Date of Degree
PhD (Doctor of Philosophy)
Jeffrey D. Dawson
High frequency time series data have become increasingly common. In many settings, such as the medical sciences or economics, these series may additionally display semi-reflective boundaries. These are boundaries, either physically existing, arbitrarily set, or determined based on inherent qualities of the series, which may be exceeded and yet based on probable consequences offer incentives to return to mid-range levels. In a lane control setting, Dawson, Cavanaugh, Zamba, and Rizzo (2010) have previously developed a weighted third-order autoregressive model utilizing flat, linear, and quadratic projections with a signed error term in order to depict key features of driving behavior, where the probability of a negative residual is predicted via logistic regression. In this driving application, the intercept (Λ0) of the logistic regression model describes the central tendency of a particular driver while the slope parameter (Λ1 ) can be intuitively defined as a representation of the propensity of the series to return to mid-range levels. We call this therefore the "re-centering" parameter, though this is a slight misnomer since the logistic model does not describe the position of the series, but rather the probability of a negative residual. In this framework a multi-step estimation algorithm, which we label as the Single-Pass method, was provided.
In addition to investigating the statistical properties of the Single-Pass method, several other estimation techniques are investigated. These techniques include an Iterated Grid Search, which utilizes the underlying likelihood model, and four modified versions of the Single-Pass method. These Modified Single-Pass (MSP) techniques utilize respectively unconstrained least squares estimation for the vector of projection coefficients (Β), use unconstrained linear regression with a post-hoc application of the summation constraint, reduce the regression model to include only the flat and linear projections, or implement the Least Absolute Shrinkage and Selection Operator (LASSO). For each of these techniques, mean bias, confidence intervals, and coverage probabilities were calculated which indicated that of the modifications only the first two were promising alternatives.
In a driving application, we therefore considered these two modified techniques along with the Single-Pass and Iterative Grid Search. It was found that though each of these methods remains biased with generally lower than ideal coverage probabilities, in a lane control setting they are each able to distinguish between two populations based on disease status. It has also been found that the re-centering parameter, estimated based on data collected in a driving simulator amongst a control population, is significantly correlated with neuropsychological outcomes as well as driving errors performed on-road. Several of these correlations were apparent regardless of the estimation technique, indicating real-world validity of the model across related assessments. Additionally, the Iterated Grid Search produces estimates that are most distinct with generally lower bias and improved coverage with the exception of the estimate of Λ1. However this method also requires potentially large time and memory commitments as compared to the other techniques considered. Thus the optimal estimation scheme is dependent upon the situation. When feasible the Iterated Grid Search appears to be the best overall method currently available. However if time or memory is a limiting factor, or if a reliable estimate of the re-centering parameter with reasonably accurate estimation of the Β vector is desired, the Modified Single-Pass technique utilizing unconstrained linear regression followed by implementation of the summation constraint is a sensible alternative.
Copyright 2013 Amy May Johnson