Date of Degree
PhD (Doctor of Philosophy)
Variable selection procedures for high dimensional data have been proposed and studied by a large amount of literature in the last few years. Most of the previous research focuses on the selection properties as well as the point estimation properties. In this paper, our goal is to construct the confidence intervals for some low-dimensional parameters in the high-dimensional setting. The models we study are the partially penalized linear and accelerated failure time models in the high-dimensional setting. In our model setup, all variables are split into two groups. The first group consists of a relatively small number of variables that are more interesting. The second group consists of a large amount of variables that can be potentially correlated with the response variable. We propose an approach that selects the variables from the second group and produces confidence intervals for the parameters in the first group. We show the sign consistency of the selection procedure and give a bound on the estimation error. Based on this result, we provide the sufficient conditions for the asymptotic normality of the low-dimensional parameters. The high-dimensional selection consistency and the low-dimensional asymptotic normality are developed for both linear and AFT models with high-dimensional data.
Accelerated failure time model, Asymptotic normality, Confidence interval, LASSO, Minimax concave penalty, Sign consistency
ix, 81 pages
Includes bibliographical references (pages 78-81).
Copyright 2014 Hao Chai