Date of Degree
Access restricted until 02/23/2019
PhD (Doctor of Philosophy)
First Committee Member
Second Committee Member
Third Committee Member
Fourth Committee Member
Fifth Committee Member
Due to the rapid development and growing need for information technologies, more and more researchers start to focus on high-dimensional data. Much work has been done on problems like point estimation possessing oracle inequalities, coefficient estimation, variable selection in high-dimensional regression models. However, with respect to the statistical inference for the regression coefficients, there have been few studies. Therefore, we propose a regularized efficient score estimation and testing (RESET) approach for treatment effects in the presence of nuisance parameters, either low-dimensional or high-dimensional, in generalized linear models (GLMs). Based on the RESET method, we are also able to develop another two-step approach related to the same problem.
The RESET approach is based on estimating the efficient score function of the treatment parameters. This means we are trying to remove the influence of nuisance parameters on the treatment parameters and construct an efficient score function which could be used for estimating and testing for the treatment effect. The RESET approach can be used in both low-dimensional and high-dimensional settings. As the simulation results show, it is comparable with the commonly used maximum likelihood estimators in most low-dimensional cases. We will prove that the RESET estimator is consistent under some regularity conditions, either in the low-dimensional or the high-dimensional linear models. Also, it is shown that the efficient score function of the treatment parameters follows a chi-square distribution, based on which the regularized efficient score tests are constructed to test for the treatment effect, in both low-dimensional and high-dimensional GLMs.
The two-step approach is mainly used for high-dimensional inference. It combines the RESET approach with a first step of selecting "promising" variables for the purpose of reducing the dimension of the regression model. The minimax concave penalty is adopted for its oracle property, which means it tends to choose "correct" variables asymptotically. The simulation results show that some improvement is still required for this approach, which will be part of our future research direction.
Finally, both the RESET and the two-step approaches are implemented with a real data example to demonstrate their application, followed by a conclusion for all the problems investigated here and a discussion for the directions of future research.
Due to the rapid development and growing need for information technologies, researchers often encounter high-dimensional data (data with more parameters to estimate than observations). One of the main problems they face is estimating the effect of a specific factor on a response variable while controlling for the influence of other factors. With respect to this problem, there have been few studies. In order to tackle this challenge, we propose a new approach based on likelihood theory but combined with parameter penalization. This approach is called the regularized efficient score estimation and testing (RESET) approach.
In this dissertation, the methodology and asymptotic properties of RESET are presented in detail. In order to study its finite sample performance, simulation studies are also implemented to compare RESET with other estimation and testing methods in both low-dimensional and high-dimensional generalized linear models. Under the simulation studies, RESET shows both advantages and disadvantages over the other approaches.
Finally, the utility of RESET for high-dimensional data is demonstrated by applying the method to a breast cancer dataset, followed by a conclusion and a discussion for the directions of future research.
xvi, 156 pages
Includes bibliographical references (pages 154-156).
Copyright © 2016 Lixi Yu
Yu, Lixi. "Regularized efficient score estimation and testing (reset) approach in low-dimensional and high-dimensional GLM." PhD (Doctor of Philosophy) thesis, University of Iowa, 2016.
Available for download on Saturday, February 23, 2019