Document Type

Dissertation

Date of Degree

Summer 2011

Degree Name

PhD (Doctor of Philosophy)

Degree In

Statistics

First Advisor

Kung-Sik Chan

Abstract

In many high dimensional problems, the dependence structure among the variables can be quite complex. An appropriate use of the regularization techniques coupled with other classical statistical methods can often improve estimation and prediction accuracy and facilitate model interpretation, by seeking a parsimonious model representation that involves only the subset of revelent variables. We propose two regularized stochastic regression approaches, for efficiently estimating certain sparse dependence structure in the data. We first consider a multivariate regression setting, in which the large number of responses and predictors may be associated through only a few channels/pathways and each of these associations may only involve a few responses and predictors. We propose a regularized reduced-rank regression approach, in which the model estimation and rank determination are conducted simultaneously and the resulting regularized estimator of the coefficient matrix admits a sparse singular value decomposition (SVD). Secondly, we consider model selection of subset autoregressive moving-average (ARMA) modelling, for which automatic selection methods do not directly apply because the innovation process is latent. We propose to identify the optimal subset ARMA model by fitting a penalized regression, e.g. adaptive Lasso, of the time series on its lags and the lags of the residuals from a long autoregression fitted to the time-series data, where the residuals serve as proxies for the innovations. Computation algorithms and regularization parameter selection methods for both proposed approaches are developed, and their properties are explored both theoretically and by simulation. Under mild regularity conditions, the proposed methods are shown to be selection consistent, asymptotically normal and enjoy the oracle properties. We apply the proposed approaches to several applications across disciplines including cancer genetics, ecology and macroeconomics.

Keywords

Biconvexity, Multivariate regression, Oracle properties, Reduced-rank regression, Seasonal ARIMA models, Sparsity

Pages

ix, 137 pages

Bibliography

Includes bibliographical references (pages 133-137).

Copyright

Copyright 2011 Kun Chen

Share

COinS