Document Type


Date of Degree

Summer 2011

Degree Name

PhD (Doctor of Philosophy)

Degree In


First Advisor

Kung-Sik Chan


In many high dimensional problems, the dependence structure among the variables can be quite complex. An appropriate use of the regularization techniques coupled with other classical statistical methods can often improve estimation and prediction accuracy and facilitate model interpretation, by seeking a parsimonious model representation that involves only the subset of revelent variables. We propose two regularized stochastic regression approaches, for efficiently estimating certain sparse dependence structure in the data. We first consider a multivariate regression setting, in which the large number of responses and predictors may be associated through only a few channels/pathways and each of these associations may only involve a few responses and predictors. We propose a regularized reduced-rank regression approach, in which the model estimation and rank determination are conducted simultaneously and the resulting regularized estimator of the coefficient matrix admits a sparse singular value decomposition (SVD). Secondly, we consider model selection of subset autoregressive moving-average (ARMA) modelling, for which automatic selection methods do not directly apply because the innovation process is latent. We propose to identify the optimal subset ARMA model by fitting a penalized regression, e.g. adaptive Lasso, of the time series on its lags and the lags of the residuals from a long autoregression fitted to the time-series data, where the residuals serve as proxies for the innovations. Computation algorithms and regularization parameter selection methods for both proposed approaches are developed, and their properties are explored both theoretically and by simulation. Under mild regularity conditions, the proposed methods are shown to be selection consistent, asymptotically normal and enjoy the oracle properties. We apply the proposed approaches to several applications across disciplines including cancer genetics, ecology and macroeconomics.


Biconvexity, Multivariate regression, Oracle properties, Reduced-rank regression, Seasonal ARIMA models, Sparsity


ix, 137 pages


Includes bibliographical references (pages 133-137).


Copyright 2011 Kun Chen