Document Type


Date of Degree

Fall 2011

Degree Name

PhD (Doctor of Philosophy)

Degree In


First Advisor

Pendergast, Jane F

First Committee Member

Cavanaugh, Joseph E

Second Committee Member

Lang, Joseph B

Third Committee Member

Smith, Brian J

Fourth Committee Member

Oleson, Jacob J


Longitudinal data analysis is common in biomedical research area. Generalized estimating equations (GEE) approach is widely used for longitudinal marginal models. The GEE method is known to provide consistent regression parameter estimates regardless of the choice of working correlation structure, provided the square root of n consistent nuisance parameters are used. However, it is important to use the appropriate working correlation structure in small samples, since it improves the statistical efficiency of β estimate. Several working correlation selection criteria have been proposed (Rotnitzky and Jewell, 1990; Pan, 2001; Hin and Wang, 2009; Shults et. al, 2009). However, these selection criteria have the same limitation in that they perform poorly when over-parameterized structures are considered as candidates. In this dissertation, new working correlation selection criteria are developed based on generalized eigenvalues. A set of generalized eigenvalues is used to measure the disparity between the bias-corrected sandwich variance estimator under the hypothesized working correlation matrix and the model-based variance estimator under a working independence assumption. A summary measure based on the set of the generalized eigenvalues provides an indication of the disparity between the true correlation structure and the misspecified working correlation structure. Motivated by the test statistics in MANOVA, three working correlation selection criteria are proposed: PT (Pillai's trace type criterion),WR (Wilks' ratio type criterion) and RMR (Roy's Maximum Root type criterion). The relationship between these generalized eigenvalues and the CIC measure is revealed.

In addition, this dissertation proposes a method to penalize for the over-parameterized working correlation structures. The over-parameterized structure converges to the true correlation structure, using extra parameters. Thus, the true correlation structure and the over-parameterized structure tend to provide similar variance estimate of the estimated β and similar working correlation selection criterion values. However, the over-parameterized structure is more likely to be chosen as the best working correlation structure by "the smaller the better" rule for criterion values. This is because the over-parameterization leads to the negatively biased sandwich variance estimator, hence smaller selection criterion value. In this dissertation, the over-parameterized structure is penalized through cluster detection and an optimization function. In order to find the group ("cluster") of the working correlation structures that are similar to each other, a cluster detection method is developed, based on spacings of the order statistics of the selection criterion measures. Once a cluster is found, the optimization function considering the trade-off between bias and variability provides the choice of the "best" approximating working correlation structure.

The performance of our proposed criterion measures relative to other relevant criteria (QIC, RJ and CIC) is examined in a series of simulation studies.


Generalized Eigenvalue, Generalized Estimating Equation, Longitudinal data, Model Selection, Penalization, Working Correlation Structure


xxi, 232 pages


Includes bibliographical references (pages 230-232).


Copyright 2011 Mijin Jang

Included in

Biostatistics Commons