Document Type

Dissertation

Date of Degree

Fall 2011

Degree Name

PhD (Doctor of Philosophy)

Degree In

Biostatistics

First Advisor

Joseph Cavanaugh

Abstract

Model selection criteria frequently arise from constructing estimators of discrepancy measures used to assess the disparity between the data generating model and a fitted approximating model. The widely known Akaike information criterion (AIC) results from utilizing Kullback's directed divergence (KDD) as the targeted discrepancy. Under appropriate conditions, AIC serves as an asymptotically unbiased estimator of KDD. The directed divergence is an asymmetric measure of separation between two statistical models, meaning that an alternate directed divergence may be obtained by reversing the roles of the two models in the definition of the measure. The sum of the two directed divergences is Kullback's symmetric divergence (KSD).

A comparison of the two directed divergences indicates an important distinction between the measures. When used to evaluate fitted approximating models that are improperly specified, the directed divergence which serves as the basis for AIC is more sensitive towards detecting overfitted models, whereas its counterpart is more sensitive towards detecting underfitted models. Since KSD combines the information in both measures, it functions as a gauge of model disparity which is arguably more balanced than either of its individual components. With this motivation, we propose three estimators of KSD for use as model selection criteria in the setting of generalized linear models: KICo, KICu, and QKIC. These statistics function as asymptotically unbiased estimators of KSD under different assumptions and frameworks.

As with AIC, KICo and KICu are both justified for large-sample maximum likelihood settings; however, asymptotic unbiasedness holds under more general assumptions for KICo and KICu than for AIC. KICo serves as an asymptotically unbiased estimator of KSD in settings where the distribution of the response is misspecified. The asymptotic unbiasedness of KICu holds when the candidate model set includes underfitted models. QKIC is a modification of KICo. In the development of QKIC, the likelihood is replaced by the quasi-likelihood. QKIC can be used as a model selection tool when generalized estimating equations, a quasi-likelihood-based method, are used for parameter estimation. We examine the performance of KICo, KICu, and QKIC relative to other relevant criteria in simulation experiments. We also apply QKIC in a model selection problem for a randomized clinical trial investigating the effect of antidepressants on the temporal course of disability after stroke.

Keywords

Generalized linear model, Kullback's symmetric divergence, Model selection

Pages

xii, 114 pages

Bibliography

Includes bibliographical references (pages 110-114).

Copyright

Copyright 2011 Laura Acion

Included in

Biostatistics Commons

Share

COinS