Date of Degree
PhD (Doctor of Philosophy)
Model selection criteria frequently arise from constructing estimators of discrepancy measures used to assess the disparity between the data generating model and a fitted approximating model. The widely known Akaike information criterion (AIC) results from utilizing Kullback's directed divergence (KDD) as the targeted discrepancy. Under appropriate conditions, AIC serves as an asymptotically unbiased estimator of KDD. The directed divergence is an asymmetric measure of separation between two statistical models, meaning that an alternate directed divergence may be obtained by reversing the roles of the two models in the definition of the measure. The sum of the two directed divergences is Kullback's symmetric divergence (KSD).
A comparison of the two directed divergences indicates an important distinction between the measures. When used to evaluate fitted approximating models that are improperly specified, the directed divergence which serves as the basis for AIC is more sensitive towards detecting overfitted models, whereas its counterpart is more sensitive towards detecting underfitted models. Since KSD combines the information in both measures, it functions as a gauge of model disparity which is arguably more balanced than either of its individual components. With this motivation, we propose three estimators of KSD for use as model selection criteria in the setting of generalized linear models: KICo, KICu, and QKIC. These statistics function as asymptotically unbiased estimators of KSD under different assumptions and frameworks.
As with AIC, KICo and KICu are both justified for large-sample maximum likelihood settings; however, asymptotic unbiasedness holds under more general assumptions for KICo and KICu than for AIC. KICo serves as an asymptotically unbiased estimator of KSD in settings where the distribution of the response is misspecified. The asymptotic unbiasedness of KICu holds when the candidate model set includes underfitted models. QKIC is a modification of KICo. In the development of QKIC, the likelihood is replaced by the quasi-likelihood. QKIC can be used as a model selection tool when generalized estimating equations, a quasi-likelihood-based method, are used for parameter estimation. We examine the performance of KICo, KICu, and QKIC relative to other relevant criteria in simulation experiments. We also apply QKIC in a model selection problem for a randomized clinical trial investigating the effect of antidepressants on the temporal course of disability after stroke.
Generalized linear model, Kullback's symmetric divergence, Model selection
xii, 114 pages
Includes bibliographical references (pages 110-114).
Copyright 2011 Laura Acion