DOI

10.17077/etd.lrb1o573

Document Type

Dissertation

Date of Degree

Fall 2016

Access Restrictions

Access restricted until 02/23/2021

Degree Name

PhD (Doctor of Philosophy)

Degree In

Biostatistics

First Advisor

Michael P. Jones

First Committee Member

Gideon K.D. Zamba

Second Committee Member

Kai Wang

Third Committee Member

Jeffrey D. Long

Fourth Committee Member

Kung-Sik Chan

Abstract

A semiparametric proportional likelihood ratio model was proposed by Luo and Tsai (2012) which is suitable for modeling a nonlinear monotonic relationship between the response variable and a covariate. Extending the generalized linear model, this model leaves the probability distribution unspecified but estimates it from the data. In this thesis, we propose to extend this model into analyzing the longitudinal data by incorporating random effects into the linear predictor. By using this model as the conditional density of the response variable given the random effects, we present a maximum likelihood approach for model estimation and inference. Two numerical estimation procedures were developed for response variables with finite support, one based on the Newton-Raphson algorithm and the other one based on generalized expectation maximization (GEM) algorithm. In both estimation procedures, Gauss-Hermite quadrature is employed to approximate the integrals.

Upon convergence, the observed information matrix is estimated through the second-order numerical differentiation of the log likelihood function. Asymptotic properties of the maximum likelihood estimator are established under certain regularity conditions and simulation studies are conducted to assess its finite sample properties and compare the proposed model to the generalized linear mixed model. The proposed method is illustrated in an analysis of data from a multi-site observational study of prodromal Huntington's disease.

Public Abstract

Huntington's disease (HD) is a fatal neurodegenerative genetic disorder that causes motor abnormalities, mental decline, and behavioral symptoms. Neurobiological Predictors of Huntington's Disease (PREDICT-HD) is an international multisite longitudinal observational study of subjects who are at-risk for HD. Participants' motor, cognition, behavior, function and clinical diagnosis were assessed annually. The total functional capacity (TFC) is a composite score that evaluates participants' function on occupation, handling finances, domestic chores, and activities of daily living, ranging from 0 to 13 with 13 suggesting normal functioning. It has been demonstrated that the TFC is reliable for indicating disease progression. Identifying predictors of TFC decline is an important goal of PREDICT-HD research.

Given the nature of the construction of the TFC, it is problematic to assume this composite score follows some particular distribution. As an example, the generalized linear mixed model (GLMM), a common model for longitudinal discrete data, requires specification of the probability distribution, which is restrictive in many cases. In order to relax this constraint, we propose a more flexible model which does not depend on specification of the underlying distribution but estimates it from the data. Our model builds upon the proportional likelihood ratio model proposed by Ruo and Tsai (2012) by incorporating both fixed effects and random effects. The estimation procedure is based on the generalized expectation maximization algorithm, with its validity established theoretically through rigorous proof and numerically through extensive simulation. The proposed model is then applied to the PREDICT-HD data to identify predictors of the TFC decline.

Keywords

GLMM, Longitudinal data, Misspecification, Mixed Model, Porportional likelihood ratio model

Pages

x, 112 pages

Bibliography

Includes bibliographical references (pages 108-112).

Copyright

Copyright © 2016 Hongqian Wu

Available for download on Tuesday, February 23, 2021

Included in

Biostatistics Commons

Share

COinS