Document Type


Date of Degree

Spring 2012

Degree Name

PhD (Doctor of Philosophy)

Degree In


First Advisor

Joseph Cavanaugh

Second Advisor

Gideon Zamba


Time series data involving counts are frequently encountered in many biomedical and public health applications. For example, in disease surveillance, the occurrence of rare infections over time is often monitored by public health officials, and the time series data collected can be used for the purpose of monitoring changes in disease activity. For rare diseases with low infection rates, the observed counts typically contain a high frequency of zeros (zero-inflated), but the counts can also be very large during an outbreak period. Failure to account for zero-inflation in the data may result in misleading inference and the detection of spurious associations.

In this thesis, we develop two classes of statistical models for zero-inflated time series. The first part of the thesis introduces a class of observation-driven models in a partial likelihood framework. The expectation-maximization (EM) algorithm is applied to obtain the maximum partial likelihood estimator (MPLE). We establish the asymptotic theory of the MPLE under certain regularity conditions. The performances of different partial-likelihood based model selection criteria are compared under model misspecification. In the second part of the thesis, we introduce a class of parameter-driven models in a state-space framework. To estimate the model parameters, we devise a Monte Carlo EM algorithm, where particle filtering and particle smoothing methods are employed to approximate the high-dimensional integrals in the E-step of the algorithm. Upon convergence, Louis' formula is used to find the observed information matrix. The proposed models are illustrated with simulated data and an application based on public health surveillance for syphilis, a sexually transmitted disease (STD) that remains a major public health challenge in the United States. An R package, called ZIM (Zero-Inflated Models), has been developed to fit both observation-driven models and parameter-driven models.


viii, 74 pages


Includes bibliographical references (pages 71-74).


Copyright 2012 Ming Yang

Included in

Biostatistics Commons