Document Type


Date of Degree

Spring 2010

Degree Name

PhD (Doctor of Philosophy)

Degree In


First Advisor

Zhang, Ying

First Committee Member

Huang, Jiang

Second Committee Member

Jones, Michael

Third Committee Member

Chan, Kung-Sik

Fourth Committee Member

Chaloner, Kathryn


In this thesis, we propose to analyze panel count data using a spline-based

sieve generalized estimating equation method with a semiparametric proportional mean model E(N(t)|Z) = Λ0(t) eβT0Z. The natural log of the baseline mean function, logΛ0(t), is approximated by a monotone cubic B-spline function. The estimates of regression parameters and spline coefficients are the roots of the spline based sieve generalized estimating equations (sieve GEE). The proposed method avoids assumingany parametric structure of the baseline mean function and the underlying counting process. Selection of an appropriate covariance matrix that represents the true correlation between the cumulative counts improves estimating efficiency.

In addition to the parameters existing in the proportional mean function, the estimation that accounts for the over-dispersion and autocorrelation involves an extra nuisance parameter σ2, which could be estimated using a method of moment proposed by Zeger (1988). The parameters in the mean function are then estimated by solving the pseudo generalized estimating equation with σ2 replaced by its estimate, σ2n. We show that the estimate of (β00) based on this two-stage approach is still consistent and could converge at the optimal convergence rate in the nonparametric/semiparametric regression setting. The asymptotic normality of the estimate of β0 is also established. We further propose a spline-based projection variance estimating method and show its consistency.

Simulation studies are conducted to investigate finite sample performance of the sieve semiparametric GEE estimates, as well as different variance estimating methods with different sample sizes. The covariance matrix that accounts for the overdispersion generally increases estimating efficiency when overdispersion is present in the data. Finally, the proposed method with different covariance matrices is applied to a real data from a bladder tumor clinical trial.


Counting process, Generalized Estimating Equation, Monotone polynomial splines, Over-dispersion, Semiparametric model


x, 153 pages


Includes bibliographical references (pages 149-153 ).


Copyright 2010 Lei Hua

Included in

Biostatistics Commons