## Theses and Dissertations

Dissertation

Spring 2012

#### Degree Name

PhD (Doctor of Philosophy)

Biostatistics

Ying Zhang

#### Abstract

Monotone function, such as growth function and cumulative distribution function, is often a study of interest in statistical literature. In this dissertation, we propose a nonparametric least-squares method for estimating monotone functions induced from stochastic processes in which the starting time of the process is subject to interval censoring. We apply this method to estimate the mean function of tumor growth with the data from either animal experiments or tumor screening programs to investigate tumor progression. In this type of application, the tumor onset time is observed within an interval. The proposed method can also be used to estimate the cumulative distribution function of the elapsed time between two related events in human immunodeficiency virus (HIV)/acquired immunodeficiency syndrome (AIDS) studies, such as HIV transmission time between two partners and AIDS incubation time from HIV infection to AIDS onset. In these applications, both the initial event and the subsequent event are only known to occur within some intervals. Such data are called doubly interval-censored data. The common property of these stochastic processes is that the starting time of the process is subject to interval censoring.

A unified two-step nonparametric estimation procedure is proposed for these problems. In the first step of this method, the nonparametric maximum likelihood estimate (NPMLE) of the cumulative distribution function for the starting time of the stochastic process is estimated with the framework of interval-censored data. In the second step, a specially designed least-squares objective function is constructed with the above NPMLE plugged in and the nonparametric least-squares estimate (NPLSE) of the mean function of tumor growth or the cumulative distribution function of the elapsed time is obtained by minimizing the aforementioned objective function. The theory of modern empirical process is applied to prove the consistency of the proposed NPLSE. Simulation studies are extensively carried out to provide numerical evidence for the validity of the NPLSE. The proposed estimation method is applied to two real scientific applications. For the first application, California Partners' Study, we estimate the distribution function of HIV transmission time between two partners. In the second application, the NPLSEs of the mean functions of tumor growth are estimated for tumors with different stages at diagnosis based on the data from a cancer surveillance program, the SEER program. An ad-hoc nonparametric statistic is designed to test the difference between two monotone functions under this context. In this dissertation, we also propose a numerical algorithm, the projected Newton-Raphson algorithm, to compute the non– and semi-parametric estimate for the M-estimation problems subject to linear equality or inequality constraints. By combining the Newton-Raphson algorithm and the dual method for strictly convex quadratic programming, the projected Newton-Raphson algorithm shows the desired convergence rate. Compared to the well-known iterative convex minorant algorithm, the projected Newton-Raphson algorithm achieves much quicker convergence when computing the non- and semi-parametric maximum likelihood estimate of panel count data.

#### Keywords

Doubly interval-censored data, Empirical processes, HIV/AIDS, Interval-censored data, Monotone function, Tumor growth

2, x, 159 pages

#### Bibliography

Includes bibliographical references (pages 154-159).