Document Type


Date of Degree

Spring 2017

Degree Name

PhD (Doctor of Philosophy)

Degree In

Psychological and Quantitative Foundations

First Advisor

Stephen B. Dunbar


There has been a wealth of research conducted on the high school dropouts spanning several decades. It is estimated that compared with those who complete high school, the average high school dropout costs the economy approximately $250,000 more over his or her lifetime in terms of lower tax contributions, higher reliance on Medicaid and Medicare, higher rates of criminal activity, and higher reliance on welfare (Levin & Belfield, 2007). The nation suffers not only because of the loss in revenue but also as a result of the education level of the population. Individuals who choose to drop out of high school are less likely to be in the labor force than adults who earned a high school credential, and they fare worse in many aspects of life.

In many studies on high school dropouts, an important challenge is how to determine an appropriate structural form for a statistical model to be used in making inferences and predictions. Many useful statistical modeling for survival analysis have been developed to study the competing risks frame of probability of dropping out and the probability of graduating; however, few methods exist for establishing the actual competing risks structural form of a model when the data contains two educational milestones – drop out and graduation.

In this dissertation, we first utilized the data collected from the National Education Longitudinal Study (NELS: 88/2000) and proposed a discrete time competing risks hazard model and the corresponding model selection process to study the contributions of student’s academic ability, family background, school characteristics and vocational education to the probabilities of students graduating from or dropping out of high school. This model finds a way to overcome the shortcomings of the traditional models existing in the previous research.

Within educational research, missing data is very common occurrence and can easily complicate the model selection problem. Handling missing data inappropriately can lead to bias and inaccurate inferences. This dissertation applies four missing data techniques to the key attributes including listwise deletion, dummy variable adjustment, mean imputation, and multiple imputation. Recommendations were offered for future endeavors and research in finding solutions to handle missing data in educational research.

Finally, we outline the implementation of the proposed methodology. This research has the potential for both theoretical merit and implications for affecting educational policy. My dissertation adds to the limited body of literature of quantitative studies of the high school dropouts. A discrete time competing risks hazard model for predicting the probability of dropping out could become part of a powerful tool to identify students at risk of dropping out.


Competing risk, Dropout, Gradution, Survival analysis


xii, 126 pages


Includes bibliographical references (pages 96-102).


Copyright © 2017 Fan Yang