Document Type


Date of Degree

Spring 2017

Degree Name

PhD (Doctor of Philosophy)

Degree In

Psychological and Quantitative Foundations

First Advisor

Stephen B. Dunbar

First Committee Member

Catherine J. Welch

Second Committee Member

Timothy N. Ansley

Third Committee Member

Brandon C. LeBeau

Fourth Committee Member

David B. Bills


There has been a wealth of research conducted on the high school dropouts spanning several decades. It is estimated that compared with those who complete high school, the average high school dropout costs the economy approximately $250,000 more over his or her lifetime in terms of lower tax contributions, higher reliance on Medicaid and Medicare, higher rates of criminal activity, and higher reliance on welfare (Levin & Belfield, 2007). The nation suffers not only because of the loss in revenue but also as a result of the education level of the population. Individuals who choose to drop out of high school are less likely to be in the labor force than adults who earned a high school credential, and they fare worse in many aspects of life.

In many studies on high school dropouts, an important challenge is how to determine an appropriate structural form for a statistical model to be used in making inferences and predictions. Many useful statistical modeling for survival analysis have been developed to study the competing risks frame of probability of dropping out and the probability of graduating; however, few methods exist for establishing the actual competing risks structural form of a model when the data contains two educational milestones – drop out and graduation.

In this dissertation, we first utilized the data collected from the National Education Longitudinal Study (NELS: 88/2000) and proposed a discrete time competing risks hazard model and the corresponding model selection process to study the contributions of student’s academic ability, family background, school characteristics and vocational education to the probabilities of students graduating from or dropping out of high school. This model finds a way to overcome the shortcomings of the traditional models existing in the previous research.

Within educational research, missing data is very common occurrence and can easily complicate the model selection problem. Handling missing data inappropriately can lead to bias and inaccurate inferences. This dissertation applies four missing data techniques to the key attributes including listwise deletion, dummy variable adjustment, mean imputation, and multiple imputation. Recommendations were offered for future endeavors and research in finding solutions to handle missing data in educational research.

Finally, we outline the implementation of the proposed methodology. This research has the potential for both theoretical merit and implications for affecting educational policy. My dissertation adds to the limited body of literature of quantitative studies of the high school dropouts. A discrete time competing risks hazard model for predicting the probability of dropping out could become part of a powerful tool to identify students at risk of dropping out.

Public Abstract

Every year, over 1.2 million students drop out high school in the United States. That’s 7,000 students every day. Dropping out of high school has been viewed as a serious educational and social problem. By leaving high school prior to completion, most dropouts have serious educational deficiencies that severely limit their economic and social well-being throughout their adult lives. The individual consequences lead to social costs of billions of dollars.

Over the last 40 years, researchers have made great headway in understanding reasons students drop out of high school. Several statistical models were developed for prediction of the students at risk of dropping out, enabling the adoption of proactive process to alleviate the situation. However, few methods exist for establishing the actual competing risks structural form of a model when the data contain two educational milestones – dropout and graduation.

Using data from the National Education Longitudinal Study (NELS: 88/2000), this study examined how student academic ability, family background, school characteristics and vocational education, influenced students’ decisions on graduating from or dropping out of high school. Additionally, this study examined how missing data techniques influenced students’ engagement on leaving high school.

These findings have two major implications for dropout prevention. First, dropout prevention program should adopt statistical modeling that would track all students at the beginning of high school to identify at-risk students and provide them with additional supports. Second, more extensive efforts should be made to investigate the data missing in the system.


Competing risk, Dropout, Gradution, Survival analysis


xii, 126 pages


Includes bibliographical references (pages 96-102).


Copyright © 2017 Fan Yang