Spring 2010

PhD (Doctor of Philosophy)

John Geweke


In this dissertation I examine the effects of sample selection on the probability of stroke among older adults. If study subjects are selected into the sample based on some non-experimental selection process, then statistical analysis may produce inconsistent estimates.

Chapter 1 develops a model of non-ignorable selection for a discrete outcome variable, such as whether stroke occurred or not. I start by noticing that in the literature there are relatively few applications of the Heckman model to the case of a discrete outcome variable and they are limited to a bivariate case. After that I extend the Bayesian multivariate probit model of Chib and Greenberg (1998) broadly following the logic of Heckman's original (1979) work. The model in the first chapter of my dissertation is set in a way general enough to handle multiple selection and discrete-continuous outcome equations.

The first extension of the multivariate probit model in Chib and Greenberg (1998) allows some of the outcomes to be missing. In particular, stroke occurrence is missing whenever the person is not selected into the sample. In terms of latent variable representation this implies that multivariate normal distribution is not truncated in the direction of missing outcome. I also use Cholesky factorization of the variance matrix to avoid the Metropolis-Hastings algorithm in the Gibbs sampler.

Chapter 2 evaluates how severe the problem of sample selection is in Assets and HEAlth Dynamics among the Oldest Old (AHEAD) data set. I start with a more restrictive assumption of ignorable selection. In particular, I apply the propensity score method as in a recent paper by Wolinsky et al. (2009) and find no selection effects in the study of stroke. Then I consider the model developed in Chapter 1, which is based on a less restrictive assumption of non-ignorable selection, and also find no evidence of selection. Thus, the main substantive contribution of this chapter is the absence of selection effects based on either ignorable or non-ignorable sample selection model.


Copyright 2010 Maksym Obrizan

