Document Type

Dissertation

Date of Degree

Spring 2011

Degree Name

PhD (Doctor of Philosophy)

Degree In

Statistical Genetics

First Advisor

Veronica J. Vieland

Abstract

In this dissertation, the posterior probability of linkage (PPL) framework is extended to the analysis of case-control (CC) data and three new linkage disequilibrium (LD) statistics are introduced. These statistics measure the evidence for or against LD, rather than testing the null hypothesis of no LD, and they therefore avoid the need for multiple testing corrections. They are suitable not only for CC designs but also can be used in application to family data, ranging from trios to complex pedigrees, all under the same statistical framework, allowing for the unified analysis of these disparate data structures. They also provide the other core advantages of the PPL framework, including the use of sequential updating to accumulate LD evidence across potentially heterogeneous sets of subsets of data; parameterization in terms of a very general trait likelihood, which simultaneously considers dominant, recessive, and additive models; and a straightforward mechanism for modeling two-locus epistasis. Finally, being implemented within the PPL framework, the new statistics readily allow linkage information obtained from distinct data, to be incorporated into LD analyses in the form of a prior probability distribution. Performance of the proposed LD statistics is examined using simulated data. In addition, the effects of key modeling violations on performance are assessed. These statistics are also applied to a previously published type 1 diabetes (T1D) family dataset with a few candidate genes with previously reported weak associations, and another T1D CC dataset also previously published as a genome-wide association (GWA) study with some strong associations reported. The new LD statistics under the PPLD framework confirm most of the findings in the published work and also find some new SNPs suspected of being associated with T1D. Sequential updating between the family dataset and the CC dataset dramatically increased the association signal strength for a CTLA4 SNP genotyped in both studies. Linkage information gleaned from the family dataset is also combined into the LD analysis of the CC dataset to demonstrate the utility of this unique feature of the PPL framework, and specifically for the new LD statistics.

Keywords

Association Mapping, Gene Mapping, Linkage Disequilibrium, Statistical Genetics, Type 1 Diabetes

Pages

ix, 129 pages

Bibliography

Includes bibliographical references (pages 121-129).

Copyright

Copyright 2011 Yungui Huang

Share

COinS