Document Type


Date of Degree

Spring 2017

Access Restrictions


Degree Name

PhD (Doctor of Philosophy)

Degree In

Biomedical Engineering

First Advisor

Kai Tan

First Committee Member

Terry Braun

Second Committee Member

Michael Mackey

Third Committee Member

Kai Wang

Fourth Committee Member

Xiaodong Wang


Network biology has proven to be powerful tool for representing and analyzing complex molecular networks. It has also been successfully applied to biological field helping understand various biological processes. However, our current knowledge about the dynamics of gene networks during disease progression is rather limited. On the other hand, network construction is a prerequisite of network analysis. When the number of samples is limited, state-of-art computational methods for network construction are not robust in terms of low statistical power. In addition, molecular networks have been used extensively to improve the inference accuracy of causal coding variants, but this potential has not been investigated to the same extent for noncoding variants.

To address those limitations, I first developed inference of multiple differential modules (iMDM) algorithm to study network dynamics. This method is able to identify both unique and shared modules from multiple gene networks, each of which denoting a different perturbation condition. Using iMDM algorithm, I identified different types of modules to understand heart failure progression and disease dynamics.

Next, I developed a computational framework to construct condition specific transcriptional regulatory network. I also developed a computational method to rank transcription factors in the transcriptional regulatory network. Applying this framework to RNA-seq data for hematopoietic stem cell development, I successfully constructed corresponding transcriptional regulatory network and identified key transcriptional factors that play important roles.

Finally, I developed Annotation of Regulatory Variants using Integrated Networks (ARVIN), a network-based algorithm, to identify causal genetic variants for diseases. By applying ARVIN to various diseases, we obtained a systems understanding of the gene circuitry that is affected by all enhancer mutations in a given disease.

Public Abstract

Genes and proteins often work together in an intricate network rather than acting in isolation. These biological networks contain abundant information revealing the overall physical and functional landscape of a biological system. Network analysis has been demonstrated as a powerful approach to studying biological phenomena because it provides a global picture of molecular interactions in different cell types and disease states. Existing network analysis methods mostly rely on mining protein-protein interaction networks, transcriptional regulatory networks (TRNs) or gene co-expression networks (Aittokallio and Schwikowski, 2006; Bebek and Yang, 2007; Gitter et al., 2011; Huang and Fraenkel, 2009; Ourfali et al., 2007). Both healthy development and disease progression are driven by dynamic changes in both the activity and connectivity of gene pathways, and network biology provides powerful tools for studying such dynamic changes (Cho et al., 2012).

Currently, there is a lack of computational methods that enable analysis of multiple gene networks to understand the dynamic events. In addition, many computational methods require a large number of gene expression profiles to construct network models. Unfortunately, the number of samples for particular conditions is usually not enough for these methods. Network analysis has also been applied to infer causal genetic variants for diseases (Lee et al., 2009; Zhang et al., 2013a). Although molecular networks can improve the inference accuracy of causal coding variants, their utility has not been examined for causal non-coding variants (Jia et al., 2011; Lee et al., 2011; Linghu et al., 2009; Moreau and Tranchevent, 2012). To address those problems, I developed 3 network-based methods to: 1) identify dynamic events across multiple gene networks during healthy development and disease progression; 2) construct condition- specific gene networks with a limited number of samples; and 3) infer causal non-coding variants for human diseases.


Complex diseases, Genetic variants, Hematopoietic stem cell, Network Biology


xviii, 183 pages


Includes bibliographical references (pages 168-183).


Copyright © 2017 Long Gao