Document Type


Date of Degree

Spring 2018

Access Restrictions

Access restricted until 07/03/2019

Degree Name

PhD (Doctor of Philosophy)

Degree In


First Advisor

J. Robert Manak

First Committee Member

Josep Comeron

Second Committee Member

Prakash Nadkarni

Third Committee Member

Benjamin W. Darbro

Fourth Committee Member

Nick Street


During the Human Genome Project the first hundred billion bases were sequenced in four years, however, the second hundred billion bases were sequenced in four months (NHGRI, 2013). As efforts were made to improve every aspect of sequencing in this project, cost became inversely proportional to the speed (NHGRI, 2013). Human Genome Project ended in April 2003 but research in faster and cheaper ways to sequence the DNA is active to date (NHGRI, 2013). On the one hand, these advancements have allowed the convenient and unbiased generation and interrogation of a variety of omics datasets; on the other hand, they have substantially contributed towards the ever-increasing size of biological data. Therefore, informatics techniques are indispensable tools in the field of biology and medicine due to their ability to efficiently store and probe large datasets. Bioinformatics is a specialized domain under informatics that focusses on biological data storage, organization and analysis (NHGRI, 2013). Here, I have applied informatics approaches such as database designing and web development in the context of biological datasets or bioinformatics, to create a novel web-based resource that allows users to explore the comprehensive transcriptome of common aquatic tunicate named Oikopleura dioica (O .dioica), and access their associated annotations across key developmental time points, conveniently. This unique resource will substantially contribute towards studies on development, evolution and genetics of chordates using O. dioica as a model.

Mendelian or single-gene disorders such as cystic fibrosis, sickle-cell anemia, Huntington’s disease, and Rett’s syndrome run across generations in families (Chial, 2008). Allelic variations associated with Mendelian disorders primarily reside in the protein-coding regions of the genome, collectively called an exome (Stenson et al., 2009). Therefore, sequencing of exome rather than whole genome is an efficient and practical approach to discover etiologic variants in our genome (Bamshad et al., 2011). Renal agenesis (RA) is a severe form of congenital anomalies of the kidney and urinary tract (CAKUT) where children are born with one (unilateral renal agenesis) or no kidneys (bilateral renal agenesis) (Brophy et al., 2017; Yalavarthy & Parikh, 2003). In this study, we have applied exome-sequencing technique to selective human patients in a renal agenesis (RA) pedigree that followed a Mendelian mode of disease transmission. Exome sequencing and molecular techniques combined with my bioinformatics analysis has led to the discovery of a novel RA gene called GREB1L (Brophy et al., 2017). In this study, we have successfully demonstrated the validation of exome sequencing and bioinformatics techniques to narrow down disease-associated mutations in human genome. Additionally, the results from this study has substantially contributed towards understanding the molecular basis of CAKUT. Discovery of novel etiologic variants will enhance our understanding of human diseases and development.

High-throughput sequencing technique called RNA-Seq has revolutionized the field of transcriptome analysis (Z. Wang, Gerstein, & Snyder, 2009). Concisely, a library of cDNA is prepared from a RNA sample using an enzyme called reverse transcriptase (Nottingham et al., 2016). Next, the cDNA is fragmented, sequenced using a sequencing platform of choice and mapped to a reference genome, assembled transcriptome, or assembled de novo to generate a transcriptome (Grabherr et al., 2011; Nottingham et al., 2016). Mapping allows detection of high-resolution transcript boundaries, quantification of transcript expression and identification of novel transcripts in the genome. We have applied RNA-Seq to analyze the gene expression patterns in water flea otherwise known as D. pulex to work out the genetic details underlying heavy metal induced stress (unpublished) and predator induced phenotypic plasticity (PIPP) (Rozenberg et al., 2015), independently. My bioinformatics analysis of the RNA-Seq data has facilitated the discovery of key biological processes participating in metal induced stress response and predator induced defense mechanisms in D. pulex. These studies are great additions to the field of ecotoxicogenomics, phenotypic plasticity and have aided us in gaining mechanistic insight into the impact of toxicant and predator exposure on D. pulex at a bimolecular level.


Exome Sequencing, Genome browser, Heavy metal treatment, Phenotypic plasticity, Renal agenesis, RNA-Seq


xix, 210 pages


Includes bibliographical references (pages 171-210).


Copyright © 2018 Mrutyunjaya Parida

Available for download on Wednesday, July 03, 2019