DOI

10.17077/etd.dljipt1k

Document Type

Thesis

Date of Degree

Fall 2017

Access Restrictions

Access restricted until 01/31/2020

Degree Name

MS (Master of Science)

Degree In

Biomedical Engineering

First Advisor

Michael J. Schnieders

Second Advisor

Richard J.H. Smith

First Committee Member

Michael Mackey

Second Committee Member

Thomas Casavant

Third Committee Member

Terry Braun

Abstract

With recent advances and cost reductions in next generation sequencing (NGS), the amount of genetic sequence data is increasing rapidly. However, before patient specific genetic information reaches its full potential to advance clinical diagnostics, the immense degree of genetic heterogeneity that contributes to human disease must be more fully understood. For example, although large numbers of genetic variations are discovered during clinical use of NGS, annotating and understanding the impact of such coding variations on protein phenotype remains a bottleneck (i.e. what is the molecular mechanism behind deafness phenotypes). Fortunately, computational methods are emerging that can be used to efficiently study protein coding variants, and thereby overcome the bottleneck brought on by rapid adoption of clinical sequencing.

To study proteins via physics-based computational algorithms, high-quality 3D structural models are essential. These protein models can be obtained using a variety of numerical optimization methods that operate on physics-based potential energy functions. Accurate protein structures serve as input to downstream variation analysis algorithms. In this work, we applied a novel amino acid side-chain optimization algorithm, which operated on an advanced model of atomic interactions (i.e. the AMOEBA polarizable force field), to a set of 164 protein structural models implicated in deafness. The resulting models were evaluated with the MolProbity structure validation tool. MolProbity “scores” were originally calibrated to predict the quality of X-ray diffraction data used to generate a given protein model (i.e. a 1.0 Å or lower MolProbity score indicates a protein model from high quality data, while a score of 4.0 Å or higher reflects relatively poor data). In this work, the side-chain optimization algorithm improved mean MolProbity score from 2.65 Å (42nd percentile) to nearly atomic resolution at 1.41 Å (95th percentile). However, side-chain optimization with the AMOEBA many-body potential function is computationally expensive. Thus, a second contribution of this work is a parallelization scheme that utilizes nVidia graphical processing units (GPUs) to accelerate the side-chain repacking algorithm. With the use of one GPU, our side-chain optimization algorithm achieved a 25 times speed-up compared to using two Intel Xeon E5-2680v4 central processing units (CPUs). We expect the GPU acceleration scheme to lessen demand on computing resources dedicated to protein structure optimization efforts and thereby dramatically expand the number of protein structures available to aid in interpretation of missense variations associated with deafness.

Keywords

AMOEBA, Dead-End Elimination, GPU Acceleration, High Performance Computing, Polarizable Force Field, Protein Structure

Pages

xi, 38 pages

Bibliography

Includes bibliographical references (pages 35-38).

Copyright

Copyright © 2017 Mallory RaNae Tollefson

Available for download on Friday, January 31, 2020

Share

COinS