College of Engineering
BSE (Bachelor of Science in Engineering)
Session and Year of Graduation
Honors Major Advisor
Cancer genomics, in the context of informing clinical decisions with tumor genotype, is a field characterized by high-dimensional data. Computational approaches for evaluating sets of features to be utilized in machine learning methods are essential for yielding accurate predictive and prognostic models. Additionally, the publicly-available results of the Broad Institute’s Firehose cancer genomics analysis pipeline presents a wealth of information that may be useful for cancer genotyping. Power analysis and classifier comparison are performed with the goal of evaluating a gene-based mutation significance feature set (MutSig) from Firehose. They reveal that while the MutSig features likely contain some prognostic information, the methods with which they are currently integrated do not provide enough predictive power to result in clinically-useful decision support. Results also suggest that Random Forest or other bagged classifiers are potential good candidates for feature selection and model building in this context.
genomics, cancer, machine learning, classification, bioinformatics, firehose
Copyright © 2017 Michael Rendleman