Major Department

Biomedical Engineering


College of Engineering


BSE (Bachelor of Science in Engineering)

Session and Year of Graduation

Spring 2017

Honors Major Advisor

Edwin Dove

Thesis Mentor

Tom Casavant


Cancer genomics, in the context of informing clinical decisions with tumor genotype, is a field characterized by high-dimensional data. Computational approaches for evaluating sets of features to be utilized in machine learning methods are essential for yielding accurate predictive and prognostic models. Additionally, the publicly-available results of the Broad Institute’s Firehose cancer genomics analysis pipeline presents a wealth of information that may be useful for cancer genotyping. Power analysis and classifier comparison are performed with the goal of evaluating a gene-based mutation significance feature set (MutSig) from Firehose. They reveal that while the MutSig features likely contain some prognostic information, the methods with which they are currently integrated do not provide enough predictive power to result in clinically-useful decision support. Results also suggest that Random Forest or other bagged classifiers are potential good candidates for feature selection and model building in this context.


genomics, cancer, machine learning, classification, bioinformatics, firehose

Total Pages

12 pages


Copyright © 2017 Michael Rendleman