Date of Degree
PhD (Doctor of Philosophy)
Street, W. Nick
Segre, Alberto M.
First Committee Member
Street, W. Nick
Second Committee Member
Segre, Alberto M.
Third Committee Member
Fourth Committee Member
Fifth Committee Member
In many circumstances, predictions elicited from induced classification models are useful to a certain extent, as such predictions provide insight into what the future may hold. Such models, in and of themselves, hold little value beyond making such predictions, as they are unable to inform their user as to how to change a predicted outcome. Consider, for example, a health care domain where a classification model has been induced to learn the mapping from patient characteristics to disease outcome. A patient may want to know how to lessen their probability of developing such a disease.
In this document, four different approaches to inverse classification, the process of turning predictions into prescriptions by working backwards through an induced classification model to optimize for a particular outcome of interest, are explored. The first study develops an inverse classification framework, which is created to produce instance-specific, real-world feasible recommendations that optimally improve the probability of a good outcome, while being as classifier-permissive as possible. Real-world feasible recommendations are obtained by imposition of constraints that specify which features can be optimized over and accounts for user-specific preferences. Assumptions are made as to the differentiability of the classification function, permitting the use of classifiers with exploitable gradient information, such as support vector machines (SVMs) and logistic regression. Our results show that the framework produces real-world recommendations that successfully reduce the probability of a negative outcome.
In the second study, we further relax our assumptions as to the differentiability of the classifier, allowing virtually any classification function to be used. Correspondingly, we adjust our optimization methodology. To such an end, three heuristic-based optimization methods are devised. Furthermore, non-linear (quadratic) relationships between feature changes and so-called cost, which accounts for user preferences, are explored. The results suggest that non-differentiable classifiers, such as random forests, can be successfully navigated using the specified framework and updated, heuristic-based optimization methodology. Furthermore, findings suggest that regularizers, encouraging sparse solutions, should be used when quadratic/non-linear cost-change relationships are specified.
The third study takes a longitudinal approach to the problem, exploring the effects of applying the inverse classification process to instances across time. Furthermore, we explore the use of added temporal linkages, in the form of features representing past predicted outcome probability (i.e., risk), on the inverse classification results. We further explore and propose a solution to a missing data subproblem that frequently arises in longitudinal data settings.
In the fourth and final study, a causal formulation of the inverse classification framework is provided and explored. The formulation encompasses a Gaussian Process-based method of inducing causal classifiers, which is subsequently leveraged when the inverse classification process is applied. Furthermore, exploration of the addition of certain dependencies is explored. The results suggest the importance of including such dependencies and the benefits of taking a causal approach to the problem.
causal learning, Classification, heuristics, Inverse classification, Machine learning, optimization
xv, 147 pages
Includes bibliographical references (pages 141-147).
Copyright © 2018 Michael Timothy Lash
Lash, Michael Timothy. "Optimizing outcomes via inverse classification." PhD (Doctor of Philosophy) thesis, University of Iowa, 2018.