Document Type

Thesis

Date of Degree

Fall 2016

Degree Name

MS (Master of Science)

Degree In

Electrical and Computer Engineering

First Advisor

Guadalupe Canahuate

Abstract

Vehicular crashes are the leading cause of death for young adult drivers, however, very little life course research focuses on drivers in their 20s. Moreover, most data analyses of crash data are limited to simple correlation and regression analysis. This thesis proposes a data-driven approach and usage of machine-learning techniques to further enhance the quality of analysis.

We examine over 10 years of data from the Iowa Department of Transportation by transforming all the data into a format suitable for data analysis. From there, the ages of drivers present in the crash are discretized depending on the ages of drivers present for better analysis. In doing this, we hope to better discover the relationship between driver age and factors present in a given crash.

We use machine learning algorithms to determine important attributes for each age group with the goal of improving predictivity of individual methods. The general format of this thesis follows a Knowledge Discovery workflow, preprocessing and transforming the data into a usable state, from which we perform data mining to discover results and produce knowledge.

We hope to use this knowledge to improve the predictivity of different age groups of drivers with around 60 variables for most sets as well as 10 variables for some. We also explore future directions this data could be analyzed in.

Keywords

Car Crashes, Data Analysis, Knowledge Discovery, Vehicles

Pages

vii, 39

Bibliography

39

Copyright

Copyright © 2016 John Dietrich Tollefson

Share

COinS