Date of Degree
MS (Master of Science)
Electrical and Computer Engineering
Vehicular crashes are the leading cause of death for young adult drivers, however, very little life course research focuses on drivers in their 20s. Moreover, most data analyses of crash data are limited to simple correlation and regression analysis. This thesis proposes a data-driven approach and usage of machine-learning techniques to further enhance the quality of analysis.
We examine over 10 years of data from the Iowa Department of Transportation by transforming all the data into a format suitable for data analysis. From there, the ages of drivers present in the crash are discretized depending on the ages of drivers present for better analysis. In doing this, we hope to better discover the relationship between driver age and factors present in a given crash.
We use machine learning algorithms to determine important attributes for each age group with the goal of improving predictivity of individual methods. The general format of this thesis follows a Knowledge Discovery workflow, preprocessing and transforming the data into a usable state, from which we perform data mining to discover results and produce knowledge.
We hope to use this knowledge to improve the predictivity of different age groups of drivers with around 60 variables for most sets as well as 10 variables for some. We also explore future directions this data could be analyzed in.
Car Crashes, Data Analysis, Knowledge Discovery, Vehicles
vii, 39 pages
Includes bibliographical references (page 39).
Copyright © 2016 John Dietrich Tollefson
Additional FilesAppendixAAttributeDocumentation.pdf (41 kB)
AppendixBQueryDocumentation.pdf (45 kB)
AppendixCtreescannerjava.pdf (36 kB)
AppendixDAdditionalTables.pdf (149 kB)