Top Banner
Group 3 Final Project Paper In our final project for ISDS 4180, we were asked to analyze and interpret crash data from the Louisiana Highway Safety Research Group with one basic question in mind: which factors are most likely to impact injury for drivers between the ages of 1821 in passenger cars, trucks, and suvs while on US Highway roads, state roads, and the interstate? As a group, our first task was narrow down all of the factors into the top 12 variables. After choosing which factors we wanted to analyze, we began working on data analysis using tools such as SQL Server, PowerPivot, Tableau, and JMP. We decided to use the factors: Seatbelt, Alcohol, Drugs, Distractions, Airbag, Ejection, Gender, Speeding, Precipitation, Vehicle Age, Vehicle Type. Gender is a factor we chose to see if there was any correlation between injury among men versus women. Vehicle Age is a factor because we think older vehicles, when involved in accidents, do not protect the driver as well as newer models. VehicleAge was determined by vehicles manufactures in 2006 or earlier being classified as ‘Old’ and vehicles manufactured after 2006 then the vehicle is classified as ‘New’ Vehicle Type is a factor because we think that injury depends a lot on the type of vehicle the driver is in. We are using cars, trucks, and SUVs to determine which types of cars have higher injuries. We chose to use seatbelt as a factor because we think that wearing seat belts can impact your injury in case of an accident. The SeatbeltUsed classification was determined by if ProtectionsSystemCode is ‘A’None Used then SeatbeltUsed is classified as ‘No’, if ProtectionsSystemCode is ‘BShoulder Belt Only Used’ then SeatbeltUsed is classified as ‘Yes’, if ProtectionsSystemCode is ‘CLap Belt Only Used’ then SeatbeltUsed is classified as ‘Yes’, if ProtectionsSystemCode is ‘DShoulder and Lap Belt Used’ then SeatbeltUsed is classified as ‘Yes’. We chose to use these classifications because our analysis is only focused on teen drivers of vehicles, therefore child restraint and helmet usage is irrelevant to our data. Alcohol is a factor because we think driving drunk can limit your reaction time and is more likely to cause accidents than driving sober.The possibility of more accidents can lead to more injuries. The AlcoholUsed classification was determined by PredictedAlcoholCode being Yes or No. Drug Usage is a factor because, like alcohol, we think drugs can impair a driver’s ability to navigate efficiently. If a driver is impaired, he/she is more likely to be involved in an accident. The DrugsUsed classification was determined by if DrugsCode is ‘ATest Not Given’ then DrugsUsed is classified as ‘No’, if DrugsCode is ‘CTest Refused’ or ‘DDrugs Reported’ then DrugsUsed is classified as ‘Yes’ Distractions is a factor because we feel as though if a driver is paying attention to their phones, iPads, eating, applying makeup, looking anywhere but in front of them, etc, they are more likely to get in an accident. We believe think a distracted driver will get in a more sudden crash which would cause more injuries. The DriverDistracted classification was determined by if DriverDistractionCode is ‘ACell Phone’ then DriverDistracted is ‘Yes’, if DriverDistractionCode
10

Group 3 Final Project Paper - Camry White's E-portfoliocamrywhite.weebly.com/uploads/2/3/6/1/23610564/isds_4180_projec… · Group 3 Final Project Paper In our final project for ISDS

Apr 30, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Group 3 Final Project Paper - Camry White's E-portfoliocamrywhite.weebly.com/uploads/2/3/6/1/23610564/isds_4180_projec… · Group 3 Final Project Paper In our final project for ISDS

Group 3 Final Project Paper In our final project for ISDS 4180, we were asked to analyze and interpret crash data

from the Louisiana Highway Safety Research Group with one basic question in mind: which factors are most likely to impact injury for drivers between the ages of 18­21 in passenger cars, trucks, and suvs while on US Highway roads, state roads, and the interstate? As a group, our first task was narrow down all of the factors into the top 12 variables. After choosing which factors we wanted to analyze, we began working on data analysis using tools such as SQL Server, PowerPivot, Tableau, and JMP.

We decided to use the factors: Seatbelt, Alcohol, Drugs, Distractions, Airbag, Ejection, Gender, Speeding, Precipitation, Vehicle Age, Vehicle Type.

Gender is a factor we chose to see if there was any correlation between injury among men versus women.

Vehicle Age is a factor because we think older vehicles, when involved in accidents, do not protect the driver as well as newer models. VehicleAge was determined by vehicles manufactures in 2006 or earlier being classified as ‘Old’ and vehicles manufactured after 2006 then the vehicle is classified as ‘New’

Vehicle Type is a factor because we think that injury depends a lot on the type of vehicle the driver is in. We are using cars, trucks, and SUVs to determine which types of cars have higher injuries.

We chose to use seatbelt as a factor because we think that wearing seat belts can impact your injury in case of an accident. The SeatbeltUsed classification was determined by if ProtectionsSystemCode is ‘A­’None Used then SeatbeltUsed is classified as ‘No’, if ProtectionsSystemCode is ‘B­Shoulder Belt Only Used’ then SeatbeltUsed is classified as ‘Yes’, if ProtectionsSystemCode is ‘C­Lap Belt Only Used’ then SeatbeltUsed is classified as ‘Yes’, if ProtectionsSystemCode is ‘D­Shoulder and Lap Belt Used’ then SeatbeltUsed is classified as ‘Yes’. We chose to use these classifications because our analysis is only focused on teen drivers of vehicles, therefore child restraint and helmet usage is irrelevant to our data.

Alcohol is a factor because we think driving drunk can limit your reaction time and is more likely to cause accidents than driving sober.The possibility of more accidents can lead to more injuries. The AlcoholUsed classification was determined by PredictedAlcoholCode being Yes or No.

Drug Usage is a factor because, like alcohol, we think drugs can impair a driver’s ability to navigate efficiently. If a driver is impaired, he/she is more likely to be involved in an accident. The DrugsUsed classification was determined by if DrugsCode is ‘A­Test Not Given’ then DrugsUsed is classified as ‘No’, if DrugsCode is ‘C­Test Refused’ or ‘D­Drugs Reported’ then DrugsUsed is classified as ‘Yes’

Distractions is a factor because we feel as though if a driver is paying attention to their phones, iPads, eating, applying makeup, looking anywhere but in front of them, etc, they are more likely to get in an accident. We believe think a distracted driver will get in a more sudden crash which would cause more injuries. The DriverDistracted classification was determined by if DriverDistractionCode is ‘A­Cell Phone’ then DriverDistracted is ‘Yes’, if DriverDistractionCode

Page 2: Group 3 Final Project Paper - Camry White's E-portfoliocamrywhite.weebly.com/uploads/2/3/6/1/23610564/isds_4180_projec… · Group 3 Final Project Paper In our final project for ISDS

is ‘B­Other Electronic Device’ then DriverDistracted is ‘Yes’, if DriverDistractionCode is ‘C­Other Inside the Vehicle’ then DriverDistracted is ‘Yes’, if DriverDistractionCode is ‘D­Other Outside the Vehicle’ then DriverDistracted is ‘Yes’, if DriverDistractionCode is ‘E­Not Distracted’ then DriverDistracted is ‘No’

Airbag is a factor because we want to determine whether or not the airbag being deployed has an effect on injury in the event of a crash. We think that if the airbag was deployed then there will be more injuries because airbags usually deploy in more severe crashes. Airbags also tend to cause more injuries to the upper body because of the force of their deployment. The AirbagDeployed classification was determined by if AirbagCode is ‘A­Deployed’ then AirbagDeployed is ‘Yes’, if AirbagCode is ‘B­Non Deployed’ then AirbagDeployed is ‘No’, if AirbagCode is ‘C­Non Deployed/Switch Off’ then AirbagDeployed is ‘No’

Ejection is a factor because we believe that a driver being ejected has significant effects on being injured. The DriverEjected classification was determined by if EjectionCode is ‘A­Not Ejected’ then DriverEjected is ‘No’, if EjectionCode is ‘B­Totally Ejected’ then DriverEjected is ‘Yes’, if EjectionCode is ‘C­PartiallyEjected’ then DriverEjected is ‘Yes’

Speeding is a factor because we think that high speeds involved in a crash is more likely to cause injury. The SpeedingInvolved classification was determined by if ViolationsCode is ‘A­Exceeding Stated Speed Limit’ then SpeedingInvolved is ‘Yes’, if ViolationsCode is ‘B­Exceeding Safe Speed Limit’ then SpeedingInvolved is ‘Yes’

Precipitation is a factor because we think that if a road is slippery then it could be a cause for more accidents and more injuries. The PrecipitationInvolved classification was determined by if WeatherCode is ‘C­Rain’ then PrecipitationInvolved is ‘Yes’, if WeatherCode is ‘E­Sleet/Hail’ then PrecipitationInvolved is ‘Yes’, if WeatherCode is ‘F­Snow’ then PrecipitationInvolved is ‘Yes’

Using SQL Server and the factors and classifications listed above, we created a View to include our chosen factors with the appropriate classifications. The View helps us limit what we see to only the data that we’re analyzing.

Page 3: Group 3 Final Project Paper - Camry White's E-portfoliocamrywhite.weebly.com/uploads/2/3/6/1/23610564/isds_4180_projec… · Group 3 Final Project Paper In our final project for ISDS

The next step in our analysis was using PowerPivot to create tables and charts to help

us understand the relationships between injury and the different factors. Using PowerPivot we found that some factors we thought would have an important impact did not have a significant impact on driver injury. The factors that had the most impact on driver injury are AirbagDeployed, AlcoholInvolved, DriverEjected, DrugsInvolved and SeatbeltUsed. These factors had a significant difference when comparing whether or not the factor was present when injury was involved.

When the airbag was deployed 23% of drivers were injured compared to 2.4% of drivers being injured when the airbag was not deployed. Since more

drivers were injured when the airbag was deployed, this supports our thought that this could be due to the fact that the car crash was more severe leading the airbags to be deployed.

When alcohol was present, 29% of drivers were injured compared to 4% being injured when alcohol was not present. This leads us to believe that when alcohol is present more drivers were likely to be injured. Even though alcohol was not involved in most crashes, the data shows us that when alcohol was present more drivers were injured. When the driver was ejected, 86.9% of drivers were injured

compared to only 3.9% being injured when the driver was not ejected. Like alcohol being present, most drivers involved in crashes were not ejected, but when they were ejected, drivers were more likely injured.

Page 4: Group 3 Final Project Paper - Camry White's E-portfoliocamrywhite.weebly.com/uploads/2/3/6/1/23610564/isds_4180_projec… · Group 3 Final Project Paper In our final project for ISDS

When drugs were present, 28.75% of drivers were injured compared to 4.31% of drivers being injured when drugs were not present. This tells us that when drugs are present the driver is more likely to be injured.

When a seatbelt was used, 3.66% of drivers were injured compared to 48.44% of drivers who were injured not wearing a seatbelt. This leads us to believe that when an injury is present wearing a seatbelt seems to reduce the percentages of injury.

When speeding was a factor in a crash, 13.45% of drivers were injured compared to 4.6% of drivers injured when speeding was not involved. This leads us to believe that drivers who were speeding at the time of the crash are more likely to be injured.

The other factors we tested: DriverDistracted, PrecipitationInvolved and SpeedingInvolved did not have as much of an impact on injury as we were expecting; specifically driver distractions. There was only a 1% difference in injured drivers and non­injured drivers when distractions were present. We believed distracted drivers, speeding cars and slippery roads would lead to more serious accidents which would lead to more serious injuries, however the data does not support this. We thought that all of these factors would lead to more serious accidents that would lead to more injuries, but the data specific to teen drivers did not support our initial beliefs.

The next step in analyzing what affects teen driver injury was using Tableau to create an

interactive dashboard to further analyze our factors. Using Tableau we compare the factors we found to have an impact on injury from the PowerPivot analysis. We created a dashboard to see side­by­side which factors had higher numbers of injury. Here we can see how the factor being present or not present affected injury. We notice through using Tableaus analysis that when alcohol was involved in the injury the driver was more likely to not use a seatbelt. Not having on their seatbelt contributed to more injuries where the driver was ejected from the car. Also, we found that in cases where the airbag was deployed more drivers were wearing a seatbelt and not ejected. We surmised that these factors correlated with one another with driver injury and that future related injuries could be attributed to these instances.

Page 5: Group 3 Final Project Paper - Camry White's E-portfoliocamrywhite.weebly.com/uploads/2/3/6/1/23610564/isds_4180_projec… · Group 3 Final Project Paper In our final project for ISDS

Tableau Dashboard

Page 6: Group 3 Final Project Paper - Camry White's E-portfoliocamrywhite.weebly.com/uploads/2/3/6/1/23610564/isds_4180_projec… · Group 3 Final Project Paper In our final project for ISDS

Our next step was to use JMP to run a nominal logistic regression between our

independent variable, Injury, and the rest of the factors. The thought process using this software was to compare our findings from Powerpivot and Tableau to a statistical model. We left out null values for Injury and brought in DrugsInvolved, AlcoholInvolved, AirbagDeployed, DriverEjected, DriverDistracted, and SeatbeltUsed. “Yes” became 0 and “No” became 1. Our analysis consisted of interpreting the odds ratios and using the prediction profiler to visualize the data graphically. All of the factors we brought in were significant and did impact injury.

Page 7: Group 3 Final Project Paper - Camry White's E-portfoliocamrywhite.weebly.com/uploads/2/3/6/1/23610564/isds_4180_projec… · Group 3 Final Project Paper In our final project for ISDS
Page 8: Group 3 Final Project Paper - Camry White's E-portfoliocamrywhite.weebly.com/uploads/2/3/6/1/23610564/isds_4180_projec… · Group 3 Final Project Paper In our final project for ISDS

The Interpretation of Odds Ratios for each of the following factors: Seatbelt Used Not wearing a seatbelt during a crash increases your odds of being injured by

9.5956594. Comparatively, wearing a seatbelt during a crash decreases your odds of being injured by .1042138.

Page 9: Group 3 Final Project Paper - Camry White's E-portfoliocamrywhite.weebly.com/uploads/2/3/6/1/23610564/isds_4180_projec… · Group 3 Final Project Paper In our final project for ISDS

Airbag Deployed, An airbag not deploying during a crash decreases your odds of being injured by

.0927601. An airbag deploying during a crash increase your odds of being injured by 10.7805. Alcohol Involved, When alcohol is not involved during a crash, it decreases your odds of being injured by

.3908443. When alcohol is involved during a crash, it increases your odds of being injured by 2.5585634.

Driver Distracted, In accidents where the driver is not distracted, your odds of being injured increase by

1.193. In accidents where the driver is distracted, your odds of being injured decrease by .83. This proves that this variable does not impact injury and was included to ensure it had little to no effect on the model.

Drugs Involved, In accidents where drugs are not involved, your odds of being injured decrease by

.5785143. In accidents involving drugs, your odds of being injured increase by 1.7285657. Driver Ejected, In accidents where the driver is not ejected, your odds of being injured decrease by

.0532707. In accidents where the driver is ejected, your odds of being injured increase by 18.77205.

Based on the odds ratios in our JMP analysis, we ranked the factors that impact injury as follows in order: DriverEjected, AirbagDeployed, SeatbeltUsed, AlcoholInvolved, DrugsInvolved, and lastly DriverDistracted.

The Prediction Profiler results were used to display how the prediction model changes for each factor individually. It shows the response of each variable while the remaining variables are held constant. In the future, it could be used to efficiently compare future or past nominal logistic regression models.

In conclusion, the results from our data analysis supports our predictions of airbag deployment, alcohol involvement, driver distractions, drug involvement, and driver ejection being factors that cause teen, driver injury in crashes. We believe the factors most affect injury in the following order: 1) Ejected 2) Airbag Deployment 3) Seatbelt Usage.

To reduce teen driver injury, we believe that more emphasis should be placed on wearing seatbelts, making sure your car’s safety features are up to date, reducing driver distractions, and more programs to prevent driving under the influence. Newly manufactured cars should come equipped with sensors that won’t allow the driver to reach over 25 miles per hour before their seatbelt has been buckled. Most cars already have a notification to alert the driver that the aren’t wearing their seatbelt, but if the car will not accelerate until the seatbelt is buckled more drivers would wear seatbelts. We also think more focus should be on the ‘Click It or Ticket’ and ‘Buckle Up or Pay Up’ campaigns. Wearing seatbelts would also lead to less ejections, which would lead to less injury. To reduce driver distractions cell phone engineers could potentially require Bluetooth to be activated when a driver gets in the car and this will allow the phone to be locked while the car is in motion. This would reduce drivers being distracted by cell phones because they cannot access them while driving. With new

Page 10: Group 3 Final Project Paper - Camry White's E-portfoliocamrywhite.weebly.com/uploads/2/3/6/1/23610564/isds_4180_projec… · Group 3 Final Project Paper In our final project for ISDS

transportation services becoming available DUIs have been declining. Uber Technologies, a popular transportation service, has estimated that DUI rates in Seattle decreased by more than 10% (Newsroom, Uber). We believe that bouncers should take the keys of each person entering the bar and before they can leave they need to take a breathalyzer to be able to retrieve their keys back or the will be forced to call a taxi, Uber, Lyft, friend, etc for a ride home. We also believe that more DUI checkpoints should be set up around college towns (since our focus is on ages 18­21) to prevent drivers who chose to get in their cars impaired will not be able to get far. With these changes we believe there can be a reduction of crashes with injuries.