Top Banner
David Budet Mariel Castro Jason Jaworski Yevgeny Khait Florangel Marte Client: Richard Washington, NYC Transit Authority Data Mining Customer & Employee-Related Subway Incidents: Phase II
20

Data Mining Customer & Employee-Related Subway Incidents: Phase II

Dec 30, 2015

Download

Documents

Samantha Cook

Data Mining Customer & Employee-Related Subway Incidents: Phase II. David Budet Mariel Castro Jason Jaworski Yevgeny Khait Florangel Marte Client: Richard Washington, NYC Transit Authority. Presentation Summary. Project Description Review Progression City Crime vs. Subway Crime - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data Mining Customer & Employee-Related Subway Incidents: Phase II

David Budet Mariel Castro Jason Jaworski Yevgeny Khait

Florangel Marte

Client: Richard Washington, NYC Transit Authority

Data Mining Customer & Employee-Related

Subway Incidents: Phase II

Page 2: Data Mining Customer & Employee-Related Subway Incidents: Phase II

Presentation Summary

Project DescriptionReviewProgressionCity Crime vs. Subway CrimeResults: Customer AssaultsResults: Employee AssaultsResults: Robberies (Simple Theft)Results: Train Delays Weka ID3 Decision TreesFuture Research Avenues

Page 3: Data Mining Customer & Employee-Related Subway Incidents: Phase II

Project Description

Phase I concentrated on looking at incidents and identifying reasons for aggression, specifically what effects delays had on aggression incidents

Phase II is more specifically concentrated on subway assaults and possible correlations with the data’s attributes

Main focus of both phases: analysis of a dataset of incidents which occurred in the New York City Subway system over multiple years and mining of the data to establish relationships and trends

Page 4: Data Mining Customer & Employee-Related Subway Incidents: Phase II

Review

Violent assaults against customers and employeesDelaysSimple thefts (unarmed robberies, pick-pocketing,

etc.)

The first half of the study focused on mining data with Microsoft SQL Server 2008 and the program Weka. Utilizing these tools and team methodologies, we determined which stations and train lines had the most:

Page 5: Data Mining Customer & Employee-Related Subway Incidents: Phase II

Progression

Acquired US Census data regarding crime and population in NYC

Normalized the Census crime data and subway crime data by population for Manhattan, Brooklyn, Queens and the Bronx 

Analyzed Subway crime as a microcosm of overall NYC crime for 2007

Created an interactive Javascript map pinpointing stations with most violent incidents and delays

The second half of the study had a more regional focus. The team:

Page 6: Data Mining Customer & Employee-Related Subway Incidents: Phase II

City Crime vs. Subway Crime

We found that Manhattan, though the third largest borough in terms of population, accounted for over half the crime in NYC

The Bronx has the smallest population, but in terms of crime per resident, had the second highest rate of crime

Subway crime accounts for less of a percentage of overall crime in Manhattan than the other three boroughs researched

In comparing overall crime in New York City for 2007 to crime in the NYC Subway system:

Page 7: Data Mining Customer & Employee-Related Subway Incidents: Phase II

City Crime vs. Subway Crime

Page 8: Data Mining Customer & Employee-Related Subway Incidents: Phase II

City Crime vs. Subway Crime

When normalized for population, subway crime in Brooklyn and Queens accounts for a greater percentage of overall crime than in

Manhattan and the Bronx, signaling these boroughs may have more dangerous, or incident prone stations than Manhattan or Queens.

Page 9: Data Mining Customer & Employee-Related Subway Incidents: Phase II

Findings: Customer Assaults

The stations with the most assaults (all types of assault) against customers from 2005 – 2007 were 59th Street, 14th Street and

125th Street.

Page 10: Data Mining Customer & Employee-Related Subway Incidents: Phase II

Findings: Customer Assaults

Between 2005 & 2007, the highest number of assaults (all types) committed against customers took place on the A, 2 and

4 lines.

Page 11: Data Mining Customer & Employee-Related Subway Incidents: Phase II

Findings: Employee Assaults

Stations with more than 5 total assaults (all types of assault) against employees between 2005 – 2007

Page 12: Data Mining Customer & Employee-Related Subway Incidents: Phase II

Findings: Employee Assaults

Between 2005 & 2007, the highest number of assaults (all types) committed against employees took place on the 6, 2 and A lines.

Page 13: Data Mining Customer & Employee-Related Subway Incidents: Phase II

Findings: Robberies (Simple Theft)

Page 14: Data Mining Customer & Employee-Related Subway Incidents: Phase II

Findings: Robberies (Simple Theft)

Page 15: Data Mining Customer & Employee-Related Subway Incidents: Phase II

Findings: Train Delays

Number of delays by month over 3 year period:

Page 16: Data Mining Customer & Employee-Related Subway Incidents: Phase II

Findings: Train Delays

Page 17: Data Mining Customer & Employee-Related Subway Incidents: Phase II

Findings: Train Delays

Page 18: Data Mining Customer & Employee-Related Subway Incidents: Phase II

Weka ID3 Decision Tree

Page 19: Data Mining Customer & Employee-Related Subway Incidents: Phase II

Weka ID3 Decision Tree

Page 20: Data Mining Customer & Employee-Related Subway Incidents: Phase II

Future Research Avenues

MTA and project team can separately mine an identical data set and introduce an objective methodology for determining the best results and techniques from both databases

Continue in-depth data mining Identify and research other algorithms in Weka

conducive to mining and correlating NYC Subway data (we propose the next team utilize clustering analysis via the algorithm SimpleKMeans)

Investigate possible correlations between neighborhood income levels and stations where subway crime is prevalent

Continue to expand and build on Javascript map