San Francisco Crime Analysis – Report -By Rohit Dandona and Sameer Darekar Introduction: From 1934 to 1963, San Francisco was infamous for housing some of the world's most notorious criminals on the inescapable island of Alcatraz. Today, the city is known more for its tech scene than its criminal past. But, with rising wealth inequality, housing shortages, and a proliferation of expensive digital toys riding BART to work, there is no scarcity of crime in the city by the bay. This project examines the San Francisco Police Department crime records between January 1st 2003 and May 13st 2015. It visualizes the trends of major crimes and drugs in the city across time and locations. This analysis has been carried out to facilitate law enforcement officers to enhance their strategies based on time and location. R and Tableau have been used for data processing and visualizations. Dataset: This dataset contains incidents derived from SFPD Crime Incident Reporting system. The data ranges from 1/1/2003 to 5/13/2015. Data fields: Dates - timestamp of the crime incident Category - category of the crime incident Descript - detailed description of the crime incident DayOfWeek - the day of the week PdDistrict - name of the Police Department District Resolution - how the crime incident was resolved Address - the approximate street address of the crime incident X – Longitude Y - Latitude Data-Preprocessing: The date field in the dataset is a timestamp present in the “MM/dd/yyyy hh:mm:ss” format and is not of much utility if used as such. Tableau provides the facility of parsing through a timestamp through a utility called “DATEPART ” and extract specific segments of the timestamp. For example: DATEPART('hour',[Dates])
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
San Francisco Crime Analysis – Report
-By Rohit Dandona and Sameer Darekar
Introduction:
From 1934 to 1963, San Francisco was infamous for housing some of the world's most notorious
criminals on the inescapable island of Alcatraz. Today, the city is known more for its tech scene than its
criminal past. But, with rising wealth inequality, housing shortages, and a proliferation of expensive
digital toys riding BART to work, there is no scarcity of crime in the city by the bay. This project examines
the San Francisco Police Department crime records between January 1st 2003 and May 13st 2015. It
visualizes the trends of major crimes and drugs in the city across time and locations. This analysis has
been carried out to facilitate law enforcement officers to enhance their strategies based on time and
location.
R and Tableau have been used for data processing and visualizations.
Dataset:
This dataset contains incidents derived from SFPD Crime Incident Reporting system. The data ranges
from 1/1/2003 to 5/13/2015.
Data fields:
Dates - timestamp of the crime incident
Category - category of the crime incident
Descript - detailed description of the crime incident
DayOfWeek - the day of the week
PdDistrict - name of the Police Department District
Resolution - how the crime incident was resolved
Address - the approximate street address of the crime incident
X – Longitude
Y - Latitude
Data-Preprocessing:
The date field in the dataset is a timestamp present in the “MM/dd/yyyy hh:mm:ss” format and is not of
much utility if used as such.
Tableau provides the facility of parsing through a timestamp through a utility called “DATEPART ” and
extract specific segments of the timestamp. For example:
DATEPART('hour',[Dates])
A part of the analysis required representing the data by five separate parts of the day. The following
construct was used to derive a column in Tableau:
IF DAY([Dates]) >= 4 AND DAY([Dates]) <= 8 THEN 'EARLY MORNING(4 AM to 8 AM)'
ELSEIF DAY([Dates]) > 8 AND DAY([Dates]) < 12 THEN 'MORNING(8 AM to 12 PM)'
ELSEIF DAY([Dates]) >= 12 AND DAY([Dates]) < 17 THEN 'AFTERNOON(12 PM to 5 PM)'
ELSEIF DAY([Dates]) >= 17 AND DAY([Dates]) <= 21 THEN 'EVENING (5 PM to 9 PM)'
ELSE 'NIGHT(9 PM to 4 AM)' END
The address data is of the following form
1. 1500 Block of LOMBARD ST
2. OAK ST / LAGUNA ST
The first one is actually a location on the street but the second one is the intersection of two streets, we
extracted the streets from both of them for plotting the bubble chart shown in analysis section. The rest
of the columns are clean and no further processing was required.
Analysis of Crime
Trends:
There are 39 categories of crime report incidents available in the dataset including “Other offenses”
and “Non-criminal”.
Larceny and theft was the most common crime in San Francisco between January 1st 2003 and May
13th 2015, with a total of 174,900 reported incidents. The next two highest crime categories, “Other
offenses” (98,281 reported incidents) and “Non-criminal” (98,172 reported incidents), are omitted
from discussion in the analyses. Instead, the key focus is on the following high crime categories: