Breaking Down Crime in Cleveland Neighborhoods Using Spatial Statistics Hasani Wheat Cleveland State University Levin College of Urban Affairs PDD 643- Advanced GIS Dr. Sung-Gheel Jang December 16, 2011
Oct 16, 2014
Breaking Down Crime in Cleveland Neighborhoods Using Spatial Statistics
Hasani Wheat
Cleveland State University
Levin College of Urban Affairs
PDD 643- Advanced GIS
Dr. Sung-Gheel Jang
December 16, 2011
1
Contents
Abstract………………………………………………………………………………………..2
Background……………………………………………………………………………………3
Goals and Objectives…………………………………………….............................................5
Data……………………………………………………………………………………………6
Methods and Analysis…………………………………………………………………………8
Discussions and Summary…………………………………………………………………...12
References……………………………………………………………………………………19
Appendix……………………………………………………………………………………..20
1. Attribute Tables…………………………………………………………..………….20
2. Maps…………..………………………………………………………………….......21
3. Results………………………………………………………………………………..27
4. Data Dictionary………………………………………………………………….…...30
2
Abstract
The association of Cleveland, Ohio with the presence of crime has been an ongoing topic of
conversation for both residents and non-residents for many years. In Cleveland, some
neighborhoods have a bad reputation of being dangerous areas because of foreclosures, high
vacancy rates, and other housing and demographic information. Oftentimes, some neighborhoods
that are viewed as unsafe because of their perceived high amounts of crime are not as unsafe as
perceived when data is presented. Another problem that is noticed about crime in Cleveland
neighborhoods is its lack of availability and detail. While there are maps that analyze crime in
the Cleveland area, most maps do not show the most detailed scale of analysis, which is through
Census block groups. Additionally, there are few maps available that display crime as current as
2010. This paper will detect hot and cold spots in Cleveland neighborhoods using crime data
from the Cleveland Police Department via NEO CANDO. Taking it a step further, this paper will
also look at the outliers that are near or within those hot and cold spot clusters. Identifying where
these outliers are located and in which Cleveland neighborhood will be critical for conducting
further research and analysis for showing crime. Using the Census block groups instead of a less
detailed boundary files such as Census tracts will allow for more data to be clustered for the hot
spot analysis. The Census block group data can also be used to establish where the outliers of
crime types are in Cleveland neighborhoods.
3
Background
This project that focuses on using spatial statistics to display crime in Cleveland
neighborhoods is important because the maps and data will make it easier to see the
concentration of crime in a detailed manner. The perception that Cleveland is an unsafe place
due to its crime is prevalent in the minds of many people including individuals that are from
Cleveland neighborhoods. In order to change the minds of Cleveland residents, a visual
representation that reflects the types of crimes that are present in Cleveland neighborhoods needs
to be presented to the urban masses. The problem statement for this project is: Is there a
statistically significant relationship between the number of crime counts of a particular type and
the location of a particular census block group that lies within a particular neighborhood?
As mentioned in the proposal of this project, in order to get a better understanding of the
levels of crime in Cleveland, there must be a categorical breakdown of where crime is located in
Cleveland and more specifically, its neighborhoods. Although the objective is to break down any
myths associated with crime types in a given Cleveland neighborhood, graphical representation
is also critical. Cleveland as a whole is not a dangerous place but by providing maps of what
particular crimes affect what particular neighborhood or even what particular crimes do not
affect a particular neighborhood, the hope is that the maps will alleviate fears that people have
about that particular crime. For this project, there are only a couple of maps that are presented;
however, these maps will be a small representation of the type of crime and where crime is
located. In a nutshell, the problems that the project will attempt to address are 1.) Where is a
particular crime type located in Cleveland with hot and cold clustering? 2.) Where are the
outliers in a hot or cold cluster? 3.) How to assess the results of the hot and cold clustering as
well as the outliers within these clusters in a real life situation?
The audience should expect three things from this project: a visual representation of
whether a particular type of crime is either a high crime rate cluster, a low crime rate cluster, or
an area that falls in between the extreme values of crime located in each Cleveland neighborhood
represented by Census block groups, an explanation to the hot and cold spots results of a
particular crime for each neighborhood, and an explanation of where the outliers are within or
near a cluster of high or low crime. Once these methods that analyze various crimes are created,
an explanation of criminal activities in a spatial context can be established by comparing socio-
economic factors such as vacancy and foreclosure rates to each other (Greenburg & Rohe, 1984).
4
Easy access to retrieve the information so that people can see the data would be helpful so that
people can be aware that the data exists. Policymakers and city officials will be able to seek out
the approximate areas of where the crime is located. For example, according to the Cleveland
Police Department, the North Broadway neighborhood has one of the highest average rates of
violent crime out of the 36 Cleveland Statistical Planning Areas. By selecting the area that has
the highest rate of crime, the policymakers will be able to extract this location from the rest of
the dataset, zoom in on the location, compare it with the Statistical Planning Area Map, study the
area, and make recommendations on as to how to reduce the crime count (See Table 1, Maps #1
& #2).
Some of the research that was influential to the development of this project was “Extend
Crime Analysis with ArcGIS Spatial Statistics Tools” by Lauren Scott and Nathan Warmerdam,
“Mapping Crime: Understanding Hot Spots” by the U.S. Department of Justice- Office of Justice
Programs, and the “Rebuilding Blocks” article by Randall McShepard & Fran Stewart from
PolicyBridge. The Scott and Warmerdam article provided me with an introduction to the
importance of creating hot and cold spots using statistical analysis to easily depict the data. The
“Mapping Crime” article indicates the importance of establishing theories to help support the
data shown in the maps as to why the high or low crime areas are the way they are and how
policymakers and other people of interest will be able to make decisions based off these theories.
The “Rebuilding Blocks” article addresses the fear and the perception of crime in Cleveland and
its neighborhoods that is mentioned earlier in this paper.
As for the cluster and outlier analysis portion of the project, there are also some research
and literary works in which I referred to in assisting me with this project. A 2007 course project
titled “Drug arrests in High Gun Crime Locations of Dallas Texas” by Josh Taylir focuses on
how gun crimes are either spatially clustered, random, or dispersed using the Local Morans I
statistical tool. The other article that helped me to understand the importance of cluster and
outlier analysis is, ironically, named after the statistical tools’ namesake, Luc Anselin. The
article, “Review of Cluster Analysis Software,” addresses the importance of the Anselin Local
Morans I statistical tool. Some of the programs mentioned in the review are CrimeStat and
GeoDa, which are two recognizable programs which evaluate maps on a statistical basis. The
importance of this article is that many statistical programs utilize the Local Morans I as a tool;
ArcGIS takes it a step further by using Local Morans I and incorporating the statistical tool into
5
what will be a valuable asset in viewing a map showing clusters and outliers of crime in a
Cleveland neighborhood. Additionally, in the PowerPoint created by Scott and Warmerdam
titled “Spatial Statistics for Public Health and Safety,” Scott and Warmerdam further define
cluster and outlier analysis as “gaining a better understanding of feature distribution through
degree of clustering or dispersion across study area” (Scott and Warmerdam, “Spatial
Statistics”).
Goals and Objectives
The goal of this project is to analyze the locations of crime types in the Cleveland
neighborhoods through Spatial Statistic tools. These Spatial Statistic tools are the Hot Spot
Analysis, which uses the Getis-Ord Gi* statistic and the Cluster and Outlier Analysis, which uses
the Anselin Local Morans I statistic. By creating a visual of where crime types are located in the
Cleveland area, people will better understand where in Cleveland different crime types are
prevalent.
The objectives of the project that will help meet the goal are to identify which Cleveland
neighborhoods have high counts of crime (hot spots that are statistically significant), Cleveland
neighborhoods that have low counts of crime (cold spots that are also statistically significant), as
well as Cleveland neighborhoods that have crime counts that average in the middle (the majority
of census block groups that are within range of the mean). Additionally, the other objective to
this project is to seek out Census block groups in a high or low cluster that have a higher or
lower crime number than the rest of the cluster. In other words, the analysis will be similar to
that of the hot spots; however, the census block group within or near a cluster of low crime will
have a higher than usual number than its surrounding census block group. This also happens to
be true with high crime cluster that have a couple of census block groups that have lower than
expected crime numbers.
6
Data
Looking at the breakdown of specific crimes in Cleveland from the 2010 crime reports
from City Data (http://www.city-data.com/crime/crime-Cleveland-Ohio.html), I wanted to use a
variable that was deemed a common problem in Cleveland neighborhoods and a serious but
infrequent crime problem. The two crime types that were decided upon for use for this project
were the auto thefts as the common problem (3,503 instances of auto thefts (or 822.2 per 100,000
people)) and violent crimes as the serious but infrequent problem (higher violent crime index
than the U.S. average (Cleveland- 706.9 to the U.S. average- 222.7)) (City Data). In order to
create the maps and to provide the data needed for the results of what constitutes a dangerous or
least dangerous area for various types of crime, I acquired the following information:
Spatial
Name Source Use
Census County Boundary-
Block Group
ESRI TigerFile- 2010 Boundaries used to extract
Cleveland census block
groups
Cleveland SPAs06 Conflated
to 2010 Blocks
Northern Ohio Data &
Information Systems
Feature class that highlights
the boundaries of each
Cleveland neighborhood
Cleveland Zoning Map Cleveland Planning
Commission
Static map; to compare with
(Getis-Org Statistic) hot and
cold spots clusters created in
ArcGIS
Non-Spatial:
Name Source Use
The Following Selected
Crimes:
- 2010 Violent Crimes
- 2010 Auto Thefts
Cleveland Police Department,
Crime Analysis Unit, retrieved
from NEO CANDO
The Primary Variables in
which will be used for Hot and
Cold Spot Analysis and
Cluster and Outlier Analysis
7
Once the variables have been gathered, downloaded into ArcGIS, and the tables of crimes
are joined with the census block groups to produce the initial classification maps that display the
counts of a particular crime through the census block groups (see Map #3 for the Violent Crime
classification map and Map #4 for the Auto Thefts classification map), a data dictionary is
established to clarify the variables that are within the attribute table which makes up the table or
feature class. For this particular project, there are four variables in the data dictionary that make
up the initial feature classes to be used later for the hot spots analysis as well as the cluster and
outlier analysis: the tables for 2010 violent crimes and 2010 auto theft in Cleveland, the
boundary file for Cleveland Block Groups which is derived from the Ohio Block Groups, and the
Cleveland Neighborhood Layer (see Appendix 4- Data Dictionary). The data dictionary breaks
down each variable in the attribute table used to create the initial maps. Without these bits of
information, the created maps would not be possible and therefore, the analyses in the next step
would not exist. Additionally, when the analyses are created in the next steps, they produce their
own attribute tables. A data dictionary is especially useful for determining what the language
means in terms of statistical analysis for those hot spot and cluster-outlier maps.
In producing the initial block group maps that show the concentration of crime, I
manipulated the newly created hot spot feature class to reflect a cleaner version of the map. To
manipulate the hot spot data for a cleaner visual analysis, I searched for Feature to Raster in
Spatial Analyst toolbox. Using the hot spot feature class as an input, the vector produced data is
turned into a map that represents the raster data, perfectly smoothed to show the results more
clearly. An example of this smoothing process that is represented by the raster data is seen in a
hot spots map that was created with the Feature to Raster tool (see Map #5). In comparing the
vector data with the raster data of the hot spots analysis of auto theft (compare Maps #4 and #5),
one can see similarities with the two maps especially with the most extreme low and high levels
values of crime, although the classification scheme between the two maps are different. The key
to comparing the two datasets is to look at the extreme values. Additionally, adjusting the
classification scheme may develop even more similarities in the presentation of the data.
Creating a raster version of the hot spot data may not be entirely necessary in this case but the
visual may be easier to read especially if the map was put on a poster or presented in
PowerPoint.
8
Methods and Analysis
- Getis-Ord Gi*
ModelBuilder Work Flow that depicts the usefulness of the Statistical variables created from
the Hot Spot Analysis using Getis-Ord Gi*. Arrows added in Paint.
The work flow created in ModelBuilder shown above is used to develop the Hot Spot
Analysis. Using Hot Spot Analysis is the first step to developing a sense of where crime
types in Cleveland is located and how strong the crime types are depending on the color
developed for each value. In Hot Spot Analysis, the Getis-Ord Gi* statistical tool analyzes
the numbers represented by the counts of a crime type in a particular census block group and
creates the outputs GiZScore and GiPScore. The Z Score indicates where a value falls on a
chart of standard distribution. In most cases, the values will fall in the middle while a small
number of values will become extremes in comparison to the middle values.
In this project, each census block group has a count of a particular type of crime that has
occurred over the 2010 year. For instance, census block group 1011.01 has a recorded violent
crime count of 8 in the year 2010. This information is compared with other block groups that
are comprised of the entire Cleveland area block groups divided into neighborhoods. Taking
all of this into consideration, the Getis-Ord Gi* produces a Z Score that has an average from
-1.96 to 1.96 with all other values acting as extremes. These extremes are the corresponding
9
statistically significant values; all values that are to the left of -1.96 are known as significant
cold spots and values that are to the right of 1.96 are known as significant hot spots. The
crime data is sorted through the Z Score and is displayed on the map.
The P Value complements the Z Score by assessing the probability that an event is likely
to happen. In this particular case, the closer the P Value is to the Z Score value of 0, the more
likely the crime will not take place. For example, a P Value that is .99 or 99% is likely to
have a z score near 0. On the opposite end of the spectrum as seen in Table 2 below, the more
negative or positive a Z score is, the farther away the P Value will be. For instance, the
screenshot of the attribute table shows a column that has a Z Score of -3.86 (indicating a cold
spot for a crime type) with its corresponding P Value is .00011. If the Z Score was 3.86
(indicating a hot spot for a crime type) with its corresponding value would be the same P
Value. This follows the standard bell curve for determining values, seen below.
Source: Royal Bunnykins UK. http://royalbunnykins.co.uk/is/ru-z-score-table-normal-distribution/.
The SQL Queries that are represented for this project are based off data on violent crime.
Using Hot Spot Analysis to create the map, the SQL Queries complement the map by asking for
specific information. In this project, I asked the database to pull general violent crime per 1,000
people, the Z Score and the P Value for Source ID representing my neighborhood as well as for a
random census block group that has a statistically significant count of violent crimes. By
comparing the two block groups, I can see how much of a difference statistically violent crime
occurrences in 2010 are in the two block groups. The results are discussed in the next section.
10
- Local Morans I
ModelBuilder Work Flow that depicts the usefulness of the Statistical variables created from
the Cluster and Outlier Analysis using Anselin Local Morans I. Arrows added in Paint.
The Local Morans I statistic that is used for the cluster and outlier analysis part of the project
is important to determine where the outliers in a cluster are located. In the above ModelBuilder
work flow that was created for this project, the most important variables are those in the aqua
circle. The five variables are: LMiIndex, LMiZScore, LMiPValue, Source ID, Cluster-Outlier
Type (COType for short). While the Source ID establishes a unique ID for the values created, the
LMiIndex, the LMiZScore, and the LMiPValue are variables from Local Morans I that identifies
spatial clusters of features with attribute values similar in magnitude (ArcGIS Resource Center-
Desktop Help 10.0).
Arguably, the most important variable to establish strong results with the cluster and outlier
analysis is the COType. The COType establishes five different kinds of results from the cluster
and outlier analysis. The first three values that those that are not significant, which produces the
11
same types of results as the hot spot analysis, “HH” which represents all of the results of high
counts of crime in comparison to the overall population in that particular census block group, and
“LL” which represents all of the results of low counts of crime in comparison to the overall
population in that particular census block group. These three variables represent cluster types.
The more important part of the analysis is the two types that represent the outlier portion of the
output: “LH” which represents the low crime counts that are near or within the high clusters of
crime and “HL” which are the high crime counts that are near or within the low clusters of crime.
While the Cluster and Outlier Analysis output may look similar to the Hot Spots Analysis, the
outliers represent the subtle difference between the two.
The SQL Queries that were developed for the Cluster and Outlier Analysis of Auto Thefts in
Cleveland Neighborhoods identify the two sets of outliers from the rest of the data. The outliers,
those that have higher than expected values of auto thefts than the cluster that is generally lower
in their values and those that have lower than expected values of auto thefts than the cluster that
is generally higher in their values, open up further debate as to how to take action in favor or
against the outlier. Once these outliers were found, people of interest can further investigate how
to mitigate auto thefts in higher than expected block group areas or suggest to surrounding block
groups and neighborhoods how to reduce their own high auto theft counts. The results of the
analysis are discussed further in the next section.
12
Discussion and Summary
ModelBuilder Work Flow that showcases the importance of the Union to retrieve results for
Cluster and Outlier Analysis. This importance is also extended to the Hot Spot Analysis.
Square added in Paint.
In ModelBuilder, there are two maps that were created for the purpose of this project. The
first map is that of the Hot Spots Analysis. In the appendix, there are two maps; one vector and
one raster, which reflect the ModelBuilder work flow results (see Maps #6 and #7). The
ModelBuilder Work Flow seen above produces outputs seen in the appendix for both vector and
raster maps (see Maps #8 and #9). The most important step in the ModelBuilder process for this
project for both the Hot Spots Analysis and the Cluster and Outlier Analysis is to use the Union
tool to combine the datasets of the Neighborhood Boundary and the Hot Spot or the Cluster and
Outlier Analysis Feature Class, which already has the information about the specific type of
crime. The results will produce a rough outline on the map that will aggregate the existing
results. These results are seen in the appendix.
13
For example, using the union tool to combine the datasets will allow you to see the
approximate area of the census block group under the name of a Cleveland neighborhood. To
produce the results in Result #1 for all 36 Statistical Planning Areas, I sorted alphabetically the
names of each Cleveland neighborhood and their z-score results, exported each column of
neighborhoods into Excel, averaged all of the z-scores to get one result, and filed the name of the
Cleveland neighborhood and the one result for the z-score until all 36 Cleveland neighborhoods
were gathered and completed. Once all Cleveland neighborhood z-scores were gathered, I took
out the statistically significant z-scores (meaning all of the z-scores that averaged above 1.96 for
hot spots and below -1.96 for cold spots). These results are seen below- the most dangerous
neighborhoods where violent crimes exist by z-score are compiled in Result #2 and the least
dangerous neighborhoods where violent crimes are to the left of the mean are compiled in Result
#3.
The results for the Cluster and Outlier Analysis show the neighborhoods that average
statistically significant high and low auto theft counts that tend to cluster together. The results
also show the neighborhoods that have outliers that are within or near the clusters that have
counts of auto thefts that are lower or higher than the surrounding block groups in the cluster. In
Result #4, the high auto theft counts are gathered from the list with its corresponding
neighborhood. Note that there are some neighborhoods that appear on this list several times and
other neighborhoods that do not appear on this list at all. Result #5 illustrates a similar list with
the exception that the cluster and outlier analysis created a neighborhood list for clusters with a
low auto theft count. The interesting portion of the analysis is the determination of the outliers.
The auto theft counts that are higher than expected in areas with a low count of auto thefts are
listed by neighborhood in Result #6. The auto theft counts that are lower than expected in areas
with a high count of auto thefts are listed by neighborhood in Result #7.
14
SQL Queries- Hot Spots Analysis
SQL Query #1
Looking at the data, I was curious to see some of the statistics of my census block group. After
identifying the area in where I live on the census block group map (Lee-Miles neighborhood),
census block group number (1223.01), and source ID (1889), I entered the following SQL
statement below:
The results show that the census block group that I live in which has a source ID of 1889, has a
violent crime rate of 8 per 1,000 people, a Z Score of.44 which falls in the middle of the values
represented for violent crimes and thus is deemed not statistically significant, and the probability
that a violent crime will not happen in the census block group is high which is at 65%. The map
in the appendix shows the results from the SQL Query (see Map #10).
SQL Query #2
To compare with my location, I decided to compare results with an area in a statistically
significant high crime cluster in which I established results from earlier in a separate analysis.
The area, a census block group parcel in the North Broadway neighborhood was randomly
picked out and the following SQL query below was produced:
15
The results shows that this particular census block group (1146.1) with a source ID of 1589 has a
crime count for 2010 of 20 persons committing violent crimes to every 1,000 people. The Z
Score is on the high end of being a statistically significant value at 4.64 and the low P Value of
.00000342 predictably matches the high Z Score. These scores show that this portion of North
Broadway is nowhere near close to becoming a safe community in terms of violent crimes.
Just for comparisons’ sake: I put the values of the Lee-Miles census block group of where I live
at and the North Broadway census block group together.
Source ID 2010 Violent Crime
Count
GiZScore GiPValue
1889 (Lee-Miles) 8 .44484 .656434
1589 (North
Broadway)
20 4.6439 .00000342
Difference 12 4.19906 -0.6563998
The results show that there is a big difference in each of the categories in terms of how safe the
block groups in the neighborhoods are for violent crimes. Perhaps the most striking is the
difference in the Z Score; the North Broadway block group represents the extreme to the right of
the values represented while the Lee-Miles block group is very close to the middle of all of the
dataset represented. The map in the appendix shows the results from the SQL Query (see Map
#11).
16
SQL Query- Cluster and Outlier Analysis
SQL Query #3
In developing the cluster and outlier analysis, I wanted to see how many values represented the
high values of auto thefts within or near a low cluster of auto theft in census block groups. The
“HL” designation of the COType is the unique identifier to establish those values. The SQL
Query is shown below.
Analyzing all of the auto theft counts in Cleveland census block groups, the search pulled these
six outliers from the clusters. These outliers are the ones that have a higher than expected auto
theft count than their block group counterparts. These could suggest that there is something else
that may not belong in these block groups such as a gang or a drug cartel. Suggestions may be
made as to how these particular areas can stop being the outsiders that they are. The map in the
appendix shows the results from the SQL Query (see Map #12).
17
SQL Query #4
Similarly, it is equally important to look at Cleveland census block groups that are outliers in a
good way. There are several areas where there are only a few cases of auto theft incidents
surrounding a cluster of high auto theft incidents. Using the SQL query below, these cases are
pulled from the database.
This is the end result of what the database has pulled. Majority of this area when highlighted on a
map are near the Downtown neighborhoods and neighborhoods that are surrounding the
Downtown. The map in the appendix shows the results from the SQL Query (see Map #13).
18
Although the language of statistical analysis may not be used as common lingo amongst
everyday individuals but the presence of a map in showing the data of where different types of
crime are heavily concentrated in great detail with census block groups is just as important, if not
more. The maps and work flow from ModelBuilder to create the vector and raster maps simplify
the process of determining where the lightest or heaviest concentration of crime is located in
Cleveland. Spatial queries also simplify the data in allowing the user to ask for only the
information we need to use.
The tools of Spatial Statistics can be a benefit to people who want to solve their issues of
crime. At worst, Spatial Statistics can give the everyday person a snapshot of what types of crime
is concentrated in their neighborhood. At best, Spatial Statistics can be a tool for policymakers in
the Planning profession as well as in numerous public sector positions such as police,
firefighters, etc. to highlight, brainstorm, and seek to correct the injustices of crime in their cities,
neighborhoods and more specifically, on a block group level as indicated with the various maps
presented in this project. Spatial Statistics can be a beneficial tool to reduce or even eradicate
different types of crime in Cleveland neighborhoods in the near future if the tool is put into the
right hands. Through a bit of training, education, and commitment to the cause of reducing crime
in Cleveland neighborhoods, the people of Cleveland can realize the potential power of Spatial
Statistics.
19
References
ArcGIS Resource Center- Desktop Help 10.0, Cluster and Outlier Analysis,
http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#/How_Cluster_and_Outlier
_ Analysis_Anselin_Local_Moran_s_I_works/005p00000012000000/
Crime in Cleveland by Year. City-Data. http://www.city-data.com/crime/crime-Cleveland-
Ohio.html.
Eck, John E., Spencer Chainey, James G. Cameron, Michael Leitner, and Ronald E. Wilson.
2005. Mapping Crime: Understanding Hot Spots. Washington, DC: National Institute of
Justice.
Greenberg, Stephanie & William Rohe. 1984. Neighbor Design & Crime. Journal of the
American Planning Association. 50:48-61.
Grubesic, Tony H., & Alan T. Murray. 2001. Detecting Hot Spots Using Cluster Analysis and
GIS. Proceedings from the Fifth Annual International Crime Mapping Research
Conference. Dallas, TX.
McShepard, Randall & Fran Stewart. October 2009. Rebuilding Blocks. Policy Bridge.
http://www.policy
bridge.org/uploaded_files/NeighborhoodReport_10_05_09_file_1255630039.pdf.
Scott, Lauren & Nathan Warmerdam. June 02, 2005. Extend Crime Analysis with ArcGIS
Spatial Statistics Tools. ArcUser Magazine.
Scott, Lauren & Nathan Warmerdam. Spatial Statistics for Public Health and Safety. 2008.
http://proceedings.esri.com/library/userconf/hss06/docs/spatial.pdf
Taylir, Josh. “Drug Arrests in High Gun Crime Locations of Dallas Texas: Methodology of
Cluster Analysis using GIS and SPSS.” 2007.
http://geography.unt.edu/~pdong/courses/4550/reports/Taylir_Josh_2007.pdf
20
Appendix 1- Attribute Tables
Table #1: Hot Spots Attribute Table Data
Table #2: Analyzing the Z Score and the P Value for Cleveland Neighborhoods
21
Appendix 2- Maps
Map #1: Extraction of North Broadway High Crime Rate Area from Cleveland Neighborhoods
Source: Cleveland City Planning Commission- Statistical Planning Area Map for North Broadway Neighborhood
Map #2: Approximation of Where the Selected Area of High Crime is Located in North Broadway
22
Map #3: A depiction of the initial Cleveland Block Groups that displays a classification scale of where the violent
crimes are located.
Map #4: A depiction of the initial Cleveland Block Groups that displays a classification scale of where the motor
vehicle thefts are located.
23
Map #5: Raster Version of the Hot Spots Map
Map #6- Hot Spots Map from ModelBuilder work flow
Map #7- Raster-based version of the Hot Spots Map from ModelBuilder work flow
24
Map #8- Cluster and Outlier Analysis Map created from the ModelBuilder Work Flow
Map #9- A Raster-Based Cluster and Outlier Analysis Map of Violent Crimes in Cleveland Neighborhoods
25
Location of Lee - Miles block group of where I live- SQL Query #1 (Map #10)
Location of Randomly Chosen Statistically Significant High Crime Block Map located in the North Broadway
neighborhood- SQL Query #2 (Map #11)
26
Map #12- Results of SQL Query #3
Map #13- Results of SQL Query #4
27
Appendix 3- Results
Result #1- Results of the average Z Score for Violent Crimes in Cleveland Neighborhoods
Result #2- Most Dangerous for Violent Crime Appearances in Cleveland Neighborhoods According to Average Z-
Score
28
Result #3- Least Dangerous for Violent Crime Appearances in Cleveland Neighborhoods According to Average Z-
Score
Result #4- HH Values of Auto Thefts for Cleveland Neighborhoods
29
Result #5- LL Values of Auto Thefts in Cleveland Neighborhoods
Result #6- HL Values of Auto Thefts in Cleveland Neighborhoods
Result #7- LH Values of Auto Thefts in Cleveland Neighborhoods
30
Appendix 4- Data Dictionary
Table: Violent Crime (Total violent crime includes four offenses: homicide, rape, robbery, and
aggravated assault).
Field Name Data Type Length Description
ObjectID Long Integer - Unique ID of Object
BlockGr String 47 Census Block Group
#
CntyNme String 15 County Name
V2010 Double - 2010 Violent Crime
Count in a Census
Block Group
StFid00_BG String 15 ID for Ohio,
Cuyahoga County,
Cleveland, and
specific block group
Table: Auto Thefts
Field Name Data Type Length Description
ObjectID Long Integer - Unique ID of Object
BlockGr String 47 Census Block Group
#
CntyNme String 15 County Name
MV2010 Double - 2010 Auto Thefts
Count in a Census
Block Group
StFid00_BG String 15 ID for Ohio,
Cuyahoga County,
Cleveland, and
specific block group
31
Cleveland Block Groups (derived from Ohio Block Groups- bg39_d00.shp) (Geometry Type:
Polygon)
Field Name Data Type Length Description
FID Object ID (Long
Integer)
- Unique ID of Block
Group
Shape Geometry - Shape of Block Group
Area Double - Area of Block Group
Perimeter Double - Perimeter of Block
Group
BG_39_D00_ Double Precision 11 Block Group Data
BG_39_D00_I Double Precision 11 Block Group Data
State String 2 Census State #
County String 3 Census County #
Tract String 6 Census Tract #
BlkGroup String 1 Census Block Group
#
Name String 90 Dup. of Census Block
Group #
LSAD String 2 Type of Boundary
LSAD_TRANS String 50
STFID00_BG String 50 ID for Ohio,
Cuyahoga County,
Cleveland, and
specific block group
Cleveland_SPAs06_conflated_to_2010_blocks v1 (Geometry Type: Line)
Field Name Data Type Length Description
FID Object ID (Long
Integer)
- Unique ID
Shape Polygon - Shape of
Neighborhood Layer
Name String 100 Name of Cleveland
Neighborhoods
32
When the Hot Spots Analysis is run, new data enters the attribute table of the former Cleveland
Block Groups. Below is an example of the hot spots analysis data for Violent Crimes in
Cleveland neighborhoods:
Field Name Data Type Length Description
ObjectID Long Integer - Unique ID of Object
Shape Geometry - Shape of Census
Block Group
Shape_Length Double - Length of the shape
Shape_Area Double - Area of the shape
GiZScore Float - Analyzes whether a
measure of crime is
statistically significant
through Getis- Ord
Gi*
GiPValue Float - Analyzes whether the
probability of a crime
in a particular block
group through Getis-
Ord Gi*
Source_ID Long - ID of the Source
polygon
V2010 Double - 2010 Violent Crime
Count in a Census
Block Group
Note: The data is the same for the auto thefts hot spots analysis that was run with the exception
of the last variable (it is MV 2010 instead of V2010).
33
Similarly, when the Cluster and Outlier Analysis is run, new data is entered into the attribute
table. Below is an example of the cluster and outlier analysis data for Auto Thefts in Cleveland
neighborhoods:
Field Name Data Type Length Description
ObjectID Long Integer - Unique ID of the
Object
Shape Geometry - Shape of Census
Block Group
Source_ID Long - ID of the Source
Polygon
Shape_Length Double - Length of the shape
Shape_Area Double - Area of the shape
LmiIndex IDW 23048 Float - Index of the Values
established by the Z
Score and P Value
LmiZScore IDW
23048
Float - Analyzes whether a
measure of crime is
statistically significant
through Local Morans
I
LmiPValue IDW
23048
Float - Analyzes whether the
probability of a crime
in a particular block
group through Local
Morans I
CoType IDW 23048 String - Classifies the clusters
and the outliers as
well as data that is
viewed as not
statistically significant
MV2010 Double - 2010 Motor Vehicle
Thefts Count in a
Census Block Group
Note: The data is the same for the violent crimes cluster and outlier analysis that was run with
the exception of the last variable (it is V2010 instead of MV2010).