Crime and Statistics

Breaking Down Crime in Cleveland Neighborhoods Using Spatial Statistics

Hasani Wheat

Cleveland State University

Levin College of Urban Affairs

PDD 643- Advanced GIS

Dr. Sung-Gheel Jang

December 16, 2011

1

Contents

Abstract………………………………………………………………………………………..2

Background……………………………………………………………………………………3

Goals and Objectives…………………………………………….............................................5

Data……………………………………………………………………………………………6

Methods and Analysis…………………………………………………………………………8

Discussions and Summary…………………………………………………………………...12

References……………………………………………………………………………………19

Appendix……………………………………………………………………………………..20

1. Attribute Tables…………………………………………………………..………….20

2. Maps…………..………………………………………………………………….......21

3. Results………………………………………………………………………………..27

4. Data Dictionary………………………………………………………………….…...30

2

Abstract

The association of Cleveland, Ohio with the presence of crime has been an ongoing topic of

conversation for both residents and non-residents for many years. In Cleveland, some

neighborhoods have a bad reputation of being dangerous areas because of foreclosures, high

vacancy rates, and other housing and demographic information. Oftentimes, some neighborhoods

that are viewed as unsafe because of their perceived high amounts of crime are not as unsafe as

perceived when data is presented. Another problem that is noticed about crime in Cleveland

neighborhoods is its lack of availability and detail. While there are maps that analyze crime in

the Cleveland area, most maps do not show the most detailed scale of analysis, which is through

Census block groups. Additionally, there are few maps available that display crime as current as

2010. This paper will detect hot and cold spots in Cleveland neighborhoods using crime data

from the Cleveland Police Department via NEO CANDO. Taking it a step further, this paper will

also look at the outliers that are near or within those hot and cold spot clusters. Identifying where

these outliers are located and in which Cleveland neighborhood will be critical for conducting

further research and analysis for showing crime. Using the Census block groups instead of a less

detailed boundary files such as Census tracts will allow for more data to be clustered for the hot

spot analysis. The Census block group data can also be used to establish where the outliers of

crime types are in Cleveland neighborhoods.

3

Background

This project that focuses on using spatial statistics to display crime in Cleveland

neighborhoods is important because the maps and data will make it easier to see the

concentration of crime in a detailed manner. The perception that Cleveland is an unsafe place

due to its crime is prevalent in the minds of many people including individuals that are from

Cleveland neighborhoods. In order to change the minds of Cleveland residents, a visual

representation that reflects the types of crimes that are present in Cleveland neighborhoods needs

to be presented to the urban masses. The problem statement for this project is: Is there a

statistically significant relationship between the number of crime counts of a particular type and

the location of a particular census block group that lies within a particular neighborhood?

As mentioned in the proposal of this project, in order to get a better understanding of the

levels of crime in Cleveland, there must be a categorical breakdown of where crime is located in

Cleveland and more specifically, its neighborhoods. Although the objective is to break down any

myths associated with crime types in a given Cleveland neighborhood, graphical representation

is also critical. Cleveland as a whole is not a dangerous place but by providing maps of what

particular crimes affect what particular neighborhood or even what particular crimes do not

affect a particular neighborhood, the hope is that the maps will alleviate fears that people have

about that particular crime. For this project, there are only a couple of maps that are presented;

however, these maps will be a small representation of the type of crime and where crime is

located. In a nutshell, the problems that the project will attempt to address are 1.) Where is a

particular crime type located in Cleveland with hot and cold clustering? 2.) Where are the

outliers in a hot or cold cluster? 3.) How to assess the results of the hot and cold clustering as

well as the outliers within these clusters in a real life situation?

The audience should expect three things from this project: a visual representation of

whether a particular type of crime is either a high crime rate cluster, a low crime rate cluster, or

an area that falls in between the extreme values of crime located in each Cleveland neighborhood

represented by Census block groups, an explanation to the hot and cold spots results of a

particular crime for each neighborhood, and an explanation of where the outliers are within or

near a cluster of high or low crime. Once these methods that analyze various crimes are created,

an explanation of criminal activities in a spatial context can be established by comparing socio-

economic factors such as vacancy and foreclosure rates to each other (Greenburg & Rohe, 1984).

4

Easy access to retrieve the information so that people can see the data would be helpful so that

people can be aware that the data exists. Policymakers and city officials will be able to seek out

the approximate areas of where the crime is located. For example, according to the Cleveland

Police Department, the North Broadway neighborhood has one of the highest average rates of

violent crime out of the 36 Cleveland Statistical Planning Areas. By selecting the area that has

the highest rate of crime, the policymakers will be able to extract this location from the rest of

the dataset, zoom in on the location, compare it with the Statistical Planning Area Map, study the

area, and make recommendations on as to how to reduce the crime count (See Table 1, Maps #1

& #2).

Some of the research that was influential to the development of this project was “Extend

Crime Analysis with ArcGIS Spatial Statistics Tools” by Lauren Scott and Nathan Warmerdam,

“Mapping Crime: Understanding Hot Spots” by the U.S. Department of Justice- Office of Justice

Programs, and the “Rebuilding Blocks” article by Randall McShepard & Fran Stewart from

PolicyBridge. The Scott and Warmerdam article provided me with an introduction to the

importance of creating hot and cold spots using statistical analysis to easily depict the data. The

“Mapping Crime” article indicates the importance of establishing theories to help support the

data shown in the maps as to why the high or low crime areas are the way they are and how

policymakers and other people of interest will be able to make decisions based off these theories.

The “Rebuilding Blocks” article addresses the fear and the perception of crime in Cleveland and

its neighborhoods that is mentioned earlier in this paper.

As for the cluster and outlier analysis portion of the project, there are also some research

and literary works in which I referred to in assisting me with this project. A 2007 course project

titled “Drug arrests in High Gun Crime Locations of Dallas Texas” by Josh Taylir focuses on

how gun crimes are either spatially clustered, random, or dispersed using the Local Morans I

statistical tool. The other article that helped me to understand the importance of cluster and

outlier analysis is, ironically, named after the statistical tools’ namesake, Luc Anselin. The

article, “Review of Cluster Analysis Software,” addresses the importance of the Anselin Local

Morans I statistical tool. Some of the programs mentioned in the review are CrimeStat and

GeoDa, which are two recognizable programs which evaluate maps on a statistical basis. The

importance of this article is that many statistical programs utilize the Local Morans I as a tool;

ArcGIS takes it a step further by using Local Morans I and incorporating the statistical tool into

5

what will be a valuable asset in viewing a map showing clusters and outliers of crime in a

Cleveland neighborhood. Additionally, in the PowerPoint created by Scott and Warmerdam

titled “Spatial Statistics for Public Health and Safety,” Scott and Warmerdam further define

cluster and outlier analysis as “gaining a better understanding of feature distribution through

degree of clustering or dispersion across study area” (Scott and Warmerdam, “Spatial

Statistics”).

Goals and Objectives

The goal of this project is to analyze the locations of crime types in the Cleveland

neighborhoods through Spatial Statistic tools. These Spatial Statistic tools are the Hot Spot

Analysis, which uses the Getis-Ord Gi* statistic and the Cluster and Outlier Analysis, which uses

the Anselin Local Morans I statistic. By creating a visual of where crime types are located in the

Cleveland area, people will better understand where in Cleveland different crime types are

prevalent.

The objectives of the project that will help meet the goal are to identify which Cleveland

neighborhoods have high counts of crime (hot spots that are statistically significant), Cleveland

neighborhoods that have low counts of crime (cold spots that are also statistically significant), as

well as Cleveland neighborhoods that have crime counts that average in the middle (the majority

of census block groups that are within range of the mean). Additionally, the other objective to

this project is to seek out Census block groups in a high or low cluster that have a higher or

lower crime number than the rest of the cluster. In other words, the analysis will be similar to

that of the hot spots; however, the census block group within or near a cluster of low crime will

have a higher than usual number than its surrounding census block group. This also happens to

be true with high crime cluster that have a couple of census block groups that have lower than

expected crime numbers.

6

Data

Looking at the breakdown of specific crimes in Cleveland from the 2010 crime reports

from City Data (http://www.city-data.com/crime/crime-Cleveland-Ohio.html), I wanted to use a

variable that was deemed a common problem in Cleveland neighborhoods and a serious but

infrequent crime problem. The two crime types that were decided upon for use for this project

were the auto thefts as the common problem (3,503 instances of auto thefts (or 822.2 per 100,000

people)) and violent crimes as the serious but infrequent problem (higher violent crime index

than the U.S. average (Cleveland- 706.9 to the U.S. average- 222.7)) (City Data). In order to

create the maps and to provide the data needed for the results of what constitutes a dangerous or

least dangerous area for various types of crime, I acquired the following information:

Spatial

Name Source Use

Census County Boundary-

Block Group

ESRI TigerFile- 2010 Boundaries used to extract

Cleveland census block

groups

Cleveland SPAs06 Conflated

to 2010 Blocks

Northern Ohio Data &

Information Systems

Feature class that highlights

the boundaries of each

Cleveland neighborhood

Cleveland Zoning Map Cleveland Planning

Commission

Static map; to compare with

(Getis-Org Statistic) hot and

cold spots clusters created in

ArcGIS

Non-Spatial:

Name Source Use

The Following Selected

Crimes:

- 2010 Violent Crimes

- 2010 Auto Thefts

Cleveland Police Department,

Crime Analysis Unit, retrieved

from NEO CANDO

The Primary Variables in

which will be used for Hot and

Cold Spot Analysis and

Cluster and Outlier Analysis

http://www.city-data.com/crime/crime-Cleveland-Ohio.html

7

Once the variables have been gathered, downloaded into ArcGIS, and the tables of crimes

are joined with the census block groups to produce the initial classification maps that display the

counts of a particular crime through the census block groups (see Map #3 for the Violent Crime

classification map and Map #4 for the Auto Thefts classification map), a data dictionary is

established to clarify the variables that are within the attribute table which makes up the table or

feature class. For this particular project, there are four variables in the data dictionary that make

up the initial feature classes to be used later for the hot spots analysis as well as the cluster and

outlier analysis: the tables for 2010 violent crimes and 2010 auto theft in Cleveland, the

boundary file for Cleveland Block Groups which is derived from the Ohio Block Groups, and the

Cleveland Neighborhood Layer (see Appendix 4- Data Dictionary). The data dictionary breaks

down each variable in the attribute table used to create the initial maps. Without these bits of

information, the created maps would not be possible and therefore, the analyses in the next step

would not exist. Additionally, when the analyses are created in the next steps, they produce their

own attribute tables. A data dictionary is especially useful for determining what the language

means in terms of statistical analysis for those hot spot and cluster-outlier maps.

In producing the initial block group maps that show the concentration of crime, I

manipulated the newly created hot spot feature class to reflect a cleaner version of the map. To

manipulate the hot spot data for a cleaner visual analysis, I searched for Feature to Raster in

Spatial Analyst toolbox. Using the hot spot feature class as an input, the vector produced data is

turned into a map that represents the raster data, perfectly smoothed to show the results more

clearly. An example of this smoothing process that is represented by the raster data is seen in a

hot spots map that was created with the Feature to Raster tool (see Map #5). In comparing the

vector data with the raster data of the hot spots analysis of auto theft (compare Maps #4 and #5),

one can see similarities with the two maps especially with the most extreme low and high levels

values of crime, although the classification scheme between the two maps are different. The key

to comparing the two datasets is to look at the extreme values. Additionally, adjusting the

classification scheme may develop even more similarities in the presentation of the data.

Creating a raster version of the hot spot data may not be entirely necessary in this case but the

visual may be easier to read especially if the map was put on a poster or presented in

PowerPoint.

8

Methods and Analysis

- Getis-Ord Gi*

ModelBuilder Work Flow that depicts the usefulness of the Statistical variables created from

the Hot Spot Analysis using Getis-Ord Gi*. Arrows added in Paint.

The work flow created in ModelBuilder shown above is used to develop the Hot Spot

Analysis. Using Hot Spot Analysis is the first step to developing a sense of where crime

types in Cleveland is located and how strong the crime types are depending on the color

developed for each value. In Hot Spot Analysis, the Getis-Ord Gi* statistical tool analyzes

the numbers represented by the counts of a crime type in a particular census block group and

creates the outputs GiZScore and GiPScore. The Z Score indicates where a value falls on a

chart of standard distribution. In most cases, the values will fall in the middle while a small

number of values will become extremes in comparison to the middle values.

In this project, each census block group has a count of a particular type of crime that has

occurred over the 2010 year. For instance, census block group 1011.01 has a recorded violent

crime count of 8 in the year 2010. This information is compared with other block groups that

are comprised of the entire Cleveland area block groups divided into neighborhoods. Taking

all of this into consideration, the Getis-Ord Gi* produces a Z Score that has an average from

-1.96 to 1.96 with all other values acting as extremes. These extremes are the corresponding

9

statistically significant values; all values that are to the left of -1.96 are known as significant

cold spots and values that are to the right of 1.96 are known as significant hot spots. The

crime data is sorted through the Z Score and is displayed on the map.

The P Value complements the Z Score by assessing the probability that an event is likely

to happen. In this particular case, the closer the P Value is to the Z Score value of 0, the more

likely the crime will not take place. For example, a P Value that is .99 or 99% is likely to

have a z score near 0. On the opposite end of the spectrum as seen in Table 2 below, the more

negative or positive a Z score is, the farther away the P Value will be. For instance, the

screenshot of the attribute table shows a column that has a Z Score of -3.86 (indicating a cold

spot for a crime type) with its corresponding P Value is .00011. If the Z Score was 3.86

(indicating a hot spot for a crime type) with its corresponding value would be the same P

Value. This follows the standard bell curve for determining values, seen below.

Source: Royal Bunnykins UK. http://royalbunnykins.co.uk/is/ru-z-score-table-normal-distribution/.

The SQL Queries that are represented for this project are based off data on violent crime.

Using Hot Spot Analysis to create the map, the SQL Queries complement the map by asking for

specific information. In this project, I asked the database to pull general violent crime per 1,000

people, the Z Score and the P Value for Source ID representing my neighborhood as well as for a

random census block group that has a statistically significant count of violent crimes. By

comparing the two block groups, I can see how much of a difference statistically violent crime

occurrences in 2010 are in the two block groups. The results are discussed in the next section.

http://royalbunnykins.co.uk/is/ru-z-score-table-normal-distribution/

10

- Local Morans I

ModelBuilder Work Flow that depicts the usefulness of the Statistical variables created from

the Cluster and Outlier Analysis using Anselin Local Morans I. Arrows added in Paint.

The Local Morans I statistic that is used for the cluster and outlier analysis part of the project

is important to determine where the outliers in a cluster are located. In the above ModelBuilder

work flow that was created for this project, the most important variables are those in the aqua

circle. The five variables are: LMiIndex, LMiZScore, LMiPValue, Source ID, Cluster-Outlier

Type (COType for short). While the Source ID establishes a unique ID for the values created, the

LMiIndex, the LMiZScore, and the LMiPValue are variables from Local Morans I that identifies

spatial clusters of features with attribute values similar in magnitude (ArcGIS Resource Center-

Desktop Help 10.0).

Arguably, the most important variable to establish strong results with the cluster and outlier

analysis is the COType. The COType establishes five different kinds of results from the cluster

and outlier analysis. The first three values that those that are not significant, which produces the

11

same types of results as the hot spot analysis, “HH” which represents all of the results of high

counts of crime in comparison to the overall population in that particular census block group, and

“LL” which represents all of the results of low counts of crime in comparison to the overall

population in that particular census block group. These three variables represent cluster types.

The more important part of the analysis is the two types that represent the outlier portion of the

output: “LH” which represents the low crime counts that are near or within the high clusters of

crime and “HL” which are the high crime counts that are near or within the low clusters of crime.

While the Cluster and Outlier Analysis output may look similar to the Hot Spots Analysis, the

outliers represent the subtle difference between the two.

The SQL Queries that were developed for the Cluster and Outlier Analysis of Auto Thefts in

Cleveland Neighborhoods identify the two sets of outliers from the rest of the data. The outliers,

those that have higher than expected values of auto thefts than the cluster that is generally lower

in their values and those that have lower than expected values of auto thefts than the cluster that

is generally higher in their values, open up further debate as to how to take action in favor or

against the outlier. Once these outliers were found, people of interest can further investigate how

to mitigate auto thefts in higher than expected block group areas or suggest to surrounding block

groups and neighborhoods how to reduce their own high auto theft counts. The results of the

analysis are discussed further in the next section.

12

Discussion and Summary

ModelBuilder Work Flow that showcases the importance of the Union to retrieve results for

Cluster and Outlier Analysis. This importance is also extended to the Hot Spot Analysis.

Square added in Paint.

In ModelBuilder, there are two maps that were created for the purpose of this project. The

first map is that of the Hot Spots Analysis. In the appendix, there are two maps; one vector and

one raster, which reflect the ModelBuilder work flow results (see Maps #6 and #7). The

ModelBuilder Work Flow seen above produces outputs seen in the appendix for both vector and

raster maps (see Maps #8 and #9). The most important step in the ModelBuilder process for this

project for both the Hot Spots Analysis and the Cluster and Outlier Analysis is to use the Union

tool to combine the datasets of the Neighborhood Boundary and the Hot Spot or the Cluster and

Outlier Analysis Feature Class, which already has the information about the specific type of

crime. The results will produce a rough outline on the map that will aggregate the existing

results. These results are seen in the appendix.

13

For example, using the union tool to combine the datasets will allow you to see the

approximate area of the census block group under the name of a Cleveland neighborhood. To

produce the results in Result #1 for all 36 Statistical Planning Areas, I sorted alphabetically the

names of each Cleveland neighborhood and their z-score results, exported each column of

neighborhoods into Excel, averaged all of the z-scores to get one result, and filed the name of the

Cleveland neighborhood and the one result for the z-score until all 36 Cleveland neighborhoods

were gathered and completed. Once all Cleveland neighborhood z-scores were gathered, I took

out the statistically significant z-scores (meaning all of the z-scores that averaged above 1.96 for

hot spots and below -1.96 for cold spots). These results are seen below- the most dangerous

neighborhoods where violent crimes exist by z-score are compiled in Result #2 and the least

dangerous neighborhoods where violent crimes are to the left of the mean are compiled in Result

#3.

The results for the Cluster and Outlier Analysis show the neighborhoods that average

statistically significant high and low auto theft counts that tend to cluster together. The results

also show the neighborhoods that have outliers that are within or near the clusters that have

counts of auto thefts that are lower or higher than the surrounding block groups in the cluster. In

Result #4, the high auto theft counts are gathered from the list with its corresponding

neighborhood. Note that there are some neighborhoods that appear on this list several times and

other neighborhoods that do not appear on this list at all. Result #5 illustrates a similar list with

the exception that the cluster and outlier analysis created a neighborhood list for clusters with a

low auto theft count. The interesting portion of the analysis is the determination of the outliers.

The auto theft counts that are higher than expected in areas with a low count of auto thefts are

listed by neighborhood in Result #6. The auto theft counts that are lower than expected in areas

with a high count of auto thefts are listed by neighborhood in Result #7.

14

SQL Queries- Hot Spots Analysis

SQL Query #1

Looking at the data, I was curious to see some of the statistics of my census block group. After

identifying the area in where I live on the census block group map (Lee-Miles neighborhood),

census block group number (1223.01), and source ID (1889), I entered the following SQL

statement below:

The results show that the census block group that I live in which has a source ID of 1889, has a

violent crime rate of 8 per 1,000 people, a Z Score of.44 which falls in the middle of the values

represented for violent crimes and thus is deemed not statistically significant, and the probability

that a violent crime will not happen in the census block group is high which is at 65%. The map

in the appendix shows the results from the SQL Query (see Map #10).

SQL Query #2

To compare with my location, I decided to compare results with an area in a statistically

significant high crime cluster in which I established results from earlier in a separate analysis.

The area, a census block group parcel in the North Broadway neighborhood was randomly

picked out and the following SQL query below was produced:

15

The results shows that this particular census block group (1146.1) with a source ID of 1589 has a

crime count for 2010 of 20 persons committing violent crimes to every 1,000 people. The Z

Score is on the high end of being a statistically significant value at 4.64 and the low P Value of

.00000342 predictably matches the high Z Score. These scores show that this portion of North

Broadway is nowhere near close to becoming a safe community in terms of violent crimes.

Just for comparisons’ sake: I put the values of the Lee-Miles census block group of where I live

at and the North Broadway census block group together.

Source ID 2010 Violent Crime

Count

GiZScore GiPValue

1889 (Lee-Miles) 8 .44484 .656434

1589 (North

Broadway)

20 4.6439 .00000342

Difference 12 4.19906 -0.6563998

The results show that there is a big difference in each of the categories in terms of how safe the

block groups in the neighborhoods are for violent crimes. Perhaps the most striking is the

difference in the Z Score; the North Broadway block group represents the extreme to the right of

the values represented while the Lee-Miles block group is very close to the middle of all of the

dataset represented. The map in the appendix shows the results from the SQL Query (see Map

#11).

16

SQL Query- Cluster and Outlier Analysis

SQL Query #3

In developing the cluster and outlier analysis, I wanted to see how many values represented the

high values of auto thefts within or near a low cluster of auto theft in census block groups. The

“HL” designation of the COType is the unique identifier to establish those values. The SQL

Query is shown below.

Analyzing all of the auto theft counts in Cleveland census block groups, the search pulled these

six outliers from the clusters. These outliers are the ones that have a higher than expected auto

theft count than their block group counterparts. These could suggest that there is something else

that may not belong in these block groups such as a gang or a drug cartel. Suggestions may be

made as to how these particular areas can stop being the outsiders that they are. The map in the

appendix shows the results from the SQL Query (see Map #12).

17

SQL Query #4

Similarly, it is equally important to look at Cleveland census block groups that are outliers in a

good way. There are several areas where there are only a few cases of auto theft incidents

surrounding a cluster of high auto theft incidents. Using the SQL query below, these cases are

pulled from the database.

This is the end result of what the database has pulled. Majority of this area when highlighted on a

map are near the Downtown neighborhoods and neighborhoods that are surrounding the

Downtown. The map in the appendix shows the results from the SQL Query (see Map #13).

18

Although the language of statistical analysis may not be used as common lingo amongst

everyday individuals but the presence of a map in showing the data of where different types of

crime are heavily concentrated in great detail with census block groups is just as important, if not

more. The maps and work flow from ModelBuilder to create the vector and raster maps simplify

the process of determining where the lightest or heaviest concentration of crime is located in

Cleveland. Spatial queries also simplify the data in allowing the user to ask for only the

information we need to use.

The tools of Spatial Statistics can be a benefit to people who want to solve their issues of

crime. At worst, Spatial Statistics can give the everyday person a snapshot of what types of crime

is concentrated in their neighborhood. At best, Spatial Statistics can be a tool for policymakers in

the Planning profession as well as in numerous public sector positions such as police,

firefighters, etc. to highlight, brainstorm, and seek to correct the injustices of crime in their cities,

neighborhoods and more specifically, on a block group level as indicated with the various maps

presented in this project. Spatial Statistics can be a beneficial tool to reduce or even eradicate

different types of crime in Cleveland neighborhoods in the near future if the tool is put into the

right hands. Through a bit of training, education, and commitment to the cause of reducing crime

in Cleveland neighborhoods, the people of Cleveland can realize the potential power of Spatial

Statistics.

19

References

ArcGIS Resource Center- Desktop Help 10.0, Cluster and Outlier Analysis,

http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#/How_Cluster_and_Outlier

_ Analysis_Anselin_Local_Moran_s_I_works/005p00000012000000/

Crime in Cleveland by Year. City-Data. http://www.city-data.com/crime/crime-Cleveland-

Ohio.html.

Eck, John E., Spencer Chainey, James G. Cameron, Michael Leitner, and Ronald E. Wilson.

2005. Mapping Crime: Understanding Hot Spots. Washington, DC: National Institute of

Justice.

Greenberg, Stephanie & William Rohe. 1984. Neighbor Design & Crime. Journal of the

American Planning Association. 50:48-61.

Grubesic, Tony H., & Alan T. Murray. 2001. Detecting Hot Spots Using Cluster Analysis and

GIS. Proceedings from the Fifth Annual International Crime Mapping Research

Conference. Dallas, TX.

McShepard, Randall & Fran Stewart. October 2009. Rebuilding Blocks. Policy Bridge.

http://www.policy

bridge.org/uploaded_files/NeighborhoodReport_10_05_09_file_1255630039.pdf.

Scott, Lauren & Nathan Warmerdam. June 02, 2005. Extend Crime Analysis with ArcGIS

Spatial Statistics Tools. ArcUser Magazine.

Scott, Lauren & Nathan Warmerdam. Spatial Statistics for Public Health and Safety. 2008.

http://proceedings.esri.com/library/userconf/hss06/docs/spatial.pdf

Taylir, Josh. “Drug Arrests in High Gun Crime Locations of Dallas Texas: Methodology of

Cluster Analysis using GIS and SPSS.” 2007.

http://geography.unt.edu/~pdong/courses/4550/reports/Taylir_Josh_2007.pdf

http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#/How_Cluster_and_Outlier_

http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#/How_Cluster_and_Outlier_

http://www.city-data.com/crime/crime-Cleveland-%09Ohio.html

http://www.city-data.com/crime/crime-Cleveland-%09Ohio.html

http://proceedings.esri.com/library/userconf/hss06/docs/spatial.pdf

http://geography.unt.edu/~pdong/courses/4550/reports/Taylir_Josh_2007.pdf

20

Appendix 1- Attribute Tables

Table #1: Hot Spots Attribute Table Data

Table #2: Analyzing the Z Score and the P Value for Cleveland Neighborhoods

21

Appendix 2- Maps

Map #1: Extraction of North Broadway High Crime Rate Area from Cleveland Neighborhoods

Source: Cleveland City Planning Commission- Statistical Planning Area Map for North Broadway Neighborhood

Map #2: Approximation of Where the Selected Area of High Crime is Located in North Broadway

22

Map #3: A depiction of the initial Cleveland Block Groups that displays a classification scale of where the violent

crimes are located.

Map #4: A depiction of the initial Cleveland Block Groups that displays a classification scale of where the motor

vehicle thefts are located.

23

Map #5: Raster Version of the Hot Spots Map

Map #6- Hot Spots Map from ModelBuilder work flow

Map #7- Raster-based version of the Hot Spots Map from ModelBuilder work flow

24

Map #8- Cluster and Outlier Analysis Map created from the ModelBuilder Work Flow

Map #9- A Raster-Based Cluster and Outlier Analysis Map of Violent Crimes in Cleveland Neighborhoods

25

Location of Lee - Miles block group of where I live- SQL Query #1 (Map #10)

Location of Randomly Chosen Statistically Significant High Crime Block Map located in the North Broadway

neighborhood- SQL Query #2 (Map #11)

26

Map #12- Results of SQL Query #3

Map #13- Results of SQL Query #4

27

Appendix 3- Results

Result #1- Results of the average Z Score for Violent Crimes in Cleveland Neighborhoods

Result #2- Most Dangerous for Violent Crime Appearances in Cleveland Neighborhoods According to Average Z-

Score

28

Result #3- Least Dangerous for Violent Crime Appearances in Cleveland Neighborhoods According to Average Z-

Score

Result #4- HH Values of Auto Thefts for Cleveland Neighborhoods

29

Result #5- LL Values of Auto Thefts in Cleveland Neighborhoods

Result #6- HL Values of Auto Thefts in Cleveland Neighborhoods

Result #7- LH Values of Auto Thefts in Cleveland Neighborhoods

30

Appendix 4- Data Dictionary

Table: Violent Crime (Total violent crime includes four offenses: homicide, rape, robbery, and

aggravated assault).

Field Name Data Type Length Description

ObjectID Long Integer - Unique ID of Object

BlockGr String 47 Census Block Group

#

CntyNme String 15 County Name

V2010 Double - 2010 Violent Crime

Count in a Census

Block Group

StFid00_BG String 15 ID for Ohio,

Cuyahoga County,

Cleveland, and

specific block group

Table: Auto Thefts



BlockGr String 47 Census Block Group

#

CntyNme String 15 County Name

MV2010 Double - 2010 Auto Thefts

Count in a Census

Block Group

StFid00_BG String 15 ID for Ohio,

Cuyahoga County,

Cleveland, and


31

Cleveland Block Groups (derived from Ohio Block Groups- bg39_d00.shp) (Geometry Type:

Polygon)


FID Object ID (Long

Integer)

- Unique ID of Block

Group

Shape Geometry - Shape of Block Group

Area Double - Area of Block Group

Perimeter Double - Perimeter of Block

Group

BG_39_D00_ Double Precision 11 Block Group Data

BG_39_D00_I Double Precision 11 Block Group Data

State String 2 Census State #

County String 3 Census County #

Tract String 6 Census Tract #

BlkGroup String 1 Census Block Group

#

Name String 90 Dup. of Census Block

Group #

LSAD String 2 Type of Boundary

LSAD_TRANS String 50

STFID00_BG String 50 ID for Ohio,

Cuyahoga County,

Cleveland, and


Cleveland_SPAs06_conflated_to_2010_blocks v1 (Geometry Type: Line)


FID Object ID (Long

Integer)

- Unique ID

Shape Polygon - Shape of

Neighborhood Layer

Name String 100 Name of Cleveland

Neighborhoods

32

When the Hot Spots Analysis is run, new data enters the attribute table of the former Cleveland

Block Groups. Below is an example of the hot spots analysis data for Violent Crimes in

Cleveland neighborhoods:



Shape Geometry - Shape of Census

Block Group

Shape_Length Double - Length of the shape

Shape_Area Double - Area of the shape

GiZScore Float - Analyzes whether a

measure of crime is

statistically significant

through Getis- Ord

Gi*

GiPValue Float - Analyzes whether the

probability of a crime

in a particular block

group through Getis-

Ord Gi*

Source_ID Long - ID of the Source

polygon

V2010 Double - 2010 Violent Crime

Count in a Census

Block Group

Note: The data is the same for the auto thefts hot spots analysis that was run with the exception

of the last variable (it is MV 2010 instead of V2010).

33

Similarly, when the Cluster and Outlier Analysis is run, new data is entered into the attribute

table. Below is an example of the cluster and outlier analysis data for Auto Thefts in Cleveland

neighborhoods:


ObjectID Long Integer - Unique ID of the

Object

Shape Geometry - Shape of Census

Block Group

Source_ID Long - ID of the Source

Polygon

Shape_Length Double - Length of the shape

Shape_Area Double - Area of the shape

LmiIndex IDW 23048 Float - Index of the Values

established by the Z

Score and P Value

LmiZScore IDW

23048

Float - Analyzes whether a

measure of crime is


through Local Morans

I

LmiPValue IDW

23048

Float - Analyzes whether the

probability of a crime

in a particular block

group through Local

Morans I

CoType IDW 23048 String - Classifies the clusters

and the outliers as

well as data that is

viewed as not


MV2010 Double - 2010 Motor Vehicle

Thefts Count in a

Census Block Group

Note: The data is the same for the violent crimes cluster and outlier analysis that was run with

the exception of the last variable (it is V2010 instead of MV2010).

Crime and Statistics

Documents

census block

census block

modelbuilder

specific block

miles block

surrounding

hot spot analysis

hot spots