Spatial Data Analysis

Why Geography is important.

What is spatial analysis?

• From Data to Information– beyond mapping: added value– transformations, manipulations and application of

analytical methods to spatial (geographic) data

• Lack of locational invariance– analyses where the outcome changes when the

locations of the objects under study changes» median center, clusters, spatial autocorrelation

– where matters• In an absolute sense (coordinates)• In a relative sense (spatial arrangement, distance)

Components of Spatial Analysis

• Visualization– Showing interesting patterns

• Exploratory Spatial Data Analysis (ESDA)– Finding interesting patterns

• Spatial Modeling, Regression– Explaining interesting patterns

Implementation of Spatial Analysis

• Beyond GIS– Analytical functionality not part of typical commercial

GIS» Analytical extensions

– Exploration requires interactive approach» Training requirements» Software requirements

– Spatial modeling requires specialized statistical methods

» Explicit treatment of spatial autocorrelation» Space-time is not space + time

• ESDA and Spatial Econometrics

What Is Special About Spatial Data?

• Location, Location, Location– “where” matters

• Dependence is the rule– spatial interaction, contagion, externalities,

spill-overs, copycatting– First Law of Geography (Tobler)

• everything depends on everything else, but closer things more so

• Spatial heterogeneity– Lack of stationarity in first-order statistics

• Pertains to the spatial or regional differentiation observed in the value of a variable– Spatial drift (e.g., a trend surface)– Spatial association

Nature of Spatial Data

• Spatially referenced data “georeferenced”» “attribute” data associated with location

» where matters

• Example: Spatial Objects– points: x, y coordinates

» cities, stores, crimes, accidents

– lines: arcs, from node, to node» road network, transmission lines

– polygons: series of connected arcs» provinces, cities, census tracts

GIS Data Model

• Discretization of geographical reality necessitated by the nature of computing devices (Goodchild)– raster (grid) vs. vector (polygon)– field view (regions, segments) vs. object view

(objects in a plane)

• Data model implies spatial sampling and spatial errors

3 Classes of Spatial Data

• Geostatistical Data– points as sample locations (“field” data as

opposed to “objects”)• Continuous variation over space

• Lattice/Regional Data– polygons or points (centroids)

• Discrete variation over space, observations associated with regular or irregular areal units

• Point Patterns– points on a map (occurrences of events at

locations in space)• Observations of a variable are made at location X• Assumption that the spatial arrangement is directly

related to the interaction between units of observation

Visualization and ESDA

• Objective– highlighting and detecting pattern

• Visualization– mapping spatial distributions– outlier detection– smoothing rates

• ESDA– dynamically linked windows– linking and brushing

Mapping patterns

http://www.cdc.gov/nchs/data/gis/atmapfh.pdf

ESDAhttp://www.public.iastate.edu/~arcview-xgobi/

Spatial Process

• Spatial Random Field– { Z(s): s ∈ D }

» s R∈ d : generic data location (vector of coordinates)

» D R⊂ d : index set(subset of potential locations)

» Z(s) random variable at s, with realization z(s)

– Examples• s are x, y coordinates of house sales, Z sales price

at s• s are counties, Z is crime rate in s

Point Pattern Analysis

• Objective– assessing spatial randomness

• Interest in location itself– complete spatial randomness– clustering, dispersion

• Distance-based statistics– nearest neighbors– number of events within given radius

Point Patterns

• Spatial process– index set D is point process, s is random

• Data– mapped pattern

» examples: location of disease, gang shootings

• Research question– interest focuses on detecting absence of

spatial randomness (cluster statistics)– clustered points vs dispersed points

Geostatistical Data

• Spatial Process– index set D is fixed subset of Rd (continuous)

• Data– sample points from underlying continuous surface

» examples: mining, air quality, house sales price

• Research Question– interest focuses on modeling continuous spatial

variation– spatial interpolation (kriging)

Variogram Modeling (Geostatistics)

• Objective– modeling continuous variation across space

• Variogram– estimating how spatial dependence varies

with distance– modeling distance decay

• Kriging– optimal spatial prediction

Lattice or Regional Data

• Spatial process– index set D is fixed collection of countably many

points in Rd

– finite, discrete spatial units

• Data– fixed points or discrete locations (regions)

» examples: county tax rates, state unemployment

• Research question– interest focuses on statistical inference– estimation, specification tests

Spatial Autocorrelation

• Objective– hypothesis test on spatial randomness of

attributes = value and location

• Global and local autocorrelation statistics: Moran’s I, Geary’s c, G(d), LISA

• Visualization of spatial autocorrelation– Moran scatterplot– LISA maps

Spatial process models

• How is the spatial association generated?– Spatial autoregressive process (SAR)

• Y = ρWY + ε

– Spatial moving average process (SMA)• Y = (I + ρW) ε

– ε – vector of independent errors

– W = distance weights matrix

– In SAR, correlation is fairly persistent with increasing distance, whereas with SMA is decays to zero fairly quickly.

• Spatial process—the rule governing the trajectory of the system as a chain of changes in state.

• Spatial pattern—the map of a single realization of the underlying spatial process (the data available for analysis).

• Say you conduct a regression analysis. If the residuals do not display spatial autocorrelation, then there is no need to add “space” to the model. Examine s.a. in the residuals using Moran’s I or Geary’s c or G(d).

Perspectives on spatial process models

• Finding out how the variable Y relates to its value in surrounding locations (the spatial lag) while controlling for the influence of other explanatory variables.

• When the interest is in the relation between the explanatory variables X and the dependent variable, after the spatial effect has been controlled for (this is referred to as spatial filtering or spatial screening).

• The expected value of the dependent variable at each location is a function not only of explanatory variables at that location, but of the explanatory variables at all other locations as well.

Spatial Data Analysis

spatial sampling

spatial econometricswhat

spatial objects points

sample locations field

location xassumption

space time esda

y coordinates cities

esda objective highlighting

Documents

SPATIAL DATA ANALYSIS Tony E. Smith University of...

Spatial Analysis – Raster data analysis Lecture 5-6.

Geospatial and Time Series Data Analysis · Spatial Data...

Exploratory spatial data analysis using Stata Spatial data.....

Lecture3 Spatial Data Analysis

Spatial data analysis 2

Exploratory interactive tools for spatial data analysis...

Spatial Data Analysis - Raju

Big Data for Spatial Analysis -...

Spatial data analysis 1

GGIT 538 Spatial Data Analysis

Spatial Analysis – vector data analysis Lecture 2.

Overview of Statistical Analysis of Spatial Data Geog...

l 9 - Spatial Data Analysis

GEOCOMPUTATION TECHNIQUES FOR SPATIAL ANALYSIS: IS IT...

Spatial Data Analysis: Course Outline