Spatial Data Analysis
Post on 16-Jan-2016
165 Views
Preview:
DESCRIPTION
Transcript
Spatial Data Analysis
Why Geography is important.
What is spatial analysis?
• From Data to Information– beyond mapping: added value– transformations, manipulations and application of
analytical methods to spatial (geographic) data
• Lack of locational invariance– analyses where the outcome changes when the
locations of the objects under study changes» median center, clusters, spatial autocorrelation
– where matters• In an absolute sense (coordinates)• In a relative sense (spatial arrangement, distance)
Components of Spatial Analysis
• Visualization– Showing interesting patterns
• Exploratory Spatial Data Analysis (ESDA)– Finding interesting patterns
• Spatial Modeling, Regression– Explaining interesting patterns
Implementation of Spatial Analysis
• Beyond GIS– Analytical functionality not part of typical commercial
GIS» Analytical extensions
– Exploration requires interactive approach» Training requirements» Software requirements
– Spatial modeling requires specialized statistical methods
» Explicit treatment of spatial autocorrelation» Space-time is not space + time
• ESDA and Spatial Econometrics
What Is Special About Spatial Data?
• Location, Location, Location– “where” matters
• Dependence is the rule– spatial interaction, contagion, externalities,
spill-overs, copycatting– First Law of Geography (Tobler)
• everything depends on everything else, but closer things more so
• Spatial heterogeneity– Lack of stationarity in first-order statistics
• Pertains to the spatial or regional differentiation observed in the value of a variable– Spatial drift (e.g., a trend surface)– Spatial association
Nature of Spatial Data
• Spatially referenced data “georeferenced”» “attribute” data associated with location
» where matters
• Example: Spatial Objects– points: x, y coordinates
» cities, stores, crimes, accidents
– lines: arcs, from node, to node» road network, transmission lines
– polygons: series of connected arcs» provinces, cities, census tracts
GIS Data Model
• Discretization of geographical reality necessitated by the nature of computing devices (Goodchild)– raster (grid) vs. vector (polygon)– field view (regions, segments) vs. object view
(objects in a plane)
• Data model implies spatial sampling and spatial errors
3 Classes of Spatial Data
• Geostatistical Data– points as sample locations (“field” data as
opposed to “objects”)• Continuous variation over space
• Lattice/Regional Data– polygons or points (centroids)
• Discrete variation over space, observations associated with regular or irregular areal units
• Point Patterns– points on a map (occurrences of events at
locations in space)• Observations of a variable are made at location X• Assumption that the spatial arrangement is directly
related to the interaction between units of observation
Visualization and ESDA
• Objective– highlighting and detecting pattern
• Visualization– mapping spatial distributions– outlier detection– smoothing rates
• ESDA– dynamically linked windows– linking and brushing
Mapping patterns
http://www.cdc.gov/nchs/data/gis/atmapfh.pdf
ESDAhttp://www.public.iastate.edu/~arcview-xgobi/
Spatial Process
• Spatial Random Field– { Z(s): s ∈ D }
» s R∈ d : generic data location (vector of coordinates)
» D R⊂ d : index set(subset of potential locations)
» Z(s) random variable at s, with realization z(s)
– Examples• s are x, y coordinates of house sales, Z sales price
at s• s are counties, Z is crime rate in s
Point Pattern Analysis
• Objective– assessing spatial randomness
• Interest in location itself– complete spatial randomness– clustering, dispersion
• Distance-based statistics– nearest neighbors– number of events within given radius
Point Patterns
• Spatial process– index set D is point process, s is random
• Data– mapped pattern
» examples: location of disease, gang shootings
• Research question– interest focuses on detecting absence of
spatial randomness (cluster statistics)– clustered points vs dispersed points
Geostatistical Data
• Spatial Process– index set D is fixed subset of Rd (continuous)
• Data– sample points from underlying continuous surface
» examples: mining, air quality, house sales price
• Research Question– interest focuses on modeling continuous spatial
variation– spatial interpolation (kriging)
Variogram Modeling (Geostatistics)
• Objective– modeling continuous variation across space
• Variogram– estimating how spatial dependence varies
with distance– modeling distance decay
• Kriging– optimal spatial prediction
Lattice or Regional Data
• Spatial process– index set D is fixed collection of countably many
points in Rd
– finite, discrete spatial units
• Data– fixed points or discrete locations (regions)
» examples: county tax rates, state unemployment
• Research question– interest focuses on statistical inference– estimation, specification tests
Spatial Autocorrelation
• Objective– hypothesis test on spatial randomness of
attributes = value and location
• Global and local autocorrelation statistics: Moran’s I, Geary’s c, G(d), LISA
• Visualization of spatial autocorrelation– Moran scatterplot– LISA maps
Spatial process models
• How is the spatial association generated?– Spatial autoregressive process (SAR)
• Y = ρWY + ε
– Spatial moving average process (SMA)• Y = (I + ρW) ε
– ε – vector of independent errors
– W = distance weights matrix
– In SAR, correlation is fairly persistent with increasing distance, whereas with SMA is decays to zero fairly quickly.
• Spatial process—the rule governing the trajectory of the system as a chain of changes in state.
• Spatial pattern—the map of a single realization of the underlying spatial process (the data available for analysis).
• Say you conduct a regression analysis. If the residuals do not display spatial autocorrelation, then there is no need to add “space” to the model. Examine s.a. in the residuals using Moran’s I or Geary’s c or G(d).
Perspectives on spatial process models
• Finding out how the variable Y relates to its value in surrounding locations (the spatial lag) while controlling for the influence of other explanatory variables.
• When the interest is in the relation between the explanatory variables X and the dependent variable, after the spatial effect has been controlled for (this is referred to as spatial filtering or spatial screening).
• The expected value of the dependent variable at each location is a function not only of explanatory variables at that location, but of the explanatory variables at all other locations as well.
top related