www.spatialanalysisonline.com Chapter 5 Part B: Spatial Autocorrelation and regression modelling
Mar 26, 2015
www.spatialanalysisonline.com
Chapter 5
Part B: Spatial Autocorrelation and regression modelling
3rd edition www.spatialanalysisonline.com 2
Autocorrelation
Time series correlation model {xt,1} t=1,2,3…n‑1 and {xt,2} t=2,3,4…n
3rd edition www.spatialanalysisonline.com 3
Spatial Autocorrelation
Correlation coefficient {xi} i=1,2,3…n, {yi} i=1,2,3…n
Time series correlation model {xt,1} t=1,2,3…n‑1 and {xt,2} t=2,3,4…n Mean values: Lag 1 autocorrelation:
large n
n
ii
n
ii
n
iii
yyxx
yyxx
r
1
2
1
2
1
n
tt
x xn
1
.11
11
n
tt
x xn.2
2
11
n
tt
x xn 1
1
n
t tt
n
tt
x x x x
r
x x
1
11
12
1
3rd edition www.spatialanalysisonline.com 4
Spatial Autocorrelation
Classical statistical model assumptionsIndependence vs dependence in time and
spaceTobler’s first law:
“All things are related, but nearby things are more related than distant things”
Spatial dependence and autocorrelationCorrelation and Correlograms
3rd edition www.spatialanalysisonline.com 5
Spatial Autocorrelation
Covariance and autocovarianceLags – fixed or variable intervalCorrelograms and rangeStationary and non-stationary patternsOutliersExtending concept to spatial domain
Transects Neighbourhoods and distance-based models
3rd edition www.spatialanalysisonline.com 6
Spatial Autocorrelation
Global spatial autocorrelation Dataset issues: regular grids; irregular lattice
(zonal) datasets; point samples Simple binary coded regular grids – use of Joins
counts Irregular grids and lattices – extension to x,y,z data
representation Use of x,y,z model for point datasets
Local spatial autocorrelation Disaggregating global models
3rd edition www.spatialanalysisonline.com 7
Spatial Autocorrelation
Joins counts (50% 1’s)A. Completely separated pattern (+ve)
B. Evenly spaced pattern (-ve)C. Random pattern
3rd edition www.spatialanalysisonline.com 8
Spatial Autocorrelation
Joins count Binary coding Edge effects Double counting Free vs non-free sampling
Expected values (free sampling) 1-1 = 15/60, 0-0 = 15/60, 0-1 or 1-0 = 30/60
3rd edition www.spatialanalysisonline.com 9
Spatial Autocorrelation
Joins countsA. Completely separated (+ve) B. Evenly spaced (-ve) C. Random
3rd edition www.spatialanalysisonline.com 10
Spatial Autocorrelation
Joins count – some issues Multiple z-scores Binary or k-class data Rook’s move vs other moves First order lag vs higher orders Equal vs unequal weights Regular grids vs other datasets Global vs local statistics Sensitivity to model components
3rd edition www.spatialanalysisonline.com 11
Spatial Autocorrelation
Irregular lattice – (x,y,z) and adjacency tables
+4.55 +5.54
+2.24
-5.15 +9.02
+3.10
-4.39 -2.09
+0.46 -3.06
1,1 1,2 1,3
2,1 2,2 2,3
3,1 3,2 3,3
4,1 4,2 4,3
x y z
1 2 4.55
1 3 5.54
2 1 2.24
2 2 ‑5.15
2 3 9.02
3 1 3.1
3 2 ‑4.39
3 3 ‑2.09
4 2 0.46
4 3 ‑3.06
3 7
1 4 8
2 5 9
6 10
Cell numbering
Cell data Cell coordinates (row/col) x,y,z view
Adjacency matrix, total 1’s=26
3rd edition www.spatialanalysisonline.com 12
Spatial Autocorrelation
“Spatial” (auto)correlation coefficient Coordinate (x,y,z) data representation for cells Spatial weights matrix (binary or other), W={wij}
From last slide: Σ wij=26 Coefficient formulation – desirable properties
Reflects co-variation patterns Reflects adjacency patterns via weights matrix Normalised for absolute cell values Normalised for data variation Adjusts for number of included cells in totals
3rd edition www.spatialanalysisonline.com 13
Spatial Autocorrelation
Moran’s I
TSA model
example cell 10our for 1026 hence
,/
where,)(
))((1
2
/p
nwp
zz
zzzzw
pI
i jij
ii
i jjiij
t tt
tt
x x x x
rx x
1
.1 2
3rd edition www.spatialanalysisonline.com 14
Spatial Autocorrelation
A. Computation of variance/covariance-like quantities, matrix C
B. C*W: Adjustment by multiplication of the weighting matrix, W
Moran I =10*16.19/(26*196.68)=0.0317 0
3rd edition www.spatialanalysisonline.com 15
Spatial Autocorrelation
Moran’s I
Modification for point data Replace weights matrix with distance bands, width h Pre-normalise z values by subtracting means Count number of other points in each band, N(h)
i j
ij
ii
i jjiij
nwpzz
zzzzw
pI / where,
)(
))((1
2
ii
i jji
z
zz
hNhI2
)()(
3rd edition www.spatialanalysisonline.com 16
Spatial Autocorrelation
Moran I Correlogram
Source data points Lag distance bands, h Correlogram
3rd edition www.spatialanalysisonline.com 17
Spatial Autocorrelation
Geary C Co-variation model uses squared differences
rather than products
Similar approach is used in geostatistics
2
2
( )1
( )
21
ij i j
i
ij
w z zC
p z z
wp
n
3rd edition www.spatialanalysisonline.com 18
Spatial Autocorrelation
Extending SA concepts Distance formula weights vs bands Lattice models with more complex
neighbourhoods and lag models (see GeoDa) Disaggregation of SA index computations (row-
wise) with/without row standardisation (LISA) Significance testing
Normal model Randomisation models Bonferroni/other corrections
3rd edition www.spatialanalysisonline.com 19
Regression modelling
Simple regression – a statistical perspective One (or more) dependent (response) variables One or more independent (predictor) variables Linear regression is linear in coefficients:
Vector/matrix form often used Over-determined equations & least squares
y x x x or
y0 1 1 2 2 3 3 ...,
xβ
3rd edition www.spatialanalysisonline.com 20
Regression modelling
Ordinary Least Squares (OLS) model
Minimise sum of squared errors (or residuals) Solved for coefficients by matrix expression:
0 1 1 2 2 3 3 ... , ori i i i iy x x x y Xβ ε
ˆ
1T Tβ XX X y ( ) σ2ˆvar
1Tβ XX
3rd edition www.spatialanalysisonline.com 21
Regression modelling
OLS – models and assumptions Model – simplicity and parsimony Model – over-determination, multi-collinearity
and variance inflation Typical assumptions
Data are independent random samples from an underlying population
Model is valid and meaningful (in form and statistical) Errors are iid
• Independent; No heteroskedasticity; common distribution Errors are distributed N(0,2)
3rd edition www.spatialanalysisonline.com 22
Regression modelling
Spatial modelling and OLS Positive spatial autocorrelation is the norm,
hence dependence between samples exists Datasets often non-Normal >> transformations
may be required (Log, Box-Cox, Logistic) Samples are often clustered >> spatial
declustering may be required Heteroskedasticity is common Spatial coordinates (x,y) may form part of the
modelling process
3rd edition www.spatialanalysisonline.com 23
Regression modelling
OLS vs GLS OLS assumes no co-variation
Solution:
GLS models co-variation: y~ N(,C) where C is a positive definite covariance matrix y=X+u where u is a vector of random variables (errors)
with mean 0 and variance-covariance matrix C
Solution:
ˆ
1T Tβ XX X y
ˆ 11 T T 1β XC X X C y ˆvar
1T 1 T(β) X C X
3rd edition www.spatialanalysisonline.com 24
Regression modelling
GLS and spatial modelling y~ N(,C) where C is a positive definite covariance
matrix (C must be invertible) C may be modelled by inverse distance weighting,
contiguity (zone) based weighting, explicit covariance modelling…
Other models Binary data – Logistic models Count data – Poisson models
3rd edition www.spatialanalysisonline.com 25
Regression modelling
Choosing between models Information content perspective and AIC
where n is the sample size, k is the number of parameters used in the model, and L is the likelihood function
12)ln(2
2)ln(2
knn
kLAICc
kLAIC
3rd edition www.spatialanalysisonline.com 26
Regression modelling
Some ‘regression’ terminology Simple linear Multiple Multivariate SAR CAR Logistic Poisson Ecological Hedonic Analysis of variance Analysis of covariance
3rd edition www.spatialanalysisonline.com 27
Regression modelling
Spatial regression – trend surfaces and residuals (a form of ESDA) General model:
y - observations, f( , , ) - some function, (x1,x2) - plane coordinates, w - attribute vector
Linear trend surface plot Residuals plot 2nd and 3rd order polynomial regression Goodness of fit measures – coefficient of
determination
),,( 21 wxxfy
3rd edition www.spatialanalysisonline.com 28
Regression modelling
Regression & spatial autocorrelation (SA) Analyse the data for SA If SA ‘significant’ then
Proceed and ignore SA, or Permit the coefficient, , to vary spatially (GWR), or Modify the regression model to incorporate the SA
3rd edition www.spatialanalysisonline.com 29
Regression modelling
Regression & spatial autocorrelation (SA) Analyse the data for SA If SA ‘significant’ then
Proceed and ignore SA, or Permit the coefficient, , to vary spatially (GWR)
or Modify the regression model to incorporate the SA
3rd edition www.spatialanalysisonline.com 30
Regression modelling
Geographically Weighted Regression (GWR) Coefficients, , allowed to vary spatially, (t) Model: Coefficients determined by examining neighbourhoods
of points, t, using distance decay functions (fixed or adaptive bandwidths)
Weighting matrix, W(t), defined for each point Solution:
GLS:
y Xβ(t) ε
t t tˆ
1T Tβ( ) XW( )X X W( )y
ˆ 11 T T 1β XC X X C y
3rd edition www.spatialanalysisonline.com 31
Regression modelling
Geographically Weighted Regression Sensitivity – model, decay function, bandwidth,
point/centroid selection ESDA – mapping of surface, residuals,
parameters and SEs Significance testing
Increased apparent explanation of variance Effective number of parameters AICc computations
3rd edition www.spatialanalysisonline.com 32
Regression modelling
Geographically Weighted Regression Count data – GWPR
use of offsets Fitting by ILSR methods
Presence/Absence data – GWLR True binary data Computed binary data - use of re-coding, e.g.
thresholding Fitting by ILSR methods
3rd edition www.spatialanalysisonline.com 33
Regression modelling
Regression & spatial autocorrelation (SA) Analyse the data for SA If SA ‘significant’ then
Proceed and ignore SA, or Permit the coefficient, , to vary spatially (GWR)
or Modify the regression model to incorporate the
SA
3rd edition www.spatialanalysisonline.com 34
Regression modelling
Regression & spatial autocorrelation (SA) Modify the regression model to incorporate the
SA, i.e. produce a Spatial Autoregressive model (SAR)
Many approaches – including: SAR – e.g. pure spatial lag model, mixed model,
spatial error model etc. CAR – a range of models that assume the expected
value of the dependent variable is conditional on the (distance weighted) values of neighbouring points
Spatial filtering – e.g. OLS on spatially filtered data
3rd edition www.spatialanalysisonline.com 35
Regression modelling
SAR models Pure spatial lag:
Re-arranging:
MRSA model:
y Wy ε
1( ) y I W ε
Autoregression parameter
Spatial weights matrix
εWyXβy ρ
Linear regression added
3rd edition www.spatialanalysisonline.com 36
Regression modelling
SAR models Spatial error model:
Substituting and re-arranging:
Spatial weighted error vector
Linear regression + spatial error
λ
where
y Xβ ε,
ε Wε u
iid error vector
( ) or
y Xβ Wy Xβ u,
y Xβ Wy WXβ u
iid error vectorLinear regression (global)
SAR lag Local trend
3rd edition www.spatialanalysisonline.com 37
Regression modelling
CAR models Standard CAR model:
Local weights matrix – distance or contiguity Variance :
Different models for W and M provide a range of CAR models
ij
jjijiiji ywall yyE |
weighted mean for neighbourhood of i
Autoregression parameter
Expected value at i
MW(Iy 1))var(
3rd edition www.spatialanalysisonline.com 38
Regression modelling
Spatial filtering Apply a spatial filter to the data to remove SA
effects Model the filtered data Example: y=Xβ+ε
1
, or
, hence
y Wy=Xβ WXβ+ε
y I W = I W Xβ+ε
y=Xβ+ I W ε
Spatial filter