Page 1
1
Tests for Spatial Clustering
global statistic aggregate / points
k-function Grimson’s method Cuzick & Edward’s method Join Count
aggregate data Geary’s C Moran’s I
local statistic spatial scan statistic LISA statistic geographical analysis machine (GAM)
Page 2
2
K - Function
summary of local dependence of spatial process -> second order process
expresses number of expected events within given distance of randomly chosen event
Page 3
3
Example: k – Function for Newcastle Disease Outbreak
950
1000
1050
1100
1150
1200
1250
1900 1950 2000 2050 2100 2150 2200
X-Coordinate
Y-C
oo
rdin
ate
Controls
Cases
50 100 150 200
-10
00
00
10
00
0
50 100 150 200
02
00
00
40
00
06
00
00
80
00
01
00
00
0
Page 4
4
TB Case-Control Study in Central North Island of NZ
# #
###
#####
#####
##
#
##
#
###
######
#
##
# #
####
#
##
#
## #
####
# ##
#
##
#
##
##
#
###
###
#
#####
#####
###
####
###### #### #
# ##
##
#
# #
#
##
#
##
##
# ###
#
#
#
####
#
###
##
#
#
# #### ##
###
#
#
###
# #
#
## #
##
### #
# # ###
#
# ##
#
# ### #
#
##
#####
###
#
#
cases = redcontrols = blue
Page 5
5
Cuzick and Edward’s Test applied to TB Case-Control Study
Page 6
6
Legend0 - 0.690.69 - 2.032.03 - 3.583.58 - 5.85.8 - 10.18
Local Spatial Autocorrelation
Local MoranLocal Geary
Legend-0.59 - -0.21-0.21 - 0.120.12 - 0.60.6 - 2.12.1 - 6.75
Page 7
7
Spatial Scan Statistic
no pre-specified cluster sizecan take confounding into accountalso does time - space clusteringmethod
increasing circles (cylinders if including time) compare risk within with outside circle most likely cluster -> circle with maximum
likelihood (more than expected number of cases)
SaTScan software (public domain)
Page 8
8
Example - SaTScan
locations of den sites of tuberculous and non-tuberculous possums
Page 9
9
Example - SaTScan cont.
MOST LIKELY CLUSTER1. Coordinates / radius..: (348630,708744) / 126.65 Population............: 56 Number of cases.......: 34 (16.44 expected) Overall relative risk.: 2.07 Log likelihood ratio..: 15.86 P-value...............: 0.001SECONDARY CLUSTERS2. Coordinates / radius..: (348491,708496) / 33.35 Population............: 5 Number of cases.......: 5 (1.47 expected) Overall relative risk.: 3.41 Log likelihood ratio..: 6.25 P-value...............: 0.3373. Coordinates / radius..: (348369,708453) / 80.55 Population............: 8 Number of cases.......: 7 (2.35 expected) Overall relative risk.: 2.98 Log likelihood ratio..: 6.13 P-value...............: 0.365
Page 10
10
Example - SaTScan cont.
Page 11
11
Space-Time Scan Statistic
MOST LIKELY CLUSTER
1.Census areas included.: 75, 26, 77, 76, 29, 32
Coordinates / radius..: (389631,216560) / 59840.47
Time frame............: 1997/1/1 - 1999/12/31
Population............: 4847
Number of cases.......: 1507 (632.85 expected)
Overall relative risk.: 2.38
Log likelihood ratio..: 509.4
Monte Carlo rank......: 1/1000
P-value...............: 0.001
Page 12
12
Framework for Spatial Data Analysis
Visualization
Exploration
Modelling
Attribute data
Feature data
Databases
Maps
Describe patterns
Test hypothese
s
GISDBMS
StatisticalSoftware
Page 13
13
Modelling
explain and predict spatial structure hypothesis testing
methods data mining statistical and simulation modelling multi-criteria/multi-objective decision
modellingproblem -> spatial dependence
Page 14
14
3D Risk Map for FMD Outbreak Occurrence in Thailand(based on random effects logistic regression analysis)
Page 15
15
Recent Developments in Spatial Regression Modelling
generalised linear mixed models (GLMM) use random effect term to reflect spatial
structureimpose spatial covariance structuresBayesian estimation, Markov chain Monte
Carlo (MCMC), Gibbs sampling
autologistic regression include spatial covariate MCMC estimation
Page 16
16
Bayesian Regression Modelling
Bayesian inference combines
information from data (likelihood)prior distributions for unknown parameters
to generateposterior distribution of dependent variable
allows modelling of data heterogeneity, addresses multiplicity issues
Page 17
17
TB Reactor Risk Modelling
dependent variable -> observed TB reactors per county in 1999 in GB
Poisson regression model MCMC estimation expected no. TB reactors two random effects (convolution prior)
spatial – conditionally autoregressive (CAR) prior
non-spatial – exchangeable normal prior
Page 18
18
Raw Standardised Morbidity Ratio
BUGS softwarewith GeoBUGS extension
Page 19
19
Example – Kernel Density Plots
RR[25] sample: 5000
0.0 5.0 10.0 15.0
0.0
1.0
2.0
3.0
RR[31] sample: 5000
0.5 1.0 1.5 2.0
0.0
1.0
2.0
3.0
Page 20
20
Raw SMR and Posterior Relative Risk Maps
raw SMR Bayes’ RRestimates
Page 21
21
Medians and 95% CI of Posterior Relative Risks
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
Shetla
nd Is
land
s
Ork
ney
Islan
ds
Highla
nd
Weste
rn Is
les
Gra
mpia
n
Taysid
e
Strath
clyde
Centra
lFife
Loth
ian
Borde
rs
North
umbe
rland
Dumfri
es & G
allow
ay
Cumbr
ia
Tyne &
Wea
r
Durha
m
Cleve
land
North
Yor
kshi
re
Lanc
ashir
e
Humbe
rside
West
Yorksh
ire
Mer
seys
ide
Gre
ater M
anch
este
r
South
Yor
kshi
re
Linc
olns
hire
Derby
shire
Nottin
gham
shire
Chesh
ire
Gwyn
edd
Clwyd
Staffo
rdsh
ire
Norfo
lk
Shrop
shire
Leice
ster
shire
Powys
Cambr
idge
shire
Warw
icksh
ire
West
Mid
lands
North
ampt
onshi
re
Suffo
lk
Dyfed
Heref
ord
& Wor
cest
er
Bedfo
rdsh
ire
Buckin
gham
shire
Oxfo
rdsh
ire
Essex
Glou
ceste
rshir
e
Hertfo
rdsh
ire
Gwen
t
Mid
Gla
mor
gan
West
Glam
organ
Out
erLo
ndon
Wilts
hireAvo
n
Inne
r Lon
don
Berks
hire
South
Gla
mor
ganKen
t
Surre
y
Hamps
hire
Somer
set
Devon
East S
ussex
West
Sussex
Dorse
t
Cornw
all
Isle
of W
ight
County
Re
lati
ve
ris
k
Page 22
22
Model Residuals and RR Significance
RR[31] sample: 10000
0.5 1.0 1.5 2.0
0.0
1.0
2.0
3.0
Page 23
23
Relative Importance of Structured versus Unstructured Random Effect
sdratio
iteration1 2500 5000 7500 10000
0.0
0.2
0.4
0.6
0.8
Page 24
24
Multi-Criteria Decision Making using GIS
decision -> choice between alternatives vaccinate wildlife or not
criterion -> evidence used to decide on decision factors and constraints
presence of wildlife reservoircattle stocking densityaccess to wildlife for vaccine delivery
decision rule -> procedure for selection and combination of criteria
Page 25
25
Multi-Criteria Decision Making in GIS cont.
evaluation -> application of decision rules multi-criteria evaluations
boolean overlaysweighted linear combinations
uncertainty database uncertainty decision rule uncertainty -> fuzzy versus crisp
setsdecision risk -> likelihood of decision being
wrong -> Bayesian probability theory, Dempster-Shafer Theory
Page 26
26
Dempster - Shafer Theory
extension of Bayesian probability theorydata uncertainty included in calculation ->
belief in hypothesis not complement of belief in negation (sensitivity of diagnosis)
collect different sources of evidence for presence/absence (data, expert knowledge) re-express as probability
combine evidence as mass of support for particular hypothesis
Page 27
27
More about Dempster-Shafer Theory
belief total support for hypothesis degree of hard evidence supporting hypothesis
plausibility degree to which hypothesis cannot be
disbelieved degree to which conditions appear to be right
for hypothesis, even though hard evidence is lacking
Page 28
28
Even more about Dempster-Shafer Theory
belief interval range between belief and plausibility degree of uncertainty in establishing
presence/absence of hypothesis areas with high belief interval suitable for
collection of new data
Page 29
29
Example – East Coast Fever Occurrence in Zimbabwe
Belief in T.parva Presence
Belief interval for T.parva Presence(Degree of uncertainty)
Page 30
30
Landscape Structure
quantify landscape structure/composition
habitat features as a whole
Page 31
31
TB Infected Herds around Hauhungaroa Ranges in NZ
Vegetation classespasturepasture/scrubscrublandforest
Farm boundaries
Page 32
32
Framework for Spatial Data Analysis
Visualization
Exploration
Modelling
Attribute data
Feature data
Databases
Maps
Describe patterns
Test hypothese
s
GISDBMS
StatisticalSoftware
Page 33
33
Conclusion
spatial analysis essential component of epidemiological analysis
key ideas visualization -> extremely effective for
analysis and presentation exploration -> cluster detection
methods (beware of type I error) modelling -> Bayesian modelling and
decision analysis techniques