-
1
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Spatial Analysis of Surveillance Data
Fernando Simón, Francisco Luquero, Victor Flores, Denis
Coulombier
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Early Neonatal Mortality
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Cartogram: Malaria. N. cases
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
-
2
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Purpose of spatial analysisDescribe spatial distribution of
data– Counts, rates, RR …– Mapping
Identify spatial association of cases– Identify clusters, OB …–
Analysis of point processes
Estimating – Counts, rates, RR …– Geostatistics
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Descriptive Analysis
Dot-density maps for count of casesAdministrative area maps for
rates– Choice of administrative areas– Rates to account for
population– Standardised rates to account for
population structure“Isorate” maps for sentinel surveillanceGIS
when case coordinates available
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Dot Density Map
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Notification of Tuberculosis in France, 19964-Week Period Ending
31/12/1996
-
3
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Spatial Distribution of Polio CasesAlbania, April-September
1996
AprilMayJuneJulyAugustSeptember
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Descriptive Analysis: Place and Rates
Count of cases does not represent riskAdministrative areas have
different populationsPopulation may vary over time– Seasons –
Population influx (refugees)
Rates allow to compare risksChoice of administrative areas
(problem of small numbers of cases)Choice of ranges
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Notification Rate of Tuberculosis in France, 1996
Cases/100,000
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Distribution of cases of PERTUSSISLebanon, as of week
2003-15
##
##
#
##
#
####
# ##
#
######
###
##
###
#
#
#
##
Cases/100,000/Y
0.178 – 0.5540.555 – 0.8720.873 – 1.7411.742 – 3.554No
report
-
4
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Choosing map data If we map number of cases instead of
rates:
Is the disease risk in A really similar to B?
Misleading because underlying population may be greater in A
A B
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Use of Standardised Rates
Age structure
Disease Place
Population structurevaries across places
independently of disease
Disease occurrence varies across ages
independently of place
ConfoundingAge, independently related to disease and to
location
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Use of Standardised Rates
Direct standardisation
Indirect standardisation
Value of rate affected by the reference population
For comparison only
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Distribution of Death by Falls by Province, Canada, 1998
Age Standardized Rate per 100,000Crude deaths rate per
100,000
-
5
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
General mortality, 1995-1997
Not smoothed
Smoothed
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
BoxMap
Equal interval Equal n. of records
Natural breaking
Chloropleths: colors or patterns
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Choropleth Maps
Natural break
The story we tell depends upon how we choose to create the
legend
Equal ranges
Equal counts
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Choice of Data Breakdown in Classes
Equal area
Equalinterval
Naturalbreaks
Quartiles
MeanSt Dev.
0
5
10
15
20
0.378 7.400 14.423 21.445 28.467
-
6
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Testing for Hypothesis place
Remove confounding (standardisation)Detection of clusters–
Unexpected events: dot-maps
• Test for spatial correlation by nearest neighbour– Events with
baseline historical data
• Test for spatial correlation by contiguity analysis
Risk factor identification– Overlaying exposure and outcome–
Test for cross-correlation
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Distribution of cases of Botulism France, Week 42-45, 2000
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
0.000000-0.0012500.001251-0.0020510.002052-0.0025920.002593-0.006114
Death per 100000Observed contiguity in high risk counties:
24Expected contiguity in high risk counties: 16.3Contiguity
standard deviation: 3.46z statistic: 2.07, p=0.038
Testing for ContiguitiesGrimson Method
Sudden Infant Death Syndrome by County, North Carolina,
1974-1978
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Procesos Puntuales
-
7
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Why identify geographic patterns?
Geographic patterns range from completely clustered to
completely dispersed. A pattern between these extremes is said to
be random.
Knowing there’s pattern in your data is useful if you need to
gain a better understanding of a geographic phenomenon
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Spatial Processes questionsIs there any systematic pattern or
are my data distributed atrandom.
Possibilities: clusteringregularity
Scale of the clustrereing
The pattern is due to:
• natural variation in the population• obvious a priori
heterogeneity• associated with proximity to other features of
interestAre events that aggregate in space also clustered in
time?
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Is this clustered?
YES!
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Is this clustered?
NO:
Its regularly dispersed
-
8
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Is this clustered?
Maybe: (Complete Spatially Random)
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Point processes
The process is stationary if the jointdistribution of N(A) is
invariant totranslation by an arbitrary amount x
The process is isotropic if the jointdistribution of N(A) is
invariant torotation through an arbitrary angle
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Point processesTheorem 2: For a homogeneous planar Poisson
process
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Point processes
-
9
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Point processes
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Point processes
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
RANDOM UNIFORM CLUSTERED
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Global and Local Tests
Cluster detection methods
Global (first-order) tests detect the presence or absence of
clustering over the whole study regionwithout specifying the
spatial location.
Local (second-order) tests additionally specify the location and
if extended to consider temporal patterns, can specify
spatio-temporal clusters.
A special case of local tests is the focussed test which is used
to detect raised incidence of disease around some pre-specified
source, such as an incinerator.
-
10
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Choosing a method
There are two critical aspects, statistical power, and
confounding.
Methods that can control for known confounding effects should be
used in the first instance.
Statistical power is the ability to detect a real effect.
Readers will become acquainted in the literature with the
ability of methods to identify true clusters (true positives) but
also the frequency with which the methods report clusters falsely
(false positives).
Comparative evaluations of statistical power, often by running
competing cluster methods against a set of simulated data with
known properties, can provide guidance in the choice and
application of particular methods. Confounding is the erroneous
attribution of a disease cluster to a factor which is both related
to an exposure and a disease outcome.
Change in background population density
demographic factors such as age, gender or ethnicity.
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Maps for Sentinel SystemsIncidence of diarrhea in France,
1995
Cases / 100,000 population
Source: Réseau National Télématique des Maladies
Transmissibles
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Distribution of Syndromic Influenza,France, Week 1-19, 2003
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Interpretation of Significant Tests
The role of artefacts– Errors…
The role of confounding– Rates (time)– Standardised rates
(place)
The role of chance– Statistical testing (place dependency)
True disease pattern
-
11
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
George W. Comstock
The art of epidemiological thinkingis to draw conclusions
from imperfect data
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Epi Map MappingShapefiles + Data file (.mdb)
Slides based in a previous Paolo D’Ancona presentation for
EPIET
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Epi MapGIS component of Epi-InfoProgrammed with MapObjects
language (Shapefiles), Developed by ESRI (Makers of ArcGis® and
Arcview)Compatible with Arcview 3.x, and ArcGIS
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
ShapefilesGIS data set– Data represented by coordinates– Point,
line, area (polygon)– Attributes stored in separate dbf file
• Names and other information
-
12
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Structure of a simple dataset
Main file: xy.shpIndex file: xy.shxdBASE file or Access: xy.dbf
or xyMDB
The project file– In arcview: xy.apr– In epimap: xy.map
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
ESRI shapefileEPI MAP
Europe.shp Polygons
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
ESRI shapefileEPI MAP
Europe.shp Polygons
Europe.dbf Attributes (names)
Greece
France
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
ESRI shapefileEPI MAP
Europe.shp Polygons
Europe.dbf Attributes
Europe.shx File Structure
Greece
France
-
13
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
ESRI shapefile & Access data tableEPI MAP
Europe.shp Polygons
Europe.dbf Attributes
Europe.shx File Structure
Greece
France
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
ESRI shapefile & Access data tableEPI MAP
Europe.shp Polygons
Europe.dbf Attributes
Europe.shx File Structure
Attributes to match
Greece
France
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
ESRI shapefile & Access data tableEPI MAP
Europe.shp Polygons
Europe.dbf Attributes
Europe.shx File Structure
Attributes to matchCount variables
Greece
France
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
ESRI shapefile & Access data tableEPI MAP
Europe.shp Polygons
Europe.dbf Attributes
Europe.shx File Structure
Attributes to matchCount variablesOther Information (Size,
Pop...)
Greece
France
-
14
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
Data file (.mdb)
Contains– Geographical variable
• To link data to specific features in shapefile• Unique
relationship
– At least one numeric variable• Disease count, rate, etc.
Individual data– Must be processed to produce a summary file–
Only one record per geographic entity
10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of
surveillance data, EPISOUTH, Madrid, September 2007
= Epi-Info Map file
Dot density Chloropleth