Spatio – Temporal Cluster Detection Using AMOEBA Jimmy Kroon Pennsylvania State University Advisor: Dr. Frank Hardisty
Jan 20, 2016
Spatio – Temporal Cluster Detection
Using AMOEBA
Jimmy KroonPennsylvania State University
Advisor: Dr. Frank Hardisty
This is a parody – Original Art: http://projectswordtoys.blogspot.com/2009/05/project-sword-annual-1967.html
Outline
• Introduction – Clustering and Project Direction
• The Spatial Scan Statistic and SatScan
• AMOEBA
• Proposed Spatio-Temporal AMOEBA Method
• Software, Data, and Progress
Cluster Detection
Cluster: “a geographically and/or temporally bounded group of occurrences of sufficient size and concentration to be unlikely to have occurred by chance” (Knox, 1989)
Disease SurveillanceWeek of 2/7/2010
Data: Google Flu Trends – Analysis: GeoDa
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Epidemiological StudiesBrain Cancer in NM
Two Typical Uses
Kulldorff et al. 1998
Time in Spatial Analysis
Time Matters: •Many geographic phenomena are dynamic.•Spatial patterns we see probably change over time•The American Association of Geographers describes temporal geography as a ‘frontier’ of GIScience.
Spatio-temporal clusters may exhibit behaviors not seen in purely spatial clusters.• Growth• Movement• Splits / Joins
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Research Problem
Primary: No method exists for the determining the true extent of irregularly shaped clusters in spatio-temporal datasets.
Secondary: Spatial AMOEBA has not been implemented in R
Project Goals
• A demonstration of spatio-temporal cluster detection based on the AMOEBA procedure.
• R scripts for running spatial and spatio-temporal AMOEBA will be contributed to the R community.
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
The Spatial Scan Statistic
• Scan data with a moving ‘window’, calculating local autocorrelation for spatial units that fall within the window.
• Select the window(s) with the highest calculated autocorrelation value as possible cluster(s).
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
• The spatial scan statistic is by far the most popular cluster detection technique, largely due to the availability of SaTScan software by Martin Kulldorff.
The Spatial Scan Statistic
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Drawbacks of the Spatial Scan Statistic
Clusters that are not similar in shape to the scanning window can produce errors.•False inclusions•False exclusions•Identify thin clusters as multiple small clusters•Cannot detect holes in clusters
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
The Elliptical Spatial Scan Statistic
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
• Must choose shapes a priori to avoid pre-selection bias
See Kulldorff et al. 2006
AMOEBA
AMOEBA Clusters
• Ecotope-Based – Regions of contiguous spatial units that are related in terms of z-value
• Multidirectional – Search in all directions.• Optimum – Procedure takes place at the finest spatial scale possible
and is capable of revealing all spatial association present in the dataset (Aldstadt and Getis, 2006).
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
AMOEBA
Defining an Ecotope
• Add a seed location (one polygon) to the ecotope
• Calculate Gi* (Getis-Ord local autocorrelation statistic)
• Search in all directions for contiguous polygons• Those that increase Gi* are added to the growing ecotope for that
seed location
• Keep searching for more neighbors, growing the ecotope until Gi* no longer increases
Repeat – creating ecotopes for each polygon in the dataset
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
The R Neighbor Object
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
AMOEBA
From Ecotopes to Clusters
• Rank ecotopes by final Gi*
• Select that with the highest Gi* as a cluster• Eliminate intersecting ecotopes• Select the ecotope with the next highest Gi* as a second cluster• Repeat
• Probability of clusters can be tested using Monte Carlo simulation
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Incorporating Time into AMOEBA
Remember - Spatio-temporal clusters may exhibit behaviors not seen in purely spatial clusters.• Growth• Movement• Splits / Joins
Visualize temporal data as layers of data with time extending vertically through the layers.•Each spatio-temporal unit has spatial neighbors and temporal neighbors
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
The Spatio-Temporal Scan Statistic
Clusters : SaTScan : AMOEBA : ST AMOEBA : ProgressSee Kulldorff et al. 1998
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Software Environment and Test Data
The R Project•Free, open source statistical software•Extendable with user contributed packages•www.r-project.org
Google Flu Trends•Estimates flu incidence levels using aggregated data about user searches for certain keywords•90% accurate compared to CDC data•State-level data - updated daily•www.google.org/googleflu
SEER (Surveillance Epidemiology and End Results)•National Cancer Institute incidence, survival, and mortality data
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
AMOEBA ArcToolbox for ArcGIS Python Scripts by Jared Aldstadt and Yeming Fan (Aldstadt, 2010)
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Google Flu Trends – Feb 1, 2009
Spatio-Temporal AMOEBA in Python: 2009 Flu Epidemic
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Hmmm…
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
R Programming Progress
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Compete …Geoprocessing tasks
• Create spatio-temporalneighbor list• Delineate ecotopes• Sort and eliminate intersecting
ecotopes• Returns primary cluster PolyID’s
that match the Python results
To Do …• Monte Carlo simulation• Process results and add to the
output shapefile• Test, test, test
Aldstadt, Jared, and Arthur Getis. 2006. Using AMOEBA to Create a Spatial Weights Matrix and Identify Spatial Clusters. Geographical Analysis 38: 327-343.
Aldstadt, Jared. 2010. Spatial Analysis Tools (ArcGIS). Spatial Analysis Tools. http://www.acsu.buffalo.edu/~geojared/tools.htm.
Bellec, S, D Hémon, J Rudant, A Goubin, and J Clavel. 2006. Spatial and space–time clustering of childhood acute leukaemia in France from 1990 to 2000: a nationwide study. British Journal of Cancer
Duczmal, Luiz, Martin Kulldorff, and Lan Huang. 2006. Evaluation of Spatial Scan Statistics for Irregularly Shaped Clusters. Journal of Computational and Graphical Statistics 15(2): 428-442.
Knox, G. 1989. Detection of Clusters. In Methodology of Enquiries into Disease Clustering, ed. P Elliott, 17-22. London: Small Area Health Statistics Unit.
Kulldorff, Martin, Athas, William, Feuer, Eric, Miller, Barry, and Key, Charles. 1998. Evaluating cluster alarms: A space-time scan statistic and brain cancer in Los Alamos, New Mexico. American Journal of Public Health 88(9): 1377-1380.
Kulldorff, Martin, Lan Huang, Linda Pickle, and Luiz Duczmal. 2006. An elliptic spatial scan statistic. Statistics in Medicine 25(22): 3929.
Kulldorff, Martin. 1999. Geographic Information Systems (GIS) community health: Some statistical issues. Journal of Public Health Management and Practice 5(2): 100-106.
References
Original artwork for parody title slide: http://projectswordtoys.blogspot.com/2009/05/project-sword-annual-1967.html