Clusters : SaTScan : AMOEBA

Spatio – Temporal
Cluster Detection
Jimmy Kroon
Pennsylvania State University
Advisor: Dr. Frank Hardisty
This is a parody – Original Art:
Introduction – Clustering and Project Direction
The Spatial Scan Statistic and SatScan
Proposed Spatio-Temporal AMOEBA Method
Software, Data, and Progress
Cluster Detection
“a geographically and/or temporally bounded group of
occurrences of sufficient size and concentration to be unlikely
to have occurred by chance” (Knox, 1989)
Two Typical Uses
Disease Surveillance
Epidemiological Studies
Week of 2/7/2010
Data: Google Flu Trends – Analysis: GeoDa
Brain Cancer in NM
Kulldorff et al. 1998
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Time in Spatial Analysis
Time Matters:
• Many geographic phenomena are dynamic.
• Spatial patterns we see probably change over time
• The American Association of Geographers describes temporal
geography as a ‘frontier’ of GIScience.
Spatio-temporal clusters may exhibit behaviors not seen in purely
spatial clusters.
• Growth
• Movement
• Splits / Joins
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Research Problem
No method exists for the determining the true extent of
irregularly shaped clusters in spatio-temporal datasets.
Secondary: Spatial AMOEBA has not been implemented in R
Project Goals
A demonstration of spatio-temporal cluster detection
based on the AMOEBA procedure.
R scripts for running spatial and spatio-temporal
AMOEBA will be contributed to the R community.
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
The Spatial Scan Statistic
Scan data with a moving ‘window’, calculating local autocorrelation
for spatial units that fall within the window.
Select the window(s) with the highest calculated autocorrelation
value as possible cluster(s).
The spatial scan statistic is
by far the most popular
cluster detection
technique, largely due to
the availability of
SaTScan software by
Martin Kulldorff.
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
The Spatial Scan Statistic
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Drawbacks of the Spatial Scan Statistic
Clusters that are not similar in shape to the scanning window can
produce errors.
• False inclusions
• False exclusions
• Identify thin clusters as multiple small clusters
• Cannot detect holes in clusters
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
The Elliptical Spatial Scan Statistic
Must choose shapes a priori to avoid pre-selection bias
See Kulldorff et al. 2006
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Ecotope-Based – Regions of contiguous spatial units that are related
in terms of z-value
Multidirectional – Search in all directions.
Optimum – Procedure takes place at the finest spatial scale possible
and is capable of revealing all spatial association present in the
dataset (Aldstadt and Getis, 2006).
AMOEBA Clusters
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Defining an Ecotope
Add a seed location (one polygon) to the ecotope
Calculate Gi* (Getis-Ord local autocorrelation statistic)
Search in all directions for contiguous polygons
Those that increase Gi* are added to the growing ecotope for that
seed location
Keep searching for more neighbors, growing the ecotope until
Gi* no longer increases
Repeat – creating ecotopes for each polygon in the dataset
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
The R Neighbor Object
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
From Ecotopes to Clusters
Rank ecotopes by final Gi*
Select that with the highest Gi* as a cluster
Eliminate intersecting ecotopes
Select the ecotope with the next highest Gi* as a second cluster
Probability of clusters can be tested using Monte Carlo
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Incorporating Time into AMOEBA
Remember - Spatio-temporal
clusters may exhibit behaviors
not seen in purely spatial clusters.
• Growth
• Movement
• Splits / Joins
Visualize temporal data as layers
of data with time extending
vertically through the layers.
• Each spatio-temporal unit
has spatial neighbors and
temporal neighbors
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
The Spatio-Temporal Scan Statistic
See Kulldorff et al. 1998
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Software Environment and Test Data
The R Project
• Free, open source statistical software
• Extendable with user contributed packages
Google Flu Trends
• Estimates flu incidence levels using aggregated data about user
searches for certain keywords
• 90% accurate compared to CDC data
• State-level data - updated daily
SEER (Surveillance Epidemiology and End Results)
• National Cancer Institute incidence, survival, and mortality data
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
AMOEBA ArcToolbox for ArcGIS
Python Scripts by Jared Aldstadt and Yeming Fan (Aldstadt, 2010)
Google Flu Trends – Feb 1, 2009
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA in Python: 2009 Flu Epidemic
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
R Programming Progress
Compete …
Geoprocessing tasks
• Create spatio-temporal
neighbor list
• Delineate ecotopes
• Sort and eliminate intersecting
• Returns primary cluster PolyID’s
that match the Python results
To Do …
• Monte Carlo simulation
• Process results and add to the output
• Test, test, test
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Aldstadt, Jared, and Arthur Getis. 2006. Using AMOEBA to Create a Spatial Weights Matrix and
Identify Spatial Clusters. Geographical Analysis 38: 327-343.
Aldstadt, Jared. 2010. Spatial Analysis Tools (ArcGIS). Spatial Analysis Tools.
Bellec, S, D Hémon, J Rudant, A Goubin, and J Clavel. 2006. Spatial and space–time clustering of
childhood acute leukaemia in France from 1990 to 2000: a nationwide study. British Journal of
Duczmal, Luiz, Martin Kulldorff, and Lan Huang. 2006. Evaluation of Spatial Scan Statistics for
Irregularly Shaped Clusters. Journal of Computational and Graphical Statistics 15(2): 428-442.
Knox, G. 1989. Detection of Clusters. In Methodology of Enquiries into Disease Clustering, ed. P
Elliott, 17-22. London: Small Area Health Statistics Unit.
Kulldorff, Martin, Athas, William, Feuer, Eric, Miller, Barry, and Key, Charles. 1998. Evaluating
cluster alarms: A space-time scan statistic and brain cancer in Los Alamos, New Mexico.
American Journal of Public Health 88(9): 1377-1380.
Kulldorff, Martin, Lan Huang, Linda Pickle, and Luiz Duczmal. 2006. An elliptic spatial scan statistic.
Statistics in Medicine 25(22): 3929.
Kulldorff, Martin. 1999. Geographic Information Systems (GIS) community health: Some statistical
issues. Journal of Public Health Management and Practice 5(2): 100-106.
Original artwork for parody title slide: