Clusters : SaTScan : AMOEBA

advertisement
Spatio – Temporal
Cluster Detection
Using AMOEBA
Jimmy Kroon
Pennsylvania State University
Advisor: Dr. Frank Hardisty
This is a parody – Original Art: http://projectswordtoys.blogspot.com/2009/05/project-sword-annual-1967.html
Outline
•
Introduction – Clustering and Project Direction
•
The Spatial Scan Statistic and SatScan
•
AMOEBA
•
Proposed Spatio-Temporal AMOEBA Method
•
Software, Data, and Progress
Cluster Detection
Cluster:
“a geographically and/or temporally bounded group of
occurrences of sufficient size and concentration to be unlikely
to have occurred by chance” (Knox, 1989)
Two Typical Uses
Disease Surveillance
Epidemiological Studies
Week of 2/7/2010
Data: Google Flu Trends – Analysis: GeoDa
Brain Cancer in NM
Kulldorff et al. 1998
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Time in Spatial Analysis
Time Matters:
• Many geographic phenomena are dynamic.
• Spatial patterns we see probably change over time
• The American Association of Geographers describes temporal
geography as a ‘frontier’ of GIScience.
Spatio-temporal clusters may exhibit behaviors not seen in purely
spatial clusters.
• Growth
• Movement
• Splits / Joins
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Research Problem
Primary:
No method exists for the determining the true extent of
irregularly shaped clusters in spatio-temporal datasets.
Secondary: Spatial AMOEBA has not been implemented in R
Project Goals
•
A demonstration of spatio-temporal cluster detection
based on the AMOEBA procedure.
•
R scripts for running spatial and spatio-temporal
AMOEBA will be contributed to the R community.
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
The Spatial Scan Statistic
•
Scan data with a moving ‘window’, calculating local autocorrelation
for spatial units that fall within the window.
•
Select the window(s) with the highest calculated autocorrelation
value as possible cluster(s).
•
The spatial scan statistic is
by far the most popular
cluster detection
technique, largely due to
the availability of
SaTScan software by
Martin Kulldorff.
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
The Spatial Scan Statistic
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Drawbacks of the Spatial Scan Statistic
Clusters that are not similar in shape to the scanning window can
produce errors.
• False inclusions
• False exclusions
• Identify thin clusters as multiple small clusters
• Cannot detect holes in clusters
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
The Elliptical Spatial Scan Statistic
•
Must choose shapes a priori to avoid pre-selection bias
See Kulldorff et al. 2006
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
AMOEBA
•
•
•
Ecotope-Based – Regions of contiguous spatial units that are related
in terms of z-value
Multidirectional – Search in all directions.
Optimum – Procedure takes place at the finest spatial scale possible
and is capable of revealing all spatial association present in the
dataset (Aldstadt and Getis, 2006).
AMOEBA Clusters
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
AMOEBA
Defining an Ecotope
•
Add a seed location (one polygon) to the ecotope
•
Calculate Gi* (Getis-Ord local autocorrelation statistic)
•
•
Search in all directions for contiguous polygons
Those that increase Gi* are added to the growing ecotope for that
seed location
•
Keep searching for more neighbors, growing the ecotope until
Gi* no longer increases
Repeat – creating ecotopes for each polygon in the dataset
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
The R Neighbor Object
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
AMOEBA
From Ecotopes to Clusters
•
•
•
•
•
Rank ecotopes by final Gi*
Select that with the highest Gi* as a cluster
Eliminate intersecting ecotopes
Select the ecotope with the next highest Gi* as a second cluster
Repeat
•
Probability of clusters can be tested using Monte Carlo
simulation
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Incorporating Time into AMOEBA
Remember - Spatio-temporal
clusters may exhibit behaviors
not seen in purely spatial clusters.
• Growth
• Movement
• Splits / Joins
Visualize temporal data as layers
of data with time extending
vertically through the layers.
• Each spatio-temporal unit
has spatial neighbors and
temporal neighbors
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
The Spatio-Temporal Scan Statistic
See Kulldorff et al. 1998
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Software Environment and Test Data
The R Project
• Free, open source statistical software
• Extendable with user contributed packages
• www.r-project.org
Google Flu Trends
• Estimates flu incidence levels using aggregated data about user
searches for certain keywords
• 90% accurate compared to CDC data
• State-level data - updated daily
• www.google.org/googleflu
SEER (Surveillance Epidemiology and End Results)
• National Cancer Institute incidence, survival, and mortality data
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
AMOEBA ArcToolbox for ArcGIS
Python Scripts by Jared Aldstadt and Yeming Fan (Aldstadt, 2010)
Google Flu Trends – Feb 1, 2009
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA in Python: 2009 Flu Epidemic
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Hmmm…
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
R Programming Progress
Compete …
Geoprocessing tasks
• Create spatio-temporal
neighbor list
• Delineate ecotopes
• Sort and eliminate intersecting
ecotopes
• Returns primary cluster PolyID’s
that match the Python results
To Do …
• Monte Carlo simulation
• Process results and add to the output
shapefile
• Test, test, test
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
References
Aldstadt, Jared, and Arthur Getis. 2006. Using AMOEBA to Create a Spatial Weights Matrix and
Identify Spatial Clusters. Geographical Analysis 38: 327-343.
Aldstadt, Jared. 2010. Spatial Analysis Tools (ArcGIS). Spatial Analysis Tools.
http://www.acsu.buffalo.edu/~geojared/tools.htm.
Bellec, S, D Hémon, J Rudant, A Goubin, and J Clavel. 2006. Spatial and space–time clustering of
childhood acute leukaemia in France from 1990 to 2000: a nationwide study. British Journal of
Cancer
Duczmal, Luiz, Martin Kulldorff, and Lan Huang. 2006. Evaluation of Spatial Scan Statistics for
Irregularly Shaped Clusters. Journal of Computational and Graphical Statistics 15(2): 428-442.
Knox, G. 1989. Detection of Clusters. In Methodology of Enquiries into Disease Clustering, ed. P
Elliott, 17-22. London: Small Area Health Statistics Unit.
Kulldorff, Martin, Athas, William, Feuer, Eric, Miller, Barry, and Key, Charles. 1998. Evaluating
cluster alarms: A space-time scan statistic and brain cancer in Los Alamos, New Mexico.
American Journal of Public Health 88(9): 1377-1380.
Kulldorff, Martin, Lan Huang, Linda Pickle, and Luiz Duczmal. 2006. An elliptic spatial scan statistic.
Statistics in Medicine 25(22): 3929.
Kulldorff, Martin. 1999. Geographic Information Systems (GIS) community health: Some statistical
issues. Journal of Public Health Management and Practice 5(2): 100-106.
Original artwork for parody title slide: http://projectswordtoys.blogspot.com/2009/05/project-sword-annual-1967.html
Download