INTERNET TECHNOLOGY EVOLUTION

advertisement
Center for Statistical Ecology
and Environmental Statistics
Surveillance Hotspots Systems for Digital Government
By G. P. Patil1, R. Acharya2, R. Modarres3, W. L. Myers4,and S. L. Rathbun4
1
Center for Statistical Ecology and Environmental Statistics
Department of Statistics, Penn State University
2
Department of Computer Science and Engineering, Penn State University
3
Department of Statistics, George Washington University
4
School of Forest Resources and Office for Remote Sensing and Spatial Information Resources,
Penn State Institutes of Environment, Penn State University
4
Department of Health Administration, Biostatistics and Epidemiology, University of Georgia
This material is based upon work supported by the National Science Foundation under Grant No. 0307010.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the
author(s) and do not necessarily reflect the views of the National Science Foundation.
This project is funded, in part, under a grant with the Pennsylvania Department of Health using Tobacco
Settlement Funds. The Department specifically disclaims responsibility for any analyses, interpretations
or conclusions.
[Invited paper in preparation for Encyclopedia of Digital Government]
Technical Report Number 2005-0203
TECHNICAL REPORTS AND REPRINTS SERIES
February 2005
Department of Statistics
The Pennsylvania State University
University Park, PA 16802
G. P. Patil
Distinguished Professor and Director
Tel: (814)865-9442 Fax: (814)865-1278
Email: gpp@stat.psu.edu
http: //www.stat.psu.edu/~gpp
http://www.stat.psu.edu/hotspots
DGOnline News
I N T R O D U C T I O N
Geoinformatic surveillance for spatial and temporal hotspot detection and
prioritization is crucial in the 21st century. A hotspot may be any unusual
phenomenon, anomaly, aberration, outbreak, elevated cluster, or critical area.
Government agencies require hotspot delineation and prioritization for
monitoring, etiology, management, or early warning. Responsible factors may
be natural, accidental or intentional, with relevance to both infrastructure and
security.
This article describes multi-disciplinary research based on novel methods
for hotspot detection and prioritization, driven by a diverse variety of case
studies of interest to agencies, academia, and the private sector. These case
studies concern critical societal issues, such as public health, ecosystem
health, biodiversity and threats to biodiversity, emerging infectious disease,
water management and conservation, carbon sources and sinks, persistent
poverty, environmental justices, crop pathogens, i nvasive species
management, biosurveillance, biosecurity, disease biogeoinformatics, social
networks, sensor networks, hospital networks and syndromic surveillance,
video mining, early warning, tsunami inundation, remote sensing, and
disaster management.
Our approach has involved an innovation of the popular circle -based
spatial scan statistic. In particular, it employs the notion of an upper level set
and is accordingly called the upper level set scan statistic system, pointing to
the next generation of sophisticated analytical and computational system s,
effective for the detection of arbitrarily shaped hotspots alon g spatiotemporal dimensions. It also involves a novel prioritization scheme based on
multiple indicators and stakeholder criteria without having to reduce
indicators to a single index using Hasse diagrams and partially ordered sets.
It is accordingly called the poset prioritization and ranking system. See Patil
and Taillie, 2004 ab.
The following websites have additional information:
(1) http://www.stat.psu.edu/hotspots/
(2) http://www.stat.psu.edu/~gpp/
(3)htttp://www.digitalgovernment.org/news/stories/2004/1104/1104_hots
pots_heyman.jsp
U P P E R
L E V E L
S E T
H O T S P O T
S T A T I S T I C
S Y S T E M
S C A N
Patil and Taillie (2004 ab) introduce an innovation of the health -areapopular circle-based spatial and spatiotemporal scan stat istic. It employs the
notion of an upper level set, and is accordingly called the upper level set
(ULS) scan statistic, pointing to a sophisticated analytical and computational
system as the next generation of the present day popular SaTScan ( Kulldorff
and Nagarwalla, 1995; Kulldorff, 1997; Kulldorff et al., 1998; Kulldorff,
2001; Mostashari et al., 2002; Waller, 2002).
Fig 1. Limitations of circular scanning windows. (Left) An irregularly shaped cluster—
perhaps a cholera outbreak along a winding river floodplain. Small circles miss much of
the outbreak and large circles include many unwanted cells. (Right) Circular windows
may report a single irregularly shaped cluster as a series of small clusters.
Background Theory of Scan Statistics
The spatial scan statistic concerns the following situation: A region R of
Euclidian space is tessellated or subdivided into cells, which will be denoted
by the symbol a . Data is available in the form of a count Ya on each cell a .
In addition, a “size” value Aa is associated with each cell. The cell sizes Aa
are regarded as fixed and known, while the cell counts Ya are independent
random variables. Two distributional settings are commonly studied:
 Binomial: The size Aa = N a is a positive integer and Ya ~ Binomial
( N a , pa ), where pa is an unknown parameter attached to cell a with
0  pa  1 .
 Poisson: The size Aa is a positive real number and Ya ~ Poisson
(  a Aa ), where  a > 0 is an unknown parameter attached to cell a .
Each distributional model has a simple in terpretation. For the
binomial, N a people reside in cell a and each person contracts a certain
disease independently with probability pa . The cell count Ya is the number of
diseased people. For the Poisson, Aa is the size (e.g., area or some adjusted
population size) of the cell a , and Ya is a realization of a Poisson process
with intensity  a . In each scenario, the responses Ya are independent; it is
assumed that spatial variability can be accounted for b y cell-to-cell variation
in model parameters.
The spatial scan statistic seeks to identify “hotspots” or “clusters” of cells
having an elevated response with respect to the remainder of the region.
Elevated response means large values for the rates (or intensities),
Ga  Ya / Aa ,
instead of the raw counts Ya . The scan statistic easily accommodates other
adjustments, such as for age or gender.
A collection of cells from the tessellation should satisfy several geometric
properties before it could be considered as a candidate for a hotsp ot cluster.
First, the union of the cells should comprise a geographically connected
subset of the region R (Fig. 2). Such collections of connected cells will be
referred to as zones Z and the set of all zones is den oted by  . Second, the
zone should not be excessively large. Otherwise, the zone instead of its
exterior would constitute background. This restriction is generally achieved
by limiting the search for hotspots to zones comprising of less than, say, fifty
percent of the region.
Fig. 2. A tessellated region. The collection of shaded cells in the left-hand diagram is
connected and, therefore, constitutes a zone in  . The collection on the right is not
connected.
The notion of a hotspot is inherently vague and lacks any a priori
definition. There is no “true” hotspot in the statistical sense of a true
parameter value. A hotspot is instead defined by its estimate, provided the
estimate is statistically significant. To this end, the scan statistic adopts a
hypothesis testing model in which the hotspot occurs as an unknown zonal
parameter in the statement of the alternative hypothesis.
The traditional spatial scan statistic uses expanding circles to determine a
reduced list  0 of candidate zones Z. By their very construction, these
candidate zones tend to be compact in shape and may do a poor job of
approximating actual clusters. The reduced parameter space of the circular
scan statistic is determined entirely by the geometry of the tessellation and
does not involve the data in any way. We propose a scan statistic that takes
an adaptive point of view in which  0 depends very much upon the data.
Furthermore,  0 induces a tree structure useful for visualization and
expressing uncertainty of hotspot clusters in the form of a hotspot confidence
set on the tree.
Although the traditional spatial scan statistic is applicable only to
tessellated data, the ULS approach has an abstract graph (i.e., vertices and
edges) as its starting point. Accordingly, this approach can also be applied to
data defined over networks, such as subway, water or highway systems.
There is complete flexibility regarding the definition of adjacency. For
example, one may declare two cells as adjacent ( i) if their boundaries have at
least one point in common, (ii) if their common boundary has positive length,
or (iii) in the case of a drainage network, if the flow is from one cell to the
next.
ULS Scan Statistic
The ULS scan statistic is an adaptive approach in which the reduced
parameter space 0  ULS is determined from the data using the empirical
cell rates
Ga  Ya / Aa .
These rates determine a function a  Ga defined over the cells in the
tessellation. This function has only finitely many values and each level g
defines an upper level set (ULS)
U g  {a : Ga  g}
Rate G
Schematic
Response “Surface”
g
g
Z2
Z1
Z4
Z3
Z5
Z6
Region R
Fig. 3. Schematic response surface with two response levels, g and g  . The upper level
set determined by g has three connected components, Z1 , Z 2 and Z 3 ; that determined
by g  has Z 4 , Z 5 and Z 6 as its connected components. The diagram also illustrates the
three ways in which connectivity can change as the level drops from g to g  : (i) zones
Z1 and Z 2 grow in size and eventually coalesce into a single zone Z 4 , (ii) zone Z 3
simply grows to Z 5 , and (iii) zone Z 6 is newly emergent.
Since upper level sets do not have to be geographically connected ( Fig. 3),
we take the reduced list of candidate zones  U LS to consist of all connected
components of all possible upper level sets. The zones in  ULS are plausible
as potential hotspots since they are portions of upper level sets of the
response rate. The number of zones is small enough for practical maximum
likelihood search; in fact, the size of  ULS does not exceed the number of
cells in the tessellation.
A ULS-tree can be defined on the reduced parameter space  ULS . Its
nodes are the zones Z ULS and are therefore collections of vertices from
the abstract graph. Leaf nodes are typically singleton vertic es at which the
response rate attains a local maximum. The root node consists of all
connected vertices in the abstract graph. Fig. 4 shows the tree structure for
the surface from Fig. 3.
Intensity G
g
Z3
Z2
Z1
Schematic
intensity “surface”
A
g
Z4
Z5
Z6
B
C
Fig. 4.N.B.ULS
connectivity tree for the schematic surface displayed in Fig. 3. The four leaf
Intensity surface is cellular (piece-wise constant), with only finitely many levels
A, B, C are junction
where multiple
zones The
coalesceroot
into a single
zonerepresents the entire region. Junction
nodes correspond
to nodes
surface
peaks.
node
nodes (A, B and C) occur when two (or more) connected components coalesce into a
single connected component.
A consequence of the adaptivity of the ULS approach is that  ULS must
be recalculated for each replicate in a simulation study. Efficient algorithms
are needed for this calculation. Several generic algorithms are available in
the computer science literature (Cormen et al, 2001, Section 22.3 for depth
first search; Knuth, 1973, p. 353 or Press et al, 1992, Section 8.6 for
transitive closure).
Hotspot Membership Rating
Zonal estimation uncertainty is visually depicted by inner and outer
envelopes, where the outer envelope consists of all cells belonging to at least
one zone in the confidence set. Cells in the inner envelope belong to all of
the zones in the confidence set. In other words, the outer envelope is the
union of all zones in the confidence set while the inner envelope is their
intersection (Fig. 5; Fig. 6).
MLE
Outer envelope
Inner envelope
Fig. 5. Estimation uncertainty in hotspot delineation. Cells in the inner envelope belong
to all plausible estimates (at specified confidence level); cells in the outer envelope
belong to at least one plausible estimate. The MLE is nested between the two envelopes.
A numerical rating may also be assigned to each cell for inclusion in
the hotspot. The rating is the percentage of zones in the co nfidence set that
includes the cell under consideration. The inner envelope consists of cells
receiving a 100% rating while the outer envelope contains the cells with a
nonzero rating. A map of these ratings, with the superimposed MLE,
provides a visual display of uncertainty of the hotspot delineation.
Typology of Space-Time Hotspots
Scan statistic methods extend readily to the detection of hotspots in
space-time. A space-time version of the circle-based scan statistic employs
cylindrical extensions of spatial circles, but cylinders are often unable to
adequately represent the temporal evolution of a hotspot ( Fig. 7). The
space-time generalization of the ULS scan statistic can detect arbitrarily
shaped hotspots in space-time (Patil and Taillie 2004a). This lets us classify
space-time hotspots into various evolutionary types, a few of which appear
on the left hand side of Fig. 8. The merging hotspot is particularly
interesting because, while it comprises a connected zone in space -time,
several of its time slices are spatially disconnected. The diagrams in Fig. 8
are motivated by a study on “trajectories of persistent poverty in the US”
being conducted by Amy Glasmeier of Penn State University.
Tessellated Region R
MLE
Junction Node
Alternative
Hotspot Delineation
Alternative
Hotspot Locus
Time
Fig. 6. A confidence set of hotspots on the ULS tree. The different connected
components correspond to different hotspot loci while the nodes within a connected
component correspond to different delineations of that hotspot—all at the appropriate
confidence level.
Hotspot
Cylindrical
approximation
Cylindrical approximation sees
single hotspot as multiple hotspots
Space
1990
Stationary
Hotspot
1980
1970
Time (census year)
2000
2000
Time (census year)
Time (census year)
Fig. 7. Temporal evolution of a spatial hotspot is represented by the shape of the hotspot
in space-time. Cylinders may not adequately capture this shape.
2000
1990
Time (census year)
1990
Shifting
Hotspot
1980
1970
1980
1990
2000
1980
1970
Space (census tract)
Space (census tract)
2000
1970
Expanding
Hotspot
1990
Merging
Hotspot
1980
1970
Space (census tract)
Space (census tract)
Time slices
Fig. 8. The four diagrams on the left depict different types of space-time hotspots. The
spatial dimension is represented schematically on the horizontal axis while time is on the
vertical axis. The diagrams on the right show the trajectory (sequence of time slices) of a
merging hotspot.
P A R T I A L L Y O R D E R E D S E T
P R I O R I T I Z A T I O N S Y S T E M
H O T S P O T
The prioritization system of hotspot geoinformatics is concerned with
the ranking of a finite collection of objects when a suite of indicator values is
available for each member of the collection. The objec ts can be represented
as a configuration of points in indicator space, but the different indicators
typically convey different comparative messages a nd there is no unique way
to rank the objects while taking all indicators into account. A traditional
approach is to assign a composite numerical score to each object by
combining the indicator information in some fashion. Consciously or
otherwise, every such composite involves judgments (often arbitrary or
controversial) about tradeoffs or substitutability among indicators.
Rather than attempting to combine indicators, Patil and Taillie (2004b)
take the view that the relative positions in indicator spa ce determine only a
partial ordering and that a given pair of objects may not be inherently
comparable. Working with Hasse diagrams of the partial order, they study
the collection of all rankings compatible with the partial order. In this way,
an interval of possible ranks is assigned to each object. The intervals can be
very wide. Noting, however, that ranks near the ends of each interval are
usually infrequent under linear extensions, a distribution is obtained over the
interval of possible ranks. This distribution, called the rank-frequency
distribution, is unimodal, log-concave and represents the degree of ambiguity
involved in attempting to assign a rank to the corresponding object.
Stochastic ordering of distributions imposes a partial order on t he
collection of rank-frequency distributions. This collection of distributions is
in one-to-one correspondence with the original collection of objects and the
induced ordering on these objects is called the cumulativ e rank-frequency
(CRF) ordering, extending the original partial order. For example, Fig. 9
shows the Hasse diagram for a small partially ordered set (poset) with six
objects, labeled a through f. The decision tree on the right enumerates all
possible linear extensions of the poset , where each path through the tree
determines a linear extension. In this example, there are a total of 16 linear
extensions. Object a is assigned rank 1 by nine of those extensions, rank 2
by five of the extensions, and rank 3 by the remaining two extensions. The
cumulative rank frequencies for object a are thus 9, 9+5=14, and 9+5+2=16.
These determine a cumulative rank profile for object a as shown in the Fig.
10 and similarly for the other five objects.
Linear extension decision tree
Poset B
(Hasse Diagram)
e
a
b
c
d
b
a
c
b
e
f
b
d
c
d
e
d
a
d
c
e
d
d
a
c
c
b
e
d
d
e
f
d
e
f
e
f
d e
f
e
f e
f
f
f
f
e
f
f
e
f
e
f
e
f
e
e
c
f
f
Fig. 9. Haase diagram and corresponding linear extension tree. The linear extension tree
enumerates all admissible linear extensions of the poset. Dashed links in the dimension
tree are not implied by the partial order and are called jumps. If one tries to trace the
linear extension in the original Haase diagram, a jump would be required at each dashed
link.
Cumulative Frequency
16
a
b
c
d
e
f
12
8
4
0
1
2
3
4
5
6
Rank
Fig. 10. Cumulative rank-frequency distribution for the poset in Fig. 9.
For this example, the six profiles are stacked one -above-the-other, thus
determining a linear ordering of the objects. The CRF operator treats each
linear extension as an equal “voter” in determining the CRF ranking. It is
possible to generalize to a weighted CRF operator by giving linear extensions
differential weights either on mathematical grounds (e.g., number of jump s)
or empirical grounds (e.g., indicator concordance). Explicit enumeration of
all possible linear extensions is computationally impractical unless the
number of objects is quite small. In such cases, the rank -frequencies can be
estimated using discrete Markov chain Monte Carlo (MCMC) methods.
The resulting prioritization system has the following innovative features:


Ability to rank and prioritize hotspots ;
Utilizes multiple indicator and stakeholder criteria without integrating
indicators into an index;



Employs Hasse diagrams, partially ordered sets, and Markov Chain
Monte Carlo computations leading to several key applications,
including:
Early warning systems;
Identification of critical areas for focused investigation.
In the area of Health Policy, Health Statistics, and Disease Etiology,
the prioritization component may be combined with a hotspot detection
component to yield a three-stage surveillance system:

First stage screening: Identification of significant clusters (hotspots)
by an upper level set version of the scan statistic;

Second stage screening: Rank and prioritize significant hotspots
using likelihood values and other attributes such as raw intensity
values, remediation-feasibility scores, socio-economic and
demographic factors;

Third stage screening: Follow up hotspots for etiology and/or
intervention.

For more details, see Patil and Taillie (2004b).
S E L E C T
C A S E
S T U D I E S
In response to an ever increasing volume of georeferenced data, government
agencies require a new generation of decisi on support systems for early
detection, surveillance, and prioritization of hotspots. A decision support
framework for geographic and network surveillance, using systems involving
upper level sets and partially ordered sets, is applicable to a variety of
important case studies, such as:
1.
Cyber security and computer network diagnostics;
2.
Tasking of self-organizing surveillance mobile sensor networks;
3.
Drinking water quality and water utility vulnerability;
4.
Surveillance network and early warning;
5.
West Nile virus;
6.
Crop pathogens and bioterrorism;
7.
Disaster management: Oil spill detection monitoring, and
prioritization;
8.
Network analysis of biological integrity in freshwater streams.
The framework can be applied to irregular networks, such as th ose
formed by streams (Fig. 11), political units, social networks, and the internet.
When applied to data collected over both space and time, the ULS scan
statistic system may be used to detect shifting poverty hotspots (Fig. 11),
coalescence of neighboring hotspots, or thei r growth.
Fig. 11. Data on a network of streams (left), and shifting poverty hotspots (right).
Protecting the nation’s computer networks from cyber attack is an important
homeland security priority requiring diagnostic tools for detecting security
attacks and infrastructure failures. A probabilistic finite state automaton
(PFSA), describing a network element is obtained from its data stream
output. The variational distance between the stochastic languages generated
by normal and crisis automata may be used to form a crisis index. The ULS
scan statistic is then applied to crises indices over a collection of network
elements for hotspot detection. These hotspots and their prioritization can be
used to detect coordinated attacks geographically spread over a network.
Additional applications of PFSA include the tasking of self -organizing
surveillance mobile sensor networks, geotelemetry with wireless sensor
networks, videomining networks, and syndromic surveillance in public
health.
Fig 12. Framework for probabilistic finite state automata (left), and a metric for
measuring the distance between two finite state automata.
The National Tsunami Hazard Mitigation Program (NTHMP) is the
first systematic national effort for the production of inundation maps
essential for tsunami hazard planning and mitigation. Without a clear
understanding of what areas are most at risk, it is not possible to develop
effective emergency response plans involving population and infrastructure
vulnerability and evacuation routes (Gonzalaz, 2001). Inundation maps
enable the construction of tsunami risk maps, where risk is the hazard times
the exposure; for example, the probability that a particular grid cell is struck
by a tsunami times the number of people occupying that c ell. These form
risk surfaces defined over tessellations of grid cells in regions under
consideration. For purposes of optimal disaster management planning, it is
essential to have the capability to recognize priority high risk areas with
minimal false alarms. The tsunami disaster management triggers research,
expanding its scope to geospatial continuous response risk variables with
skewed distributions, and to hotspot trajectories representing changing
spatial patterns of inundated areas with increasing tsunami severity.
Understanding the latter typology may impact planning of evacuation routes.
Under an expanding hotspot scenario, traffic is always directed outwards
from the hotspots, but under merging hotspots, a portion of the traffic may be
directed through regions between hotspots when the tsunami is predicted to
be small. Another significant contribution to tsunami disaster management
will be to prioritize and rank risk hotspots, detected at specified confidence
levels with respect to multiple criteria, stakeholders, and indicators without
reduction to a single index. Examples of such criteria may include the
number of people at risk and the economic value of infrastructure, buildings,
and their contents.
Fig 13. Portion of a community projected to be inundated by a tsunami as predicted
under a tsunami inundation model for a given earthquake scenario (blue region on the
left). Two typologies expected under tsunamis of increasing severity (right).
C O N C L U S I O N
Government agencies often require concise summaries of
georeferenced data to support their decisions regarding the geographic
allocation of resources. Geoinformatic surveillance for spatial and
spatiotemporal hotspot detection and prioritization is a critical need for the
21 st century. A hotspot can mean an unusual phenomenon, anomaly,
aberration, outbreak, or critical area. Hotspot delineation and prioritization
may be required for etiology, management, or early warning.
The article briefly describes a prototype Geoinformatic Hotspot
Surveillance (GHS) system for hotspot delineation and prioritization (Fig.
14) in a variety of case studies of critical societal importance . The prototype
system is comprised of modules for (1) hotspot detection and delineation, and
(2) hotspot prioritization.
Geoinformatic Surveillance System
Geoinformatic spatio-temporal
data from a variety of data
products and data sources with
agencies, academia, and industry
Masks, filters
Spatially
distributed
response
variables
Hotspot
analysis
Prioritization
Decision
support
systems
Masks, filters
Indicators, weights
Fig. 14. Framework for the Geoinformatic Hotspot Surveillance (GHS) system.
R E F E R E N C E S
Cormen, T. H., Leierson, C. E., Rivest, R. L., and Stein, C. (2001).
Introduction to Algorithms, Second Edition. MIT Press, Cambrid ge,
Massachusetts.
Gonzalez, F.I. (2001). The NTHMP inundation mapping program. In Proceedings of the
International Tsunami Symposium 2001. Seattle, August 7-10, pp. 29-54.
Knuth, D. E. 1973. The Art of Computer Programming: Volume 1,
Fundamental Algorithms, Second Edition. Addison-Wesley, Reading,
Massachusetts.
KULLDORFF, M. 1997. A spatial scan statistic. Communications in
Statistics: Theory and Methods 26, 1481–1496.
Kulldorff, M. 2001. Prospective time-periodic geographical disease
surveillance using a scan statistic. Journal of the Royal Statistical Society,
Series A 164, 61–72.
Kulldorff, M., Feuer, E. J., Miller, B. A., and Freedman, L. S. 1997. Breast
cancer clusters in Northeast United States: A geographic analysis. American
Journal of Epidemiology 146, 161–170.
Kulldorff, M. And Nagarwalla, N. 1995. Spatial disease clusters: Detection
and inference. Statistics in Medicine 14, 799–810.
Kulldorff, M., Rand, K., Gherman, G., Williams, G., and Defrancesco, D.
1998. SaTScan version 2.1: Software for the spatial and space -time scan
statistics. National Cancer Institute, Bethesda, MD.
Mostashari, F., Kulldorff, M., and Miller, J. 2002. Dead bird clustering: An
early warning system for West Nile virus activity. Manuscript prepared for
the New York City West Nile Virus Surveillance Working Group. Under
review.
Patil, G.P. (2005). Geoinformatic surveillance of hotspot detection, prioritization, and
early warning. Demo for 6th Annual National Conference on Digital Government
Research, Atlanta, GA.
Patil, G.P., and Taillie, C. 2004a. Upper level set scan statistic for detecting
arbitrarily shaped hotspots. Environmental and Ecological Statistics 11, 183197.
Patil, G.P., and Taillie, C. 2004b. Multiple indicators, partially ordered sets,
and linear extensions: Multi-criterion ranking and prioritization.
Environmental and Ecological Statistics 11, 199-228.
Press, W. H., Teukolsky, S. A., Vetterling, W. T., and Flannery, B. P . 1992.
Numerical Recipes in C, Second Edition. Cambridge Univ ersity Press,
Cambridge.
NOTE: This research was supported by the National Science Foundation
Digital Government Program Award Number EIA -0307010. Partner federal
agencies include DOD, DOT, EPA, NASA, NCHS, NCI, NIEHS, USFS, and
USGS with USGS as the coordinating agency. The contents have not been
subjected to Agency review and therefore do not necessarily reflect the views
of the Agencies and no official endorsement should be inferred .
T E R M S
A N D
D E F I N I T I O N S
Hotspots: A connected subset of the study region with statistically significant
elevated rates of disease, poverty, accidents, or any other relevant georeferenced
phenomenon.
Upper Level Set Scan Statistic System: Adaptive system for hotspot detection and
delineation based on upper level sets in georeferenced data.
Poset Prioritization and Ranking System: Nonparametric approach to ranking
objects using multiple indicators based on cumulative rank functions constructed from
Haase diagrams and linear extension trees.
Hotspot Rating: Confidence level that a given cell belongs to a hotspot.
Typology of Space-Time Hotspots: Classification of the trajectories of hotspots
over time when the upper level set scan statistic system is applied to space-time data.
Early Warning: Alert to a pending disaster.
Digital Government Case Studies: Investigations of interest to society
demonstrating the efficacy of proposed digital informatic approaches to handling
government data bases.
Word count: 3397 (excluding references; list of key terms and their definitions)
Download