Cascading Spatio-Temporal Pattern Discovery

advertisement
Cascading Spatio-Temporal
Pattern Discovery
P. Mohan, S.Shekhar,
J. Shine, J. Rogers
Presented by:
Atanu Roy
Akash Agrawal
CSci 8715
Motivation
• Applications in domains like
– Public safety
– Climate modeling
– Natural disaster planning
CSci 8715
The Problem
• Input
– ST dataset consisting of a set of boolean event-types
over a common ST framework
– a directed neighborhood relation
– a threshold CPI
• Output
– CSTPS with CPI ≥ threshold
• Objective
– Minimize Computation cost
• Constraints
– Correctness, completeness
CSci 8715
CSci 8715
Key Challenges
• Absence of natural
transactions & overlap
across instances
• Exponential cardinality
of candidate patterns
• Computationally
complex ST
neighborhood
• Conflicting demands of
computational
scalability and
statistical interpretation
CSci 8715
Related Works
Spatio-temporal frequent patterns
Others
Unordered
(ST Co-occurrence)
Partially Ordered
Totally Ordered
(ST Sequences)
This Work
(Cascading ST patterns )
 ST Co-occurrence [Celik et al. 2008, Cao et al. 2006]
 Designed for moving object datasets by treating trajectories as location time series
 Does not capture partially ordered relationships over space and time.
 ST Sequence [Huang et al. 2008, Cao et al. 2005 ]
Totally ordered patterns modeled as a chain.
Does not account for multiply connected patterns(e.g. nonlinear)
 Misses non-linear semantics.
 No ST statistical interpretation.
Slide Courtesy: Pradeep Mohan. Used in the class for demonstrating “Articulating Novelty”.
6
Novel & Better!
• Novelty
–
–
–
–
–
–
Implementation of partial ordered ST framework.
Spatio-temporal statistical interpretation first introduced
Novel interest measure
2 filtering strategies
New measure (clumpiness degree)
Tested on novel datasets
• Better
– Bottleneck analysis shows major time is utilized for interest
measure evaluation
– Computes interest measure using ST partitioning
– Algebraic cost model for filtering
– Comparison shows better performance from authors’ previous work
CSci 8715
Key Concepts
•
CSci 8715
Filters
• Upper Bound (UB) Filter*:
– Has anti-monotone upper bound.
– Reflects maximum possible values of interest
measure.
• Multi-resolution Spatio-Temporal Filter: *
– There exists a low dimensional embedding in space
and time
– Used to create a coarse CPI which is later proved to
never underestimate the CPI
– Can be used for pruning patterns with low CPI
– Saves time since actual CPI computation is very
expensive
* The paper should have addressed the issue that the filters are complimentary in
nature and should be used together to achieve the desired results.
CSci 8715
Description
• Description: for each size k pattern
– Apply UB filter
– for k in (1,2,…n) do
• Generate size k candidates using CSTPs of size (k1) recursively
• Perform MST filtering for non-prevalent patterns
• Generate pattern instance and compute CPI
• Prune non-prevalent and generate prevalent CSTP
– end for
CSci 8715
CSci 8715
Validations
• Mathematical proofs & Statistical
Interpretation
– Diggle et al.’s K-function
• Determination of the impact of filtering
• Comparison of performance of the 2
different CSTPM algorithms
CSci 8715
Assumptions
• Use of Euclidean distance for the distance
instead of real network distance.
• Helpful only -when the network is very wellconnected.
• In real world, Euclidean distance is rarely the “true”
distance between two points.
• Fails to capture dynamic constraints.
– Police patrol can not cross a river unless there is a
bridge.
– Washington Ave. is closed for vehicular movements for
the next few years.
• Most intuitive is the use of underlying spatial
network distance instead.
– esp. Road Network
– River Network
CSci 8715
CSci 8715
Assumptions
• ST events are boolean.
– Domains like climate study has attributes
which can have REAL data.
• ST non-stationarities, choices of directed
neighborhood relations are beyond the
scope.
– Events like drunk driving can be considered as
non-stationary and will change with respect to
time.
CSci 8715
Critique
• The approach used for candidate
generation can be improved further to
reduce the computational complexity.
– Implementation of hash indices for
checking sub-graph isomorphism can be
tried.
• Joins can also be used for shortest path
computation.
CSci 8715
Thank You
1.
P. Mohan, S. Shekhar, J. A. Shine
and J. P. Rogers, "Cascading
spatio-temporal pattern dis-covery:
A summary of results," in SDM,
2010, pp. 327 - 338.
2.
J. A. Shine, J. P. Rogers, S.
Shekhar and P. Mohan,
"Discovering partially ordered
patterns of Terrorism via Spatiotemporal Data Mining," in 16th
Army conference on Applied
Statistics, Cory, NC, USA, 2010.
3.
J. A. Shine, J. P. Rogers, S.
Shekhar and P. Mohan, "Cascade
models for spatio-temporal pattern
discovery," in 1st USACE
Research and Development
Conference, Memphis, TN , USA,
2009.
4.
M. Celik, S. Shekhar, B. George, J.P.
Rogers, and J.A. Shine, “Discovering
and quantifying mean streets: A
summary of results”, (2007).
CSci 8715
Download