const_poproutes_unc_tra_adesai_10_24_12

advertisement
Constructing Popular Routes
from Uncertain Trajectories
Authors of Paper:
Ling-Yin Wei
(National Chiao Tung University, Hsinchu)
Yu Zheng
(Microsoft Research Asia)
Wen-Chih Peng
(National Chiao Tung University, Hsinchu)
Paper reviewed by:
Aniruddha Desai
(University of Washington,Tacoma)
Applications
Scope: Infer popular routes from a set of
uncertain trajectories
 Trip Planning (Travel / Tourism)
 Traffic Management (Transportation)
 Animal Movement studies
Spatial Trajectories
What is a trajectory?
Sequence of points: Location (Latt, Long) & Time-stamp
What are the moving objects?
Humans,Vehicles, Animals etc.
How are the trajectories collected?
Ubiquitous location acquisition technologies / devices
using GPS
Uncertainty and Inference

Trajectories generated at low or irregular
frequencies.

Routes between consecutive points on
trajectories are uncertain.

To infer a popular route we need to find
similarity between two uncertain trajectories –
this is hard to measure.
“RICK”
Route Inference framework based on
Collective Knowledge
Approach: aggregate uncertain trajectories in a mutually
reinforcing way: uncertain + uncertain => certain
Datasets:
◦ Real datasets used for conducting extensive
experiments
◦ Check-in dataset from Foursquare – 6,600
trajectories from Manhattan (3 check-ins
min)
◦ 15,000 taxi trajectories in Beijing.
How does it work?
Rick Overview: user specified query consists of a
location sequence & a time span; RICK infers the top-k
popular routes that pass through these locations within
given time span
Region Construction



Historical uncertain trajectories used to construct a
routable graph in a gridded space based on spatiotemporal characteristics
Grid cell size (“l”) represents granularity of inferences
Data points (or grid “cells”) “spatially close” if:
|x - x’| <= 1 and |y - y’| <= 1
Region Construction (cont’d…)

Data points “st-correlated” (spatio-temporally
correlated) if they are spatially close (Rule 1 or Rule 2)
and they mutually satisfy a temporal constraint q

Connection support C is of a cell pair is a threshold
for connectivity in the graph.

Neighbor: If the connection support of a cell pair is
>= C then they are neighbors.
Region Construction (cont’d…)


Region: Based on the connection support (above a
specified threshold value ‘C’) between individual cell
pairs regions are constructed.
Cell pairs are merged into regions using an efficient
recursive algorithm; Time complexity: O(cnm2)
Where c = minimum loop iterations
n = size (cardinality) of the set of cells in the grid space
m = size (cardinality) of the dataset
Edge Inference

After the regions are constructed we infer edges.

Two types of Edges:
◦ Edges within each region
◦ Edges among regions
Edge Inference (cont’d…)

Each vertex represents a cell and each edge indicates a
transition relationship and has two attributes:
◦ Transition support
◦ Travel time

Virtual bidirected edges between cells (vertices) are
generated if cells are neighbors in a region.

Shortest path inference approach is used. The direction,
transition supports and travel time information for edge
on shortest path is stored.

Redundant edges and edges whose transition support is
0 are eliminated
Route Inference

Two phases:
◦ Route generation
◦ Route refinement

Route generation:
◦ Top-k coarse routes are discovered with the routable
graph
Route Inference (cont’d…)

If query location can not be mapped to a graph vertex
we use MINDIST (nearest neighbor algorithm) to find
the cells close to the query location.

Local Routes: the top-k local routes between any two
consecutive cells are searched in the cell sequence by
an A*-like algorithm.

Route score is computed based on the range of time
interval between the two query locations.

Based on top-k local routes top-k global routes are
searched by a branch-and-bound search approach
Route Inference (cont’d…)
Two-Layer Routing Algorithm
 Before searching for local routes region sequences are
generated to reduce the search space by using a lower
bound of the transition times between the regions
with respect to two given cells.

Thus, multiple region sequences are possible
Route Inference (cont’d…)
Route Refinement:

Use historical data points (of trajectories that traverse
the cells on the rough route) that locate in the cells on
the route generated.

Adopt linear regression for set of points of each cell to
derive a line segment.

Concatenate line segments in the order of the inferred
route
Performance Evaluation


Inferred routes are compared against ground-truth from
raw-trajectories.
Two metrics used:
◦ NDTW – normalized dynamic time warping distance
◦ MD - maximum distance between inferred route and the rawtrajectory of the ground truth.
Compared RICK with existing approach MPR (Most
Popular Route) as a baseline
 Time Efficiency is tested (avg. query time 0.5 secs).
 RICK outperforms the baseline by generating routes
300-700m closer to the ground-truth (than the those of
the baseline).

Visualization of Results
Visualization of the query: “Central Park - > The Museum of
Modern Art - > Times Square - > Empire State Building - >
SoHo”, for top-1 (most popular) route inferred by RICK
Note:
The route does not just connect the query locations, but
passes through other attractions along the “inferred” most
popular route.
Strengths

Thorough / Credible
The authors have conducted extensive experiments on
real data. Their results show that the route inference
framework is effective, efficient and measurably
accurate.

Organized / Easy to understand
The content of the paper is very well organized and can
be easily understood even by a naïve reader.

Illustrations: (where provided) are very effective in
describing spatial concepts.
Weaknesses

Connection Support: Not explained sufficiently,
diagrams would have been helpful explain key concept

Route generated using A*-like algorithm: Not
explained the role of A*-like algorithm adequately in the
context of inferred route generated.

NDTW: “Normalized dynamic time warping” distance
is not explained adequately; diagrams would have helped
explain this key performance metric better.
Thank you!
Q&A
Download