3 - Stanford Computer Science

advertisement
Tracking Moving Objects in Anonymized
Trajectories
Nikolay Vyahhi1, Spiridon Bakiras2,
Panos Kalnis3, and Gabriel Ghinita3
1St.
Petersburg State University
2John Jay College, City Univ. of New York
3National University of Singapore
Motivation

Collection of Trajectory Data

Example: Traffic monitoring system



Data expected to be anonymous


GPS or Sensors deployed across a city
Queries: Predict traffic conditions
Remove ID
Reconstruction of original trajectories

E.g., Police tracking a suspect
2
Problem Statement
Given a large database with anonymized
spatio-temporal measurements,
reconstruct the original object trajectories
 Requirements



Efficiency (large databases)
Accuracy (useful results)
3
Problem Statement



Input: A series of M
snapshots Si, each
containing exactly N
measurements from
timestamp ti
Output: A set of N
trajectories
Each measurement
can be associated with
a single trajectory
M=N=3
4
Related work: Multiple Target Tracking

This problem is closely related to multiple
target tracking (MTT) algorithms


Studied in the field of radar technology
Three major categories



Nearest neighbor (NN)
Joint probabilistic data association (JPDA)
Multiple hypothesis tracking (MHT)
5
Related work: NN and JPDA
They work in a single scan of the dataset
 Greedy approach: in each timestamp,
every sample is associated with a single
track
 Objective: minimize the error across all
associations in the current timestamp
 Performance:



Efficient – can work in polynomial time
Greedy approach results in many false
associations
6
Related work: MHT

Multiple hypotheses are maintained

Joint probabilities are calculated recursively
when new measurements are received
Each association is based on both previous
and subsequent data (multiple scans)
 Unfeasible hypotheses are eventually
eliminated
 Performance:



Very accurate
Computational and space complexity is
exponential to the number of measurements
7
Comparison
Very
accurate
Very slow
Large
Fast
errors
Very
accurate
Much faster than MHT
8
Our Approach
MCMF: Min-cost Max-flow
Transform the tracking problem into a
min-cost max-flow problem
 Min-cost max-flow (graph algorithm)




Input: a weighted graph G with two special
nodes (source s and destination t)
Objective: find the maximum flow that can be
sent from s to t that results in the minimum
cost
Well-known algorithms exist that work in
polynomial time
9
Transformation


All edges have capacity 1
Node id (ti, pi, pj): the object moves from location pi
in timestamp ti to location pj in timestamp ti+1
10
Calculating the Cost Values



Assume two successive measurements (pi and pj)
belong to the same track
Use these values to predict the next location
Calculate the error (i.e., cost) for every possible
location pk
11
Limitation of this Approach

Problem: A single measurement can be
associated with multiple tracks!
12
Solution:
Create a Block for each Measurement
Block for kth measurement
of mth timestamp (pm,k)



Corresponds to all
partial tracks pm-1,i 
pm,k  pm+1,j
A block containing a
flow is marked as
active
The only possible
route inside an active
block, is through the
reverse path of the
existing flow
13
Block Functionality
Block for p2,1
Block for p3,1
Original track: p1,1  p2,1  p3,1
Original track: p2,1  p3,1  p4,1
New track:
New track:
p1,1  p2,1  p3,2
p2,2  p3,1  p4,1
14
Improving the Running Time

Flow network is too large


Assume any object can travel at most Rmax
distance between two consecutive
timestamps. Rmax depends on



Inefficient, since solution requires multiple
shortest path calculations
The maximum speed of the objects
The time interval between two timestamps
This reduces significantly the number of
vertices and edges inside each block
15
The Tracking Algorithm

Successive Shortest Path Algorithm



Most efficient implementation:



At each iteration, send a single flow unit across
the shortest path from s to t
Total of N iterations in our case
Dijkstra with Fibonacci heap for priority queue
Graph contains negative weights, but can
utilize vertex potentials to avoid this (provided
that there are no negative weight cycles)
Bellman-Ford also works very well
16
Dealing with Negative Weight Cycles
Negative weight cycles do appear in MCMF
calculations
 In this case, follow a greedy approach:


Output all the tracks that are discovered so far



they might not be optimal
Remove all vertices and edges associated with
these tracks from the flow network
Start a new min-cost max-flow calculation on
the reduced graph
17
Complexity

Computational:




N iterations of a shortest path algorithm
O(MN2K(log(MNK) + K)) for Dijkstra with
Fibonacci heap
K is the average number of feasible
associations (due to Rmax) per measurement
Space:

O(MNK2) for storing the graph
18
Experimental Evaluation

Data generator:





Road map of San Francisco city
For each object, randomly select a starting
point and a destination point
The object then follows the shortest path
between the two points
At each timestamp, every object i covers a
distance di  [0,Rmax]
Number of measurements: 50,000 to 500,000
19
Experimental Evaluation

Competitor: Global Nearest Neighbor
(GNN)



Employs clustering within each snapshot
Considered the best single scan algorithm –
runs in O(MNC2) time (C is the average cluster
size)
Performance metrics:


CPU time
Success rate – percentage of partial tracks
(triplets) that agree with original data
20
Variable N
CPU time [sec]
Success rate [%]
21
Variable Rmax (speed)
CPU time [sec]
Success rate [%]
22
Points to Remember

Multiple-Target Tracking

Large Anonymized Trajectory Databases
Existing methods are either inefficient or
inaccurate
 We proposed a polynomial time solution
based on a novel transformation of the MTT
problem into a min-cost max-flow problem
 Very accurate


Need to improve the running time
23
Bibliography on LBS Privacy
http://anonym.comp.nus.edu.sg
24
Download