ppt - Joensuu

advertisement
Compression of GPS
Trajectories
Minjie Chen, Mantao Xu and Pasi Fränti
Speech and Image Processing Unit (SIPU)
School of Computing
University of Eastern Finland, FINLAND
Presented on Apr 10th for Data Compression Conference,
Snowbird, Utah, USA.
User upload GPS file
to OpenstreetMap.org
Dataset in MOPSI Project
Many GPS Trajectories are
collected by Geo-position devices
to depict the movement of
human, car, animals...
It includes latitude, longitude and
time information
Microsoft Geolife dataset
Example of GPS Trajectories
BerlinMOD Cycling dataset
Animal Movement
Plenty of date space are needed in client side to store these data
In GPX format, Storage cost is around : 43KB/hour(binary) ,
300+KB/hour(GPX) if the data is collected at 1 second interval.
For 10,000 users, it is 30GB/day, 10TB/year.
Geolife and MOPSI
BerlinMOD
From http://onestepahead.de
Trajectory simplification (TS)
Top-down time-ratio (TD-TR)
Open Window (OW)
Threshold-guided algorithm
STTrace
Spatial join
SQUISH
Generic Remote Trajectory Simplification (GRTS)
Multi-resolution Polygonal Approximation (MRPA)
With different error measures
synchronous Euclidean distance (SED)
position, speed and orientation
spatial join
Fréchet distance
local integral square synchronous Euclidean distance
Maximum Synchronous Euclidean distance (max SED) is used as
the error metrics.
The errors were measured through distances between pairs of
temporally synchronized positions.
Reduction
The reduced data points are saved directly with a fixed bit
length
Support both the visualization process and the effective
trajectory queues in database.
Compression (This is discussed in this paper)
Optimizes both for the reduction and the quantization in the
encoding process
A better compression ratio, appropriate for data storage.
44 points
13 points
The original route has 575 points in this example
6 points
Only lossy compression of Vector data are considered (No
timestamp information)
Uniform quantization
Product scalar quantization
Clustering-based method
Reference line approach
Combine scalar quantization and reduction via Dynamic Programming
UK map
differential coordinates
For GPS Trajectory, speed and direction change will be robust
variant in the encoding process
distance
speed
1000
60
40
Speed (m/s)
Distance (m)
800
600
400
20
200
0
0
4,000
8,000
12,000
0
16,000
0
4000
Time
8000
12000
16000
12000
16000
Time
10000
60
40
6000
Speed (m/s)
Distance (m)
8000
4000
20
2000
0
0
4000
8000
Time
12000
16000
0
0
4000
8000
Time
Speed and direction changes are incorporated in the encoding
process instead of using the differential coordinates.
Line simplification and quantization are combined in order to
seek an approximation result for compression.
A greedy solution is used for the trajectory approximation in this
paper.
10
2  
pj '
pj
5
0
pi '
pi
lv  2     / ti
-5
l  2 tan 1
-10
-10
-5
0
2 / 2
vi*  ti  2 / 2
5
10
Lossless Compression by adaptive arithmetic coding
Probability estimation
p   t / rtspmax
p( t ( k )  ti  ti ) 
1  t
k
where p  rt ( t (k ) / mintsp ) /
rtspmax
 r ( s),
s 1
t
rtspmax  max( t ( q)) / tspmin , q  1,2,..., m  1.
Updating
0.15
Probability
k 1
0.2
0.1
0.05
1  t rt ( s ), s  tk / tspmin
rt ( s )  
else
 t rt ( s ),
0
0
20
tspmin: minimum sampling time internal
δt = 0.01, bias factor
rt : rtspmax x 1 vector
μt = 0.995, forgetting factor, higher weight for recent encoded values
40
t
60
80
Predict mean and variance, quantized level determined by time difference
Predict speed and
variance by previous
enccoded value
Speed(m/s)
4
3
2
1
0
0
200
400
600
Time(s)
p( spd * (k )) 
p   spd / nlvspd ( k )
1   spd
1 t ( k )  t (i ) 2
spd pred ( k )  nc1  spd * (i )  ( t (i )  exp(   (
) )), t (i )  t ( k )  d  t ( k )
2
w
i
t
 spdpred  k   nc 2  t (i )  (( spd (i )  spd pred (k )) 
2
*
i
2
 2 2
6t 2 (i )

2
 GPS
2t 2 (i )
Quantized level determined by time difference and speed
0.06
0.1
0.05
Encoded
Value
0.08
P( )
k
0.03
0.02
0
0.06
P(Δθk)
0.04
0.02
0.01
-3
-2
-1
0
1
2
0
3
0

Update
-3
-2
-1
0

k
P(Δθ0)
0.2
0.15
P(0| )
k
P(Δθ0)
P(0)
0.04
0.1
P(Δθ0 |Δθk)
0.05
0
-3
-2
-1
0
0

1
2
3
1
2
3
Time Cost (s / 10,000 points)
3
2.5
2
Time cost is 2s for 10,000
points using Matlab
implementation
Encoding
1.5
1
Decoding
0.5
0
1
3
10
Max SED (m)
30
100
Bit-rate (KB/hour)
1.5
Only 35% comparing with
those “compression”
algorithm + 7-Zip (Lempel-Ziv
Markov chain Algorithm) on
Geolife dataset
TD-TR+7-Zip
1
VMC
0.5 GTC
0
1
3
10
30
100
Max SED (m)
KB/h on the compression algorithm
KB/h on the compression algorithm
Estimated Storage Cost for a long time period
3m maxSED, 0.36 KB/h
10m maxSED, 0.19KB/h
Visualization of GPS trajectory compression
Visualization of GPS trajectory compression
original
compressed
50m maxSED, 0.06KB/h
Visualization of GPS trajectory compression
original
compressed
original
compressed
maxSED =3m meanSED=1.5m
maxSED =10m meanSED=4.9m
maxSED =49.8m meanSED=26.4m
original file is 99549 bytes and compressed file is 544 bytes, bitrate is 0.35562KB/h
original file is 99549 bytes and compressed file is 283 bytes, bitrate is 0.185KB/h
original file is 99549 bytes and compressed file is 129 bytes, bitrate is 0.084328KB/h
A demo is published on http://cs.joensuu.fi/~mchen/GPSTrajComp.htm
KB/h on the compression algorithm
The bit-rate can be reduced around 30%, 20%,
15% for 1m, 3m, 10m max SED.
Bit-rate will not be changed for 30m, 100m max
SED.
State-of-the-art lossy compression algorithm for GPS
Trajectories with 0.39KB/h bit-rate for geolife dataset
Approximate the encoding curve by both data reduction and
quantization, on speed and direction change variant.




Extension can be done on:
Online compression
Improvement of approximation and encoding process by
dynamic programming (improve 15%-20%)
In urban area, road network can be considered
Consider similarity of multiple Trajectories (only time is needed
to encode in similar part)
N. Meratnia and R. A. de By. "Spatiotemporal Compression Techniques for Moving Point Objects", Advances in
Database Technology, vol. 2992, 551–562, 2004.
M. Potamias, K. Patroumpas, T. Sellis, "Sampling Trajectory Streams with Spatiotemporal Criteria", Scientific and
Statistical Database Management (SSDBM), 275-284, 2006.
H. Cao, O. Wolfson, G. Trajcevski, "Spatio-temporal data reduction with deterministic error bounds", VLDB Journal,
15(3), 211-228, 2006.
A. Akimov, A. Kolesnikov and P. Fränti, "Coordinate quantization in vector map compression", IASTED Conference on
Visualization, Imaging and Image Processing (VIIP’04), 748-753, 2004.
S. Shekhar, S. Huang, Y. Djugash, J. Zhou, "Vector map compression: a clustering approach", ACM Int. Symp. Advances
in Geographic Inform, 74-80, 2002.
A. Kolesnikov, "Optimal encoding of vector data with polygonal approximation and vertex quantization", SCIA’05, LNCS,
vol. 3540, 1186–1195. 2005.
M. Chen, M. Xu and P. Fränti, "Fast dynamic quantization algorithm for vector map compression", IEEE Int. Conf. on
Image Processing, 4289-4292, September 2010.”
Y. Chen, K. Jiang, Y. Zheng, C. Li, N. Yu, "Trajectory Simplification Method for Location-Based Social Networking
Services", ACM GIS workshop on Location-based social networking services, 33-40, 2009.
J. Muckell, J. H. Hwang, C. T. Lawson, S. S. Ravi, "Algorithms for compressing GPS trajectory data: an empirical
evaluation", SIGSPATIAL International Conference on Advances in Geographic Information Systems, 402-405, 2010.
J. Muckell, J. H. Hwang, V. Patil, C. T. Lawson, F. Ping , S. S. Ravi, "SQUISH: an online approach for GPS trajectory
compression", International Conference on Computing for Geospatial Research & Applications, 1-8, 2011.
M. Chen, M. Xu and P. Fränti, "A Fast O(N) Multi-resolution Polygonal Approximation Algorithm for GPS Trajectory
Simplification", IEEE Transactions on Image Processing (in press).
G. Kellaris, N. Pelekis and Y. Theodoridis, "Trajectory Compression under Network Constraints", Lecture Notes in
Computer Science, Vol. 5644, pp.392-398, 2009.
F. Schmid, K. F. Richter and P. Laube, "Semantic Trajectory Compression", Lecture Notes in Computer Science, Vol. 5644,
pp.411-416, 2009.
M. Koegel, M. Mauve,”On the Spatio-Temporal Information Content and Arithmetic Coding of Discrete Trajectories”,
International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services, Copenhagen,
Denmark, December 2011.
Speed at x direction
Speed at y direction
60
40
50
40
Speed (m/s)
Speed (m/s)
30
20
30
20
10
10
0
0
4000
8000
12000
0
16000
0
4000
Time
8000
12000
16000
Time
Speed
Direction Change
60
1.5
Direction Change
1
Speed (m/s)
40
20
0.5
0
-0.5
-1
-1.5
0
0
4000
8000
Time
12000
16000
-2
0
4000
8000
Time
12000
16000
Download