Compression of GPS Trajectories Minjie Chen, Mantao Xu and Pasi Fränti Speech and Image Processing Unit (SIPU) School of Computing University of Eastern Finland, FINLAND Presented on Apr 10th for Data Compression Conference, Snowbird, Utah, USA. User upload GPS file to OpenstreetMap.org Dataset in MOPSI Project Many GPS Trajectories are collected by Geo-position devices to depict the movement of human, car, animals... It includes latitude, longitude and time information Microsoft Geolife dataset Example of GPS Trajectories BerlinMOD Cycling dataset Animal Movement Plenty of date space are needed in client side to store these data In GPX format, Storage cost is around : 43KB/hour(binary) , 300+KB/hour(GPX) if the data is collected at 1 second interval. For 10,000 users, it is 30GB/day, 10TB/year. Geolife and MOPSI BerlinMOD From http://onestepahead.de Trajectory simplification (TS) Top-down time-ratio (TD-TR) Open Window (OW) Threshold-guided algorithm STTrace Spatial join SQUISH Generic Remote Trajectory Simplification (GRTS) Multi-resolution Polygonal Approximation (MRPA) With different error measures synchronous Euclidean distance (SED) position, speed and orientation spatial join Fréchet distance local integral square synchronous Euclidean distance Maximum Synchronous Euclidean distance (max SED) is used as the error metrics. The errors were measured through distances between pairs of temporally synchronized positions. Reduction The reduced data points are saved directly with a fixed bit length Support both the visualization process and the effective trajectory queues in database. Compression (This is discussed in this paper) Optimizes both for the reduction and the quantization in the encoding process A better compression ratio, appropriate for data storage. 44 points 13 points The original route has 575 points in this example 6 points Only lossy compression of Vector data are considered (No timestamp information) Uniform quantization Product scalar quantization Clustering-based method Reference line approach Combine scalar quantization and reduction via Dynamic Programming UK map differential coordinates For GPS Trajectory, speed and direction change will be robust variant in the encoding process distance speed 1000 60 40 Speed (m/s) Distance (m) 800 600 400 20 200 0 0 4,000 8,000 12,000 0 16,000 0 4000 Time 8000 12000 16000 12000 16000 Time 10000 60 40 6000 Speed (m/s) Distance (m) 8000 4000 20 2000 0 0 4000 8000 Time 12000 16000 0 0 4000 8000 Time Speed and direction changes are incorporated in the encoding process instead of using the differential coordinates. Line simplification and quantization are combined in order to seek an approximation result for compression. A greedy solution is used for the trajectory approximation in this paper. 10 2 pj ' pj 5 0 pi ' pi lv 2 / ti -5 l 2 tan 1 -10 -10 -5 0 2 / 2 vi* ti 2 / 2 5 10 Lossless Compression by adaptive arithmetic coding Probability estimation p t / rtspmax p( t ( k ) ti ti ) 1 t k where p rt ( t (k ) / mintsp ) / rtspmax r ( s), s 1 t rtspmax max( t ( q)) / tspmin , q 1,2,..., m 1. Updating 0.15 Probability k 1 0.2 0.1 0.05 1 t rt ( s ), s tk / tspmin rt ( s ) else t rt ( s ), 0 0 20 tspmin: minimum sampling time internal δt = 0.01, bias factor rt : rtspmax x 1 vector μt = 0.995, forgetting factor, higher weight for recent encoded values 40 t 60 80 Predict mean and variance, quantized level determined by time difference Predict speed and variance by previous enccoded value Speed(m/s) 4 3 2 1 0 0 200 400 600 Time(s) p( spd * (k )) p spd / nlvspd ( k ) 1 spd 1 t ( k ) t (i ) 2 spd pred ( k ) nc1 spd * (i ) ( t (i ) exp( ( ) )), t (i ) t ( k ) d t ( k ) 2 w i t spdpred k nc 2 t (i ) (( spd (i ) spd pred (k )) 2 * i 2 2 2 6t 2 (i ) 2 GPS 2t 2 (i ) Quantized level determined by time difference and speed 0.06 0.1 0.05 Encoded Value 0.08 P( ) k 0.03 0.02 0 0.06 P(Δθk) 0.04 0.02 0.01 -3 -2 -1 0 1 2 0 3 0 Update -3 -2 -1 0 k P(Δθ0) 0.2 0.15 P(0| ) k P(Δθ0) P(0) 0.04 0.1 P(Δθ0 |Δθk) 0.05 0 -3 -2 -1 0 0 1 2 3 1 2 3 Time Cost (s / 10,000 points) 3 2.5 2 Time cost is 2s for 10,000 points using Matlab implementation Encoding 1.5 1 Decoding 0.5 0 1 3 10 Max SED (m) 30 100 Bit-rate (KB/hour) 1.5 Only 35% comparing with those “compression” algorithm + 7-Zip (Lempel-Ziv Markov chain Algorithm) on Geolife dataset TD-TR+7-Zip 1 VMC 0.5 GTC 0 1 3 10 30 100 Max SED (m) KB/h on the compression algorithm KB/h on the compression algorithm Estimated Storage Cost for a long time period 3m maxSED, 0.36 KB/h 10m maxSED, 0.19KB/h Visualization of GPS trajectory compression Visualization of GPS trajectory compression original compressed 50m maxSED, 0.06KB/h Visualization of GPS trajectory compression original compressed original compressed maxSED =3m meanSED=1.5m maxSED =10m meanSED=4.9m maxSED =49.8m meanSED=26.4m original file is 99549 bytes and compressed file is 544 bytes, bitrate is 0.35562KB/h original file is 99549 bytes and compressed file is 283 bytes, bitrate is 0.185KB/h original file is 99549 bytes and compressed file is 129 bytes, bitrate is 0.084328KB/h A demo is published on http://cs.joensuu.fi/~mchen/GPSTrajComp.htm KB/h on the compression algorithm The bit-rate can be reduced around 30%, 20%, 15% for 1m, 3m, 10m max SED. Bit-rate will not be changed for 30m, 100m max SED. State-of-the-art lossy compression algorithm for GPS Trajectories with 0.39KB/h bit-rate for geolife dataset Approximate the encoding curve by both data reduction and quantization, on speed and direction change variant. Extension can be done on: Online compression Improvement of approximation and encoding process by dynamic programming (improve 15%-20%) In urban area, road network can be considered Consider similarity of multiple Trajectories (only time is needed to encode in similar part) N. Meratnia and R. A. de By. "Spatiotemporal Compression Techniques for Moving Point Objects", Advances in Database Technology, vol. 2992, 551–562, 2004. M. Potamias, K. Patroumpas, T. Sellis, "Sampling Trajectory Streams with Spatiotemporal Criteria", Scientific and Statistical Database Management (SSDBM), 275-284, 2006. H. Cao, O. Wolfson, G. Trajcevski, "Spatio-temporal data reduction with deterministic error bounds", VLDB Journal, 15(3), 211-228, 2006. A. Akimov, A. Kolesnikov and P. Fränti, "Coordinate quantization in vector map compression", IASTED Conference on Visualization, Imaging and Image Processing (VIIP’04), 748-753, 2004. S. Shekhar, S. Huang, Y. Djugash, J. Zhou, "Vector map compression: a clustering approach", ACM Int. Symp. Advances in Geographic Inform, 74-80, 2002. A. Kolesnikov, "Optimal encoding of vector data with polygonal approximation and vertex quantization", SCIA’05, LNCS, vol. 3540, 1186–1195. 2005. M. Chen, M. Xu and P. Fränti, "Fast dynamic quantization algorithm for vector map compression", IEEE Int. Conf. on Image Processing, 4289-4292, September 2010.” Y. Chen, K. Jiang, Y. Zheng, C. Li, N. Yu, "Trajectory Simplification Method for Location-Based Social Networking Services", ACM GIS workshop on Location-based social networking services, 33-40, 2009. J. Muckell, J. H. Hwang, C. T. Lawson, S. S. Ravi, "Algorithms for compressing GPS trajectory data: an empirical evaluation", SIGSPATIAL International Conference on Advances in Geographic Information Systems, 402-405, 2010. J. Muckell, J. H. Hwang, V. Patil, C. T. Lawson, F. Ping , S. S. Ravi, "SQUISH: an online approach for GPS trajectory compression", International Conference on Computing for Geospatial Research & Applications, 1-8, 2011. M. Chen, M. Xu and P. Fränti, "A Fast O(N) Multi-resolution Polygonal Approximation Algorithm for GPS Trajectory Simplification", IEEE Transactions on Image Processing (in press). G. Kellaris, N. Pelekis and Y. Theodoridis, "Trajectory Compression under Network Constraints", Lecture Notes in Computer Science, Vol. 5644, pp.392-398, 2009. F. Schmid, K. F. Richter and P. Laube, "Semantic Trajectory Compression", Lecture Notes in Computer Science, Vol. 5644, pp.411-416, 2009. M. Koegel, M. Mauve,”On the Spatio-Temporal Information Content and Arithmetic Coding of Discrete Trajectories”, International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services, Copenhagen, Denmark, December 2011. Speed at x direction Speed at y direction 60 40 50 40 Speed (m/s) Speed (m/s) 30 20 30 20 10 10 0 0 4000 8000 12000 0 16000 0 4000 Time 8000 12000 16000 Time Speed Direction Change 60 1.5 Direction Change 1 Speed (m/s) 40 20 0.5 0 -0.5 -1 -1.5 0 0 4000 8000 Time 12000 16000 -2 0 4000 8000 Time 12000 16000