T-Drive:Driving Directions Based on Taxi Trajectories Jing Yuan, Yu Zheng, Chengyang Zhang, Wenlei Xie, Xing Xie, Guangzhong Sun and Yan Huang Microsoft Research, Computer science department, University of North Texas 2010 Presented by Salem Othman Kent state university Nov-4-2011 Email: Sothman@kent.edu http://www.samtaxicabservices.com/ Background How long does it really take to drive from point A to point B at 5:00 pm? Shortest Time. Shortest Distance. 2 Background Cont. Practically fastest route 3 Motivation Big cities have a large number of taxicabs equipped with a GPS sensors Historical GPS trajectories Taxi drivers are experienced drivers http://barrycarguythomas.blogspot.com/2011/05/monday-another-taxi-story.html 4 Goal Model the dynamic traffic patterns Model intelligence of experienced drivers http://www.asnowtech.com/genetics-of-human-intelligence-2171215.html http://www.fastcompany.com/1644403/microsoft-predestination-can-predict-where-youre-going 5 Outline Challenges faced performing the system Methodology Trajectory preprocessing Landmark graph construction Travel time estimation Route computing Experiments Conclusion References 6 Challenges faced performing the system Intelligence modeling Can we answer any user query? Data sparseness and coverage Can we accurately estimate the speed pattern of each road segment? Low sampling rate problem Is there uncertainty of the routes traversed by a taxi? 7 Outline Challenges faced performing the system Methodology Trajectory preprocessing Landmark graph construction Travel time estimation Route computing Experiments Conclusion References 8 Step 1: Preprocessing Taxi trajectory: a sequence of GPS points pertaining to one trip. Road segment: a directed edge, one-way or bidirectional Trajectory segmentation Partition a GPS log into some taxi trajectories Map matching Map each GPS point of a trip to the corresponding segment Taxi #6, 3 Days a R1 R4 R2 R3 b 9 Outline Challenges faced performing the system Methodology Trajectory preprocessing Landmark graph construction Travel time estimation Route computing Experiments Conclusion References 10 Step 2: Building the landmark graph Landmark: one of the top-k road segment being frequently traversed by taxis Select top-k road segments Connect two landmarks with a landmark edge p1 Tr5 Tr1 r1 r4 Tr2 p2 r3 Tr3 r7 r9 r1 e13 r2 r6 r6 r6 r10 r8 e96 r9 p3 p4 A) Matched taxi trajectories B) Detected landmarks r3 e63 e16 Tr4 r5 r3 r1 e93 r9 C) A landmark graph 11 Outline Challenges faced performing the system Methodology Trajectory preprocessing Landmark graph construction Travel time estimation Route computing Experiments Conclusion References 12 Step 3: Travel time estimation Travel time gather around some values like a set of clusters. V-clustering Find the best split point having minimal weighted average variance E-clustering Split the x-axis into several time slots Compute the distribution of travel time in each time slot 13 Outline Challenges faced performing the system Methodology Trajectory preprocessing Landmark graph construction Travel time estimation Route computing Experiments Conclusion References 14 Step 4: Route computing Rough routing In landmark graph Search m nearest landmarks for source and destination points. For each pair of landmark find time-dependent fastest route. Refined routing In real road network Dynamic programming 2 2 0.3 qs r2 r4 0.2 1 r5 qe r6 A) A rough route r4.start r2.start 0.3 qs 1 4.5 1.4 1 1.4 r2.end r4.end 1 3.2 0.9 r6.start 1 2.4 1 qe 2.5 r5.start r5.end C) A fastest path Taxi Trajectories r6.end r6.start 1 1 r4.end Rough Routing 0.2 r5.end B) The refined routing r4.start r2.start qs 1.7 r5.start 2.8 r2.end 0.3 1 2.5 A Time-dependent Landmark Graph 0.9 0.2 qe A Road Network Refined Routing r6.end 15 Outline Challenges faced performing the system Methodology Trajectory preprocessing Landmark graph construction Travel time estimation Route computing Experiments Conclusion References 16 Experiments Data Road network of Beijing has 106,579 nodes and 141,380 segments Taxi trajectories 33,000 taxis over 3 months total distance 400 million Km total GPS points 790 million The average interval is 3.1, average distance 600 meters, 4.96 million trajectories Evaluation framework Landmark graph Based on synthetic queries In-the-field K=500 K=4000 17 Outline Challenges faced performing the system Methodology Trajectory preprocessing Landmark graph construction Travel time estimation Route computing Experiments Conclusion References 18 Conclusion 60-70% of the routes suggested are faster than the competing methods 20% of the routes share the same results On average, 50% of routes are at least 20% faster than the competing approaches 19 Outline Challenges faced performing the system Methodology Trajectory preprocessing Landmark graph construction Travel time estimation Route computing Experiments Conclusion References 20 References [1] Jing Yuan, Yu Zheng, Chengyang Zhang, Wenlei Xie, Xing Xie, Guangzhong Sun, Yan Huang. T-Drive: Driving Directions Based on Taxi Trajectories. In Proceedings of ACM SIGSPATIAL Conference on Advances in Geographical Information Systems (ACM SIGSPATIAL GIS 2010). [2] Yin Lou, Chengyang Zhang, Yu Zheng, Xing Xie. Map-Matching for Low-Sampling-Rate GPS Trajectories. In Proceedings of ACM SIGSPATIAL Conference on Geographical Information Systems (ACM SIGSPATIAL GIS 2009). [3] Jin Yuan, Yu Zheng. An Interactive Voting-based Map Matching Algorithm. In proceedings of the International Conference on Mobile Data Management 2010 (MDM 2010). Thank you 21