T-Share: A Large-Scale Dynamic Taxi Ridesharing Service Shuo Ma, Yu Zheng, Ouri Wolfson Microsoft Research Asia University of Illinois at Chicago Background • Taxi-sharing is of great social and environmental importance – Serving more demands: Peak hours vs Off-peak hours – Reduce energy consumption and air pollutants emission – Could save taxi fares while increasing the income of taxi drivers Next Pickup Point : in 1.2km Next Delivery Point : in 6.3km, $13.2 Original route Newly scheduled route Scheduled route Pickup Point Delivery Point T -Share Pick your role Ride Request Driver From Current Position To Huaxing Cinema Number of riders Confirmation 2 ZX3G18126781 Rider Earliest Taxi ID: departure Now δΊ¬B 1203785 Estimated pickup time: Latest arrival Estimated taxi fare: Send OK 08:32 09:30 $ 5.3 Cancel Ride Joining Request Number of riders added: 2 Fare saving: $1.5 Travel time delay: Accept 2 min Reject Background • Challenges – Dynamic: • Dynamic queries: anytime and anywhere, lazy users • Dynamic taxis • Real-time query processing – large-scale: millions of users and tens of thousands of taxis • Wide range of applications – Private vehicles – Logistic industry for transporting goods Value • Government – Save 800 million liter gasoline per year • Supporting 1M cars for 10 months • Worth about 1 billion USD • 1.64 billion KG CO2 emission • Passengers – Serving rate increased 300% – Save 42% expense on average • Taxi drivers increase profit 16% on average Problem Definition • Query π=< π. π , π. π , π. π€π, π. π€π > – Origin and destination: π. π and π. π – Time window for pickup: π. π€π = (π. π€π. π, π. π€π. π) – Time window for delivery: π. π€π = (π. π€π. π, π. π€π. π) Given a fixed number of taxis traveling on a road network and a stream of queries, we aim to serve each query π in the stream by dispatching the taxi which satisfies π with the minimum increase in travel distance. Architecture Scheduling Index Updating Rv {Taxis} R {V} Taxi Searching Ru Spatio-Temporal index of taxis Q Communication Interface Q Rv Service providing data flow Q=<t, o, wp , d, wd>; R=Ru || Rv; V Rv V Taxi status updating flow V=real time pos Spatio-Temporal Index • Grid-based approximation • Select an anchor node in each grid g0 gj ci gi gj gn ο¦ D01 D10 ο¦ D0j D0n D1j D1n gi Di0 Di1 Dij Din gn Dn0 Dn1 Dnj ο¦ g0 cj g1 g1 M= Dij = ( tij , dij ) A) Grid-partitioned map B) Grid distance matrix Spatio-Temporal Index • For each Grid – Spatially-ordered grid cell list π. ππ (spatial closeness) – Temporally-ordered grid cell list π. ππ‘ (temporal closeness) – Taxi list π. ππ£ sorted by the arrival time gi g0 g1 gj ci gi t7i g2 g7 dn'i gn' tni gn d7i D1n gi Di0 Di1 Dij Din gn Dn0 Dn1 Dnj ο¦ g1 Dij = ( tij , dij ) A) Grid-partitioned map t2i d2i g7 g2 D0n M= B) Grid distance matrix earliest nearest gn ο¦ D01 D0j D10 ο¦ D1j g0 cj gj spatial furthest Taxi2 :ta Taxi7 :ta temporal Taxim :ta latest Taxi Searching Scheduling Index Updating Rv {Taxis} R {V} Taxi Searching Ru Spatio-Temporal index of taxis Q Communication Interface Q Rv Service providing data flow Q=<t, o, wp , d, wd>; R=Ru || Rv; V Rv V Taxi status updating flow V=real time pos Taxi Searching • Single-side taxi search g7 Temporal Closeness nearest – π. π is located in π7 – π‘π7 + π‘ππ’π ≤ π. π€π. π – Merge taxi lists π3 g7 O π5 π9 – Many candidate taxis – Scheduling process is heavy gi t2i d2i g7 g2 t7i g2 g7 dn'i gn' tni gn furthest g9 furthest g3 g3 g7 g7 earliest earliest earliest nearest spatial g5 gn • Problem d7i g3 Taxi2 :ta Taxi7 :ta tcur tcur Taxim :ta latest cur tcur Taxi5Taxi5 Selected Taxies Selected Taxies Selected Taxies Selected Taxies Taxi8Taxi8 Taxi7Taxi7 Taxi2Taxi2 Q.wp.l Q.wp.l temporal earliest earliest t TaxixTaxix latestlatest Q.wp.l-t 37 37 Q.wp.l-t TaxiyTaxiy latestlatest Dual-Side Taxi Searching • Origin side • Destination side – π. π in π7 – π‘π7 + π‘ππ’π ≤ π. π€π. π g3 g5 g9 gn furthest g2 nearest Spatial Closeness Temporal Closeness nearest g7 g3 g9 g8 g5 π1 πg77 π3 π2 D π5 π6 nearest g1 g6 Spatial Closeness g7 – π. π in π2 – π‘ππ’π + π‘π2 ≤ π. π€π. π O π9 gn furthest gm furthest g7 earliest g7 g7 2 earliestTaxi earliest Taxi7 2 Taxi Taxi72 Taxi7 Taxix latest Taxix latest Taxi x latest g3 earliest g3 g3 5 earliestTaxi earliest Taxi8 5 Taxi Taxi85 Taxi8 Taxiy latest Taxiy latest Taxi y latest g9 earliest g9 g97 earliestTaxi earliest Taxi Taxi107 Taxi 7 Taxi17 10 Taxi10 Taxi17 latest Taxi Taxi17 z latest Taxi z latest Taxi z g7 g3 tcur gg77 gg33 So tcur tcur So Q.wp.l So Q.wp.l Q.wp.l g9 g5 g9 g5 g 9 g5 Taxi 2 Taxi Taxi72 Taxi72 Taxi Taxi7 g2 g6 g2 g6 gTaxi 2 g6 g2 g2 g Taxi 23 tcur tcur tcur 3 Sd Taxi Taxi113 Taxi3 Sd Q.wd.l Taxi10 11 Sd Taxi11 Taxi10 Step 1: So ∩ Sd =Taxi {}10 Q.wd.l Q.wd.l B) earliest earliest Taxi Taxi113 earliest Taxi Taxi113 Taxi11 Taxim latest Taxim Taxim latest g6 latest A) Step 1: So ∩ Sd = {} A) B) g9 1:g5So ∩ Sdg=2 {}g6 A) g7 g3 Step B) g g6 earliest g9 g5 g2 g6 t cur tcur g77 gg33 Taxi Taxi 3 g610 g9 2 g5 Taxi g2 g6 earliest Taxi11 t cur Taxi tcur So Taxi7 Taxi3 Sd Taxi earliest 21 Taxi 10 10 tcur Taxi52 tcur Taxi113 Taxi Taxi Taxi 10 Taxi 21 2 7 S S Q.wp.l-t37 o Taxi8 dQ.wd.l-t 17 Taxi 11 62 Taxi21 Taxi10 17 Sd So Taxi75 Taxi 21 Taxi10 Q.wp.l-t37 Q.wp.l-t37 C) C) C) g7 g3 g tcur g77 gg33 tcur So tcur So Q.wp.l-t So 21 Taxi85 Taxi21 Taxi82: So ∩ Sd = {}17 Step Taxi17 Q.wd.l-t62 Taxi17 Q.wd.l-t D)62 Taxi17n latest Step 2: So ∩ Sd = {} g9 2:g5So ∩ Sd g=2 {}g6 Step g9 2 g5 g2 g6 Taxi 3 g g gTaxi 9 7 5 2 g6 Taxi Taxi 11 Taxi Taxi3 Sd Taxi52 Taxi 10 Taxi 72 Taxi113 Taxi Taxi8 Taxi 21 Sd Taxi57 Taxi Taxi 11 10 Taxi 17 Sd Taxi10 85 Taxi10 Taxi17 21 97 Taxi810 Taxi21 17 Q.wp.l-t97Step 3:Taxi 10 17 S ∩ S = {Taxi , Taxi o d 10 17Taxi17} E) 97 Taxi17 Q.wp.l-t E) E) Step 3: So ∩ Sd = {Taxi10 , Taxi17} Step 3: So ∩ Sd = {Taxi10 , Taxi17} D) D) F) F) F) Taxin latest Taxin latest Scheduling Module • Calculate schedule for each candidate taxi Scheduling Index Updating Rv {Taxis} R {V} Taxi Searching Ru Spatio-Temporal index of taxis Q Communication Interface Q Rv Service providing data flow Q=<t, o, wp , d, wd>; R=Ru || Rv; V Rv V Taxi status updating flow V=real time pos Scheduling Module • Feasibility check – Two steps: first insert π. π and then π. π – Do not change the order of an existing schedule – Minimize the increase of travel distance • Given a schedule π. π composed of π points – π + 1 positions to insert π. π – π − π + 1 positions to insert π. π – π(π2 ) possible ways of insertion Q.o Q1.o Q.o Q2.o Q.o Q.o Q1.d Q.o Q2.d Scheduling Module • Feasibility check (using π. π as an example) – π‘π = π2 . π → π. π + π. π → π1 . π + π‘π€ − π2 . π → π1 . π – π‘π€ : the time spent on waiting for the passenger – (π. π)π π‘ = π. π€π. π − ππ – (π. π)π π‘ = π. π€π. π − ππ – If ππ ≥ π΄ππ{ πΈπ . π ππ , πΈπ . π Original schedule leg1 Q.o leg2 ππ }, fail One possible way to insert a new query Q into the original schedule Q2.d Q2.o Points for time window check Q1.d leg3 Q.d leg4 Points for slack time check Scheduling Module • Lazy Shortest Path Calculation – Find a lower bounder of travel time between two points – π‘ππ· ≥ π‘ππ − (ππ → π) − (π· → ππ ) 1. ππ → π + π‘ππ· ≥ (ππ → π·) 3. ππ → π + π‘ππ· ≥ π‘ππ − (π· → ππ ) g1 gj gn ο¦ D01 D0j D0n D10 ο¦ D1j D1n gi Di0 Di1 Dij Din gn Dn0 Dn1 Dnj ο¦ g0 cj 2. (ππ → π·)+(π· → ππ ) ≥ π‘ππ (ππ → π·) ≥ π‘ππ - (π· → ππ ) g0 D gj ci O gi g1 M= Dij = ( tij , dij ) A) Grid-partitioned map B) Grid distance matrix Pricing Scheme • Taxi fare per mile is higher for multiple passengers than for a single passenger • The taxi fare of shared distances is evenly split among the riding passengers πΉπππ = π(π1 +∑ππ=2 πΌ + 1 ∗ ππ π) πππ‘ππ_ππππππ‘ = π(π·π + 1 + πΌ ∗ π·π ) Evaluation • Settings – A trajectory dataset generated by over 33,000 taxis in Beijing over 3 months – Built experimental platform based on the data • Big data – 400 million kilometres – 790 million points – 20 million trips (46% occupied) Evaluation • Experimental platform – Learn the distribution of queries on the road network over time of day from the data – Assume the arrival of queries follows a Poisson distribution – Learn the transition probability between different road segments ππ ππ πππ #. Of queries ππ 50K 40K # Query πππ extracted 2-inflated 60K 30K 20K 10K 0K 0 5 10 15 hour of day 20 Settings of experimental platform Definition Value The start time of simulation 9 am The end time of simulation 9:30 am The number of taxis 2,980 The pickup window size 5 minute The length of a time bin 5 minute The # of time bins in a frame 12 Number of queries 27,000 Evaluation • Baselines – – – – – No ridesharing Single-side and First Fit Ridesharing (SF) Single-side and Best-fit Ridesharing (SB) Dual-side and First Fit Ridesharing (DF) Dual-side and Best-fit Ridesharing (DB) Results • Effectiveness SF SB DF DB NR 50% 40% Relative Distance Rate Satisfaction Rate 60% 30% 20% 10% 1 2 3 delta 4 5 6 104% 102% 100% 98% 96% 94% 92% 90% 88% SF SB DF DB NR 1 2 3 delta 4 5 6 Results # Road Nodes Accessed Per Query 90 75 SF SB DF DB 60 45 30 1 2 3 delta 4 5 SF SB DF DB 20 15 10 5 SF SB DF DB 60K 50K 40K 30K 20K 10K 6 1 # Road Nodes Accessed Per Query #Taxi Accessed Per Query # Grid Cells Accessed Per Query • Efficiency 2 3 delta 4 5 6 200K w/o lazy strategy with lazy strategy 150K 100K 50K 0K 1 2 3 delta 4 5 6 15*15 20*20 25*25 Grid Size 30*30 Conclusion • Win-win-win scenario • Candidate taxi selection based on a spatio-temporal index – Dual-side search saves 50% computational load – Have the similar effectiveness as compared with the single-side search • Taxi scheduling based on – Feasibility check – Lazy shortest path computing saves 83% computational load • Serve 720k queries per hour on a single machine • Future work – Consider more constraints: monetary constraints – Dynamic time estimation – Other factors: like social trust and credit Thanks! Yu Zheng yuzheng@microsoft.com Homepage