Algorithm for finding optimal paths in a public transit network with Real-Time Data I want to get from here We introduce a new algorithm which overcomes the problem of computing shortest paths in a transit network which pulls realtime data from a third-party Application Programming Interface (API) on buses and trains with real-time information and I want directions FAST! (3 sec on my mobile phone) 28 41 To Here Our first thought was: let’s build a graph of the transit network (bus stops as nodes, and real-time travel times as links) and run Djikstra’s! Problem: We can’t retrieve all link travel times from the real-time API at once *there are restrictions on how often programs can hit the API. Retrieving travel times for all links would take > 1 min for large We need information from third party sources transit agencies route configurations We have to intelligently retrieve only a subset of estimated arrivals at particular bus stops. locations of bus stops sequence of stops visited estimated arrival to bus stop More accurate predictions for trip time Better suggestions for alternative routes Customer Satisfaction < 1% Less 7% No Real-Time Transit Trip Planner satisfied Change 91% Schedule-Based Transit Trip Planner 85% 44% Somewhat more satfisfied Percentage which Transit Trip Planners under-estimate actual trip time How do we determine these bus stops? First, consider these observations: 1. If the origin and destination are not served by the same bus route, humans intuitively plan trips by finding transfer points to connect between bus routes. 2. The set of feasible paths from any bus stop along Route X to any bus stop along Route Y is a subset of all feasible paths from the origin station of Route X to the terminus station of Route Y. Based on these observations, build a lookup table for all possible paths from the origin of every bus route to the terminus of every other bus route. real-time arrival API Why real-time data in trip planning? Jariyasunant et al, (2010) “Mobile Transit Trip Planning with Real-Time Data” Proceedings of Transportation Research Board 2010. 48% Much more satfisfied Change in satisfaction of public transit Ferris, B., Watkins, K., and Borning, A. (2010) "OneBusAway: Results from Providing Real-Time Arrival Information for Public Transit." Proceedings of CHI 2010. Algorithm Flowchart Preparing Data Real-Time API GTFS files Extract route, direction, stop, latitude, longitude, sequence #, and agency Link Bus Stops with Real-Time feed Refine Route Configurations Each stop must be linked to a real-time URL by an unique stopcode See Figure 4 in paper Static Precomputation Create Route Configurations Route Configurations Pre-computation of Lookup Tables Build Geolocation Lookup Table Store a list of bus stops with their latitude and longitude Geo-Location Database Build Service-Time Lookup Table Find Transfer Points and Routes Store a list of the hours of the day each bus route is running List of pairs of stops within a reasonable transfer distance (is arbitrarily chosen at half a mile) Service-Time Database Build Path Lookup Table Store all paths within 4 transfers the origin stop of every bus route to the terminus of every bus route in each direction. Exclude paths that take two transfers more than any existing feasible paths between an O-D. Path Lookup Database Real-Time Origin-Destination Query Retrieve Pre-computed Paths not all paths are physically possible to be made Retrieve Real-Time Information From API and sort routes by shortest travel time ≈0.3 secs How do we measure the performance of the algorithm? By measuring the response time < 3 seconds The response time depends on the number of possible paths (from Origin to Destination) retrieved from the path lookup database which then affects the number of bus stop predictions needed from the realtime API ≈0.1 secs Our Experiment: 770,000 tests in 77 different cities 1) Response time 2) Number of paths pulled from lookup table 3) Number of requests sent to real-time API Number of bus stop predictions needed from real-time API 99% Percentile: 82 queries. 50% Percentile: 26 queries Size of lookup table for 77 different cities 29 This example: 3 paths 8 arrival requests Repeat x10000 for each city J 14 49 The two graphs to the left and right represent 10,000 Simulations for Washington D.C. The performance of the algorithm is affected most by the time to query the real-time API, the bottleneck. The number of queries increases with the amount of distance people are willing to catch a bus. Jerald Jariyasunant, Eric Mai, Raja Sengupta. U.C. Berkeley 5 shortest paths The size of the lookup table increases polynomially as the number of routes served by a transit agency grows D.C. 63 million pre-computed paths 1.2 GB needed to store in memory 312 routes served 194,000 unique bus stop/route-direction pairs The size of the bubbles represents the number of total bus stops in the transit network Number of routes served 29 *Washington D.C. was the largest network (number of transit routes operated) route configuration files were available for, and therefore was tested to show the algorithm working in the worst-case scenario Result of 10,000 simulations in Washington D.C. Real-Time API Does it scale? Pick random origin & destination in a city, compute directions and record: Number of possible paths (x100) 99% Percentile: 1900 paths. 50% Percentile: 224 paths ≈1.5 secs Bus stop predictions needed from real-time API for different walking radii Frequency (out of 10,0000) ≈0.1 secs Calculate real-time travel times Total # of paths in lookup table look up running transit routes within ½ mile Remove Irrelevant Paths RealTime Find Nearest Stops Predictions needed