VTrack: Accurate, Energy-Aware Road Traffic Delay Estimation using Mobile Phones By: Michael Glus, MSEE EEL 6788 1 Agenda • • • • • • • • Introduction Challenges Overview of algorithms used Travel time estimation Evaluation of results Related work Conclusion References 2 Introduction • Traffic congestion is a serious, growing problem in today's society with over 4 billion hours spent in traffic in 2007 • As the world gets “smarter” so should the roadways • The idea of using vehicles as data collection points is not a new idea, but since cell phones emerged we can get much better and more accurate data 3 What is VTrack? • VTrack is a system for travel time estimation using sensor data (GPS or WiFi) • This idea was developed using four National Science Foundation grants by students at MIT CSAIL • We want to mitigate long traffic delays using this data and inform customers of any potential traffic issues 4 Key Applications • Two key applications to support 1. Detecting and visualizing “hotspots” • A hotspot is a road segment which has a observed travel time that far exceeds its normal travel time • Goal is to display these hotspots to the user via a web browser • User can select their geographic area and see all the traffic spots • Must minimize false hotspots and also missed hotspots 5 Key Applications Continued 2. Real-time Route planning • • • Users are most concerned about end-toend time spent in a commute Route planning can use past and current data to give the user the fastest possible route to their destination Since the planning is in real-time, the application can update the user to alter their driving path if a hotspot arises suddenly 6 VTrack Architecture • Users run applications from their cell that reports to server • Server runs algorithm to estimate travel time 7 Server Diagram 8 Challenge # 1 • • • The first challenge for estimating travel delays is the energy consumption of the device that is transmitting the data Cell phones that trasmit frequently can drain a battery quickly Can not force users to keep phones plugged in all the time while obtaining data 9 Challenge # 2 • The second challenge is sensor unreliability • Will users always have their phone in data collection mode? • How will we know where the users are (ie. Accuracy of the sensor)? • This leads into the debate of GPS and WiFi 10 GPS vs. WiFi • GPS – GPS not available on all phones – Power hungry (up to 20x vs. WiFi) – Outages in tunnels or users pockets – High resolution • WiFi – Less resolution (only to 50-100m) – Consumes less power – Needs more processing to determine user location 11 Overcoming the Challenges • Algorithm use – Process streams of time-stamped position samples using a Hidden Markov Model (HMM) to model vehicle trajectory over a map • Map Matching – Map matching is used to associate each position sample with the most likely point on the map and then produces travel time estimates within seconds 12 Algorithms • HMM is not a new idea, but VTrack is using it in a slightly different way • VTrack uses HMM to evaluate time estimates that come from noisy and sparsely sampled locations • The estimates from these locations are especially important in energy conscious settings 13 HMM • HMM is a process that uses different states (roads) and observations about those states (data samples) to obtain its output • The sequence of roads traveled is unknown, so the HMM uses probabilities to determine state transition (road usage) – VTrack doesn’t know when a user will turn so it uses these transition probabilities to determine the most likely sequence of roads used 14 Algorithms Continued • Viterbi decoding is used on top of HMM – This is a programming technique that finds the maximum likelihood sequence of states (roads) • Using HMM and Viterbi together produces a robust method for determining route estimation 15 Map Matching Process • Prior to HMM, data is processed to eliminate bad points and outages • Outages are dealt with by inserting interpolated points in the regions where an outage occurs – This assumes constant speed on the line, but it works well for map matching accuracy • The output of map matching is the most likely road segment that each point in the raw trajectory came from. 16 Map Matching Process 17 Travel Time Estimation • • • • Tleft(S) is the time between the unobserved entry point S and the first observed point in S Tright(S) is the time between the last point in S and the unobserved exit point from S The Time estimation is equal to the time interval between the first position point in segment S and the last point in the segment preceeding S (Sprev) divide it equally between Tleft(S) and Tright(S) This must be done for each road segment, S 18 Travel Time Estimation Errors • Main source of error is inaccuracy in the map matched output which can occur for two reasons: 1. Outages during transition times - 2. Noisy position samples - • If a car is moving from one segment to another during a transition time without observed samples, we don’t know if some delay occurred during that time A car location could be just entering a segment, but with WiFi, the sample could estimate the car is near the end of the segment; this would lead to an extremely inaccurate delay estimate It was found that determining travel times for small segments (with lengths near the order of magnitude of noise in that location) were nearly impossible to calculate. 19 Evaluation • VTrack was evaluated on a large data set (GPS and WiFi) of location estimates from actual drives completed. This info was obtained from CarTel (the other presentation of the afternoon) • Evaluation is based on: – Data and Method of obtaining the data – Route Planning – Hotspot Detection – Energy vs. Accuracy 20 Evaluation-Data and Method Used • To obtain a “ground truth” is the most challenging part of delay estimation because there needs to be something to compare the results to • An aggressive data cleaning was used in VTrack to produce high confidence: – For each GPS point g in a drive a set of segments Sg within a 15m radius was considered – A search was done to match the sequence of points g to a previous continuous sequence of segments Xg. This ensures each GPS point is matched to a neighbor – A search for outages of 10+ seconds is performed and split the drive into multiple drives on either side of the outage – Each g is projected to the closest point on Xg to obtain a corresponding ground truth • 3 constraints are held with this method: – No gap exceeds 10 seconds – Each GPS point is matched to another point within 15m – The resulting segments form an unbroken drive 21 Evaluation-Data and Method Used • Validation of delays was attempted by performing short drives around Boston, MA • An Android GPS equipped phone was used to record phone location and an app was used to mark turns • A human operator pressed a button whenever the car stopped, turned or went through an intersection • The travel times were compared, with an average of 4.7% error for a 30 min drive in Boston and an 8.1% error for another 30 min drive in Cambridge area (Boston suburb) • Most of this error was attributed to humans because it’s difficult to mark exactly where segment transitions are 22 Evaluation-Data and Method Used • 3998 drives worth of data was taken from 25 cars equipped with GPS and WiFi sensors • The data was cleaned, traces under 2km and 200 samples were discarded along with traces with 10 or less segments travelled • The resultant data set was 2145 drives which equates to around 800 hours of drive time 23 Map of Evaluation Drives 24 Evaluation-Route Planning • For route planning, VTrack chose to minimize total expected drive time • A set of clean drives Dgt is selected along with a set of “noisy” drives Dnoisy. The algorithm is run on Dnoisy to obtain a travel time • Sgt is a set of road segments with ground truth travel times • An induced graph Ggt is constructed on the set Sgt • Dijkstra’s algorithm is run on Ggt, then this computed travel time is compared with the ground truth travel time using a optimality gap • Optimality gap = time(Dnoisy)-time(Dgt) / time(Dgt) 25 Evaluation-Route Planning • This figure shows CDFs of the optimality gap • It shows the 90th percentile gap is 10-15% for WiFi which implies that 90% of the simulated commutes found paths that were no worse than 10-15% of the optimal path • We can see how the other sampling theories fared 26 Evaluation-Route Planning • WiFi plus GPS every 20 seconds is outperformed by GPS plus interpolation which suggests the VTrack map matching technique works better than WiFi over 20 seconds • A hybrid strategy of GPS and WiFi is better than either one by itself (reference first CDF graph) 27 Evaluation-Hotspot Detection • • A road segment has a “high delay” is the observed travel time on that segment differs from the travel time estimated with scaled speed limits by at least threshold seconds Two metrics were used: 1. Success rate 2. False positive rate 28 Evaluation-Hotspot Detection • This graphs shows the success rate vs. Threshold in sec • GPS hovers around 8090% success rate due to GPS constantly being available • WiFi is around 65% but this is attributed to outages. If WiFi was 100% available, it would have near-GPS success rates 29 Evaluation-Hotspot Detection • The false positive rate is low for almost all threshold levels • This indicates the algorithm is not too aggressive, otherwise it would deem too many hotspots • This is a desirable result 30 Evaluation-Energy vs. Accuracy • The power consumption on an iPhone was tested between GPS and WiFi • We can see that the iPhone GPS is extremely power-hungry compared to the WiFi • WiFi sampled continuously is almost the same power consumption as the phone being on • It may be that only iPhone has poor GPS power management, so it can’t be concluded that GPS is a bad choice 31 Evaluation-Energy vs. Accuracy • Using these numbers, it can be shown that GPS is about 25 times the cost of WiFi • This is only an iPhone example though, and it should be noted that WiFi performs about the same as GPS sampled every 40 seconds. If the platform being used is optimized for GPS usage, it may be true that sampling GPS every 20 seconds consumes less power than WiFi • In that case, it would be prudent to use the GPS • It depends on the platform being used, and how its power is managed for WiFi and GPS 32 Related Work • Mobile Millennium project at UC Berkeley – Built software to report real-time traffic delays on mobile phones • NeriCell project – Monitors road conditions and traffic using smartphones – Road surface quality is an objective as well as traffic conditions • Yoon et al. uses GPS data to classify road conditions as good or bad, including delays 33 Conclusion • VTrack presented a system for mobile phones to accurately estimate road travel times using a series of position samples and evaluations route planning and hotspot detection • The two key challenges were addressed for energy consumption and obtaining accurate travel times with inaccurate samples • Using the algorithm developed, it was shown that they could identify highly delayed roads and provide accurate travel time estimates for route planning 34 Future Work • Some potential future work: – A plan to develop an online, adaptive algorithm that selects the best sensor that takes available energy, node position and trajectory into account – Improving the quality of the algorithm presented to predict future travel times on segments using historic travel times and some real-time data to make traffic routing even better 35 References • A. Thiagarajan, L. Ravindranath, K. LaCurts, S. Madden, H. Balakrishnan, S. Toledo, and J. Eriksson. VTrack: Accurate, Energy-Aware Road Traffic Delay Estimation Using Mobile Phones. Proc. 14th ACM SenSys, Berkeley, CA, November 2009. • The mobile millennium project http://traffic.berkeley.edu • P. Mohan, V. N. Padmanabhan, and R. Ramjee. Nericell: rich monitoring of road and traffic conditions using mobile smartphones. In SenSys ’08: Proceedings of the 6th ACM conference on Embedded network sensor systems, 2008. • J. Yoon, B. Noble, and M. Liu. Surface Street Traffic Estimation. In MobiSys, 2007. 36 Questions • If you have any questions regarding the presentation please don’t hesitate to send me an email at michael.glus@knights.ucf.edu and I will respond within a day or two 37