Overview of algorithms used
Travel time estimation
Evaluation of results
Related work
• Traffic congestion is a serious, growing problem
in today's society with over 4 billion hours
spent in traffic in 2007
• As the world gets “smarter” so should the
• The idea of using vehicles as data collection
points is not a new idea, but since cell phones
emerged we can get much better and more
accurate data
What is VTrack?
• VTrack is a system for travel time estimation
using sensor data (GPS or WiFi)
• This idea was developed using four National
Science Foundation grants by students at MIT
• We want to mitigate long traffic delays using
this data and inform customers of any potential
traffic issues
Key Applications
• Two key applications to support
1. Detecting and visualizing “hotspots”
• A hotspot is a road segment which has a observed
travel time that far exceeds its normal travel time
• Goal is to display these hotspots to the user via a
web browser
• User can select their geographic area and see all
the traffic spots
• Must minimize false hotspots and also missed
Key Applications Continued
2. Real-time Route planning
Users are most concerned about end-toend time spent in a commute
Route planning can use past and current
data to give the user the fastest possible
route to their destination
Since the planning is in real-time, the
application can update the user to alter
their driving path if a hotspot arises
VTrack Architecture
• Users run
from their cell
that reports
to server
• Server runs
algorithm to
travel time
Server Diagram
Challenge # 1
The first challenge for estimating travel
delays is the energy consumption of the
device that is transmitting the data
Cell phones that trasmit frequently can
drain a battery quickly
Can not force users to keep phones
plugged in all the time while obtaining
Challenge # 2
• The second challenge is sensor unreliability
• Will users always have their phone in data
collection mode?
• How will we know where the users are (ie.
Accuracy of the sensor)?
• This leads into the debate of GPS and WiFi
GPS vs. WiFi
– GPS not available on all phones
– Power hungry (up to 20x vs. WiFi)
– Outages in tunnels or users pockets
– High resolution
• WiFi
– Less resolution (only to 50-100m)
– Consumes less power
– Needs more processing to determine user
Overcoming the Challenges
• Algorithm use
– Process streams of time-stamped position
samples using a Hidden Markov Model
(HMM) to model vehicle trajectory over a
• Map Matching
– Map matching is used to associate each
position sample with the most likely point on
the map and then produces travel time
estimates within seconds
• HMM is not a new idea, but VTrack is using it in
a slightly different way
• VTrack uses HMM to evaluate time estimates
that come from noisy and sparsely sampled
• The estimates from these locations are
especially important in energy conscious
• HMM is a process that uses different states
(roads) and observations about those states
(data samples) to obtain its output
• The sequence of roads traveled is unknown, so
the HMM uses probabilities to determine state
transition (road usage)
– VTrack doesn’t know when a user will turn so it uses
these transition probabilities to determine the most
likely sequence of roads used
Algorithms Continued
• Viterbi decoding is used on top of HMM
– This is a programming technique that finds
the maximum likelihood sequence of states
• Using HMM and Viterbi together produces
a robust method for determining route
Map Matching Process
• Prior to HMM, data is processed to
eliminate bad points and outages
• Outages are dealt with by inserting
interpolated points in the regions where
an outage occurs
– This assumes constant speed on the line, but
it works well for map matching accuracy
• The output of map matching is the most
likely road segment that each point in the
raw trajectory came from.
Map Matching Process
Travel Time Estimation
Tleft(S) is the time between the unobserved
entry point S and the first observed point in S
Tright(S) is the time between the last point in S
and the unobserved exit point from S
The Time estimation is equal to the time
interval between the first position point in
segment S and the last point in the segment
preceeding S (Sprev) divide it equally between
Tleft(S) and Tright(S)
This must be done for each road segment, S
Travel Time Estimation Errors
Main source of error is inaccuracy in the map matched
output which can occur for two reasons:
Outages during transition times
Noisy position samples
If a car is moving from one segment to another during a
transition time without observed samples, we don’t know if
some delay occurred during that time
A car location could be just entering a segment, but with WiFi,
the sample could estimate the car is near the end of the
segment; this would lead to an extremely inaccurate delay
It was found that determining travel times for small
segments (with lengths near the order of magnitude
of noise in that location) were nearly impossible to
• VTrack was evaluated on a large data set
(GPS and WiFi) of location estimates from
actual drives completed. This info was
obtained from CarTel (the other
presentation of the afternoon)
• Evaluation is based on:
– Data and Method of obtaining the data
– Route Planning
– Hotspot Detection
– Energy vs. Accuracy
Evaluation-Data and Method Used
• To obtain a “ground truth” is the most challenging part of delay
estimation because there needs to be something to compare the
results to
• An aggressive data cleaning was used in VTrack to produce high
– For each GPS point g in a drive a set of segments Sg within a
15m radius was considered
– A search was done to match the sequence of points g to a
previous continuous sequence of segments Xg. This ensures
each GPS point is matched to a neighbor
– A search for outages of 10+ seconds is performed and split the
drive into multiple drives on either side of the outage
– Each g is projected to the closest point on Xg to obtain a
corresponding ground truth
• 3 constraints are held with this method:
– No gap exceeds 10 seconds
– Each GPS point is matched to another point within 15m
– The resulting segments form an unbroken drive
Evaluation-Data and Method Used
• Validation of delays was attempted by performing short
drives around Boston, MA
• An Android GPS equipped phone was used to record
phone location and an app was used to mark turns
• A human operator pressed a button whenever the car
stopped, turned or went through an intersection
• The travel times were compared, with an average of
4.7% error for a 30 min drive in Boston and an 8.1%
error for another 30 min drive in Cambridge area
(Boston suburb)
• Most of this error was attributed to humans because it’s
difficult to mark exactly where segment transitions are
Evaluation-Data and Method Used
• 3998 drives worth of data was taken from
25 cars equipped with GPS and WiFi
• The data was cleaned, traces under 2km
and 200 samples were discarded along
with traces with 10 or less segments
• The resultant data set was 2145 drives
which equates to around 800 hours of
drive time
Map of Evaluation Drives
Evaluation-Route Planning
• For route planning, VTrack chose to minimize total
expected drive time
• A set of clean drives Dgt is selected along with a set of
“noisy” drives Dnoisy. The algorithm is run on Dnoisy to
obtain a travel time
• Sgt is a set of road segments with ground truth travel
• An induced graph Ggt is constructed on the set Sgt
• Dijkstra’s algorithm is run on Ggt, then this computed
travel time is compared with the ground truth travel
time using a optimality gap
• Optimality gap = time(Dnoisy)-time(Dgt) / time(Dgt)
Evaluation-Route Planning
• This figure shows CDFs of
the optimality gap
• It shows the 90th
percentile gap is 10-15%
for WiFi which implies
that 90% of the
simulated commutes
found paths that were no
worse than 10-15% of
the optimal path
• We can see how the
other sampling theories
Evaluation-Route Planning
• WiFi plus GPS every 20
seconds is outperformed
by GPS plus interpolation
which suggests the
VTrack map matching
technique works better
than WiFi over 20
• A hybrid strategy of GPS
and WiFi is better than
either one by itself
(reference first CDF
Evaluation-Hotspot Detection
A road segment has a “high delay” is the
observed travel time on that segment
differs from the travel time estimated
with scaled speed limits by at least
threshold seconds
Two metrics were used:
1. Success rate
2. False positive rate
Evaluation-Hotspot Detection
• This graphs shows the
success rate vs.
Threshold in sec
• GPS hovers around 8090% success rate due to
GPS constantly being
• WiFi is around 65% but
this is attributed to
outages. If WiFi was
100% available, it would
have near-GPS success
Evaluation-Hotspot Detection
• The false positive rate
is low for almost all
threshold levels
• This indicates the
algorithm is not too
aggressive, otherwise
it would deem too
many hotspots
• This is a desirable
Evaluation-Energy vs. Accuracy
• The power consumption on an iPhone was tested between GPS and
• We can see that the iPhone GPS is extremely power-hungry
compared to the WiFi
• WiFi sampled continuously is almost the same power consumption
as the phone being on
• It may be that only iPhone has poor GPS power management, so it
can’t be concluded that GPS is a bad choice
Evaluation-Energy vs. Accuracy
• Using these numbers, it can be shown that GPS
is about 25 times the cost of WiFi
• This is only an iPhone example though, and it
should be noted that WiFi performs about the
same as GPS sampled every 40 seconds. If the
platform being used is optimized for GPS usage,
it may be true that sampling GPS every 20
seconds consumes less power than WiFi
• In that case, it would be prudent to use the GPS
• It depends on the platform being used, and how
its power is managed for WiFi and GPS
Related Work
• Mobile Millennium project at UC Berkeley
– Built software to report real-time traffic delays on
mobile phones
• NeriCell project
– Monitors road conditions and traffic using
– Road surface quality is an objective as well as traffic
• Yoon et al. uses GPS data to classify road
conditions as good or bad, including delays
• VTrack presented a system for mobile phones to
accurately estimate road travel times using a
series of position samples and evaluations route
planning and hotspot detection
• The two key challenges were addressed for
energy consumption and obtaining accurate
travel times with inaccurate samples
• Using the algorithm developed, it was shown
that they could identify highly delayed roads
and provide accurate travel time estimates for
route planning
Future Work
• Some potential future work:
– A plan to develop an online, adaptive
algorithm that selects the best sensor that
takes available energy, node position and
trajectory into account
– Improving the quality of the algorithm
presented to predict future travel times on
segments using historic travel times and
some real-time data to make traffic routing
even better
