MO14

advertisement

Critical Analysis Presentation:

T-Drive: Driving Directions based on Taxi Trajectories

Authors of Paper: Jing Yuan, Yu Zheng, Chengyang Zhang,

Weilei Xie, Xing Xie, Guangzhong Sun, Yan Huang

Presentation by: Yashu Chamber, Zhe Jiang

CSCI: 8715

Instructor: Dr. Shashi Shekhar

1

Problem Statement

INPUT : Taxi trajectory data using collected using GPS; user query with a start point q s , departure time t d

, end point q d

, and a

OUTPUT time t d

: To find a route from q s to q d in a dynamic road network G learned from a trajectory archive A.

r with departure

= (V r

, E r

) which is

OBJECTIVE : The route should be fastest.

CONSTRAINTS :

– dynamic road network G r

= (V r

, E r

) in which travel time cost change over time and space

– low sample rates of GPS points in trajectory archives

Significance To Course

Trajectory data is both spatial and temporal in nature. This work could complement the course content as novel technique in spatial and temporal data mining.

This paper shows an interesting application of spatial and temporal data mining to “smart driving directions” recommendation.

3

Challenges

• difficulty in modeling intelligence of taxi drivers based on taxi historical trajectories,

• the sparseness and low coverage of trajectory data

• low sampling rate in GPS points.

Major Contributions

1. Introduce the notion of “landmark graph” that models the intelligence of taxi drivers based on taxi trajectories.

2. Proposed a “Variance-Entropy-Based

Clustering” method to summarize the distributions of travel times between any two landmarks at any time.

3. Build a ‘system’ to suggest ‘fastest driving directions’ based on real world dataset, evaluate its performance with both synthetic queries and in-field-experiments.

Key Concepts

1. Trajectory : A sequence of GPS points pertaining to one trip.

2. Road Segments : A directed edge (one or bi-directional) between two segment terminal points.

3. Landmarks

: ‘k’ frequently traversed road segments.

4. Landmark Edges : Edge connecting two landmarks if a good number of trajectories ( δ) are passing through these two landmarks, and the travel time is less than(t max

).

5. Rough Routing : Sequence of landmark edges having the fastest route for a given user (based on user’s speed).

6. Refined Routing : Each Landmark edge in the rough route is transformed into a number of road segments having the fastest route.

Approach

Trajectory Preprocessing

Trajectory segmentaion: Segement the GPS log of taxis into individual trips.

Landmark Graph Construction

– identify the most ‘k’ frequently traversed road segments, which they denote as “landmarks”.

– Different landmarks are connected by an edge (“landmark edge”) if a good number of trajectories (

δ) are passing through these two landmarks, and the travel time is less than(t max

).

Route Computing .

Rough Routing: Identify a sequence of landmark edges representing the fastest route.

Refined Routing.

Figures Describing The Approach

Source: T-Drive: Driving Directions Based on Taxi Trajectories. Author: Jing Yuan

Validation Methodology

1. Evaluating landmark graphs .

RESULT: Used to validate the modeling efficiency of the method.

2. Evaluation based on synthetic queries .

The authors randomely generate 1200 and compare against Speed-constraint based method

(SC) which is used in Google and Bing maps, and the Real-time-traffic-analysis-based method.

RESULT: Suggested method outperforms state of the art methods RT and SC.

3. In-the-field-evaluation . i.

Same driver traverses routes suggested by their method and competetive techniques.

ii.

Different drivers (with similar skills) travel different routes suggested by different methods.

Strengths and Weaknesses of the methodology .

The authors did a good job of evaluating using different methodologies, including real and synthetic evaluations using real dataset.

Assumptions

1. The normal users are allowed on the same roads where taxi trajectories exist. Travel time cost of taxis are same as other type of vehicles.

2. There is periodicity of travel time cost over week, i.e. accidents, road works, emergencies and so on that change traffic flow patterns are rare.

3. The taxi drivers have knowledge of fastest route and do not make “roundabout trip”.

4. *Historical taxi trajectories cover all potential points.

Suggestions

Remove data sparseness and low coverage from challenge section and mention it in a scope paragraph, since it is not conquered.

Provide more justification of deciding travel time cost for a user based on user optimism index

Provide theoretical cost model of suggested method.

Add more in-field evaluations, current sample size (two) is too small.

Download