Driving with Knowledge from the Physical World

advertisement
Urban Computing with Taxicabs
Yu Zheng
Microsoft Research Asia
Motivation
Urban computing for Urban planning
Developing countries: Urbanization and city planning
Developed countries: Urban reconstruction, city renewal, and suburbanization
Questions
What’s wrong with the city configurations?
Does a carried out urban planning really works?
GPS-equipped taxis are mobile sensors
Rank
Cities
Country/Region
Taxicabs
1
The Mexico city
Mexico
103,000+
2
Bangkok
Thailand
80,000+
3
Seoul
South Korea
73,000+
4
Beijing
China
67,000
5
Tokyo
Japan
60,000
6
Shanghai
China
50,000+
7
New York City
USA
48,300
8
buenos aires
Argentina
45,000
9
Moscow
Russia
40,000 (1000,000)
10
St.Paul
Brazil
37,000
11
Tianjin
China
35,000
12
Taipei
Taiwan
31,000+
13
New Taipei City
Taiwan
23,500
14
Singapore
Singapore
23,000
15
Osaka
Japan
20,000
16
Hong Kong
China
18,000+
17
Wuhan
China
18,000
18
London
England
17,000
19
Harbin
China
17,000
20
Guangzhou
China
16,000+
21
Shenyang
China
15,000+
22
Paris
France
15,000
What We Do
Detect flawed urban planning using taxi trajectories
Evaluate the carried out city configurations
Reminder city planners with the unrecognized problems
Challenges
City-wide traffic modeling
Embodying flaws and reveal their relationship
Methodology
Partition a city into regions with major roads
Methodology
Partition the trajectory dataset into some portions
42
48
Speed
Average Speed
46
44
36
42
Speed (km/h)
38
34
32
30
Speed
Average Speed
40
38
36
34
28
Time of day
Time of day
Workday
Rest day
Time
Work day
Rest day
Slot 1
7:00am-10:30am
9:00am-12:30pm
Slot 2
10:30am-4:00pm
12:30pm-7:30pm
Slot 3
4:00pm-7:30pm
7:30pm-9:00am
Slot 4
7:30pm-7:00am
--
22:00
20:00
18:00
16:00
14:00
12:00
10:00
22:00
20:00
18:00
16:00
28
14:00
24
12:00
30
10:00
26
8:00
32
8:00
Speed (km/h)
40
Methodology
Project taxi trajectories onto these regions
Building a region graph for each time slot
Methodology
Extracting features from each edge
|S|: Number of taxis
E(v): Expectation of speed
πœƒ = 𝐸 𝐷 𝐢𝑒𝑛𝐷𝑖𝑠𝑑(π‘Ÿ1 , π‘Ÿ2 )
Tr1
p0
r
p4 1
p5
Tr2
p1
p7
r2
p2
p3
r 3 p8
p6 𝐢𝑒𝑛𝐷𝑖𝑠𝑑(π‘Ÿ1 , π‘Ÿ3 ) Tr2:
1000
800
rj
r0
r1
r0 r1
 a0,1
a1,0 
ri
ai,0 ai,1
ai,j
rn-1 rn
a0,n
a1,n
|S|
600
400
M=
ai,n
200
0
4
0
10
3
20
30
E(V
2
1
40
50
) (k
m/h
60
)
0
70
-1

rn-1
rn
an-1,0
an,0
Tr1:
 an-1,n

π‘Žπ‘–π‘— =< 𝑺 , 𝐸 𝑉 , πœƒ >
Methodology
Select edges with |S| above average
Detect Skyline edges according to < 𝐸 𝑉 , πœƒ >
E(V)
Select edges with big πœƒ and small 𝐸 𝑉
Any point from the skyline is not dominated by other points
skyline
Ι΅
A) A skyline
point
E(V)
Ι΅
1
2
3
4
5
24
20
30
22
18
1.6
2.4
2.8
2.0
1.4
6
7
8
34
30
36
2.4
2.0
3.2
B) Seeking a skyline
Slot 3
r3
r9
r4
r2
r8
r9
r8
r3
r4
r6
r2
r1
r5
r3
r6
r4
r5
r8
r2
r8
r4
r3
r5
r6
r8
r4
Slot
Slot1 1
r11
r1
r22
r2
r44
r4
r33
r3
r55
r5
r22
r2
r88
r8
r55
r5
r77
r7
r88
r8
r99
r9
r11
r1
r44
r4
r1
r2
r3
r8
r2
r4
r8
r4
r3
r5
r6
r11
r1
r55
r5
r99
r9
r3
r3
r44
r4
r6
r6
r2
r2
r33
r3
r44
r4
r8
r8
r3
r3
r4
r4
r1
r1
r5
r5
r8
r8
r4
r4
r77
r7
r88
r8
r4
r4
r8
r8
r44
r4
Support=1.0
Support=1.0
r22
r2
r55
r5
r5
r5
r2
r2
r88
r8
r11
r1
Step
Step (2)
Step (2)
r11
r1
r44
r4
r4
r4
r2
r2
r88
r8
r55
r5
r2
r2
r5
r5
r4
r4
r44
r4
r22
r2
r33
r3
r1
r1
r2
r2
r33
r3
Mining skyline patterns
Support=2/3
Support=1.0
r2
r1
r5
r4
r7
r8
r4
Step (2)
r1
r4
r3
r8
r4
r3
r4
r2
To avoid falser2alert
r7
r5
r8
Deep understanding
r8
r5
4
r8
r5
r1
5
4
Patterns
Patterns
r2
2
Slot2 2
Slot
5
1
2
Slot3 3
Slot
3
1
Skyline
SkylineGraphs
Graphs
4
Building skyline graphs
Skyline Graphs
2
Step (1)
Patterns
r
r
r
r skyline
r
r
Formulate
graphs
r
r
r
r
r
Mining
frequent
patterns
r1
Day 3
Day 3
Day 2
Day 2
Day 1
Day 1
Day 3
Day 2
Day 1
Slot 2
Slot 1
Methodology
r6
r6
r8
r8
r2
r2
r8
r8
r4
r4
r3
r3
r5
r5
r6
r6
Mining skyline patterns
Mining skyline patterns
Support=2/3
Support=2/3
r1
r1
r2
r2
r3
r3
r8
r8
r2
r2
r4
r4
r8
r8
r4
r4
r3
r3
r5
r5
r6
r6
Evaluations
Datasets
2009. 3-5
2010.3-6
Number of taxis
29,286
30,121
Effective days
89
116
Total
679M
1,730M
Per taxi/day
306
528
Total
310M
600M
Per taxi/day
128
171
Average sampling rate (s)
100
74
Ave. dist. between two points (m)
457
349
Number of points
Distance (KM)
2009
Rest Days
2010
Workdays
Results
Entrance
r1
The 4th ring road
r3
Wangjing we
st
r2
Road
Nanhuqu we
st Road
Some flaws occurring in 2009 disappeared
Example 1: Two roads launched in late 2009
r4
Results
Su
lin bw
e ay
14
Subway
line 14
Subway
Line 14
Subway
line 15
Some flaws occurring in 2009 still exist in 2010
Example 1: Subway line 14 and 15
r2
r3
r1
A) overview B) Line 14 and 15 in Wangjing C) Line 14 passing New CBD
Conclusion
Video
The Released Dataset: T-Drive taxi trajectories
A demo in the demo session on Sept. 20.
Thanks!
Yu Zheng
http://research.microsoft.com/en-us/people/yuzheng/
Download