Computerized Trip Classification of GPS Data: A Proposed Framework

advertisement
Computerized Trip Classification
of GPS Data:
A Proposed Framework
Terry Griffin - Yan Huang – Ranette Halverson
Midwestern State University, Wichita Falls
University of North Texas, Denton
Midwestern State University, Wichita Falls TX
1
Introduction and
Motivation
• Why Derive Trip Purpose??
• Many Transportation Departments are doing
studies that require Travel Diaries (TD) or
Origin Destination (OD) matrices.
• TD’s and OD matrices require user interaction
(lots of it).
• In this paper we propose a framework to
possibly eliminate the human factor from the
creation of TD’s and OD matrices.
• This is done by passively collecting GPS data.
Midwestern State University, Wichita Falls TX
2
Overview of the Presentation
•Some Background
•Trip Purpose Classification
•Data Collection
•Data Preparation
•Data Aggregation
•Clustering
•Generating Random Data
•Results
•Conclusions
Midwestern State University, Wichita Falls TX
3
Background
To create a trip classification model,
we first need to know:
•What is a trip?
•GPS streams
•How do we classify that trip?
•Clustering
•Decision Trees
Midwestern State University, Wichita Falls TX
4
Background
GPS Streams
What is a GPS stream?
•The logged GPS data can be described as a collection of points
(P1, P2...Pn)
•Each point is defined by a Latitude (Lat) and Longitude (Lon)
pair, accompanied by the Time of Day (ToD).
•The entire set becomes:
(P[Lat,Lon,ToD]1,P[Lat,Lon,ToD]2,...,P[Lat,Lon,ToD]n)
Midwestern State University, Wichita Falls TX
5
Background
GPS Streams
What is a GPS stream?
Each stream is typically recorded:
• continuously with a user defined
interval
• or by movement only
Each stream creates Points Of Interest (POI)
Midwestern State University, Wichita Falls TX
6
Background
Clustering
Dbscan – Density Based Clustering
•Eps
•MinPts
•Density Reachability
•Density Connectivity
Midwestern State University, Wichita Falls TX
7
Background
Clustering
Dbscan – Density Based Clustering
Midwestern State University, Wichita Falls TX
8
Background
Decision Trees
• What is a decision tree?
1. Used as a tool for classification and prediction
2. Tree like structure that represents rules
3. leaf node - indicates the value of the target
attribute (class) of examples, or
4. decision node - specifies some test to be
carried out on a single attribute-value, with
one branch and sub-tree for each possible
outcome of the test.
Midwestern State University, Wichita Falls TX
9
Background
Decision Trees
Example Decision Tree
Given
ATTRIBUTE
|
POSSIBLE VALUES
============+=======================
outlook
| sunny, overcast, rain
temperature | continuous
humidity
| continuous
windy
| true, false
and
OUTLOOK | TEMPERATURE | HUMIDITY | WINDY | PLAY
=====================================================
sunny
|
85
|
85
| false | Don't Play
sunny
|
80
|
90
| true | Don't Play
overcast|
83
|
78
| false | Play
rain
|
70
|
96
| false | Play
rain
|
68
|
80
| false | Play
rain
|
65
|
70
| true | Don't Play
overcast|
64
|
65
| true | Play
….
You get
Midwestern State University, Wichita Falls TX
10
Background
Decision Trees
Example Decision Tree (Golf)
Midwestern State University, Wichita Falls TX
11
Background
Decision Trees
1.Entropy – measures the purity of an
arbitrary collection of examples (the
homogeneity )
2.Information gain - measures how well a
given attribute separates the training
examples according to their target
classification
Midwestern State University, Wichita Falls TX
12
Trip Purpose Classification
•To find and classify trip purposes for a given
GPS stream, we follow a series of steps
•Data Collection
•Data Preparation
•Data Aggregation
•Actual Classification
Midwestern State University, Wichita Falls TX
13
Trip Purpose Detection
Data Collection
•Tools
•Used a Palm m515 (hardware)
•Magellan GPS companion (hardware)
•Cetus GPS 1.1 (software)
•Method
•Continuous
•Movement Only (caused problems)
•Collected
•6 weeks of continuous data for 1 individual
•Randomly generated a data set
Midwestern State University, Wichita Falls TX
14
Trip Purpose Detection
Data Preparation
• Data cleansing
• Compute trip stop lengths from given raw
GPS data.
• Continuous
• Movement only
Midwestern State University, Wichita Falls TX
15
Trip Purpose Detection
Data Aggregation
•Single points are not meaningful
•Only after many points are “clustered” together can we
really gain information.
•Each balloon is a “POI” (cluster)
•Each balloon gives us:
•Average time of day
•Average length of stay
•Longest length of stay
•Earliest arrival time
•Etc…
Midwestern State University, Wichita Falls TX
16
Trip Purpose Detection
Data Aggregation
•It’s from these aggregate values that we
can build / train our decision tree.
Midwestern State University, Wichita Falls TX
17
Classifying Points of Interest
Trip Purpose Detection
Identified Clusters:
Midwestern State University, Wichita Falls TX
18
Trip Purpose Detection
Classifying Points of Interest
•Example Tree
created by c4.5:
Midwestern State University, Wichita Falls TX
19
Trip Purpose Detection
Classifying Points of Interest
Identified Clusters:
Midwestern State University, Wichita Falls TX
20
Random Data
x - current time of day
µ - specified time for
location in which the
probability of going there
should be high
σ - time window (standard
deviation) around µ
d – control parameter
d = (d1,d2)|
d {(0,1),(-1,0),(-1,1)}
Midwestern State University, Wichita Falls TX
21
Results
Random Data
•50 generations
•For each generation we modified Eps and MinPts
•15x15 feet - 200x200 feet (5 distinct sizes)
•MinPts of 2 – 10 were used
•As each cluster was found, it was classified using a
classification tree based on the data generated for that
test.
•Each cluster was assigned a level of correctness
(all points in the cluster correctly identified = 1)
• We used 20 % of the generated data to train the tree.
Midwestern State University, Wichita Falls TX
22
Results
Midwestern State University, Wichita Falls TX
23
Results
Midwestern State University, Wichita Falls TX
24
Future Work
Midwestern State University, Wichita Falls TX
25
Future Plans
• Create a GPS database
– $5000 grant for GPS devices (fall 2006)
– Additional University funds
• Fill a needed gap in GPS research
Midwestern State University, Wichita Falls TX
26
Conclusions
•This classification tool has potential, but
needs real validation
•Be nice to obtain a large data set
•Future…
•possibly predict the next trip stop
based on Markhov chains
•Questions??
Midwestern State University, Wichita Falls TX
27
Download