Mining Individual Life Pattern Based on

advertisement
Mining Individual Life Pattern Based on Location
History: A Paradigm and Framework
Yu Zheng @ Microsoft Research Asia
On behalf of Ye Yang
March 16, 2009
Background
GPS-enabled devices have become prevalent
These devices enable us to record our location history with GPS trajectories
Human location history is a big cake given the large number of GPS phones
2
Motivation
Human location history
does not only represent an individual’s life regularity
but also imply the tastes/preferences of a person
University
Movie
center
Supermarket
Microsoft
3
Motivation
An individual’s life pattern
can be used to model and predict a person’s behaviors/preferences
and enable valuable applications
context-aware computing
personalized recommendation
4
Challenges
How to model an individual’s location history
Life Pattern could have multiple representation/definitions
E.g., John typically leaves home at 8:30 am
E.g., Matt usually goes to a cinema once a month
E.g., Marry goes shopping after visiting a Starbucks
Different applications need different patterns
Many mining algorithms
Duplicated effort
5
What we do
Propose a model representing an individual’s location history
Define the paradigm of individual life patterns
Present a framework for mining individual life pattern
6
1:Modeling Location History
GPS logs P and GPS trajectory
A Stay Point S
Latitude, Longitude, Time
p1: Lat1, Lngt1, T1
p2: Lat2, Lngt2, T2
………...
pn:
Latn,
Lngtn,
p1
p6
p3
p7
p2
Tn
p4
Stay points S={s1, s2,…, sn}.
Stands for a geo-region where a user has stayed for a while
Carry a semantic meaning beyond a raw GPS point
p5
1:Modeling Location History
Location history:
πΏπ‘œπ‘π» = (𝑠1
βˆ†π‘‘ 1
𝑠2
βˆ†π‘‘ 2
βˆ†π‘‘ 𝑛 −1
,…,
𝑠𝑛 )
represented by a sequence of stay points
with transition intervals
Day 1: S1οƒ S2οƒ S3οƒ S4
Restaurant
C4
S6
S7
Home
S9
Day 3: S7οƒ S8οƒ S9οƒ S10
S10
S8
S1
C1
Day 2: S4οƒ S5οƒ S7οƒ S7
S5
S4
C3
C2
Day 1: C1οƒ  C3 οƒ  C2 οƒ  C1
S2
Company
Day 2: C1 οƒ  C2 οƒ C4οƒ  C1
Day 3: C1 οƒ  C3 οƒ C4οƒ  C3
S3
Supermarket
8
1:Modeling Location History
B
A
Considering the scale of a location
C1
C2
C3
C4
S1 S4 S7 S5 S3 S2 S8 S10 S6 S9
Restaurant
A
C4
S6
S7
Home
S9
S10
S8
S1
C1
Day 1: C1οƒ  C3 οƒ  C2 οƒ  C1
S5
S4
C2
Day 2: C1 οƒ  C2 οƒ C4οƒ  C1
Day 3: C1 οƒ  C3 οƒ C4οƒ  C3
S2
B
S3
Supermarket
C3
Company
Day 1: A B  A A
Day 2: A A BA
Day 3: A BB B
9
1:Modeling Location History
Build a tree using a hierarchical clustering algorithm
Each node represents a cluster of stay points
Different levels denote different geospatial granularity
City 1
Community A
Supermarket
City i
City n
Community B
Home Company
Restaurant
10
1:Modeling Location History
An individual’s location history can be represented by a sequence of stay point
clusters with transition time between two clusters on different geospatial scales.
A
S9
S6
S7
Day 1: A B  A A
Day 2: A A BA
Day 3: A BB B
S8
S1
S2
S4
S5
B
S3
Restaurant C4
S6
Day 1: C1οƒ  C3 οƒ  C2 οƒ  C1
Day 2: C1 οƒ  C2 οƒ C4οƒ  C1
Day 3: C1 οƒ  C3 οƒ C4οƒ  C3
Day 1: S1οƒ S2οƒ S3οƒ S4
Day 2: S4οƒ S5οƒ S7οƒ S7
Day 3: S7οƒ S8οƒ S9οƒ S10
S9
S7
C1
S10
Company
S8
S1
Home
S10
S2
S5
S4
C3
C2
S3
Supermarket
11
2: The Paradigm of Life Pattern
Life Pattern
P
𝑠
𝑃𝑛𝑐 ∢= 𝑃 βˆ₯ 𝑃
𝑛𝑠
𝑃 ∢= 𝑃𝑐 βˆ₯ 𝑃𝑛𝑐
Non-Conditional
Life Pattern
Pnc
Sequential
Life Pattern
PS
Non-Sequential
Life Pattern
Pns
Conditional
Life Pattern
Pc
1
2
𝑃𝑐 ∢= 𝑃𝑛𝑐
| 𝑃𝑛𝑐
Life Associate Rule
Pnc1οƒ Pnc2
Location dimension: City, Community, Restaurants
Time dimension: Year, Month, Week, Day
12
2: The Paradigm of Life Pattern
Atomic life pattern
𝐴 ∢= 𝑣𝑖𝑠𝑖𝑑 𝑋 . (? π‘Žπ‘Ÿπ‘£ 𝑑1 , 𝑑2 . (? π‘ π‘‘π‘Žπ‘¦ 𝜏1 , 𝜏2
E.g., Marry typically arrives at the “Starbucks” between 2 and 3 pm.
E.g., Marry typically stays in the “Starbucks” for 1 to 1.5 hours
E.g., Marry typically arrives at the “Starbucks” between 2 and 3 pm, and stays there for 1 to 1.5 hours
Non-sequential life pattern
𝑃𝑛𝑠 ∢= 𝐴 βˆ₯ (𝑃𝑛𝑠 ∧ 𝐴)
E.g., Typically, Marry leaves home around 9 am.
E.g., Typically, Marry leaves around 9 am and comes back around 7 pm
𝑃 𝑠 ∢= 𝐴 βˆ₯ (𝑃 𝑠 → 𝐴)
E.g., John usually goes to a Starbucks café after shopping in a Outlets (Outletsοƒ  Starbucks)
E.g., John usually visits Outlets Starbucks restaurants
Sequential life pattern
13
3: The Framework for Life Pattern Mining
Modeling Location
Location History
History
Modeling
Stay Points
Points
Stay
Clustering
Clustering
Stay Point
Point
Stay
Sequences
Sequences
Stay Point
Point
Stay
Detection
Detection
Mining Atomic
Atomic Life
Life Patterns
Patterns
Mining
Location History
History
Location
Temporal Sampling
Sampling
Temporal
and Partition
Partition
and
Location Selection
Selection
Location
Time
Time
Condition
Condition
Location
Location
Condition
Condition
Life Sequence
Sequence Dataset
Dataset
Life
Mining Atomic
Atomic Life
Life Pattern
Pattern
Mining
GPS
GPS
Traces
Traces
Log
Log
Parsing
Parsing
GPS
GPS
Log
Log
Mining Non-Conditioned
Non-Conditioned Life
Life Patterns
Patterns
Mining
Atomic
Atomic Patterns
Patterns
Atomic Pattern
Pattern
Atomic
Combination
Combination
Frequent
Frequent
Sequence
Sequence Mining
Mining
NonNonSequential
Sequential
Patterns
Patterns
Sequential
Sequential
Patterns
Patterns
Mining
Mining Conditioned
Conditioned Life
Life Patterns
Patterns
Conditioned
Conditioned Patterns
Patterns
Mining
Mining Conditioned
Conditioned Life
Life Patterns
Patterns
14
3: The Framework for Life Pattern Mining
Mining Atomic life patterns
A user need to specify
the geo-region that interest them (location condition)
the time span and/or temporal type they concern (Temporal condition)
A suggested support value (S)
E.g., show me my life patterns about Sigma building in the weekends of the last year
E.g., show me my life patterns on Friday during 2008 in Beijing
Algorithms like FP-growth, MAFIA, CHARM and Closet+ can be used here
Possible results
1. In the last year, you typically arrive at Sigma around 10~11 am, and stay 4-6 hours;
you visited Sigma building every two weekends.
……
2. In 2008, you went to Xidan once a month.
you visit there in the evening.
Typically, you spent 2-3 hours in Xidan;
you went to a Movie center every three weeks.
15
3: The Framework for Life Pattern Mining
Mining non-conditioned life patterns based on atomic patterns
Combine atomic patterns
E.g., In the last year, you went to Xidan once a month; in most case, you visited there in the evening
of weekend and spent 2-3 hours there.
Mining sequential life patterns
Algorithms like CloSpan, etc.
E.g., In 2008, you typically travel to Xidan from Sigma building in the weekend.
More specifically, you usually leave Sigam building around 7 pm and spent 30 to 50 minutes on
the way.
30-50min
Sigma building ----------------> Xidan
16
3: The Framework for Life Pattern Mining
Mining conditional life patterns
1
2
Pr
𝑃
β‹€
𝑃
𝑛𝑐
𝑛𝑐
1
2
Pr[𝑃𝑛𝑐
| 𝑃𝑛𝑐
]=
2
Pr 𝑃𝑛𝑐
One or two conditions would be more useful
E.g., typically, you will go to Zhongguanchun movie center if you leave Sigma building
before 4 pm in weekends.
If you leave Sigma building after 7 pm in the weekends, you usually visit Xidan.
If stayed in Xidan more than 3 hours, you went to a Thai-food restaurant.
17
Experiements
60 Devices and 138 users
From May 2007 ~ present
age<=22
22<age<=25
26<=age<29
age>=30
Microsoft emplyees
Employees of other companies
Government staff
Colleage students
9% 16%
18%
30%
14%
45%
58%
10%
18
Experiments
Select 10 volunteers out of the 138 users
Partition their location histories into two parts
Mine patterns separately
Investigate the predictability of the detected life patterns
19
Experiments
The predictability of life patterns
1
Predicability
0.95
0.9
Sequential
0.85
NonSequenti
al
0.8
0.75
0.7
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Support
20
Experiment
A case study on non-conditioned patterns
5
5
4
4
3
Intersting
2
Representative
Mean Score
Mean Score
One-year GPS logs of each volunteer
3
Interesting
2
1
1
0
0
All Days
Workdays
Holidays
Representative
Day
Week
Month
21
Experiments
A case study on conditioned patterns
Condition 1:not visiting the most frequent place;
Condition 2: visiting the second frequent place;
Condition 3: visiting the second frequent place while not visiting the most frequent
place.
5
Mean Score
4
3
Interesting
2
Representative
1
0
Cond. 1
Cond. 2
Cond. 3
22
Conclusion
Propose a model representing an individual’s location history
Define the paradigm of individual life patterns
Present a framework for mining individual life pattern
23
Thanks!
yuzheng@microsoft.com
24
Download