NextPlace - Interactive Computing Lab

advertisement
NextPlace: A Spatio-Temporal
Prediction Framework for
Pervasive Systems
Salvatore Scellato1, Micro Musolesi, Cecilia Mascolo1, Vito Latora,
and Andrew T. Campbell2
1 Computer Laboratory, University of Cambridge, UK
2 Department of Computer Science, Dartmouth College, USA
Pervasive’11
Motivation
• The ability to predict future locations of
people allows
▫  A rich set of novel pervasive applications
and systems
 Advertisement
 Leisure events reports and notification
▫ With pervasive technology, these could be
implemented
 in a more effective way
 avoiding the delivery of information to
uninterested users
 providing a better user experience
NextPlace
• A
▫
▫
▫
new prediction framework based on nonlinear time series analysis
For forecasting user behavior
In different locations
From a spatio-temporal point of view
• Estimate the duration of a visit to a certain location and of the interva
ls between two subsequent visits
▫ When they visit their most important places
• Do not focus on the transitions between different locations
Predictability & Intuition
• Any prediction of future user behavior is based on the assumption of determinism
▫ Determinism: future events are determined by past events
• Human activities are characterized by a certain degree of regularity and predictabi
lity
▫ Because in human societies daily and weekly routines are well-established
• Intuition
▫ The sequence of important locations that an individual visits every day is more
or less fixed
▫ Example
 if a woman periodically goes to the gym on Mondays and Thursdays, she may change
her routine for those days, but the changed routine will be more or less the same over
different weeks.
• Two steps of Next Place
▫ How to isolate the user’s significant places
▫ How to estimate future times of arrival and residence times in the different sig
nificant places
• Omit the detailed nonlinear prediction model
Significant Place Extraction: GPS
• Many solutions have been presented in the literature [2,11,18]
• Intuition: permanence at a place is directly proportional to the importance that is attrib
uted it by the user
• Approach: 2-D Gaussian distribution weighted by the residence time at each GPS point
▫ The value of the variance for the Gaussian distributions: 10 meters
• Frequency map
▫
▫
Contains peaks which give information about the position of popular locations
Significant places: regions that are above a certain threshold T
(a) Frequency map
(b) Significant places
Significant Place Extraction: WiFi
• Intuition: the most frequently seen access points are natural candidates to re
present significant places
• Methodology
▫ Determine a significant place if the user has a sequence of at least n visits
to the access point
 In the setting: n = 20
Predicting User Behavior
• Algorithm description
▫ The history of visits of a user to each of its significant locations is considered
▫ For each location, try to predict when the next visits will take place and for how lon
g they will last
• Procedure
▫ Create two time series from the sequence of previous visits
 C = (c1, c2, …, cn), time series of the visit start times
 D = (d1, d2, …, dn), time series of the visit duration
▫ Search in the time series C sequence of m consecutive values (ci-m+1, …, ci) that are
closely similar to the last m values (cn-m+1,…,cn)
▫ Estimate next value of time series C by averaging all the values ci+1 that follow each
found sequence
▫ Select corresponding sequences (di-m+1, …, di)
▫ Estimate next value of time series D by averaging all the values di+1 that follow thes
e sequence
• Do not consider type of visit place, the visit purpose, correlation of visit place
s, …
Example
• Last three visit of a certain user to a location
▫ Monday at 6:30pm
▫ Monday at 10:00pm
▫ Tuesday at 8:15 am
• Find sequences that are numerically close to (6:30pm, 10:00pm, 8:15
pm)
▫ i.e., (6:10pm, 9:50pm, 8:35am) and (6:35pm, 10:10pm, 8:00am)
• Assume that the next visits that follow these subsequences
▫ Start at 1:10pm and 12:40pm
▫ Last for 40 and 30 minutes
• Estimate the next visit at 12:55pm for 35 minutes
Validation: Datasets
• Cabspotting: movement traces of taxi cabs in San Francisco with GP
S coordinates of approximately 500 taxis
• CenceMe GPS: during the deployment of CenceMe[21], at Dartmout
h College with GPS
• Dartmouth WiFi: extracted from SNMP logs of the WiFi LAN of Dart
mouth College campus
• Ile Sans Fils: a non-profit organization which operates a network of
free WiFi hotspots in Montreal, Canada. Over 45,000 users with 140 h
otspots
Validation: Careful Choice of suitable threshold
• GPS-based: threshold T for frequency map
▫ T: a fraction of the maximum value of the frequency map
▫ T=0.10 for Cabspotting
▫ T=0.15 for CenceMe GPS
Validation: Predictability Test
• Mean quadratic prediction error
▫ 𝜀=
1
𝑁
𝑁
𝑛=1(𝑠𝑛
− 𝑝𝑛 )2
▫ sn = time series
▫ pn = predicted values
• Predictability error: error / variance^2
▫ If this ratio is close to 1, the mean quadratic prediction error is large  no
determinism is present
▫ If this ratio is close to 0, the mean quadratic prediction error is small  a
high degree of determinism
Evaluation
• Methodology
▫ NPm : NextPlace with m = 1, 2, 3
▫ M1, M2: first-order and second-order Markov-based
▫ L: NextPlace with linear predictor
• Definition of correctness
▫ If we predict that, at time T, the user will be at location L at time TP = T +
delta T
▫ Correct only if the user is at L at any time during the interval [TP – theta,
TP+theta]
Download