PPT

advertisement
Geographical and Temporal
Similarity Measurement in
Location-based Social Networks
Zhengwu Yuan
Yanli Jiang
Gyözö Gidofalvi
Chongqing University of Posts and
Telecommunications
KTH – Royal Institute of Technology
Outline
 Introduction
 Related work
 A hierarchical spatio-temporal similarity measure in LBSN
 Empirical evaluations
2013-11-05
MobiGIS 2013, Orlando, FL
2
Introduction
Mobile Internet technology
Location-based
User Similarity
Social Network
Internet technology
Space Location technology
2013-11-05
MobiGIS 2013, Orlando, FL
3
LBSN Applications
2013-11-05
MobiGIS 2013, Orlando, FL
4
Information Layout of LBSN
Gao at al. Data Analysis on Location-Based Social Networks. 2011
2013-11-05
MobiGIS 2013, Orlando, FL
5
Outline
 Introduction
 Related work
 A hierarchical spatio-temporal similarity measure in LBSN
 Empirical evaluations
2013-11-05
MobiGIS 2013, Orlando, FL
6
Traditional: Cosine Similarity
 Given a set of commonly rated items IAB, the cosine similarity between
two users A and B based on their respective ratings RA,i and RB,i on
items i ϵ IAB is:
sim( A, B) 
2013-11-05


iI AB
iI AB
RA,i RB ,i
RA,i 2

MobiGIS 2013, Orlando, FL
iI AB
RB,i 2
7
Traditional: Adjusted Cosine Similarity
 Given a set of commonly rated items IAB , the adjusted cosine
similarity between two users A and B based on the sets of their
individually rated items IA and IB and their average individual ratings
on these items and
is:
sim( A, B) 
2013-11-05


iI A
iI AB



( RB ,i  RB )2
( RA,i  RA ) RB ,i  RB
( RA , i  R A ) 2
MobiGIS 2013, Orlando, FL
iI B
8
Traditional: Pearson Correlation Coefficient
 Given a set of commonly rated items IAB , the adjusted cosine
similarity between two users A and B based on the sets of their
individually rated items IA and IB and their average individual ratings
on these items and
is:
sim( A, B) 
2013-11-05


iI AB
iI AB
( RA,i  RA )( RB ,i  RB )
( RA , i  RA )
2

MobiGIS 2013, Orlando, FL
iI AB
( RB ,i  RB )
2
9
Similarity in LBSN
 Similarity along (a combination of) different dimensions:






Content layer, e.g.: Ye’11, McKenzie’13
Social layer, e.g.: Ye’12
Geographical layer, e.g.: Li’08
Semantic locations / categories of locations, e.g.: Xiao’10, Bao’12, Ye’11
Temporal sequential similarity, e.g.: Li’08
Check-in temporal similarity, e.g.: Ye’11
2013-11-05
MobiGIS 2013, Orlando, FL
10
Outline
 Introduction
 Related work
 A hierarchical spatio-temporal similarity measure in LBSN
 Empirical evaluations
2013-11-05
MobiGIS 2013, Orlando, FL
11
A Hierarchical Spatial-Temporal Similarity
Measure in LBSN
 Assumptions about user similarity:
 The closer is the time and the geographical location that two users access the
more similar are the two users to each other
 The larger is the number of check-ins of two users in nearby locations at similar
times, the more similar are the two users to each other
 Similarity changes with the level of detail
 Proposed method:
 Extract spatio-temporal clusters from user check-ins at different spatio-temporal
levels of detail
 For each ST level of detail, measure the cosine similarity between users using
the classical Vector Space Model (VSM) with vectors composed of the amount of
user visits in different ST clusters
 Calculate the weighted combination of similarities at different ST levels of detail
2013-11-05
MobiGIS 2013, Orlando, FL
12
Hierarchical Spatio-Temporal Clustering
 Spatio-temporal variant of DBSCAN: ST-DBSCAN [Birant’07]
 An object is a core object if within its spatial (Eps_space) and temporal
(Eps_time) neighborhood the number of objects is at least MinPts.
 Definitions for Directly Density-Reachable (DDR), Density-Reachable (DR), and
Density-Connected are straight forward extensions.
 Clusters at different levels of detail:
2013-11-05
MobiGIS 2013, Orlando, FL
13
Vector Space Model
Define the user-location matrix within a certain period as
Vl ( mn )
 V1,1 V1,2
V
V
2,1
2,2




Vm ,1 Vm ,2
V1,n 1
V2, n 1
Vm,n 1
V1,n 

V2, n 


Vm ,n 
where m is the total number of users, n is the number of ST clusters
discovered by ST-DBSCAN(Eps_space, Eps_time, MinPts), Vij is the
number of check-ins by user i in the ST cluster j, and l is the level of
detail in the clustering hierarchy.
2013-11-05
MobiGIS 2013, Orlando, FL
14
User Similarity
 User similarity at a given cluster hierarchy level is according to the
cosine similarity of the location vectors of the users:
U A U B
sim( A, B)  cos(U A ,U B ) 
U A UB
 The overall similarity of users is calculated across the cluster
hierarchy levels as follows:
N
simoverall    simi
i 1

i
N
(  i )
i 1
2013-11-05
MobiGIS 2013, Orlando, FL
15
Outline
 Introduction
 Related work
 A hierarchical spatio-temporal similarity measure in LBSN
 Empirical evaluations
2013-11-05
MobiGIS 2013, Orlando, FL
16
Dataset
 Check-in datasets from Gowalla from the Stanford Network Analysis
project for the US cities:
2013-11-05
MobiGIS 2013, Orlando, FL
17
Evaluation Metrics
 Precision and recall (“relative overlap”) of the visits of a user ur and its
most similar user u to the Top-N ST clusters / POIs:
2013-11-05
MobiGIS 2013, Orlando, FL
18
Results
 ST generalization at different
levels of detail improves
performance
 Combining similarity measures
at different ST levels of detail
improves precision and recall
and outperforms the fine-grained
method (see ST-DBSCAN)
 Considering the amount (not
only the existence) of checkin at different ST clusters
improves performance (see
Jaccard)
2013-11-05
MobiGIS 2013, Orlando, FL
19
Conclusions
 We have proposed a new method to calculate the user similarity on
LBSN based on the spatial and temporal properties of the user
check-in data.
 The method can be applied to recommend location or friends in
LBSN, because the key of a recommendation system is the similarity
measurement of user or item.
2013-11-05
MobiGIS 2013, Orlando, FL
20
Thank you for your attention!
Q/A?
2013-11-05
MobiGIS 2013, Orlando, FL
21
Download