Geographical and Temporal Similarity Measurement in Location-based Social Networks Zhengwu Yuan Yanli Jiang Gyözö Gidofalvi Chongqing University of Posts and Telecommunications KTH – Royal Institute of Technology Outline Introduction Related work A hierarchical spatio-temporal similarity measure in LBSN Empirical evaluations 2013-11-05 MobiGIS 2013, Orlando, FL 2 Introduction Mobile Internet technology Location-based User Similarity Social Network Internet technology Space Location technology 2013-11-05 MobiGIS 2013, Orlando, FL 3 LBSN Applications 2013-11-05 MobiGIS 2013, Orlando, FL 4 Information Layout of LBSN Gao at al. Data Analysis on Location-Based Social Networks. 2011 2013-11-05 MobiGIS 2013, Orlando, FL 5 Outline Introduction Related work A hierarchical spatio-temporal similarity measure in LBSN Empirical evaluations 2013-11-05 MobiGIS 2013, Orlando, FL 6 Traditional: Cosine Similarity Given a set of commonly rated items IAB, the cosine similarity between two users A and B based on their respective ratings RA,i and RB,i on items i ϵ IAB is: sim( A, B) 2013-11-05 iI AB iI AB RA,i RB ,i RA,i 2 MobiGIS 2013, Orlando, FL iI AB RB,i 2 7 Traditional: Adjusted Cosine Similarity Given a set of commonly rated items IAB , the adjusted cosine similarity between two users A and B based on the sets of their individually rated items IA and IB and their average individual ratings on these items and is: sim( A, B) 2013-11-05 iI A iI AB ( RB ,i RB )2 ( RA,i RA ) RB ,i RB ( RA , i R A ) 2 MobiGIS 2013, Orlando, FL iI B 8 Traditional: Pearson Correlation Coefficient Given a set of commonly rated items IAB , the adjusted cosine similarity between two users A and B based on the sets of their individually rated items IA and IB and their average individual ratings on these items and is: sim( A, B) 2013-11-05 iI AB iI AB ( RA,i RA )( RB ,i RB ) ( RA , i RA ) 2 MobiGIS 2013, Orlando, FL iI AB ( RB ,i RB ) 2 9 Similarity in LBSN Similarity along (a combination of) different dimensions: Content layer, e.g.: Ye’11, McKenzie’13 Social layer, e.g.: Ye’12 Geographical layer, e.g.: Li’08 Semantic locations / categories of locations, e.g.: Xiao’10, Bao’12, Ye’11 Temporal sequential similarity, e.g.: Li’08 Check-in temporal similarity, e.g.: Ye’11 2013-11-05 MobiGIS 2013, Orlando, FL 10 Outline Introduction Related work A hierarchical spatio-temporal similarity measure in LBSN Empirical evaluations 2013-11-05 MobiGIS 2013, Orlando, FL 11 A Hierarchical Spatial-Temporal Similarity Measure in LBSN Assumptions about user similarity: The closer is the time and the geographical location that two users access the more similar are the two users to each other The larger is the number of check-ins of two users in nearby locations at similar times, the more similar are the two users to each other Similarity changes with the level of detail Proposed method: Extract spatio-temporal clusters from user check-ins at different spatio-temporal levels of detail For each ST level of detail, measure the cosine similarity between users using the classical Vector Space Model (VSM) with vectors composed of the amount of user visits in different ST clusters Calculate the weighted combination of similarities at different ST levels of detail 2013-11-05 MobiGIS 2013, Orlando, FL 12 Hierarchical Spatio-Temporal Clustering Spatio-temporal variant of DBSCAN: ST-DBSCAN [Birant’07] An object is a core object if within its spatial (Eps_space) and temporal (Eps_time) neighborhood the number of objects is at least MinPts. Definitions for Directly Density-Reachable (DDR), Density-Reachable (DR), and Density-Connected are straight forward extensions. Clusters at different levels of detail: 2013-11-05 MobiGIS 2013, Orlando, FL 13 Vector Space Model Define the user-location matrix within a certain period as Vl ( mn ) V1,1 V1,2 V V 2,1 2,2 Vm ,1 Vm ,2 V1,n 1 V2, n 1 Vm,n 1 V1,n V2, n Vm ,n where m is the total number of users, n is the number of ST clusters discovered by ST-DBSCAN(Eps_space, Eps_time, MinPts), Vij is the number of check-ins by user i in the ST cluster j, and l is the level of detail in the clustering hierarchy. 2013-11-05 MobiGIS 2013, Orlando, FL 14 User Similarity User similarity at a given cluster hierarchy level is according to the cosine similarity of the location vectors of the users: U A U B sim( A, B) cos(U A ,U B ) U A UB The overall similarity of users is calculated across the cluster hierarchy levels as follows: N simoverall simi i 1 i N ( i ) i 1 2013-11-05 MobiGIS 2013, Orlando, FL 15 Outline Introduction Related work A hierarchical spatio-temporal similarity measure in LBSN Empirical evaluations 2013-11-05 MobiGIS 2013, Orlando, FL 16 Dataset Check-in datasets from Gowalla from the Stanford Network Analysis project for the US cities: 2013-11-05 MobiGIS 2013, Orlando, FL 17 Evaluation Metrics Precision and recall (“relative overlap”) of the visits of a user ur and its most similar user u to the Top-N ST clusters / POIs: 2013-11-05 MobiGIS 2013, Orlando, FL 18 Results ST generalization at different levels of detail improves performance Combining similarity measures at different ST levels of detail improves precision and recall and outperforms the fine-grained method (see ST-DBSCAN) Considering the amount (not only the existence) of checkin at different ST clusters improves performance (see Jaccard) 2013-11-05 MobiGIS 2013, Orlando, FL 19 Conclusions We have proposed a new method to calculate the user similarity on LBSN based on the spatial and temporal properties of the user check-in data. The method can be applied to recommend location or friends in LBSN, because the key of a recommendation system is the similarity measurement of user or item. 2013-11-05 MobiGIS 2013, Orlando, FL 20 Thank you for your attention! Q/A? 2013-11-05 MobiGIS 2013, Orlando, FL 21