Urban Point-of-Interest Recommendation by Mining User Check-in Behaviors Josh Jia-Ching Ying, Eric Hsueh-Chan Lu, Wen-Ning Kuo and Vincent S. Tseng Institute of Computer Science and Information Engineering National Cheng Kung University No.1, University Road, Tainan City 701, Taiwan (R.O.C.) Intelligent DataBase System Lab, NCKU, Taiwan Outline Introduction Background Motivation Challenges Proposed Method – UPOI-Mine Experimental Results Conclusions 2 Intelligent DataBase System Lab, NCKU, Taiwan Introduction – Background The markets of Location-Based Services (LBSs) in urban areas have grown rapidly. Effective and efficient urban POI recommendation techniques are desirable. Location Based Social Network (LBSN) data is widely used for building POI recommendation model. 3 Intelligent DataBase System Lab, NCKU, Taiwan Introduction – Background (cont.) heterogeneous data 4 Intelligent DataBase System Lab, NCKU, Taiwan Introduction – Motivation We can not accurately catch users’ preference by analyzing his and his friend’s check-in actives ? 5 ? Intelligent DataBase System Lab, NCKU, Taiwan Introduction – Challenges How to understand user preference from LBSN data? How to extract useful features from heterogeneous data? How to precisely estimate the relevance between a user- POI pair based on the extracted features? How to integrate heterogeneous information? 6 Intelligent DataBase System Lab, NCKU, Taiwan Proposed Method – UPOI-Mine Offline :UPOI-Mine Individual Preference (IP) POI popularity (PP) Social Factor (SF) Location Types Check-in Data Social Links LBSN Dataset Online: Recommender 7 Intelligent DataBase System Lab, NCKU, Taiwan Feature Extraction Offline :UPOI-Mine Individual Preference (IP) POI popularity (PP) Social Factor (SF) Location Types Check-in Data Social Links LBSN Dataset Online: Recommender 8 Intelligent DataBase System Lab, NCKU, Taiwan Social Factor (SF) Weighted summation: Weight F SF(useri ,POI j ) [ Interestk,j Relationi,k ] k 1 Relationi,k w CheckSimi,k (1 w) DisSimi ,k Interestk , j checkink , j |S | checkin s 1 k ,s F: friends of user i S: the set of POIs U: the set of user i’s friends Check-in k,* = check-ins of user k at POI* 9 Intelligent DataBase System Lab, NCKU, Taiwan Social Factor – Relation Check-in Similarity (CheckSim) based on their check-in log Relative Distance Similarity (DisSim) based on their geographic distance 10 Intelligent DataBase System Lab, NCKU, Taiwan Relation – CheckSim CheckSimi,j (1 0) (0 10) (2 0) (5 1) (0 0) 1 0 2 5 0 0 10 0 1 0 2 2 2 2 2 2 2 2 2 2 0.0908 Friend Indicator i 11 i j k … POI ID A B C D E 0 1 0 … user i 1 0 2 5 0 user j 0 10 0 1 0 user k 1 1 0 0 0 j 1 0 1 … k 0 1 0 .. user l 1 1 1 1 1 … … … … … … … … … … … Intelligent DataBase System Lab, NCKU, Taiwan Relation – DisSim Distance dissimilarity Maxi=1000 Distance Friend Indicator i j k … i j k … i 0 100 10 … i 0 1 0 … j 100 0 50 … j 1 0 1 … k 10 50 0 .. k 0 1 0 .. … … … … … … … … … … 100 DisSim i,j 1 0.9 1000 12 Intelligent DataBase System Lab, NCKU, Taiwan Social Factor – Example w 0.1 10 Social Fact orfrom User B [0.1 0.5 (1 0.1) 0.03] 0.0077 100 Social Factor from User C ..... Social Factor from User D ..... Social Factor from User E ..... -------------------------------social factor of user A to POI k User A Relation: CheckSim(A, B) = 0.5 DisSim(A, B) = 0.03 ? POIk User B #Check-ins at POIK : 10 #Total Check-ins : 100 Interest(B, POIK) = 13 10 100 Intelligent DataBase System Lab, NCKU, Taiwan Individual Preference (IP) highlight • Individual Preference(IP) • HPrefi,h • CPrefi,c category IP(useri ,POI j ) HCounth, j C Pr efi,c I ctg( c ) (POIj ) (1 ) H Pr efi,h HCountg , j cC hH gH , where I(s,c) is an indicatorfunctiondefined as 1 I ctg( c ) (POIj ) 0 14 if POIj ctg(c) otherwise Intelligent DataBase System Lab, NCKU, Taiwan Individual Preference – HPref & CPref POI A(c1) B(c2) C(c2) D(c3) Highlight h1,h2 h1,h2 h2 h3 Check-in count 5 1 2 2 User c1 c2 c3 User1 A B,C D 10 h1 h2 h3 User1 A,B A,B,C D Total 5+1 5+1+2 2 c4 5 1 5 1 2 16 16 5 1 2 2 10 10 10 15 Total Category CPrefi,c Highlight HPrefi,h C1 0.5 H1 0.375 C2 0.3 H2 0.5 C3 0.2 H3 0.125 h4 h5 Total 0 0 16 2 16 proportion of check-ins of the label Intelligent DataBase System Lab, NCKU, Taiwan Individual Preference – Example There is only one category for one POI. There are many highlights for one POI. Counts of highlight POI Category: Hotdog & Sausages Highlight: Coffee(12), Cheese(88) 16 User A’s pref table Category CPref Seafood 0.5 Hotdog & Sausages 0.1 Fast food 0.1 Steak 0.3 Highlight HPref Coffee 0.5 Sightseeing 0.1 Ice Cream 0.1 Cheese 0.3 Intelligent DataBase System Lab, NCKU, Taiwan Individual Preference – Example (cont.) User A’s pref table Category CPref Seafood 0.5 Hotdog & Sausages 0.1 Fast food 0.1 IP( UserA, POI j ) 0.2 0.1 Steak 0.3 12 88 (1 0.2) (0.5 ) (0.1 ) 100 100 HPref 0.168 Highlight HPref Coffee 0.5 Sightseeing 0.1 Ice Cream 0.1 Cheese 0.3 POI A Category: Hotdog & Sausages Highlight: Coffee(12), Cheese(88) 0.2 17 CPref Intelligent DataBase System Lab, NCKU, Taiwan POI Popularity (PP) POI Popularity Relative Popularity of POI Normalized based on category RPj checkinsj checkins POI k CS k , where CS is theset of P OIs which in thesame category with P OI j. 18 Intelligent DataBase System Lab, NCKU, Taiwan POI Popularity – Example Frank Category: Hot Dogs RPFrank 19 Hot Dogs Check-in count Frank 4,032 KKK 25 …… … total 100,000 4,032 0.04032 100,000 Intelligent DataBase System Lab, NCKU, Taiwan Relevance Estimation Offline :UPOI-Mine Individual Preference (IP) POI popularity (PP) Social Factor (SF) Location Types Check-in Data Social Links LBSN Dataset Online: Recommender 20 Intelligent DataBase System Lab, NCKU, Taiwan Relevance Estimation – Example To estimate the relevance of each pair of user to POI, Target we use these feature to learn a Regression-Tree Model. User ID POI ID SF PP IP Relevance 1 A 0.2 0.1 0.001 3 1 B 0.05 0.2 0.1 5 1 C 0.004 0.1 0.9 1 … … … … … … N D 0.5 0.15 0.06 2 Regression-Tree Model 21 Intelligent DataBase System Lab, NCKU, Taiwan Relevance Estimation – Regression-Tree Model Regression-Tree Model has shown excellent performance for numerical value prediction • Demographic Prediction • Bio Life Cycle Analysis • Prediction of Geographical Natural Learning Steps: 1. Building the initial tree 2. Linear regression model for each leaf node 3. Pruning the tree 22 Intelligent DataBase System Lab, NCKU, Taiwan Recommender Offline :UPOI-Mine Individual Preference (IP) POI popularity (PP) Social Factor (SF) Location Types Check-in Data Social Links LBSN Dataset Online: Recommender 23 Intelligent DataBase System Lab, NCKU, Taiwan Experimental Evaluation Experimental dataset – Gowalla Dataset Near or within New York City 1,964,919 POIs 18,159 people 5,341,191 Check-ins 392,246 Friendship Links 24 Intelligent DataBase System Lab, NCKU, Taiwan Experimental Evaluation Experimental measurements Normalized Discounted Cumulative Gain (NDCG) if i 1 G[i ], DCG[i 1] G[i ], if i b DCG[i ] DCG[i 1] G[i ] , if i b logb i NDCG @ p DCG[ p] IDCG[ p] To measure ranking performance of relevance score of top k POIs in recommendation list Mean Absolute Error (MAE) 1 n MAE f i yi n i 1 To measure error of relevance score of all POIs 25 Intelligent DataBase System Lab, NCKU, Taiwan Experimental Evaluation (cont.) Ground Truth x avg 3 2, if x avg max -avg x-avg 3 2, if x avg min -avg avg = 200 POI ID Check-in Relevance A 50 1 B 50 1 C 500 5 D 200 3 Baseline Trust Walker M. Jamali, M. Ester. TrustWalker: A Random Walk Model for Combining Trust-based and Item-based Recommendation. Proceedings of KDD, pages 397-406, Paris, 2009. Multi-Factor CF-based M. Ye, P. Yin, W.-C. Lee and Dik-Lun Lee. Exploiting Geographical 26 Influence for Collaborative Point-of-Interest Recommendation. Proceedings of SIGIR, pages 1046-1054, Beijing, China, 2011. Intelligent DataBase System Lab, NCKU, Taiwan Comparison of Various Features The Individual Preference is more important than Social Factor for urban MAE 0.7 0.62 0.6 0.5 27 0.63 0.65 0.59 NDCG@10 POI recommendation. 100% 80% 60% 40% 20% 0% Intelligent DataBase System Lab, NCKU, Taiwan MAE 0.7 0.6 0.5 28 NDCG@10 Comparison of Various Features (cont.) 100% 80% 60% 40% 20% 0% Intelligent DataBase System Lab, NCKU, Taiwan Comparison with Existing Recommenders NDCG@10 TrustWalker Multi-Factor CF-based Multi-Factor CF-based(geographic influence) Multi-Factor CF-based( user prefrence influence) Multi-Factor CF-based( social influence) Our approach (PP) Our approach (SF) Our approach (IP) Our approach (All) 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 29 Intelligent DataBase System Lab, NCKU, Taiwan Comparison with Existing Recommenders (cont.) MAE TrustWalker Multi-Factor CF-based Multi-Factor CF-based(geographic influence) Multi-Factor CF-based( user prefrence influence) Multi-Factor CF-based( social influence) Our approach (PP) Our approach (SF) Our approach (IP) Our approach (All) 0.00 30 0.50 1.00 1.50 2.00 2.50 Intelligent DataBase System Lab, NCKU, Taiwan Conclusions We proposed a novel urban POIs recommendation which is called UPOI-Mine by mining users’ preferences. we propose three kinds of useful features Social Factor Individual Preference POI Popularity Through a series of experiments by the real dataset Gowalla We have validated our proposed UPOI-Mine and shown that UPOI-Mine has excellent performance under various conditions. The Individual Preference is more important than Social Factor for urban POI recommendation. Intelligent DataBase System Lab, NCKU, Taiwan Thank you for your attentions Question? Intelligent DataBase System Lab, NCKU, Taiwan