PPT - Computer Science

advertisement
Urban Point-of-Interest Recommendation by
Mining User Check-in Behaviors
Josh Jia-Ching Ying, Eric Hsueh-Chan Lu, Wen-Ning Kuo and Vincent S. Tseng
Institute of Computer Science and Information Engineering
National Cheng Kung University
No.1, University Road, Tainan City 701, Taiwan (R.O.C.)
Intelligent DataBase
System Lab, NCKU, Taiwan
Outline
 Introduction
 Background
 Motivation
 Challenges
 Proposed Method – UPOI-Mine
 Experimental Results
 Conclusions
2
Intelligent DataBase
System Lab, NCKU, Taiwan
Introduction – Background
 The markets of Location-Based Services (LBSs) in
urban areas have grown rapidly.
 Effective and efficient urban POI recommendation
techniques are desirable.
 Location Based Social Network (LBSN) data is widely
used for building POI recommendation model.
3
Intelligent DataBase
System Lab, NCKU, Taiwan
Introduction – Background (cont.)
 heterogeneous data
4
Intelligent DataBase
System Lab, NCKU, Taiwan
Introduction – Motivation
We can not accurately catch users’ preference by
analyzing his and his friend’s check-in actives
?
5
?
Intelligent DataBase
System Lab, NCKU, Taiwan
Introduction – Challenges
 How to understand user preference from LBSN data?
 How to extract useful features from heterogeneous data?
 How to precisely estimate the relevance between a user-
POI pair based on the extracted features?
 How to integrate heterogeneous information?
6
Intelligent DataBase
System Lab, NCKU, Taiwan
Proposed Method – UPOI-Mine
Offline :UPOI-Mine
Individual
Preference (IP)
POI popularity
(PP)
Social Factor
(SF)
Location
Types
Check-in
Data
Social
Links
LBSN Dataset
Online: Recommender
7
Intelligent DataBase
System Lab, NCKU, Taiwan
Feature Extraction
Offline :UPOI-Mine
Individual
Preference (IP)
POI popularity
(PP)
Social Factor
(SF)
Location
Types
Check-in
Data
Social
Links
LBSN Dataset
Online: Recommender
8
Intelligent DataBase
System Lab, NCKU, Taiwan
Social Factor (SF)
Weighted summation:
Weight
F
SF(useri ,POI j )  [ Interestk,j  Relationi,k ]
k 1
Relationi,k  w CheckSimi,k  (1  w)  DisSimi ,k
Interestk , j 
checkink , j
|S |
 checkin
s 1
k ,s
F: friends of user i
S: the set of POIs
U: the set of user i’s friends
Check-in k,* = check-ins of user k at POI*
9
Intelligent DataBase
System Lab, NCKU, Taiwan
Social Factor – Relation
 Check-in Similarity (CheckSim)
 based on their check-in log
 Relative Distance Similarity (DisSim)
 based on their geographic distance
10
Intelligent DataBase
System Lab, NCKU, Taiwan
Relation – CheckSim
CheckSimi,j 
(1 0)  (0 10)  (2  0)  (5 1)  (0  0)
1  0  2  5  0  0  10  0  1  0
2
2
2
2
2
2
2
2
2
2
 0.0908
Friend Indicator
i
11
i
j
k
…
POI ID
A
B
C
D
E
0
1
0
…
user i
1
0
2
5
0
user j
0
10
0
1
0
user k
1
1
0
0
0
j
1
0
1
…
k
0
1
0
..
user l
1
1
1
1
1
…
…
…
…
…
…
…
…
…
…
…
Intelligent DataBase
System Lab, NCKU, Taiwan
Relation – DisSim
 Distance  dissimilarity
 Maxi=1000
Distance
Friend Indicator
i
j
k
…
i
j
k
…
i
0
100
10
…
i
0
1
0
…
j
100
0
50
…
j
1
0
1
…
k
10
50
0
..
k
0
1
0
..
…
…
…
…
…
…
…
…
…
…
100
DisSim i,j  1 
 0.9
1000
12
Intelligent DataBase
System Lab, NCKU, Taiwan
Social Factor – Example
w  0.1
10
Social Fact orfrom User B 
 [0.1 0.5  (1  0.1)  0.03]  0.0077
100
Social Factor from User C  .....
 Social Factor from User D  .....
Social Factor from User E  .....
-------------------------------social factor of user A to POI k
User A
Relation:
CheckSim(A, B) = 0.5
DisSim(A, B) = 0.03
?
POIk
User B
#Check-ins at POIK : 10
#Total Check-ins : 100
Interest(B, POIK) =
13
10
100
Intelligent DataBase
System Lab, NCKU, Taiwan
Individual Preference (IP)
highlight
• Individual Preference(IP)
• HPrefi,h
• CPrefi,c
category
IP(useri ,POI j ) 



HCounth, j 
   C Pr efi,c  I ctg( c ) (POIj )   (1   )    H Pr efi,h 

HCountg , j 
cC
hH 

gH


, where I(s,c) is an indicatorfunctiondefined as
1
I ctg( c ) (POIj )  
0
14
if POIj  ctg(c)
otherwise
Intelligent DataBase
System Lab, NCKU, Taiwan
Individual Preference – HPref & CPref
POI
A(c1)
B(c2)
C(c2)
D(c3)
Highlight
h1,h2
h1,h2
h2
h3
Check-in count
5
1
2
2
User
c1
c2
c3
User1
A
B,C
D
10
h1
h2
h3
User1
A,B
A,B,C
D
Total
5+1
5+1+2
2
c4
5 1 5 1 2
16
16
5 1 2 2
10 10 10
15
Total
Category
CPrefi,c
Highlight
HPrefi,h
C1
0.5
H1
0.375
C2
0.3
H2
0.5
C3
0.2
H3
0.125
h4
h5
Total
0
0
16
2
16
proportion of
check-ins of
the label
Intelligent DataBase
System Lab, NCKU, Taiwan
Individual Preference – Example
 There is only one category for one POI.
 There are many highlights for one POI.
Counts of highlight
POI
Category: Hotdog & Sausages
Highlight: Coffee(12), Cheese(88)
16
User A’s pref table
Category
CPref
Seafood
0.5
Hotdog &
Sausages
0.1
Fast food
0.1
Steak
0.3
Highlight
HPref
Coffee
0.5
Sightseeing
0.1
Ice Cream
0.1
Cheese
0.3
Intelligent DataBase
System Lab, NCKU, Taiwan
Individual Preference – Example (cont.)
User A’s pref table
Category
CPref
Seafood
0.5
Hotdog &
Sausages
0.1
Fast food
0.1
IP( UserA, POI j )  0.2  0.1 
Steak
0.3
12
88 

(1  0.2)  (0.5 
)  (0.1
)
100
100 

HPref
 0.168
Highlight
HPref
Coffee
0.5
Sightseeing
0.1
Ice Cream
0.1
Cheese
0.3
POI A
Category: Hotdog & Sausages
Highlight: Coffee(12), Cheese(88)
  0.2
17
CPref
Intelligent DataBase
System Lab, NCKU, Taiwan
POI Popularity (PP)
 POI Popularity
 Relative Popularity of POI
 Normalized based on category
RPj 
checkinsj
 checkins
POI k CS
k
, where CS is theset of P OIs which in thesame category with P OI j.
18
Intelligent DataBase
System Lab, NCKU, Taiwan
POI Popularity – Example
Frank
Category: Hot Dogs
RPFrank
19
Hot Dogs
Check-in count
Frank
4,032
KKK
25
……
…
total
100,000
4,032

 0.04032
100,000
Intelligent DataBase
System Lab, NCKU, Taiwan
Relevance Estimation
Offline :UPOI-Mine
Individual
Preference (IP)
POI popularity
(PP)
Social Factor
(SF)
Location
Types
Check-in
Data
Social
Links
LBSN Dataset
Online: Recommender
20
Intelligent DataBase
System Lab, NCKU, Taiwan
Relevance Estimation – Example
To estimate the relevance of each pair of user to POI,
Target
we use these feature to learn a Regression-Tree Model.
User ID
POI ID
SF
PP
IP
Relevance
1
A
0.2
0.1
0.001
3
1
B
0.05
0.2
0.1
5
1
C
0.004
0.1
0.9
1
…
…
…
…
…
…
N
D
0.5
0.15
0.06
2
Regression-Tree Model
21
Intelligent DataBase
System Lab, NCKU, Taiwan
Relevance Estimation – Regression-Tree
Model
 Regression-Tree Model has shown excellent performance
for numerical value prediction
• Demographic Prediction
• Bio Life Cycle Analysis
• Prediction of Geographical Natural
 Learning Steps:
 1. Building the initial tree
 2. Linear regression model for each leaf node
 3. Pruning the tree
22
Intelligent DataBase
System Lab, NCKU, Taiwan
Recommender
Offline :UPOI-Mine
Individual
Preference (IP)
POI popularity
(PP)
Social Factor
(SF)
Location
Types
Check-in
Data
Social
Links
LBSN Dataset
Online: Recommender
23
Intelligent DataBase
System Lab, NCKU, Taiwan
Experimental Evaluation
 Experimental dataset – Gowalla Dataset
 Near or within New York City
 1,964,919 POIs
 18,159 people
 5,341,191 Check-ins
 392,246 Friendship Links
24
Intelligent DataBase
System Lab, NCKU, Taiwan
Experimental Evaluation
 Experimental measurements
 Normalized Discounted Cumulative Gain (NDCG)
if i  1
G[i ],
 DCG[i  1]  G[i ],
if i  b

DCG[i ]  
 DCG[i  1]  G[i ] , if i  b

logb i
NDCG @ p 
DCG[ p]
IDCG[ p]
 To measure ranking performance of relevance score of top k POIs in
recommendation list
 Mean Absolute Error (MAE)
1 n
MAE   f i  yi
n i 1
 To measure error of relevance score of all POIs
25
Intelligent DataBase
System Lab, NCKU, Taiwan
Experimental Evaluation (cont.)
 Ground Truth
x  avg

3

 2, if x  avg

 max -avg

x-avg
3  2, if x  avg

 min -avg
avg = 200
POI ID
Check-in
Relevance
A
50
1
B
50
1
C
500
5
D
200
3
 Baseline
 Trust Walker
 M. Jamali, M. Ester. TrustWalker: A Random Walk Model for Combining
Trust-based and Item-based Recommendation. Proceedings of KDD,
pages 397-406, Paris, 2009.
 Multi-Factor CF-based
 M. Ye, P. Yin, W.-C. Lee and Dik-Lun Lee. Exploiting Geographical
26
Influence for Collaborative Point-of-Interest Recommendation.
Proceedings of SIGIR, pages 1046-1054, Beijing, China, 2011.
Intelligent DataBase
System Lab, NCKU, Taiwan
Comparison of Various Features
 The Individual Preference is more important than Social Factor for urban
MAE
0.7
0.62
0.6
0.5
27
0.63
0.65
0.59
NDCG@10
POI recommendation.
100%
80%
60%
40%
20%
0%
Intelligent DataBase
System Lab, NCKU, Taiwan
MAE
0.7
0.6
0.5
28
NDCG@10
Comparison of Various Features (cont.)
100%
80%
60%
40%
20%
0%
Intelligent DataBase
System Lab, NCKU, Taiwan
Comparison with Existing
Recommenders
NDCG@10
TrustWalker
Multi-Factor CF-based
Multi-Factor CF-based(geographic influence)
Multi-Factor CF-based( user prefrence influence)
Multi-Factor CF-based( social influence)
Our approach (PP)
Our approach (SF)
Our approach (IP)
Our approach (All)
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
29
Intelligent DataBase
System Lab, NCKU, Taiwan
Comparison with Existing
Recommenders (cont.)
MAE
TrustWalker
Multi-Factor CF-based
Multi-Factor CF-based(geographic influence)
Multi-Factor CF-based( user prefrence influence)
Multi-Factor CF-based( social influence)
Our approach (PP)
Our approach (SF)
Our approach (IP)
Our approach (All)
0.00
30
0.50
1.00
1.50
2.00
2.50
Intelligent DataBase
System Lab, NCKU, Taiwan
Conclusions
 We proposed a novel urban POIs recommendation which is
called UPOI-Mine by mining users’ preferences.
 we propose three kinds of useful features
 Social Factor
 Individual Preference
 POI Popularity
 Through a series of experiments by the real dataset Gowalla
 We have validated our proposed UPOI-Mine and shown that
UPOI-Mine has excellent performance under various
conditions.
 The Individual Preference is more important than Social
Factor for urban POI recommendation.
Intelligent DataBase
System Lab, NCKU, Taiwan
Thank you for your attentions
Question?
Intelligent DataBase
System Lab, NCKU, Taiwan
Download