Personalizing Web Page Recommendation via

advertisement
Personalizing Web Page Recommendation
via Collaborative Filtering and
Topic-Aware Markov Model
Qingyan Yang, Ju Fan, Jianyong Wang, Lizhu Zhou
Database Research Group, DCS&T, Tsinghua University
Agenda
Motivation
Recommender framework
Experimental evaluation
Conclusions
4/8/2015
DB Group, DCS&T, Tsinghua University
2
Motivation
Recommender framework
Experimental evaluation
Conclusions
4/8/2015
DB Group, DCS&T, Tsinghua University
3
Motivation
• The Web is explosively growing
▪ By the end of 2009 (source: the 25th Internet Report, 2010)
◦ 33,600,000,000 Web pages in China
◦ Twice as many as that in 2003
• Finding desired information is more difficult.
▪ Users often wander aimless on the Web without
visiting pages of his/her interests
▪ Or spend a long time on finding the expected
information.
4/8/2015
DB Group, DCS&T, Tsinghua University
4
Web page
recommendation
4/8/2015
DB Group, DCS&T, Tsinghua University
5
Web page recommendation
• Objective
▪ To understand users' navigation behavior
▪ To show some pages of users' interests at a
specific time
• Existing popular solutions
▪ Markov model and its variants
▪ Temporal relation is important.
If the browsing sequence is "A B C … A B C … A B C",
Then C is recommended when A and B are visited one after another
4/8/2015
DB Group, DCS&T, Tsinghua University
6
Limitations
• No personalized recommendations
▪ All users receive the same results
• Topic information of pages is neglected.
▪ Two pages, which are sequentially visited, may be
very different in terms of topics.
4/8/2015
DB Group, DCS&T, Tsinghua University
7
PIGEON: our solution
• Personalized Web page recommendation
• Two novel features
▪ Personalization
◦ Meet preference of different users
I am a
blog
about
finance
4/8/2015
DB Group, DCS&T, Tsinghua University
8
PIGEON: our solution
• Two novel features
▪ Personalization
▪ Topical coherence
◦ To be relevant to users' present missions
4/8/2015
DB Group, DCS&T, Tsinghua University
9
Motivation
Recommender framework
Experimental evaluation
Conclusions
4/8/2015
DB Group, DCS&T, Tsinghua University
10
Recommender framework
4/8/2015
DB Group, DCS&T, Tsinghua University
11
Data representation
• Navigation graph
A
2
B
1
2
3
D
4
2
2
H
E
C
K
G
2
I
6
1
Weight:
relation frequency
L
J
1
F
2
Edge:
jump relation
M
Web page
Jump relation
4/8/2015
Time
User ID
IP address
Target
Source
(09:44:44)
(0e0c…)
(211.90.-.-)
A
()
(09:44:58)
(0e0c…)
(211.90.-.-)
B
A
(10:14:29)
(0e0c…)
(211.90.-.-)
G
A
DB Group, DCS&T, Tsinghua University
12
Topic discovery
• Basic idea
▪ We assume that pages with similar URLs or
evolved in jump relations are topically relevant.
• URLs Features
▪ Keywords. e.g., http://dblp.uni-trier.de/db/index.html
▪ Expanded by Manifold-based keyword propagation
• Web page clustering
▪ Each cluster represents one topic
4/8/2015
DB Group, DCS&T, Tsinghua University
13
Example
A
3
2
B
2
2
D
G
4
2
2
1
K
E
C
H
2
6
I
L
J
1
1
M
F
4/8/2015
DB Group, DCS&T, Tsinghua University
14
Topic-Aware Markov Model
• Take n-grams as states. e.g., n=2
ABCD B CA
A C C A, B D B
AB BC CD
DB CA
AB BC CD AC CC CA
DB CA
BD DB
Temporal state
Topical state
• Web page preference score
▪ Maximum likelihood estimation
▪ e.g., P(D|BC) = f(BCD)/f(BC) = 1/2
4/8/2015
DB Group, DCS&T, Tsinghua University
15
Personalized Recommender
• Collaborative filtering
▪ Basic idea
X
s~(u; p) = k
u0
4/8/2015
si m(u; u0)s(u0; p)
user
similarities
u : acti ve user; p : W eb page
Web page
preference
DB Group, DCS&T, Tsinghua University
16
User Similarity
• User profile
▪ A set of topics
• Similarity measurement
▪ Topic similarity
▪ Maximum weight matching
si m(u1 ; u2 ) =
4/8/2015
0.9
1.0
0.8
0:9 + 0:8 + 1:0
= 0:9
3
DB Group, DCS&T, Tsinghua University
17
Motivation
Recommender framework
Experimental evaluation
Conclusions
4/8/2015
DB Group, DCS&T, Tsinghua University
18
Experiment settings
• Data set
▪ 1,402,371 records of 375 users in 34 days
▪ First 30 days for training and 4 days for testing
• Metrics are precision and recall
• Comparative methods
Temporal
4/8/2015
Topical
Baseline
Y
TAMM
Y
Y
PIGEON
Y
Y
DB Group, DCS&T, Tsinghua University
Personalized
Y
19
Experimental evaluation
1st-order model
4/8/2015
2nd-order model
DB Group, DCS&T, Tsinghua University
20
Motivation
Recommender framework
Experimental evaluation
Conclusions
4/8/2015
DB Group, DCS&T, Tsinghua University
21
Conclusions
• Taking user similarities into account, we could
recommend Web pages to meet different users'
preferences.
• We discover users' interested topics using an effective
graph-based clustering algorithm.
• We devise a topic-aware Markov model to learn
navigation patterns which contribute to the topically
coherent recommendations.
4/8/2015
DB Group, DCS&T, Tsinghua University
22
THANKS
4/8/2015
DB Group, DCS&T, Tsinghua University
23
Download