AAAI10_Wu

advertisement
Modeling Dynamic Multi-topic
Discussions in Online Forums
Hao Wu, Jiajun Bu, Chun Chen, Can Wang,
Guang Qiu, Lijun Zhang and Jianfeng Shen*
Zhejiang University, China
*Zhejiang Health Information Center, China
July 13, AAAI’2010
Atlanta, GA, USA
Social Media
• Web 2.0 applications socialize users online
• Online Forums
– Distinct platform for knowledge sharing and information exchange
Reveal how information propagates on Internet.
Modeling the process of topic discussions and
predicting user activity is an interesting problem!
2
Benefits of Modeling
• Understand online human interactions and
group forming
Social network analysis
• Improve applications e.g., recommender
• Track new ideas and technology
• Mine opinions about products
User review
3
Environment of Online Forums
• Great complexity
433,839 threads
13,599,245 posts
• Randomness
What are the mechanisms
underlying user’s participation
?
From which perspective
to view the process of
topic discussion
How to make use of the
– Usually no well-defined
friendships or co-authorships property of topics and
temporal feature for modeling
Modeling
Dynamic
Multi-topic
– Free to posting
Discussions is challenging !
– Topic drifts in a single thread
How to measure the importance
of a user in discussions
4
Outline
•
•
•
•
Motivation and Intuitions
Topic Flow Models
Experimental Results
Summary
5
Topic Flow Model (TFM)
The new comer reads some of the
previous comments before posting.
The information (topic) flows from
early participant to late participant .
Reply Link
Topic Flow
Topic diffuses through the
underlying social networks
6
Basic Topic Flow Model (B-TFM)
Thread Document: d  D
j : Rijd
Frequency of i
j : Cd
Frequency of i
Social Network
i
Thread Documents
D
Peer-influence
wij  dD Rijd
Self-preference
yi   dD Cid
Normalization
Random Walk
S   D W  (1   )11 / n With Restart
1
q  yi / y
T
Topic Flow
ParticipationRank: measures
the susceptibility of a user to
a ‘infective’ topic
p( t 1)   ST pt  (1   )q
p*  (1   )(I   ST )1 q
7
Topic-specific TFM (T-TFM)
 Different interaction patterns according to different topics
iPhone
FIFA World Cup
w  dD P( z | d ) R
z
ij
d
ij
yiz   dD P( z | d )Cid
Using Latent Dirichlet
Allocation [Blei 2003]
8
Time-sensitive T-TFM (TT-TFM)
• Forgetting Mechanism
past
now
Time lapses
now
wijz  dD exp(   td ) P( z | d ) Rijd
yiz  dD exp(   td ) P( z | d )Cid
Time Lapse Factor
9
Evaluation: Prediction
• ParticipationRank p  (indicator)
– The willingness of a user in participation to
discussion of a topic
Train
Predict
Ranking
?
p   zZ  dD P ( z | d )p
*
F
*
z
Synthesize For T-TFM and TT-TFM
Whether a user
joins in discussion?
(post at least once )
10
Outline
•
•
•
•
Motivation and Intuitions
Topic Flow Models
Experimental Results
Summary
11
Experiments
• Dataset
(www.honda-tech.com)
– Two communities: Drag Racing and Honda/Acura
– Across one year, from 09/01/2008 to 08/31/2009.
posted more than the average
number of posts per user.
12
• Evaluations
Results
– Divide the data into 12 continuous time windows
– Generate ranking for each one month data, and
predict user posting activity in the following one week
13
Model Selection
•  = 0.3 and 0.1
• T = 30 and 40
•  = 0.01
14
Summary
• An intuitive model of discussions in online forums
• Topic Flow Models (TFM)
– Consider both peer-influence and self-preference
– Property of latent topics
– Temporal feature: forgetting mechanism
• Evaluation on prediction of user activity
• Future work:
– Utilize the web structure of online forum
– More data sets e.g.,
– Build recommendation system
15
Thanks!
Any Question?
16
Download