slides

advertisement
Mining Cross-network Association for
YouTube Video Promotion
Ming Yan
Institute of Automation, Chinese Academy of Sciences
May 15, 2014
Outline
 Motivation
 Three-stage Framework
 Some Visualization
 Further Discussion
Background
• Large quantities of videos are consumed in YouTube and
the trend is growing year by year.
• More than 1 billion unique users visit
YouTube each month.
• Over 6 billion hours of video are watched
each month on YouTube.
• 100 hours of video are uploaded to
YouTube every minute.
• YouTube exhibits limited propagation efficiency and many
videos remain unknown to the wide public.
• Long tail effect for the video view count
distribution.
• Short active life span for most videos.
Background
• YouTube video popularity limited by its internal mechanism.
•
•
•
•
Internal search
Related video recommendation
Channel subscription
Front page highlight
• External referrers such as social media websites arise to be
important sources to lead users to YouTube videos.
• Twitter has been quickly growing as
the top referrer source for web video
discovery.
Motivation
• For specific YouTube video, to identify proper Twitter followees
with goal to maximize video dissemination to the followers.
Twitter followee
YouTube video
watch
Got 1 billion views in
5 months
Twitter follower
Challenge
• The heterogeneous knowledge association between
YouTube video and Twitter followee
• user-perceived
• How to define the “properness” of candidate Twitter
followee for a specific YouTube video
• interestness
• virtual cost
Our Twitter followee identification scheme actually expects
to find the optimal Twitter followee whose followers are
more likely to show interest to the target video.
User-perceived Solution
• Illustration example
better promotion
referrer
follow
follow
User
Association
view
favor
view
Framework
• Three Stages
Heterogeneous Topic Modeling
Twitter users 𝓤𝑻
Input
Following
𝒖
ACM Multimedia
2014
@acmmm14
Username
@TwitterID
…
NBA
@NBA
LDA
Britney Spears
@britneyspears
Bill Gates
@BillGates
𝒇𝒐𝒍𝒍𝒐𝒘𝒆𝒆
𝓤𝒖
Twitter user
distribution 𝑼𝑇
…
𝑝(𝒛𝑇 |𝑢)
Topic Modeling Approach
• On YouTube Side:
Propose an inverse Corr-LDA model to
discover the YouTube video multimodal
topics.
YouTube video
distribution 𝑽
…
𝑝(𝒛𝑌 |𝑣)
iCorr-LDA
𝒗
𝒇
𝒘
YouTube videos 𝓥
• YouTube video 𝒗 ∈ 𝓥 : [𝒘𝒗 , 𝒇𝒗 ]
• Twitter users 𝒖 ∈ 𝓤𝑻 with their
follower set
Output
• Twitter user distribution 𝑈 𝑇
• YouTube user distribution 𝑉
𝑁
• On Twitter Side:
𝑦
𝑓
Standard LDA on Twitter followeefollower social graph.
𝑀
 user as document
𝑧 as word
𝛼  user’s
𝑤
𝜃 followees
|𝒱|
𝜇
𝜎
𝛽
Cross-network Topic Association
overlapped users
𝑇
𝓤
Input
𝑌
𝓤𝑜 𝓤
𝑈𝑇
𝒛𝑇
𝒖𝑇
𝓤𝑜
• Twitter user and video distribution
𝑈 𝑇 and 𝑉 (output of stage 1)
• YouTube, Twitter and the overlapped
user set 𝑢 𝑌 , 𝑢 𝑇 , 𝑢𝑜
• YouTube user interested video set 𝑣𝑢
ℱ
𝒛𝑌
𝒖𝑌
Association Mining
Output
• Distribution transfer function
ℱ: 𝐮𝑌 → 𝐮𝑇
𝑉
Aggregation
…
𝑝(𝒛𝑌 |𝑢)
(𝐮𝑌 : the aggregated YouTube user
distribution)
YouTube user distribution 𝑼𝑌
Interested
videos
username
𝓥𝑢
Approach
• YouTube User Aggregation
• Association Mining
Cross-network Topic Association
• YouTube User Aggregation
𝑝(𝑧𝑘𝑌 |𝑣)
𝒗𝟏
𝑤1
𝑝 𝑧𝑘 𝑢𝑖
𝒗𝟐
𝒖
user 𝒖’s
…
𝑤2
interested
videos
𝒗𝒏
𝑤𝑛
𝑝 𝑧𝑘 𝑢𝑖 =
𝑣∈𝑉𝑢
𝑁𝑣 𝑓 + 𝑁𝑣 (𝑤)
∙ 𝑝(𝑧𝑘𝑌 |𝑣)
𝑁 𝑓 + 𝑁(𝑤)
𝑁𝑣 𝑓 , 𝑁𝑣 (𝑤) : the total number of keyframes and words in video 𝒗
𝑁 𝑓 , 𝑁(𝑤) : the total number of keyframes and words in 𝒖’s video set 𝑉𝑢
Cross-network Topic Association
• Association Mining
Goal:
• To obtain the association between the YouTube video
space and Twitter user space. (i.e. ℱ: 𝐮𝑌 → 𝐮𝑇 )
Approach:
• Transition Probability-based Association
• Regression-based Association
• Latent Attribute-based Association
overlapped users
𝓤𝑇 𝓤𝑜 𝓤𝑌
𝒛𝑇
Explicit association/transition matrix:
𝐴
𝐴 = 𝑎𝑖𝑗 , 𝑠. 𝑡. 𝐮𝑌 → 𝐮𝑇
𝓤𝑜
𝒛𝑌
Association Mining
Cross-network Topic Association
• Transition Probability-based Association
• Regression-based Association
𝑈𝑜𝑇 , 𝑈𝑜𝑌 : The overlapped
users’ distribution matrix
in Twitter and YouTube
q=1: lasso problem and can be effectively solved by LARS and
feature sign algorithm
q=2: ridge regression problem and with analytical solution as
Cross-network Topic Association
• Latent Attribute-based Association (non-linear)
• only on overlapped users
• on all users
• Innovation: To discover shared latent structure behind the
two topic spaces. (After projected to the latent attribute
spaces, user’s YouTube and Twitter distribution share the
same coefficient.)
• Only on overlapped users
shared latent user
attribute
By some simple transfer, it can be efficiently solved by the
sparse coding algorithm.
Cross-network Topic Association
• Latent attribute discovery on all users (plenty of nonoverlapped users are considered in this scheme)
• Objective function
𝑌
𝑇 ]
𝑆 𝑌 = 𝑆𝑜 , 𝑆𝑛𝑜𝑛
, 𝑆 𝑇 = [𝑆𝑜 , 𝑆𝑛𝑜𝑛
• Iteratively solved via three sub-problems
Referrer Identification
test YouTube video
Input
• Distribution transfer function ℱ
• Test videos 𝒗𝒕
• Twitter followee set 𝑢 𝑓𝑜𝑙𝑙𝑜𝑤𝑒𝑒
𝒗𝒕
𝒇
𝒘
𝑝(𝒛𝑌 |𝑢𝑡 )
Output
• Twitter followee rank for each video
𝑣 ∈ 𝒗𝒕
Distribution
Transfer
Approach
• Direct product-based matching
• Weighted product-based matching
𝑝(𝒛𝑇 |𝑢𝑡 )
Matching
…
𝑓𝑜𝑙𝑙𝑜𝑤𝑒𝑒
𝓤𝑡
candidate Twitter
followees
Referrer Identification
• Direct product-based matching
• Weighted product-based matching
• Ranking SVM algorithm is used to train the weights:
• Feature:
• Training label: a designed properness score
In charge of the coverage of the
interested audiences
In charge of the virtual cost
• With the learnt model parameter ℎ∗ :
Some Visualization
Further Discussion
Some Extensible Application
 Examining the value of Twitter followees (Our work can be
viewed as valuing Twitter followee w.r.t. promotion efficiency
to YouTube videos)
(e.g. the followee has a lot of young female followers)
 Advertising (Advertising media selection for our work)
(e.g. anchor text generation (i.e., optimizing video description for
promotion), advertising slot bid (i.e., followee reshare time
selection))
Other user-bridged cross network application
Challenge
Data hard to get!
Tweet
Topic
1
user
Taobao
Topic
recommend
Video
2
Advertisement
Download