SIGIR2010_publish

advertisement
Understanding Web Browsing Behaviors
through Weibull Analysis of Dwell Time
Chao Liu, Ryen White, Susan Dumais
Microsoft Research at Redmond
Dwell Time as User Implicit Feedbacks


The most significant indicator of document relevance
besides clickthroughs [Kelly and Belkin, SIGIR’01,
SIGIR’04]
Leveraged in various applications




Learning to rank [Agichtein et al., SIGIR’06]
Query expansion [Buscher et al., SIGIR’09]
BrowseRank, assuming an exponential dist. [Liu et al., SIGIR’08]
…
Questions Addressed in this Study

Questions:




How do we model the dwell time distribution Pr(t|d)?
What does Pr(t|d) tell us about user browsing behaviors?
How is the distribution related to page-level features, and can we
predict the distribution based on page-level features?
Takeaways



We propose to model Pr(t|d) using Weibull distributions
The fitted Weibull distribution exhibits a strong negative aging effect,
which indicates a “screen-and-glean” browsing behavior
We can predict Pr(t|d) based on page features, which effectively
extends the application of dwell time to scenarios where dwell time
data is not available
Outline

A Primer on Weibull Analysis



Weibull Analysis on Dwell Time




Goodness-of-Fit
Screen-and-glean browsing pattern
Screening by categories
Predicting Dwell Time Distribution



Weibull distribution and analysis
Hazard function and aging effects
Prediction performance
Feature importance
Conclusions
Weibull Analysis

Weibull analysis is a method for modeling positive
data sets, such as time-to-failure data




Success beyond reliability engineering


Predicting product life,
Comparing reliability of competing product designs
Establishing warranty policies or proactively managing
spare parts inventories
Survival analysis, weather forecasting, fading channels in
wireless communication, the length of labor strikes,
AIDS mortality and earthquake probabilities, etc.
Unfortunately, no prior Weibull analysis on Web data
although Web abounds with temporal data

Page dwell time, session length, time-to-first-click, etc
Weibull Distribution

2-parameter Weibull distribution



λ: scale parameter
k: shape parameter
Exponential dist. when k = 1
Weibull Analysis

Hazard function at time x



Instantaneous failure rate (or hazard rate) at time x
Amount of risk associated with an x-survivor at time x
Hazard function for Weibull distributions
Aging Effects from Hazard Function

k = 1: No aging



0<k<1: Negative aging




Constant failure rate
Exponential distribution
Decreasing failure rate
An initial screening has to be
passed in order to survive longer
Smaller k means harsher
screening
k > 1: Positive aging


Increasing failure rate
Little to no screening at the
beginning but life becomes
tougher as time goes by
Weibull Analysis on Dwell Time and Beyond

Reliability
Analysis
Dwell Time Analysis
Click Analysis
…
Data
time-to-failure
Time-to-abandon
Time-to-first-click
…
Hazard
Failure rate
Abandon rate
Click rate
…
E(t|t>t0)
Mean residual life
Mean residual time
on page
How soon to click …
…
…
…
…
Web abounds with temporal data


Time to first click, session length, eye fixation, …
Weibull analysis is way beyond hazard functions

Failure forecasting, corrective actions, …
…
Outline

A Primer on Weibull Analysis



Weibull Analysis on Dwell Time




Goodness-of-Fit
Screen-and-glean browsing pattern
Screening by categories
Predicting Dwell Time Distribution



Weibull distribution and analysis
Hazard function and aging effects
Prediction performance
Feature importance
Conclusions
Goodness-of-Fit Comparison


Dwell time collected for 205,873 pages (URLs) in English
(US) market, each of which has a minimum of 10k dwell
times
Comparison on Goodness-of-Fit (GoF)



Dwell times for each page are split into training (80%) and
testing (20%)
Model fitting on training and evaluated on testing
Metrics: Log-likelihood and Kolmogorov–Smirnov distance
Fitting λ and k
What’s the initial screening?
Strong Negative Aging
Screen-and-glean browsing pattern?
P(k|Category): Aging Effect w.r.t. Categories
Screening is harsher for less-entertaining topics
Outline

A Primer on Weibull Analysis



Weibull Analysis on Dwell Time




Goodness-of-Fit
Screen-and-glean browsing pattern
Screening by categories
Predicting Dwell Time Distribution



Weibull distribution and analysis
Hazard function and aging effects
Prediction performance
Feature importance
Conclusions
Dwell Time Prediction from Page Features

Why predicting dwell time?




Extend dwell time to pages with less or no dwell time
Enable third parties to leverage dwell time even if they don’t
have access to real dwell time data
Gain insights into what elements affect dwell time
Why using only page-level features?


Users decide how long to stay with a page based on the
experience and perception, rather than PageRank for example
Advanced features like PageRank and inlink counts may not be
available to all parties
Experiment Setup

5000 randomly sampled pages with fitted λ and k as the target
values




Page-level features




Pages are crawled using a dynamic crawler, which parses the html,
executes all dynamic components (e.g., redirections, flashes, javascripts,
etc), and finally renders the page
“login” pages are removed as they are likely due to time-out redirection
4771 pages left
HtmlTag: frequencies of 93 Html tags
Content: frequencies of top-1000 terms
Dynamic: statistics from dynamic crawling
Regressor: Multiple Additive Regression Tree (MART)

Effectiveness and feature interpretability
Prediction Results

Comparisons with various feature configurations




Prediction outperforms the baseline
HtmlTag and Dynamic are similar effectively when separated, and
complementary to each other when combined
Content > HtmlTag+Dynamic
Content+Dynamic the best: Dynamic captures what users experience
after clicks whereas Content shows what users would see in the end
Baseline returns the mean λ and k
Important Features
Outline

A Primer on Weibull Analysis



Weibull Analysis on Dwell Time




Goodness-of-Fit
Screen-and-glean browsing pattern
Screening by categories
Predicting Dwell Time Distribution



Weibull distribution and analysis
Hazard function and aging effects
Prediction performance
Feature importance
Conclusions
Conclusions

The first Weibull analysis on Web dwell time



Dwell time exhibits a strong negative aging effect, which hints a
prevalent “screen and glean” browsing pattern


Harsher screening for less-entertaining topics
Feasible to predict dwell time based on page-level features


Draws an analogy between dwell time and lifetime
Opens the door to Weibull analysis for temporal implicit feedbacks
Extending applicability to less-visited pages and parties without dwell
time data
Future work


Improving prediction accuracy through better feature engineering
Weibull analysis for IR
Acknowledgments






Yutaka Suzue
Krysta Svore
Qiang Wu
Wen-tau Yih
Xiaoxin Yin
Alice Zheng
Q&A
Thank You!
Download