Modelling Relevance and User Behaviour in Sponsored Search using Click-Data Adarsh Prasad, IIT Delhi Advisors: Dinesh Govindaraj SVN Vishwanathan* Group: Revenue and Relevance *-Visiting Researcher from Purdue Overview • Click-Data seems to be the perfect source of information when deciding which Ads to show in answer to a query. It can be thought as the result of users voting in favour of the documents they find interesting. • This information can be fed into the ranker, to tune search parameters or even use as training points as for the ranker. • The aim of the project is to develop a model which takes in Click-Data and generates output in the form of constraints or updated ranking score as input to the ranker. 2 Motivation • Quality of training points is of critical importance for learning a ranking function • Currently, labeled data collected using human judges. Human-labeling is time-consuming and labor-intensive. • Need to ensure “temporal relevance” of Ads i.e. Something relevant today might not be relevant 6 months later, therefore labeling must be repeated and there is a need for automation of labeling process Main Difficulty – Presentation Bias •Results at lower positions are less likely to be clicked even if they are relevant.(Position) •Clicks depend on other Ads being shown.(Externalities) Example[1] Query: myspace URL = www.myspace.com Market = U.K. Ranking 1 Pos 1: uk.myspace.com: ctr = 0.97 Pos 2: www.myspace.com: ctr = 0.11 [1] Oliver Chapelle et al. A Dynamic Bayesian Click Model for Web Search Ranking Ranking 2: Pos 1 : www.myspace.com : ctr = 0.97 3 Procedure For learning a web search function, clicks can be used as a target[2] or as a feature[3] • Use of Click Data as target : Useful for markets with few editorial Judgments. • Train on pairwise preferences: Two Sets of preferences: PE from editorial judgments and PC coming from click modeling. Minimize: Target 1. Deriving Preference Relations on the basis of click-pattern and feeding them as constraints to ranker (Rocky-Road) • Position and Order-of-Click based Constraints[4] • Aggregate Constraints Feature 1. Sample Clicked Ads and label them as relevant. 2. Types of Sampling: • Random • Position based Weighted : User Clicking ml-4 Ad stronger signal of relevance as compared to user clicking ml-1 3. Feed them to the Binary Classifier [2] Joachims et al. Optimizing Search Engines using Clickthrough Data [3] Agichtein et al. Improving web search ranking via incorporating User Behaviour [4] Joachims et al. Accurately interpreting ClickThrough Data as Implicit Feedback 4 Results ๐ −๐ Fisher Score = √(๐๐ ๐+ ๐๐ ๐) ๐ ๐ EXACTMATCH BROADMATCH PHRASEMATCH Sampling +0.39% +1.02% Position and Order Constraints +1.22% +5.93% +4.15% +0.38% Aggregate Constraints +0.2% +5.17% +0.77% +0.5% -0.06% Log Loss (Label Based) Sampling SAME SUPERSET +5.72% +4.22% Position and +3.1% Order Constraints +2.28% Aggregate Constraints +5.28% +7.4% SMARTMATCH -0.5% Weighted LL DISJOINT -6.28% -3.9% -11.3% Sampling +0.001% Position and +3.07% Order Constraints Aggregate Constraints +1.75% 5 Background on Click Models • Use CTR (click-through rate) data. • Pr(click) = Pr(examination) x Pr(click | examination) Relevance • Need user browsing models to estimate Pr(examination) 6 Notation • Φ(i) : result at position i • Examination event: • Click event: ๏ฌ1, if theuser examined๏ฆ (i) Ei ๏ฝ ๏ญ ๏ฎ0, otherwise ๏ฌ1, if theuser clickedon ๏ฆ (i) Ci ๏ฝ ๏ญ ๏ฎ0, otherwise 7 Examination Hypothesis Richardson et al, WWW 2007: Pr(Ci = 1) = Pr(Ei = 1) Pr(Ci = 1 | Ei = 1) • αi : position bias • Depends solely on position. • Can be estimated by looking at CTR of the same result in different positions. 8 Using Prior Clicks R1 R2 R3 R4 R5 : Clicks Pr(E5 | C1) = 0.3 R1 R2 R3 R4 R5: Clicks Pr(E5 | C1,C3) = 0.5 9 Examination depends on prior clicks • Cascade model • Dependent click model (DCM) • User browsing model (UBM) [Dupret & Piwowarski, SIGIR 2008] • More general and more accurate than Cascade, DCM. • Conditions Pr(examination) on closest prior click. • Bayesian browsing model (BBM) [Liu et al, KDD 2009] • Same user behavior model as UBM. • Uses Bayesian paradigm for relevance. 10 User browsing model (UBM) • Use position of closest prior click to predict Pr(examination). Pr(Ei = 1 | C1:i-1) = αi β i,p(i) position bias p(i) = position of closest prior click Pr(Ci = 1 | C1:i-1) = Pr(Ei = 1 | C1:i-1) Pr(Ci = 1 | Ei = 1) Prior clicks don’t affect relevance. 11 Other Related Work • Examination depends on prior clicks and prior relevance • Click chain model (CCM) • General click model (GCM) • Post-click models • Dynamic Bayesian model • Session utility model 12 User Browsing in Sponsored Search • Is user browsing in sponsored search similar to browsing in Web Search?? • Generally, the assumption in organic search is that users examine and click in a linear top-to-bottom fashion. • We observed that for sponsored search where the number of returned results is few, a fair share (~ 30%) of users click out of order. • Users behaving in a non-linear fashion is a strong signal, which may contain important information. • Combining position and temporal behavior of user. The statistic(x) that has been counted is the difference between the positions of temporal clicks. Example: if the user clicks on ml1 and then ml2 then x = -1 if ml2 and then ml1 then x=1 and so on. 13 A New Model • Allow users to move in a non-linear fashion • Also, incorporate the notion of externalities, i.e. perceived relevance changes with other clicks. For learning our parameters, we can use EM Algorithm. (1) In E step, we estimate our hidden parameters by a forward-backward algorithm. (2) In M step- We have closed form solutions to maximize the expected log-likelihood. 14