Cassini - Soam Acharya

advertisement
Cassini: A Simulation
Framework for Evaluating
Designs for Sponsored Search
Markets
Soam Acharya, Prabhakar Krishnamurthy, Ketan Deshpande,
Tak W. Yan, Chi-Chao Chang
Yahoo! Inc.
2821 Mission College Boulevard
Santa Clara, CA 95054
Topics
•
•
•
•
•
•
Overview/Motivation
Requirements
Architecture
Methodology
Applications/Results
Future Directions
Cassini Overview
• What is it?
– Discrete Event Simulation System
• support simulations of different marketplace designs,
policies and technologies
• Provide rapid assessment of
revenue/search (RPS), click-through-rate
(CTR) and cost-per-click (CPC) impact
– Compare % change vs. a baseline
– Other metrics calculated depend on specific
experiment
Contribution
• General purpose sponsored search
auction simulator
– Built-in support
• Auction structure, ranking, and payment
policies, budgets
• Advertisers, campaigns, bids
• User click model
• Search events
– Extensible, modular architecture
Motivation
and
• Alternative: live tests
• Problems
– Expensive
– Time consuming
• Preparation, SLAs
• Must run long enough for statistical significance
– Incomplete
• Not possible to explore all aspects of marketplace
– Eg. advertiser long term effects
Topics
•
•
•
•
•
•
Overview/Motivation
Requirements
Architecture
Methodology
Applications/Results
Future Directions
Requirements for a Simulation
Framework
• Mimic Sponsored Search Auction mechanisms
– Ranking, budgeting, pricing
• User behavior
– Click model
– Use actual log traces as input
• Advertiser behavior
– Advertiser action controls
• Performance
– Process large quantities of data
– Need to complete large numbers of runs quickly
• Others:
– Extensible
– Support for market mechanisms
Overall Architecture
Query
Trace
Ad Server
Ad
Information
External Ad
Ranker
Ad
Information
Budget
Filtering
Ranking
Pricing
Offline
Click Model
Generation
Click
Model
Click
Generator
Budget &
Advertiser
Management
YSM
Impression &
Click Logs
Metric
Computation
Output DB
Simulation Log
Output
Topics
•
•
•
•
•
•
Overview/Motivation
Requirements
Architecture
Methodology
Applications/Results
Future Directions
Methodological Issues
• Sponsored search auctions are complex
– Advertisers adapt to events and outcomes
– Users adapt to market structure and policies and
auction outcomes
– Advertiser budgets introduce dependencies
between markets
• Input and event space is multidimensional
with interactions
– Simulation of joint distribution can be too time
consuming
Approach
•
Simplifying assumptions in current version of Cassini
– Advertiser actions are at equilibrium
– Static user click model
– Each auction is independent
• Except when budget management designs are being evaluated
•
Approach
– Take samples of actual historical search traffic
– Focus on only the most significant sources of variation in traffic
• Week-end vs week-day traffic
• Samples from different times in history
•
Sampling
– Using full day traffic for simulation is infeasible
– Random sample of searches works well except with budgets
• Best option: Ignore budgets unless it is the focus of experimentation
• A very small proportion of traffic can provide reasonably good estimates of RPS
(revenue per search)
• With budgets, estimates are biased upwards
– Otherwise, reasonably small (almost) closed micro-markets can be used
Micro-market Sampling
• A micro-market is a collection of
accounts and keywords such that
All spend due to these accounts and
keywords occurs within the collection
• Run simulations with multiple micromarkets
Topics
•
•
•
•
•
•
Overview/Motivation
Requirements
Architecture
Methodology
Applications/Results
Future Directions
Applications of Cassini at Yahoo!
• Screened candidates of ranking
algorithms for live testing
• Evaluated different design options for
matching algorithms
• Estimated the potential of budget
optimization
• Others
Evaluation of Matching Algorithms
Impact of Matching Methods on
% CTR Change from Baseline
6%
5%
4%
3%
Method 1
Method 2
Method 3
2%
1%
0%
0%
10%
20%
30%
40%
50%
60%
-1%
% Ads Dropped
70%
80%
90%
100%
Topics
•
•
•
•
•
•
Overview/Motivation
Requirements
Architecture
Methodology
Applications/Results
Future Directions
Cassini – Future Directions
• Advertiser bidding agent
– Support automated, adaptive bidding agent
– Allow different bidding strategies to be implemented
– Bidding languages
• Scale to full traffic
• Open interface (other groups within Yahoo)
• Self-service architecture
Backup
Related Work
• Simulations of sponsored search auction
designs
– Feng, Bhargava, Pennock; Kitts, LeBlanc
• Simulations of other types of auctions
– Yankee Auctions (Bapna, Goes, Gupta); FCC
Spectrum Auctions (Csirik et al)
• Bidding Agents, Bots
– Wurman, Wellman, and Walsh; Jennings; Powell
Design Decisions
• Query driven metaphor
• Allow collaboration:
– leverage models from other groups
• Multiple iterations for the same set of inputs
• Maintain as much state as possible:
– new metrics easily computed
• Generate as much state as possible
– Copious quantities of log files
– Turn off for performance
Advertiser Actions
• Static
• Adjust bids, budgets
• Target
– Individual advertisers
– Groups
• Predefined
• Randomly pick a certain percentage of
advertisers within each group
– Eg. Select 25% of advertisers in cluster E
Cassini Implementation Notes
• # of lines:
– 22K lines of C++, Perl, shell, SQL
– 56K lines of C++ libraries
• Performance
– Single instance per machine
– 80K unique queries over one day
• Several million queries
• Order of hours
– Capacity: memory bound
– Speed: Disk I/O bound
Advertiser Actions Examples
Cluster
A
B
C
D
E
F
G
H
I
J
K
UNK
Bid Increase %
ProportionMean
Std Dev
85
18
85
8
85
5
85
5
85
5
85
0
85
5
85
5
85
18
85
24
85
5
40
10
Cluster
A
B
C
D
E
F
G
H
I
J
K
UNK
Bid Increase %
ProportionMean
Std Dev
100
30
100
15
100
10
100
10
100
10
100
0
100
10
100
10
100
30
100
30
100
10
100
10
4
2
2
2
2
2
2
2
4
5
2
4
Budget Increase %
Mean
Std Dev
10
4
5
2
10
4
5
4
5
2
10
4
5
2
0
2
10
4
5
2
0
2
5
2
0
0
0
0
0
0
0
0
0
0
0
0
Budget Increase %
Mean
Std Dev
10
0
5
0
10
0
5
0
5
0
10
0
5
0
0
0
10
0
5
0
0
0
5
0
Setting 1
Setting 2
Simulation Setup
•
Inputs
–
Bid Landscape
•
–
Other
•
–
Search
Clicks
Advertiser actions
Calibration
–
•
Bid and budget changes (stochastic)
Events
–
–
–
•
Budgets
Advertiser actions
•
•
Accounts, ads, bids
User click model
What do we want to use simulation for?
Design Exploration
• Use reference simulation run to verify “invariants”:
data and parameters
– Similar to production set-up whose performance is wellunderstood
– Compare to actual performance over a number of data
samples
• Comparison to bucket tests
Future Directions
• Open interfaces for click model, ranking algo,
matching algo, etc.
– New click models
• Self-service – ease of use, user interface, job
management
– Leverage work from Yahoo Pipes, other log/data processing
groups.
• Better analysis support – pre- and post- simulation
analysis
– See above
Download