Intro to Computational Advertising - Courses

advertisement
Introduction to Computational
Advertising
ISM293
University of California, Santa Cruz
Spring 2009
Instructors: Ram Akella, Andrei Broder, and Vanja
Josifovski
1
This lecture
Logistics of the course
Introduction & Overview of Computational Advertising
1.
2.
Classic advertising
Web advertising
y
y
y
y
y
The marketplace size
The actors: Publishers, Advertisers, Users, & “Ad agency”
Advertising as information
Trends in Computational Advertising
y
y
y
y
y
y
y
3.
2
Graphical ads
Sponsored search
Content match
Connection with recommender systems
Mobile advertising
Open markets
Summary
Instructor
y Prof. Ram Akella
y Professor Information Systems
and Technology Management
at University Of California Santa Cruz
y Ph.D., EECS, Indian Institute of Science, Bangalore, India, 1976-1982
y B.S., EE, Indian Institute of Technology, Madras, India, 1971-1976
y http://www.soe.ucsc.edu/~akella/
y akella@soe.ucsc.edu
3
Instructor
y Dr. Vanja Josifovski
y Principal Research Scientist
y
y
y
y
y
4
at Yahoo! Research.
Research Area: Computational Advertising
Previously at IBM Research working on databases and enterprise
search
M.Sc. from University of Florida, PhD from Linkopings University in
Sweden.
vanjaj@yahoo-inc.com
http://research.yahoo.com/bouncer_user/88
Instructor
y Dr. Andrei Broder
y Fellow and Vice President
y
y
y
y
y
5
for Search & Computational Advertising
in Yahoo! Research.
Chief Scientist of Yahoo’s Advertising Technology Group.
Research interests: computational advertising, web search, contextdriven information supply, and randomized algorithms.
B. Sc. Summa cum Laude from Technion, M.Sc. and Ph.D. in
Computer Science at Stanford University under Don Knuth.
broder@yahoo-inc.com
http://research.yahoo.com/bouncer_user/17
Contacts
y Course Website:
y www.soe.ucsc.edu/classes/ism293/Spring09/
y TAs:
y Bin An ban@soe.ucsc.edu
y Chunye Wang cwang@soe.ucsc.edu
y Course Group Email:
y Ism293-group@soe.ucsc.edu
y Subscribing to ISM293-group email (enrolled students only)
http://www.soe.ucsc.edu/mailman/listinfo/ism293-group
y Lecture: 6-9:30 PM Wed.
y Office Hours:
y 5~6 PM Wednesdays, by appointment only!
y Room: TBD
y For now, please use the course group email to reach the
instructors and TAs
6
Lecture Content
1. Introduction and Overview
2. Information Retrieval for Computational Advertising
3. Marketplace Design
4. Machine Learning Techniques
5-6.Sponsored Search
7. Content Match
8 Graphical Ads and Guaranteed Delivery
9. Behavioral Targeting
10. Overflow lecture: catch up, or optional topics:
y
y
7
Emerging Adverting Media or
Evaluation of On-line Advertising system
General lecture structure
y Overview: 1~1.5 hours
y In depth: 1 hour
y Discussion: 20~30 Minutes
8
Boot camp
y Boot camp is twice per week, 5:00 to 6:00 PM
Mon. and Wed. in order to help students fill in
gaps, expand on some topics, & understand the
course better. We initially want to cover Linear
Algebra, the Lemur toolkit, and Elementary
Statistics and Probability
y Generally, TAs will lead the boot camps.
y Prof. Ram Akella will lead some boot camps as
well.
9
Homework
y One homework per week
y Based on reading research paper, typically
answer a question in the following style:
y Why does the algorithms in the paper work?
y Try to extend the idea in the paper from a different
view of angle?
y How to modify it for a new scenario?
y Some homeworks will be conventional exercise
style.
10
Project
y There will be one course project divided into
several mini projects. Examples:
y Crawling Ads, Indexing, and so on (TBD)
11
Exam
y Two quizzes or one midterm (TBD)
y One final exam
12
Grading
y Homework
y 20%
y Exam
y 40%
y Project
y 40%
13
Questions?
y We welcome suggestions about all
aspects of the course!
y E-mail to
Ism293-group@soe.ucsc.edu
14
Introduction to Computational
Advertising
15
Disclaimers
y This talk presents the opinions of the authors. It
does not necessarily reflect the views of Yahoo!
Inc or any other entity
y Algorithms, techniques, features, etc mentioned
here might or might not be in use by Yahoo! Or
any other company
16
Computational advertising – main
challenge
Find the "best match" between a
given user in a given context and a
suitable advertisement.
y Examples
y Context = Web search results
y Context = Publisher page
Sponsored search
Content match,
banners
y Other contexts: mobile, video, newspapers, etc
17
What is “Computational
Advertising”?
y New scientific sub-discipline, at the intersection of
y Large scale search and text analysis
y Information retrieval
y Statistical modeling
y Machine learning
y Classification
y Optimization
y Microeconomics
y Recommender systems
18
Establishing a new discipline…
19
Establishing a new discipline…
20
Key messages
1.
2.
Computational advertising = A principled way to find the
"best match" between a given user in a given context and a
suitable advertisement.
The financial scale for computational advertising is huge
⇒ Small constants matter
⇒ Expect plenty of further research
3.
Advertising is a form of information.
⇒ Adding ads to a context is similar to the integration problem of other
types of information
⇒ Finding the “best ad” is a type of information retrieval problem with
multiple, possibly contradictory utility functions
4.
New application domains and new techniques are
emerging every day
⇒ Good area for research + new businesses
21
Classic Advertising
22
Brand advertising
Goal: create a distinct favorable image
23
Direct marketing
Advertising that involves a "direct response”: buy,
subscribe, vote, donate, etc, now or soon
24
Long history….
Japan ,1806
25
USA,1890
Lots of computational this and
that …
y
y
y
y
y
y
y
y
y
Computational Biology
Computational Chemistry
Computational Finance
Computational Geometry
Computational Neuroscience
Computational Physics
Computational Mechanics
Computational Economics
…
All are about mixing an old science with large scale
computing capabilities
26
What’s computational about it?
y Classical:
y Relatively few venues – magazines, billboards,
y
y
y
y
newspapers, handbills, TV, etc
High cost per venue ($3Mil for a Super Bowl TV ad)
No personalization possible
Targeting by the wisdom of ad-people
Hard to measure ROI
y Computational – almost the exact opposite:
y
y
y
y
y
27
Billions of opportunities
Billions of creatives
Totally personalizable
Tiny cost per opportunity
Much more quantifiable
Revenue flow basics
y What do advertisers pay?
y CPM = cost per thousand impressions
y Typically used for graphical/banner ads (brand advertising)
y Could be paid in advance
“Guaranteed delivery”
y CPC = cost per click
y Typically used for textual ads
y CPT/CPA = cost per transaction/action a.k.a. referral fees or affiliate
fees
y Typically used for shopping (“buy from our sponsors”), travel, etc.
y … but now also used for textual ads (risk mitigation)
y What do publishers get?
y Whatever advertisers pay minus rev-share (revenue-share) paid to
intermediaries
y What do intermediaries get?
y Whatever advertisers pay minus TAC (traffic acquisition costs) paid to
publishers
28
US Online Advertising Spending
29
Still growing (15% FH 2008 vs.
FH 2007)
Source: IAB
30
US Online vs. Offline
advertising spend
31
The advertising $$ budget vs. the
human time budget
32
Computational Advertising
Landscape
33
Graphical ads
34
Graphical ad
More graphical ads
35
Textual ads
Ads driven by search keywords –
“sponsored search” (a.k.a. “keyword
driven ads”, “paid search”, etc)
2. Ads directly driven by the content of a
web page – “context match” (a.k.a.
“context driven ads”, “contextual ads”,
etc)
Textual ads are heavily related to Search
and IR
1.
36
Sponsored search:
Text-based ads driven by a keyword search
37
More sponsored search:
Text-based ads driven by a keyword search
38
Content match
39
The actors: Publishers, Advertisers, Users, &
“Ad agency”
Advertisers
Publishers
Ad agency
Users
40
Dual roles
Advertisers
Publishers
Ad agency
Users
y Sponsored search:
y Pub = AA (Yahoo!, Google)
y Content match:
y Pub = AA (Yahoo! content)
y Pub = Adv (“House Ads”)
41
Advertising as information
y “I do not regard advertising as entertainment or
an art form, but as a medium of information….”
[David Ogilvy, 1985]
y “Advertising as Information” [Nelson, 1974]
y Irrelevant ads are annoying; relevant ads are
interesting
y Vogue, Skiing, etc are mostly ads and advertorials
42
Finding the “best ad” as an Information
Retrieval (IR) problem
Analyze the “query” and extract queryfeatures
1.
•
Query = full context (content, user,
environment, etc)
Analyze the documents (= ads) and
extract doc-features
3. Devise a scoring function = predicates
on q-features and d-features + weights
4. Build a search engine that produces
quickly the ads that maximize the scoring
function
2.
43
Behind the scenes…
Setting the ad retrieval problem:
Ads corpus =
Bid phrase(s) + Title + Creative + URL + Landing Page + …
Query features =
Search Keywords + Outside Knowledge Expansion + Context features
Context features (for sponsored search) =
Location + User data + Previous searches + …
Context features (for context match) =
Location + User data + Page topic + Page keywords …
Search problem similar to web search, but
y Ad database is smaller
y Ad database entries are “small pages” [+ URL]
y Ranking depends also on bids
y Ranking depends also on click-through-rate
44
What is the best match?
y An ad has different utilities for publishers,
advertisers, users
y Quality (utility) Factor (QF) is different
y A-QF, U-QF, P-QF
y The ad agency has its own economic interest
y Might have different types of ads that are not
easily compared
y Might have economics/contractual obligations
that need to be fulfilled.
45
A bit deeper: how does the
matching happens?
46
A Semantic Approach to Contextual Advertising
[SIGIR 2007]
[AB, M. Fontoura, V. Josifovski, & L. Riedel]
y What is more important: the words or the context?
y Contextual ad matching based on a combination of
semantic and syntactic features.
y Classify both ads and pages into a 6000 nodes commercial
taxonomy
y The class information captures the “about-ness” of pages
and ads
47
Matching ads and pages
winter
sports
skiing
JOE’S SKIING BLOG
….
My Atomic skis floated on
the fresh powder…
….pages
48
snowboarding
(imperfect) match
Atomic boards
75% off!!
Semantic and syntactic score
Semantic component – weighted taxonomy distance
Syntactic component - term vector cosine
49
Final score
50
Results
51
The crystal ball chapter: Trends in
Computational Advertising
A very biased sample
52
The Recommender Systems
Connection
53
The ad matching problem as a
recommendation challenge
y The traditional IR approach is based on a fixed
query
results correspondence
y For ads we
y Need CTR probability or user utility rather than
top-K results
y Have a continuous click-through feedback
y Challenge: incorporate the feedback
1.Long term loop: improve the ranking function
y
ML based ranking
2.Short term loop: use the statistics we have for a
particular (query, ads) pair
y
54
Closest to recommender systems
A generic view – dyadic
interaction systems
y D. Agarwal, B.C. Chen “Feature based factorization
model for dyadic data” [In preparation]
y Dyadic Interaction data pervasive
y Recommendation systems (user-movie, user-music,
user-book)
y Web advertising (match ads to webpage/query)
y Content Optimization (match articles to users)
y Unit of measurement : dyad
(i, j)
y i= user, webpage,..; j= movies, ads,…
y Measure some response: ratings, click-rates,…
y Often have meta-data on dyadic elements
y Demographics, genres,….
y Goal: predict response for unknown dyads
y Better match-making, prediction
55
Challenge: clicks are very rare
y Need to aggregate clicks/recommendations
y Pages belong to site section to sites
Page ∈ CNN/sports ∈ CNN
y Ads belong to Campaign to Advertisers Ad ∈ Ford
Focus ∈ Ford
y Similar to preference estimation for
Annie Hall ∈ Woody Allen Comedies ∈
Comedies
y D. Agarwal, A. B, D. Chakrabarti, D. Diklic, V. J., and
M. Sayyadian. Estimating rates of rare events at
multiple resolutions. [KDD, 2007]
56
Challenge: how to find new ads?
y Similar to music recommender systems: need to
explore new songs but need to keep some
similarity
y Can take advantage of semantic taxonomy
y S. Pandey, D. Agarwal, D. Chakrabarti, and V. Josifovski.
Bandits for taxonomies: A model-based approach. [SDM,
2007]
57
Mobile advertising
58
How popular is mobile advertising?
(Limbo’s Mobile Advertising Report, US Users, 4th
quarter 2007)
59
Specific features (and targeting
dimensions) for mobile advertising
y High precision demographics
y Location
y Geo (GPS or tower triangulation)
y Location functionality: Shopping mall, at home, airport, etc
y Personal context: working, on vacation, etc
y Social context
y Alone, with colleagues, with family, with friends, travelling, …
y
y
y
y
y
y
y
y
60
Language
Handset capabilities
Bandwidth
Operator
Time of day
Speed (walking vs. driving)
Recent history
Etc
What can change the game?
y Instant couponing
y One time use coupons sent via SMS/MMS to
handset
y Handset as contactless credit/cash card
y Interaction with point-of-sale via optical readers
(one way), Bluetooth (two way), or wireless
network
y Direct purchase (tickets, take-out food, etc)
y Prevalent high precision GPS
y I-Phone, G1-Phone, …
y High res touch screen, closer to web experience
y Lots of things we haven’t thought about …
61
What can change the game?
Inference from noisy data
y What can you infer from lots of people reporting
their position every few seconds?
y
y
y
y
y
62
Real time traffic …
Opening hours …
Night clubs going out of fashion …
Very accurate life-style predictions
Interesting gossip ☺
Marketplace
63
Advertiser agencies:
The Open Exchange
Bids $0.75 via Network…
Bids $0.50
Bids $0.60
Ad.com
AdSense
Bids $0.65—WINS!
Has ad
impression
to sell -AUCTIONS
Transparency and value
64
… which becomes
$0.45 bid
So many topics, so little time…
y Publishers & media
y Internet Protocol Television (IPTV)
y Variable-data printing (VDP) newspapers
y …
y Users
y
y
y
y
Fluid legal landscape wrt regulation, privacy, etc.
As usual willing to trade privacy for functionality
OpenSocial, Universal Profile, etc
…
y Advertisers
y Move towards performance advertising
y Fine granularity ROI computation, analytics
y See R. Lewis & D. Reiley “Measuring the Effects of Advertising on Sales
via a Controlled Experiment on Yahoo!”
y …
y Advertising agencies
y
y
y
y
65
Market becoming more efficient (arbitrage, exchange)
Less is more – manage user attention
Single user multi-media campaigns
…
Summary
66
Key messages
1.
2.
Computational advertising = A principled way to find the
"best match" between a given user in a given context and a
suitable advertisement.
The financial scale for computational advertising is huge
⇒ Small constants matter
⇒ Expect plenty of further research
3.
Advertising is a form of information.
⇒ Adding ads to a context is similar to the integration problem of other
types of information
⇒ Finding the “best ad” is a type of information retrieval problem with
multiple, possibly contradictory utility functions
4.
New application domains and new techniques are
emerging every day
⇒ Good area for research + new businesses
67
Thank you!
broder@yahoo-inc.com
vanjaj@yahoo-inc.com
http://research.yahoo.com
68
68
This talk is Copyright 2009.
Authors retain all rights, including copyrights and
distribution rights. No publication or further
distribution in full or in part permitted without
explicit written permission
69
Download