Privad Overview and Private Auctions Paul Francis (MPI-SWS) Ruichuan Chen (MPI-SWS) Bin Cheng (NEC Research) Alexey Reznichenko (MPI-SWS) Saikat Guha (MSR India) 2 Can we replace current advertising systems with one that is private enough, and targets at least as well? Can we replace current advertising systems with one that is private enough, and targets at today’s least as well?model • Follows business • Advertisers bid for ad space, pay for clicks • Publishers provide ad space, get paid for clicks • Deal with click fraud • Scales adequately Can we replace current advertising systems with one that is private enough , and targets at least as well? • Most users don’t care about privacy • But privacy advocates do, and so do governments • Privacy advocates need to be convinced Can we replace current advertising systems with one that is private enough , and targets at least as well? Our approach: • “As private as possible” • While still satisfying other goals • Hope that this is good enough Can we replace current advertising systems with one that is private enough, and targets at least as well? A principle: Increased privacy begets better targeting Today’s advertising model Trackers (simplified) Publishers Advertisers Broker (Ad exchange) Client Trackers U U U Publishers Advertisers Broker (Ad exchange) Trackers track users Compile user profile Client Trackers U U U Publishers Advertisers U U U Broker (Ad exchange) Trackers may share profiles with advertisers? Client Trackers U U U Publishers Advertisers U U U Broker (Ad exchange) Client gets webpage with adbox Client Trackers U U U Publishers Advertisers U U U Broker (Ad Client tells broker of page exchange) Client Trackers Publishers U U U Advertisers U U U Broker (Ad exchange) Broker launches auction (for given user visiting Client given webpage ….) Also does clickfraud etc. Trackers Publishers U U U Advertisers U U U Broker (Ad exchange) (alternatively the publisher could Client have launched the auction) Trackers U U U Publishers Advertisers U U U Broker (Ad exchange) Client Advertisers present bids and ads Trackers U U U Publishers Advertisers U U U Broker (Ad exchange) Client Broker picks winners, delivers ads Trackers U U U Publishers Advertisers U U U Broker (Ad exchange) Client User waits for this exchange Trackers U U U Publishers Advertisers U U U Broker (Ad exchange) Client Various reporting of results . . . . Publishers Advertisers Broker Dealer U SA Clients Privad Basic Architecture Publishers A Advertisers Broker Dealer U SA Clients Learn interest in tennis shoes Publishers A Advertisers Broker Dealer Clients U SA A Anonymous request for tennis shoes Publishers A Advertisers Broker Dealer Clients U SA A Relevant and non-relevant ads stored locally Chan: {Interest, Region, Language} Ad: {AdID, AdvID, Content, Targeting, . . . .} Key K unique to this request Dealer knows Client requests some channel Broker knows some Client requests this channel Dealer cannot link requests ICCCN 2010 23 Publishers A Broker Dealer Webpage with adbox Clients U SA A Advertisers Publishers A Advertisers Broker Dealer Clients U SA A Ad is delivered locally Minimal delay May or may not be related to page context Publishers A Advertisers Broker Dealer Clients U SA A View or click is reported to Broker via Dealer Report: {AdID, PubID, EvType} Dealer learns client X clicked on some ad Broker learns some client clicked on ad Y At Broker, multiple clicks from same client appear as clicks from multiple clients ICCCN 2010 27 List of suspected rid’s rid: Report ID Unique for every report Used to (indirectly) inform Dealer of suspected attacking Clients Dealer remembers rid↔Client mappings Client with many reported rid’s is suspect ICCCN 2010 28 Many interesting challenges Click fraud and auction fraud 2nd-price, pay-per-click auction How to do profiling Protecting user from malicious advertisers ….and still have good targeting Gathering usage statistics and correlations Accommodating multiple clients Dynamic bidding for ad boxes Co-existing with today’s systems Many interesting challenges Click fraud and auction fraud 2nd-price, pay-per-click auction How to do profiling Protecting user from malicious advertisers ….and still have good targeting Gathering usage statistics and correlations Accommodating multiple clients Dynamic bidding for ad boxes Co-existing with today’s systems Advertising auctions today • Almost all auctions are second price • Most auctions are Pay Per Click (PPC) Bid3: Bid1: Bid2: Bid2: Bid1: Bid3: $6 $2 $7 $3 $5 $1 $4 Second Price Auction Bid2: $7 Bid3: $6 Bid1: $5 • Winner pays bid+δ of next ranked bidder • Bidders can safely bid maximum from the start Second Price Auction Bid2: $7 ($9) Maximum bid Bid3: $6 ($6) Bid1: $5 ($5) • Winner pays bid+δ of next ranked bidder • Bidders can safely bid maximum from the start Second Price Auction Bid2: ($9) Bid3: ($6) Bid1: ($5) • Bidder 2 is 1st ranked – Pays $6+1¢=$6.01 • Bidder 3 is 2nd ranked – Pays $5+1¢=$5.01 What about PPC (pay per click)? Bid2: $9 P(C)=0.1 Bid3: $6 P(C)=0.1 Bid1: $5 P(C)=0.4 Click Probabilities What about PPC (pay per click)? Bid2: $9 P(C)=0.1 $0.9 Bid3: $6 P(C)=0.1 $0.6 Bid1: $5 P(C)=0.4 $2.0 Expected Revenue Expected Revenue = Bid X Click Probability What about PPC (pay per click)? Bid1: $5 P(C)=0.4 $2.0 Bid2: $9 P(C)=0.1 $0.9 Bid3: $6 P(C)=0.1 $0.6 Ad Rank= Bid X Click Probability What does bidder 1 pay??? Bid1: $5 P(C)=0.4 Bid2: $9 P(C)=0.1 Bid3: $6 P(C)=0.1 What does bidder 1 pay??? Bid1: $5 P(C)=0.4 Bid2: $9 P(C)=0.1 Bid3: $6 P(C)=0.1 Certainly not $9+1¢=$9.01 Google Second Price Auction Bid1: $5 P(C)=0.4 Bid2: $9 P(C)=0.1 Bid3: $6 P(C)=0.1 Ad Rank= Bid X Click Probability P(C) next CPC = Bid next P(C) clicked Google Second Price Auction Bid1: $5 P(C)=0.4 $2.26 Bid2: $9 P(C)=0.1 $6.01 Bid3: $6 P(C)=0.1 $? Ad Rank= Bid X Click Probability P(C) next CPC = Bid next P(C) clicked What is the Click Probability??? What is the Click Probability??? • Historical click performance of the ad • Landing page quality • Relevance to the user • User click through rates • ……. What is the Click Probability??? • Historical click performance of the ad • Landing page quality • Relevance to the user • User click through rates • ……. Today all this is known by the broker (ad network) What is the Click Probability??? • Historical click performance of the ad • Landing page quality • Relevance to the user • User click through rates • ……. In a non-tracking advertising system, the broker knows nothing about the user! What is the Click Probability??? • Historical click performance of the ad • Landing page quality • ……. • Relevance to the user • User click through rates • ……. Known at broker (call it G) Known at user (call it U) Second price auction with broker and user components • Ranking by revenue potential: – Assume that Click Probability = G x U • Second-Price cost per click: Non-tracking advertising revisited • User profile at client • Privacy goals at broker: – Anonymity: No user identifier tied to any user profile attributes – Unlinkability: Individual user profile attributes cannot be linked Finally: Problem Statement • Satisfy anonymity and unlinkability goals in a system that runs this auction: • Where Bid and G are known at broker • And U is known at client Basic Architecture Two questions • Where do we do the ranking? • Where do we do the CPC computation? Two questions Do CPC at Broker: • Don’t want to reveal • Where do we doBid the ranking? advertiser’s • Fraud • Where do we do the CPC computation? Three flavors of Non-Tracking auctions Broker (Bid, G) Client (U) Rank@Client Bid, G Rank@Broker U Rank@3rdParty U 3 party Bid, G Three flavors of Non-Tracking auctions Broker (Bid, G) Client (U) Rank@Client Bid, G Rank@Broker U Rank@3rdParty U 3 party Bid, G Client (U) Computes ranking: (B × G) × U Broker (Bid, G) A - the ad ID, Value of (B × G), E[B,G], (+ targeting etc.) Client (U) Computes ranking: (B × G) × U Broker (Bid, G) A - the ad ID, Value of (B × G), E[B,G], (+ targeting etc.) Client (U) Computes ranking: (B × G) × U Tim e Ac - clicked ad ID ((Bn × Gn) × Un / Uc) E[Bc, Gc] Broker (Bid, G) A - the ad ID, Value of (B × G), E[B,G], (+ targeting etc.) Decrypts E[Bc, Gc] Computes CPC: ((Bn × Gn) × Un / Uc) / Gc Checks that CPC ≤ Bc Client (U) Broker (Bid, G) Decrypts E[Bc, Gc] Computes CPC: ((Bn × Gn) × Un / Uc) / Gc Checks that CPC ≤ Bc Client (U) Computes ranking: (B × G) × U Broker (Bid, G) User information A - the by adhiding ID, obscured withinofthis Value (B × G), composite value E[B,G], (+ targeting etc.) Ac - clicked ad ID ((Bn × Gn) × Un / Uc) E[Bc, Gc] Decrypts E[Bc, Gc] Computes CPC: ((Bn × Gn) × Un / Uc) / Gc Checks that CPC ≤ Bc Client (U) Computes ranking: (B × G) × U Broker (Bid, G) A - the ad ID, Value of (B × G), E[B,G], (+ targeting etc.) Ac - clicked ad ID ((Bn × Gn) × Un / Uc) E[Bc, Gc] Decrypts E[Bc, Gc] Computes CPC: ((Bn × Gn) × Un / Uc) / Gc Checks that CPC ≤ Bc Client Bc and (U) Gc may have changed between ranking Computesand ranking: CPC (B × G)calculation ×U Broker (Bid, G) A - the ad ID, Value of (B × G), E[B,G], (+ targeting etc.) Ac - clicked ad ID ((Bn × Gn) × Un / Uc) E[Bc, Gc] Decrypts E[Bc, Gc] Computes CPC: ((Bn × Gn) × Un / Uc) / Gc Checks that CPC ≤ Bc All three auction designs introduce various system delays • precompute and cache ranking • use out-of-date bid information • do not immediately reflect changes in bids Changes in bids constitute main source of churn Advertisers constantly update their bids to • show ads in a preferred position • meet target number of impressions • respond to market changes How detrimental are auction delays? Broker perspective: • How much revenue is lost due to these delays? Advertiser perspective: • How they affect advertisers’ rankings? Bing’s Auction log • 2TB of log data spanning 48 hours • 150M auctions with 18M ads • Trace record for an auction includes: – All participating ads – Bids and quality scores – Whether ad was shown and clicked Understanding effect of churn on revenue Idea: • Simulate auctions with stale bid information • Compute auctions at time t using bids recorded at time t-x • Compare generated revenue to auctions with up-to-date bid information We cannot predict changes in clicking behavior when rankings change Trace A B Simulation Click B Same position click? A Same ad click? We simulate five click models 1. 2. 3. 4. 5. 100% same position 75% same position, 25% same ad 50%-50% 25% same position, 75% same ad 100% same ad Bid staleness and change in revenue Same position 0.1% Change in revenue 75%-25% 50%-50% 0% 25%-75% Same ad -0.1% 1m 5m 30m 2h 6h Bid staleness 1d 2d Average fraction of auctions affected for advertiser (%) 20 Rank increased Rank decreased Became visible Became invisible Bid staleness and change in ranking Average % of auctions affected per advertiser 18 16 14 12 Rank increased Rank decreased Became visible Became invisible 1010 88 66 44 22 00 1m 1m 2m 5m 5m 15m 30m 30m 1h 2h 2h Staleness 3h 6h 6h Bid staleness 12h 1d 1d 36h 2d 2d Computing U So far, we assume we know user component of click probability Hard to compute purely at client Not enough history Unlinkably gather click stats from clients, compute U, feed back to clients Assume a set of factors X={x1, x2, …, xL} Level of interest in ad’s product/service Targeting/user match quality Webpage context User’s historic CTR ……. Clients report {Ad-ID, X, click} Broker computes U = f(X), delivers f(X) along with ad Problem if X={x1, x2, …, xL} fingerprints user Possible mitigating factors: Level of interest Many interests change, many interests don’t correlate that well Targeting match quality Different ads have different targeting Webpage context Can be course-grained User’s historic CTR Can be course-grained Future work So far, designs appear practical, but: • Can we accurately compute user score U? – And without violating privacy…. • Are there new forms of click fraud? • Need experience in practice…. User Statistics Broker and advertiser want to know deep statistical information about users What kind of targeting works best? When should ads be shown? Are users interested in A also interested in B? How can conversion rates be improved? Centralized systems have full knowledge How can Privad privately provide this information? Differential Privacy Differential Privacy adds noise to answers of DB queries Such that presence or absence of single DB element cannot be determined Normally modeled as a single trusted DB Query DB Trusted True Answer Add Noise Noisy Answer Distributed Differential Privacy DB Dealer DB DB DB Query (cleartext) Distributed Differential Privacy DB Dealer DB DB DB True Answers (encrypted) Noisy Answers (encrypted) A couple URLs adresearch.mpi-sws.org trackingfree.org • Backups, trust model ICCCN 2010 81 Publishers Advertisers Broker Dealer Software Agent U SA Clients Generate user profiles locally at the client In other words, Adware! Publishers Advertisers Broker Dealer U SA Clients Anonymizes client-broker communications Cannot eavesdrop Helps with clickfraud Publishers Advertisers Broker Client/broker messages: Dealer Contain minimal info (no PII) U SA Clients Cannot be linked to same client Publishers Advertisers Broker Dealer U SA Clients Unlinkability and anonymity Publishers Advertisers Broker Dealer Browser sandbox Encrypted U SA Clients Trusted, open Cleartext Reference Monitor Untrusted Software Black-box Agent U Publishers Advertisers Broker Possibly malicious Dealer “Pretty Honest but honest but tempted very curious” Doesn’t collude Clients Browser sandbox Encrypted U SA Trusted, open Cleartext Reference Monitor Untrusted Software Black-box Agent U Privacy and threat models??? Honest but curious isn’t quite right We expect the broker to do “what it can get away with”, but cautiously Plus we need to make privacy advocates comfortable No formal privacy model Formal models are too narrow and restrictive Dealer U SA Clients Dealer and Software Agent are new components How are they incentivized? Dealer U SA Clients Dealer: Legally bound to follow protocols, not collude Execute open-source software, open to inspection Dealer Client: Various options: U SA Clients Provide benefit: free software, content, ….. Like adware! Bundle with browser or OS • Local threats… ICCCN 2010 92 Publishers Advertisers Broker Dealer U SA Clients Please suspend disbelief, imagine that we succeed…… Publishers A Broker Perfectly Private Advertising System Dealer U SA Clients Advertisers Publishers Advertisers A Broker Perfectly Private Advertising System Dealer Ad U SA Clients Ad Ad targeted to: “Man” AND “Married” AND “Has girlfriend” Publishers Advertisers A Broker Perfectly Private Advertising System Dealer U SA Clients Click Advertiser gets (very) personal information about users Honey, why are you getting ads for sexy lingerie? ICCCN 2010 97 More ??? Privad Privacy Less Worse Targeting Better