Advertising in Online Social Networks

advertisement
Advertising in Online Social
Networks
How do ad revenue work
Types of Ads
• Search Ads
• Contextual Ads
Learning User profiles
Trackers link user activities to form large user profiles
SIGCOMM 2013
4
Implications of Giving Users Control
• Cons:
• Pros:
Personalization
Better Security
Lack of Privacy
Revenue for Service
SIGCOMM 2013
5
Issues with Ads
• Click-Spam
• Protecting User Privacy
Estimating Click-spam – Main Idea
How many
?
Equate ratios of buyers
to non-spammers
Both non-spammers and
spammers click ads
A fraction of non-spammers
buy
Black box
Both non-spammers and
spammers click ads
Lose spammers and some
non-spammers
Some non-spammers buy
Estimating
Idea
DissectingClick-spam
Black box – Main
Hurdles
How many
Hurdle
Both non-spammers
and
fraction
of non-spammers
Spammers
and non-spammers
ExtraAclick
required
to view
spammers
buy
click
on an click
ad ads
site
?
Equate ratios of buyers
to non-spammers
Some spammers and
Non-spammers see the
content
Different hurdles have different hardness
5 sec wait, Click to continue
Black box
Send only a fraction of traffic through hurdles
To minimize impact on user experience
Lose
spammers and some
Both
non-spammers
and block all
Perfect
hurdle would
spam
non-spammers
10spammers click ads
In reality, some spammers get through (False Negatives)
Some non-spammers buy
Dissecting
Ads[1]
Estimating
Click-spam
Main
Idea
DissectingBlack
Blackbox
box- –Bluff
Hurdles
Bluff Ads
How many
?
Junk ad text with normal keywords, same targeting
Equate ratios of buyers
Normal users unlikely to click
to non-spammers
Hurdle
Both non-spammers
and Normal
fraction
of non-spammers
Spammers
and non-spammers
ExtraAclick
required
to view
spammers
buy
click
on an click
ad ads
site
Bluff
Some
spammers and
Non-spammers see the
content
Different hurdles have different hardness
5 sec wait, Click to continue
Hurdle box
Black
Send only a fraction of traffic through hurdles
To minimize impact on user experience
Spammers and curious
users
click on an ad and
Both non-spammers
11 Perfect
10spammers
Lose
spammers and some
hurdle would block all
spam
click ads
non-spammers
Some spammers and
Some
buy
users
maynon-spammers
see the
content
In reality, some spammers get through (False Negatives)
[1] Fighting online click fraud using bluff ads [CCR 2010]
Dissecting
Ads[1]
Estimating
Click-spam
Main
Idea
DissectingBlack
Blackbox
box- –Bluff
Hurdles
Bluff Ads
How many
?
Junk ad text
with Negative
normal keywords,
targeting
Maximum
False
rate same
known
for each
Equate ratios of buyers
Normal
users
unlikely
to
click
hurdle
to non-spammers
Hurdle
Can be subtracted out
Both non-spammers
and Normal
fraction
of non-spammers
Spammers
and non-spammers
ExtraAclick
required
to view
spammers
buy
click
on an click
ad ads
site
Bluff
Some
spammers and
Non-spammers see the
content
Different hurdles have different hardness
5 sec wait, Click to continue
Hurdle box
Black
Send only a fraction of traffic through hurdles
To minimize impact on user experience
Spammers and curious
users
click on an ad and
Both non-spammers
12
11 Perfect
10spammers
Lose
spammers and some
hurdle would block all
spam
click ads
non-spammers
Some spammers and
Some
buy
users
maynon-spammers
see the
content
In reality, some spammers get through (False Negatives)
[1] Fighting online click fraud using bluff ads [CCR 2010]
Dissecting
Ads[1]
Estimating
Click-spam
Main
Idea
DissectingBlack
Blackbox
box- –Bluff
Hurdles
Bluff Ads
How many
?
Junk ad text
with Negative
normal keywords,
targeting
Maximum
False
rate same
known
for each
Equate ratios of buyers
Normal
users
unlikely
to
click
hurdle
to non-spammers
Hurdle
Can be subtracted out
Both non-spammers
and Normal
fraction
of non-spammers
Spammers
and non-spammers
ExtraAclick
required
to view
spammers
buy
click
on an click
ad ads
site
Bluff
Some
spammers and
Non-spammers see the
content
Different hurdles have different hardness
5 sec wait, Click to continue
Hurdle box
Black
Send only a fraction of traffic through hurdles
To minimize impact on user experience
Spammers and curious
users
click on an ad and
Both non-spammers
12
11 Perfect
10spammers
Lose
spammers and some
hurdle would block all
spam
click ads
non-spammers
Some spammers and
Some
buy
users
maynon-spammers
see the
content
In reality, some spammers get through (False Negatives)
[1] Fighting online click fraud using bluff ads [CCR 2010]
Dissecting
box
Ads[1]
Estimating
Click-spam
Main
Idea
Dissecting
Uh-oh. Black
How
Black
do
box
we- –Bluff
validate?
Hurdles
14
Bluff Ads
How many
?
Junk ad text
with Negative
normal keywords,
same
targeting
Maximum
False
known
for each
No
groundrate
truth!
Equate ratios of buyers
Normal
users
unlikely
to
click
hurdle
to non-spammers
Hurdle
Can be subtracted out
Compare against search ads on Google and Bing
Both non-spammers
and Normal
fraction
of non-spammers
Spammers
and non-spammers
ExtraAclick
required
to view
spammers
buy
click
on an click
ad ads
site
Bluff
Some
spammers and
Non-spammers see the
content
Different hurdles have different hardness
5 sec wait, Click to continue
Hurdle box
Black
Send only a fraction of traffic through hurdles
To minimize impact on user experience
Spammers and curious
users
click on an ad and
Both non-spammers
12
11 Perfect
10spammers
Lose
spammers and some
hurdle would block all
spam
click ads
non-spammers
Some spammers and
Some
buy
users
maynon-spammers
see the
content
In reality, some spammers get through (False Negatives)
[1] Fighting online click fraud using bluff ads [CCR 2010]
Dissecting
Black
- –Bluff
Ads[1]
Results
– Validation
using
search
ads
Estimating
Click-spam
Main
Idea
Dissecting
Blackbox
box
Hurdles
Bluff Ads
Valid Traffic Fraction (Normalized)
F
r
a
c
tio
n
v
a
lid
(
n
o
r
m
.)
15
Adwith
Network’s
Junk ad text
normalEstimate
keywords,
Maximum
False
Negative
rate
Normal
users c
unlikely
to click
1
.2
5
e
le
b
r
ity
hurdle
Hurdle
y
o
g
a
la
w
n
m
o
w
e
r
Can1be subtracted
out
How many
?
Our each
Estimate
same
targeting
known
for
Equate ratios of buyers
to non-spammers
Both non-spammers
and Normal
fraction
of non-spammers
Spammers
and non-spammers
ExtraAclick
required
to view
0
.
7
5
spammers
buy
click
on an click
ad ads
site
Bluff
Some
spammers and
Non-spammers see the
content
0
.5
Different
hurdles have different hardness
5 sec wait, Click to continue
Hurdle box
Black
0
.2
5
Send only a fraction of traffic through hurdles
To minimize impact on user experience
Spammers 0
and curious
users
click on an ad and
Both non-spammers
Some spammers
C and
Lose spammers and some
Some
buy
users
maynon-spammers
see the
12
11 Perfect hurdle would block all spam
Ad
Networks
non-spammers
content
10spammers click ads
In reality, some
spammers
through
(False
Negatives)
Clicks
charged get
are close
to the
estimated
valid clicks
A
B
[1] Fighting online click fraud using bluff ads [CCR 2010]
Dissecting
Ads[1]
Estimating
Click-spam
Main
Idea
DissectingBlack
Blackbox
box- –Bluff
Hurdles
Bluff Ads
How many
?
Junk ad text
with Negative
normal keywords,
targeting
Maximum
False
rate same
known
for each
Equate ratios of buyers
Normal
users
unlikely
to
click
hurdle
to non-spammers
Hurdle
Can be subtracted out
Both non-spammers
and Normal
fraction
of non-spammers
Spammers
and non-spammers
ExtraAclick
required
to view
spammers
buy
click
on an click
ad ads
site
Bluff
Some
spammers and
Non-spammers see the
content
Different hurdles have different hardness
5 sec wait, Click to continue
Hurdle box
Black
Send only a fraction of traffic through hurdles
To minimize impact on user experience
Spammers and curious
users
click on an ad and
Both non-spammers
12
11 Perfect
10spammers
Lose
spammers and some
hurdle would block all
spam
click ads
non-spammers
Some spammers and
Some
buy
users
maynon-spammers
see the
content
In reality, some spammers get through (False Negatives)
[1] Fighting online click fraud using bluff ads [CCR 2010]
Dissecting
Ads[1]
Estimating
Click-spam
Main
Idea
DissectingBlack
Blackbox
box- –Bluff
Hurdles
Bluff Ads
How many
?
Junk ad text
with Negative
normal keywords,
targeting
Maximum
False
rate same
known
for each
Equate ratios of buyers
Normal
users
unlikely
to
click
hurdle
to non-spammers
Hurdle
Can be subtracted out
Both non-spammers
and Normal
fraction
of non-spammers
Spammers
and non-spammers
ExtraAclick
required
to view
spammers
buy
click
on an click
ad ads
site
Bluff
Some
spammers and
Non-spammers see the
content
Different hurdles have different hardness
5 sec wait, Click to continue
Hurdle box
Black
Send only a fraction of traffic through hurdles
To minimize impact on user experience
Spammers and curious
users
click on an ad and
Both non-spammers
12
11 Perfect
10spammers
Lose
spammers and some
hurdle would block all
spam
click ads
non-spammers
Some spammers and
Some
buy
users
maynon-spammers
see the
content
In reality, some spammers get through (False Negatives)
[1] Fighting online click fraud using bluff ads [CCR 2010]
Dissecting
Ads[1]
Estimating
Click-spam
Main
Idea
DissectingBlack
Blackbox
box- –Bluff
Hurdles
Bluff Ads
How many
?
Junk ad text
with Negative
normal keywords,
targeting
Maximum
False
rate same
known
for each
Equate ratios of buyers
Normal
users
unlikely
to
click
hurdle
to non-spammers
Hurdle
Can be subtracted out
Both non-spammers
and Normal
fraction
of non-spammers
Spammers
and non-spammers
ExtraAclick
required
to view
spammers
buy
click
on an click
ad ads
site
Bluff
Some
spammers and
Non-spammers see the
content
Different hurdles have different hardness
5 sec wait, Click to continue
Hurdle box
Black
Send only a fraction of traffic through hurdles
To minimize impact on user experience
Spammers and curious
users
click on an ad and
Both non-spammers
12
11 Perfect
10spammers
Lose
spammers and some
hurdle would block all
spam
click ads
non-spammers
Some spammers and
Some
buy
users
maynon-spammers
see the
content
In reality, some spammers get through (False Negatives)
[1] Fighting online click fraud using bluff ads [CCR 2010]
Dissecting
Ads[1]
Estimating
Click-spam
Main
Idea
DissectingBlack
Blackbox
box- –Bluff
Hurdles
Bluff Ads
How many
?
Junk ad text
with Negative
normal keywords,
targeting
Maximum
False
rate same
known
for each
Equate ratios of buyers
Normal
users
unlikely
to
click
hurdle
to non-spammers
Hurdle
Can be subtracted out
Both non-spammers
and Normal
fraction
of non-spammers
Spammers
and non-spammers
ExtraAclick
required
to view
spammers
buy
click
on an click
ad ads
site
Bluff
Some
spammers and
Non-spammers see the
content
Different hurdles have different hardness
5 sec wait, Click to continue
Hurdle box
Black
Send only a fraction of traffic through hurdles
To minimize impact on user experience
Spammers and curious
users
click on an ad and
Both non-spammers
12
11 Perfect
10spammers
Lose
spammers and some
hurdle would block all
spam
click ads
non-spammers
Some spammers and
Some
buy
users
maynon-spammers
see the
content
In reality, some spammers get through (False Negatives)
[1] Fighting online click fraud using bluff ads [CCR 2010]
Dissecting
Ads[1]
Estimating
Click-spam
Main
Idea
DissectingBlack
Blackbox
box- –Bluff
Hurdles
Bluff Ads
How many
?
Junk ad text
with Negative
normal keywords,
targeting
Maximum
False
rate same
known
for each
Equate ratios of buyers
Normal
users
unlikely
to
click
hurdle
to non-spammers
Hurdle
Can be subtracted out
Both non-spammers
and Normal
fraction
of non-spammers
Spammers
and non-spammers
ExtraAclick
required
to view
spammers
buy
click
on an click
ad ads
site
Bluff
Some
spammers and
Non-spammers see the
content
Different hurdles have different hardness
5 sec wait, Click to continue
Hurdle box
Black
Send only a fraction of traffic through hurdles
To minimize impact on user experience
Spammers and curious
users
click on an ad and
Both non-spammers
12
11 Perfect
10spammers
Lose
spammers and some
hurdle would block all
spam
click ads
non-spammers
Some spammers and
Some
buy
users
maynon-spammers
see the
content
In reality, some spammers get through (False Negatives)
[1] Fighting online click fraud using bluff ads [CCR 2010]
Dissecting
Ads[1]
Estimating
Click-spam
Main
Idea
DissectingBlack
Blackbox
box- –Bluff
Hurdles
Bluff Ads
How many
?
Junk ad text
with Negative
normal keywords,
targeting
Maximum
False
rate same
known
for each
Equate ratios of buyers
Normal
users
unlikely
to
click
hurdle
to non-spammers
Hurdle
Can be subtracted out
Both non-spammers
and Normal
fraction
of non-spammers
Spammers
and non-spammers
ExtraAclick
required
to view
spammers
buy
click
on an click
ad ads
site
Bluff
Some
spammers and
Non-spammers see the
content
Different hurdles have different hardness
5 sec wait, Click to continue
Hurdle box
Black
Send only a fraction of traffic through hurdles
To minimize impact on user experience
Spammers and curious
users
click on an ad and
Both non-spammers
12
11 Perfect
10spammers
Lose
spammers and some
hurdle would block all
spam
click ads
non-spammers
Some spammers and
Some
buy
users
maynon-spammers
see the
content
In reality, some spammers get through (False Negatives)
[1] Fighting online click fraud using bluff ads [CCR 2010]
Protecting User privacy
Trackers Link User Requests
Multiple requests are linkable by remote trackers, if they
share the same identifiers.
User
Req. 1 (128.208.7.x), header: cookie(…)
Tracker
Req. 2 (128.208.7.x), header: cookie(…)
• Important identifiers for Web tracking:
– Application info. (cookie, JS localstorage, Flash)
– IP Address
SIGCOMM 2013
23
Approach: Pseudonym Abstraction
• Pseudonym = A set of all identifying features that
persist across an activity
• Allow a user to manage a large number of unlinkable
pseudonyms
– User can choose which ones are used for which
operations.
Pseudonym1
Alice
Medical information
Tracker
Cookie1
IP1
Pseudonym2
Cookie2
Location-related (Alice’s home)
IP2
SIGCOMM 2013
24
How We Want to Use Pseudonyms
Alice
1. Application-Layer Design
Application
Policy Engine
Pseudonym1
Tracker
Medical
Cookie1
IP1
IP
IP
IP
IP1
Pseudonym2
OS
Cookie2
Location
IP2
DHCP
Routers
2. Network-Layer Design
SIGCOMM 2013
25
Application-Layer Design
• Application needs to assign different pseudonyms
into different activities.
– How to use pseudonyms depends on user and
application.
– APIs are provided to define policies.
• Policy in Web browsing: a function of the request
information and the state of the browser.
– Window ID, tab ID, request ID, URL, whether request
is going to the first-party, etc.
SIGCOMM 2013
26
Sample Pseudonym Policies for the Web
Article on Politics
P1
news.com
P2
facebook.com
facebook.com
P3
• Default: P1 = P2 = P3
• Per-Request: P1 != P2 != P3
• Per-First Party: P1 = P2 != P3
SIGCOMM 2013
27
Sample Pseudonym Policies for the Web
Article on Politics
P1
news.com
P2
facebook.com
facebook.com
P3
• Default: P1 = P2 = P3
• Per-Request: P1 != P2 != P3
• Per-First Party: P1 = P2 != P3
SIGCOMM 2013
28
Sample Pseudonym Policies for the Web
Article on Politics
P1
news.com
P2
facebook.com
facebook.com
P3
• Default: P1 = P2 = P3
• Per-Request: P1 != P2 != P3
• Per-First Party: P1 = P2 != P3
SIGCOMM 2013
Facebook cannot know
the user’s visit to news.com
29
Pseudonyms in Action
Alice
Tracker
Application
Policy Engine
Pseudonym1
Cookie1
IP1
IP
IP
IP
IP1
Pseudonym2
OS
Cookie2
IP2
DHCP
Routers
2. Network-Layer Design
SIGCOMM 2013
30
Network-Layer Design Consideration
1. Many IP addresses for an end-host
2. Proper mixing
3. Efficient routing
4. Easy revocation
5. Support for small networks
SIGCOMM 2013
31
Network-Layer Design Consideration
1. Many IP addresses for an end-host
2. Proper mixing
3. Efficient routing
4. Easy revocation
5. Support for small networks
SIGCOMM 2013
32
1) IPv6 Allows Many IPs per Host
128bits
IPv6 Address
Small networks get /64 address space (1.8e19)
SIGCOMM 2013
33
2, 3) Symmetric Encryption
for Mixing and Routing
128bits
IPv6 Address
Network Prefix
To route the packet
“within” the network
To route the packet
“to” the network
Networks can use this part as they want
SIGCOMM 2013
34
2, 3) Symmetric Encryption
for Mixing and Routing
128bits
Base
Network Prefix
Subnet
Use symmetric-key encryption
Encrypted
Network Prefix
Host
Encrypt
Pseudonym
Decrypt
Encrypted ID
• End-hosts know only encrypted IP addresses
• Router uses the base addresses to forward packets
– By longest-prefix matching with subnet::host, thus,
the size of routing table does not change.
SIGCOMM 2013
35
Routing Example
Prefix
Internet
Encrypted ID
Sub::Host::Pseudo
Sub::Host::Pseudo
SIGCOMM 2013
ISP ( Prefix :: … )
36
Determining Worthiness of a user
• Categorize the user
– Infer intent from web-site visists
• Determine intent (willingness to buy)
Cost Of Preserving Privacy
Download