Fighting With : Fire

advertisement
Fighting Fire With Fire:
Crowdsourcing Security Threats
and Solutions on the Social Web
Gang Wang, Christo Wilson, Manish Mohanlal, Ben Y. Zhao
Computer Science Department, UC Santa Barbara.
gangw@cs.ucsb.edu
A Little Bit About Me
2



3nd Year PhD @ UCSB
Intern at MSR Redmond 2011
Intern at LinkedIn (Security Team) 2012
Research Interests:
 Security and Privacy
 Online Social Networks
 Crowdsourcing
Data Driven Analysis and Modling
Recap: Threats on the Social Web
3

Social spam is a serious problem
10% of wall posts with URLs on Facebook are spam
 70% phishing


Sybils underlie many attacks on Online Social Networks
Spam, spear phishing, malware distribution
 Sybils blend completely into the social graph


Existing countermeasures are ineffective
 Blacklists only catch 28% of spam
 Sybil detectors from the literature do not work
Sybil Accounts on Facebook
4

In-house estimates
 Early
2012: 54 million
 August 2012: 83 million
 8.7% of the user base

Fake likes
 VirtualBagel:
useless site, 3,000 likes in 1 week
 75% from Cairo, age 13-17
• Sybils attacks in large scale
• Advertisers are fleeing Facebook
Sybil Accounts on Twitter
Followers
5


4,000 new
followers/day
100,000 new
followers in 1 day
92% of Newt Gingritch’s followers are Sybils
• Twitter is vital infrastructure
Russian political protests on Twitter
• Sybils usurping Twitter for political ends
 25,000
Sybils sent 440,000 tweets
 1 million Sybils controlled overall
Talk Outline
6
1.
Malicious crowdsourcing sites – crowdturfing [WWW’12]
 Spam and Sybils generated by real people
 Huge threat in China
 Growing threat in the US
2.
Crowdsourced Sybil detection [NDSS’13]
 If attackers can do it, why not defenders?
 Can humans detect Sybils?
User Study
 Is this cost effective?
 Design a crowdsourced Sybil detection system
Outline
7

Intro

Crowdturfing

Crowdsourcing Overview

What is Crowdturfing

How bad is it?

Crowdturfing in the US

Crowdsourced Sybil Detection

Conclusion
High Quality Sybils and Spam
8



We tend to think of spam as “low quality”
Stock
Photographs
What about
high
quality spam and Sybils?
Gang Wang
Open questions MaxGentleman is the bestest male enhancement system
 What
avalable. http://cid-ce6ec5.space.live.com/
is the scope of this problem?
 Generated manually or mechanically?
 What are the economics?
Black Market Crowdsourcing
9

Amazon’s Mechanical Turk
 Admins

remove spammy jobs
Black market crowdsourcing websites
 Spam
and fake accounts, generated by real people
 Major force in China, expanding in the US and India
Crowdturfing = Crowdsourcing + Astroturfing
10
Crowdturfing Workflow
11
Customers
 Initiate
campaigns
 May be
legitimate
businesses
Campaign
Agents
 Manage
campaign
and
workers
 Verify
completed
tasks
Workers
Tasks
Reports
 Complete
tasks for
money
 Control
Sybils on
other
websites
Crowdturfing in China
12
Site
Active
Since
Total
$ for
$ for
Campaigns Workers Reports Workers Site
Zhubajie
Nov. 2006
76K
$2.4M
$595K
1000000
100000
1000
Zhubajie
$
100
1000
$
10
1
10000
Campaigns
100
Sandaha
10
Campaigns
Jan. 08
Jan. 09
Jan. 10
Jan. 11
1
Dollars per Month
6.3M
Site Growth Over Time
10000
Campaigns per Month
169K
Spreading Spam on Weibo
13
100
50% of campaigns reach
>100000 users
90
80
70
8% reach
>1 million users
CDF
60
• Campaigns reach huge audiences
• How effective are these campaigns?
50
40
30
20
10
0
100
1000
10000
100000
1000000 10000000
Approximate Audience Size per Campaign
How Effective is Crowdturfing?
14

Initiate our own campaigns as a customer
Web Display Ads
 4 benign ad campaigns promoting real e-commerce sites
CPC = $0.01
 All

clicks route through our measurement server
Campaign
About
Target
Vacation
Advertise for a
discount vacation
through a travel
agent
Weibo
QQ
Forums
Cost
$15
Tasks
100
Reports
Clicks
Cost Per
Click
108
28
$0.21
118
187
$0.09
123
3
$0.90
Travel agency reported sales statistics
2
sales/month before our campaign
 11 sales within 24 hours after our campaign
 Each trip sells for $1500!
Crowdturfing in America
15
Black Market Legit
US Sites

% Crowdturfing
Mechanical Turk
12%
MinuteWorkers
70%
MyEasyTasks
83%
Microworkers
89%
ShortTasks
95%
Other studies support these findings
 Freelancer
 28%
spam jobs
 Bulk OSN accounts, likes, spam
 Connections to botnet operators
 Poultry
 $20
Markets
for 1000 followers
 Ponzi scheme
Takeaways
16

Identified a new threat: Crowdturfing
 Growing
exponentially in size and revenue in China
 $1 million per month on just one site
 Cost effective: $0.21 per click

Starting to grow in US and other countries
 Mechanical
Turk, Freelancer
 Twitter Follower Markets

Huge problem for existing security systems
 Little
to no automation to detect
 Turing tests fail
Outline
17

Intro

Crowdturfing

Crowdsourced Sybil Detection


Open Questions

User Study

Accuracy Analysis

System Design
Conclusion
Crowdsourcing Sybil Defense
18


Defenders are losing the battle against OSN Sybils
Idea: build a crowdsourced Sybil detector
Leverage human intelligence
 Scalable


Open Questions
How accurate are users?
 What factors affect detection accuracy?
 Is crowdsourced Sybil detection cost effective?

User Study
19

Two groups of users
Crowdturfing
Site
Experts – CS professors, masters, and PhD students
 Turkers – crowdworkers from Mechanical Turk and Zhubajie


Three ground-truth datasets of full user profiles
Renren – given to us by Renren Inc.
 Facebook US and India
Stock Picture
 Crawled
 Legitimate profiles – 2-hops from our own profiles
 Suspicious profiles – stock profile images
 Banned suspicious profiles = Sybils

Testers may skip around
and revisit profiles
Real or fake?
Navigation Buttons
Why?
Progress
Classifying
Profiles
Browsing
Profiles
Screenshot of Profile
(Links Cannot be Clicked)
20
Experiment Overview
21
Data from RenrenFewer Experts
Dataset
# of Profiles
Sybil Legit.
Renren
100
100
Facebook
US
32
50
Facebook
India
50
49
Test Group
# of
Testers
Profile
per
Tester
Chinese Expert
Chinese Turker
US Expert
24
418
40
100
10
50
US Turker
India Expert
India Turker
299
20
342
12
100
12
Crawled Data
More Profiles per Experts
Individual Tester Accuracy
22
100
Chinese Turker
US Turker
US Expert
Chinese Expert
80
60
Not so
good :(
CDF (%)
• Experts prove that humans can be accurate
Awesome!
• Turkers
need
extra
help…
40
80% of experts have
>90% accuracy!
20
0
0
10
20
30
40
50
60
70
Accuracy per Tester (%)
80
90
100
Accuracy of the Crowd
23
Treat each classification by each tester as a vote
 Majority makes final decision
Almost Zero
Experts
False
False
False
Positives
Dataset
Test Group
Negatives
PerformPositives
Okay

Chinese
Expert
0%
3%
• Renren
False positive
rates
are excellent
Chinese TurkerTurkers Miss
0%
63%
• Turkers need extra help against false negatives
US Expert Lots of Sybils
0%
10%
Facebook
• What
can beUSdone
to
improve
accuracy?
US
Turker
2%
19%
Facebook
India
India Expert
India Turker
0%
0%
16%
50%
Eliminating Inaccurate Turkers
24
Most workers are
>40% accurate
False Negative Rate (%)
100
80
China
Dramatic
India
Improvement
From 60% to 10%
US
False Negatives
• Only
60 a subset of workers are removed (<50%)
• Getting rid of inaccurate turkers is a no-brainer
40
20
0
0
10
20
30
40
50
Turker Accuracy Threshold (%)
60
70
How Many Classifications Do You Need?
25
100
Error Rate (%)
False Negatives
• Only need a 4-5 classifications
to converge
80
• Few classifications = less cost
China
India
60
40
False Positives
20
US
0
2
4
6
8 10 12 14 16 18 20 22 24
Classifications per Profile
How to turn our results into a system?
26
1.
Scalability

2.
Performance


3.
OSNs with millions of users
Improve turker accuracy
Reduce costs
Preserve user privacy when giving data to turkers
Out
SystemFilter
Architecture
27
Inaccurate Turkers
Maximize Usefulness of
High Accuracy Turkers
Crowdsourcing Layer
Rejected!
OSN employee
Turker
Selection
Very Accurate
Turkers
Accurate Turkers
Sybils
• Leverage Existing Techniques
All Turkers • Help the System Scale
• Continuous Quality Control
• Locate Malicious Workers
Social Network
Heuristics
User Reports
?
Suspicious Profiles
Filtering Layer
Trace Driven Simulations
28
Classifications
2
Very Accurate
Turkers
Accurate Turkers
 Simulate
2000 profiles
 Error rates drawn from survey data
 Vary 4 parameters
20-50%
Controversial Range
Results++
Results
• Average 6
8 classifications per profile
90%
Threshold
• <1%
<0.1%false
falsepositives
positives
• <1%
<0.1%false
falsenegatives
negatives
Classifications
5
Estimating Cost
29

Estimated cost in a real-world social networks: Tuenti
12,000 profiles to verify daily
 14 full-time employees
 Annual salary 30,000 EUR (~$20 per hour)  $2240 per day


Crowdsourced Sybil Detection
20sec/profile, 8 hour day  50 turkers
 Facebook wage ($1 per hour)  $400 per day


Cost with malicious turkers
Estimate that 25% of turkers are malicious
 63 turkers
 $1 per hour  $504 per day

Takeaways
30



Humans can differentiate between real and fake profiles
Crowdsourced Sybil detection is feasible
Designed a crowdsourced Sybil detection system
 False
positives and negatives <1%
 Resistant to infiltration by malicious workers
 Sensitive to user privacy
 Low cost

Augments existing security systems
Outline
31

Intro

Crowdturfing

Crowdsourced Sybil Detection

Conclusion

Summary of My Work

Future Work
Key Contributions
32
1.
Identified novel threat: crowdturfing
End-to-end spam measurements from customers to the web
 Insider knowledge of social spam

2.
Novel defense: crowdsourced Sybil detection
 User
study proves feasibility of this approach
 Build an accurate, scalable system
 Possible deployment in real OSNs – LinkedIn and RenRen
Ongoing Works
33
1.
Twitter follower markets



2.
Locate customers who purchase bulk of Twitter followers
Study the un-follow dynamics of customers
Develop systems to detect customers in the wild
Sybil detection using server-side click streams



Build click models based on clickstream logs
Extract click patterns of Sybil and normal users
Develop systems to detect Sybil
34
Questions?
Thank you!
Potential Project Ideas
35

Malware distribution in cellular networks
 Identify
malware related cellular network traffic
 Coordinated malware distribution campaigns
 Feature based detection

Advertising traffic analysis on mobile Apps
 Characterize
ads traffic
 How effective for app-displayed ads to get click-through?
 Are there malware delivered through ads?
Preserving User Privacy
36


Showing profiles to crowdworkers raises privacy issues
Solution: reveal profile information in context
Crowdsourced
Evaluation
Friends
Crowdsourced
Evaluation
Friend-Only
Public Profile
Profile
Information
Information
Clickstream Sybil Detection
37

96%
Initial
20%
64%
31%
Share
9%
1.
2%
55%
27%
Browse
Profiles
68%
Sybil Clickstream
87%
Absolute
ofMessage
clicks
Friend number
5%
Invite
2. Time
between clicks
14%
4%
3. Page 9%
traversal order
Friend
Invite
15%
Clickstream detection of
86%Sybils
Final

Initial
3%
22%
Challenges
43%
Photo
93% 10%
 Real-time
9%
14%
 Massive scalability
Share
Browse
21%
 Low-overhead
Profiles
56%
56%
Normal Clickstream
5%
Final
29%
Are Workers Real People?
38
Late Night/Early Morning
Work Day/Evening
9
% of Reports from Workers
8
7
6
5
Lunch
4
Dinner
3
ZBJ
Zhubajie
2
Sandaha
SDH
1
0
0
5
10
15
Hours in the Day
20
Crowdsourced Sybil Detection
39

How to detect crowdturfed Sybils?
 Blur
the line between real and fake
 Difficult to detect algorithmically

Anecdotal evidence that people can spot Sybils
 75%
of friend requests from Sybils are rejected
 Can people distinguish in real/fake general?

User studies: experts, turkers, undergrads
 What
features give Sybils away?
 Are certain Sybils tougher than others?

Integration of human and machine intelligence
Survey Fatigue
40
100
80
80
80
80
60
60
60
60
40
40
40
40
20
20
20
20
0
0
0
0
0
10
20 30 40
Profile Order
Time per Profile (s)
100
Accuracy (%)
Time per Profile (s)
100
US Turkers
Accuracy
Time
0
2
4 6 8 10
Profile Order
All testers speed up over
time matters
No fatigue
Fatigue
100
Accuracy (%)
US Experts
Sybil Profile Difficulty
Experts perform well on
most difficult Sybils
41
Average Accuracy per Sybil (%)
100
90
80
70
60
Turker
• Some
Sybils are more stealthy
50
Expert
40
• Experts catch more tough Sybils than turkers
30
Really difficult
profiles
20
10
0
0
5
10
15
20
25
30
Sybil Profiles Ordered By Turker Accuracy
35
Download