Fighting Spam, Phishing and Online Scams at the Network Level Nick Feamster

advertisement
Fighting Spam, Phishing and
Online Scams at the Network Level
Nick Feamster
Georgia Tech
with Anirudh Ramachandran, Shuang Hao, Nadeem Syed,
Alex Gray, Sven Krasser, Santosh Vempala
Spam: More than Just a Nuisance
• 95% of all email traffic
– Image and PDF Spam
(PDF spam ~12%)
• As of August 2007, one
in every 87 emails constituted a
phishing attack
• Targeted attacks on the rise
– 20k-30k unique phishing attacks per month
Source: CNET (January 2008), APWG
2
Filtering
• Prevent unwanted traffic from reaching a user’s
inbox by distinguishing spam from ham
• Question: What features best differentiate spam
from legitimate mail?
– Content-based filtering: What is in the mail?
– IP address of sender: Who is the sender?
– Behavioral features: How the mail is sent?
3
Conventional Approach: Content Filters
• Trying to hit a moving target...
PDFs
Excel sheets
...and even mp3s!
Images
Problems with Content Filtering
• Low cost to evasion: Spammers can easily alter
features of an email’s content can be easily adjusted and
changed
• Customized emails are easy to generate: Contentbased filters need fuzzy hashes over content, etc.
• High cost to filter maintainers: Filters must be
continually updated as content-changing techniques
become more sophisticated
5
Another Approach: IP Addresses
• Problem: IP addresses are ephemeral
• Every day, 10% of senders are from previously
unseen IP addresses
• Possible causes
– Dynamic addressing
– New infections
6
Idea: Network-Based Filtering
• Filter email based on how it is sent, in addition
to simply what is sent.
• Network-level properties are less malleable
– Set of target recipients
– Hosting or upstream ISP (AS number)
– Membership in a botnet (spammer, hosting
infrastructure)
– Network location of sender and receiver
7
Challenges
• Understanding the network-level behavior
– What behaviors do spammers have?
– How well do existing techniques work?
• Building classifiers using network-level features
– Key challenge: Which features to use?
– Algorithms: SpamTracker and SNARE
• Building the system
– Dynamism: Behavior itself can change
– Scale: Lots of email messages (and spam!) out there
8
Data Collection: Spam and BGP
• Spam Traps: Domains that receive only spam
• BGP Monitors: Watch network-level reachability
Domain 1
Domain 2
17-Month Study: August 2004 to December 2005
9
Data Collection: MailAvenger
• Highly configurable SMTP server
• Collects many useful statistics
10
BGP “Spectrum Agility”
• Hijack IP address space using BGP
• Send spam
• Withdraw IP address
A small club of persistent
players appears to be using
this technique.
Common short-lived
prefixes and ASes
~ 10 minutes
61.0.0.0/8 4678
66.0.0.0/8 21562
82.0.0.0/8 8717
Somewhere between 1-10% of all
spam (some clearly intentional,
others might be flapping)
11
Why Such Big Prefixes?
• Visibility: Route typically won’t be filtered
(nice and short)
• Flexibility: Client IPs can be scattered
throughout dark space within a large /8
– Same sender usually returns with different IP
addresses
12
Characteristics of Agile Senders
• IP addresses are widely distributed across the /8 space
• IP addresses typically appear only once at our sinkhole
• Depending on which /8, 60-80% of these IP addresses
were not reachable by traceroute when we spotchecked
• Some IP addresses were in allocated, albeit
unannounced space
• Some AS paths associated with the routes contained
reserved AS numbers
13
Other Findings
• Top senders: Korea, China, Japan
– Still about 40% of spam coming from U.S.
• More than half of sender IP addresses appear
less than twice
• ~90% of spam sent to traps from Windows
14
What about IP-based blacklists?
15
Two Metrics
• Completeness: The fraction of spamming IP
addresses that are listed in the blacklist
• Responsiveness: The time for the blacklist to
list the IP address after the first occurrence of
spam
16
Completeness and Responsiveness
• 10-35% of spam is unlisted at the time of receipt
• 8.5-20% of these IP addresses remain unlisted
even after one month
Data: Trap data from March 2007, Spamhaus from March and April 2007
17
Fraction of all spam received
Completeness of IP Blacklists
~95% of bots listed in
one or more blacklists
~80% listed on average
Only about half of the IPs
spamming from short-lived
BGP are listed in any
blacklist
Number of DNSBLs listing this spammer
Spam from IP-agile senders tend to be listed in fewer blacklists
18
What’s Wrong with IP Blacklists?
• Based on ephemeral identifier (IP address)
– More than 10% of all spam comes from IP addresses not seen
within the past two months
• Dynamic renumbering of IP addresses
• Stealing of IP addresses and IP address space
• Compromised machines
• IP addresses of senders have considerable churn
• Often require a human to notice/validate the behavior
– Spamming is compartmentalized by domain and not analyzed
across domains
19
Fraction of IP Addresses
Ephemeral: Addresses Keep Changing
About 10% of IP
addresses never
seen before in trace
20
Amount of Spam
Low Volume to Each Domain
Most spammers send very little spam, regardless
of how long they have been spamming.
Lifetime (seconds)
21
Where do we go from here?
• Option 1: Stronger sender identity
– Stronger sender identity/authentication may make
reputation systems more effective
– May require changes to hosts, routers, etc.
• Option 2: Filtering based on sender behavior
– Can be done on today’s network
– Identifying features may be tricky, and some may
require network-wide monitoring capabilities
22
Outline
• Understanding the network-level behavior
– What behaviors do spammers have?
– How well do existing techniques work?
• Building classifiers using network-level features
– Key challenge: Which features to use?
– Algorithms: SpamTracker and SNARE
• Building the system (SpamSpotter)
– Dynamism: Behavior itself can change
– Scale: Lots of email messages (and spam!) out there
23
SpamTracker
• Idea: Blacklist sending behavior
(“Behavioral Blacklisting”)
– Identify sending patterns commonly used by
spammers
• Intuition: Much more difficult for a spammer to
change the technique by which mail is sent than
it is to change the content
24
SpamTracker Approach
• Construct a behavioral fingerprint for
each sender
• Cluster senders with similar fingerprints
• Filter new senders that map to existing
clusters
25
Some Patterns of Sending are Invariant
IP Address: 76.17.114.xxx
IP Address: 24.99.146.xxx
DHCP
Reassignment
spam
spam
domain1.com domain2.com
spam
domain3.com
spam
domain1.com
spam
spam
domain2.com
domain3.com
• Spammer's sending pattern has not changed
• IP Blacklists cannot make this connection
26
SpamTracker: Identify Invariant
IP Address: 24.99.146.xxx
Unknown sender
IP Address: 76.17.114.xxx
Known Spammer
DHCP
Reassignment
spam
spam
spam
domain1.com domain2.com domain3.com
Cluster on
sending behavior
Behavioral fingerprint
Infection
spam
spam
spam
domain1.com domain2.com domain3.com
Cluster on
sending behavior
Similar fingerprint!
27
Building the Classifier: Clustering
• Feature: Distribution of email sending volumes
across recipient domains
• Clustering Approach
– Build initial seed list of bad IP addresses
– For each IP address, compute feature vector:
volume per domain per time interval
– Collapse into a single IP x domain matrix:
– Compute clusters
28
Clustering: Output and Fingerprint
• For each cluster,
compute
fingerprint vector:
• New IPs will be
compared to this
“fingerprint”
IP x IP Matrix: Intensity
indicates pairwise similarity
29
Classifying IP Addresses
• Given “new” IP address, build a feature vector
based on its sending pattern across domains
• Compute the similarity of this sending pattern to
that of each known spam cluster
– Normalized dot product of the two feature vectors
– Spam score is maximum similarity to any cluster
30
Evaluation
• Emulate the performance of a system that could
observe sending patterns across many domains
– Build clusters/train on given time interval
• Evaluate classification
– Relative to labeled logs
– Relative to IP addresses that were eventually listed
31
Data
• 30 days of Postfix logs from email hosting service
– Time, remote IP, receiving domain, accept/reject
– Allows us to observe sending behavior over a large
number of domains
– Problem: About 15% of accepted mail is also spam
• Creates problems with validating SpamTracker
• 30 days of SpamHaus database in the month
following the Postfix logs
– Allows us to determine whether SpamTracker detects
some sending IPs earlier than SpamHaus
32
Classification Results
Ham
Spam
Not always so
accurate!
SpamTracker Score
33
Early Detection Results
• Compare SpamTracker scores on “accepted”
mail to the SpamHaus database
– About 15% of accepted mail was later determined to
be spam
– Can SpamTracker catch this?
• Of 620 emails that were accepted, but sent from
IPs that were blacklisted within one month
– 65 emails had a score larger than 5 (85th percentile)
34
Evasion
• Problem: Malicious senders could add noise
– Solution: Use smaller number of trusted domains
• Problem: Malicious senders could change
sending behavior to emulate “normal” senders
– Need a more robust set of features…
35
Improving Classification
• Lower overhead
• Faster detection
• Better robustness (i.e., to evasion, dynamism)
• Use additional features and combine for more
robust classification
– Temporal: interarrival times, diurnal patterns
– Spatial: sending patterns of groups of senders
36
Outline
• Understanding the network-level behavior
– What behaviors do spammers have?
– How well do existing techniques work?
• Building classifiers using network-level features
– Key challenge: Which features to use?
– Algorithms: SpamTracker and SNARE
• Building the system (SpamSpotter)
– Dynamism: Behavior itself can change
– Scale: Lots of email messages (and spam!) out there
37
SNARE: Automated Sender Reputation
• Goal: Sender reputation from a single packet?
(or at least as little information as possible)
– Lower overhead
– Faster classification
– Less malleable
• Key challenge
– What features satisfy these properties and can
distinguish spammers from legitimate senders
38
Sender-Receiver Geodesic Distance
90% of legitimate
messages travel 2,200
miles or less
39
Density of Senders in IP Space
For spammers, k
nearest senders
are much closer
in IP space
40
Combining Features
• Put features into the RuleFit classifier
• 10-fold cross validation on one day of query logs
from a large spam filtering appliance provider
• Using only network-level features
• Completely automated
41
Outline
• Understanding the network-level behavior
– What behaviors do spammers have?
– How well do existing techniques work?
• Building classifiers using network-level features
– Key challenge: Which features to use?
– Algorithms: SpamTracker and SNARE
• Building the system (SpamSpotter)
– Dynamism: Behavior itself can change
– Scale: Lots of email messages (and spam!) out there
42
Real-Time Blacklist Deployment
Approach
• As mail arrives,
lookups received
at BL
• Queries provide
proxy for sending
behavior
• Train based on
received data
• Return score
43
Challenges
• Scalability: How to collect and aggregate data, and form
the signatures without imposing too much overhead?
• Dynamism: When to retrain the classifier, given that
sender behavior changes?
• Reliability: How should the system be replicated to
better defend against attack or failure?
• Sensor placement: Where should monitors be placed to
best observe behavior/construct features?
44
Design Choice: Augment DNSBL
• Expressive queries
– SpamHaus: $ dig 55.102.90.62.zen.spamhaus.org
• Ans: 127.0.0.3 (=> listed in exploits block list)
– SpamSpotter: $ dig \
receiver_ip.receiver_domain.sender_ip.rbl.gtnoise.net
• e.g., dig 120.1.2.3.gmail.com..1.1.207.130.rbl.gtnoise.net
• Ans: 127.1.3.97
(SpamSpotter score = -3.97)
• Also a source of data
– Unsupervised algorithms work with unlabeled
data
45
Design Choice: Sampling
Relatively small samples can
achieve low false positive rates
46
Sampling: Training Time
47
Dynamism: Accuracy over Time
48
Improvements
• Accuracy
– Synthesizing multiple classifiers
– Incorporating user feedback
– Learning algorithms with bounded false positives
• Performance
– Caching/Sharing
– Streaming
• Security
– Learning in adversarial environments
49
Summary:
Network-Based Behavioral Filtering
• Spam increasing, spammers becoming agile
– Content filters are falling behind
– IP-Based blacklists are evadable
• Up to 30% of spam not listed in common blacklists at receipt.
~20% remains unlisted after a month
• Complementary approach: behavioral blacklisting
based on network-level features
– Blacklist based on how messages are sent
– SpamTracker: Spectral clustering
• catches significant amounts faster than existing blacklists
– SNARE: Automated sender reputation
• ~90% accuracy of existing with lightweight features
– SpamSpotter: Putting it together in an RBL system
50
References
• Anirudh Ramachandran and Nick Feamster,
“Understanding the Network-Level Behavior of
Spammers”, ACM SIGCOMM, 2006
• Anirudh Ramachandran, Nick Feamster, and Santosh
Vempala, “Filtering Spam with Behavioral Blacklisting”,
ACM CCS, 2007
• Nadeem Syed, Shuang Hao, Nick Feamster, Alex Gray
and Sven Krasser, “SNARE: Spatio-temporal Networklevel Automatic Reputation Engine”, GT-CSE-08-02
• Anirudh Ramachandran, Shuang Hao, Hitesh
Khandelwal, Nick Feamster, Santosh Vempala, “A
Dynamic Reputation Service for Spotting Spammers”,
GT-CS-08-09
51
52
Additional History:
Message Size Variance
Certain
Spam
Likely
Spam
Senders of legitimate mail have a
much higher variance in sizes of
messages they send
Likely
Ham
Certain
Ham
Message Size Range
Surprising: Including this
feature (and others with more
history) can actually decrease
the accuracy of the classifier
53
Download