Spamming Techniques and Control By Neha Gupta Research Assistant, MINDLAB

advertisement
Spamming Techniques and
Control
By Neha Gupta
Research Assistant, MINDLAB
University of Maryland-College Park
Contents






What is Spamming?
Cost, history and types of spam
Spam Statistics
Insight into Spammers minds
Spamming tricks and techniques
Spam Control Methods and Feasibility
What is Spamming?







Spamming is the abuse of electronic
messaging systems send unsolicited bulk
messages or to promote products or services.
Most widely recognized abuse is email spam.
instant messaging spam
usenet newsgroup spam
web search engine spam-’Spamdexing’
spam in blogs
mobile phone messaging spams.
Costs of Spams




Consumption of computer and network
resources.
Race between spammers and those who
try to control them.
Lost mail and lost time.
Cost United States organizations alone
more than $10 billion in 2004.
History of Spam



Internet was first established as for educational
and military purpose.
Probably the first spam was sent by an
employee of Digital Equipment Corporation on
the APRANET- March 1978.
Cantor and Siegel posted an advertisement for
"Green Card Lottery“ to 6000 newsgroups -1994.
Global Spam Categories







Product Email Attacks
Financial Email Attacks
Adult Email Attacks
Scams Email Attacks
Health Email Attacks
Leisure Email Attacks
Internet Email Attacks
Spam Statistics
About Spammers


Refer themselves as ‘bulk marketers’,
’online e-mail marketers’ ,’mail bombers’.
One of the main reasons people started
spamming was it had an extremely low
start-up cost ~ 1500 K.
Spam activities

Sending spam to sell their products


Harvesting email addresses


Examples : pirated software-easily distributable
products
Builds lists of spams and sells to other spammers.
Affiliate Programs: ‘Most common types’



Click through rate
Commissions
Can make -150-2000$ per campaign
Spam Tricks

Top-to-bottom HTML encoding
 Code
words as individual letters
Zero Font Size

Embedded Image


Adding spaces or characters


B*U*Y or B-U-Y
Misspelling


Text messages are embedded in images
Replace ‘l’ by 1 ,’O’ by ‘0’
Hashing

Legitimate message attached with short spam
message.
Ways to Send spams/bulk mails
 Multiple
ISPs
 Spoofing
Email addresses
 Hacking/Viruses
Using Multiple ISPs

Example: spammers send short bursts of
messages every 20 seconds from 6
different computers using different ISPs
and in 12 hour time span can average
over 1.3 million messages.
Spoofing email addresses

Emails use SMTP – simple mail transfer
protocol, documented in RFC 821.


Was designed to be simple and easily usable.
Open Relay SMTP servers
No need to verify your identity
 Operates on port 25

Spoofing…
>telnet mail.abc.com 25
220 ss71.shared.server-system.net ESMTP Sendmail 8.12.11/8.12.11;
Fri, 8 March 2007 10:17:19 -0800
helo xyz.com
250 ss71.shared.server-system.net Hello [12.178.219.195], pleased to
meet you
mail from:
250 OK
receipt to :jkl@mail.yahoo.com
DATA
Blah blah blah ..
<CRLF>.<CRLF>
250 OK
QUIT
Phishing


Phishers attempt to fraudulently acquire
sensitive information, such as usernames,
passwords and credit card details, by
masquerading as a trustworthy entity in
an electronic communication.
Ebay and Paypal are two of the most
targeted companies, and online banks are
also common targets
Zombies


More than 80 percent of all spam
worldwide comes from zombie PCs owned
by businesses, universities, and average
computer owners, says MessageLabs, an
e-mail security service provider.
Zombie PCs are computers that have been
infected by malicious code that allows
spammers to use them to send e-mail.
Spam Control Ideas
Content or Point Based Spam
Filtering
 Postage/Stamp Based Spam
Filtering

Content/Point Based Spam
Filtering




Rule Based Approach
Whitelist/Verification filters
Distributed adaptive blacklists
Bayesian filters
Rule Based Approach
•Email is compared with a set of rules to
determine if it’s a spam or not with various
weights given to each rule. E.g. Spam Assassin

Advantages




Very effective with a
given set of
rules/conditions
Accuracy 90-95%
No need of training
Rules can be updated

Disadvantages


No self-learning
facility available
for the filter.
Spammers with
knowledge of rules
can design spam
to deceive the
method.
Blacklist Approach


Detected spammers/open relays that are
found to be sources of spam are black
listed
Blacklist can be maintained both at
personal and server level.

Advantages



Useful in the scenario
when servers are
compromised and
used for sending spam
to hundreds of
thousands of users.
Can be a better option
when used at ISP
level.
Tools like Razor and
Pyzor can be used for
this purpose.

Disadvantages

As soon as the
spammer learns that
the computer is being
detected he can use a
different computer.
Whitelist Approach

Aggressive technique for spam filtering .


Used in mailing lists.example users
subscribed to the mailing list can only send
message to the list.
Any mail from an unknown email address will
will require a confirmation message the first
time posting from that mail address. A
confirmation reply adds that address to the
whitelist.
Bayesian Spam Filters
(Statistical Models)


Use probabilistic approach
Have to be trained, not self learning.

Advantages





Very popular
Can customize according to users
No need of a centralized mechanism
Everyone relies on them 
Disadvantages


False Positives
Based on words.
Postage/Stamp Method


Pro-active measures against spams.
Based on economics.
“When sending an email to someone, the
sender attaches a stamp to his message ,a
token that is costly to the sender but
demonstrates his good faith”
Types of Postage Payment Methods

Monetary Payment Method




First time a sender sends a message he sends
some cheque redeemable as money from
recipient’s stamp processing software.
Postage can be returned in reply.
After that both are in each others whitelist.
Obstacle

Security problems related to e-cash.
Postage ~ computing resources


The sender’s software makes some kind of
computationally expensive computation
which is relatively easy for the receiver to
check.
E.g calculation of a hash message digest
used in CAMRAM project.
Payment ~Human Time


Automated reply from a recipients
software.
Sender would connect to a webpage and
answer itself as a human spending time
answering a simple test which till date
only humans can pass.
CAPTCHA-Completely Automated
Turing Test to tell Computers and
Humans Apart
Implementation of Stamp Payment
Protocols



Standardize an Email Postage Payment
Protocol .
MUA (Mail User Agent) modification is
necessary.
Stamps will be attached with emails in
envelopes and headers ,care should be
taken to pick the encoding convention .
Business Models for Spreading
Postage




Sale of services to IT departments.
Sale of ready-to-use software.
Investment of deposits on postage
accounts.
Sale of marketing services
Conclusion


Spams costs time and resources 
The design of any information centric
system should be such that it can prevent
the misuse of resources by malicious
users.
References




http://www.symantec.com/avcenter/refere
nce/Symantec_Spam_Report__January_2007.pdf
http://fare.tunes.org
An Essay on Spam-Paul Graham
Norman Report-Why spammers spam.
Acknowledgements


Prof. Ashok Agrawala
Mudit Agrawal- proof reading
VIDEO CLIP
http://video.google.com/videoplay?docid=8246463980976635143&q=luis+von+ahn
THANKS & QUESTIONS
Download