white paper - MS Word

advertisement
Abstract
Over the past 5 years the Internet has changed vastly from its academic origins to become the
lifeblood of global business communications. Unfortunately, along with all the obvious benefits
associated with such connectivity there is a downside. This paper aims to investigate the carrier
class virus phenomenon and propose solutions.
Historically, computer viruses spread primarily via booting from infected disks or executing
infected files or documents. In every case, a human element has factored in a virus's ability to
pollinate. However, this situation has recently changed and with it so have some important factors
regarding virus protection.
Email aware viruses such as Melissa, Happy99 or ExploreZIP are able to pollinate themselves,
instantly and with great efficiency. Worse still, the powerful scripting languages presented in
todays’ email clients and office suites make creating such viruses a comparatively easy task. We
know that statistically, 1 in every 1,500 emails on average will contain a virus.
In order to help counter this trend towards email capable viruses, ISPs must take more
responsibility for the email they forward and provide an effective front line of defense for their
customers. In this presentation, MessageLabs will introduce supporting data to highlight important
new issues and trends in virus protection and detail how ISPs can efficiently and easily integrate
virus scanning into their networks.
The Internet is at the core of the problem so it is logical that it should be at the core of the solution.
Contents:
Page No.
1.
Emerging Virus Trends…………………………………………………………………3
2.
Why Are Viruses So Prolific?…………………………………………………………4-6
2.1 Networked Plumbing
2.2 Too Much Functionality
2.3 Common Platforms
3.
4.
Current Solutions are Outdated……………………………………………………6-7
Evolving a New Approach: Scanning For Viruses
at the Internet Level………………………………………………………………………7-14
4.1 How Does the Virus Scanning System work?
4.2 How is the Virus Scanning System Deployed?
4.3 Skeptic™
5.
Mail Encryption……………………………………………………………………………..15
6.
Making Use of Live Statistics……………………………………………..………….16
http://www.messagelabs.com/
Mark Sunner, MessageLabs, November 2000
1
www.MessageLabs.com
Section 1.
Emerging Virus Trends
Recent developments in the industry have seen virus writers incorporating email capabilities into
viruses with devastating effects. Older techniques used to increase the impact or longevity of a
virus, such as stealth and polymorphism, have become old hat. By using email as a method of
distribution, it is possible to write a virus capable of infecting thousands of computers in a matter
of minutes. Worse still, creating such viruses has become easy due to the immense
programming capabilities contained within today’s powerful office suites.
Since the first PC virus appeared on the scene a little over 12 years ago, there are now
approximately some 40-50,000 known viruses. However, whilst this figure may sound dramatic, it
is not an indication of the overall threat, indeed at any point in time only a very small proportion of
viruses are actually in the wild causing damage. The current number of viruses in the wild is
estimated to be around 400 although this figure precludes variants it still seems small contrasted
against the number of documented incidents.
Traditionally, virus incidents are hard to track and obtain accurate data on, as this information is
often kept “under wraps”. In December 1999, Dell Computer Corporation were publicly exposed
as being hit by the Funlove virus.
Section 2.
Why are Viruses So Prolific?
It is easy to see that the number of incidents has risen sharply over the past 18 months. In order
to see what conclusions and possible preventative steps can be drawn lets first take a closer look
at the three main contributory elements.
2.1
Networked Plumbing
We’re all connected! The corporate world has just spent the last 5 years feverishly gluing itself
together. The major driving force behind this is email.
Five years ago, the chances of the author of this paper being able to email you, the reader, were
probably around 50/50. Nowadays, having an email address at work is taken as a given. Aside
from the corporate world, domestic connectivity is catching up fast. In a recent Email Marketing
Report (February 2000) there were 409 million email accounts worldwide in 1999, up from 234
million the previous year, that’s a growth rate of 170%. With current estimates projecting 700M+
mail accounts worldwide by the year 2005, the total pervasiveness of email will be truly woven
into the fabric of society.
Email is a type of plumbing that links us all together – globally. It does not take a rocket scientist
to see that viruses engineered to exploit this global connectivity, have the potential to wreak
havoc worldwide. This is what we are seeing right now. According to a recent study published by
ICSA Labs, The Computer Virus Prevalence Survey 2000 (Nov 2000), there have been very
significant changes in virus distribution methods – in 1996, 9% of viruses were distributed by
email, in 2000 email was the main method of virus distribution at 87%.
2.2
Too much functionality?
Consider the Scenario:
Date: 25th December 1978, Christmas Day morning.
Venue: My parents’ house.
It’s Christmas day and I’ve just opened my largest Christmas present. It’s a chemistry set and I’m
a very happy 8 year old boy! I can’t believe my luck ! I can clearly remember taking out and ogling
nearly 100 test tubes full of various potions and powders.
Mark Sunner, MessageLabs, November 2000
2
www.MessageLabs.com
I can also remember my parents returning to the house, after having been with friends for just an
hour. They find me standing in the center of the living room, looking very guilty. The living room
was proudly sporting a new look, a fairly uniform dark purple splatter. Having been handed all the
essential tools (chemicals, a large test tube, mentholated spirit and a wick) it was just a matter of
time before I’d try my hand at some form of explosive.
All this brings me to the powerful programming languages now incorporated as standard in
today’s Office Application Suites and browsers. A while ago, writing a virus was just like writing
any program and required a bit of savvy on behalf of the author. Rapid development tools we
now take for granted simply did not exist in the past and this fact kept many would-be virus
writers at bay. Putting the obvious malicious aspect to one side for a moment this was a job for
men not boys. However the proliferation of the Internet has accelerated the need for new
development tools and environments that can exploit this connectivity.
Visual Basic for Applications (VBA) and Visual Basic Script (VBS) both present obvious
advantages for power users who want better integration between applications, it also presents
virus authors with a perfect toolkit for quickly developing viruses based around Office
applications. More recently, the ability to easily gain access to just about any conceivable email
function is responsible for the sharp increase in the number of worm type macro viruses now
propagating in the wild. This is a growing trend, which is visible on just about every virus
prevalence table available.
Given the right tools, it is only a matter of time before a certain combination will cause havoc.
2.3
Common Platforms: Microsoft Outlook
Consider the Scenario:
Hybridization was the term given to a crude form of genetic engineering back in the 60’s. The
idea was simple, cross the biggest, hardiest, fastest-growing varieties of corn (or any other crop)
until you end up with SUPERCORN. SUPERCORN is so much better than regular corn. It yields
more bushels per acre and is more resistant to disease. Soon there are millions of acres of
SUPERCORN that are not only the same variety, but since it is derived from a single master plant
it is genetically identical. And that's the dark side of hybridization, because if a new disease
comes along that does bother SUPERCORN, it affects the entire crop. There is always the risk
that the entire crop will die all at once.
This brings us on to the Love Bug, the latest in a recent spate of computer virus furor that caused
billions of dollars in damage. Hybridization comes into this drama because the I Love You worm
isn't just a computer virus or a PC virus or a Windows virus or even an e-mail virus. I Love You is
specifically a Microsoft Outlook/Visual Basic virus. It takes advantages of features in this
SUPERCORN of e-mail programs to cause damage to the greatest possible number of users.
Our networks, desktops and office suites are literally littered with SUPERCORN like Outlook.
This is a strong argument for genetic diversity in software because what made the Love Bug so
costly was Microsoft's success at getting people to use its software.
In summary:

We are all totally connected via the Internet and email.

Writing viruses is easier than ever before.

Once flaws are discovered in common software platforms, virus writers will exploit them,
impacting a massive installed base.
Mark Sunner, MessageLabs, November 2000
3
www.MessageLabs.com
Section 3.
Current Solutions are Outdated
An important conclusion that can be drawn is that anti-virus solutions have been slow in evolving
to meet the new virus threat. Anti-virus solutions in the main are still focused on the desktop or
local gateway. A desktop strategy should certainly always exist as people will always continue to
use suspect floppy disks and CDs from a magazine. Such solutions are however inadequate at
dealing with a global outbreak such as the Love Bug.
To help illustrate this suppose we could step into a time machine and travel back four years…….
Four years in the past, somewhere in the Philippines, our bad guy sits hunched over his keyboard
in a darkened room illuminated only by the phosphorous of his monitor. He is about to release a
virus into a CompuServe forum. However, back then, the virus he is about to release is not a
potential threat to the “computer user” over in the UK for weeks, if not months or years.
The essential ingredient for virus pollination at this time is people. The very speed in which
people can exchange floppy disks, .exe files and office documents is a very limiting factor – at
this point, we are not all glued together. Anti virus strategies around at this time were adequate in
dealing with the virus problem.
Our time machine now whizzes forward to the present day. This time, our bad guy sits at his flat
screen and posts his email aware virus into a news group. It’s a simple bit of Visual Basic Script
(VBS) and thanks to COM, it is easily able to locate and exploit Microsoft Outlook, the dominant
mail client.
Depending on whose address book contains my email details, this virus threat could arrive within
minutes. My desktop AV software does not know about this new virus nor can it catch it
heuristically. It takes, on average, 6 hours for my AV vendor to obtain a sample and make a
signature available. As everyone is trying to obtain the signature at the same time from the
publicly accessed web-sites of my AV vendor, the ftp server is unavailable. In effect, a pseudo
denial of service attack is being performed.
Working in this way places a huge burden on network managers and dramatically increases the
amount of anti-virus activity necessary to provide effective protection. Performing tasks such as
maintaining multiple scanners and updating virus signatures several times day are often
impractical for network administrators.
Section 4.
Evolving a New Approach: Scanning for Viruses at the
Internet Level
In order to provide better virus protection we need to implement scanning and detection systems
higher up the food chain of mail delivery where economies of scale make a more sophisticated
approach possible. The logic lies in scanning for email viruses at the Internet level.
MessageLabs has been providing a virus scanning service at the ISP level for 2 years. The
system took just under a year to develop and essentially integrated several commercial virus
scanners into an ISPs mail infrastructure, plus a rules-based system for emergency outbreaks
(such as Lovebug). The greatest hurdle to overcome with ISP virus scanning is scalability.
From the first live date the Virus Scanning System started to intercept approximately 40 viruses
each day, instantly vindicating the approach. Today the system has evolved into a carrier class
solution and regularly intercepts in excess of 1000 viruses per day. As an important aside the
system also generates valuable data about what viruses are actually in the wild in real-time and
the effectiveness of many commercial scanners against our ‘live list’.
The chart below details the progression of MessageLabs virus scanning technology and
effectiveness over the last year. Some of the major lessons learned along the way were that more
Mark Sunner, MessageLabs, November 2000
4
www.MessageLabs.com
than one virus scanner is needed and that the overriding threat to computer users lay within
macro viruses.
We settled on 3 scanners ultimately finding that this number gave the most effective results
without reducing performance. Significant re-coding of the mail platform was necessary in order
to provide the self-tuning required ensuring latency remained low.
Failure to use more than one scanner resulted in an average 3% non-detection rate. Given that a
single virus could prove devastating, this was thought to be unacceptable.
The average figure for virus detection across all email is 1 virus in every 1500 emails. From this
we are able to estimate that an average of 66 viruses will pass through an ISP’s network for every
100,000 messages handled. This formula makes it easy for any ISP to estimate the potential
effectiveness of implementing virus scanning technology with relative ease.
Analysis over a long period of time has shown that a greater percentage of viruses come from
‘free’ mail accounts than from general private domains. Further investigation revealed that the
average number of viruses contained within popular free mail accounts soars to 1 in 500. From
this information we are able to broadly estimate that the average number of viruses passing
through an ISP increases to 200 in 100,000 for free mail messages handled. We suspect webmail
vastly increases the promiscuous use of multiple computers for handling documents and hence
increases the chance of infection.
4.1
How does the Virus Scanning System Work?
The system, known as the MessageLabs Virus Control Center (VCC), runs on scalable
architecture comprising a cluster of towers densely populated with high performance servers,
Cisco Catalysts and load distributors which host McAfee, F-Secure and Vfind virus scanning
software. We have a rolling program to locate tower clusters at major peering locations around
the world. Currently installations are deployed in London, Amsterdam and New York. Additional
Towers will be deployed on a Global basis during Q2 and Q3 2001.
Each tower comprises 2 Cisco load balancers, 2 Cisco high performance 100Mbs switches, 24
industrial PCs each running the MessageLabs proprietary mail engine and temperature and fan
monitors linked to all PCs. A management system (PC) ensures that load is not only distributed
but also tuned dynamically. If an individual mail server queue becomes excessive the
management system lessens the delivery priority to the affected system. Excessively large
emails are handled by a separate “Big-email server” to permit a more even flow. Should either
the management server or big email server fail then an election is forced and other systems will
take over either role. In order to perform the scan, we intercept all mail as it passes through our
system so that it can be processed and examined for viruses before being allowed to continue to
its ultimate destination.
After being delivered to a tower the SMTP session is distributed to one of the scanning mail
servers. The session is authenticated against the customer database to ensure that it is either
coming from or going to a known customer. If authentication succeeds then mail and associated
file attachments are decoded using open standard formats (nested and combinations are also
decoded) and passed to the binary queue for scanning. During this process if an abnormal file
structure is unpacked such as a “Zip of Death” expanding to several terabytes the file is rejected
and an error message sent. The file is then passed through three commercial virus scanners and
Skeptic™, MessageLabs own proprietary heuristics and rules based scanner used to detect the
very latest viruses for which no signature is available.
If all is well the message finally passes to the Processed queue where an optional corporate
“Scan successful” message can be added. If a virus is detected the email is moved to a
Mark Sunner, MessageLabs, November 2000
5
www.MessageLabs.com
quarantine area where it will remain retrievable for a period of 10 days. Once 10 days have
elapsed the email is destroyed.
A separate Health monitor process constantly monitors the status of the mail server to ensure
that all processes are running smoothly, security trip wires have not activated and that adequate
disk space always remains. Health reports are then fed back to the management system, which
is monitoring the health of the tower as a whole. These reports are in turn fed back to our central
monitoring system which keeps an eye on our worldwide network.
Several external processes exist to get updated information such as new customers or
configuration changes of existing customers and also virus signature updates. As new data is
received it is encrypted and delivered to a distribution host. Once there, a trigger is sent (push) to
inform all the relevant towers that new data is waiting to be collected. The affected towers then
collect (pull) the new data in via the management system. This distribution method is part of an
overall security policy ensuring no communication is allowed into the towers other than SMTP.
4.2
How is the VCC Service Deployed?
Getting customer’s mail scanned by the VCC is a very simple process but varies slightly
depending on whether the customer has a leased line or dial up connection. Both scenarios are
described as follows:Leased Line:
Signing up leased line customers onto the service simply entails making the VCC the lowest MX
record for a given domain. Once the scan has been completed the VCC then relays the mail onto
its final destination. Outbound mail is relayed through the VCC by making a simple change to the
customer’s outgoing mail gateway.
ISDN/Dial up
The dialup scenario is slightly more complex depending on whether the final mail relay can be
contacted. In the majority of cases, mail for dial up customers is simply queued at the ISP until
the customer connects and triggers mail delivery with something like a finger command or pulls
the mail via POP. In either case it is necessary to have a private DNS structure to prevent mail
looping back to the VCC where public DNS and the corresponding MX record will point. Creating
a private DNS structure is a simple process.
Scanning mail via the VCC introduces virtually no perceptible delay on the overall delivery of mail.
During a four-week sample period processing approximately 1 million messages per day
MessageLabs calculated that the average message size being processed was 66K and that the
time taken to process these messages under normal load was 1.2 seconds.
Scalability and resilience
From the beginning the architecture behind the VCC solution was designed to be both scalable
and resilient. Resilience starts within the towers, which have dual load balancers, dual switches
and 24 mail servers for ultra redundancy. Towers are also always deployed as a pair so if a
whole tower should fail it will always have an immediate twin ready to take over. Should a whole
site fail, multi-site redundancy is achieved via MX records, which permit a dark site to take over
the handling of mail.
MessageLabs’
server cluster
scalability and
statistics data.
BackOffice architecture uses MS SQL Server 7 for Windows NT. A centralised
incorporates load balancing and redundancy to provide high performance,
fault tolerance for management of customer configuration data, and customer
Additional offsite servers provide further contingency.
Mark Sunner, MessageLabs, November 2000
6
www.MessageLabs.com
4.3
Skeptic – MessageLabs’ proprietary heuristics and rules based scanner
Skeptic is MessageLabs’ own virus scanner. Using our wealth of mail experience, we have
developed a set of heuristics, which Skeptic uses to detect new viruses in email.
Skeptic has been very successful, trapping the following viruses BEFORE signatures were
available:
ExploreZip
Irok
JS/Kak.A
JS/Kak.days
NewApt
Lifestages.Worm
PrettyPark.variant
VBS/Fireburn.A
VBS/LoveBug
WinExt.worm
WScript\Unicle.worm
As an example of email heuristics, an email going to 20 or more recipients, and containing an
office document with a macro, would be suspicious. If the macro also contained code that mass
emailed, that would be very suspicious!
Following the discovery of Bubbleboy we have added analysis of HTML formatted email - we
search for virus-like executable scripts buried within HTML. We also added heuristics for various
other scripting languages, such as VBS and JS. This enabled us to detect and trap the LoveBug
virus over 10 hours before conventional anti-virus companies were able to make their signatures
public. We are now often finding ourselves in the position of being aware of new viruses, or of
having samples of new viruses before virus signatures are available. We are able to configure
Skeptic very quickly to counter such threats. For instance, we can search for emails containing
certain attachments, or with certain text in the subject line or body text. We can check if emails
contain Office documents which contain macros. We can check attachments have specific MD5
checksums and can also mix these checks to be very specific and cut down the chance of false
alarms. Usually, we have detection configured, testing and running within 20 minutes. Solutions
from other vendors often require a wait of several hours, days or even weeks for the public
signatures to arrive.
Section 5.
Mail Encryption
It is sometimes proffered that scanning for viruses at either a gateway or ISP level becomes
obsolete if the transient email is encrypted. Whilst on the surface this is seemingly a valid point, it
has no basis on current fact. Firstly, the most recent statistics indicate the prevalence of
encrypted email is extremely minor.
Encryption will undoubtedly become an important issue in the future. As such, the solution
employed will involve a dual keyed approach whereby the higher level scanning facility will have a
copy of the potential recipient’s private key.
Section 6.
Making use of Live Statistics
Finally, along with many anti-virus vendors, MessageLabs contributes to the wildlist. The wildlist
(www.wildlist.org) is widely considered a defacto meeting point between anti-virus vendors. The
data presented is essentially a compiled report of all viruses seen actively in the wild by all
contributing vendors.
By scanning for viruses at the Internet level, using 4 anti virus scanners, the actual number of
viruses detected and intercepted is significantly higher than one single vendor’s scanner,
deployed at the desktop or gateway. Therefore this type of virus prevention not only prevents
viruses entering a company’s network, but it also provides the anti virus industry with continuous,
real time virus statistics. This data is used by the anti virus vendors to develop signatures and
further increase their virus knowledge base.
Mark Sunner, MessageLabs, November 2000
7
www.MessageLabs.com
Download