Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan

advertisement

Automated Worm

Fingerprinting

Sumeet Singh, Cristian Estan,

George Varghese, and Stefan

Savage

Introduction

Problem: how to react quickly to worms?

CodeRed 2001

 Infected ~360,000 hosts within 11 hours

Sapphire/Slammer (376 bytes) 2002

 Infected ~75,000 hosts within 10 minutes

Existing Approaches

Detection

 Ad hoc intrusion detection

Characterization

 Manual signature extraction

 Isolates and decompiles a new worm

Look for and test unique signatures

Can take hours or days

Existing Approaches

 Containment

 Updates to anti-virus and network filtering products

Earlybird

Automatically detect and contain new worms

Two observations

Some portion of the content in existing worms is invariant

Rare to see the same string recurring from many sources to many destinations

Earlybird

Automatically extract the signature of all known worms

 Also Blaster, MyDoom, and Kibuv.B hours or days before any public signatures were distributed

Few false positives

Background and Related Work

 Almost all IPs were scanned by

Slammer < 10 minutes

 Limited only by bandwidth constraints

The SQL Slammer Worm:

30 Minutes After “Release”

- Infections doubled every 8.5 seconds

- Spread 100X faster than Code Red

- At peak, scanned 55 million hosts per second.

Network Effects Of The SQL

Slammer Worm

 At the height of infections

Several ISPs noted significant bandwidth consumption at peering points

Average packet loss approached 20%

South Korea lost almost all Internet service for period of time

Financial ATMs were affected

Some airline ticketing systems overwhelmed

Signature-Based Methods

 Pretty effective if signatures can be generated quickly

For CodeRed, 60 minutes

For Slammer, 1 – 5 minutes

Worm Detection

 Three classes of methods

Scan detection

Honeypots

Behavioral techniques

Scan Detection

Look for unusual frequency and distribution of address scanning

Limitations

 Not suited to worms that spread in a nonrandom fashion (i.e. emails, IM, P2P apps)

Based on a target list

Spread topologically

Scan Detection

 More limitations

Detects infected sites

Does not produce a signature

Honeypots

Monitored idle hosts with untreated vulnerabilities

 Used to isolate worms

Limitations

Manual extraction of signatures

Depend on quick infections

Behavioral Detection

Looks for unusual system call patterns

Sending a packet from the same buffer containing a received packet

Can detect slow moving worms

Limitations

Needs application-specific knowledge

Cannot infer a large-scale outbreak

Characterization

Process of analyzing and identifying a new worm

Current approaches

Use a priori vulnerability signatures

Automated signature extraction

Vulnerability Signatures

Example

 Slammer Worm

 UDP traffic on port 1434 that is longer than

100 bytes (buffer overflow)

Can be deployed before the outbreak

 Can only be applied to well-known vulnerabilities

Some Automated Signature

Extraction Techniques

 Allows viruses to infect decoy programs

Extracts the modified regions of the decoy

Uses heuristics to identify invariant code strings across infected instances

Some Automated Signature

Extraction Techniques

 Limitation

 Assumes the presence of a virus in a controlled environment

Some Automated Signature

Extraction Techniques

Honeycomb

 Find longest common subsequences among sets of strings found in messages

Autograph

 Uses network-level data to infer worm signatures

Limitations

 Scale and full distributed deployments

Containment

 Mechanism used to deter the spread of an active worm

Host quarantine

 Via IP ACLs on routers or firewalls

String-matching

Connection throttling

 On all outgoing connections

Host Quarantine

 Preventing an infected host from talking to other hosts

 Via IP ACLs on routers or firewalls

Defining Worm Behavior

Content invariance

 Portions of a worm are invariant (e.g. the decryption routine)

Content prevalence

 Appears frequently on the network

Address dispersion

 Distribution of destination addresses more uniform to spread fast

Finding Worm Signatures

 Traffic pattern is sufficient for detecting worms

Relatively straightforward

Extract all possible substrings

Raise an alarm when

FrequencyCounter[substring] > threshold1

SourceCounter[substring] > threshold2

DestCounter[substring] > threshold3

Practical Content Sifting

 Characteristics

Small processing requirements

Small memory requirements

Allows arbitrary deployment strategies

Estimating Content

Prevalence

 Finding the packet payloads that appear at least x times among the N packets sent

 During a given interval

Estimating Content

Prevalence

Table[payload]

 1 GB table filled in 10 seconds

Table[hash[payload]]

1 GB table filled in 4 minutes

Tracking millions of ants to track a few elephants

Collisions...false positives

Multistage Filters

Array of counters

Hash(Pink) stream memory

[Singh et al. 2002]

Multistage Filters

Array of counters

Hash(Green) packet memory

Multistage Filters

Array of counters

Hash(Green) packet memory

Multistage Filters packet memory

Multistage Filters

Collisions are OK packet memory

Multistage Filters

Reached threshold packet memory packet1 1

Insert

Multistage Filters packet memory packet1 1

Multistage Filters packet memory packet1 1 packet2 1

Multistage Filters

Stage 1 packet memory packet1 1

No false negatives!

(guaranteed detection)

Stage 2

Conservative Updates

Gray = all prior packets

Conservative Updates

Redundant

Redundant

Conservative Updates

Detecting Common Strings

Cannot afford to detect all substrings

Maybe can afford to detect all strings with a small fixed length

Detecting Common Strings

Cannot afford to detect all substrings

Maybe can afford to detect all strings with a small fixed length

A horse is a horse, of course, of course

F

1

= (c

1 p 4 + c

2 p 3 + c

3 p 2 + c

4 p 1 + c

5

) mod M

Detecting Common Strings

Cannot afford to detect all substrings

Maybe can afford to detect all strings with a small fixed length

F

2

= (c

2 p 4 + c

3 p 3 + c

4 p 2 + c

5 p 1 + c

6

) mod M

A horse is a horse, of course, of course

F

1

= (c

1 p 4 + c

2 p 3 + c

3 p 2 + c

4 p 1 + c

5

) mod M

Detecting Common Strings

 Cannot afford to detect all substrings

 Maybe can afford to detect all strings with a small fixed length

F

2

= (c

2 p 4 + c

3 p 3 + c

4 p 2 + c

5 p 1 + c

6

) mod M

= (c

1 p 5 + c

2 p 4 + c

3 p 3 + c

4 p 2 + c

5 p 1 + c

6

- c

1 p 5 ) mod M

= (pF

1

+ c

6

- c

1 p 5 ) mod M

A horse is a horse, of course, of course

F

1

= (c

1 p 4 + c

2 p 3 + c

3 p 2 + c

4 p 1 + c

5

) mod M

Detecting Common Strings

Cannot afford to detect all substrings

Maybe can afford to detect all strings with a small fixed length

Still too expensive…

Estimating Address Dispersion

Not sufficient to count the number of source and destination pairs

 e.g. send a mail to a mailing list

 Two sources —mail server and the sender

 Many destinations

Need to count the unique source and destination traffic flows

 For each substring

Bitmap counting – direct bitmap

Set bits in the bitmap using hash of the flow ID of incoming packets

HASH( green )=10001001

[Estan et al. 2003]

Bitmap counting – direct bitmap

Different flows have different hash values

HASH( blue )=00100100

Bitmap counting – direct bitmap

Packets from the same flow always hash to the same bit

HASH( green )=10001001

Bitmap counting – direct bitmap

Collisions OK, estimates compensate for them

HASH( violet )=10010101

Bitmap counting – direct bitmap

HASH( orange )=11110011

Bitmap counting – direct bitmap

HASH( pink )=11100000

Bitmap counting – direct bitmap

As the bitmap fills up, estimates get inaccurate

HASH( yellow )=01100011

Bitmap counting – direct bitmap

Solution: use more bits

HASH( green )=10001001

Bitmap counting – direct bitmap

Solution: use more bits

Problem: memory scales with the number of flows

HASH( blue )=00100100

Bitmap counting – virtual bitmap

Solution: a) store only a portion of the bitmap b) multiply estimate by scaling factor

Bitmap counting – virtual bitmap

HASH( pink )=11100000

Bitmap counting – virtual bitmap

Problem: estimate inaccurate when few flows active

HASH( yellow )=01100011

Bitmap counting – multiple bmps

Solution: use many bitmaps, each accurate for a different range

Bitmap counting – multiple bmps

HASH( pink )=11100000

Bitmap counting – multiple bmps

HASH( yellow )=01100011

Bitmap counting – multiple bmps

Use this bitmap to estimate number of flows

Bitmap counting – multiple bmps

Use this bitmap to estimate number of flows

Bitmap counting – multires. bmp

OR

OR

Problem: must update up to three bitmaps per packet

Solution: combine bitmaps into one

Bitmap counting – multires. bmp

HASH( pink )=11100000

Bitmap counting – multires. bmp

HASH( yellow )=01100011

Multiresolution Bitmaps

Still too expensive to scale

Scaled bitmap

Recycles the hash space with too many bits set

Adjusts the scaling factor according

 E.g., 1 bit represents 2 flows as opposed to a single flow

Too CPU-Intensive

A packet with 1,000 bytes of payload

 Needs 960 fingerprints for string length of

40

Prone to Denial-of-Service attacks

CPU Scaling

Obvious approach: sampling

- Random sampling may miss many substrings

Solution: value sampling

Track only certain substrings

 e.g. last 6 bits of fingerprint are 0

P(not tracking a worm)

= P(not tracking any of its substrings)

CPU Scaling

 Example

Track only substrings with last 6 bits = 0s

String length = 40

1,000 char string

960 substrings  960 fingerprints

‘11100…101010’…‘10110…000000’…

Use only ‘xxxxx….

000000 ’ as signatures

 Probably 960 / 2 6 = 15 signatures

CPU Scaling

P(finding a 100-byte signature) = 55%

P(finding a 200-byte signature) = 92%

P(finding a 400-byte signature) =

99.64%

Putting It Together

Address Dispersion Table key src cnt dest cnt header payload substring fingerprints substring fingerprints

AD entry exist?

update counters else update counter key cnt counters > dispersion threshold?

report key as suspicious worm

Content Prevalence Table cnt > prevalence threshold?

create AD entry

Putting It Together

Sample frequency: 1/64

String length: 40

Use 4 hash functions to update prevalence table

 Multistage filter reset every 60 seconds

System Design

 Two major components

Sensors

 Sift through traffic for a given address space

 Report signatures

An aggregator

Coordinates real-time updates

Distributes signatures

Implementation and

Environment

Written in C and MySQL (5,000 lines) rrd-tools library for graphical reporting

PHP scripting for administrative control

Prototype executes on a 1.6Ghz AMD

Opteron 242 1U Server

 Linux 2.6 kernel

EarlyBird

Processes 1TB of traffic per day

Can keep up with 200Mbps of continuous traffic

Parameter Tuning

Prevalence threshold: 3

 Very few signatures repeat

Address dispersion threshold

30 sources and 30 destinations

Reset every few hours

Reduces the number of reported signatures down to ~25,000

Parameter Tuning

 Tradeoff between and speed and accuracy

 Can detect Slammer in 1 second as opposed to 5 seconds

 With 100x more reported signatures

Performance

200Mbps

Can be pipelined and parallelized for achieve 40Gbps

Memory Consumption

Prevalence table

4 stages

 Each with ~500,000 bins (8 bits/bin)

2MB total

Address dispersion table

25K entries (28 bytes each)

< 1 MB

Total: < 4MB

Trace-Based Verification

 Two main sources of false positives

2,000 common protocol headers

 e.g. HTTP, SMTP

 Whitelisted

SPAM e-mails

BitTorrent

 Many-to-many download

False Negatives

So far none

Detected every worm outbreak

Inter-Packet Signatures

An attacker might evade detection by splitting an invariant string across packets

With 7MB extra, EarlyBird can keep per flow states and fingerprint across packets

Live Experience with EarlyBird

 Detected precise signatures

CodeRed variants

MyDoom mail worm

Sasser

Kibvu.B

Variant Content

 Polymorphic viruses

Semantically equivalent but textually distinct code

Invariant decoding routine

Extensions

Self configuration

Slow worms

Containment

 How to handle false positives?

If too aggressive, EarlyBird becomes a target for DoS attacks

An attacker can fool the system to block a target message

Coordination

Trust of deployed servers

Validation

Policy

Conclusions

EarlyBird is a promising approach

To detect unknown worms real-time

To extract signatures automatically

To detect SPAMs with minor changes

Wire-speed signature learning is viable

Download