Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Introduction Problem: how to react quickly to worms? CodeRed 2001 Infected ~360,000 hosts within 11 hours Sapphire/Slammer (376 bytes) 2002 Infected ~75,000 hosts within 10 minutes Existing Approaches Detection Ad hoc intrusion detection Characterization Manual signature extraction Isolates and decompiles a new worm Look for and test unique signatures Can take hours or days Existing Approaches Containment Updates to anti-virus and network filtering products Earlybird Automatically detect and contain new worms Two observations Some portion of the content in existing worms is invariant Rare to see the same string recurring from many sources to many destinations Earlybird Automatically extract the signature of all known worms Also Blaster, MyDoom, and Kibuv.B hours or days before any public signatures were distributed Few false positives Background and Related Work Almost all IPs were scanned by Slammer < 10 minutes Limited only by bandwidth constraints The SQL Slammer Worm: 30 Minutes After “Release” - Infections doubled every 8.5 seconds - Spread 100X faster than Code Red - At peak, scanned 55 million hosts per second. Network Effects Of The SQL Slammer Worm At the height of infections Several ISPs noted significant bandwidth consumption at peering points Average packet loss approached 20% South Korea lost almost all Internet service for period of time Financial ATMs were affected Some airline ticketing systems overwhelmed Signature-Based Methods Pretty effective if signatures can be generated quickly For CodeRed, 60 minutes For Slammer, 1 – 5 minutes Worm Detection Three classes of methods Scan detection Honeypots Behavioral techniques Scan Detection Look for unusual frequency and distribution of address scanning Limitations Not suited to worms that spread in a nonrandom fashion (i.e. emails) Detects infected sites Does not produce a signature Honeypots Monitored idle hosts with untreated vulnerabilities Used to isolate worms Limitations Manual extraction of signatures Depend on quick infections Behavioral Detection Looks for unusual system call patterns Sending a packet from the same buffer containing a received packet Can detect slow moving worms Limitations Needs application-specific knowledge Cannot infer a large-scale outbreak Characterization Process of analyzing and identifying a new worm Current approaches Use a priori vulnerability signatures Automated signature extraction Vulnerability Signatures Example Slammer Worm UDP traffic on port 1434 that is longer than 100 bytes (buffer overflow) Can be deployed before the outbreak Can only be applied to well-known vulnerabilities Some Automated Signature Extraction Techniques Allows viruses to infect decoy progs Extracts the modified regions of the decoy Uses heuristics to identify invariant code strings across infected instances Limitation Assumes the presence of a virus in a controlled environment Some Automated Signature Extraction Techniques Honeycomb Autograph Finds longest common subsequences among sets of strings found in messages Uses network-level data to infer worm signatures Limitations Scale and full distributed deployments Containment Mechanism used to deter the spread of an active worm Host quarantine Via IP ACLs on routers or firewalls String-matching Connection throttling On all outgoing connections Defining Worm Behavior Content invariance Content prevalence Portions of a worm are invariant (e.g. the decryption routine) Appears frequently on the network Address dispersion Distribution of destination addresses more uniform to spread fast Finding Worm Signatures Traffic pattern is sufficient for detecting worms Relatively straightforward Extract all possible substrings Raise an alarm when FrequencyCounter[substring] > threshold1 SourceCounter[substring] > threshold2 DestCounter[substring] > threshold3 Detecting Common Strings Cannot afford to detect all substrings Maybe can afford to detect all strings with a small fixed length Detecting Common Strings Cannot afford to detect all substrings Maybe can afford to detect all strings with a small fixed length A horse is a horse, of course, of course F1 = (c1p4 + c2p3 + c3p2 + c4p1 + c5) mod M Detecting Common Strings Cannot afford to detect all substrings Maybe can afford to detect all strings with a small fixed length F2 = (c2p4 + c3p3 + c4p2 + c5p1 + c6) mod M A horse is a horse, of course, of course F1 = (c1p4 + c2p3 + c3p2 + c4p1 + c5) mod M Too CPU-Intensive Each packet with payload of s bytes has s-β+1 strings of length β A packet with 1,000 bytes of payload Needs 960 fingerprints for string length of 40 Still too expensive Prone to Denial-of-Service attacks CPU Scaling Random sampling may miss many substrings Solution: value sampling Track only certain substrings e.g. last 6 bits of fingerprint are 0 P(not tracking a worm) = P(not tracking any of its substrings) CPU Scaling Example Track only substrings with last 6 bits = 0s String length = 40 1,000 char string 960 substrings 960 fingerprints ‘11100…101010’…‘10110…000000’… Track only ‘xxxxx….000000’ substrings Probably 960 / 26 = 15 substrings in total string that end in 6 0’s CPU Scaling P(finding a 100-byte signature) = 55% P(finding a 200-byte signature) = 92% P(finding a 400-byte signature) = 99.64% Estimating Content Prevalence Finding the packet payloads that appear at least x times among the N packets sent During a given interval Estimating Content Prevalence Table[payload] 1 GB table filled in 10 seconds Table[hash[payload]] 1 GB table filled in 4 minutes Tracking millions of ants to track a few elephants Collisions...false positives Multistage Filters stream memory Array of counters Hash(Pink) [Singh et al. 2002] Multistage Filters packet memory Array of counters Hash(Green) Multistage Filters packet memory Array of counters Hash(Green) Multistage Filters packet memory Multistage Filters Collisions are OK packet memory Multistage Filters Reached threshold packet memory packet1 1 Insert Multistage Filters packet memory packet1 1 Multistage Filters packet memory packet1 1 packet2 1 Multistage Filters Stage 1 packet memory packet1 1 No false negatives! (guaranteed detection) Stage 2 Conservative Updates Gray = all prior packets Conservative Updates Redundant Redundant Conservative Updates Estimating Address Dispersion Not sufficient to count the number of source and destination pairs e.g. send a mail to a mailing list Two sources—mail server and the sender Many destinations Need to count the unique source and destination traffic flows For each substring Bitmap counting – direct bitmap Set bits in the bitmap using hash of the flow ID of incoming packets HASH(green)=10001001 [Estan et al. 2003] Bitmap counting – direct bitmap Different flows have different hash values HASH(blue)=00100100 Bitmap counting – direct bitmap Packets from the same flow always hash to the same bit HASH(green)=10001001 Bitmap counting – direct bitmap Collisions OK, estimates compensate for them HASH(violet)=10010101 Bitmap counting – direct bitmap HASH(orange)=11110011 Bitmap counting – direct bitmap HASH(pink)=11100000 Bitmap counting – direct bitmap As the bitmap fills up, estimates get inaccurate HASH(yellow)=01100011 Bitmap counting – direct bitmap Solution: use more bits HASH(green)=10001001 Bitmap counting – direct bitmap Solution: use more bits Problem: memory scales with the number of flows HASH(blue)=00100100 Bitmap counting – virtual bitmap Solution: a) store only a portion of the bitmap b) multiply estimate by scaling factor Bitmap counting – virtual bitmap HASH(pink)=11100000 Bitmap counting – virtual bitmap Problem: estimate inaccurate when few flows active HASH(yellow)=01100011 Bitmap counting – multiple bmps Solution: use many bitmaps, each accurate for a different range Bitmap counting – multiple bmps HASH(pink)=11100000 Bitmap counting – multiple bmps HASH(yellow)=01100011 Bitmap counting – multiple bmps Use this bitmap to estimate number of flows Bitmap counting – multiple bmps Use this bitmap to estimate number of flows Bitmap counting – multires. bmp OR OR Problem: must update up to three bitmaps per packet Solution: combine bitmaps into one Bitmap counting – multires. bmp HASH(pink)=11100000 Bitmap counting – multires. bmp HASH(yellow)=01100011 Putting It Together Address Dispersion Table (scalable counters) key header src cnt dest cnt payload substring fingerprints AD entry exist? substring fingerprints update counters else update counter key cnt counters > dispersion threshold? report key as suspicious worm cnt > prevalence threshold? create AD entry Content Prevalence Table (multistage filters) Putting It Together Sample frequency: 1/64 String length: 40 Use 4 hash functions to update prevalence table Multistage filter reset every 60 seconds System Design Two major components Sensors Sift through traffic for a given address space Report signatures An aggregator Coordinates real-time updates Distributes signatures Implementation and Environment Written in C and MySQL (5,000 lines) Prototype executes on a 1.6Ghz AMD Opteron 242 1U Server Linux 2.6 kernel EarlyBird Performance Processes 1TB of traffic per day 200Mbps of continuous traffic Can be pipelined and parallelized for achieve 40Gbps Parameter Tuning Prevalence threshold: 3 Very few signatures repeat Address dispersion threshold 30 sources and 30 destinations Reset every few hours Reduces the number of reported signatures down to ~25,000 Parameter Tuning Tradeoff between and speed and accuracy Can detect Slammer in 1 second as opposed to 5 seconds With 100x more reported signatures Memory Consumption Prevalence table 4 stages 2MB total Address dispersion table Each with ~500,000 bins (8 bits/bin) 25K entries (28 bytes each) < 1 MB Total: < 4MB Trace-Based Verification Two main sources of false positives 2,000 common protocol headers e.g. HTTP, SMTP Whitelisted SPAM e-mails BitTorrent Many-to-many download False Negatives So far none Detected every worm outbreak Inter-Packet Signatures An attacker might evade detection by splitting an invariant string across packets With 7MB extra, EarlyBird can keep per flow states and fingerprint across packets Live Experience with EarlyBird Detected precise signatures CodeRed variants MyDoom mail worm Sasser Kibvu.B Conclusions EarlyBird is a promising approach To detect unknown worms real-time To extract signatures automatically To detect SPAMs with minor changes Wire-speed signature learning is viable