zero-day attack

advertisement
1
Zero-day Attack 방어를 위한
네트워크 기반 탐지 방법
2007년 4월 12일
김익균 (ikkim21@etri.re.kr)
한국전자통신연구원 정보보호연구단
2
Contents
Vulnerability & Zero-day Attack
Intrusion Detection
Detection Model
Research Trends
: Zero-day Attack Detection
3
Vulnerabilities
Laws of Vulnerabilities
 Half-life of critical vulnerabilities is 21 days
 Half of the most prevalent are replaced by new
vulnerabilities every year
 Lifespan of some vulnerabilities and worms is
unlimited
 80% of worms and automated exploits occur in the
first two half-lives
*Source : Gerhard Eschelbeck of Qualys at Blackhat 2004
4
Zero-day Attacks
Time-gap between vulnerability disclosure and release of a worm that exploits it is decreasing
SQL server bufferoverflow vulnerability
LSASS buffer-overflow
vulnerability
July 24
2002
Apr 11
2004
185-day
Slammer attack
Worm
released
0-day
attack
Apr 30
2004
Jan 25
2003
19-day
Sasser attack
Worm
released
Vulnerability
disclosure
…..
2006 ??
Future worm
released
• A zero-day attack is a computer threat that exposes undisclosed or unpatched
computer application vulnerabilities. (defined by wikipedia)
• May 2005 : Zero-day exploits for unknown vulnerabilities in Mozilla Firefox
5
Intrusion Detection - I
Misused Analysis : Signature-based
Expert
Manager
Response
Manager
Rule Manage
Response
Alert
Signature
Data Base
Detection
Engine
Pattern Matching
Response
Comparison
Objects
Packet Parsing
Audit Data
Packet Sensor
Network
6
Signature-based ID
DPI : High Performance Pattern Matching
 By 2006, 75 percent of Global 2000 enterprises will replace or augment
their firewall approach with deep packet inspection capabilities
 By 2005, enterprises will no longer use software-based application
proxy firewalls
Source - Deep Packet Inspection: Next Phase of Firewall Evolution
(21 November 2002, Gartner)
 Today, Deep Packet Inspection capability, ASIC-based Appliance
• TippingPoint: IPS 5000E
• TopLayer: IPS 5500
• Cisco: IPS4255
 By 2009 the UTM space will
be the largest single market.
7
Intrusion Detection - II
Anomaly Detection
Report
Alert
Learning
Profile
Anomaly
Analysis
Engine
Statistical
Data Mining
Neural Net.
Learning &
Comparison
Objects
Packet Parsing
Audit Data
Packet Sensor
Network
Zero-day Attack Protection
8
Anomaly Detection + Signature Generation + High Performance FW (IPS)
 Current Signature Generation Process
 New worm outbreak
 Report of anomalies from people via phone/email/newsgroup
 Worm trace is captured
 Manual analysis by security experts
 Signature generation
 Labor-intensive, Human-mediated
9
Control Flow Hijacking Worm Model
* Source [Crandall05]
Epsilon-Gamma-Pi Model
NOP NOP
NOP NOP
Decryption
Exploit
Attack Code
UPR.
CodeLYR. PAYLOAD
(ReturnAddr)TCP/UDP HDR
• Epsilon (ε) = Exploit Vector
• Gamma (γ) = Bogus Control Data
• Pi (π) = Payload
IP
10
ε- γ- π Model
CodeRed II Case
• Epsilon (ε) = HTTP Header
• Gamma (γ) = Return Address
• Pi (π) = Codered Shellcode
* Source [Crandall05]
11
Control hijacking - example
Buffer Overflow
Normal Stack
Smashed Stack
Smashed Stack
Recent Worm Exploits
12
Worm Exploits
13
Worm Polymorphic
Polymorphic Worm
Randomly generates a new key
and corresponding decryptor code
Decrypt and execute
Mutation A
Worm body
Mutation B
Mutation C
To detect an unknown mutation
of a known virus
,
emulate CPU execution of
until the current sequence of
instruction opcodes matches the known sequence for virus body
14
Polymorphic Engine
Mutation Engine
 ADMutate alters each of these elements
 NOP substitution with operationally inert commands
 Shell code encoded by XORing with a randomly generated
key
 Return address modulated – least significant byte altered to
jump into different parts of NOPs
NOP substitute
Another NOP
Yet another NOP
A different NOP
Here’s a NOP
NOP substitute
Another NOP
Yet another NOP
A different NOP
Here’s a NOP
XOR’ed Machine Code:
execve (/bin/sh)
Polymorphic
XOR Decoder
Modulated Pointer to
NOP Substitutes
XOR’ed Machine Code:
execve (/bin/sh)
Modulated Pointer to
NOP Substitutes
Metamorphic Code - Examples
 Code reordering
 Instructions that are independent are re-ordered
MOV EAX, [X]
MOV EBX, [Y]
ADD EAX, EBX
MOV [X], EAX
MOV EBX, [Y]
MOV EAX, [X]
ADD EAX, EBX
MOV [X], EAX
 Garbage Code insertion
 Instructions are inserted that are semantic no-ops (do not effect the code and
registers, and therefore execution)
MOV EAX, [X]
MOV EBX, [Y]
ADD EAX, EBX
MOV [X], EAX
 Equivalent Code Replacement
MOV EAX, [X]
MOV EBX, [Y]
ADD EAX, EBX
PUSH ESI
MOV [X], EAX
POP ESI
 Register renaming, or semantically equivalent code
MOV EAX, [X]
MOV EBX, [Y]
ADD EAX, EBX
MOV [X], EAX
XOR EAX, EAX
ADD EAX, [X]
ADD EAX, [Y]
MOV [X], EAX
 Register-reassignment
 Swaps the usage of the registers Causes extensive “minor” changes in the code seq
uence
15
Zero-Day Attack Detection
Research Trends
 Network-based
 Prevalence Model
• Autograph/Polygraph
• Earlybird
 Other Type
• PayL
• PacketVaccine
 Malicious Code detection
• SigFree
• Polymorphic Detection - Network execution
• Control Flow Graph
 Host-based
• MINOS
• DACODA
16
Prevalence Model – (1)
[1] EarlyBird
17
* Source [Singh04]
 Key observation : Define worm behavior
 Content invariance
• Portions of a worm are invariant (e.g. the decryption routine)
 Content prevalence
• Appears frequently on the network
 Address dispersion
• Distribution of destination addresses more uniform to spread fast
 Two consequences
 Content Prevalence: 1/60 sampled
Rabin Fingerpirntng, 40bytes
substring, Prevalence threshold is 3
 Address Dispersion: Threshold 30
source, 30 destination
 Packet content examination can be
evaded with simple polymorphism
(Stefan Savage, UCSD *)
18
Prevalence Model – (2)
[2] Autograph
* Source [Kim 04]
 Key Observations
 TCP worms that propagate via scanning
 Worm’s payloads share a common substring
• Vulnerability exploit part is not easily mutable, Not polymorphic
 Step 1: Select suspicious flows using heuristics
 Flows from scanners are suspicious
 Step 2: Generate signature using content-prevalence analysis
 All instances of a worm have a common byte pattern specific to the worm
 Content-based Payload Partitioning (COPP)
• Partition if Rabin fingerprint of a sliding window matches Breakmark
• Configurable parameters: content block size (minimum, average, maximum),
breakmark, sliding window
19
Prevalence Model – (2)
[2] Autograph
* Source [Kim 04]
A protocol through which multiple distributed
Autograph monitors may share information
step1: select suspicious flows
using heuristics
step2: generate signature using
content prevalence analysis
19
20
Prevalence Model – (3)
[3] Polygraph
 No one substring is specific enough
 BUT, there are multiple substrings
 Protocol framing
 Value used to overwrite return address
 (Parts of poorly obfuscated code)
 Approach : combine the substrings (3 bytes-size)
* Source [Newsome05]
21
Summary of Prevalence Model Detection
Prevalence Model
Earlybird
Autograph/Polygraph
Traffic
Contents Prevalence ->
Address dispersion
Suspicious Flow Selection ->
Contents Prevalence
Preprocessing
1/64 string sample
Session reassembly
Suspicious Traffic
Prevalent Content
Extraction
Signature size
Whitelist
Demerits
Trends
NA
• Port Scan detection
• Session Success Rate
• Address Dispersion
Multi-stage Hash
Longest Common Substring
40 bytes
3 bytes tokens
Heuristic
Heuristic
• High False Positive
• Sampled Traffic
• Heuristic based Whitelist
management
• Do not handle polymorphic
worm
• Due to Session Reassemble,
Degradation of Processing
Power
• High session fail rate due to
P2P service
No more super-worm outbreak, since 2004
22
Payload Anomaly
* Source [Wang05]
[4] PAYL
PAYL
 Compute a “normal profile” of a site’s unique content flow, and use
this information to detect anomalous data
 n-gram
• is the sequence of n adjacent byte values in a packet payload
• A sliding window with width n is passed over the whole payload one byte
at a time and the frequency of each n-gram is computed
• The frequency count distribution represents a model of the content flow
(“statistical centroid”)
 Compare the similarity between test data and the trained models
• Mahalanobis distance
• If the distance of a test datum is greater than the threshold, the system
issues an alert
Character Distribution
Normal HTTP Request
CodeRed II
23
Jump Address Detection –(1)
[5] Packet Vaccine
* Source [X. Wang, CCS2006]
 Vaccine Generation

detection of anomalous packet payloads

a byte sequence resembling a jump address, and randomization of selected contents
 Exploit Detection

detect an exploit attempt

it should now trigger an exception in a vulnerable program
 Vulnerability Diagnosis

correlates the exception with the vaccine to acquire information regarding the exploit

the corrupted pointer content and its location in the exploit packet
 Signature Generation

creates variations of the original exploit to probe the vulnerable program

in an effort to identify necessary exploit conditions for generation of a signature
24
Jump Address Detection –(2)
[5] Packet Vaccine
* Source [X. Wang, CCS2006]
 Vaccine Generation
 A key step in most exploits is to inject a jump address to redirect
the control flow of a vulnerable program
 Such an address points
• stack or heap in a code-injection attack
• global library entry in an existing-code attack
 Approach
 Check every 4-byte sequence(32-bit system) or 8-byte
sequence(64-bit system)
 Randomize those which fall in the
address range of the potential jump
targets in a protected program
 Should cause an exception,
segmentation fault (SEGV) or illegal
instruction fault(ILL)
25
Attack Scenario
BLASTER Worm
 DCOM object : Insufficient bounds checking
 RPC Endpoint Mapper listen : 139, 135, 445, 593
Attacker
Victim
1. Probe-Connection Scan Attempt
( TCP 135 )
2. RPC DCOM Request
( TCP 135 )
4. Start TFTP Server
5. Remote Command Attempt
Buffer
Overflow
3. Shell Code : Binding Port 4444
Listening TCP 4444
4
5.1 tftp <host> GET msblast.exe
6. TFTP Download Request
( UDP 69 )
Worm File Code
7. Delivery Main Worm Body “Msblaster.exe”
( TFTP : UDP 69 )
7.1 start msblast.exe
msblaster.exe
8. Syn Flooding
WindowsUpdate.com
26
Malicious Code Detection Approach –(1)
Malicious Code
 Assumptions
 Buffer overflow attacks typically contain executables whereas legitimate client requests
never contain executables in most Internet services
 if a packet contains executables it would be an attack
Windows platforms
Linux platforms
Web Service
Port 80
Apache Web
server
Port 80
Remote
access services
Port 111, 137, 138, 139
BIND
Port 53
MS-SQL Servers
Port 1434
SNMP
Port 161
Workstation
services
Port 139, 445
Mail
Port 25
Database
servers
Port 1521, 3306, 5432
Accept Data Only
27
Malicious Code Detection
[6] SigFree
* Source [Wang06]
 SigFree blocks attacks by detecting the presence of code






Signature free
Immunized from most attack-side obfuscation methods
Generic code-data separation criteria
Transparency
Negligible throughput degradation
Economical deployment with very low maintenance cost
 Scope
 Web service (port 80) Buffer overflow attacks
• Actually it’s not a BOF detection algorithm, it’s a executable code detection algorithm
 Application level attacks such as data manipulation and SQL injection are out of
the scope
 IA-32(Intel)
 Packet based (No reassemble)
 Assumption : Normal requests do not contain executable codes
28
SigFree Overview
* Source [Wang06]
SigFree
[6] SigFree
 SigFree architecture
• Scheme 1: exploits the OS characteristics of a
program (faster)
• Scheme 2: exploits the data flow characteristics of a
program (more robust)
Extended instruction
flow graph
All Possible instruction
29
SigFree - Limitation
[6] SigFree
SigFree
* Source [Wang06]
 Limitations
 SigFree can’t fully handle the branch-function based obfuscation
 SigFree can’t detect the shellcode that is written in a
alphanumeric form
 SigFree can’t detect malicious code which consists of fewer
useful instructions than current threshold 15
 SigFree can’t the encrypted executable codes
29
30
Network-Level Execution – (1)
* Source [Michalls06]
[7] Polymorphic Shellcode Detection
 executes every potential instruction sequence, aiming to identify the execution
behavior of polymorphic shellcodes
 compares their execution profile against the behavior observed to be inherent to
polymorphic shellcodes.
[ Input Stream]
Decryptor
start
end
Byte shifting
Disassembly
Execution
Mem Read Count
Invalid memory accesses & Invalid Instructions
Disassembly
Execution
Mem Read Count
Mem read loop
If over threshold, attack
decision.
31
Network-Level Execution – (2)
[7] Polymorphic Shellcode Detection
* Source [Michalls06]
Patten 1
During decrytion, the decryptor must read the encrypted payload in order to decrypt it.
Hence, the decryption process must read the encrypted payload.
Criterion 1 : If a number of payload reads in a execution chain > Payload Read Threshold
(PRT)
Patten 2
A mandatory operation of every polymorphic shellcode is to find its location in memory
using some form of “Get PC(%eip)”.
Criterion 2 : If the chain executes some form of “Get PC(%eip)”
“Get PC” code
An execution
chain
Execution chain for
payload reads
32
PW Detection : CFG – (1)
[8] Control
Graph
Extraction
Control
Flow Flow
Graph
Extraction
* Source [Kruegel 05]
 Perform a linear disassembly from the first byte of a stream to
extract the machine instructions
 Remove invalid basic blocks (resulted from the disassembly of noncode byte streams)
 Invalid block :
• if it contains one or more invalid instructions,
• if it is on a path to an invalid block or
• if it ends in a control transfer instruction that jumps into the
middle of another instruction
32
33
PW Detection : CFG – (2)
[8] CFG
Construction
CFG
Construction
* Source [Kruegel 05]
 Robustness to modification
 Junk insertion, register renaming, code transposition, instruction substitution
 Uniqueness
 Different executable regions should map to different fingerprints
• Move A 10
• Move B 10
• ADD B
• JMP BLOCK2
• MOV A 15
• MOV B 20
• MUL B
• linear disassembly of the byte
stream
• Nodes  Describes the sequence of
instruction without any jumps.
• Edges  jump instruction making
transition from one node to another.
CFG of a binary code
 cluster of closely connected nodes
CFG of random sequence  isolated nodes
34
PW Detection : CFG – (3)
[8] Graph Coloring
* Source [Kruegel 05]
 Classify Instructions  14 sets
•
A 14 bit colour value  associated with each node (1 bit corresponding to 1 class)
•
When one or more instructions of certain class appears in the basic block , the
corresponding bit of the basic block colour value is set to 1.
•
E.g. MOV A, B 00000000000010
MUL A,10 00000000000001
PUSH A 00000000010000
Node Colour : 00000000010011
Append 14 bit colour value to each node in the adjacency matrix of the sub graph
Concatenate the rows as before and get the new fingerprint
Classification of Malicious Code Detection
35
Static v.s. Dynamic Analysis
 Static Analysis : Structure of executables






Analysis without “execution”
Disassembling -> String Analysis
Frequency Analysis
Structure of a program is described by its control flow graph (CFG)
Cannot be used to detect novel malware instances
Used to recognize obfuscated invariants of the same code instance
 Dynamic Analysis : Behavior of executables
 File is “executed “ in saved envirionment
• VMWare, SandBox
 Behavior of a program is the observable effect that it has on its
environment
• RegMon, FileMon, syscall monitoring
 Consider the behavior for a whole class of malware
36
Payload Analysis
• Executables
• Exploits
• Known Files
Payload Analysis
• Data with unknown file-type
Known Filetype
Unknown Filetype
Static Analysis
Crypto Analysis
Dynamic Analysis
Find out
“responsible”
Program
(Loader)
Host-based Detection – (1)
[9] MINOS
 Tagged architecture that tracks the integrity of
every memory word
 Network data is tainted
 Control data (return pointers, function
pointers, jump targets, etc.) should not be
 Taint tracking with every instruction
 Great for catching worms
 Uses the γ mapping
 Implemented a full-system tagging scheme in a
virtual machine
 Linux (modified kernel)
• Tracks integrity in the file system
• Virtual memory swapping
 Windows (unmodified)
• Works great as a honeypot for cacthing worms
37
* Source [Crandall04]
38
Host-based Detection – (2)
[10] DACODA
* Source [Crandall05]
 DAvis malCODe Analyzer
 Discover invariants in the exploit vector (ε)
 Symbolic execution on the system trace during attacks that
Minos catches
 Used for an empirical analysis of polymorphism and metamorphism
 Quantify and understand the limits
39
bibliography
1) [Singh04] S. Singh, C. Estan, G. Varghese, and S. Savage. Automated worm fingerprinting. In
OSDI, 2004.
2) [Kim 04] H.-A. Kim and B. Karp. Autograph: Toward automated, distributed worm signature
detection. In USENIX Security Symposium, pages 271-286, 2004.
3) [Newsome05] J. Newsome, B. Karp, and D. Song. Polygraph: Automatically generating
signatures for polymorphic worms. In Proceedings of the IEEE Symposium on Security and
Privacy, May, 2005.
4) [Wang05] K. Wang, G. Cretu, and S. J. Stolfo. Anomalous payload-based worm detection and
signature generation. In Proceedings of the 14th Usenix Security Symposium, Baltimore, MD,
USA, July 31 – August 5 2005.
5) [X. Wang 06] XiaoFeng Wang, Zhuowei Li, Jun Xu, Michael K. Reiter, Chongkyung Kil, Jong
Youl Choi1, Packet Vaccine: Black-box Exploit Detection and Signature Generation, CCS 2006
6) [Wang06] SigFree: A Signature-free Buffer Overflow Attack Blocker, Usenix security 2006
7) [Michalls06] Michalis Polychronakis, Kostas G. Anagnostakis, and Evangelos P. Markatos
Network-Level Polymorphic Shellcode Detection Using Emulation DIMVA2006
8) [Kruegel05] C. Kruegel, E. Kirda, D. Mutz,W. Robertson, and G. Vigna. Polymorphic worm
detection using structural information of executables. In Proceedings of the 8th International
Symposium on Recent Advances in Intrusion Detection (RAID), September 2005.
9) [Crandall 04] Jedidiah R. Crandall and Frederic T. Chong, Minos: Control Data Attack
Prevention Orthogonal to Memory Model, IEEE/ACM international symposium on microarchitecture, 221-232, IEEE Computer Society. 2004
10) [Crandall 05] J. R. Crandall, Z. Su, S. F. Wu, and F. T. Chong. On Deriving Unknown
Vulnerabilities from Zero-Day Polymorphic and Metamorphic Worm Exploits. ACM CCS, pages
235–248, November 2005
Download