Down to the Bare Metal: Using Processor Features for Binary Analysis

advertisement
Annual Computer Security Applications Conference (ACSAC) 2012
DOWN TO THE BARE METAL:
USING PROCESSOR FEATURES FOR
BINARY ANALYSIS
Carsten Willems1, Ralf Hund1, Andreas Fobian1,
Thorsten Holz1, Amit Vasudevan2
1Ruhr-University
Bochum, Germany
2Carnegie Mellon University
左昌國
2013/02/25 Seminar @ ADLab, NCU-CSIE
2
Outline
• Introduction
• Software Emulators
• Delusion Attacks
• Binary Analysis with Branch Tracing
• Experiments
• Limitations
• Conclusion
3
Introduction
• Binary(malware or vulnerable software) analysis
• Static
• Dynamic
• Number of execution paths
• (on behavior analysis) Every Instruction or Critical Point
• Native Machine or Emulation/Virtualization
4
Introduction
• Native Machine
• The analysis result must be unaffected by malicious code
• Reverting to clean states
• Lack of monitoring abilities
• Emulator
• Artificial environment detection
• Delusion attacks
• No explicit test
5
Introduction
• Contributions:
• Introducing several delusion attacks
• An approach to perform behavior analysis
• Branch tracing feature of x86 CPU
• Implementing a prototype that shows the usefulness of this
approach
6
Software Emulators
• BOCHS
• QEMU
• Dynamic Translation
• Guest code block (before branch)  intermediate code  optimization
 translated to host instruction code block (Translation Block)  saving
TBs in code cache
• Isolated Memory
• BitBlaze and Anubis
• Taint Propagation Tracking
7
Delusion Attacks - Motivation
• Current emulator detection techniques consist of 2 steps:
(1) Probing the existence of a non-native system environment
(2) Depending on the outcome of (1), different actions are performed
• These techniques are easy to spot and mitigate
• Powerful analysis methods like multi-path execution
• This paper proposes detection methods that have no
explicit check and do not have conditional branch
8
Delusion Attacks – Basic Principle
• Self-Modifying Code (SMC)
• On a native system, handling SMC correctly is sophisticated
• Instruction prefetch
• Multi-processor environment
• Modern CPUs can handle these problems correctly
• In an emulator, the CPU facilities for SMC detection cannot be
utilized
• Implemented in software
• Preparing a list of addresses of instructions  huge overhead
• Most emulators (like QEMU) use page fault handling for SMC detection
• All executable memory pages are set read-only
• If (memory write on executable memory), page fault handler triggered
• (In the handler) If the target memory should be writable (writable in guest OS),
1.
2.
3.
Memory protection is modified to writable
The memory write instruction is executed again
Memory protection is changed to read-only
9
Delusion Attacks – REP MOVS
• rep movs instruction
• Copying a number of bytes, words, or double words within an
implicit loop
• esi: source memory location
• edi: destination location
• ecx: loop counter, -1 for each loop, 0 for stopping loop
• On a real machine, the copy loop is atomically
• In an emulator, if the destination is a code address,
• The first loop iteration triggers the page fault handler
• Making it writable, re-executing the write operation, and making it
read-only
• The instruction is re-read from memory (second loop iteration)
• …
10
Delusion Attacks – REP MOVS
lea
lea
lea
lea
mov
eax,
ebx,
esi,
edi,
ecx,
BENIGNCODE
MALICIOUSCODE
NEW
OLD
2
ecx = 0
2
1
OLD+0x0
eip = OLD+0x2
OLD+0x0: rep movsd
OLD+0x2: nop
OLD+0x3: nop
OLD+0x4: call eax
OLD+0x6: nop
OLD+0x7: nop
//BENIGNCODE
ret
NEW+0x0:
NEW+0x0:
NEW+0x1:
NEW+0x1:
NEW+0x2:
NEW+0x2:
NEW+0x3:
NEW+0x3:
NEW+0x4:
NEW+0x4:
NEW+0x6:
NEW+0x6:
NEW+0x7:
NEW+0x7:
nop
nop
nop
nop
nop
nop
nop
nop
call ebx
call ebx
nop
nop
nop
nop
Double word
//MALICIOUSCODE
//MALICIOUSCODE
On a real machine
11
Delusion Attacks – REP MOVS
re-read the instruction
from memory
lea
lea
lea
lea
mov
eax,
ebx,
esi,
edi,
ecx,
BENIGNCODE
MALICIOUSCODE
NEW
OLD
2
OLD+0x0: rep movsd
OLD+0x2: nop
OLD+0x3: nop
OLD+0x4: call eax
OLD+0x6: nop
OLD+0x7: nop
ecx = 1
2
read-only
read-only
page
fault
writable
OLD+0x0
eip = OLD+0x1
//BENIGNCODE
ret
NEW+0x0:
NEW+0x0:
NEW+0x1:
NEW+0x1:
NEW+0x2:
NEW+0x2:
NEW+0x3:
NEW+0x3:
NEW+0x4:
NEW+0x4:
NEW+0x6:
NEW+0x6:
NEW+0x7:
NEW+0x7:
nop
nop
nop
nop
nop
nop
nop
nop
call ebx
call ebx
nop
nop
nop
nop
Double word
//MALICIOUSCODE
//MALICIOUSCODE
In QEMU
12
Delusion Attacks - INVD
• Many kinds of caches are available on a contemporary
system
• In an emulator, there is no explicit cache support, and all
cache-related instructions have no effect
• On a real machine
• The modification in cache will not be written back to memory
immediately
• On an emulated machine
• The modification is written directly to RAM
13
Delusion Attacks - INVD
lea eax,
lea ebx,
lea esi,
inc esi
wbinvd
mov byte
invd
BENIGNCODE
MALICIOUSCODE
A
ptr [esi], 0xD0
A+0x0
esi = A+0x1
The modification is done
in cache, not yet writing
back to memory
The cache is now invalidated
A:
call ebx
// FF D3 = call ebx
MALICIOUSCODE // FF D0 = call eax
On a real machine
14
Delusion Attacks - INVD
lea eax,
lea ebx,
lea esi,
inc esi
wbinvd
mov byte
invd
BENIGNCODE
MALICIOUSCODE
A
ptr [esi], 0xD0
A+0x0
esi = A+0x1
The modification is directly
written to memory
A:
call eax
ebx
MALICIOUSCODE
BENIGNCODE
// FF D3 = call ebx
// FF D0 = call eax
In QEMU
15
Delusion Attacks - LEAVE
leave
mov esp, ebp
pop ebp
16
Binary Analysis with Branch Tracing
• On x86/64 architectures from Intel and AMD, the branch
tracing (BT) facilities can record all pairs of the source
address and the destination address of branch operations
• The information can be used to reconstruct the
execution/decision path taken during execution
17
Experiments 1: Binning of Malicious PDF
Documents
• “Fuzzing” which produces a large number of crash reports
is a kind of automated vulnerability analysis
• Binning: a technique to group similar root causes in the
crash reports
• This technique can also be used to group a set of exploits by the
categories of exploited vulnerability
• By comparing with the control path generated from BT log, it is
easy to realize binning
18
Experiments 1: Binning of Malicious PDF
Documents
• CWXDetector
• A tool that is capable of detecting exploitation attempts and
extracting shellcode
• It does not become active before the execution of the first shellcode
instruction
no information can be gained about the cause vulnerability
• By combining BT with CWXDetector, it is useful to trace
back from the execution of the first shellcode instruction to
the root cause of vulnerability
• The experiment
• 4,869 malicious PDF documents
• Each file exploits some kind of vulnerability in Acrobat Reader 9.00
19
Experiments 1: Binning of Malicious PDF
Documents
20
Experiments 1: Binning of Malicious PDF
Documents
• Normalization
• Because of ASLR, the branch addresses are recorded in the form
of relative addresses
• Collapsing loops
• Removing internal exception handling of the Windows system
• Ignoring the shellcode part
• Clustering algorithm
• DBSCAN
• Jaro-Winkler distance
• Measure the difference between two strings
• Similar string  higher score
• Similar prefix  higher score
21
Experiments 1: Binning of Malicious PDF
Documents
k: minimum cluster size
ε: maximum distance of two objects to belong to the same cluster
22
Experiments 1: Binning of Malicious PDF
Documents
• Comparing with Wepawet
• 5 different vulnerability signatures (only addressing exploits of
Acrobat Reader 9.00)
• A small number of samples not detected to have exploits to Acrobat
Reader 9.00
 manually verified  wepawet is wrong
• Some samples are labeled incorrectly
 manually verified  wepawet is wrong
• Performance
• Time from opening the documents to the execution of shellcode
• Min: 11s (2s w/o BT)
• Max: 406s (117s w/o BT)
• Avg: 129s (11s w/o BT)
23
Experiment 2: Enriching BT Logs
24
Experiment 3: Practical Delusion Attack
with a PDF File
• See T.R. Appendix B
• This sample in Anubis behaved normally
25
Limitations
• The data from BT logs is coarse
• The prototype could be detected by timing measurements
• The attacker in ring-0 is capable of disabling the BT
• Could incorporate with a hardware-assisted hypervisor
26
Conclusion
• Many analysis techniques utilize software emulators.
• Attackers still have methods to evade the analysis under
the emulation environment
• A new approach for dynamic code analysis that uses
CPU-assisted branch tracing offers a granularity between
instruction- and function-level monitoring with reasonable
overhead
• Practical results show that the BT traces contain enough
information to assist some tasks in malware and
vulnerability analysis
Download