Inspector Gadget: Automated Extraction of Proprietary Gadgets from

advertisement
31st IEEE Symposium on Security & Privacy, 2010
Clemens Kolbitsch
Thorsten Holz
Secure Systems Lab
Vienna University of Technology
Christopher Kruegel
University of California
Engin Kirda
Institute Eurecom
Outline
Introduction
 System Overview
 Automated Extraction
 Gadget Preparation and Replay
 Gadget Inversion
 Evaluation

Introduction

Malware is the driving force behind
many of the attacks on the Internet
today.

It now being increasingly deployed as
software that can be remotely controlled.
How to analyze…

Static analysis
 Obfuscation, etc.

Dynamic analysis
 It doesn’t support automatically extracting the
specific functionality from the malware.
 Ex: domain generation algorithm of samples that
use domain flux
 Ex: the decoding function
This paper aims…

Presenting a novel approach to automatically
extract from a given malware the instructions
that are responsible for a certain activity of the
sample

First, INSPECTOR performs dynamic program
slicing on the malware to extract a slicing with
“interesting” behavior.

Second, it generates a stand-alone gadget
base on the extracted slice.
Advantages of
the extracted gadgets
Reduce our exposure to the malicious
code
 Immediately carry out a certain
operation the malware performs
 Identify in-memory buffers that hold
decrypted data
 Some gadgets can be inverted.

System Overview
Automated Extraction

Generating Activity Logs
 Anubis[web] performs dynamic malware
analysis base on a processor emulator(QEMU).
○ Recording all executed instructions
○ Marking each byte returned by a system call, and
using taint technique
○ Record all memory accesses
 Once an analyst has spotted an interesting
behavior, she can instruct INSPECTOR to
extract a gadget.
Automated Extraction (cont.)

Selecting and Extracting Algorithms
 An analyst has to select the relevant flow
manually.
○ In the HTTP download, she may select
WriteFile, or CreateFile.
 Extract a slice
○ Attempts to find all necessary data sources
required to calculate the parameters pass to
the function call.
Selecting and Extracting
Algorithms

Forward Searching and Backward Slicing
 The behavior selected by an analyst is not the
intended endpoint.
 The analyst should specify something as an
endpoint where the forward searching stops.

Heuristics for Detecting Endpoint
 string comparison functions, or execution of
code containing string handling instructions
 The data has been processed by a list of
mathematical instructions.
Selecting and Extracting
Algorithms (cont.)

Closure Analysis
 INSPECTOR can decide to deliberately
exclude certain dependencies.
○ Conditional jump
○ A behavior is only triggered under a certain
condition
Gadget Preparation and Replay

Gadget Format and Relocation
 Dynamic loadable library (DLL)
 All references to absolute code addresses
are rewritten to use relative addressing
 Extract all static memory areas into a data
file
Gadget Preparation and Replay
(cont.)

Gadget Player
 Memory Management
○ Preinitialized memory areas
○ Provide the player with a complete view of the
memory buffers accessible to the gadget.
Gadget Preparation and Replay
(cont.)

Execution Containment
 Must isolate the gadget from the player’s
memory
 Some choice
○ Emulation
 Performance consideration
○ Our approach
 Memory management rewrites the memory accesses
 Using a separate thread
 Redirect the API or system call to environment
interface
○ Other approach
 SFI, Native Client[web]
Gadget Preparation and Replay
(cont.)

Environment Interface
 During the gadget start-up, it registers a
callback function inside the gadget
○ Invoked by the gadget each time a system or
Windows API call
○ The callback can be changed by the analyst
Gadget Preparation and Replay
(cont.)

Callback Handling
 The gadget player can return fake
information to the gadget
Gadget Inversion

Main idea
 First, extract the gadget that is responsible
for stealing and encoding the data
 Second, compute the input that leads to the
output observed in the network dump

Use brute-force and the data
dependencies
Gadget Inversion
o  O, be theset of output bytes
i  I , be theset of input bytes
ov is theexpectedvalue
Dependentinput bytes: Do  i | i  I  o depends on i


Candidateinputs: Co  vii   vin | ii ,, in   Do

Gadget Inversion

Implementation
 Using taint tracking to get information

Applicability

 Base64:
○ 3 byte encode to 4 byte
○ Depend on 2 byte
Gadget Inversion
 XOR
○ Using constant key  depend on 1 byte
○ Using the content as key  depend on 2 byte
 Strong Encryption
○ Ex: RSA
○ Depend on all byte
○ imposible
Gadget Inversion

Possible Extensions
 Extract algebraic formulae
○ Constraint solver
 Input parallelization
○ Check multiple input candidates
Evaluation
Evaluation

Domain Flux: Conficker[web]
Evaluation
Evaluation

Fetching Binary Updates: Pushdo
 Over a period of 16 days
 Change IP for 3 C&C servers

Binary Update Decryption: Pushdo
 Pushdo client use random key to append on
URL in order to get encrypt file.
 Invere the program to find the key
Evaluation

Binary Update Generation: Pushdo
 Inverse the decrypt algorithm
 Redirect connection to our server
 140 bytes  44 seconds
Evaluation

Template-based Spamming: Cutwail
 XOR based encrypt
 Store template in memory
Download