PolymorphicMalwareDe..

advertisement
Polymorphic
Malware Detection
Connor Schnaith, Taiyo Sogawa
9 April 2012
Motivation
• “5000 new malware samples per day”
• --David Perry of Trend Micro
• Large variance between attacks
• Polymorphic attacks
• Perform the same function
• Altered immediate values or addressing
• Added extraneous instructions
• Current detection methods insufficient
• Signature-based matching not accurate
• Behavioral-based detection requires human analysis and
engineering
Malware Families
•Classified into related clusters (families)
•Tracking of development
•Correlating information
•Identifying new variants
•Based on similarity of code
•Koobface
•Bredolab
•PoisonIvy
•Conficker (7 mil. Infected)
Source: Carrera, Ero, and Peter Silberman. "State of Malware: Family Ties." Media.blackhat.com. 2010. Web. 7 Apr. 2012.
<https://media.blackhat.com/bh-eu-10/presentations/Carrera_Silberman/BlackHat-EU-2010-Carrera-Silberman-State-of-Malwareslides.pdf>.
~300 samples of malware with 60% similarity threshold
Current Research
• Techniques for identifying malicious behavior
• Mining and clustering
• Building behavior trees
• Industry
• ThreatFire and Sana Security developing behavioral-based
malware detection
Design challenges
• Discerning malicious portions of code
o
o
Dynamic program slicing
accounting for control flow dependencies
• Reliable automation
o
o
Must be able to be reliable w/o human intervention
Minimal false positives
Holmes: Main Ideas
•
Two major tasks
o Mining significant behaviors from a set of
o
•
samples
Synthesizing an optimally discriminative
specification from multiple sets of samples
Key distinction in approach
o "positive" set - malicious
o "negative" set - benign
o Malware: fully described in the positive set,
while not fully described in the negative set
Main Ideas: behavior mining
•
•
•
Extracts portions of the dependence graphs of
programs from the positive set that correspond to
behaviors that are significant to the programs’
intent.
The algorithm determines what behaviors are
significant (next slide)
Can be thought of as contrasting the graphs of
positive programs against the graphs of negative
programs, and extracting the subgraphs that
provide the best contrast.
Main ideas: behavior mining
•
•
A "behavior" is a data dependence graph
G = (V, E, a, B)
o V is the set of vertices that correspond to
operations (system calls)
o E is the edges of the graph and correspond to
dependencies between operations
o a is the labeling function that associates nodes
with the operations they represent
o B is the labeling function that associates the
edges with the logic that represents the
dependencies
Main ideas: behavior mining
•
•
A program P exhibits a behavior G if it can produce
an execution trace T with the following properties
o Every operation in the behavior corresponds to an
operation invocation and its arguments satisfy
certain logical constraints
o the logic formula on edges connecting behavior
operations is satisfied by a corresponding pair of
operation invocations in the trace
Must capture information flow in dependence graphs
o two key characteristics
 the path taken by the data in the program
 security labels assigned to the data source and
the data sink
Security Label
Description
NameOfSelf
The name of the currently
executing program
IsRegistryKeyForBootLis
t
A Windows registry key lsiting
software set to start on boot
IsRegistryKeyForWindows
A registry key that contains
configuration settings for the
operating system
IsSystemDirectory
The Windows system directory
IsRegistryKeyForBugfix
The Windows registry key
containing list of installed
bugfixes and patches
IsRegistryKeyForWindows
Shell
The Windows registry key
controlling the shell
IsDevice
A named kernel device
IsExecutableFile
Executable file
Main ideas: behavior mining
•
•
•
Information gain is used to determine if a behavior
is significant. A behavior that is not significant is
ignored when constructing the dependency graph
Information gain is defined in terms of Shannon
entropy and it means gaining additional information
to increase the accuracy of determining if a G is in
G+ or G-
Shannon entropy
o H(G+ U G-) corresponds to the uncertainty that
a graph G belongs to G+ or Go partition G+ and G- into smaller subsets to
decrease that uncertainty
o process called subgraph isomorphism
Main ideas: behavior mining
•
A significant behavior g is a subgraph of a
dependence graph in in G+ such that:
Gain(G+ U G- , g) is maximized
•
•
Information gain is used as the quality measure to
guide the behavior mining process
Some non-significant actions can get passed as
significant
o these actions may or may not throw off the
algorithm that determines if the program is
malicious
Main ideas: behavior mining
•
•
Significant behaviors mined from malware Ldpinch
o Leaking bugfix information over the network
o Adding a new entry to the system autostart list
o Bypassing firewall to allow for malicious traffic
Could say any program that exhibits all three of these
behaviors should be flagged malicious
o This is too specific of a statement
i. Doesn't account for variations within a family
ii. It is known that smaller subsets of behaviors
that only include one of these actions could still
be malicious
iii. Need discriminative specifications
Main ideas: discriminative
specifications
•
Creates clusters of behaviors that can be classified
into as characteristic subset
o Program matches specification if it matches all
of the behaviors in a subset
o "Discriminative" in that it matches the malicious
but not the benign programs
Main ideas: discriminative
specifications
•
Each set of subset of behaviors induces a cluster of
samples
o Malicious and benign samples are mined are
organized into these clusters
o Goal: find an optimal clustering technique to
organize the malicious into the positive subset
and the benign into negative subset
Main ideas: discriminative
specifications
•
•
Three part algorithm
o Formal concept analysis
o Simulated annealing
o Constructing optimal specifications
Formal concept analysis
o O is a cluster of samples
o A is the set of mined behaviors in O
o A concept is the pair (A, O)
Set of concepts: {c1, c2, c3 , ... , cN)
Behavior specification: S(c1, c2, c3, ... , cN)
Main ideas: discriminative
specifications
Formal Concept Analysis (continued)
Begins by constructing all concepts and computes
pairwise intersection of the intent sets of these
concepts
• Repeated until a fixpoint is reached and no new
concepts can be constructed
• When algorithm terminates, left with an explicit listing
of all of the sample clusters that can be specified in
terms of one or more mined behaviors
• Goal is to find {c1, c2, c3, ... , cN} such that
S(c1, c2, c3, ... , cN) is optimal (based on
threshold)
•
Main ideas: discriminative
specifications
Simulated annealing
•
•
•
•
Probabilistic technique for finding approximate
solution to global optimization problem
At each step, a candidate solution i is examined
and one of its neighbors j is selected for
comparison
The algorithm moves to j with some probability
A cooling parameter T is reduced throughout
process and when it gets to a minimum the process
stops
Main ideas: discriminative
specifications
Constructing Optimal Specifications
•
•
Threshold t, a set containing positive and negative
samples, and a set of behaviors mined with the
previous process
Called SpecSynth
o Constructs full set of concepts
o Removes redundant concepts
o Run simulated annealing until
convergence, then return the
best solution
Holmes: Mining an Clustering
Evaluation and Results:
Holmes
• Used six malware families to develop specifications
• Tested final product against 19 malware families
• Collected 912 malware samples and 49 benign
Holmes Continued
• Experiments carried over varying threshold values (t)
• Demonstrates high sensitivity to system accuracy
• Perhaps only efficient for a specific subset of malware
Holmes Scalability
• Worst-case complexity is exponential
• Behaviors of repeated executions (Stration and Delf)
took 12-48 hours to analyze
• Scalability for Holmes is a nightmare!
“scary and scaled”
USENIX
• The Advanced Computing Systems Association
• (Unix Users Group)
• 2009 article: automatic behavior matching
o
o
o
o
Behavior graphs (slices)
Tracking data and control dependencies
Matching functions
Performance evaluations
Source: Kolbitsch, Clemens. "Effective and Efficient Malware Detection at the End Host." Usenix Security
Symposium (2009). Web. 8 Apr. 2012. <http://www.iseclab.org/papers/usenix_sec09_slicing.pdf>.
USENIX: Producing
Behavior Graphs
• Instruction log
o
o
Trace instruction
dependencies
Slicing doesn't reflect
stack manipulation
• Memory log
o
Access memory
locations
Partial behavior graph of Netsky (Kolbitsch et al)
USENIX: Behavior Slices to
Functions
• Use instruction and memory log to determine input
arguments
• Identify repeated instructions as loops
• Include memory read functions
• We can now compare to known malware
Evaluation
Six families used for development (mostly mass-mailing worm)
Expanded test set
Performance Evaluation
•
•
Installed Internet Explorer, Firefox, Thunderbird, Putty, and Notepad on
Windows XP test machine
Single-core, 1.8 GHz, 1GB RAM, Pentium 4 processor
USENIX Limitations
• Evading system emulator
o
o
o
o
USENIX detector uses Qemu emulator
delays
time-triggered behavior
command and control mechanisms
• Modifying algorithms behavior
o
A more fundamental change, but cannot be detected
using same signatures
• End-host based system
o
Cannot track network activity
Questions/Discussion
Download