General Description

advertisement
Anti-malware
Security Projects
236349
Contact Persons:
Primary:
Secondary:
Tomer Brand
Daniel Radu Daniel.Radu@microsoft.com
052-6119010
Marian Radu Marian.Radu@microsoft.com
TomerB@Microsoft.com
General comments:
1) The following projects will be guided by Microsoft security researchers, which based
in Munich, Germany. Students who will choose those projects shall gain:

Hands-on experience in anti-malware / anti-cyber challenges
Initial experience of working in a global environments with experts from around
the globe (The will also be a point of contact in Israel.)
2) The project are stack rank according to the difficulty level (From easiest to hardest)
Project 1:
Automatic function identification in binary
code
General Description
Finding functions in a binary is a fundamental problem in reverse engineering
because of the fact that there is no distinction between code and data on intel processors.
Current methods are based on pattern matching based on the code generated by compilers
(prologue and epilog usually) and as such not accurate enough: when compilers are
changed, code is obfuscated or prologue and epilog of the functions are not standard.
This capability, of finding code functions, would be integrated into an automatic
processes which performs classification / identification of malware files and tools used in
cyber-attacks.
Goals
Students will be required to implement a tool which performs static analysis of a
portable executable program and finds all internal functions (not including imports)

The ‘input’ we will have are stripped binary files (without any debug info) with a
pointer for the program entry point
Prerequisites
Compilation course
Computer Security course
Basic Cryptography course
Recommended Reading

IDA F.L.I.R.T. Technology: In-Depth

BYTEWEIGHT: Learning to Recognize Functions in Binary Code
Project 2:
Improving de-compilation using symbolic
execution (smt solvers, abstract interpretation)
General Description
IDA is a de-facto standard tool used by all the researchers in the anti-malware industry.
IDA has a plug-in which allows it to decompile x86 code back to C. Because of the fact
that IDA does all of its analysis statically the de-compilation fails if the disassembly it
encounters contains:

Data embedded between instructions

Indirect branch instructions

Obfuscated code which does not follow compiler generated “style”

Etc.
Goals
The goal is to be able to leverage symbolic execution to retrieve additional
information and embedding it back into IDA in order to improve de-compilation results.
More specifically we would to produce SMT equations for compiler functions
Prerequisites
Compilation course
Computer Security course
Basic Cryptography course
Recommended Reading

Disassembly Challenges
Project 3:
Using symbolic execution and SMT solvers
to reason about a loop’s exit criteria and reduction of
complexity
General Description
When the anti-malware engine scans a file that is about to get launched it emulates
(executing in a local sandbox) in an attempt to identify malicious behaviors. During
emulation, loops stand out because they are resource intensive and lead to early termination
of the emulation in a significant number of cases.
Malware (ab)uses loops to hide their behavior from the malware scanner’s emulator.
Being able to tell if a loop:

Will terminate

What kind of computation it is performing

Is inefficient and can be optimized
Would help the emulation process and increase the anti-malware engines ability to
detect malware before it being actually executed by the OS.
Goals
Identify loops and their intent in order to replace the expensive loop with a less
expensive, non-iterative, piece of code, with similar side effects, and continue emulation.
Prerequisites
Compilation course
Computer Security course
Basic Cryptography course
Project 4:
Function matching using code semantics
General Description
Syntax is highly fluid, one can produce many different implementations, in terms of
code structure, which eventually performs the same task. This fact give an attacker a lot of
power in terms of hiding his real intent and escape security products.
This project is aiming towards statically identify functions based on semantics rather
than syntax thus making it resilient to obfuscation, compiler changes, etc. This could be
useful in recognizing:

Crypto algorithms

Standard library functions (atoa, printf, etc.)

Malicious functions
In the general sense this is unsolvable (NP Complete) problem, but we can do a
reduction for scenarios which is doable.
Goals
The goal is to be able to define a language for semantic level / intent of a
function and have a tool which analyze programs and describe their using the
defined language

The ‘input’ would be a binary and a breakdown of its internal functions in a
decompiled or assembly language.

The expected output would be:
o Identification of crypto algorithms (and even distinguishing
between an encrypt or decrypt routines)
o Identify authentication method
Prerequisites
Compilation course
Computer Security course
Basic Cryptography course
Recommended Reading

Fast location of similar code fragments using semantic “juice”
Download