Integrity & Malware Dan Fleck CS469 Security Engineering Some of the slides are modified with permission from Quan Jia. 1 Coming up: Integrity – Who Cares? Integrity – Who Cares? • IARPA – Funded GMU (and others) to research software validation. • Securely Taking On New Executable Software Of Uncertain Provenance (STONESOUP) is a multi-year Intelligence Advanced Research Projects Activity (IARPA) • Overall Goal: Eliminate effects of software vulnerabilities in code. • GMU’s Part: Determine if a program has been compromised (taken over) by malware to do something it shouldn’t be doing and report to another component • How we do it: Binary Instrumentation! Coming up: Program Compromise: SQL Injection 2 3 Program Compromise: SQL Injection • SQL Injection – by inserting code into the data, a poorly written program can accidentally run unexpected SQL: • name: Dan Fleck’; update personnel set password=‘abc123’; • Code executes like this: • select account number from accounts where name=Dan Fleck’; update personnel set password=‘abc123’; • Return oriented programming (more direct) 3 4 Coming up: Program Compromise: Buffer Overflow Program Compromise: Buffer Overflow Overflowing: • Input data to a program that gets stored in a local variable (variable “A” for example) • No bounds checking… can overwrite the Return Address • strcpy in C does not check bounds! Coming up: Program Exploit: Buffer Overflow 4 5 Program Exploit: Buffer Overflow Exploiting: • Get a program to store information on the heap somehow • Add to the stack: • NOP sled • Shell Code • Return address into the NOP sled Coming up: Protections 5 6 Protections • Windows Data Execution Protection – don’t execute code on the stack! W X protection – no memory address can be both writeable and executable • (Circa 2004 – Win XP sp 2) • gcc –fstack-protector --- add in stack canaries • Can we do it without executing code on the stack? • Return oriented programming (ROP) 6 7 Coming up: Return Oriented Programming Return Oriented Programming • Re-use code that already exists in the system. • Steps: • Take control on the stack (somehow). Put variables you need on the stack for a specific function call, and jump to the function – Solar 1997 • Nergal 2001 – Phrack article – how to chain multiple function calls • Defense: hardware supported non-executable segments introduced. No longer could store function arguments on the stack, must be in registers. • Stealth 2005 – use chunks of functions that start by copying values from the stack into registers. DEPLib created which automates this approach. Loops and conditionals unsupported. • Shacham 2007 - introduces “return oriented programming” which allows loops and conditionals (These chunks are “gadgets”) Coming up: Return Oriented Programming 7 8 Return Oriented Programming • Gadget: • small piece of binary code that does some simple operation and returns: • add r1, r2 • return • Find lots of these by disassembling the binary program to create a gadget library (use automated tools (e.g. ROPGadget)) • Chain them together by modifying the stack 8 9 Coming up: ROP Ref: http://www.drdobbs.com/security/anatomy-of-a-stack-smashing-attack-and-h/240001832 ROP Normal program uses instruction pointer ROP program uses stack pointer 10 9 Coming up: Defenses Defenses • Many groups are trying to stop ROP. • One student we work with did okay • http://www.microsoft.com/security/bluehatprize/ • But what about the next thing? and the next? • Can we check the integrity of the program based on behavior? 10 11 Coming up: Monitoring Behavior Monitoring Behavior • Another approach is to monitor the behaviors of an application and determine if it’s out of “normal” • Challenges: • What is normal? • Speed versus security tradeoff • others… • Our specific part is resource-based attacks: • Does the program use more resources than it should? • Disk, CPU, memory, network, semaphores. Coming up: PIN Tools 11 12 PIN Tools • Intel’s PIN tool is a dynamic binary instrumentation tool • Lets you run a program and instrument it while it is running. • For example, find all the function calls and add in your own code before/after the call. • Lets see an example: http://software.intel.com/sites/landingpage/pintool/docs/61206/ Pin/html/index.html#EXAMPLES 12 13 Coming up: What is Instrumentation? What is Instrumentation? • A technique that inserts extra code into a program to collect runtime information. • Program analysis : performance profiling, error detection, capture & replay • Architectural study : processor and cache simulation, trace collection • Binary translation : Modify program behavior, emulate unsupported instructions 13 14 Coming up: Instrumentation Approaches 13 Instrumentation Approaches • Source Code Instrumentation (SCI) – instrument source programs • Binary Instrumentation (BI) • – instrument binary executable directly 14 15 Coming up: SCI Example (Code Coverage) 14 SCI Example (Code Coverage) Original Program • void foo() { • bool found=false; • for (int i=0; i<100; ++i) { • if (i==50) break; • if (i==20) found=tru e; • } • printf("foo\n"); • } Coming up: Binary Instrumentation (BI) 15 Instrumented Program • • • • • • • • • • char inst[5]; void foo() { bool found=false; inst[0]=1; for (int i=0; i<100; ++i) { if (i==50) { inst[1]=1;break;}if (i==20) { inst[2]=1;found=true;} inst[3]=1; } printf("foo\n"); inst[4]=1; } 15 16 Binary Instrumentation (BI) • Static binary instrumentation – inserts additional code and data before execution and generates a persistent modified executable • Dynamic binary instrumentation – inserts additional code and data during execution without making any permanent modifications to the executable. 16 17 Coming up: BI Example – Instruction Count 16 BI Example – Instruction Count counter++; sub $0xff, %edx counter++; cmp %esi, %edx counter++; jle <L1> counter++; mov $0x1, %edi counter++; add $0x10, %eax 17 18 17 Coming up: BI Example – Instruction Trace BI Example – Instruction Trace Print(ip); sub $0xff, %edx Print(ip); cmp %esi, %edx Print(ip); jle <L1> Print(ip); mov $0x1, %edi Print(ip); add $0x10, %eax 18 19 18 Coming up: Advantages Advantages • Binary instrumentation • • • Language independent Machine-level view Instrument legacy/proprietary software • Dynamic instrumentation • • • • No need to recompile or relink Discover code at runtime Handle dynamically-generated code Attach to running processes 19 20 Coming up: Advantages of Pin Instrumentation 19 Advantages of Pin Instrumentation • Easy-to-use Instrumentation: • Uses dynamic instrumentation - Do not need source code, recompilation, post-linking • Programmable Instrumentation: • Provides rich APIs to write in C/C++ your own instrumentation tools (called Pintools) • Multiplatform: • Supports x86, x86-64, Itanium, Xscale • OS’s: Windows, Linux, OSX, Android • Robust: • Instruments real-life applications: Database, web browsers, … • Instruments multithreaded applications • Supports signals • Efficient: • Applies compiler optimizations on instrumentation code Coming up: Widely Used and Supported 20 20 21 Widely Used and Supported • Large user base in academia and industry • 30,000+ downloads • 700+ citations • Active mailing list (Pinheads) • Actively developed at Intel • Intel products and internal tools depend on it • Nightly testing of 25000 binaries on 15 platforms 21 22 21 Coming up: Using Pin Using Pin • Launch and instrument an application $ pin –t pintool.so –- application Instrumentation engine (provided in the kit) Instrumentation tool (write your own, or use one provided in the kit) Attach to and instrument an application $ pin –t pintool.so –pid 1234 22 23 22 Coming up: Pin and Pintools Pin and Pintools • Pin – the instrumentation engine • Pintool – the instrumentation program • Pin provides the framework and API, Pintools run on Pin to perform meaningful tasks. • Pintools – Written in C/C++ using Pin APIs – Many open source examples provided with the Pin kit – Certain Do’s and Don’ts apply 23 24 Coming up: Pin Instrumentation Capabilities 23 Pin Instrumentation Capabilities • Replace application functions with your own. • Fully examine any application instruction – insert a call to your instrumenting function whenever that instruction executes. • Pass a large set of supported parameters to your instrumenting function. • Register values (including IP), Register values by reference (for modification) • Memory addresses read/written by the instruction • Full register context • Track function calls including syscalls and examine/change arguments. • Track application threads. • Intercept signals. • Instrument a process tree. ……… 24 25 24 Coming up: Pintool 1: Instruction Count Pintool 1: Instruction Count counter++; sub $0xff, %edx counter++; cmp %esi, %edx counter++; jle <L1> counter++; mov $0x1, %edi counter++; add $0x10, %eax 25 26 See icount example 25 Coming up: Pintool 1: Invocation Pintool 1: Invocation • Windows examples: •> pin.exe -t inscount0.dll -- dir.exe •> pin.exe -t inscount0.dll -o incount.out -- gzip.exe FILE • Linux examples: •$ pin -t inscount0.so -- /bin/ls •$ pin -t inscount0.so -o incount.out -- gzip FILE 26 27 Coming up: Pintool 1: Invocation 26 Pintool 1: ManualExamples/inscount0.cpp #include <iostream> #include "pin.h" UINT64 icount = 0; void docount() { icount++; } analysis routine instrumentation routine void Instruction(INS ins, void *v) { INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)docount, IARG_END); } switch to pin stack void Fini(INT32 code, void *v) save registers { std::cerr << "Count " << icount << endl; } call docount restore registers int main(int argc, char * argv[]) switch to app stack { PIN_Init(argc, argv); INS_AddInstrumentFunction(Instruction, 0); PIN_AddFiniFunction(Fini, 0); 27 28 PIN_StartProgram(); return 0; } 27 Coming up: Pin Instrumentation APIs Pin Instrumentation APIs • Basic APIs are architecture independent: • Provide common functionalities like determining: • Control-flow changes • Memory accesses • Architecture-specific APIs • E.g., Info about segmentation registers on IA32 • Call-based APIs: • Instrumentation routines • Analysis routines 28 29 Coming up: Pintool 2: Instruction Trace Pintool 2: Instruction Trace Print(ip); sub $0xff, %edx Print(ip); cmp %esi, %edx Print(ip); jle <L1> Print(ip); mov $0x1, %edi Print(ip); add $0x10, %eax 29 30 29 Coming up: Pintool 2: Instruction Trace Pintool 2: #include <stdio.h> #include "pin.H" FILE * trace; ManualExamples/itrace.cpp argument to analysis routine void printip(void *ip) { fprintf(trace, "%p\n", ip); } analysis routine void Instruction(INS ins, void *v) { INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)printip, IARG_INST_PTR, IARG_END); instrumentation routine } void Fini(INT32 code, void *v) { fclose(trace); } int main(int argc, char * argv[]) { trace = fopen("itrace.out", "w"); PIN_Init(argc, argv); INS_AddInstrumentFunction(Instruction, 0); PIN_AddFiniFunction(Fini, 0); PIN_StartProgram(); return 0; } Coming up: Examples of Arguments to Analysis Routine 30 31 Examples of Arguments to Analysis Routine • IARG_INST_PTR • Instruction pointer (program counter) value • IARG_UINT32 <value> • An integer value • IARG_REG_VALUE <register name> • Value of the register specified • IARG_BRANCH_TARGET_ADDR • Target address of the branch instrumented • IARG_MEMORY_READ_EA • Effective address of a memory read And many more … (refer to the Pin manual for details) 31 32 31 Coming up: Instrumentation Points Instrumentation Points • Instrument points relative to an instruction: • Before (IPOINT_BEFORE) • After: • Fall-through edge (IPOINT_AFTER) • Taken edge (IPOINT_TAKEN) cmp %esi, %edx count() count() jle <L1> mov $0x1, %edi count() <L1>: mov $0x8,%edi 32 33 Coming up: Instrumentation Granularity Instrumentation Granularity Instrumentation can be done at three different granularities: • Instruction • Basic block • A sequence of instructions terminated at a control-flow changing instruction • Single entry, single exit • Trace • A sequence of basic blocks terminated at an unconditional control-flow changing instruction • Single entry, multiple exits sub $0xff, %edx cmp %esi, %edx jle <L1> mov $0x1, %edi add $0x10, %eax jmp <L2> 1 Trace, 2 BBs, 6 insts jumpmix example – which types of jump instructions are called? 33 Coming up: Alternative Hw #2 33 34 Alternative Hw #2 • Instead of the given HW #2 you can write a PIN tool • Task: Write a PIN tool that monitors which files are opened by a program and stores it to a log. • Turn in your program and the output of it running in lieu of Hw#2. • Note: This is much harder than the original Hw#2, but more fun . 34 35 Coming up: Lessons 34 Lessons • Integrity of programs can be violated in many ways • Many defenses exist • Monitoring programs for normal can be done through binary instrumentation • PIN is one example of a powerful binary instrumentation tool 35 Coming up: References References • http://blog.zynamics.com/2010/03/12/a-gentle-introductionto-return-oriented-programming/ • http://cseweb.ucsd.edu/~hovav/dist/geometry.pdf • Stealth: http://www.suse.de/~krahmer/no-nx.pdf • Nergel, http://www.phrack.com/issues.html?issue=58&id=4#article • http://cseweb.ucsd.edu/~hovav/dist/rop.pdf 36 End of presentation