Understand stack Buffer overflow attack and defense Controls against program threats Computer Emergency Response Team (CERT) At the Carnegie Mellon University http://www.cert.org Formed after Morris Worm (after 1988) A long report on Morris Worm by Eugene Spafford of Purdue CERIAS http://portal.acm.org/citation.cfm?id=66093.66095 Year 2007 2006 2005 2004 2003 2002 2001 2000 1999 1998 1997 1996 1995 Total vulnerabilities 7,236 8,064 Program control – specification & verification 5,990 typical approach: specify a should-do list 3,780 3,784 security needs: a shouldn’t-do list 4,129 2,437 1,090 417 No silver bullet to achieve security effortlessly 262 1. exhaustive testing of all program states is infeasible 311 2. software engineering techniques evolve rapidly 345 171 Some definitions and Types of flaws Program security flaw: unexpected behavior Error: human mistake Fault: an incorrect step, command, process, data definition in a program Failure: departure from system’s required behavior A taxonomy of program flaws Landwehr [LAN93] Intentional flaws Malicious Non-malicious Inadvertent flaws Validation, domain, serialization/aliasing Identification/authentication Boundary condition violation Logic errors Buffer overflows Java checks array bound – no buffer overflow A buffer (or array or string) is a space in which data can be held In memory, finite capacity Checking bound takes time/space char sample[10]; for(i=0;i<=9;i++) sample[i]=‘A’; sample[10]=‘B’; Replace code in system space with kernel privilege Figure 3-1 Places Where a Buffer Can Overflow. Process memory region Text Text region: code, read-only data Initialized) Data (uninitialized) Static variables. If data region expands or stack space runs out, new memory is added between data and stack segments Stack: an abstract data type. Stack Last in, first out (LIFO) PUSH: add an element at top of stack POP: remove element at top of stack Stack is for procedure call – jump and return Use stack for: dynamically allocate local vars pass parameters to functions Aleph One. Smashing the stack for fun and profit. 96. return values from functions http://www.phrack.com/issues.html?issue=49&id=14 How a process uses its stack: Call stack, stack pointer Stack buffer overflow occurs when information is written into the memory allocated to a variable on a stack, but the size of this information exceeds what was allocated at compile time. HEAP buffer overflow – used in many drive-by downloads Run-time stack, call stack, control stack, execution stack (all the same): What a process uses to keep track of the sequence of subroutines called and local variables encountered Stack frame: the consecutive stack space for each calling function that has not yet finished execution Top stack frame: for function that just got called and is being executed Stack pointer: memory location of the top of the stack Stored in a register Avi Kak Computer Security lecture note Another example //example1.c: void function(int a, int b, int c) { char buffer1[5]; char buffer2[10]; } void main() { function(1,2,3); } sfp: saved frame pointer Bottom of memory buffer2 <-----Top of stack [ Top of memory buffer1 sfp ret a b c ] [ ] [ ] [ ] [ ] [ ] [ ] Bottom of stack An example of stack and stack frame // ex.c int main() { int x = foo( 10 ); printf( "the value of x = %d\n", x ); return 0; } int foo( int i ) { int ii = i + i; int iii = bar( ii ); int iiii = iii; return iiii; } int bar( int j ) { int jj = j + j; return jj; } stack_ptr--> jj return-address to caller stack frame for bar j stack frame for foo stack frame for main iii ii return-address to caller i x argc argv Partial assembly code of ex.c in ex.S .globl bar .type bar, @function bar: pushl %ebp movl %esp, %ebp movl 8(%ebp), %eax addl %eax, %eax popl %ebp ret .size bar, .-bar .globl foo continues .type foo, @function foo: pushl %ebp movl %esp, %ebp subl $4, %esp movl 8(%ebp), %eax addl %eax, %eax movl %eax, (%esp) call bar leave ret .size foo, .-foo Compile with gcc -S –O ex.c –o ex.S Inspecting the call stack – assembly code of ex.c foo: pushl %ebp movl %esp, %ebp subl $4, %esp movl 8(%ebp), %eax addl %eax, %eax movl %eax, (%esp) call bar leave ret Intel X86 32 bit architecture push value stored in register ebp onto stack move value in register esp to ebp substract 4 from value in esp (stack grows) move i into an accumulator i+i move accumulator content into stack location pointed to by the content of esp register – so that local var ii becomes the argument to bar call bar To inspect the assembly code, run gcc -S -O ex.c -o ex.S esp: stack pointer – top of the stack ebp: base pointer (aka frame pointer) – param/local var of current statck frame eip: instruction pointer – next CPU instruction to be executed Avi Kak Computer Security lecture note void function(char *str) { char buffer[16]; Another buffer overflow example strcpy(buffer,str); Return address is overwritten and becomes 0x41414141 You get segmentation fault } void main() { char large_string[256]; Worse, attacker can change flow of program int i; for( i = 0; i < 255; i++) large_string[i] = 'A'; function(large_string); Overwritten by ‘A’ (0x414141…) } buffer <-----Top of stack [ sfp ] [ ret str* ] [ ] [ ] Bottom of stack Observing buffer overflow in action – try this at home #include <stdio.h> gcc -fno-stack-protector buffover2.c -o buffover2 int main() { while(1) foo(); } int foo(){ unsigned int yy = 0; char buffer[5]; char ch; int i = 0; printf("Say something: "); while ((ch = getchar()) != '\n') buffer[i++] = ch; buffer[i] = '\0'; printf("You said: %s\n", buffer); printf("The variable yy: %d\n", yy); return 0; } Avi Kak Computer Security lecture note Heap overflow Heap located above program code/global data For use in dynamic data structures, e.g., linked lists Can affect the memory following it Unlike stack, there is no return address to overwrite So aims to overwrite pointer to a function E.g., a list of record containing data & their processing function Another type of buffer overflow (usually in web applications) Overflow when passing parameters to a routine http://www.somesite.com/userinout.asp?param1=(808)555-1212&param2=2009Jan17 Web developer may just allocate 20 bytes for param1. How does the program handle long phone number, e.g., 1000 digits? Defense against buffer overflow Boundary checking, sanity checking by developers Canaries: a know value on the stack just before the return address – canary word Check the canary when function is to return Stack guard by Crispin Cowan (a gcc extension) Non-executable stacks Address randomization Compiler boundary checking In Java Java JVM may still be susceptible to buffer overflow attacks