The Geometry of Innocent Flesh on the Bone: Return-into-libc without Function Calls (on the x86) In Proceedings of the 14th ACM Conference on Computer and Communications Security, (CCS '07) Shacham (UC San Diego ) Presented by: WANG Zhi Code and Behavior • Does “good” codes always do “good” behavior? malicious code • Insert and execute malicious code Preventing malicious code • Check the integrity of good codes in the system. • Isolate “bad” code that has been introduced into the system. (e.g., W-xor-X, antivirus scanner.) W-xor-X model • Memory is either marked as writable or executable, but may not be both. • Widely deployed: Linux (via PaX patches); OpenBSD; Windows (since XP SP2); OS X (since 10.5); … • Hardware support: Intel “XD” bit, AMD “NX” bit • So it is very difficult to execute the inserted malicious codes. Ordinary programming • Instruction pointer (%eip) determines which instruction to fetch & execute • Once processor has executed the instruction, it automatically increments %eip to next instruction • Control flow by changing the value of %eip Return-oriented programming • The “ret” instruction could transfers program control to a return address located on the top of the stack. • Return-oriented programming introduces no new instructions, just using carefully crafted stack to link existed code snippets together by ret instruction. • Return-oriented programming generates a new execution path based on existed “good” codes Return-oriented programming • Stack pointer (%esp) determines which instruction sequence to fetch & execute • Processor doesn’t automatically increment %esp, but the “ret” at end of each instruction sequence does Useful Instruction Sequences • Useful Instruction Sequence: For a instruction sequence to be potentially useful, it needs only end in a return instruction. • In the libc library there are sufficient useful instruction sequences. • An attacker who controls the stack will be able to make the victim program to undertake arbitrary computation. Side-effect of the x86 variable length instruction set Finding Useful sequences • • • • Galileo Algorithm: First, to identify the ret (c3 byte) locations Second, to scan backwards from such locations. Does the single byte immediately preceding represent a valid one-byte instruction. • Dose the two bytes immediately preceding represent a valid two-byte instruction or two valid one-byte instructions. • And so on, up to the maximum length Instruction Sequence Challenges • • • • Code sequences are difficult to use: short; perform a small unit of work no standard function prologue/epilogue haphazard interface, not an ABI Gadget Gadget design • Gadgets are built from found code sequences. • Gadgets are intermediate organizational unit and perform well-defined operations, such as: load-store operations arithmetic & logic operations control flow invoking system calls The set of gadgets is Turing complete, so return-oriented programming could construct arbitrary computations. Loading a Constant to Register Loading From Memory Storing To Memory Arithmetic And Logic Simple add into %eax Return-Oriented Shellcode • • 1. 2. 3. 4. An application of return-oriented shellcode. The shellcode invokes the execve system call to run a shell. This requires: Setting the system call index in %eax; Setting the path of the program to run in %ebx. Setting the argument vector argv in %ecx. Setting the environment vector envp in %edx. Return-oriented Shellcode • In this case, the binary codes of shellcode given above : • 3e 78 03 03 07 7f 02 03 0b 0b 0b 0b 18 ff ff 4f • 30 7f 02 03 4f 37 05 03 bd ad 06 03 34 ff ff 4f • 07 7f 02 03 2c ff ff 4f 30 ff ff 4f 55 d7 08 03 • 34 ff ff 4f ad fb ca de 2f 62 69 6e 2f 73 68 00 Extension • Return-oriented programming has extended to the SPARC, Atmel AVR, PowerPC, Z80, and ARM processors. • Using return-like instructions such as indirect jump instructions: ”pop x; jmp *x”; References • [CCS08] Buchanan et al.: When good instructions go bad: Generalizing returnoriented programming to RISC.CCS 2008 • [USENIX Securiy09] Hund et al.: Returnoriented rootkits: Bypassing kernel code integrity protection mechanisms Conclusions • Code injection is not necessary for arbitrary exploitation. • Defenses that distinguish “good code” from “bad code” are useless. • Return-oriented programming likely possible on every architecture, not just x86. Thanks.