Memory Attacks and Protection through Software Diversity Fall 2014 Presenter: Kun Sun, Ph.D. Outlines What is SoK paper? Two SoK papers 1. Eternal War in Memory 2. László Szekeres*, Mathias Payer, Tao Wei, Dawn Song, , IEEE Symposium on Security and Privacy, 2013. Automated Software Diversity Per Larsen, Andrei Homescu*, Stefan Brunthaler, Michael Franz, IEEE Symposium on Security and Privacy, 2014. *Thank Laszlo and Andrei for sharing their slides. What is SoK paper? Systematization of Knowledge (SoK), in IEEE s&p conference, since 2010 Papers provide a high value to our community but may not be accepted because of a lack of novel research contributions. survey papers that provide useful perspectives on major research areas; papers that support or challenge long-held beliefs with compelling evidence, or papers that provide an extensive and realistic evaluation of competing approaches to solving specific problems. 3 SoK paper The goal is to encourage work that evaluates, systematizes, and contextualizes existing knowledge. identify areas that have enjoyed much research attention, point out open areas with unsolved challenges, and present a prioritization that can guide researchers to make progress on solving important challenges. Part I: Eternal War in Memory C/C++ is unsafe Everybody runs C/C++ code They surely have exploitable vulnerabilities Overview What are the attacks? What are the deployed protections? What are the not deployed protections? Why aren’t they deployed? Attack Model First step: Memory Corruption Attack Second step: Code Corruption Attack Control Flow Hijacking Data-only Attack Information leak Memory Corruption Attack Memory errors (bugs) C and C++ are inherently memory unsafe. allow the attacker to read and modify the program’s internal state in unintended ways. writing an array beyond its bounds, dereferencing a null-pointer, or reading an uninitialized variable result in undefined behavior Finding and fixing all the programming bugs is infeasible Memory Corruption More aggressive behavior: attacker changes internal program state Most often: memory contents Values of variables on stack/heap Code pointers Attack vectors Buffer overflows Use-after-free/double-free Uninitialized variables Format strings Classic Stack Smashing Attack Make pointer out-of-bounds Make pointer dangling Use pointer to write Use pointer to read Modify a code pointer... … to target code address Use pointer by indir. call/jmp Use pointer by ret instruction Exec. gadgets or functions Execute injected code Control-flow hijack Use-after-free Exploits Make pointer out-of-bounds Make pointer dangling Use pointer to write Use pointer to read Modify a code pointer... … to target code address Use pointer by indir. call/jmp Use pointer by ret instruction Exec. gadgets or functions Execute injected code Control-flow hijack Heap overflow Corrupting Newer Pointers Modify a data pointer Make pointer out-of-bounds Make pointer dangling Use pointer to write Use pointer to read Modify a code pointer... … to target code address Use pointer by indir. call/jmp Use pointer by ret instruction Exec. gadgets or functions Execute injected code Control-flow hijack Attack Model First step: Memory Corruption Attack Second step: Code Corruption Attack Control Flow Hijacking Data-only Attack Information leak Modifying the code itself Modify a data pointer Make pointer out-of-bounds Make pointer dangling Use pointer to write Use pointer to read Modify code ... Modify a code pointer... … to attacker specified code … to target code address Code corruption Use pointer by indir. call/jmp Use pointer by ret instruction Exec. gadgets or functions Execute injected code Control-flow hijack Code Integrity Modify a data pointer ReadModify only code ... … code to attacker specified pagescode Code corruption Make pointer out-of-bounds Make pointer dangling Use pointer to write Use pointer to read Modify a code pointer... … to target code address Use pointer by indir. call/jmp Use pointer by ret instruction Exec. gadgets or functions Execute injected code Control-flow hijack Challenge: In just-in-time compilation, a small time window when the generated code is on a writable page. Attack Model First step: Memory Corruption Attack Second step: Code Corruption Attack Control Flow Hijacking Data-only Attack Information leak Control Flow Hijacking Attacker takes control of program execution Program executes code under attacker control Other name: arbitrary code execution Non-executable data Modify a data pointer Readonly Modify code ... code … pages to attacker specified code Code corruption Make pointer out-of-bounds Make pointer dangling Use pointer to write Use pointer to read Modify a code pointer... … to target code address Use pointer by indir. call/jmp Use pointer by ret instruction Exec. gadgets or functions Execute injected code Control-flow hijack e.g., Stack Non-executable data Modify a data pointer Readonly Modify code ... code … pages to attacker specified code Code corruption Make pointer out-of-bounds Make pointer dangling Use pointer to write Use pointer to read Modify a code pointer... … to target code address Use pointer by indir. call/jmp Use pointer by ret instruction Exec. gadgets or functions NonExecute exec. injected code data pages Control-flow hijack Inserted Shellcode can not be executed. Return-oriented programming Modify a data pointer Readonly Modify code ... code … pages to attacker specified code Code corruption Make pointer out-of-bounds Make pointer dangling Use pointer to write Use pointer to read Modify a code pointer... … to target code address Use pointer by indir. call/jmp Use pointer by ret instruction Exec. gadgets or functions NonExecute exec. injected code data pages Control-flow hijack Return integrity Modify a data pointer Readonly Modify code ... code … pages to attacker specified code Code corruption Make pointer out-of-bounds Make pointer dangling Use pointer to write Use pointer to read Modify a code pointer... … to target code address Use pointer by indir. call/jmp Use pointer by ret instruction Exec. gadgets or functions NonExecute exec. injected code data pages Control-flow hijack Return integrity Modify a data pointer Readonly Modify code ... code … pages to attacker specified code Code corruption Make pointer out-of-bounds Make pointer dangling Use pointer to write Use pointer to read Modify a code pointer... … to target code address Use pointer by indir. call/jmp UseStack pointer by ret instruction canaries Exec. gadgets or functions NonExecute exec. injected code data pages Control-flow hijack Hijacking Indirect Calls and Jumps Modify a data pointer Readonly Modify code ... code … pages to attacker specified code Code corruption Make pointer out-of-bounds Make pointer dangling Use pointer to write Use pointer to read Modify a code pointer... … to target code address Use pointer by indir. call/jmp UseStack pointer by ret instruction canaries Exec. gadgets or functions NonExecute exec. injected code data pages Control-flow hijack 23 Address Space Randomization Modify a data pointer Readonly Modify code ... code … pages to attacker specified code Code corruption Make pointer out-of-bounds Make pointer dangling Use pointer to write Use pointer to read Modify a code pointer... … to target code address Use pointer by indir. call/jmp UseStack pointer by ret instruction canaries Exec. gadgets or functions NonExecute exec. injected code data pages Control-flow hijack Address Space Randomization Modify a data pointer Readonly Modify code ... code … pages to attacker specified code Code corruption Make pointer out-of-bounds Make pointer dangling Use pointer to write Use pointer to read Modify a code pointer... … to target ASLR code address Use pointer by indir. call/jmp UseStack pointer by ret instruction canaries Exec. gadgets or functions NonExecute exec. injected code data pages Control-flow hijack Attack Model First step: Memory Corruption Attack Second step: Code Corruption Attack Control Flow Hijacking Data-only Attack Information leak Data-only Attack The target of the corruption can be any security critical data in memory, Security critical variables configuration data the representation of the user identity or keys. Program Data Tampering Interfere with program execution Examples Bypassing DRM/copy protection Videogame cheating FPE, GW (DoS) Online game? Data-only attack Modify a data pointer Readonly Modify code ... code … pages to attacker specified code Code corruption Make pointer out-of-bounds Make pointer dangling Use pointer to write Use pointer to read Modify a code pointer... Modify data ... … to target ASLR code address … to attacker specified value Use pointer by indir. call/jmp UseStack pointer by ret instruction canaries Exec. gadgets or functions NonExecute exec. injected code data pages Control-flow hijack Use corrupted data variable Data-only attack Attack Model First step: Memory Corruption Attack Second step: Code Corruption Attack Control Flow Hijacking Data-only Attack Information leak Information Leaks Steal program information/state in memory, process metadata, registers, and files, etc. 1. 2. Valuable information by itself, e.g. credit card, password, encryption key. Facilitating further attacks, e.g., memory addresses to bypass ASLR Side channel attacks Information leakage Modify a data pointer Readonly Modify code ... code … pages to attacker specified code Code corruption Make pointer out-of-bounds Make pointer dangling Use pointer to write Use pointer to read Modify a code pointer... Modify data ... Output data … to target ASLR code address … to attacker specified value Interpret the leaked value Use pointer by indir. call/jmp UseStack pointer by ret instruction canaries Exec. gadgets or functions NonExecute exec. injected code data pages Control-flow hijack Use corrupted data variable Data-only attack Information leak Bypassing ASLR with user scripting Modify a data pointer Readonly Modify code ... code … pages to attacker specified code Code corruption Make pointer out-of-bounds Make pointer dangling Use pointer to write Use pointer to read Modify a code pointer... Modify data ... Output data … to target code address … to attacker specified value Interpret the leaked value Use pointer by indir. call/jmp UseStack pointer by ret instruction canaries Exec. gadgets or functions NonExecute exec. injected code data pages Control-flow hijack Use corrupted data variable Data-only attack Information leak Overview What are the attacks? What are the deployed protections? What are the not deployed protections? Why aren’t they deployed? Protection Techniques Deterministic Protection A low level reference monitor (RM) enforce a deterministic safety policy. By hardware, like W⊕X By software, adding RM into code Probabilistic Protection Built on randomization or encryption Instruction Set Randomization Address Space Randomization Data Space Randomization Hijack protection Deployed Protections Policy Technique Weakness Perf . Comp. W⊕X Page flags JIT 1x Good Return integrity Stack cookies Direct overwrite 1x Good Address space rand. ASLR Info-leak. 1.1x Good Overview What are the attacks? What are the deployed protections? What are the not deployed protections? Why aren’t they deployed? Control-flow Integrity Modify a data pointer Make pointer out-of-bounds Make pointer dangling Use pointer to write Use pointer to read Modify code ... Modify a code pointer... Modify data ... Output data … to attacker specified code … to target code address … to attacker specified value Interpret the leaked value Code corruption Use pointer by indir. call/jmp Use pointer by ret instruction Exec. gadgets or functions Execute injected code Control-flow hijack Use corrupted data variable Data-only attack Information leak Control-flow integrity Modify a data pointer Make pointer out-of-bounds Make pointer dangling Use pointer to write Use pointer to read Modify code ... Modify a code pointer... Modify data ... Output data … to attacker specified code … to target code address … to attacker specified value Interpret the leaked value Use pointer by Use pointer by CFIret instruction indir. call/jmp Use corrupted data variable Exec. gadgets or functions Code corruption Execute injected code Control-flow hijack Data-only attack Information leak CFI p = &f jmp p f() { … } q = &g jmp q g() { … } CFI ID p = &f jmp p f() { … } ID q = &g jmp q g() { … } CFI ID check ID p = &f jmp p f() { … } ID check ID q = &g jmp q g() { … } CFI ID check ? p = &f jmp p if (…) q = &f else q = &g check ? jmp q f() { … } ID g() { … } Over-approximation problem ID check ID p = &f jmp p if (…) q = &f else q = &g check ID jmp q f() { … } ID g() { … } Over-approximation problem ID check ID p = &f jmp p if (…) q = &f else q = &g check ID jmp q printf() { … } ID system() { … } Modularity problem ID printf() { … } ID system() { … } CFI Hijack protection Policy Technique W⊕R Return integrity Address space rand. Control-flow integ. Weakness Page flags JIT Stack Direct cookies overwrite ASLR Perf . Comp. 1x Good 1x Good Info-leak. 1.1x Good CFI Over-approx. 1.4x Librarie s Memory Safety Modify a data pointer Make pointer out-of-bounds Make pointer dangling Use pointer to write Use pointer to read Modify code ... Modify a code pointer... Modify data ... Output data … to attacker specified code … to target code address … to attacker specified value Interpret the leaked value Code corruption Use pointer by indir. call/jmp Use pointer by ret instruction Exec. gadgets or functions Execute injected code Control-flow hijack Use corrupted data variable Data-only attack Information leak Memory safety Make pointer Make pointer Pointer out-of-bounds dangling metadata tracking & Use pointer Use pointer checking to write to read Modify a data pointer Modify code ... Modify a code pointer... Modify data ... Output data … to attacker specified code … to target code address … to attacker specified value Interpret the leaked value Code corruption Use pointer by indir. call/jmp Use pointer by ret instruction Exec. gadgets or functions Execute injected code Control-flow hijack Use corrupted data variable Data-only attack Information leak Data integrity Modify a data pointer Make pointer out-of-bounds Make pointer dangling Use pointer to write Use pointer to read Modify code ... Modify a code pointer... Modify data ... Output data … to attacker specified code … to target code address … to attacker specified value Interpret the leaked value Code corruption Use pointer by indir. call/jmp Use pointer by ret instruction Exec. gadgets or functions Execute injected code Control-flow hijack Use corrupted data variable Data-only attack Information leak Data integrity Modify a data pointer Make pointer out-of-bounds Make pointer dangling Use pointer to write Use pointer to read Modify Modify a Modify Output data … to attacker specified code … to target code address … to attacker specified value Interpret the leaked value Object & checking data ... code ...metadata tracking code pointer... Code corruption Use pointer by indir. call/jmp Use pointer by ret instruction Exec. gadgets or functions Execute injected code Control-flow hijack Use corrupted data variable Data-only attack Information leak Data-flow integrity Modify a data pointer Make pointer out-of-bounds Make pointer dangling Use pointer to write Use pointer to read Modify code ... Modify a code pointer... Modify data ... Output data … to attacker specified code … to target code address … to attacker specified value Interpret the leaked value Code corruption Use pointer by indir. call/jmp Use pointer by ret instruction Exec. gadgets or functions Execute injected code Control-flow hijack Use corrupted data variable Data-only attack Information leak Data-flow integrity Modify a data pointer Make pointer out-of-bounds Make pointer dangling Use pointer to write Use pointer to read Modify code ... Modify a code pointer... Modify data ... Output data … to attacker specified code … to target code address … to attacker specified value Interpret the leaked value Code corruption Use pointer by indir. call/jmp Use pointer by DFI ret instruction Exec. gadgets or functions Execute injected code Control-flow hijack Use corrupted data variable Data-only attack Information leak Generic protection Hijack protection Summary Policy Technique Weakness Perf . Comp. W⊕R Page flags JIT 1x Good Return integrity Stack cookies Direct overwrite 1x Good Address space rand. ASLR Info-leak. 1.1x Good Control-flow integ. CFI Over-approx. 1.4x Librarie s Memory safety SB+CETS None 2-4x Good Data integrity WIT Over- 1.2x approx.,… Librarie s Data space rand. DSR Over- 1.3x approx.,… Librarie s Data-flow integrity DFI Over-approx. 2-3x Librarie s Challenges in Practical Usage Why most of defense mechanisms are not used in practice? 1. the performance overhead of the approach outweighs the potential protection, 2. the approach is not compatible with all currently used features (e.g., in legacy programs), 3. the approach is not robust and the offered protection is not complete, 4. or the approach depends on changes in the compiler toolchain or in the source-code while the toolchain is not publicly available. 55 Run-time Performance Performance significantly impacts adoption Lowest overhead (negligible) DEP ASLR Stack canaries Diversity/performance trade-off Pre-distribution: small overhead (1-11%) Post-distribution: larger overhead (5-265%) Part II: Automated Software Diversity Security Problem Attacks rely on offline information of target Memory layout Memory contents (including program code) Deterministic algorithms In one word: predictability Huge advantage for attacker Attack one instance of program => attack them all Probabilistic Protection Deterministic Protection A low level reference monitor (RM) enforce a deterministic safety policy. By hardware, like W⊕X By software, adding RM into code Probabilistic Protection Built on randomization or encryption Instruction Set Randomization Address Space Randomization Data Space Randomization Deterministic vs. Nondeterministic algorithm Deterministic algorithm, given a particular input, will always produce the same output, always passing through the same sequence of states. Nondeterministic algorithm, even for the same input, exhibit different behaviors on different runs. uses external state, e.g., user input, a global variable, a hardware timer value, a random value, or stored disk data. timing-sensitive operations, e.g., multiple processors writing to the same data at the same time. hardware errors causing state change in an unexpected way. Software Diversity Threat (Why Diversity?) Information Leaks Memory Corruption Attacks Control Flow Hijacking Code Injection Code Reuse Program Tampering Reverse Engineering Combination of the above Reverse Engineering Goal: discover program semantics (algorithms) Attacker has access to program binary security-through-obscurity Most frequent mitigation: obfuscation What Have We Learned? Attacks are increasing in complexity Work around defenses (incl. diversity) Multi-tiered attacks Example: JIT Code Reuse State-of-the-art multi-tier attack Attacker finds address of valid code Uses information leak to read all code pages Scan code pages for reusable code snippets (ROP gadgets) Redirect control flow to attack payload (memory corruption) Ultimate goal: control flow hijacking and more K. Z. Snow, F. Monrose, L. V. Davi, A. Dmitrienko, C. Liebchen, and A.-R. Sadeghi. Just-in-time code reuse: On the effectiveness of fine- grained address space layout randomization. In Proceedings of the 34th IEEE Symposium on Security and Privacy, S&P ’13, 2013. Example: Blind ROP Novel ROP attack (from S&P 2014) Blind information leak Locate gadgets by trial-and-error Build gadget chain that writes memory to socket Used to leak entire program binary Requires no a priori knowledge of program What to Diversify? Program Code Control Flow Hijacking Reverse Engineering Program Tampering Program Data Information Leaks Memory Corruption Program Tampering Granularity of Transformations Instruction-level Basic block-level Loop-level Function-level Program-level System-level Diversifying Transformations Instruction-level Basic block-level Instruction Substitution Equivalent Instructions Instruction Reordering Register Allocation Randomization Garbage Code Insertion Basic Block Reordering Opaque Predicates Branch Function Insertion Loop-level Diversifying Transformations Function-level Program-level Stack Layout Rand. Function Parameter Rand. Randomized Inlining/Outlining/Splitting Control Flow Flattening Function Reordering Base Address Rand. (ASLR) Program Encoding Rand. Data Randomization Library Entry Point Rand. System-level When to Diversify? Design Implementation Compilation Linking Installation Loading Execution Update Pre-distribution Diversification Disassembly is hard! Indirect branches (not known until running time) Data inside code section Compile-time information makes diversification easier Garbage Code Insertion Register Allocation Randomization Stack Layout Randomization more... Post-distribution Diversification Distributed diversification Lower costs for developers Introduces latency Generally faster diversification More diversity Between each run During execution Security Entropy Logical/formal argument Concrete attacks Measure entropy/# of different versions Quantitative, not qualitative ASLR: entropy matters!!! Test a real attack against hardened binary Attack-specific metrics Moving Forward Hybrid Approaches Error Reporting & Debugging Automated Security Evaluation Side-channels / Information Leaks Reference 1. Per Larsen, Andrei Homescu, Stefan Brunthaler, and Michael Franz. 2014. SoK: Automated Software Diversity. In Proceedings of the 2014 IEEE Symposium on Security and Privacy (SP '14). 2. Laszlo Szekeres, Mathias Payer, Tao Wei, and Dawn Song. 2013. SoK: Eternal War in Memory. In Proceedings of the 2013 IEEE Symposium on Security and Privacy (SP '13). 3. Pucella, R.; Schneider, F.B., "Independence from obfuscation: a semantic framework for diversity," Computer Security Foundations Workshop, 2006. 19th IEEE , vol., no., pp.12 pp.,241 4. K. Z. Snow, F. Monrose, L. V. Davi, A. Dmitrienko, C. Liebchen, and A.-R. Sadeghi. Just-in-time code reuse: On the effectiveness of fine- grained address space layout randomization. In Proceedings of the 34th IEEE Symposium on Security and Privacy, S&P ’13, 2013. Questions?