GhostRider: A Hardware-Software System for Memory Trace Oblivious Computation Chang Liu, Austin Harris, Martin Maas Michael Hicks, Mohit Tiwari, and Elaine Shi Cloud computing raises privacy concerns for sensitive data Data & Program Privacy is less concerned! Malicious insiders or intruders can potentially perform physical attacks to snoop sensitive data Insider Data & Program Data & Program bus Intruder 1st-generation secure processors encrypt memory • e.g. Secure Processors (AEGIS, XOM, AISE-BMT), IBM Cryptographic Coprocessors, Intel SGX Access patterns to even encrypted data leak sensitive information. Secure processor Breast cancer Liver problem Kidney problem How Easy Are Physical Attacks? • E.g. replace DRAM DIMMs with NVDIMMs that have non-volatile storage to record accesses [𝑀[𝑖]] 𝑂(𝑝𝑜𝑙𝑦 log 𝑁) 𝑖 ORAM Scheme • SLOW! Read M[i] 2nd generation secure processors - Oblivious RAM Secure •Hide access patterns Processor •Poly-logarithmic cost [𝑖] per access [Stefanov et al., 2013] Path oram: An extremely simple oblivious ram protocol. (Best CCS Paper 2013) [Maas, et al., 2013] Phantom: Practical oblivious computation in a secure processor. In Proc. of CCS 2013. [Fletcher, et al., 2015] Freecursive ORAM:[Nearly] Free Recursion and Integrity Verification for Position-based Oblivious RAM. In Proc. of ASPLOS 2015. To speedup ORAM-based secure processor KeyORAM Observation: Not all data’s access • Placing everything into one giant • A hybrid memory model • Normal memory (DRAM) • Encrypted memory (ERAM) • Use multiple smaller ORAM banks instead of a big one • Security must not be NOT SACRIFICED patterns leak information Memory Trace Obliviousness (MTO) [3] Secure Processor ERAM Controller ORAM Controller DRAM Controller [3] Liu, Chang, Michael Hicks, Elaine Shi, “Memory Trace Oblivious Program Execution.” In Proc. of CSF 2013. (2013 NSA Best Scientific Cybersecurity Paper) Example: FindMax int max(public int n, secret int h[]) { public int i = 0; secret int m = 0; while (i < n) { if (h[i] > m) then m = h[i]; i++; } return m; h[] need not be in ORAM. } Encryption suffices. Dynamic Memory Accesses: Main loop in Dijkstra dis[]: Not in ORAM vis[], e[][]: Inside ORAM for(int i=1; i<n; ++i) { int bestj = -1; for(int j=0; j<n; ++j) if(!vis[j] && (bestdis < 0 || dis[j] < bestdis)) bestdis = dis[j]; vis[bestj] = 1; for(int j=0; j<n; ++j) if(!vis[j] && (bestdis + e[bestj][j] < dis[j])) dis[j] = bestdis + e[bestj][j]; } Our Goal • Build a compiler to automate MTO analysis • to integrate with an ORAM-capable secure processor • Design a type system to formally enforce obliviousness Challenges: Integrating Hardware With Compiler • Implicit cache will leak information (not MTO!) • Nondetermistic timing • Timing channel leaks information • MTO assembly code generation GhostRider: A Compiler-Hardware Codesigned Approach Compiler Secure Type Checker Optimizer Formally Enforce MTO Assembly Code Secure Processor Security guarantee Cache Channel MTO⇒ Timing Channel Termination Channel Scratchpad DRAM Controller ERAM Controller ORAM 1 Controller … ORAM 𝑛 Controller Extended Instruction Set Architecture Overview Instructions have deterministic timings User can ship their code and data securely using standard method. Software-controlled scratchpad to replace an implicit cacheJoint ORAM-ERAM memory system Extended Instruction Set 𝐿 𝑇 Data Transfer Between Memory and Scratchpad Data Transfer Between Scratchpad and Registers MTO for 𝐿 𝑇 • 𝑦: = 𝑎[𝑥] • 𝑎 is placed in ERAM 𝑡1 ← 𝑟𝑥 𝐝𝐢𝐯 𝑠𝑖𝑧𝑒𝑏𝑙𝑘 𝑡1 ← 𝑡1 + 𝑠𝑡𝑎𝑟𝑡𝑏𝑙𝑘𝑎 𝑡2 ← 𝑟𝑥 𝐦𝐨𝐝 𝑠𝑖𝑧𝑒𝑏𝑙𝑘 𝐥𝐝𝐛 𝑘1 ← 𝐸 𝑡1 𝐥𝐝𝐰 𝑟𝑦 ← 𝑘1 [𝑡2 ] • Input: 𝑥 = 513 (secret input) • Assume 𝑠𝑖𝑧𝑒𝑏𝑙𝑘 = 512 𝐟𝐞𝐭𝐜𝐡 𝐟𝐞𝐭𝐜𝐡 𝐟𝐞𝐭𝐜𝐡 𝐞𝐫𝐞𝐚𝐝(𝟏) Depending on 𝒙! MTO for 𝐿 𝑇 • 𝑦: = 𝑎[𝑥] • 𝑎 is placed in an ORAM o 𝑡1 ← 𝑟𝑥 𝐝𝐢𝐯 𝑠𝑖𝑧𝑒𝑏𝑙𝑘 𝑡1 ← 𝑡1 + 𝑠𝑡𝑎𝑟𝑡𝑏𝑙𝑘𝑎 𝑡2 ← 𝑟𝑥 𝐦𝐨𝐝 𝑠𝑖𝑧𝑒𝑏𝑙𝑘 𝐥𝐝𝐛 𝑘1 ← 𝑜 𝑡1 𝐥𝐝𝐰 𝑟𝑦 ← 𝑘1 [𝑡2 ] • Input: 𝑥 = 513 (secret input) • Assume 𝑠𝑖𝑧𝑒𝑏𝑙𝑘 = 512 𝐟𝐞𝐭𝐜𝐡 𝐟𝐞𝐭𝐜𝐡 𝐟𝐞𝐭𝐜𝐡 𝒐 𝐟𝐞𝐭𝐜𝐡 Memory Trace Oblivious Type System Overview • The type system deals with assembly code • The type system computes a trace pattern for a given program • checks if the trace pattern does not depend on secret input • In trace patterns, instead of actual value, the type system keeps track of symbolic values • To deal with branching instructions, the type system allows a limited form of code patterns containing branching • only allowed in IF-code pattern and LOOP-code pattern Compiler Implementation C Program with security annotation Standard information-flow style type system 𝑛 is public 𝑥 is secret 𝑎 is secret Memory Allocation 𝑎 is ORAM 𝑜 𝑛 is in DRAM 𝑥 is in ERAM Basic Compilation (Software Caching) Padding Ifcode block Program in 𝐿 𝑇 (may not type check) Typed Program in 𝐿 𝑇 Register Allocation Type Checker FPGA Implementation Evaluation - simulator Non-secure - Compare all strategies with non-secure baseline, which is also implemented in our prototype - Up to 10x speedup from our method (Final) comparing with the baseline, and never slows down - Little overhead over nonsecure baseline for some programs (e.g sum, findmax) - For programs whose memory trace patterns heavily depend on the input, speedup is small Evaluation - FPGA Conclusion and Future work • We build co-designed compiler and architecture GhostRider for supporting memory trace oblivious computation in the cloud • Hybrid memory model implementation • A compiler to automate memory trace obliviousness analysis • A security type system to formally enforce obliviousness • Evaluations show up to 10x speedup from GhostRider • In the future, we plan to extend the current system to compile larger scale programs