Secure Programs via Game-based Synthesis Somesh Jha, Tom Reps, and Bill Harris 1 One-slide summary • Secure programming on a conventional OS is intractable • Privilege-aware OS’s take secure programming from intractable to challenging • Our program rewriter takes secure programming from challenging to simple 2 Outline 1. Motivation, problem statement 1. Motivation, problem statement 2. Previous work: Capsicum [CAV ’12, Oakland ’13] 3. Ongoing work: HiStar 4. Open challenges 3 Secure Programming is Intractable • 81 exploits in CVE since Sept. 2013 • Many exploit a software bug to carry out undesirable system operations • 2013-5751: exploit SAP NetWeaver to traverse a directory • 2013-5979: exploit bad filename handling in Xibo to read arbitrary files • 2013-5725: exploit ByWord 4 How to Carry Out an Exploit software vulnerability + OS privilege = security exploit 5 The Conventional-OS Solution software vulnerability + OS privilege = security exploit 6 The ProgramVerification Solution software vulnerability + OS privilege = security exploit 7 Priv.-aware OS • Introduce explicit privileges over all system objects, primitives that update privileges • Programs call primitives to manage privilege 8 ThePriv.-aware OS Solution ( ) software vulnerability + OS privilege primitives = security exploitmonitor 9 TheCapsicumPriv.-aware OS [Watson ’10] • Privilege: ambient authority (Amb) to open descriptors to system objects • Primitives: program calls cap_enter() to manage Amb 10 ’s Rules ofCapsicum Amb 1. When a process is created, it has the Amb value of its parent • After a process calls cap_enter(), it does not have Amb • If a process does not have Amb, then it can never obtain Amb 11 gzip • main() { • file_nms = parse_cl(); • for (f in file_nms): • L0: (in, out) = L1: compress(in, out); open2(f); /usr/local •} http://evil.com 12 gzip with AMB A simple policy • When gzip calls open2() at L0, be able to open have • it should AMB descriptors When gzip calls compress() at L1, able to open it should not have AMB descriptors 13 gzip with AMB main() { file_nms = parse_cl(); for (f in file_nms): ? cap_enter() L0: (in, out) = open2(f); ? L1: compress(in, out); } L0: AMB L1: no AMB 14 Programming Capsicum Challenges 1. Amb policies are not explicit 2. cap_enter primitive has subtle temporal effects 15 gzip Programming Challenges { main() file_nms = parse_cl(); AMB for (f in file_nms): no AMB AMB L0: (in, out) = open2(f);no AMB AMB cap_enter(); L1: compress(in, out); no AMB } L0: AMB L1: no AMB 16 Rules of Capsicum’s Amb 1. When a process is created, it has the AMB value of its parent 1. After a process calls cap_enter(), it never has AMB 2. If a process does not have Amb, then it can never obtain Amb 17 • • • • Instrumenting gzip main() { file_nms = AMB parse_cl(); AMB for (f in file_nms): AMB L0: (in, out) = sync_fork(); open2(f); cap_enter(); • • no AMB sync_join(); L1: compress(in, out); } L0: AMB L1: no AMB 18 Capsicum Challenges Not Appearing in This Talk • Program can construct capability from each UNIX descriptor • Capability has a vector of 63 access rights (~1 for every system call on a descriptor) • Programs can assume new capabilities via a Remote Procedure Call (RPC) 19 Instrumenting Programs with CapWeave 1. Programmer writes an explicit Amb policy 2. CapWeave instruments program to invoke primitives so that it satisfies the policy 20 gzipwith CapWeave main() { file_nms = parse_cl(); for (f in file_nms): L0: (in, out) = open2(f); L1: compress(in, out); } Policy L0: AMB Cur(p) => (pc[L0](p) => AMB(p) L1: no AMB & (pc[L1](p) => !AMB(p)) 21 main() { file_nms = parse_cl(); for (f in CapWeave file_nms): L0: (in, out) = open2(f); L1: compress(in, out); void main() { L0: open2(...); } Instrumente sync_fork(); d Cur(p) => Program cap_enter(); L1: compress(); sync_join(); (pc[L0](p) => AMB(p) } Policy & (pc[L1](p) => !AMB(p)) 22 The Next 700 Policy Weavers • Analogous challenges with Decentralized Information Flow Control (DIFC) • Asbestos [Efstathopoulos ‘05] • HiStar [Zeldovich ’06] • Flume [Krohn ‘07] 23 Programmer gzip() { file_nms = parse_cl(); ... } Capsicum Designer Policy Cur(p) =>cap_enter: Amb’(p) (pc[L0](p) => AMB(p)) & (pc[L1](p) => !AMB(p)) := Amb(p) & ... Weaver CapWeave Generator gzip() { file_nms = parse_cl(); sync_fork(); cap_enter(); ... } 24 Programmer wrapper() { exec(...); ... } HiStar Designer Policy forall w, s. Flows(w, s) => ... create_cat(&c): Flows’(p, q) := Flows(p, q) || ... Weaver Generator HiWeave scanner() { create_cat(&c); exec(...); ... } 25 Outline 1. Motivation, problem statement 2. Previous work: Capsicum 1. 3. Ongoing work: HiStar 4. Open challenges 26 CapWeave Algorithm Inputs: Program P, Amb Policy Q Output: Instrumentation of P that always satisfies Q •Build finite IP# ⊇ instrumented runs that violate Q 27 1. Building IP#: Inputs Program Amb Policy main() { file_nms = parse_cl(); for (f in file_nms): L0: (in, out) = open2(f); L1: compress(in, out); } 28 L0: Amb L1: no Amb parse_cl cap_enter noop L0:open2() 1. Building IP#: Output L0:open2() sync_join() noop sync_fork() noop noop noop noop cap_enter() cap_enter() L1:compress() L1:compress() L1:compress() L1:compress() noop L1: no Amb L0:open2() 29 parse_cl cap_enter noop L0:open2() 1. Building IP#: Output L0:open2() sync_join() noop sync_fork() noop noop noop noop cap_enter() cap_enter() L1:compress() L1:compress() L1:compress() L1:compress() noop L0:open2() L0: Amb 30 Building IP# • Basic idea: construct IP# as a forward exploration of an abstract state space 31 1(a). IP#: Define Abstract State-space Q# 𝛼 Q 32 1(b). IP#: Define Abstract Transformers 𝜏[cap_enter]# Q# 𝛼 Q 𝜏[cap_enter] 33 1(c). Explore Abstract State Space 𝜏[cap_enter]# 𝜏[parse_cl]# Q# 𝛼 Q init 𝜏[noop]# ... .. . cap_enter ... parse_cl L0 L0’ noop 34 ... parse_cl cap_enter noop L0:open2() L0:open2() sync_join() noop sync_fork() noop noop noop 𝜏[parse_cl]# cap_enter() noop cap_enter() L1:compress() L1:compress() L1:compress() L1:compress() noop L0:open2() 35 State-Structure Exploration If a concrete state is a logical structure, ... A Q D A B B C 36 State-Structure Exploration properties are FOL formulas, ... ∀p. A(p) ⇒ ((B(p) ⇒ C(p)) ⋀ (D(p) ⇒ ¬C(p))) 37 State-Structure Exploration ...and semantics is given as predicate updates, ... A’(x) = A(x) ⋁ ∃ y. C(y) ⋀ B(q, p) B’(x, y) = B(x, y) ⋁ (C(x) ⋀ D(y)) 𝜏[action] ≡ C’(x) = ... D’(x) = ... 38 State-Structure Exploration ...then abstract space and transformers can be generated automatically [Sagiv ’99] 𝜏[action]# Q# Q 𝛼 𝜏[action] 39 𝛼 Capsicum Semantics A 1. D Q≡ A B B C 𝜏[action] ≡ 2. A’(x) = A(x) ⋁ ∃ y. C(y) ⋀ B(q, p B’(x, y) = B(x, y) ⋁ (C(x) ⋀ D(y)) C’(x) = ... D’(x) = ... 40 Capsicum State as Structure Parent Amb Cur L1 Amb ⊭ ∀ p. Cur(p) ⋀ L1(p) ⇒ ¬ Amb(p) 41 Capsicum State as Structure Parent Amb Cur L1 Amb ⊨ ∀ p. Cur(p) ⋀ L1(p) ⇒ ¬ Amb(p) 42 Capsicum Structure Transformers Cur Amb Action sync_fork() Parent Fresh Cur Amb Structure Transformer Intro Fresh Amb’(p) := Amb(p) ⋁ ( Fresh(p) ⋀ ∃ q. Cur(q) ⋀ Amb(q)) 43 Capsicum Structure Transformers Parent Amb Action cap_enter() Fresh Cur Amb Structure Transformer Amb’(p) := Amb(p) ⋀ ¬Cur(p) 44 Building IP#: Summary • If semantics is given as transforms of logical structures, we can generate an approximation of runs that cause a violation • Capsicum semantics can be modeled as structure transforms 45 CapWeave Algorithm Inputs: Program P, Amb Policy Q Output: Instrumentation of P that always satisfies Q •Build finite IP# ⊇ instrumented runs that violate Q 1. From IP#, build safety game G won by violations of Q 46 Two-Player Safety Games • In an Attacker state, the Attacker chooses the next input • In a Defender state, the Defender chooses the next input • Attacker wants to reach an accepting state 47 a x y b b w y z c x d c y x d d y d y b 48 Instrumentation as a Game Capsicum Instrumentation Two-player Games Program instructions Attacker actions Capsicum primitives Defender actions Policy violations Attacker wins Satisfying instrumentation Winning Defender strategy 49 gzip IP# parse_cl cap_enter noop L0:open2() L0:open2() sync_join() noop sync_fork() noop noop noop noop cap_enter() cap_enter() L1:compress() L1:compress() L1:compress() L1:compress() noop L0:open2() 50 gzip Safety Game parse_cl cap_enter noop L0:open2() L0:open2() sync_join() noop sync_fork() noop noop noop noop cap_enter() cap_enter() L1:compress() L1:compress() L1:compress() L1:compress() noop L0:open2() 51 CapWeave Algorithm Inputs: Program P, Amb Policy Q Output: Instrumentation of P that always satisfies Q •Build finite IP# ⊇ instrumented runs that violate Q •From IP#, build safety game G won by violations of Q 1. From winning strategy for G, generate primitive controller for P 52 CapWeave Performance Name Progra Policy Weaving m LoC Time kLoC bzip2-1.0.6 8 70 4m57s gzip-1.2.4 9 68 3m26s php-cgi-5.3.2 852 114 46m36s tar-1.25 108 49 0m08s tcpdump-4.1.1 87 52 0m09s wget-1.12 64 35 0m10s 53 Performance on Included Tests Name capweav Base Hand Diff. e Time Overhd Overhd (%) Overhd 0.593 bzip2-1.0.6 0.909 s 0.036 gzip-1.2.4 1.111 s 0.289 php-cgi-5.3.2 1.170 s 0.156 tar-1.25 13.301 s 54 1.099 20.90 1.278 15.03 1.938 65.64 21.917 64.78 Outline 1. Motivation, problem statement 2. Previous work: Capsicum 3. Ongoing work: HiStar 1. 4. Open challenges 55 The HiStar Priv-aware OS [Zeldovich ’06] • Privilege: OS allows flow between processes • Primitives: system calls update labels, which define allowed flows • Very powerful: mutually untrusting login (?!) 56 Sandboxing a Virus Scanner launcher() { exec(“/bin/scanner”); } wrapper() { child = sync_fork(&launcher); while (true) { read(child, buf); sanitize(buf); write(netd, buf); } } 57 A Flow Policy for a Virus Scanner • Information should never transitively flow from the scanner to the network, unless it goes through the wrapper • Information should always flow from the scanner to the wrapper • Information should always flow from the wrapper to the network 58 Rules for HiStar’s Flow • A process’s label maps each category to low or high • If process p calls create_cat, then each process is low in c, and p can declassify c • Each process may raise its level at each category • Each process may relinquish declassification of any category 59 Rules for HiStar’s Flow • Information can flow from p to q if for each category: • • The level of p is lower than the level of q at c, or p can declassify c 60 Sandboxing a Virus launcher()Scanner { drop_declass(x); exec(“/bin/scanner”); } wrapper() { create_cat(&x); raise(x); child = sync_fork(&launcher); while (true) { read(child, buf); sanitize(buf); write(netd, buf); } } 61 HiStar Challenges Not Appearing in This Talk • There are actually four levels • Each process has to manage its clearance • Processes can create labeled closures (calling a closure implicitly performs two label operations and three ordering checks) 62 Cap Hi Weave Algorithm Inputs: Program P, Policy Q Flow Amb Output: Instrumentation of P that satisfies Q •Build IP# ⊇ instrumented runs that violate Q (using semantics) Capsicum HiStar •From IP#, build safety game G won by violations of Q •From winning strategy for G, generate primitive controller for P 63 Capsicu HiSta Semanti mr cs A 1. D Q≡ A B B C 𝜏[action] ≡ 2. A’(x) = A(x) ⋁ ∃ y. C(y) ⋀ B(q, p B’(x, y) = B(x, y) ⋁ (C(x) ⋀ D(y)) C’(x) = ... D’(x) = ... 64 HiStar State as Structure Netd Label Flows Low Low Label Wrap Cur Flows x High Flows Label Scan ⊭ ∀ w, s, n. Wrap(w) ⋀ Scan(s) ⋀ Netd(n) ⇒ Flows(s, w)65⋀ Flows(w, n) HiStar State as Structure Netd Label Flows Low Low Label Wrap Decl Cur Flows x High Flows Label Scan ⊨ ∀ w, s, n. Wrap(w) ⋀ Scan(s) ⋀ Netd(n) ⇒ Flows(s, w)66⋀ Flows(w, n) HiStar State Transformers Flows Cur Low Label Fresh x Decl Action Structure Transform Intro Fresh create_cat(&x) Decl’(p, c) := Decl(p, c) ⋁ (Cur(p) ⋀ Fresh(c)) 67 HiStar State Transformers Flows Cur High Low Label Fresh x Decl Action raise(&x) Structure Transform Intro Fresh High’(l, c) := High(l, c) ⋁ ∃ p. Cur(p) & Label(p, l) & x(c) 68 Summary: HiStar Semantics • We can define the HiStar semantics as FOL predicate transforms and automatically generate a weaver for HiStar • FOL predicate transforms can describe capability and DIFC semantics 69 Scanner Game noop noop noop sync_fork create_cat(&c) raise(c) sync_fork sync_fork exec noop 70 drop_decl(c) exec ... HiWeave 𝛼 Performance • Generates code for clamwrap in < 3 mins 71 Programmer scanner() { sync_fork(); ... } HiStar Designer create_cat(&c): Decl’(p, c) := Decl(p, c) || ... Policy forall w, s, n. Wrap(w) && ... Weaver Generator HiWeave scanner() { create_cat(&c); sync_fork(); ... } 72 Outline 1. Motivation, problem statement 2. Previous work: Capsicum 3. Ongoing work: HiStar 4. Open challenges 73 Open Challenges • Automating abstraction refinement • Automating error diagnosis • Compositional synthesis • Optimizing generated code • Designing a policy logic 74 Automating Abstraction Refinement • Picking the right abstraction predicates requires a lot of design effort • Can we refine the abstraction predicates via counter-strategies? 75 Automating Error Diagnosis • When weaver fails, it has a counterstrategy • How can we simplify these when presenting them to the user? 76 Compositional Synthesis • Real programs are structured as a composition of processes • Policies are expressed naturally as conjunction of local, global policies • Can we adapt compositional verification? [Long, ’89] 77 HiStar Logger Local (security) policy: only Logger should be able to modify log Logger log.txt 78 HiStar Logger Global (functionality) policy: under certain conditions, Logger will append log on behalf of Environment Logger Environment log.txt 79 Optimizing Generated Code • Mean-payoff games present an appealing cost model, but have high complexity in general • Can we apply any domain specific optimizations? 80 Designing a Policy Logic • The weaver generator allows a policy writer to declare policies purely over privileges • What logic over privileges is easiest for a policy writer to understand? • How do we evaluate value added? 81 Our Collaborators Capsicum-dev QuickTime™ and a Photo - JPEG decompress or are needed to s ee this pic ture. Pawel Jakub Dawidek Khilan Gudka MIT-LL Ben Laurie HiStar Jeffrey Seibert Michael Zhivich 82 Nickolai Zeldovich Peter Neumann TVLA Mooly Sagiv Questions? scanner() { sync_fork(); ... } create_cat(&c): Decl’(p, c) := Decl(p, c) || ... Policy forall w, s, n. Wrap(w) && ... Weaver Generator HiWeave scanner() { create_cat(&c); sync_fork(); ... } 83 Extra Slides 84 Three-valued logic • Values: true, false, and unknown • true & unknown = unknown • false & unknown = false 85 Three-valued Structures Parent Amb Cur Parent Amb Cur Amb Cur Parent Amb Amb Cur Parent Parent 86 ... Abstraction Function alpha_{Cur} Amb Cur Parent Amb Amb Cur Parent Parent 87 Abstraction Function alpha_{Cur} Amb? Cur Parent Amb Cur Parent Parent 88 Abstract Fork (def) Amb Cur Amb Amb Cur Fresh Amb Parent Parent Action fork() Semantics Intro Fresh Amb(p) := Amb(p) || (Fresh(p) & E q. Cur(q) & Amb(q)) 89 Abstract Fork (definite) Amb Cur Fresh Amb Amb Cur Fresh Amb Parent Parent Parent Action fork() Semantics Intro Fresh Amb(p) := Amb(p) || (Fresh(p) & E q. Cur(q) & Amb(q)) 90