Secure Certifying Compilation What do you want to type check today? David Walker Cornell University Extensible Systems Many systems have programmable interfaces. Code Download, Link & Execute – – – – System Interface printers and editors (postscript printers, emacs, Word) browsers and servers (applets, plugins, CGI-scripts) operating systems (virus scanners) networks (active networks, JINI) April 12, 2000 David Walker, Cornell University 2 Extensible Systems: Pros • Client-side customization – plug in your own devices, 3rd-party utilities • Preservation of market-share – vendors can add features, improve functionality easily • System maintenance and evolution – software subscriptions April 12, 2000 David Walker, Cornell University 3 Extensible Systems: Cons • Security – extensibility opens system to malicious attacks – how do we prevent misuse of resources? • Reliability – flexibility makes it hard to reason about system evolution – how do we limit damage done by erroneous extensions? April 12, 2000 David Walker, Cornell University 4 Extensible Systems: Reality • Strong economic and engineering pros – Mobile code, systems with programmable interfaces will proliferate • A necessity: practical technology for increasing the security and reliability of extensible systems April 12, 2000 David Walker, Cornell University 5 Outline • Framework for improved reliability and security – Idea I: certifying compilation – Idea II: security via code instrumentation • An instance [popl '00] – Security automaton specifications – A dependently-typed target language (TAL) • Related work & research directions April 12, 2000 David Walker, Cornell University 6 Certified Code Untrusted Code Certificate Secure Code Download & Check System Interface Link & Execute • Attach annotations/certificate (types, proofs, ...) to untrusted object code extensions • Certificates make verification feasible • Move away from trust-based security & reliability April 12, 2000 David Walker, Cornell University 7 Certifying Compilation High-level Program Compile certificate Annotated IR Optimize Transmit April 12, 2000 • Low-level certificate generation must be automated • Necessary components: 1) a source-level programming language 2) a compiler to compile and optimize source programs while preserving the certificate 3) a certifying target language David Walker, Cornell University 8 Question How should we obtain the initial certificate? April 12, 2000 David Walker, Cornell University 9 Answer • Use a type-safe language • Type inference relieves the tedium of proof construction • Programmers will rewrite programs so they type check April 12, 2000 David Walker, Cornell University 10 Certifying Compilation So Far Type Safe High-level Program Compile 1) a strongly typed source-level programming language types Typed Program 2) a type-preserving compiler to compile and optimize source programs Optimize 3) a certificate language for typesafety properties Transmit April 12, 2000 David Walker, Cornell University 11 Certifying Compilers • Proof-Carrying Code [Necula & Lee] – an expressive base logic that can encode many security policies – in practice, logic is extended with a type system – compilers produce type safety proofs • Typed Assembly Language [Morrisett, Walker, et al] – flexible type constructor language that can encode high-level abstractions – guarantees type safety properties April 12, 2000 David Walker, Cornell University 12 Conventional Type Safety • Conventional types ensure basic safety: – basic operations performed correctly – abstraction/interfaces hide data representations and system code • Conventional types don't describe complex security policies – eg: policies that depend upon history • Melissa virus reads Outlook contacts list and then sends 50 emails April 12, 2000 David Walker, Cornell University 13 Outline • Framework for improved reliability and security – Idea I: certifying compilation – Idea II: security via code instrumentation • An instance [popl '00] – Security automaton specifications – A dependently-typed target language (TAL) • Related work & research directions April 12, 2000 David Walker, Cornell University 14 Flexible Security Policies High-level Extension Compiler Security Policy Instrument Analyze & Optimize April 12, 2000 • Specify policies independently of extensible system • Compiler instruments extensions • Easier to understand, debug, evolve policies David Walker, Cornell University 15 Security Policy Specifications • Requirement: a language for specifying security policies • Features: – Notation for specifying events of interest • "network send" and "file read" are security-sensitive – Notation for specifying illegal behaviour • a privacy policy: "no send after read" – A feasible compilation strategy • must be able to prevent programs from violating the policy April 12, 2000 David Walker, Cornell University 16 Examples • SFI [Wahbe et al] – events are read, write, jump – enforce memory safety properties • SASI [Erlingsson & Schneider], Naccio [Evans & Twyman] – flexible policy languages – not certifying compilers April 12, 2000 David Walker, Cornell University 17 Putting it Together – define policies in a high-level, flexible and system-independent specification language – instrument system extensions both with dynamic security checks and static information – preserve proof of security policy during compilation and optimization – verify certified compiler output to reduce TCB April 12, 2000 David Walker, Cornell University 18 Outline • Framework for improved reliability and security – Idea I: certifying compilation – Idea II: security via code instrumentation • An instance [popl '00] – Security automaton specifications – A dependently-typed target language (TAL) • Related work & research directions April 12, 2000 David Walker, Cornell University 19 Secure Certified Code • Overview of Architecture • Security Automata [Erlingsson & Schneider] – How to specify security properties – A simple compilation strategy • A dependently-typed target language (TAL) – A brief introduction to TAL – Extensions for certifying security properties • theoretical core language proven sound • can express any security automaton policy April 12, 2000 David Walker, Cornell University 20 Security Architecture Security Automaton Specification High-level Extension System Interface Instrument Annotate Secure Typed Extension Secure Typed Interface Type Check Optimize Transmit April 12, 2000 David Walker, Cornell University Secure Executable 21 Security Automata • A general mechanism for specifying security policies • Specify any safety property – access control policies: • “cannot access file foo” – resource bound policies: • “allocate no more than 1M of memory” – the Melissa policy: • “no network send after file read” April 12, 2000 David Walker, Cornell University 22 Example read(f) send has read start bad • • • • read(f) send Policy: No send operation after a read operation States: start, has read, bad Inputs (program operations): send, read Transitions (state x input -> state): – start x read(f) -> has read April 12, 2000 David Walker, Cornell University 23 Example Cont’d read(f) send has read start bad read(f) send • S.A. monitor program execution • Entering the bad state = security violation % untrusted program send(); read(f); send(); April 12, 2000 % s.a.: start state % ok -> start % ok -> has read % bad, security violation David Walker, Cornell University 24 Bounding Resource Use malloc (i) 0 ... i ... n-1 malloc (i) bad • Policy: "allocate fewer than n bytes" April 12, 2000 David Walker, Cornell University 25 Enforcing S.A. Specs • Every security-relevant operation has an associated function: checkop • Trusted, provided by policy writer • checkop implements the s. a. transition function checksend (state) = if state = start then start else halt % terminates execution April 12, 2000 David Walker, Cornell University 26 Enforcing S.A. Specs • Easy, wrap all function calls in checks: send() let next_state = checksend(current_state) in send() • Improve performance using program analysis April 12, 2000 David Walker, Cornell University 27 Outline • Technology for improved reliability and security – Idea I: certifying compilation – Idea II: security via code instrumentation • Secure certifying compilation [popl '00] – Security automaton specifications – A dependently-typed target language (TAL) • Related work & research directions April 12, 2000 David Walker, Cornell University 28 Brief TAL Overview Typecheck Link • Assembly or machine code with typing annotations • Object files checked separately and linked together • Ensures basic safety without run-time checks – Memory safety: can't read/write arbitrary memory – Control-flow safety: can't execute arbitrary data – Type abstraction: TAL can encode and enforce high-level abstract data types April 12, 2000 David Walker, Cornell University 29 A TAL Compiler • TAL is practical – We compile "safe C" (aka Popcorn) – No pointer arithmetic, unsafe casts – ML-style data types, polymorphism, exceptions – Some simple optimizations • null-check elimination, inlining, register allocation – The compiler bootstraps • most compiler hacking by Grossman, Morrisett, Smith April 12, 2000 David Walker, Cornell University 30 Other TAL Features • Memory management features – Stack types – Aliasing – Region-based MM – See Dave’s thesis • Other features – Dynamic linking – Run-time code generation • http://www.cs.cornell/talc April 12, 2000 David Walker, Cornell University 31 Typing Assembly Code • Programs divided into labeled code blocks • Each block has a code type: {eax:,ebx:,...} • Code types specify expected register contents – Assume code type to check the block – Prove control transfers (jumps) meet the assumptions Foo : {eax: int, ecx: {eax: int}} mov ebx, 3; % {eax: int, ebx: int, ecx: {eax: int}} add eax, ebx; % OK jmp ecx % OK April 12, 2000 David Walker, Cornell University 32 Increasing Expressiveness • Basic types ensure standard type safety – functions and data used as intended and cannot be confused – security checks can’t be circumvented • Introduce a logic into the type system to express security invariants • Use the logic to encode the s.a. policy • Use the logic to prove checks unnecessary April 12, 2000 David Walker, Cornell University 33 Target Language Predicates • States (for compile-time reasoning) • constants: start, has read, bad, ... • variables: 1, 2, ... • Predicates: – describe security states • instate() – describe relationships between states • transsend(1,2) – describe dependencies between values • (see the paper) April 12, 2000 David Walker, Cornell University 34 Preconditions • Code types can specify preconditions: foo: [, instate(), bad].{eax:1, ecx:2} • A typical use: bar: {...} ... % Known: instate(start) ... jmp foo [start] April 12, 2000 - instantiate polymorphic variable - prove residual preconditions - eg: instate(start), start bad - hope proofs are easy (syntactic matching) - otherwise place explicit proof at call site - eg: jmp foo [start, Proof, Proof] David Walker, Cornell University 35 Postconditions • Expressed as a precondition on the return address type: bar: { eax: 1, ecx: [instate(has read)].{eax: 2} } • Before returning, bar proves instate(has read) • After return, assume instate(has read) April 12, 2000 David Walker, Cornell University 36 Encoding Security Automata • Each security-relevant function has a type specifying 3 preconditions, 1 postcondition • the send function: – P1: instate(curr) – P2: transsend(curr,next) – P3: next bad Pre: P1, P2, P3 – P4: instate(next) Post: P4 send: [curr,next,P1,P2,P3].{ ecx: [P4].{ } } April 12, 2000 David Walker, Cornell University 37 Technical Note • State predicates behave linearly – as in linear logic, each state predicate is used once – instate(curr) is "consumed" at send call site • can't be used in future proofs • can't fool type system into thinking code continues to be in state curr – instate(next) is "produced" on return • will be used when next calling a security-sensitive function April 12, 2000 David Walker, Cornell University 38 Compile-time & Run-time • Compile-time reasoning depends on runtime values foo: mov eax, state % should represent the current state mov ecx, ret1 jmp checksend % state argument, state result in eax ret1: push eax mov ecx, ret2 jmp send checksend: April 12, 2000 % save next state on the stack % must establish precondition for send % postcond. == precond. for ret1, send David Walker, Cornell University 39 Checksend • A type for checksend (first try) checksend: [curr,P1].{eax:state, ecx:[next,P1,P2,P3].{eax:state} } where P1 = instate(curr), P2 = transsend(curr,next), P3 = next bad April 12, 2000 David Walker, Cornell University 40 Checksend • A type for checksend (first try) checksend: [curr,P1].{eax:state, ecx:[next,P1,P2,P3].{eax:state} } where P1 = instate(curr), P2 = transsend(curr,next), P3 = next bad • No correspondence between run-time argument and static information mov eax, wrong_state; mov ecx, next; jmp checksend April 12, 2000 David Walker, Cornell University 41 Checksend • Solution: provide very precise types • Singleton types – A type containing one value – eax : state(start) • means eax contains a data structure that represents exactly the start state and no other state – eax : state() • eax contains data representing the unknown state • useful in many contexts – Similar to Dependent ML [Xi & Pfenning] April 12, 2000 David Walker, Cornell University 42 Using Singletons • checksend – implements the automaton transition function • intuitively has type state -> state • singletons help relate run-time values to compile-time predicates [curr,P1].{eax:state(curr),ecx:[next,P1,P2,P3].{eax:state(next)}} – P1 = instate(curr), P2 = transsend(curr,next), P3 = next bad April 12, 2000 David Walker, Cornell University 43 Using Checksend foo: { … } ... % Assume: instate(curr), eax : state(curr) mov ecx, ret1 jmp check_send[curr] ret1: [next, instate(curr), transsend(curr,next), next bad]. {eax:state(next)}. push eax; mov ecx, ret2; jmp send [curr,next] % P1 & P2 & P3 ==> ok ret2: ... April 12, 2000 David Walker, Cornell University 44 Optimization • Analysis of s.a. structure makes redundant check elimination possible – eg: send read(f) has read start bad read(f) send – identify transsend(start,start) as valid April 12, 2000 David Walker, Cornell University 45 Optimization Low-level Interface Policy High-level Interface send: ' read: ' checksend: ' checkread: ' Axiom A = transsend(start,start) April 12, 2000 David Walker, Cornell University 46 Optimization loop : [instate(start)].{ } mov ecx, loop jmp send [start,start,By A]; send: [curr,next,instate(curr),transsend(curr,next), next bad].{ecx: [P4].{ }} • Type-checker is simple but general • Typical optimizations – redundant check removal – loop invariant removal April 12, 2000 David Walker, Cornell University 47 Implementation • TALx86 implementation is sufficient for these encodings – includes polymorphism, higher-order type constructors, logical connectives (,,), singleton types, .... • Lots more work to be done – axioms in module interfaces – policy compiler April 12, 2000 David Walker, Cornell University 48 Outline • Technology for improved reliability and security – Idea I: certifying compilation – Idea II: security via code instrumentation • Secure certifying compilation [popl '00] – Security automaton specifications – A certifying target language • Related work & research directions April 12, 2000 David Walker, Cornell University 49 Research Directions • Design of policy languages – What kinds of logics can we compile & certify? • Mawl [Sandholm & Schwartzbach] • TALres [Crary & Weirich] • Design of safety architecture – How do we "clean up" after halting a program? – Support for mutually distrustful agents • Policy-directed optimizations April 12, 2000 David Walker, Cornell University 50 Summary • A recipe for secure certified code: – types • ensure basic safety without run-time overhead • add a logic to encode complex invariants – policy-directed code instrumentation • specify security policies independently of the rest of the system • use dynamic checking to enforce policies when they can’t be proven statically April 12, 2000 David Walker, Cornell University 51