An Introduction to Proof-Carrying Code David Walker Princeton University (slides kindly donated by George Necula; modified by David Walker) Motivation • Extensible systems can be more flexible and more efficient than client-server interaction client server client-server extensible systems extension host David Walker - Foundations of Security 2003 2 Motivation • Extensible systems can be more flexible and more efficient than client-server interaction client server client-server extensible system extension host David Walker - Foundations of Security 2003 3 Example: Deep-Space Onboard Analysis Data: > 10MB/sec Bandwidth: < 1KB/sec Latency: > hours Note: • efficiency (cycles, bandwidth) • safety critical operation Source: NASA Jet Propulsion Lab More Examples of Extensible Systems Code Host Device driver Applet Loaded procedure DCOM Component … Operating system Web browser Database server DCOM client David Walker - Foundations of Security 2003 5 Concerns Regarding Extensibility • Safety and reliability concerns How to protect the host from the extensions ? Extensions of unknown origin ) potentially malicious Extensions of known origin ) potentially erroneous • Complexity concerns How can we do this without having to trust a complex infrastructure? • Performance concerns How can we do this without compromising performance? • Other concerns (not addressed here) – How to ensure privacy and authenticity? – How to protect the component from the host? David Walker - Foundations of Security 2003 6 Approaches to Component Safety • Digital signatures • Run-time monitoring and checking • Bytecode verification • Proof-carrying code David Walker - Foundations of Security 2003 7 Assurance Support: Digital Signatures Code Checker Host • Example properties: • “Microsoft produced this software” • “Verisoft tested the software with test suite 17” No direct connection with program semantics Microsoft recently recommended that Microsoft be removed from one’s list of trusted code signers David Walker - Foundations of Security 2003 8 Run-Time Monitoring and Checking Code Monitor Host • A monitor detects attempts to violate the safety policy and stops the execution – Hardware-enforced memory protection – Software fault isolation (sandboxing) – Java stack inspection Relatively simple; effective for many properties Either inflexible or expensive on its own David Walker - Foundations of Security 2003 9 Java Bytecode JVM bytecode Code Compiler Code Code Host Checker Relatively simple; overall an excellent idea Large trusted computing base – commercial, optimizing JIT: 200,000-500,000 LOC – when is the last time you wrote a bug-free 200,000 line program? Java-specific; somewhat limited policies David Walker - Foundations of Security 2003 10 Proof-carrying code JVM bytecode Code Compiler Code Proof Host Checker Flexible interfaces like the JVM model Small trusted computing base (minimum of 3000 LOC) Can be somewhat more language/policy independent Building an optimizing, type-preserving compiler is much harder than building an ordinary compiler David Walker - Foundations of Security 2003 11 Proof-carrying code JVM bytecode Code Compiler Code Proof Host Checker Question: Isn’t it hard, perhaps impossible, to check properties of assembly language? David Walker - Foundations of Security 2003 12 Proof-carrying code JVM bytecode Code Compiler Code Proof Host Checker Question: Isn’t it hard, perhaps impossible, to check properties of assembly language? Actually, no, not really, provided we have a proof to guide the checker. David Walker - Foundations of Security 2003 13 Proof-Carrying Code: An Analogy Legend: code proof David Walker - Foundations of Security 2003 14 Proof-carrying code JVM bytecode Code Compiler Code Proof Host Checker Question: Well, aren’t you just avoiding the real problem then? Isn’t it extremely hard to generate the proof? David Walker - Foundations of Security 2003 15 Proof-carrying code JVM bytecode Code Compiler Code Proof Host Checker Question: Well, aren’t you just avoiding the real problem then? Isn’t it extremely hard to generate the proof? Yes. But there is a trick. David Walker - Foundations of Security 2003 16 PCC + Type-Preserving Compilation JVM bytecode Code Compiler Types Types Compiler Code Proof Host Checker The trick: we fool the programmer into doing our proof for us! •We convince them to program in a typesafe language. •We design our compiler to translate the typing derivation into a proof of safety •We can always make this work for type safety properties David Walker - Foundations of Security 2003 17 Good Things About PCC 1. Someone else does the really hard work (the compiler writer) • • 2. 3. 4. 5. Hard to prove safety but easy to check a proof Research over the last 5-10 years indicates we can produce proofs of type safety properties for assembly language Requires minimal trusted infrastructure • • Trust proof checker but not the compiler Again, recent research shows PCC TCB can be as small as ~3000 LOC Agnostic to how the code and proof are produced • Not compiler specific; Hand-optimized code is Ok • Only limited by the logic that is used (and we can use very general logics) Can be much more general than the JVM type system Coexists peacefully with cryptography • • • Signatures are a syntactic checksum Proofs are a semantic checksum (see Appel & Felten’s proof-carrying authorization) David Walker - Foundations of Security 2003 18 The Different Flavors of PCC • Type Theoretic PCC [Morrisett, Walker, et al. 1998] – source-level types are translated into low-level types for machine language or assembly language programs – the proof of safety is a typing derivation that is verified by a type checker • Logical PCC [Necula, Lee, 1996, 1997] – low-level types are encoded as logical predicates – a verification-condition generator runs over the program and emits a theorem, which if true, implies the safety of the program – the proof of safety is a proof of this theorem • Foundational PCC [Appel et al. 2000] – the semantics of the machine is encoded directly in logic – a type system for the machine is built up directly from the machine semantics and proven correct using a general-purpose logic (eg: higher-order logic) – the total TCB is approximately 3000 LOC David Walker - Foundations of Security 2003 19 The Common Theme • Every general-purpose system for proof carrying code relies upon a type system for checking lowlevel program safety – why? – building a proof of safety for low-level programs is hard – success depends upon being able to structure these proofs in a uniform, modular fashion – types provide the framework for developing wellstructured safety proofs • In the following lectures, we will study the lowlevel typing mechanisms that are the basis for powerful systems of proof carrying code David Walker - Foundations of Security 2003 20