Secure Certifying Compilation What do you want to type check today? David Walker

advertisement
Secure Certifying Compilation
What do you want to
type check today?
David Walker
Cornell University
Extensible Systems
Many systems have programmable interfaces.
Code
Download, Link & Execute
–
–
–
–
System
Interface
printers and editors (postscript printers, emacs, Word)
browsers and servers (applets, plugins, CGI-scripts)
operating systems (virus scanners)
networks (active networks, JINI)
April 12, 2000
David Walker, Cornell University
2
Extensible Systems: Pros
• Client-side customization
– plug in your own devices, 3rd-party utilities
• Preservation of market-share
– vendors can add features, improve
functionality easily
• System maintenance and evolution
– software subscriptions
April 12, 2000
David Walker, Cornell University
3
Extensible Systems: Cons
• Security
– extensibility opens system to malicious
attacks
– how do we prevent misuse of resources?
• Reliability
– flexibility makes it hard to reason about
system evolution
– how do we limit damage done by erroneous
extensions?
April 12, 2000
David Walker, Cornell University
4
Extensible Systems: Reality
• Strong economic and engineering pros
– Mobile code, systems with programmable
interfaces will proliferate
• A necessity: practical technology for
increasing the security and reliability of
extensible systems
April 12, 2000
David Walker, Cornell University
5
Outline
• Framework for improved reliability and security
– Idea I: certifying compilation
– Idea II: security via code instrumentation
• An instance [popl '00]
– Security automaton specifications
– A dependently-typed target language (TAL)
• Related work & research directions
April 12, 2000
David Walker, Cornell University
6
Certified Code
Untrusted
Code
Certificate
Secure
Code
Download &
Check
System
Interface
Link & Execute
• Attach annotations/certificate (types, proofs, ...) to
untrusted object code extensions
• Certificates make verification feasible
• Move away from trust-based security & reliability
April 12, 2000
David Walker, Cornell University
7
Certifying Compilation
High-level
Program
Compile
certificate
Annotated
IR
Optimize
Transmit
April 12, 2000
• Low-level certificate
generation must be
automated
• Necessary components:
1) a source-level programming
language
2) a compiler to compile and
optimize source programs
while preserving the certificate
3) a certifying target language
David Walker, Cornell University
8
Question
How should we obtain the initial certificate?
April 12, 2000
David Walker, Cornell University
9
Answer
• Use a type-safe language
• Type inference relieves the tedium of
proof construction
• Programmers will rewrite programs so
they type check
April 12, 2000
David Walker, Cornell University
10
Certifying Compilation So Far
Type Safe
High-level
Program
Compile
1) a strongly typed source-level
programming language
types
Typed
Program
2) a type-preserving compiler to
compile and optimize source
programs
Optimize
3) a certificate language for typesafety properties
Transmit
April 12, 2000
David Walker, Cornell University
11
Certifying Compilers
• Proof-Carrying Code [Necula & Lee]
– an expressive base logic that can encode many
security policies
– in practice, logic is extended with a type system
– compilers produce type safety proofs
• Typed Assembly Language [Morrisett, Walker, et al]
– flexible type constructor language that can
encode high-level abstractions
– guarantees type safety properties
April 12, 2000
David Walker, Cornell University
12
Conventional Type Safety
• Conventional types ensure basic safety:
– basic operations performed correctly
– abstraction/interfaces hide data
representations and system code
• Conventional types don't describe complex
security policies
– eg: policies that depend upon history
• Melissa virus reads Outlook contacts list and then
sends 50 emails
April 12, 2000
David Walker, Cornell University
13
Outline
• Framework for improved reliability and security
– Idea I: certifying compilation
– Idea II: security via code instrumentation
• An instance [popl '00]
– Security automaton specifications
– A dependently-typed target language (TAL)
• Related work & research directions
April 12, 2000
David Walker, Cornell University
14
Flexible Security Policies
High-level
Extension
Compiler
Security
Policy
Instrument
Analyze &
Optimize
April 12, 2000
• Specify policies
independently of
extensible system
• Compiler instruments
extensions
• Easier to understand,
debug, evolve
policies
David Walker, Cornell University
15
Security Policy Specifications
• Requirement: a language for specifying
security policies
• Features:
– Notation for specifying events of interest
• "network send" and "file read" are security-sensitive
– Notation for specifying illegal behaviour
• a privacy policy: "no send after read"
– A feasible compilation strategy
• must be able to prevent programs from violating the
policy
April 12, 2000
David Walker, Cornell University
16
Examples
• SFI [Wahbe et al]
– events are read, write, jump
– enforce memory safety properties
• SASI
[Erlingsson & Schneider],
Naccio
[Evans & Twyman]
– flexible policy languages
– not certifying compilers
April 12, 2000
David Walker, Cornell University
17
Putting it Together
– define policies in a high-level, flexible and
system-independent specification language
– instrument system extensions both with
dynamic security checks and static information
– preserve proof of security policy during
compilation and optimization
– verify certified compiler output to reduce TCB
April 12, 2000
David Walker, Cornell University
18
Outline
• Framework for improved reliability and security
– Idea I: certifying compilation
– Idea II: security via code instrumentation
• An instance [popl '00]
– Security automaton specifications
– A dependently-typed target language (TAL)
• Related work & research directions
April 12, 2000
David Walker, Cornell University
19
Secure Certified Code
• Overview of Architecture
• Security Automata [Erlingsson & Schneider]
– How to specify security properties
– A simple compilation strategy
• A dependently-typed target language (TAL)
– A brief introduction to TAL
– Extensions for certifying security properties
• theoretical core language proven sound
• can express any security automaton policy
April 12, 2000
David Walker, Cornell University
20
Security Architecture
Security
Automaton
Specification
High-level
Extension
System
Interface
Instrument
Annotate
Secure
Typed
Extension
Secure
Typed
Interface
Type
Check
Optimize
Transmit
April 12, 2000
David Walker, Cornell University
Secure
Executable
21
Security Automata
• A general mechanism for specifying
security policies
• Specify any safety property
– access control policies:
• “cannot access file foo”
– resource bound policies:
• “allocate no more than 1M of memory”
– the Melissa policy:
• “no network send after file read”
April 12, 2000
David Walker, Cornell University
22
Example
read(f)
send
has
read
start
bad
•
•
•
•
read(f)
send
Policy: No send operation after a read operation
States: start, has read, bad
Inputs (program operations): send, read
Transitions (state x input -> state):
– start x read(f) -> has read
April 12, 2000
David Walker, Cornell University
23
Example Cont’d
read(f)
send
has
read
start
bad
read(f)
send
• S.A. monitor program execution
• Entering the bad state = security violation
% untrusted program
send();
read(f);
send();
April 12, 2000
% s.a.: start state
% ok -> start
% ok -> has read
% bad, security violation
David Walker, Cornell University
24
Bounding Resource Use
malloc (i)
0
...
i
...
n-1
malloc (i)
bad
• Policy: "allocate fewer than n bytes"
April 12, 2000
David Walker, Cornell University
25
Enforcing S.A. Specs
• Every security-relevant operation has
an associated function: checkop
• Trusted, provided by policy writer
• checkop implements the s. a. transition
function
checksend (state) =
if state = start then
start
else
halt % terminates execution
April 12, 2000
David Walker, Cornell University
26
Enforcing S.A. Specs
• Easy, wrap all function calls in checks:
send()
let next_state = checksend(current_state) in
send()
• Improve performance using program
analysis
April 12, 2000
David Walker, Cornell University
27
Outline
• Technology for improved reliability and security
– Idea I: certifying compilation
– Idea II: security via code instrumentation
• Secure certifying compilation [popl '00]
– Security automaton specifications
– A dependently-typed target language (TAL)
• Related work & research directions
April 12, 2000
David Walker, Cornell University
28
Brief TAL Overview
Typecheck
Link
• Assembly or machine code with typing annotations
• Object files checked separately and linked together
• Ensures basic safety without run-time checks
– Memory safety: can't read/write arbitrary memory
– Control-flow safety: can't execute arbitrary data
– Type abstraction: TAL can encode and enforce high-level
abstract data types
April 12, 2000
David Walker, Cornell University
29
A TAL Compiler
• TAL is practical
– We compile "safe C" (aka Popcorn)
– No pointer arithmetic, unsafe casts
– ML-style data types, polymorphism, exceptions
– Some simple optimizations
• null-check elimination, inlining, register allocation
– The compiler bootstraps
• most compiler hacking by Grossman, Morrisett, Smith
April 12, 2000
David Walker, Cornell University
30
Other TAL Features
• Memory management features
– Stack types
– Aliasing
– Region-based MM
– See Dave’s thesis
• Other features
– Dynamic linking
– Run-time code generation
• http://www.cs.cornell/talc
April 12, 2000
David Walker, Cornell University
31
Typing Assembly Code
• Programs divided into labeled code blocks
• Each block has a code type: {eax:,ebx:,...}
• Code types specify expected register contents
– Assume code type to check the block
– Prove control transfers (jumps) meet the assumptions
Foo : {eax: int, ecx: {eax: int}}
mov ebx, 3;
% {eax: int, ebx: int, ecx: {eax: int}}
add eax, ebx;
% OK
jmp ecx
% OK
April 12, 2000
David Walker, Cornell University
32
Increasing Expressiveness
• Basic types ensure standard type safety
– functions and data used as intended and cannot
be confused
– security checks can’t be circumvented
• Introduce a logic into the type system to
express security invariants
• Use the logic to encode the s.a. policy
• Use the logic to prove checks unnecessary
April 12, 2000
David Walker, Cornell University
33
Target Language Predicates
• States (for compile-time reasoning)
• constants: start, has read, bad, ...
• variables: 1, 2, ...
• Predicates:
– describe security states
• instate()
– describe relationships between states
• transsend(1,2)
– describe dependencies between values
• (see the paper)
April 12, 2000
David Walker, Cornell University
34
Preconditions
• Code types can specify preconditions:
foo: [, instate(),   bad].{eax:1, ecx:2}
• A typical use:
bar: {...}
...
% Known: instate(start)
...
jmp foo [start]
April 12, 2000
- instantiate polymorphic variable 
- prove residual preconditions
- eg: instate(start), start  bad
- hope proofs are easy (syntactic matching)
- otherwise place explicit proof at call site
- eg: jmp foo [start, Proof, Proof]
David Walker, Cornell University
35
Postconditions
• Expressed as a precondition on the return
address type:
bar: { eax: 1, ecx: [instate(has read)].{eax: 2} }
• Before returning, bar proves instate(has read)
• After return, assume instate(has read)
April 12, 2000
David Walker, Cornell University
36
Encoding Security Automata
• Each security-relevant function has a type
specifying 3 preconditions, 1 postcondition
• the send function:
– P1: instate(curr)
– P2: transsend(curr,next)
– P3: next  bad
Pre: P1, P2, P3
– P4: instate(next)
Post: P4
send: [curr,next,P1,P2,P3].{ ecx: [P4].{ } }
April 12, 2000
David Walker, Cornell University
37
Technical Note
• State predicates behave linearly
– as in linear logic, each state predicate is used once
– instate(curr) is "consumed" at send call site
• can't be used in future proofs
• can't fool type system into thinking code continues
to be in state curr
– instate(next) is "produced" on return
• will be used when next calling a security-sensitive
function
April 12, 2000
David Walker, Cornell University
38
Compile-time & Run-time
• Compile-time reasoning depends on runtime values
foo:
mov eax, state % should represent the current state
mov ecx, ret1
jmp checksend
% state argument, state result in eax
ret1:
push eax
mov ecx, ret2
jmp send
checksend:
April 12, 2000
% save next state on the stack
% must establish precondition for send
% postcond. == precond. for ret1, send
David Walker, Cornell University
39
Checksend
• A type for checksend (first try)
checksend:
[curr,P1].{eax:state, ecx:[next,P1,P2,P3].{eax:state} }
where
P1 = instate(curr), P2 = transsend(curr,next), P3 = next  bad
April 12, 2000
David Walker, Cornell University
40
Checksend
• A type for checksend (first try)
checksend:
[curr,P1].{eax:state, ecx:[next,P1,P2,P3].{eax:state} }
where
P1 = instate(curr), P2 = transsend(curr,next), P3 = next  bad
• No correspondence between run-time argument
and static information
mov eax, wrong_state; mov ecx, next; jmp checksend
April 12, 2000
David Walker, Cornell University
41
Checksend
• Solution: provide very precise types
• Singleton types
– A type containing one value
– eax : state(start)
• means eax contains a data structure that represents
exactly the start state and no other state
– eax : state()
• eax contains data representing the unknown state 
• useful in many contexts
– Similar to Dependent ML [Xi & Pfenning]
April 12, 2000
David Walker, Cornell University
42
Using Singletons
• checksend
– implements the automaton transition function
• intuitively has type state -> state
• singletons help relate run-time values to compile-time predicates
[curr,P1].{eax:state(curr),ecx:[next,P1,P2,P3].{eax:state(next)}}
– P1 = instate(curr), P2 = transsend(curr,next), P3 = next  bad
April 12, 2000
David Walker, Cornell University
43
Using Checksend
foo: { … }
...
% Assume: instate(curr), eax : state(curr)
mov ecx, ret1
jmp check_send[curr]
ret1: [next, instate(curr), transsend(curr,next), next  bad].
{eax:state(next)}.
push eax;
mov ecx, ret2;
jmp send [curr,next] % P1 & P2 & P3 ==> ok
ret2: ...
April 12, 2000
David Walker, Cornell University
44
Optimization
• Analysis of s.a. structure makes
redundant check elimination possible
– eg:
send
read(f)
has
read
start
bad
read(f)
send
– identify transsend(start,start) as valid
April 12, 2000
David Walker, Cornell University
45
Optimization
Low-level Interface
Policy
High-level
Interface
send: '
read: '
checksend: '
checkread: '
Axiom A =
transsend(start,start)
April 12, 2000
David Walker, Cornell University
46
Optimization
loop : [instate(start)].{ }
mov ecx, loop
jmp send [start,start,By A];
send: [curr,next,instate(curr),transsend(curr,next),
next  bad].{ecx: [P4].{ }}
• Type-checker is simple but general
• Typical optimizations
– redundant check removal
– loop invariant removal
April 12, 2000
David Walker, Cornell University
47
Implementation
• TALx86 implementation is sufficient for
these encodings
– includes polymorphism, higher-order type
constructors, logical connectives (,,),
singleton types, ....
• Lots more work to be done
– axioms in module interfaces
– policy compiler
April 12, 2000
David Walker, Cornell University
48
Outline
• Technology for improved reliability and security
– Idea I: certifying compilation
– Idea II: security via code instrumentation
• Secure certifying compilation [popl '00]
– Security automaton specifications
– A certifying target language
• Related work & research directions
April 12, 2000
David Walker, Cornell University
49
Research Directions
• Design of policy languages
– What kinds of logics can we compile & certify?
• Mawl [Sandholm & Schwartzbach]
• TALres [Crary & Weirich]
• Design of safety architecture
– How do we "clean up" after halting a program?
– Support for mutually distrustful agents
• Policy-directed optimizations
April 12, 2000
David Walker, Cornell University
50
Summary
• A recipe for secure certified code:
– types
• ensure basic safety without run-time overhead
• add a logic to encode complex invariants
– policy-directed code instrumentation
• specify security policies independently of the rest of
the system
• use dynamic checking to enforce policies when
they can’t be proven statically
April 12, 2000
David Walker, Cornell University
51
Download