A Trustworthy Proof Checker Andrew W. Appel Aaron Stump Neophytos G. Michael

advertisement
A Trustworthy Proof Checker
Andrew W. Appel
Neophytos G. Michael
Roberto Virga
Princeton University
Aaron Stump
Stanford University
A trustworthy proof checker for
proofs of properties of machinecode programs.
FCS & VERIFY, July 2002
6/21/2016
1
Trusted Computing Base
Theorem:
Operating System:
an + bn  cn
gcc
emacs
netscape
Proof
rogomatic
make
Axioms
6/21/2016
Trusted
Base
Kernel
2
The problem: Mobile Code Security
Code Producer
Source
Program
Compiler
Code
Code Consumer
Execute
load r3, 4(r2)
add r2,r4,r1
store 1, 0(r7)
store r1, 4(r7)
add r7,0,r3
add r7,8,r7
beq r3, .-20
?
Private files
Network access
Launch control
etc.
6/21/2016
3
Existing Practice: Hardware VM protection
Code Producer
Source
Program
Compiler
Machine
Code
load r3, 4(r2)
add r2,r4,r1
store 1, 0(r7)
store r1, 4(r7)
add r7,0,r3
add r7,8,r7
beq r3, .-20
Disadvantages:
Code Consumer
Execute
Operating System
virtual memory
Protected
resources
Large trusted code base of O.S.
Clumsy, slow interfaces between
trusted & untrusted code
6/21/2016
4
Existing Practice: Bytecode Verification
Code Producer
Java
Program
Compiler
ByteCode
load r3, 4(r2)
add r2,r4,r1
store 1, 0(r7)
store r1, 4(r7)
add r7,0,r3
add r7,8,r7
beq r3, .-20
Trusted
Computing
Base
Advantage:
Clean, fast, O-O interface
between trusted & untrusted code
Disadvantage:
Huge trusted computing base: JIT
6/21/2016
Code Consumer
Bytecode
Verifier
OK
Just-in-time
Compiler
Native code
Execute
5
Foundational Proof-Carrying Code
Code Producer
Source
Program
Compiler
Hints
Machine
Spec
+
Policy
Prover
6/21/2016
Native Code
load r3, 4(r2)
add r2,r4,r1
store 1, 0(r7)
store r1, 4(r7)
add r7,0,r3
add r7,8,r7
beq r3, .-20
Safety Proof
$-i(
-i(...
-r (
...)
)
)
Code Consumer
Execute
Trusted
Computing
Base
Machine
Spec
+
Policy
Checker
OK
6
Trusted Computing Base

The minimal set of code that must be trusted

Our goal: make TCB as small as possible

TCB consists of two pieces:


6/21/2016
The safety policy (a predicate in Higher-Order Logic that
characterizes whether a program is safe to execute)
The proof-checker (a small C program that checks safety
proofs)
7
Trusted Computing Base (cont.)
1.
Safety Policy
a)
b)
c)
Choose a logical
framework (programming
language for logic)
Choose an object logic
(axioms, inference rules)
Represent our theorem in
the object logic
2. Proof Checking
Build a proof-checker for
the logical framework
6/21/2016
1.
Safety Policy
a)
We choose LF
b)
We choose Higher-Order
Logic
c)
We will explain...
2. Proof Checking
We use Twelf to prove
theorems, but for checking
we want something smaller
and simpler . . .
8
Harper et al. 1993
LF, Twelf, and Higher Order Logic

What is LF?



A Logical Framework for defining and presenting logics
Based on a general treatment of syntax, rules, and proofs by
means of a typed first-order -calculus
Its type system has three levels of terms:





6/21/2016
Objects
Types
Kinds
-- that classify objects
-- that classify families of types.

Equality is taken as -conversion

The judgments-as-types principle
We use the Twelf implementation of LF (Pfenning et al. 99)
We implement a standard HOL with arithmetic
9
Programming in Twelf
Define formula constructors (an LF signature):
num : type.
form : type.
imp : form -> form -> form.
.
.
.
.
.
.
.
.
.
Define proof constructors (axioms):
pf
: form -> type.
imp_i : (pf A -> pf B) -> pf (A imp B).
imp_e : pf (A imp B) -> pf A -> pf B.
.
.
.
6/21/2016
.
.
.
.
.
.
10
Theorems, proof checking in HOL

Proof of logical transitivity:
imp_trans: pf (A imp B) -> pf (B imp C)
-> pf (A imp C) =
[p1 : pf (A imp B)]
[p2 : pf (B imp C)]
imp_i [p3 : pf A]
imp_e p2 (imp_e p1 p3).
This shows the general form of a Twelf definition:
name :  = exp.
6/21/2016
11
The safety policy
“This program accesses memory only in range 0-1000”
“This program never executes an illegal instruction.”
Step I: define access predicates
readable(x) = 0  x  1000
writable(x) = 0  x  1000
Step II: define legal instructions . . .
6/21/2016
12
Machine states, step relation
Machine State = Register bank + memory
(r,m)  (r’,m’ ) : the step relation is a map
between machine states
r
0
1
2
3
psr
pc
6/21/2016
7
m
r’

0
1
2
3
psr
pc
m’
8
13
Machine instruction = step relation
add r1:=r2+r3 
m’=m, r’(1)=r(2)+r(3),
r’(pc)=1+r(pc),
i i  1  i  pc  r’(i)=r(i)
r
0
1
2
3
psr
pc
6/21/2016
7
2
6
m
r’

0
1
2
3
psr
pc
m’
8
2
6
14
Instruction decoding; memory policy
load ri := m(rj+k)
w=
op
d
s1
s2
3
i
j
k
r
0
1
2
3
psr
pc
m
7
w
(r,m)  (r’,m’ ) 
$ w,i,j,k
m (r (pc)) = w
 w = 3212 + i28 + j24 + k
 m’ = m
 readable (r ( j) + k )
 r’ (i) = m (r ( j)+ k)
 r’ (pc) = 1+ r’ (pc)
 x xi  xpc  r’ (x)=r (x)
 (...) (...) ...
6/21/2016
15
Making the specification
concise & trustworthy
Described in [Michael & Appel 2000]




6/21/2016
Separate syntax from semantics
Factor the semantics
Use “New Jersey Machine-Code Toolkit” to
describe syntax
Automatically translate NJMCT descriptions
into concise and readable higher-order logic
16
Specifying safe execution



 relation includes only the legal instructions
Safety means, “no matter how many
instructions you execute, the next instruction
is legal”
The program is meant to be loaded at some
start address
loaded(m,start,prog) =
i dom(prog). m(start+i) = prog(i)
100:
Example:
loaded(m,100, (9017;4214;8099;4010;6231;1008))
6/21/2016
9017
4214
8099
4010
6231
1008
17
Safety theorem
safe(prog) =
r,m,start.
loaded(m,start,prog)  r(pc)=start
 r’,m’. r,m  r’,m’
 $ r’’,m’’.
Trusted
Computing
Base
r’,m’  r’’,m’’
r
m
start:
9017
4214
8099
Theorem to be proved:
4010
pc: start
safe(9017;4214;8099;4010;6231;1008)
6/21/2016
6231
1008
?
?
?
18
Size of Safety Specification (Sparc)
Safety
Predicate
6%
Logic
7%
Arithmetic
9%
Machine
Syntax
Machine
25%
Semantics
53%
Safety Specification
Logic
Arithmetic
Machine Syntax
Machine Semantics
Safety Predicate
Total
6/21/2016
Lines
Definitions
135
160
460
1005
105
61
94
334
692
25
1865
1206
19
Representation Issues in the Specification



Eliminating Redundancy in LF terms
Dealing with Arithmetic
Representation of Axioms and Trusted
Definitions:



6/21/2016
Encoding Higher-Order Logic in LF
Polymorphic programming in Twelf
Explicit versus implicit programming in Twelf Avoiding term reconstruction
20
Eliminating Redundancy

LF signatures contain lots of redundant
information
imp_i : {A: form}{B: form}
(pf A -> pf B) -> pf (A imp B).

Twelf’s answer: parameters can be “declared”
implicit
imp_i : (pf A -> pf B) -> pf (A imp B).

Implicit parameters in the TCB means type
reconstruction in the checker


6/21/2016
Algorithm is large and complex
It relies on higher-order unification which is undecidable
(some valid proofs may fail)
21
Eliminating Redundancy (cont.)

On the TCB side:


We write axioms & trusted definitions in fully explicit
style
On the proving side:



Implicit versus explicit LF term sizes
Other approaches to this problem: Necula’s LFi, Oracle
based checking
We represent proofs as DAGs with structure sharing of
common sub-expressions




6/21/2016
Proof-size blowup is avoided
The checker does not need to parse proofs
But constant factor is not so good, though
A tradeoff: TCB size versus Proof Size
22
Term Reconstruction in the Prover


6/21/2016
Twelf’s term reconstruction algorithm (a.k.a.
“type inference”) is extremely useful in writing
proofs
Outside TCB, write “compatibility lemmas” to
interface with proofs that are written in implicit
style.
23
The Proof Checker



A small C program (~ 803 lines, 1/3 of the TCB)
Type checks explicit LF proofs and loads and
executes only safe programs
Makes no use of libraries except: read, and _exit
Main program Error Messaging
7%
2%
Input/Output
4%
Component
Type checking
21%
Parser
DAG creation
Lines
Error Messaging
Input/Output
Parser
DAG creation
Type checking
Main program
14
29
428
111
167
54
Total
803
52%
14%
6/21/2016
24
Why do we need a parser?


6/21/2016
Not for proofs -- they are transmitted to
checker in DAG form
For axioms! Humans can’t read axioms and
trusted definitions in DAG form, therefore can’t
trust them.
(see Pollack ‘98, “How to believe a machinechecked proof”)
25
DAG representation of proofs & types

Each DAG node is 5 words
op
arg1
arg2
type
match
op
arg1
arg2
type
match

6/21/2016
opcode
left child
right child
computed type
weak head normal form
op
arg1
arg2
type
match
Entire DAG is transmitted as a single block
26
Proof-checking measurements


In the paper, we report a time of 74 seconds to
check a benchmark proof (~ 6,000 lines)
We have improved this to 0.48 seconds



Checker marks closed terms
Avoid traversing closed terms during substitutions
Adds 20 lines to the Proof Checker
op cl
arg1
arg2
type
match
6/21/2016
27
Smallest possible TCB
140
Compiler or
Checker
FPCC .
20
SpecialJ .
40
.
60
BulletTrain
80
0
6/21/2016
.
100
Kaffe
1000s of lines
.
120
Core
Runtime
28
Future Work


Machine Descriptions for other CPUs (Mips, Sparc
so far)
TCB is really small but proof sizes are large. Work
on finding the right tradeoff between TCB size
and proof size



6/21/2016
Compress DAG in some way
Use another compressed form of the LF syntactic
notation
Add a simple Prolog interpreter to the TCB that
“rediscovers” the proof based on the sequence of TAL
instructions given to the checker
TCB no longer minimal but proof sizes greatly reduced
29
Download