Advanced Development of Certified OS Kernels

advertisement
Advanced Development of
Certified OS Kernels
Zhong Shao
Bryan Ford
Yale University
November 2010
http://flint.cs.yale.edu/ctos
Focus Areas: Operating Systems, Programming Languages, Formal Methods
Certifying a computing host?
Formal
proofs for
resilience,
extensibility,
security?
Need to reason about:
• human behaviors
• cosmic rays + natural disasters
• hardware failure
• software
VIEW #1: bug-free host
impossible. Treat it as a
biological system.
Certifying a computing host?
Formal
proofs for
resilience,
extensibility,
security?
HW & Env
Model
Need to reason about:
• human behaviors
• cosmic rays + natural disasters
• hardware failure
• software
VIEW #2: focus on
software since it is a
rigorous mathematical
entity!
Certified software?
01011010101010101111100011001
11011011111110101010101010101
01010101111111110101010101011
11000110001010101010101011111
SOFTWARE
100011001001111001111111100111
111100011110001010101011111101
110011001110101010111111100011
110001110001110011
Formal
specs &
proofs for
resilience,
extensibility,
security?
HW & Env
Model
Find a mathematical proof showing that
if the HW/Env follows its model, the software will run according
to its specification
Certified OS kernel?
01011010101010101111100011001110110111
11110101010101010101010101011111111101
0101010101111000110001010101010101011
Application & other
111100011001001111001111111100111111100
system SW
01111000101010101111110111001100111010
10101111111000111100
Formal
specs &
proofs for
resilience,
extensibility,
security?
01011010101010101111100011001110110111
Certified kernel SW
11110101010101010101010101011111
HW & Env
Model
TODO:
design & develop new OS kernel that can “crash-proof” the entire SW
need new programming language for writing certified kernels
need new formal methods for automating proofs & specs
Our approach
• Clean-slate OS kernel design & development
– extensibility via certified plug-ins
– resilience via history-based accountability & recovery
– security via information flow control
• New PLs for building certified kernels
– vanilla C & assembly but w. specialized program logics
– DSL-centric framework for certified linking & programming
– new DSLs & history logics for certifying kernel plugins
• New formal methods for automating proofs & specs
– VeriML and type-safe proof scripts
– automated program verifiers & tools
Clean-slate kernel design
• No more “base kernel” per se
– it is nothing but interacting plug-ins
• The entire kernel is composed of modular, replaceable, and
individually certifiable plug-ins
• Different plug-in classes implement different kernel functions --embodying different safety and correctness model
– device drivers for specific types of HW
– resource managers (schedulers, memory managers, file sys)
– protected executors implementing different secure “sandboxes”
– boot loaders and initialization modules
Extensibility via certified plug-ins
P ro te c te d
E x e c u to rs
Legacy
X 86
A pp
Java/
M S IL
A pp
N aC l
A pp
Legacy
O S + A pp
N a tiv e
E x e c u to r
T y p e s a fe
E x e c u to r
SFI
E x e c u to r
VMM
E x e c u to r
C e rtifie d K e rn e l
R e s o u rc e
M a n a g e rs
S c h e d u le r
D e v ic e
D riv e rs
CPU
CPU
S c h e d u le r
FS
N et
D riv e r
D riv e r
D riv e r
D riv e r
G PU
G PU
D is k
N IC
• each kernel extension is not just “safe” but also “semantically correct”
• protected executors replace the traditional “red line”
• efficient nested virtualization and inter-process communication (IPC)
History-based accountability
• Novel kernel primitives & executors for supporting resilience
– keep a complete history log
– replay, backtrack, recover as we wish
• How to make this efficient?
– enforce “determinism” to avoid logging nondeterministic events
– system-call atomicity for consistent check-pointing (as in Fluke)
• New techniques for history optimization & compression
Information flow control (IFC)
•
•
New kernel primitives for explicit control of IFC labels
–
follow previous work on HiStar & Loki
–
but want to have the security monitors (or plug-ins) certified
–
and also enforce IFC across heterogeneous “executors”
–
challenge: how language-based IFC (eg Jif) differs from OS-based ones?
New techniques addressing covert timing channels
–
timeshare the processes without restriction
–
but enforce “determinism” to prevent each process from reading the time
–
a “read time” request would lead to an IFC “taint” fault
–
the handler will migrate the process to a non-time-shared CPU
Outline of this talk
• Clean-slate OS kernel design & development
– extensibility via certified plug-ins
– resilience via history-based accountability & recovery
– security via information flow control
• New PLs for building certified kernels
– vanilla C & assembly but w. specialized program logics
– DSL-centric framework for certified linking & programming
– new DSLs & history logics for certifying kernel plugins
• New formal methods for automating proofs & specs
– VeriML and type-safe proof scripts
– automated program verifiers & tools
Components of a certified framework
Dependability
claim & spec
Proof
Proof checker
machine
code
HW & env
model
No
Yes
•
certified software (proof + machine code)
•
dependability claim & spec
•
HW & env model
•
mechanized meta-logic
•
proof checker
Human & the
Physical World
Devices &
Memory
CPUs
Case study: a Mini-OS
1300-line 16bit x86 code,
…
Bootable!
threads
…
ctxt
ctxt
33 MHz
ctxt
.
.
spawn, yield, exit,
lock, monitors, …
scheduler
.
.
KBD
.
timer
.
interrupts
bootloader
How to certify this code?
OS
Certifying the Mini-OS
1300 lines of code
Many challenges:
bootloader
Code loading
scheduler
Low-level code: C/Assembly
timer int. handler
Concurrency
thread lib: spawn, exit, yield, …
Interrupts
sync. lib: locks and monitors
Device drivers / IO
keyboard driver
Certifying the whole system
keyboard int. handler
Many different features
…
Different abstraction levels
Domain-specific program logics
• One logic for all code
– Consider all possible interactions.
– Very difficult!
• Reality: domain-specific logics
– Only limited combinations of
features are used.
– It’s simpler to use a specialized
logic for each combination.
– Interoperability between logics
For each DSL, use as much
automation as possible!
L2
L1
L4
L3
Our solution
C1
OS
Cn
C1
L1
…
…
OCAP
Formalized HW & env model
Mechanized meta-logic
Cn
Ln
A toy machine
f1:
f2:
f3:
(data heap) H
I1
0
I2
1
2
…
pc
r1 r2 r3 … rn
I3
(register file) R
…
(code heap) C
::={f  I}*
(state) S::=(H,R)
(program) P ::=(C,S,pc)
addu …
lw …
sw …
…
j f
(instr. seq.) I
Program specifications
(spec)  ::= {f  }*
1
f1:
2
f2:
3
f3:
(data heap) H
I1
0
I2
1
2
…

pc
r1 r2 r3 … rn
I3
(register file) R
…
(code heap) C
::={f  I}*
(state) S::=(H,R)
(program) P ::=(C,S,pc)
addu …
lw …
sw …
…
j f
(instr. seq.) I
Domain-specific logics
How to link modules?
C1
L1
…
…
OCAP Rules
Formalized HW & env model
Mechanized meta-logic
may use different 
Cn
Ln
How to link modules?
…
…
f:
…
…
call f
{r1:1, …, rn:n}
{P}_{Q}
( _ )t
( _ )h
a
a'
How to link modules (cont’d)?
{r1:1, …, rn:n}
{P}_{Q}
( _ )t
( _ )h
a
a'
How to define interpretation?
Encode the invariant enforced in our invariant-based
proof methodology.
a should be expressive enough to encode Inv.
The OCAP framework [TLDI'07,VSTTE’08]
an Open framework for Certified Assembly Programming
TAL
XCAP
SCAP
…
C1
L1
Sound
AIM
( )L1
…
Cn
Ln
Sound
OCAP Inference Rules
OCAP
Soundness
…
Formalized HW & env model
Mechanized meta logic
( )Ln
DSLs for writing certified plug-ins
•
SCAP: stack-based control abstractions
[PLDI’06]
•
SAGL: modular concurrency verification
[ESOP’07]
•
CMAP: dynamic thread creation
[ICFP’05]
•
XCAP: embedded code pointers
[POPL’06]
•
GCAP: dynamic loading & self-modifying code
[PLDI’07a]
•
Certified garbage collectors & linking w. mutators
[PLDI’07b,TASE’07]
•
Certified context switch libraries
[TPHOLs07]
•
AIM: preemptive thread impl. w. HW interrupts
[PLDI’08,VSTTE’08]
•
Certified code running on relaxed memory models
•
HLRG: certified code w. optimistic concurrency
See http://flint.cs.yale.edu for more details
[ESOP’10]
[CONCUR’10]
New OCAP & DSLs
• More realistic HW & environment modeling
• Extend OCAP to certify advanced security & correctness properties
– semantic model parameterized over the HW & env semantics
– identify invariants for different plug-in classes & executors
– certified linking of heterogeneous components
• New DSLs to certify kernel plug-ins
– virtual memory management
– thread & process management & IPC
– file system
New OCAP & DSLs (cont’d)
• New DSLs for deterministic concurrency
• New DSLs for informational flow control (IFC)
– language-based IFC vs OS-based IFC
– variable- vs file or process granularity
– relationship w. rely-guarantee & concurrent separation logic
• New DSLs for persistence, recovery, and SW transaction
– based on our new history logic HLRG [CONCUR’10]
– combining temporal reasoning with local rely-guarantee
– pre/post conditions and invariants specify history traces
Outline of this talk
• Clean-slate OS kernel design & development
– extensibility via certified plug-ins
– resilience via history-based accountability & recovery
– security via information flow control
• New PLs for building certified kernels
– vanilla C & assembly but w. specialized program logics
– DSL-centric framework for certified linking & programming
– new DSLs & history logics for certifying kernel plugins
• New formal methods for automating proofs & specs
– VeriML and type-safe proof scripts
– automated program verifiers & tools
Certified thread impl. in Coq [PLDI’08]
12,000
26,000
Locks, Condition variables
Timer handler, yield/sleep
switch, block, unblock
AIM Logic & Soundness 26,000
OCAP
Sep. Logic
6,300
SCAP
1,300
1,700
Utilities (e.g. Queues) 4,000
x86 semantics (a subset)
3,300
Coq (Higher-Order Logic with Inductive Def.)
Around 82,000 lines of Coq code
See http://flint.cs.yale.edu/publications/aim.html
Related projects
•
seL4 [Klein et al SOSP’09] in Isabelle/HOL
–
–
–
–
•
8700 lines of C and 600 lines of assembly
7500 lines of C certified in 24 person years
the rest is not certified (assembly, initialization & virtual memory)
no concurrency, interrupts, mem alloc in the kernel
Verve [Yang & Hawblitzel PLDI’10] in Boogie/Z3
– 1400 lines of assembly (nucleus) + C# kernel from Singularity
– the nucleus certified in 9 person months
– C# kernel compiled to TAL via a type-preserving compiler
– no proof objects; linking not certified; no meta theory for TAL
Challenge: need both automation (from first-order provers) &
expressiveness (from Coq / HOL)
VeriML [ICFP’10]
• Proofs are more effectively done by writing new tactics:
we define them as “functions that operate on logical terms (specs & proofs)
and produce other logical terms”
• VeriML --- a new general purpose PL for manipulating logical terms
– ML core calculus (keep expressivity)
– extended w. dependent types for logical terms
– but can still “operate on” logical terms
– use a logic similar to HOL w. inductive defs & explicit proof objects
• VeriML type system guarantees validity of logical terms & safe
handling of binding
See http://flint.cs.yale.edu/publications/veriml.html
VeriML vs Coq
Three ways to write tactics:
• ML
– untyped tactics, high barrier; requires knowledge of implementation
internals
• LTac
– untyped tactics, somewhat limited programming model
• Proof-by-reflection
– strong static guarantees but very limited programming model
VeriML enables all points between no static guarantees to strong ones,
yet with full ML programming model
Automated program verifiers & tools
•
Build certified program verifiers for each DSL
–
•
some are decidable
Develop new VeriML tactics
– certifying compiler, linker, assembler
– static analysis
– decision procedures (e.g., Omega, SMT solvers)
•
•
Connecting with first-order theorem provers
–
let them generate hints or witnesses
–
add an additional validation phase to build the proof objects
Better proof witness: type-safe VeriML proof scripts
Conclusions
01011010101010101111100011001110110111
Application & other
11110101010101010101010101011111111101
0101010101111000110001010101010101011
system SW
111100011001001111001111111100111111100
Legacy
Java/
N aC l
Legacy
01111000101010101111110111001100111010
X 86
M S IL
A pp
O S + A pp
A
p
p
A
p
p
10101111111000111100
P ro te c te d
E x e c u to rs
N a tiv e
E x e c u to r
Ty p e s a fe
E x e c u to r
SFI
E x e c u to r
Formal
specs &
proofs for
resilience,
extensibility,
security?
VMM
E x e c u to r
C e rtifie d K e rn e l
R e s o u rc e
M a n a g e rs
S c h e d u le r
D e v ic e
D riv e rs
CPU
CPU
S c h e d u le r
FS
N et
D riv e r
D riv e r
D riv e r
D riv e r
G PU
G PU
D is k
N IC
HW & Env
Model
Key innovations:
• new OS kernel that can “crash-proof” the entire SW
• new PLs for writing certified kernel plug-ins (new OCAP + DSLs)
• new formal methods for automating proofs & specs (VeriML)
Advanced Development of Certified OS Kernels
Prof. Zhong Shao (PI) & Prof. Bryan Ford (Co-PI), Yale University
Java/
M S IL
A pp
N aC l
A pp
Legacy
O S + A pp
N a tiv e
E x e c u to r
Ty p e s a fe
E x e c u to r
SFI
E x e c u to r
VMM
E x e c u to r
C e rtifie d K e rn e l
Components in traditional OS
kernels can interfere with each
other in arbitrary way.
R e s o u rc e
M a n a g e rs
S c h e d u le r
D e v ic e
D riv e rs
• A single kernel bug can wreck the
entire system’s integrity & protection
• Poor support for recovery & security
CPU
CPU
S c h e d u le r
FS
N et
D riv e r
D riv e r
D riv e r
D riv e r
G PU
G PU
D is k
N IC
NEW INSIGHTS
MAIN OBJECTIVE:
To develop a novel certified OS kernel that offer
(1) safe & application-specific extensibility, (2)
provable security properties with information
flow control, and (3) accountability & recovery
from hardware or application failures.
KEY INNOVATIONS:
Only a limited set of features at
certain abstraction layer are
used in specific kernel modules
• Structure the kernel using certified
abstraction layers will minimize
unwanted interferences & maximize
modularity and extensibility
•
•
•
•
Secure & flexible kernel via certified plug-ins
History-based accountability & recovery mechanism
Provably correct security monitor for IFC
A new DSL-centric open framework for certified
decomposition & programming & linking
• New DSLs/history-logic for certifying kernel modules
• Novel VeriML language & tools that can combine
automation with modular proofs
EXPECTED IMPACT
P ro te c te d
E x e c u to rs
Legacy
X 86
A pp
OTHER UNIQUE ASPECTS
STATUS QUO
PROPOSED ACHIEVEMENT
• Machine-checkable formal
guarantees about OS kernel
safety and security
• Reliable crash recovery &
accountability mechanisms
• A solid base for building
adaptive immunity
mechanisms
• A new programming
paradigm for building
certified bug-free software
Synergistic co-development
effort combining novel
advances in OS, prog lang &
env, and formal methods
• New VeriML/OCAP
programming evironment for
building certified system software
A crash-proof computing host needs to have a certified OS kernel to serve as its bedrock.
Download