SLAM (PLDI 2003)

advertisement
Software Model Checking with
SLAM
Thomas Ball
Testing, Verification and Measurement
Sriram K. Rajamani
Software Productivity Tools
Microsoft Research
http://research.microsoft.com/slam/
People behind SLAM
Summer interns
– Sagar Chaki, Todd Millstein, Rupak Majumdar (2000)
– Satyaki Das, Wes Weimer, Robby (2001)
– Jakob Lichtenberg, Mayur Naik (2002)
Visitors
– Giorgio Delzanno, Andreas Podelski, Stefan Schwoon
Windows Partners
– Byron Cook, Vladimir Levin, Abdullah Ustuner
Outline
• Part I: Overview (30 min)
– overview of SLAM process
– demonstration (Static Driver Verifier)
– lessons learned
• Part II: Basic SLAM (1 hour)
– foundations
– basic algorithms (no pointers)
• Part III: Advanced Topics (30 min)
– pointers + procedures
– imprecision in aliasing and mod analysis
Part I: Overview of SLAM
What is SLAM?
SLAM is a software model checking project at
Microsoft Research
– Goal: Check C programs (system software)
against safety properties using model checking
– Application domain: device drivers
Starting to be used internally inside Windows
– Working on making this into a product
Rules
Static Driver Verifier
Read for
understanding
New API rules
Development
Precise
API Usage Rules
(SLIC)
Defects
Drive testing
tools
Software Model
Checking
100% path
coverage
Source Code
Testing
SLAM – Software Model Checking
• SLAM innovations
– boolean programs: a new model for software
– model creation (c2bp)
– model checking (bebop)
– model refinement (newton)
• SLAM toolkit
– built on MSR program analysis infrastructure
SLIC
• Finite state language for stating rules
– monitors behavior of C code
– temporal safety properties (security automata)
– familiar C syntax
• Suitable for expressing control-dominated
properties
– e.g. proper sequence of events
– can encode data values inside state
State Machine
for Locking
state {
enum {Locked,Unlocked}
s = Unlocked;
}
Rel
Acq
Unlocked
Locked
Rel
Acq
Error
Locking Rule in
SLIC
KeAcquireSpinLock.entry {
if (s==Locked) abort;
else s = Locked;
}
KeReleaseSpinLock.entry {
if (s==Unlocked) abort;
else s = Unlocked;
}
The SLAM Process
c2bp
prog. P
SLIC rule
slic
prog. P’
boolean
program
bebop
predicates
path
newton
Example
Does this code
obey the
locking rule?
do {
KeAcquireSpinLock();
nPacketsOld = nPackets;
if(request){
request = request->Next;
KeReleaseSpinLock();
nPackets++;
}
} while (nPackets != nPacketsOld);
KeReleaseSpinLock();
Example
Model checking
boolean program
(bebop)
do {
KeAcquireSpinLock();
U
L
if(*){
L
L
KeReleaseSpinLock();
U
L
U
L
U
U
E
}
} while (*);
KeReleaseSpinLock();
Example
Is error path feasible
in C program?
(newton)
do {
KeAcquireSpinLock();
U
L
nPacketsOld = nPackets;
L
L
U
L
U
L
U
U
E
if(request){
request = request->Next;
KeReleaseSpinLock();
nPackets++;
}
} while (nPackets != nPacketsOld);
KeReleaseSpinLock();
Example
Add new predicate
b : (nPacketsOld == nPackets) to boolean program
(c2bp)
do {
KeAcquireSpinLock();
U
L
nPacketsOld = nPackets; b = true;
L
L
U
L
U
L
U
U
E
if(request){
request = request->Next;
KeReleaseSpinLock();
nPackets++; b = b ? false : *;
}
} while (nPackets != nPacketsOld); !b
KeReleaseSpinLock();
Example
b : (nPacketsOld == nPackets)
do {
KeAcquireSpinLock();
U
L
b = true;
b L
if(*){
b L
b U
b L
!b U
b L
U
b U
E
KeReleaseSpinLock();
b = b ? false : *;
}
} while ( !b );
KeReleaseSpinLock();
Model checking
refined
boolean program
(bebop)
Example
b : (nPacketsOld == nPackets)
do {
KeAcquireSpinLock();
U
L
b = true;
b L
if(*){
b L
b U
b L
b L
b U
!b U
KeReleaseSpinLock();
b = b ? false : *;
}
} while ( !b );
KeReleaseSpinLock();
Model checking
refined
boolean program
(bebop)
Observations about SLAM
• Automatic discovery of invariants
– driven by property and a finite set of (false) execution paths
– predicates are not invariants, but observations
– abstraction + model checking computes inductive invariants
(boolean combinations of observations)
• A hybrid dynamic/static analysis
– newton executes path through C code symbolically
– c2bp+bebop explore all paths through abstraction
• A new form of program slicing
– program code and data not relevant to property are dropped
– non-determinism allows slices to have more behaviors
Part I: Demo
Static Driver Verifier results
Part I: Lessons Learned
SLAM
• Boolean program model has proved itself
• Successful for domain of device drivers
– control-dominated safety properties
– few boolean variables needed to do proof or find real
counterexamples
• Counterexample-driven refinement
– terminates in practice
– incompleteness of theorem prover not an issue
What is hard?
• Abstracting
– from a language with pointers (C)
– to one without pointers (boolean programs)
• All side effects need to be modeled by
copying (as in dataflow)
• Open environment problem
What stayed fixed?
• Boolean program model
• Basic tool flow
• Repercussions:
– newton has to copy between scopes
– c2bp has to model side-effects by value-result
– finite depth precision on the heap is all
boolean programs can do
What changed?
• Interface between newton and c2bp
• We now use predicates for doing more
things
• refine alias precision via aliasing predicates
• newton helps resolve pointer aliasing imprecision
in c2bp
Scaling SLAM
• Largest driver we have processed has ~60K
lines of code
• Largest abstractions we have analyzed have
several hundred boolean variables
• Routinely get results after 20-30 iterations
• Out of 672 runs we do daily, 607 terminate within
20 minutes
Scale and SLAM components
• Out of 67 runs that time out, tools that take longest time:
– bebop: 50, c2bp: 10, newton: 5, constrain: 2
• C2bp:
– fast predicate abstraction (fastF) and incremental predicate
abstraction (constrain)
– re-use across iterations
• Newton:
– biggest problems are due to scope-copying
• Bebop:
– biggest issue is no re-use across iterations
– solution in the works
SLAM Status
•
2000-2001
– foundations, algorithms, prototyping
– papers in CAV, PLDI, POPL, SPIN, TACAS
•
March 2002
– Bill Gates review
•
May 2002
– Windows committed to hire two people with model checking background to
support Static Driver Verifier (SLAM+driver rules)
•
July 2002
– running SLAM on 100+ drivers, 20+ properties
•
September 3, 2002
– made initial release of SDV to Windows (friends and family)
•
April 1, 2003
– made wide release of SDV to Windows (any internal driver developer)
Part II: Basic SLAM
CTypes

::=
void | bool | int | ref 
Expressions
e
::=
c | x | e1 op e2 | &x | *x
LExpression
l
::=
x | *x
Declaration
d
::=

Statements
s
::=
skip
|
assume(e)
|
l = e
|
l = f (e1 ,e2 ,…,en )
|
return x
|
s1 ; s2 ;… ; sn
x1,x2 ,…,xn
| goto L1,L2 …Ln
Procedures
p
::=
 f (x1: 1,x2 : 2,…,xn : n )
Program
g
::=
d1 d 2 … dn p1 p2 … pn
| L: s
C-Types

::=
void | bool | int
Expressions
e
::=
c | x | e1 op e2
LExpression
l
::=
x
Declaration
d
::=

Statements
s
::=
skip
|
assume(e)
|
l = e
|
f (e1 ,e2 ,…,en )
|
return
|
s1 ; s2 ;… ; sn
x1,x2 ,…,xn
| goto L1,L2 …Ln
Procedures
p
::=
f (x1: 1,x2 : 2,…,xn : n )
Program
g
::=
d1 d 2 … dn p1 p2 … pn
| L: s
BP
Types

::=
void | bool
Expressions
e
::=
c | x | e1 op e2
LExpression
l
::=
x
Declaration
d
::=

Statements
s
::=
skip
|
assume(e)
|
l = e
|
f (e1 ,e2 ,…,en )
|
return
|
s1 ; s2 ;… ; sn
x1,x2 ,…,xn
| goto L1,L2 …Ln
Procedures
p
::=
f (x1: 1,x2 : 2,…,xn : n )
Program
g
::=
d1 d 2 … dn p1 p2 … pn
| L: s
Syntactic sugar
goto L1, L2;
if (e) {
S1;
} else {
S2;
}
S3;
L1: assume(e);
S1;
goto L3;
L2: assume(!e);
S2;
goto L3;
L3: S3;
Example, in C
int g;
main(int x, int y){
cmp(x, y);
if (!g) {
if (x != y)
assert(0);
}
}
void cmp (int a , int b) {
if (a == b)
g = 0;
else
g = 1;
}
Example, in C-int g;
main(int x, int y){
void cmp(int a , int b) {
goto L1, L2;
cmp(x, y);
L1: assume(a==b);
g = 0;
return;
assume(!g);
assume(x != y)
assert(0);
}
L2: assume(a!=b);
g = 1;
return;
}
The SLAM Process
c2bp
prog. P
SLIC rule
slic
prog. P’
boolean
program
bebop
predicates
path
newton
c2bp: Predicate Abstraction for
C Programs
Given
• P : a C program
• F = {e1,...,en}
– each ei a pure boolean expression
– each ei represents set of states for which ei is true
Produce a boolean program B(P,F)
• same control-flow structure as P
• boolean vars {b1,...,bn} to match {e1,...,en}
• properties true of B(P,F) are true of P
Assumptions
Given
• P : a C program
• F = {e1,...,en}
– each ei a pure boolean expression
– each ei represents set of states for which ei is true
• Assume: each ei uses either:
– only globals (global predicate)
– local variables from some procedure (local predicate for that
procedure)
• Mixed predicates:
– predicates using both local variables and global variables
– complicate “return” processing
– covered in advanced topics
C2bp Algorithm
• Performs modular abstraction
– abstracts each procedure in isolation
• Within each procedure, abstracts each
statement in isolation
– no control-flow analysis
– no need for loop invariants
void cmp (int a , int b) {
goto L1, L2
int g;
main(int x, int y){
L1: assume(a==b);
g = 0;
return;
cmp(x, y);
assume(!g);
assume(x != y)
assert(0);
L2: assume(a!=b);
g = 1;
return;
}
}
Preds: {x==y}
{g==0}
{a==b}
void cmp (int a , int b) {
goto L1, L2
int g;
main(int x, int y){
L1: assume(a==b);
g = 0;
return;
cmp(x, y);
assume(!g);
assume(x != y)
assert(0);
L2: assume(a!=b);
g = 1;
return;
}
}
void cmp ( {a==b} ) {
decl {g==0} ;
main( {x==y} ) {
Preds: {x==y}
{g==0}
{a==b}
}
}
void equal (int a , int b) {
goto L1, L2
int g;
main(int x, int y){
L1: assume(a==b);
g = 0;
return;
cmp(x, y);
assume(!g);
assume(x != y)
assert(0);
L2: assume(a!=b);
g = 1;
return;
}
}
decl {g==0} ;
main( {x==y} ) {
cmp( {x==y} );
assume( {g==0} );
assume( !{x==y} );
assert(0);
}
Preds: {x==y}
void cmp ( {a==b} ) {
goto L1, L2;
L1: assume( {a==b} );
{g==0} = T;
return;
{g==0}
{a==b}
L2: assume( !{a==b} );
{g==0} = F;
return;
}
C-Types

::=
void | bool | int
Expressions
e
::=
c | x | e1 op e2
LExpression
l
::=
x
Declaration
d
::=

Statements
s
::=
skip
x1,x2 ,…,xn
| goto L1,L2 …Ln
| assume(e)
| l = e
| f (e1 ,e2 ,…,en )
| return
| s1 ; s2 ;… ; sn
Procedures
p
::=
f (x1: 1,x2 : 2,…,xn : n )
Program
g
::=
d1 d 2 … dn p1 p2 … pn
| L: s
Abstracting Assigns via WP
• Statement y=y+1 and F={ y<4, y<5 }
– {y<4}, {y<5} = ((!{y<5} || !{y<4}) ? F : *), {y<4} ;
• WP(x=e,Q) = Q[x -> e]
• WP(y=y+1, y<5)
(y<5) [y -> y+1]
(y+1<5)
(y<4)
=
=
=
WP Problem
• WP(s, ei) not always expressible via { e1,...,en }
• Example
– F = { x==0, x==1, x<5 }
– WP( x=x+1 , x<5 ) = x<4
– Best possible: x==0 || x==1
Abstracting Expressions via F
• F = { e1,...,en }
• ImpliesF(e)
– best boolean function over F that implies e
• ImpliedByF(e)
– best boolean function over F implied by e
– ImpliedByF(e) = !ImpliesF(!e)
ImpliesF(e) and ImpliedByF(e)
e
ImpliedByF(e)
ImpliesF(e)
Computing ImpliesF(e)
• minterm m = d1 && ... && dn
– where di = ei or di = !ei
• ImpliesF(e)
– disjunction of all minterms that imply e
• Naïve approach
– generate all 2n possible minterms
– for each minterm m, use decision procedure to
check validity of each implication me
• Many optimizations possible
Abstracting Assignments
• if ImpliesF(WP(s, ei)) is true before s then
– ei is true after s
• if ImpliesF(WP(s, !ei)) is true before s then
– ei is false after s
{ei} = ImpliesF(WP(s, ei))
ImpliesF(WP(s, !ei))
? true :
? false
: *;
Assignment Example
Statement in P:
y = y+1;
Predicates in E:
{x==y}
Weakest Precondition:
WP(y=y+1, x==y) = x==y+1
ImpliesF( x==y+1 ) = false
ImpliesF( x!=y+1 ) = x==y
Abstraction of assignment in B:
{x==y} = {x==y} ? false : *;
Absracting Assumes
• WP( assume(e) , Q) = eQ
• assume(e) is abstracted to:
assume( ImpliedByF(e) )
• Example:
F = {x==2, x<5}
assume(x < 2) is abstracted to:
assume( {x<5} && !{x==2} )
Abstracting Procedures
• Each predicate in F is annotated as being
either global or local to a particular
procedure
• Procedures abstracted in two passes:
– a signature is produced for each procedure in
isolation
– procedure calls are abstracted given the
callees’ signatures
Abstracting a procedure call
• Procedure call
– a sequence of assignments from actuals to formals
– see assignment abstraction
• Procedure return
– NOP for C-- with assumption that all predicates
mention either only globals or only locals
– with pointers and with mixed predicates:
• Most complicated part of c2bp
• Covered in the advanced topics section
void cmp (int a , int b) {
Goto L1, L2
int g;
main(int x, int y){
L1: assume(a==b);
g = 0;
return;
cmp(x, y);
assume(!g);
assume(x != y)
assert(0);
L2: assume(a!=b);
g = 1;
return;
}
}
decl {g==0} ;
main( {x==y} ) {
{x==y}
void cmp ( {a==b} ) {
Goto L1, L2
L1: assume( {a==b} );
{g==0} = T;
return;
{g==0}
cmp( {x==y} );
assume( {g==0} );
assume( !{x==y} );
assert(0);
}
{a==b}
L2: assume( !{a==b} );
{g==0} = F;
return;
}
Precision
• For program P and E = {e1,...,en}, there exist two “ideal”
abstractions:
– Boolean(P,E) : most precise abstraction
– Cartesian(P,E) : less precise abtraction, where each boolean
variable is updated independently
– [See Ball-Podelski-Rajamani, TACAS 00]
• Theory:
– with an “ideal” theorem prover, c2bp can compute Cartesian(P,E)
• Practice:
– c2bp computes a less precise abstraction than Cartesian(P,E)
– we use Das/Dill’s technique to incrementally improve precision
– with an “ideal” theorem prover, the combination of c2bp +
Das/Dill can compute Boolean(P,E)
The SLAM Process
c2bp
prog. P
SLIC rule
slic
prog. P’
boolean
program
bebop
predicates
path
newton
Bebop
• Model checker for boolean programs
• Based on CFL reachability
– [Sharir-Pnueli 81] [Reps-Sagiv-Horwitz 95]
• Iterative addition of edges to graph
– “path edges”: <entry,d1>  <v,d2>
– “summary edges”: <call,d1>  <ret,d2>
Symbolic CFL reachability
 Partition path edges by their “target”
 PE(v) = { <d1,d2> | <entry,d1>  <v,d2> }
 What is <d1,d2> for boolean programs?
 A bit-vector!
 What is PE(v)?
 A set of bit-vectors
 Use a BDD (attached to v) to represent PE(v)
BDDs
• Canonical representation
of
– boolean functions
– set of (fixed-length)
bitvectors
– binary relations over finite
domains
• Efficient algorithms for
common dataflow
operations
– transfer function
– join/meet
– subsumption test
void cmp ( e2 ) {
[5]Goto L1, L2
[6]L1: assume( e2 );
[7] gz = T; goto L3;
[8]L2: assume( !e2 );
[9]gz = F; goto L3
[10] L3: return;
}
BDD at line [10] of cmp:
e2=e2’ & gz’=e2’
Read: “cmp leaves e2 unchanged and
sets gz to have the same final value as e2”
decl gz ;
main( e ) {
[1] equal( e );
gz=gz’& e=e’
e=e’& gz’=e’
[2] assume( gz );
[3] assume( !e );
[4] assert(F);
}
void cmp ( e2 ) {
[5]Goto L1, L2
[6]L1: assume( e2 );
[7] gz = T; goto L3;
[8]L2: assume( !e2 );
[9]gz = F; goto L3
[10] L3: return;
}
e=e’ & gz’=1 & e’=1
e2=e
gz=gz’& e2=e2’
gz=gz’& e2=e2’& e2’=T
e2=e2’& e2’=T & gz’=T
gz=gz’& e2=e2’& e2’=F
e2=e2’& e2’=F & gz’=F
e2=e2’& gz’=e2’
Bebop: summary
 Explicit representation of CFG
 Implicit representation of path edges and
summary edges
 Generation of hierarchical error traces
O(N)
 Complexity: O(E * 2 )
 E is the size of the CFG
 N is the max. number of variables in scope
The SLAM Process
c2bp
prog. P
SLIC rule
slic
prog. P’
boolean
program
bebop
predicates
path
newton
Newton
• Given an error path p in boolean program
B
– is p a feasible path of the corresponding C
program?
• Yes: found an error
• No: find predicates that explain the infeasibility
– uses the same interfaces to the theorem
provers as c2bp.
Newton
• Execute path symbolically
• Check conditions for inconsistency using
theorem prover (satisfiability)
• After detecting inconsistency:
– minimize inconsistent conditions
– traverse dependencies
– obtain predicates
Symbolic simulation for C-Domains
– variables: names in the program
– values: constants + symbols
State of the simulator has 3 components:
– store: map from variables to values
– conditions: predicates over symbols
– history: past valuations of the store
Symbolic simulation Algorithm
Input: path p
For each statement s in p do
match s with
Assign(x,e):
let val = Eval(e) in
if (Store[x]) is defined then
History[x] := History[x]  Store[x]
Store[x] := val
Assume(e):
let val = Eval(e) in
Cond := Cond and val
let result = CheckConsistency(Cond) in
if (result == “inconsistent”) then
GenerateInconsistentPredicates()
End
Say “Path p is feasible”
Symbolic Simulation: Caveats
• Procedure calls
– add a stack to the simulator
– push and pop stack frames on calls and returns
– implement mappings to keep values “in scope” at
calls and returns
• Dependencies
– for each condition or store, keep track of which values
where used to generate this value
– traverse dependency during predicate generation
int g;
void cmp (int a , int b) {
Goto L1, L2
main(int x, int y){
L1: assume(a==b);
g = 0;
return;
cmp(x, y);
assume(!g);
assume(x != y)
assert(0);
L2: assume(a!=b);
g = 1;
return;
}
}
int g;
void cmp (int a , int b) {
Goto L1, L2
main(int x, int y){
L1: assume(a==b);
g = 0;
return;
cmp(x, y);
assume(!g);
assume(x != y)
assert(0);
L2: assume(a!=b);
g = 1;
return;
}
Global:
}
main:
Conditions:
(1)
x:
X
(2)
y:
Y
int g;
void cmp (int a , int b) {
Goto L1, L2
main(int x, int y){
L1: assume(a==b);
g = 0;
return;
cmp(x, y);
assume(!g);
assume(x != y)
assert(0);
L2: assume(a!=b);
g = 1;
return;
}
Global:
}
main:
Conditions:
(1)
x:
X
(2)
y:
Y
Map:
XA
cmp:
(3)
a:
A
(4)
b:
B
YB
int g;
void cmp (int a , int b) {
Goto L1, L2
main(int x, int y){
L1: assume(a==b);
g = 0;
return;
cmp(x, y);
assume(!g);
assume(x != y)
assert(0);
L2: assume(a!=b);
g = 1;
return;
}
}
Global:
(6) g: 0
Conditions:
main:
(1)
x:
X
(2)
y:
Y
Map:
XA
cmp:
(3)
a:
A
(4)
b:
B
YB
(5) (A == B)
[3, 4]
int g;
void cmp (int a , int b) {
Goto L1, L2
main(int x, int y){
L1: assume(a==b);
g = 0;
return;
cmp(x, y);
assume(!g);
assume(x != y)
assert(0);
L2: assume(a!=b);
g = 1;
return;
}
}
Global:
(6) g: 0
Conditions:
main:
(1)
x:
X
(2)
y:
Y
cmp:
(3)
a:
A
(4)
b:
B
Map:
(5) (A == B)
[3, 4]
XA
(6) (X == Y)
[5]
YB
int g;
void cmp (int a , int b) {
Goto L1, L2
main(int x, int y){
L1: assume(a==b);
g = 0;
return;
cmp(x, y);
assume(!g);
assume(x != y)
assert(0);
L2: assume(a!=b);
g = 1;
return;
}
}
Global:
(6) g: 0
Conditions:
main:
(1)
x:
X
(2)
y:
Y
cmp:
(3)
a:
A
(4)
b:
B
(5) (A == B)
[3, 4]
(6) (X == Y)
[5]
(7) (X != Y)
[1, 2]
int g;
void cmp (int a , int b) {
Goto L1, L2
main(int x, int y){
L1: assume(a==b);
g = 0;
return;
cmp(x, y);
assume(!g);
assume(x != y)
assert(0);
L2: assume(a!=b);
g = 1;
return;
}
}
Global:
(6) g: 0
Conditions:
main:
(1)
x:
X
(2)
y:
Y
cmp:
(3)
a:
A
(4)
b:
B
Contradictory!
(5) (A == B)
[3, 4]
(6) (X == Y)
[5]
(7) (X != Y)
[1, 2]
int g;
void cmp (int a , int b) {
Goto L1, L2
main(int x, int y){
L1: assume(a==b);
g = 0;
return;
cmp(x, y);
assume(!g);
assume(x != y)
assert(0);
L2: assume(a!=b);
g = 1;
return;
}
}
Global:
(6) g: 0
Conditions:
main:
(1)
x:
X
(2)
y:
Y
cmp:
(3)
a:
A
(4)
b:
B
Contradictory!
(5) (A == B)
[3, 4]
(6) (X == Y)
[5]
(7) (X != Y)
[1, 2]
int g;
void cmp (int a , int b) {
Goto L1, L2
main(int x, int y){
L1: assume(a==b);
g = 0;
return;
cmp(x, y);
assume(!g);
assume(x != y)
assert(0);
L2: assume(a!=b);
g = 1;
return;
}
}
Predicates after simplification:
{ x == y, a == b }
Part III: Advanced Topics
CTypes

::=
void | bool | int | ref 
Expressions
e
::=
c | x | e1 op e2 | &x | *x
LExpression
l
::=
x | *x
Declaration
d
::=

Statements
s
::=
skip
|
assume(e)
|
l = e
|
l = f (e1 ,e2 ,…,en )
|
return x
|
s1 ; s2 ;… ; sn
x1,x2 ,…,xn
| goto L1,L2 …Ln
Procedures
p
::=
 f (x1: 1,x2 : 2,…,xn : n )
Program
g
::=
d1 d 2 … dn p1 p2 … pn
| L: s
Two Problems
• Extending SLAM tools for pointers
• Dealing with imprecision of alias analysis
Pointers and SLAM
• With pointers, C supports call by reference
– Strictly speaking, C supports only call by value
– With pointers and the address-of operator, one can
simulate call-by-reference
• Boolean programs support only call-by-value-result
– SLAM mimics call-by-reference with call-by-value-result
• Extra complications:
– address operator (&) in C
– multiple levels of pointer dereference in C
What changes with pointers?
• C2bp
– abstracting assignments
– abstracting procedure returns
• Newton
– simulation needs to handle pointer accesses
– need to copy local heap across scopes to match Bebop’s
semantics
– need to refine imprecise alias analysis using predicates
• Bebop
– remains unchanged!
Assignments + Pointers
Statement in P:
*p = 3
Predicates in E:
{x==5}
Weakest Precondition:
WP( *p=3 , x==5 ) = x==5
What if *p and x alias?
Correct Weakest Precondition:
(p==&x and 3==5) or (p!=&x and x==5)
We use Das’s pointer analysis [PLDI 2000] to prune
disjuncts representing infeasible alias scenarios.
Abstracting Procedure Return
• Need to account for
– lhs of procedure call
– mixed predicates
– side-effects of procedure
• Boolean programs support only call-byvalue-result
– C2bp models all side-effects using return
processing
Abstracting Procedure Returns
• Let a be an actual at call-site P(…)
– pre(a) = the value of a before transition to P
• Let f be a formal of a procedure P
– pre(f) = the value of f upon entry to P
predicate
{x==1}
{x==2}
call/return relation
Q() {
int x = 1;
x = R(x)
}
int R (int f) {
f=x
int r = f+1;
pre(f) == pre(x) f = 0;
return r;
x=r
}
WP(f=x, f==pre(f) ) = x==pre(f)
WP(x=r, x==2) = r==2
call/return assign
s = R(T);
{x==2} = s & {x==1};
}
{r==pre(f)+1}
x==pre(f) is true at the call to R
pre(f)==pre(x) and pre(x)==1 and r==pre(f)+1 implies r==2
{x==1}
Q() {
{x==1},{x==2} = T,F;
{f==pre(f)}
s
bool R ( {f==pre(f)} ) {
{r==pre(f)+1} = {f==pre(f)};
{f==pre(f)} = *;
return {r==pre(f)+1};
}
predicate
{x==1}
{x==2}
call/return relation
Q() {
int x = 1;
x = R(x)
}
int R (int f) {
f=x
int r = f+1;
pre(f) == pre(x) f = 0;
return r;
x=r
}
WP(f=x, f==pre(f) ) = x==pre(f)
WP(x=r, x==2) = r==2
call/return assign
{f==pre(f)}
{r==pre(f)+1}
x==pre(f) is true at the call to R
pre(f)==pre(x) and pre(x)==1 and r==pre(f)+1 implies r==2
{x==1}
Q() {
{x==1},{x==2} = T,F;
s = R(T);
{x==1}, {x==2} = *, s & {x==1};
}
s
bool R ( {f==pre(f)} ) {
{r==pre(f)+1} = {f==pre(f)};
{f==pre(f)} = *;
return {r==pre(f)+1};
}
Extending Pre-states
• Suppose formal parameter is a pointer
– eg. P(int *f)
• pre( *f )
– value of *f upon entry to P
– can’t change during P
• * pre( f )
– value of dereference of pre( f )
– can change during P
predicate
{x==1}
{x==2}
Q() {
int x = 1;
R(&x);
}
call/return relation
call/return assign
{a==pre(a)}
{*pre(a)==pre(*a)}
a = &x
pre(a) = &x
int R (int *a) {
*a = *a+1;
}
{*pre(a)=pre(*a)+1}
pre(x)==1 and pre(*a)==pre(x) and *pre(a)==pre(*a)+1 and pre(a)==&x
implies x==2
{x==1
s
}
Q() {
{x==1},{x==2} = T,F;
s = R(T,T);
{x==2} = s & {x==1};
}
bool R ( {a==pre(a)}, {*pre(a)==pre(*a)} ) {
{*pre(a)==pre(*a)+1} =
{a==pre(a)} & {*pre(a)==pre(*a)} ;
return {*pre(a)==pre(*a)+1};
}
Newton: what changes with
pointers?
• Simulation needs to handle pointer
accesses
• Need to copy local heap across scopes to
match Bebop’s semantics
main(int *x){
assume(*x < 5);
foo(x);
}
void foo (int *a) {
assume(*a > 5);
assert(0);
}
main(int *x){
void foo (int *a) {
assume(*a > 5);
assert(0);
}
assume(*x < 5);
foo(x);
}
Predicates after simplification:
*x < 5 , *a < 5, *a > 5
main:
(1)
x:
X
(2)
*X:
Y [1]
foo:
Conditions:
Contradictory!
(3)
a:
(4)
*A:
A
B [3]
(5) (Y< 5)
[1,2]
(6) (B < 5) [3,4,5]
(7) (B > 5) [3,4]
SLAM Imprecision due to Alias
Imprecision (Flow Insensitivity)
x = 0;
y = 0;
p = &y
{x==0} = T;
skip;
skip;
Pts-to(p)
=
{ &x, &y}
{x==0} = *;
*p = *p + 1;
{x==0}
assume(x!=0);
assume( !{x==0} );
p = &x;
skip;
Newton: Path is Infeasible
{x==0} = T;
skip;
skip;
x = 0;
y = 0;
p = &y
x = 0;
y = 0;
p = &y
assume(!{x==0} ); assume(x!=0);
if (p == &x)
x = x + 1;
else
y = y + 1;
skip;
assume(x!=0);
{x==0} = *;
*p = *p + 1;
p = &x;
p = &x;
Consider values in abstract trace
{x==0} = T;
skip;
skip;
x = 0;
y = 0;
p = &y
{x==0} = *;
*p = *p + 1;
assume(!{x==0} ); assume(x!=0);
skip;
p = &x;
{x==0} is true
{x==0} is false
Consider values in abstract trace
{x==0} = T;
skip;
skip;
x = 0;
y = 0;
p = &y
WP(*p = *p +1, !(x==0) )
{x==0} = *;
*p = *p + 1;
assume(!{x==0} ); assume(x!=0);
skip;
p = &x;
= ((p == &x) and !(x==-1))
or ((p != &x) and !(x==0))
(p!=&x) gets
added as a
predicate
What changes with pointers?
• C2bp
– abstracting assignments
– abstracting procedure returns
• Newton
– simulation needs to handle pointer accesses
– need to copy local heap across scopes to match Bebop’s
semantics
– need to refine imprecise alias analysis using predicates
• Bebop
– remains unchanged!
What worked well?
•
•
•
•
•
Specific domain problem
Safety properties
Shoulders & synergies
Separation of concerns
Summer interns & visitors
–
–
–
–
Sagar Chaki, Todd Millstein, Rupak Majumdar (2000)
Satyaki Das, Wes Weimer, Robby (2001)
Jakob Lichtenberg, Mayur Naik (2002)
Giorgio Delzanno, Andreas Podelski, Stefan Schwoon
• Windows Partners
– Byron Cook, Vladimir Levin, Abdullah Ustuner
Future Work
• Concurrency
– SLAM analyzes drivers one thread at a time
– Work in progress to analyze interleavings between threads
• Rules and environment-models
– Large scale development or rules and environment-models is a
challenge
– How can we simplify and manage development of rules?
• Modeling C semantics faithfully
• Theory:
– Prove that SLAM will make progress on any property and any
program
– Identify classes of programs and properties on which SLAM will
terminate
Further Reading
See papers, slides from:
http://research.microsoft.com/slam
Glossary
Model checking
Checking properties by systematic exploration of the state-space of a
model. Properties are usually specified as state machines, or using
temporal logics
Safety properties
Properties whose violation can be witnessed by a finite run of the system.
The most common safety properties are invariants
Reachability
Specialization of model checking to invariant checking. Properties are
specified as invariants. Most common use of model checking. Safety
properties can be reduced to reachability.
Boolean programs
“C”-like programs with only boolean variables. Invariant checking and
reachability is decidable for boolean programs.
Predicate
A Boolean expression over the state-space of the program eg. (x < 5)
Predicate abstraction
A technique to construct a boolean model from a system using a given set
of predicates. Each predicate is represented by a boolean variable in the
model.
Weakest precondition
The weakest precondition of a set of states S with respect to a statement T
is the largest set of states from which executing T, when terminating,
always results in a state in S.
Download