Automatic Predicate Abstraction of C-Programs

advertisement
Verification of parameterised systems
Automatic Predicate Abstraction of C Programs
Shilpa Seshadri
Universität Paderborn
Prof. Dr. Heike Wehrheim, Daniel Wonisch, Nils Timm, Steffen Ziegert
Agenda
 Motivation
 Introduction
 C2BP Algorithm
 SLAM Toolkit
 Future Work
 Conclusion
 Discussion
2
Automatic Predicate Abstraction of C Programs
April 7, 2015
Motivation
 Model checking




Verification technique for a finite state system
Widely used for validation and debugging
Sometimes, State-space explosion limits the use of tools
Hence, model checkers operate on abstractions of systems
 Software systems are typically infinite state systems
 Abstraction is critical
 Predicate abstraction of programs is implemented – One approach
 Model checking finite state  check an abstraction of a software system
3
Automatic Predicate Abstraction of C Programs
April 7, 2015
Model Checking
 Algorithmic exploration of state space of the system
 Several advances in the past decade:
 symbolic model checking
 symmetry reductions
 partial order reductions
 compositional model checking
 bounded model checking using SAT solvers
 Most hardware companies use a model checker in the
validation cycle
4
Automatic Predicate Abstraction of C Programs
April 7, 2015
Abstraction
Program
Model Checker
Input
void add(Object o) {
buffer[head] = o;
head = (head+1)%size;
}
Object take() {
…
tail=(tail+1)%size;
return buffer[tail];
}
Infinite state
5
Automatic Predicate Abstraction of C Programs
Finite state
April 7, 2015
Abstraction (A simplified view)
 Abstraction is an effective tool in verification
 Given a transition system, we want to generate an abstract
transition system which is easier to analyze
 However, we want to make sure that
 If a property holds in the abstract transition system, it also holds
in the original (concrete) transition system
6
Automatic Predicate Abstraction of C Programs
April 7, 2015
Abstraction (A simplified view)
 If the property does not hold in the abstract transition
system, what can we do?
 We can refine the abstract transition system (split some
states that we merged)
 The refined transition system should still be an abstraction of
the concrete transition system
 Then, we can recheck the property again on the refined
transition system
 If the property does not hold again, we can refine again
7
Automatic Predicate Abstraction of C Programs
April 7, 2015
Abstraction Refinement Loop
Initial
Abstraction
Actual
Program
No error
or bug found
Verification
Boolean
Program
Model
Checker
Spurious
counterexample
Abstraction refinement
8
Automatic Predicate Abstraction of C Programs
April 7, 2015
Predicate Abstraction
 An automated abstraction technique which can be used to reduce
the state space of a program
 The basic idea here is to remove some variables from the program
by just keeping information about a set of predicates about them
 Predicate abstraction is a technique for doing such abstractions
automatically
9
Automatic Predicate Abstraction of C Programs
April 7, 2015
A Very Simple Example
 Assume that we have two integer variables x,y
 We want to abstract the program using a single predicate “x=y”
 We will divide the states of the program to two:
1.
2.
The states where “x=y” is true
The states where “x=y” is false, i.e., “xy”
 We will then merge all the states in the same set


10
This is an abstraction
Basically, we forget everything except the value of the predicate
“x=y”
Automatic Predicate Abstraction of C Programs
April 7, 2015
A Very Simple Example
 We will represent the predicate “x=y” as the boolean variable B in
the abstract program
 “B=true” will mean “x=y” and
 “B=false” will mean “xy”
 Assume that we want to abstract the following program which
contains only one statement:
y := y+1
11
Automatic Predicate Abstraction of C Programs
April 7, 2015
Predicate Abstraction, Step 1
 Calculate preconditions based on the predicate
{x = y + 1} y := y + 1 {x = y}
precondition for B being true after
executing the statement y:=y+1
{x  y + 1} y := y + 1 {x  y}
precondition for B being false after
executing the statement y:=y+1
12
Automatic Predicate Abstraction of C Programs
Using our temporal logic notation
we can say something like:
{x=y+1}  AX{x=y}
Again, using our temporal logic
notation:
{x≠y+1}  AX{x≠y}
April 7, 2015
Predicate Abstraction, Step 2
 Use decision procedures to determine if the predicates used for
abstraction imply any of the preconditions
x = y  x = y + 1 ? No
x  y  x = y + 1 ? No
x = y  x  y + 1 ? Yes
x  y  x  y + 1 ? No
13
Automatic Predicate Abstraction of C Programs
April 7, 2015
Predicate Abstraction, Step 3
 Generate abstract code
Predicate abstraction
wrt the predicate “x=y”
IF B THEN B := false
ELSE B := true | false
y := y + 1
1) Compute
preconditions
3) Generate
abstract code
{x = y + 1} y := y + 1 {x = y}
{x  y + 1} y := y + 1 {x  y}
2) Check
implications
14
Automatic Predicate Abstraction of C Programs
x = y  x = y + 1 ? No
x  y  x = y + 1 ? No
x = y  x  y + 1 ? Yes
x  y  x  y + 1 ? No
April 7, 2015
Automatic Predicate Abstraction
 1st proposed by Graf & Saidi & reflected in T Ball’s work
 Concrete states are mapped to abstract states under a finite
set of predicates
 Designed and implemented for
 Finite state systems
 Infinite state systems specified as Guarded Commands
 Not implemented for a programming language such as C
15
Automatic Predicate Abstraction of C Programs
April 7, 2015
Predicate Abstraction of C (c2bp)
 Performs automatic predicate abstraction of C programs
 Input: a C program P and set of predicates E
 predicate = pure C boolean expression
 Output: a boolean program BP(P,E) that is
 a sound abstraction of P
 a precise (boolean) abstraction of P
 Results
 separate compilation (predicate abstraction) in presence of
procedures and pointers
16
Automatic
Predicate Abstraction of C Programs
April 7, 2015
Predicate abstraction by C2BP
program P
C2BP
Boolean
program BP(P,E)
predicates E
17
Automatic Predicate Abstraction of C Programs
April 7, 2015
Boolean program BP(P, E):
 a C program with bool as type
- plus some additional constructs
- same control structure as P
Given
 P : a C program
 E = {e1,...,en} : set of C boolean expressions over the
variables in P
 No side effects, no procedure calls
Produces a boolean program B
 Same control-flow structure as P
 Properties true of B are true of P
18
Automatic Predicate Abstraction of C Programs
April 7, 2015
Formal Properties of C2BP
 soundness
 B has a superset of the feasible paths in P
 If {ei} is true (false) at some point on a path in B, then ei is true
(false) at that point along a corresponding path in P
 complexity
 linear in size of program
 exponential in number of predicates
19
Automatic Predicate Abstraction of C Programs
April 7, 2015
BEBOP model checker
 A Symbolic Model Checker for Boolean Programs
 Performs inter procedural dataflow analysis using binary
decision diagrams (BDDs)
 Used to analyze the boolean program
 Based on Context-free Language (CFL) reachability (see
Glossary)
20
Automatic Predicate Abstraction of C Programs
April 7, 2015
What is SLAM?
SLAM is a software model checking project at Microsoft Research
 Goal: Automatically check C programs (system software) against safety
properties using model checking
 Safety property – “something good happens” . An example: a lock is
never released without first being acquired
 Application domain: device drivers
 Counterexample-driven refinement
 terminates in practice
21
Automatic Predicate Abstraction of C Programs
April 7, 2015
SLAM
 Input
 API usage rules
 client C source code “as is”
 Analysis
 create, explore and refine boolean program
abstractions
 Output
 Error traces (minimize noise)
 Verification (soundness)
22
Automatic
Predicate Abstraction of C Programs
April 7, 2015
Rules
Static Driver Verifier
Read for
understanding
New API rules
Development
Precise
API Usage Rules
(SLIC)
Defects
Drive testing
tools
Software Model
Checking
Testing
100% path
coverage
23
Source Code
Automatic Predicate Abstraction of C Programs
April 7, 2015
SLAM Toolkit
 SLAM toolkit was developed to find errors in windows device
drivers
 Windows device drivers are required to interact with the windows
kernel according to certain interface rules
 SLAM toolkit has an interface specification language called SLIC
(Specification Language for Interface Checking) which is used for
writing these interface rules
 The SLAM toolkit instruments the driver code with assertions
based on these interface rules
24
Automatic Predicate Abstraction of C Programs
April 7, 2015
Windows Device Drivers & SLIC
 Kernel presents a very complex interface to driver
 stack of drivers
 NT kernel multi-threaded
 Correct API usage described by finite state protocols
 SLIC
 Finite state language for stating rules
 monitors behavior of C code
 temporal safety properties
 familiar C syntax
25
Automatic
Predicate Abstraction of C Programs
April 7, 2015
Newton
 Given an error path p in boolean program B, it checks
 is p a feasible path of the corresponding C program?
 Yes: found an error
 No: find predicates that explain the infeasibility
 Uses the same interfaces to the theorem provers as c2bp.
26
Automatic Predicate Abstraction of C Programs
April 7, 2015
How SLAM does it
 Model checking a C program is not feasible!
Still model checking is very effective on model level ...
 Idea: automatically extract an (abstract) model from C source.
But even this is hard:
 which aspects should be retained and hidden??
 how to extract??
 Idea:
 Start with a very abstract model, whose extraction is quite trivial.
 Incrementally refine the abstraction as needed.
27
Automatic Predicate Abstraction of C Programs
April 7, 2015
Traditional approach
model
checker
28
FSM
Finite state machines
Source code
Sequential C program
Automatic Predicate Abstraction of C Programs
April 7, 2015
SLAM
model
checker
Data flow analysis implemented using BDDs
Push down model
Finite state machines
Boolean
FSM
program
abstraction
Source code
29
C data structures, pointers,
procedure calls, parameter passing,
scoping,control flow
Sequential C program
Automatic Predicate Abstraction of C Programs
April 7, 2015
SLAM Soundness
30
 Idea: SLAM constructs sound abstractions!
If A is a constructed abstraction of P, A preserves P’s control
structure.
 Therefore, theorem:
paths(P)  paths(A)
Every possible execution path of P is a possible execution path of A.
 Therefore, theorem :
So, if A satisfies the SLIC spec; so does P !
Automatic Predicate Abstraction of C Programs
April 7, 2015
SLAM completeness
31
 Unfortunately, the reverse of the previous theorem is generally not true 
an execution path (including an error path) in A may not be
an execution path in P
 so, an error found in A may be a false error
 If A produces false errors, we can try to refine it (to make it more precise) to
a new model A’ ; so an A’ such that:
paths(P)  paths(A’)  paths(A)
(suggesting an iterative procedure....)
Automatic Predicate Abstraction of C Programs
April 7, 2015
SLAM main iteration
32
Program P
Instrument Program
Property φ
Instrumented program P'
Initial predicates
Abstraction
No. Then refine
the abstraction
is  feasible in P ?
yes!
Property φ is invalid
Abstraction A of P'
violation by an error path 
Model checking:
A |= φ ?
But verification is
generally
undecidable; hence
this iteration may not
terminate.
Automatic Predicate Abstraction of C Programs
no violation
Property φ is valid
April 7, 2015
Pointers and SLAM
 Abstracting from a language with pointers (C) to one without
pointers (boolean programs) is a challenge
 With pointers, C supports call by reference
 Strictly speaking, C supports only call by value
 With pointers and the address-of operator, one can simulate call-by-
reference
 Boolean programs support only call-by-value-result
 SLAM mimics call-by-reference with call-by-value-result
 Extra complications:
 address operator (&) in C
 multiple levels of pointer dereference in C
33
Automatic Predicate Abstraction of C Programs
April 7, 2015
Challenges of predicate abstraction
Pointers: two related sub-problems treated in a uniform way
 assignments through de-referenced pointers in original Cprogram
 pointers & pointer-dereferences in the predicates for the
abstraction
Procedures: allow procedural abstraction in Boolean
programs. They also have:
 global variables
 procedures with local variables
 call-by-value parameter passing
 procedural abstraction – signatures constructed in isolation
34
Automatic Predicate Abstraction of C Programs
April 7, 2015
Cont’d …
Procedure calls: abstraction process is challenging in the
presence of pointers
 after a call the caller must conservatively update local state
modified by procedure
 sound and precise approach that takes side-effects into
account
Unknown values: it is not always possible to determine the
effect of a statement in the C-program in terms of the input
predicate set E
 such non-determinism handled in BP via non-deterministic
control expression ‘*’ which allows to implicitly express 3valued domain for boolean variables
35
Automatic Predicate Abstraction of C Programs
April 7, 2015
Assumption over a C-program:
 All inter-procedural control flow is by if and goto
 All expressions are free of side-effects & short-circuit
evaluation
 All expressions do not contain multiple pointer
dereferences (e.g. **p)
 Function calls occur at topmost level of expressions
36
Automatic Predicate Abstraction of C Programs
April 7, 2015
Weakest Precondition
 For a statement ‘s’ and a predicate ‘φ’ , let WP(s, φ) denote the
weakest liberal precondition of φ with respect to ‘s’
 For assignment statement,
 By definition WP(x = e, φ) is φ with all occurrences of x replaced with e,
denoted φ[e/x]
 For example WP(x=x+1, x<5) = (x+1) < 5 = (x<4)
 Given S and Q, what is the weakest P’ satisfying {P’} S {Q} ?
 P' is called the weakest precondition of S with respect to Q, written
WP(S, Q)
 to check {P} S {Q}, check P  P’
 C2BP uses decision procedures (i.e., a theorem prover) to
strengthen the weakest precondition
37
Automatic Predicate Abstraction of C Programs
April 7, 2015
SLAM Future Work
 More impact
 Static Driver Verifier (internal, external)
 More features
 Heap abstractions
 Concurrency
 More languages
 C# and CIL
38
Automatic Predicate Abstraction of C Programs
April 7, 2015
Predicate abstraction overview
PA Problem: given (P, E) where
 P is a C-program
 E = {φ1, …, φn} is a set of pure boolean C-expressions
over variables and constants of the C-language
Compute BP(P, E) which is a boolean program that
 has some control structure as P
 contains only boolean variables V = {b1, …, bn} where
bi = {φi} represents predicate φi
 guaranteed to be an abstraction of P (superset of traces
modulo …)
39
Automatic Predicate Abstraction of C Programs
April 7, 2015
SLAM – Software Model Checking
 SLAM innovations
 boolean programs: a new model for software
 model creation (c2bp)
 model checking (bebop)
 model refinement (newton)
 SLAM toolkit
 built on MSR program analysis infrastructure
 SLAM is Microsoft’s fully automated tool to verify the correctness of
C programs
 More info: http://www.research.microsoft.com/slam/
40
Automatic Predicate Abstraction of C Programs
April 7, 2015
Glossary
41
Model checking
Checking properties by systematic exploration of the state-space of a
model. Properties are usually specified as state machines, or using
temporal logics
Safety properties
Properties whose violation can be witnessed by a finite run of the system.
The most common safety properties are invariants
Reachability
Specialization of model checking to invariant checking. Properties are
specified as invariants. Most common use of model checking. Safety
properties can be reduced to reachability.
Boolean programs
“C”-like programs with only boolean variables. Invariant checking and
reachability is decidable for boolean programs.
Predicate
A Boolean expression over the state-space of the program eg. (x < 5)
Predicate abstraction
A technique to construct a boolean model from a system using a given set
of predicates. Each predicate is represented by a boolean variable in the
model.
Weakest precondition
The weakest precondition of a set of states S with respect to a statement T
is the largest set of states from which executing T, when terminating,
always results in a state in S.
Automatic Predicate Abstraction of C Programs
April 7, 2015
Thank You for your Attention!
Questions are welcome
42
Automatic Predicate Abstraction of C
Programs
April 7, 2015
Download