SAT Unbounded Model Checking Edmund M. Clarke Carnegie Mellon University

advertisement
SAT-based Bounded and
Unbounded Model Checking
Edmund M. Clarke
Carnegie Mellon University
Joint research with C. Bartzis, A. Biere, P. Chauhan, A. Cimatti,
T. Heyman, D. Kroening, J. Ouaknine, R. Raimi, O. Strichman, and Y. Zhu
Why am I giving this talk?
I have an ulterior motive for this talk.
Second Edition!
Need a chapter on SAT for the second edition.
Outline of Talk
1. Motivation
2. Bounded Model Checking
3. Complete methods using SAT
a. Induction
b. Unbounded Model Checking
--- with cube enlargement
--- with circuit co-factoring
--- with interpolants
Outline of Talk
1. Motivation
yes
2. Bounded Model Checking
yes
3. Complete methods using SAT
a. Induction
no
b. Unbounded Model Checking
--- with cube enlargement
yes
--- with circuit co-factoring maybe
--- with interpolants
no
SAT Solver Progress 1960 -2010
100000
10000
Vars
1000
100
10
1
1960
1970
1980
1990
Year
2000
2010
Model Checking (CE81,QS82)



Specification – temporal logic
Model – finite state transition graph
Advantages:





Always terminates
Automatic
Usually fast
Can handle partially specified models
Counterexample if specification is false
Symbolic Model Checking




Method used by most “industrial strength”
model checkers.
Uses Boolean encoding for state machine
and sets of states.
Can handle much larger designs – hundreds
of state variables.
BDDs traditionally used to represent
Boolean functions.
Problems with BDDs



BDDs are a canonical representation. Often become too
large.
Variable ordering must be uniform along paths.
Selecting right variable ordering very important for obtaining
small BDDs.
 Often time consuming or needs manual intervention.
 Sometimes, no space efficient variable ordering exists.
This talk describes alternative approaches
to model checking that use SAT procedures.
Advantages of SAT Procedures




SAT procedures also operate on Boolean
formulas but do not use canonical forms.
Do not suffer from the potential space
explosion of BDDs.
Different split orderings possible on
different branches.
Very efficient implementations exist.
Bounded Model Checking
A. Biere, A. Cimatti, E. Clarke, Y. Zhu, Symbolic Model
Checking without BDDs, TACAS’99
Bounded Model Checking as SAT
Given a property p: (e.g. “signal_a = signal_b”)
Is there a state reachable in k cycles, which satisfies p ?
p
p
p
s0
s1
s2
...
p
sk-1
p
sk
Bounded Model Checking: Safety
The reachable states in k steps are
captured by:
The property p fails in one of the k steps
Bounded Model Checking: Safety
The safety property p is valid up to step k iff W(k) is
unsatisfiable:
p
p
p
s0
s1
s2
...
p
sk-1
p
sk
Bounded Model Checking: Safety
Example: a two bit counter
00
11
01
10
Initial state: I: : l ^ : r
Transition:
R: l’ = (l  r) ^ r’ = : r
Property:
G (l  r).
For k = 2, W(k) is unsatisfiable. For k = 3 W(k) is satisfiable
Bounded Model Checking: Liveness
There is no counterexample of length k to the
Liveness property Fp iff W(k) is unsatisfiable:
=
:p
:p
:p
s0
s1
s2
...
p
sk-1
:p
sk
BMC formula for arbitrary LTL
(Standard
i
translation)
l
k
Size of resulting formula: O(k|M| + k3||)
With sharing of subformulas becomes O(k|M| + k2||)
A fixpoint based translation
T. Latvala, A. Biere, K. Heljanko, and T. Junttila:
“Simple Bounded LTL Model Checking” FMCAD 04

Idea: for lasso-shaped Kripke structures, the
semantics of LTL and CTL coincide.


Add a formula that isolates a lasso-shaped path.
Use the fixpoint characterization of CTL, e.g.
E[ U ] =   ( ^ EXE[ U ] )
i
k
Overall formula
Model
LTL
formula
Isolate lasso-shaped path
bound
Fixpoint
formula
Loop constraints
•If li is true then there exists a loop at position i.
•At most one li is true.
Fixpoint formula
i
j
k
False
True
Size of resulting formula: O(k(|M| + ||))
Generating the BMC formula
(Based on the Vardi-Wolper algorithm)

A labeled Büchi automaton is a 5-tuple
B=hS, S0 , , L, F i
states

initial
states
transition
relation
labels
final
states
Acceptance condition:
An infinite word w is accepted iff the
execution of w on B passes through a
final state an infinite number of times.
LTL model checking
Given


Transition system M

LTL property 
1.
Translate  into a Buchi automaton B
2.
Compute product automaton P = M£B
3.
Check if P is empty:
Is a fair loop reachable?
s0
Generating the BMC formula
E. Clarke, D. Kroening, J. Ouaknine, and O. Strichman:
“Computational chalenges in Bounded Model Checking” STTT 05


Encode all paths of P that start at an initial
state and are k steps long.
Require that


at least one path contains a loop.
at least one state in the loop is final.
s0
Generating the BMC formula
s0
sl=sk
sk-1
Start from the
initial state
Follow k
transitions
Choose a
state where
the loop starts
Require that
some state in
the loop is final
Bounded Model Checking
Resources
exceeded
k=0
BMC(M,,k)
k++
SAT
UnSAT
no
k ¸ CT
yes
CT is the completeness threshold
The Completeness Threshold


Computing CT is as hard as model checking.
Idea: Compute an over-approximation to the
actual CT

Consider system P as a graph.

Compute CT from structure of P.
Basic notions



Diameter D(M) = longest
shortest path between any
two reachable states.
Recurrence Diameter
RD(M) = longest loop-free
path between any two
reachable states.
The initialized versions:
DI(M) and RDI(M)
start from an initial state.
DI(M) = D(M) = 2
RDI(M) = RD(M) = 3
CT for safety properties

Theorem: for AGp properties CT = DI(M)
p
s0
· DI(M)
For AFp properties this does not hold
p
p
p
p
DI(M)=3 but CT=4
CT for liveness properties

Theorem: for AFp properties CT= RDI(M)+1
p
p
p
p
p
s0

Theorem: for an LTL property  CT = ?
CT for arbitrary LTL properties
·d I(P )
Shortest counterexample
s0
·d(P )
·rd I(P )
Theorem [CKOS 05]
A Completeness Threshold for any LTL
property  is min(rd I(P )+1, d I(P )+d (P ))
Why take the minimum?
Example 1
dI(P)+d(P) = 6
>
dI(P)+d(P) = 2
<
rdI(P)+1 = 4
Example 2
rdI(P)+1 = 4
Formulation of diameter in QBF
State s is reachable in j steps:
Thus, k is greater or equal to the diameter d if
Infeasible to compute the diameter using a poly-time
algorithm for shortest paths.
SAT-based Diameter Computation
M. Mneineh, K. Sakallah,“SAT-based
Sequential Depth Computation”,ASPDAC03
1. Check if there is a state s reachable in c
steps but not reachable in less than c
steps.
2. Increment c, until no state is reachable in
c steps.

May enumerate many states in 1.

Recurrence diameter as SAT
Find maximal n that satisfies:
O(n2)
s2
comp & swap
s1
comp & swap
s0
comp & swap
Optimization: Use a sorting network to obtain an ordered
permutation of the states [Kroening & Strichman]
s0’
s1’
O(nlogn)
s2’
Now compare only neighboring states
O(n)
Complexity of BMC: Formula size



Original translation
O(k|M| + k2||)
Automata based translation
O(k|M|2| |)
Fixpoint based translation
O(k(|M| + ||))
Complexity of BMC





Size of SAT instance is O(k(|M| + ||))
k can become as large as the diameter of the
system, which is exponential in the number
of state variables in the worst case.
SAT is exponential time.
Therefore, SAT based BMC has doubly
exponential complexity.
But LTL model checking is singly exponential!
Why use SAT based BMC?



Infeasible to represent P explicitly.
Identify shallow errors efficiently.
In many cases rd(P) and d(P) are not
exponential and can be rather small.


E.g. hardware components without counters
Modern SAT solvers are very successful
in practice.
Unbounded Model Checking
using Cube Enlargement
P. Chauhan, E. Clarke, and D. Kroening: “Using SAT based
Image Computation for Reachability Analysis” CMU-CS-03-151
Reachability analysis





Consider a system with state variables x
and inputs i.
S0(x) is the set of initial states.
T(x,i,x’) is the transition relation.
We want to compute the set of reachable
states Sreach.
Iterative process: Compute the states
reachable in 1 step, 2 steps, …
Image computation and Reachability


The set of immediate successors of states
S(x) is given by:
The set of all reachable states is the least
fixpoint:
Computing Reachability


Si+1 is the set of new states directly
reachable from Si
Then Sreach is the union of all Si
SAT based image computation

The transition relation T(x,i,x’) is represented
as a CNF formula (a set of clauses).


If not already in CNF, it can be converted in
polynomial time.
The set of newly reachable states after each
step Si as well as their union Sreach are
represented in DNF (a set of cubes).
 Obviously Sreach is in CNF.
SAT based image computation
Union of sets of cubes
Si +1 contains all solutions to
Si(x)  T(x, i, x’)  Sreach(x)
projected on x’ and renamed to x
The image computation step



Si is in DNF
Convert to CNF by introducing new variables
Solve the CNF formula

Si(x)  T(x,i,x’)  Sreach(x)
Solution is a cube d
Project d to x’ and rename to x

Add d to Sreach(x) and Si+1(x)

Repeat until the formula becomes unsat

Efficiency issues


The number of satisfying assignments can
be exponential in the number of variables.
Therefore two problems:
Enumeration of full assignments is slow.


Solution: Cube enlargement
The representation of Sreach and Si can
grow too large.

Solution: Systematically combine cubes using
an appropriate data structure.
Cube enlargement


SAT solvers like zChaff return complete
assignments (minterms).
Partial assignments (cubes) are better,
because they represent multiple minterms.
For example, the cube x1  x4 represents
4 minterms:
x1  x2  x3  x4
x1  x2  x3  x4
x1  x2  x3  x4
x1  x2  x3  x4
Efficient cube set representation




Cubes are stored in a hash table of tries.
Each trie is associated to a unique subset of
state variables.
Whenever a new cube d is inserted, the
corresponding trie is searched for cubes d’
that differ only in one literal.
The merged cube (without the differing
literal) is stored instead of d and d’.
Efficient cube set representation
Hash table
Hash keys
{x1, x2}
{x1, x7 , x8}
{x2, x3 , x4}
{x2, x4 }
…
Tries
New cube: x2  x3  x4
1. Identify appropriate hash table
entry
2. Look for matching cubes
3. If match was found, delete
cube and insert merged cube
{x2, x3 , x4}
x2
x3
x4
x2  x4
x 2
x3
x4
Related work

[Gupta et al, FMCAD 00 and ICCAD 01]


[K. McMillan, CAV 02]




Mixed BDD / SAT approach
Sets of states represented in CNF
CNF clauses stored in ZDDs
Conflict analysis for cube enlargement
[H. Kang and I. Park, DAC 03]


Offline Espresso to reduce the number of cubes
No cube enlargement
Unbounded Model Checking
using Circuit Cofactoring
M. Ganai, A. Gupta and P. Ashar,
“Efficient SAT-based Unbounded Symbolic Model
Checking Using Circuit Cofactoring”, ICCAD 04
SAT-based Image Computation




The SAT-based procedure enumerates all
state cube solutions.
Each invocation of the SAT solver generates
one new state cube.
A blocking clause representing the negation
of the state cube is added at each step.
The main problem is that the required
number of steps can be very large.
Main Contribution

Use circuit cofactoring to capture a large set
of states at each enumeration step.



Less enumeration steps
Use circuit graph simplification to compact
the captured states.
Use a Hybrid Sat Solver that works on both
OR/INVERTER circuits and CNF.
Definitions






State variables X.
Input variables U.
Partial assignment : X[U!{0,1} .
State cube s is the projection of  on X.
Input cube u is the projection of  on U.
Minterm m is a complete assignment to U
extending u.
Example

X = x1, x2

U = u 1, u 2

 = x1 ^ :u2

s = x1

u = :u2

m = u1 ^ :u2
Cofactors of Boolean functions


Cofactors of f(v1,…,v,…) with respect to
variable v are fv(v1,…,1,…), fv’(v1,…,0,…)
Cofactor of f with respect to cube c, is fc


Obtained by cofactoring f with respect to
each literal in c.
Example
Producing larger sets of states

Given a formula f and a satisfying
assignment cube s
1.
Isolate the “input part” of s and complete it
by picking values for unassigned inputs.
2.
Cofactor
3.
f with respect to the satisfying input
minterm m.
Use the function fm obtained in 2, to
represent the set of satisfying states.
Example

u1 and u2 are primary inputs.
x1 and x2 are state variables.

We want to compute:


9 u1u2f
Example cont’



The SAT solver returns <u1=1,x2=0> as
the first assignment.
Step 1: Complete the input part of the
assignment by choosing u2=1 .
Step 2: Cofactor f with respect to the
satisfying input minterm m=u1u2. We get:
Example cont’


fm represents more states than the
satisfying cube x2’
We needed just one enumeration step to
capture the entire solution set
SAT-based existential quantification
The returned value of C should correspond to 9B f(A,B)
C , 9B f(A,B)

C is a union of cofactors of f with respect
to B, therefore


When the algorithm terminates



C ) 9B f(A,B)
f(A,B) ^ :C is unsat, therefore
8B (:f(A,B) _ C) is valid
C contains no variables in B


8B (:f(A,B)) _ C
9B f(A,B) ) C
Hybrid SAT-solver




Represents original circuit with 2-input
OR/INVERTOR gates
Represents learned constraints with CNF
Finds partial satisfying assignments
Dynamically removes inactive clauses
Other applications of SAT in
formal verification

[D. Kroening, F. Lerda, and E. Clarke TACAS
04]


[G. Audemard, A. Cimatti, A. Kornilowicz,
and R. Sebastiani, FORTE 02]


Bounded Model Checking for Software
Bounded Model Checking for Timed Systems
[H. Jain, D.Kroening, N. Sharigina, E. Clarke
DAC 05]

Word level predicate abstraction and
refinement for verifying RTL verilog
For more information …

“A survey of Recent Advances in SAT-based
Formal Verification” by Mukul R Prasad,
Armin Biere and Aarti Gupta, STTT.
Download