SAT-Based Decision Procedures for Subsets of First-Order Logic Carnegie Mellon University

advertisement
SAT-Based Decision
Procedures for Subsets of
First-Order Logic
Part I:
Equality with Uninterpreted Functions
Randal E. Bryant
Carnegie Mellon University
http://www.cs.cmu.edu/~bryant
Overall Outline
Background

SAT-based Decision Procedures
Part I: Equality with Uninterpreted Functions


Translating to propositional formula
Exploiting positive equality and sparse transitivity
Part II: Separation Logic



–2–
Restricted form of addition
Translating to propositional formula
Hybrid encoding techniques
Decision Procedures in Formal
Verification
RTL/
Source
Code
+
Specification
Abstraction
Formal
Model
+
Specification
Verification
Decision Procedure for Decidable Fragment
of First-Order Logic
Applications: Out-of-order, Pipelined Microprocessors; Cache
Coherence Protocols; Device Drivers; Compiler Validation; …
–3–
OK
Error
SAT-based Decision Procedures
Input Formula
Satisfiability-preserving
Boolean Encoder
Approximate
Boolean Encoder
Boolean Formula
Boolean Formula
SAT Solver
SAT Solver
satisfiable
–4–
Input Formula
unsatisfiable
EAGER ENCODING
additional
clause
unsatisfiable
First-order
Conjunctions
SAT Checker
satisfiable
satisfying
assignment
unsatisfiable
LAZY ENCODING
satisfiable
Lazy Encoding Characteristics
Uninterpreted
Functions
Linear
Arithmetic
First-order
Conjunctions
SAT Checker
Theory
Combiner
Bit Vectors
•
•
•
Theory N
+ Can be extended to handle wide variety of theories
+ Clean & modular design
– Does not scale well
 Number of calls to conjunction checker typically exponential in
–5–
formula size
 Each call independent: nothing learned in one call can be
exploited by another
Eager Encoding Characteristics
Input Formula
– Must encode all information about
domain properties into Boolean
formula
– Some properties can give exponential
blowup
Satisfiability-preserving
Boolean Encoder
Boolean Formula
SAT Solver
+ Lets SAT solver do all of the work
Good Approach for Some Domains

Modern SAT solvers have remarkable
capacity
 Good at extracting relevant portions out
of very large formulas
 Learns about formula properties as
search proceeds
satisfiable
–6–
unsatisfiable
 Focus of this talk
Data and Function Abstraction
x0
x1
x2

x
xn-1
Common Operations
p
x 1
ITE(p, x, y)
y 0
If-then-else
Bit-vectors to (unbounded) Integers
x
y
A
L
U

x=y
Test for equality
f
Functional units to Uninterpreted Functions
–7–
=
a = x  b = y  f(a,b) = f(x,y)
Abstract Modeling of Microprocessor
IF/ID
PC
Op
ID/EX
Control
EX/WB
Control
Rd
Ra
Instr
F3
Mem
=
Adat
Reg.
File
A
FL2
U
Imm
F1
+4
Rb
=
For any Block that Transforms or Evaluates Data:


–8–
Replace with generic, unspecified function
Also view instruction memory as function
EUF: Equality with Uninterp. Functs

Decidable fragment of first order logic
Formulas (F )
F, F1  F2, F1  F2
T1 = T2
P (T1, …, Tk)
Terms (T )
ITE(F, T1, T2)
Fun (T1, …, Tk)
Functions (Fun)
f
Read, Write
Predicates (P)
p
–9–
Boolean Expressions
Boolean connectives
Equation
Predicate application
Integer Expressions
If-then-else
Function application
Integer  Integer
Uninterpreted function symbol
Memory operations
Integer  Boolean
Uninterpreted predicate symbol
EUF Decision Problem
Circuit Representation of Formula

Truth Values





Integer Values




Task

Dashed Lines
Model Control
Logical connectives
Equations
Solid lines
Model Data
Uninterpreted functions
If-Then-Else operation
e1
f
T
F
e0
x0
f
T
d0


=

T
F
=
F
Determine whether formula F is universally valid
 True for all interpretations of variables and function symbols
 Often expressed as (un)satisfiability problem
– 10 –
» Prove that formula F is not satisfiable
Finite Model Property for EUF
e1
f
T
F
e0
x0
f
T
d0


x0
=
f (x0) f (d0)

T
F
d0
=
F
Observation


– 11 –
Any formula has limited number of distinct expressions
Only property that matters is whether or not different terms
are equal
Boolean Encoding of Integer Values
Expression
x0
Possible
Values
{0}
Bit
Encoding
0
0
d0
{0,1}
0
b10
f (x0)
{0,1,2}
b21
b20
f (d0)
{0,1,2,3}
b31
b30
For Each Expression

Either equal to or distinct from each preceding expression
Boolean Encoding


Use Boolean values to encode integers over small range
EUF formula can be translated into propositional logic
 Logic circuit with multiplexors, comparators, logic gates
– 12 –
 Tautology iff original formula valid
Some History of EUF Decision
Procedures

Ackermann, 1954
 Quantifier-free decision problem can be decided based on finite
instantiations

Burch & Dill, CAV ‘94
 Automatic decision procedure
» Davis-Putnam enumeration
» Congruence closure to enforce functional consistency

Boolean approaches
 Goel, et al, CAV ‘98
» Attempted with BDDs, but didn’t get good results
 Bryant, German, Velev, CAV ‘99
» Could verify microprocessor using BDDs
 Velev & Bryant, DAC 2001
» Demonstrated power of modern SAT procedures
– 13 –
Exploiting Positive Equality


Bryant, German, Velev CAV ‘99
First successful use of Boolean methods for EUF
Positive Equality

Equations that appear in unnegated form
Exploiting

Can greatly reduce number of cases required to show
validity
 Only need to consider maximally diverse interpretations

– 14 –
Reduce number of Boolean variables in bit-level encoding
Diverse Interpretations: Illustration
Task

Verify someone’s obscure code for 4X4 array transpose
void trans(int a[4][4])
{
int t;
for (t = 4; t < 15; t++)
if (~t&2|| t&8 && ~t&1) {
int r = t&0x3;
int c = t>>2;
int val = a[r][c];
Only operations
a[r][c] = a[c][r];
on array elements
a[c][r] = val;
}
}
Observation

– 15 –

Array elements altered only by copying one to another
Just need to make sure right set of copies performed
Verifying Array Code
Test for trans4
dest
src
0
1
2
3
0
4
8
12
4
5
6
7
1
5
9
13
trans4
8
9
10
11
2
6
10 14
12
13
14 15
3
7
11
15
Single Test Adequate

Unique value for each possible source element
 “Maximally Diverse”

– 16 –
If dest[r][c] = src[c][r], then must have copied proper
value
Characteristics of Array Verification
Correctness Condition
src[0][0] = dest[0][0]  src[0][1] = dest[1][0] 
src[0][2] = dest[2][0]  …
… 
src[3][2] = dest[2][3]  src[3][3] = dest[3][3]
Properties


All equations are in positive form
Worst case test is one that tends to make things unequal
 I.e., maximally diverse interpretation

All maximally diverse interpretations isomorphic
 Only need to try one to prove all handled correctly
– 17 –
Equations in Processor Verification
IF/ID
PC
Op
ID/EX
Control
EX/WB
Control
Rd
Ra
Instr
Mem
=
Adat
Reg.
File
A
L
U
Imm
+4
=
Rb
Data Types



– 18 –
Equations
Register Ids
Control stalling & forwarding
Instruction Address Only top-level verification condition
Program Data
Only top-level verification condition
Exploiting Equation Structure
Positive Equations


In top-level verification condition
Can use maximally diverse interpretation
Negative Equations

PIpeline control logic
 Between register IDs
 Operation depends on whether or not two IDs are equal

Must use general encoding
 Encode with Boolean variables
 All possibility of IDs that match and/or don’t match
– 19 –
Application of Positive Equality
e1
f
0
1
7 8
0 1
=
f
7
5
F

T
F
T
d0
F


e0
x0
6
T
5 6
7 8
=
5
6
x0
d0
7
f (x0) f (d0)
1
5 6
7 6
5 6
Observation


– 20 –
8
All equations are positive in this formula
Can consider single, diverse interpretation for terms
Function Elimination: Ackermann’s
Method
Replace All Function Applications by Integer Variables


Introduce new domain variable
Enforce functional consistency by global constraints

x1
=
x2

– 21 –

vff1
=
F
vff2
Unclear how to restrict evaluation to diverse interpretations
Function Elimination: ITE Method
General Technique


Introduce new domain variable
Nested ITE structure maintains functional consistency
f vf1
x1
=
f vf
x2
2
T
F
=
=
x3
– 22 –
T
f
T
vf3
F
F
Generating Diverse Encoding
Replacing Application


Use fixed values rather than variables
Application results equal iff arguments equal
f 5
x1
=
f 6
x2
T
F
=
=
x3
T
f
T
7
– 23 –
F
F
Benefits of Positive Equality
Microprocessor Benchmarks



1xDLX: Single issue, RISC processor
2xDLX-EX-BP: Dual issue processor with exception handling
& branch prediction
9VLIW-BP: 9-way VLIW processor with branch prediction
Measurements

Using BerkMin SAT solver
Benchmark
1xDLX
2xDLX-EX-BP
9VLIW-BP
– 24 –
Using Pos. Eq.
No Pos. Eq
buggy
0.02
2
good
0.07
229
buggy
4
15
good
15
> 24hrs
buggy
10
> 24hrs
good
224
> 24hrs
Benefits of Positive Equality
Microprocessor Benchmarks



Velev & Bryant, JSC ‘02
1xDLX: Single issue, RISC processor
2xDLX-EX-BP: Dual issue processor with exception handling
& branch prediction
9VLIW-BP: 9-way VLIW processor with branch prediction
Measurements

Using BerkMin SAT solver
Benchmark
1xDLX
2xDLX-EX-BP
9VLIW-BP
– 25 –
Using Pos. Eq.
No Pos. Eq
good
0.02
2
buggy
0.07
229
good
4
15
buggy
15
> 24hrs
good
10
> 24hrs
buggy
224
> 24hrs
Revisiting Encoding Techniques
x=y  y=z  zx
Satisfiable?
Small Domain (SD)
x1x0 = y1y0  y1y0 = z1z0  z1z0  x1x0


Use bit-level encodings of bounded integers
Implicitly encode properties of equality logic
Per-Constraint Encoding (EIJ)
Transitivity Constraints
exy  eyz  exz
eyz  ezx  exy 
exy  eyz  exz 
exy  exz  eyz


– 26 –

Introduce explicit Boolean variable for each equation
Additional transitivity constraints to express properties of
equality logic
Per-Constraint Encoding
 Introduced by Goel et al., CAV ‘98
 Exploiting sparse structure by Bryant & Velev, CAV 2000
Procedure

Initial formula F
 Want to prove valid
 Prove that F is not satisfiable

Replace each equation x = y by Boolean variable exy
 Gives formula Fsat

Generate formula expressing transitivity constraints
 Gives formula Ftrans

Use SAT solver to show that Fsat  Ftrans not satisfiable
Motivation

– 27 –
Provides SAT solver with more direct representation of
underlying problem
Graph Interpretation of Transitivity
Transitivity Violation


Cycle in graph
Exactly one edge has ei,j = false
=

=
=
=
=
– 28 –
=
=
Exploiting Chords
Chord

Edge connecting two nonadjacent vertices in cycle
Property


Sufficient to enforce
transitivity constraints for
all chord-free cycles
If transitivity holds for all
chord-free cycles, then
holds for arbitrary cycles



– 29 –
Enumerating Chord-Free Cycles
Strategy


Enumerate chord-free cycles in graph
Each cycle of length k yields k transitivity constraints
Problem

Potentially exponential number of chord-free cycles
1
2
•••
k
2k+k chord-free cycles
•••
– 30 –
Adding Chords
Strategy

Add edges to graph to reduce number of chord-free cycles
1
2
•••
k
•••
Trade-Off


– 31 –
Reduces formula size
Increases number of relational variables
2k+k chord-free cycles
2k+1 chord-free cycles
Chordal Graph
Definition

Every cycle of length > 3 has a
chord
Goal

Add minimum number of edges
to make graph chordal
Relation to Sparse Gaussian
Elimination



– 32 –
Choose pivot ordering that
minimizes fill-in
NP-hard
Simple heuristics effective
1xDLX-C Equation Structure
Vertices

For each vi

13 different register
identifiers
Edges

For each equation

Control stalling and
forwarding logic
27 relational variables

 Out of 78 possible
– 33 –
Adding Chordal Edges to 1xDLX-C
Original

27 relational variables
286 cycles

858 clauses

Augmented

33 relational
variables

40 cycles
120 clauses

– 34 –
2DLX-CCt Equation Structure
Equations


Between 25
different register
identifiers
143 relational
variables
 Out of 300
possible
– 35 –
Adding Chordal Edges to 2xDLX-CCt
Original



143 relational
variables
2,136 cycles
8,364 clauses
Augmented

193 relational
variables

858 cycles
2,574 clauses

– 36 –
Choosing Encoding Method
Comparison


Formula length n with m integer variables & function
applications
Worst-case complexity
Small Domain
Per-Constraint
Boolean
Variables
O(m log m)
O(m2)
Formula Size
O(n + m2 log m)
O(n + m3)
Per-Constraint Encoding Works Well in Practice


– 37 –
Generates slightly larger formulas than small domain
Better performance by SAT solver
Encoding Comparison
Benchmarks


Velev & Bryant, JSC ‘02
Superscalar, out-of-order datapath
2–6 instructions issued in parallel
Measurements

– 38 –
Using BerkMin SAT solver
Per-Constraint
Small Domain
Issue
Width
Vars
Clauses
Time
Vars
Clauses
Time
2
139
8,213
1.6
81
1,294
1.7
3
308
33,270
15
127
3,780
19
4
553
96,480
65
194
8,362
99
5
857
240,892
154
249
15,647
255
6
1,243
528,962
1,957
304
26,738
3,206
Download