SAT-Based Decision Procedures for Subsets of First-Order Logic Carnegie Mellon University

advertisement
SAT-Based Decision
Procedures for Subsets of
First-Order Logic
Part II:
Separation Logic
Randal E. Bryant
Carnegie Mellon University
http://www.cs.cmu.edu/~bryant
Outline
Background

SAT-based Decision Procedures
Equality with Uninterpreted Functions


Translating to propositional formula
Exploiting positive equality and sparse transitivity
Separation Logic


–2–
Translating to propositional formula
Hybrid encoding techniques
Separation Logic with Uninterpreted
Functions (SUF)

Suitable for verifying wider class of systems
Terms (T )
ITE(F, T1, T2)
Fun (T1, …, Tk)
T+1
T–1
Formulas (F )
F, F1  F2, F1  F2
T1 = T2
T1 < T2
Pred(T1, …, Tk)
–3–
Integer Expressions
If-then-else
Function application
Increment
Decrement
Boolean Expressions
Boolean connectives
Equation
Inequality
Predicate application
SUF  Separation Logic
Eliminate function and predicate applications using fresh
variables and ITE expressions [Bryant, German, Velev, CAV’99]

–4–
f(x)  v1 and f(y)  ITE(x = y, v1, v2)
Terms (T )
ITE(F, T1, T2)
v Fun (T1, …, Tk)
T+1
T-1
Integer Expressions
If-then-else
Function application Integer variable
Increment
Decrement
Formulas (F )
F, F1  F2, F1  F2
T1 = T2
T1 < T2
b Pred(T1, …, Tk)
Boolean Expressions
Boolean connectives
Equation
Separation Predicate
Inequality
Predicate application
Boolean variable
Eager Boolean Encoding Methods for
Separation Logic
Separation Logic Formula
Small Domain Encoding
(SD)
Per-Constraint
Encoding (EIJ)
Boolean Formula
SAT Solver
satisfiable/unsatisfiable
–5–
Small Domain Encoding (SD)
[Bryant, Lahiri, Seshia, CAV’02]
x  y  y  z  z  x+1
0x1x0  0y1y0  0y1y0  0z1z0  0z1z0  0x1x0 + 1
Observation:
To check satisfiability, need to consider all possible
relative orderings of finitely-many expressions
z
x x+1
y
z
Values increase
y
x x+1
Can use Boolean encoding of finite range of values
– 4 values in this case, so 2-bit encoding
–6–
Per-Constraint Encoding (EIJ)
[Strichman, Seshia, Bryant, CAV’02]
x  y  y  z  z  x+1
Overall Boolean
Encoding
e1
xy
e1  e2  e 3
e2
yz

e3
z  x+1
e1  e2  e4

e4   e3
Transitivity Constraints
–7–
New Separation
Predicate
e4
xz
Enforcing Transitivity Constraints
x  y + c1
x
c1
y
x
c3 + c4
c3 + c2
c1 + c4
c + c2 c4
c3 1
c1
c2
z
y
Graph Representation of Separation Constraints

Directed multigraph where edges labeled by constants
Fourier-Motzkin Elimination


–8–
Eliminate nodes in succession
Possibly exponential growth in edges
Introducing New Predicates
x  y + c1
x
c1
y
Sample Predicates
x
c3 + c4
c3 + c2
c1 + c4
c + c2 c4
c3 1
c1
c2
e1
x  y + c1
e2
y  z + c2
e3
x  z + c1 + c2
Sample Transitivity Constraint
e4
x  y + c2
e1  e2  e3
y
Sample Ordering Constraint
(for c1 < c2)
e4  e1
–9–
z
Comparing Eager Encoding Methods
Of SD and EIJ encoding methods, which one is better?
Comparison with respect to


– 10 –
Size of resulting Boolean formula
Performance of SAT solver
Size of Boolean Encoding: SD better
than EIJ
Let N be size of original separation logic formula

Size of a directed acyclic graph representation
SD encoding size is worst-case O(N2)
EIJ encoding size is worst-case O(2N)

Can generate O(2N) transitivity constraints
Example: N = 6813
– 11 –
Method
Boolean Encoding Size
EIJ
> 1000000
SD
54465
Impact on SAT problem: SD vs EIJ
Experimentally compared zChaff performance on SD and EIJ
encodings of several unsatisfiable formulas
Sample result:
Method
# Boolean
variables
# CNF
Clauses
# Conflict
Clauses
EIJ
57211
169387
150
0.56
SD
23112
67699
15811
21.63
EIJ better than SD for zChaff
– 12 –
zChaff
Time (sec)
Impact on SAT: Why is EIJ better than
SD?
Conjecture: For SD, SAT solver has to “discover”
transitivity constraints as conflict clauses

Violation of transitivity constraint might be discovered only
after assigning bits of several bit-vectors
EIJ adds all such constraints a priori

– 13 –
Less learning and backtracking required by the SAT solver
Eager Encoding Tradeoffs
SD encoding
+

Polynomial size encoding
Worse for SAT solvers
EIJ encoding

+
Worst-case exponential size encoding
Better for SAT solvers
Can we automatically select between SD and EIJ based
on the input formula?
– 14 –
Selection Strategy
Seshia, Lahiri, Bryant, DAC ‘03
Estimate number of
transitivity constraints, C
YES
Use SD
encoding
– 15 –
C>T?
NO
Problem:

Can we use a different
metric?

Use EIJ
encoding
Computationally hard to
estimate number of
transitivity constraints
Idea: Identify feature of
the input formula that
varies monotonically with
run-time of EIJ (but not
with run-time of SD)
A Good Formula Feature: Number of
Separation Predicates
– 16 –
A Good Formula Feature: Number of
Separation Predicates
– 17 –
Revised Selection Strategy
Count number of
separation predicates, m
YES
Use SD
encoding
– 18 –
m>T?
NO
Use EIJ
encoding
Easy to count number of
separation predicates
Very approximate measure
of # of transitivity
constraints

Constraints only relate
predicates that share
variables
Also need to automate
setting of threshold T

Statistically estimate from
“training” set of
benchmarks
Identifying Variable Classes
Æ
Ç
u¸v
Æ
x¸y
z ¸ x+1
u = v-2
y¸z
{x,y,z} shared
– 19 –
Ç
{u,v} shared
Assignments to {u,v} are independent of those to {x,y,z}
Hybrid Encoding Technique
Separation Logic Formula
Compute 1. Variable classes based on predicates
2. Number of separation predicates for each class
{u,v}, mk
{x,y,z}, m1
NO
YES
m1 > T ?
EIJ
NO
SD
mk > T ?
EIJ
Encode each class using SD or EIJ based on local decision
– 20 –
Encoded Boolean Formula
YES
SD
Automatically Selecting a Threshold
Value: Intuition
EIJ run time increases drastically beyond
a certain number of separation predicates
– 21 –
Automatically Selecting a Threshold
Value using Clustering
Cluster total time (Y-axis) values, minimizing variance of each cluster
– 22 –
Experimental Evaluation Setup
Compared Hybrid against




SD and EIJ encodings
Cooperating Validity Checker (CVC) based on lazy encoding
method [Stump et al.’02]
Stanford Validity Checker (SVC) – non SAT-based [Barrett et al. ’96]
CVC & SVC can handle more expressive logics than SUF
Benchmarks

49 unsatisfiable SUF formulas

Load-store unit, out-of-order unit, device driver code, compiler
validation, DLX pipeline
Threshold value calculated from subset of 16 benchmarks

 Worked well for 39 out of the 49 benchmarks
Setup


– 23 –
Used zChaff SAT solver
Imposed timeout of 1800 sec. on total time (Encoding+SAT)
Hybrid vs. SD (39/49 benchmarks)
Hybrid better
SD better
– 24 –
Hybrid vs. EIJ (39/49 benchmarks)
Hybrid better
EIJ better
– 25 –
Hybrid vs. Lazy Encoding (CVC)
(39/49 benchmarks)
Hybrid better
CVC better
– 26 –
Hybrid vs. Non-SAT-based Procedure
(SVC) (39/49 benchmarks)
Hybrid better
SVC better
– 27 –
SD outperforms Hybrid on 10/49
benchmarks
Hybrid better
SD better
– 28 –
Conclusions & Ongoing Work
Hybrid combination of EIJ and SD encodings



is robust to formula variations
outperforms lazy encoding methods (CVC)
outperforms non-SAT-based methods (SVC)
Ongoing & Future work



Alternate estimators for number of transitivity constraints
Threshold setting technique based on clustering applies to
other CAD problems too
Combination of lazy and eager encoding techniques might
perform well on satisfiable formulas?
More on UCLID project webpage
http://www.cs.cmu.edu/~uclid
– 29 –
Download