A Hybrid SAT-based Decision Procedure for Separation Logic with Uninterpreted Functions

advertisement
A Hybrid SAT-based Decision
Procedure for Separation Logic
with Uninterpreted Functions
Sanjit A. Seshia
Joint work with
Shuvendu K. Lahiri & Randal E. Bryant
Carnegie Mellon University, USA
June 2003
Decision Procedures in Formal
Verification
RTL/
Source
Code
+
Specification
Abstraction
Formal
Model
+
Specification
Verification
OK
Error
Satisfiable/Unsatisfiable
Formula
Decision Procedure for Decidable Fragment
of First-Order Logic
Applications: Out-of-order, Pipelined Microprocessors; Cache
Coherence Protocols; Device Drivers; Compiler Validation; …
–2–
Data and Function Abstraction
x0
x1
x2

x
xn-1
Common Operations
p
x 1
ITE(p, x, y)
y 0
If-then-else
Bit-vectors to (unbounded) Integers
x
y
A
L
U

=
x=y
Test for equality
f
Functional units to Uninterpreted Functions
a = x Æ b = y ) f(a,b) = f(x,y)
x
y
x
1
<
x<y
Test for ordering
+
x +1
Counters
–3–
Separation Logic with Uninterpreted
Functions (SUF)


Sufficiently expressive for afore-mentioned
applications
System property expressed as SUF formula F
– Efficiently decided via translation to SAT
Terms (T )
ITE(F, T1, T2)
Fun (T1, …, Tk)
T+1
T-1
Formulas (F )
F, F1  F2, F1  F2
T1 = T2
T1 < T2
Pred(T1, …, Tk)
Integer Expressions
If-then-else
Function application
Increment
Decrement
Boolean Expressions
Boolean connectives
Equation
Inequality
Predicate application
–4–
SAT-based Decision Procedures
Input Formula
Satisfiability-preserving
Boolean Encoder
Input Formula
Approximate
Boolean Encoder
Boolean Formula
Boolean Formula
SAT Solver
SAT Solver
satisfiable
unsatisfiable
EAGER ENCODING
additional
clause
unsatisfiable
First-order
Conjunctions
SAT Checker
satisfiable
satisfying
assignment
unsatisfiable
LAZY ENCODING
satisfiable
–5–
Talk Outline

SUF  Separation Logic  SAT
– Two eager encoding techniques
– Pros and cons of each technique

Combining eager encoding techniques
– The Hybrid eager encoding technique

Experimental results
– Superior performance to lazy encoding methods
and non-SAT-based decision procedures

Conclusions
–6–
SUF  Separation Logic

Eliminate function and predicate applications using
fresh variables and ITE expressions [Bryant, German,
Velev, CAV’99]
– f(x)  v1 and f(y)  ITE(x = y, v1, v2)
Terms (T )
ITE(F, T1, T2)
v Fun (T1, …, Tk)
T+1
T-1
Integer Expressions
If-then-else
Function application Integer variable
Increment
Decrement
Formulas (F )
F, F1  F2, F1  F2
T1 = T2
T1 < T2
b Pred(T1, …, Tk)
Boolean Expressions
Boolean connectives
Equation
Separation Predicate
Inequality
Predicate application
Boolean variable
–7–
Eager Boolean Encoding Methods
for Separation Logic
Separation Logic Formula
Small Domain Encoding
(SD)
Per-Constraint
Encoding (EIJ)
Boolean Formula
SAT Solver
satisfiable/unsatisfiable
–8–
Small Domain Encoding (SD)
[Bryant, Lahiri, Seshia, CAV’02]
x ¸ y Æ y ¸ z Æ z ¸ x+1
h0x1x0i ¸ h0y1y0i Æ h0y1y0i ¸ h0z1z0i Æ h0z1z0i ¸ h0x1x0i + 1
Observation:
To check satisfiability, need to consider all possible
relative orderings of finitely-many expressions
z
x x+1
y
z
Values increase
y
x x+1
Can use Boolean encoding of finite range of values
– 4 values in this case, so 2-bit encoding
–9–
Per-Constraint Encoding (EIJ)
[Strichman, Seshia, Bryant, CAV’02]
x ¸ y Æ y ¸ z Æ z ¸ x+1
Overall Boolean
Encoding
e1
e1 Æ e2 Æ e 3
x¸y
e2
y¸z
Æ
e3
z ¸ x+1
e1 Æ e2 ) e4
Æ
e4 ) : e3
New Separation
Predicate
e4
x¸z
Transitivity Constraints
– 10 –
Comparing Eager Encoding Methods


Of SD and EIJ encoding methods, which one is
better?
Comparison with respect to
– Size of resulting Boolean formula
– Performance of SAT solver
– 11 –
Size of Boolean Encoding: SD better
than EIJ

Let N be size of original separation logic formula
– Size of a directed acyclic graph representation


SD encoding size is worst-case O(N2)
EIJ encoding size is worst-case O(2N)
– Can generate O(2N) transitivity constraints
Example: N = 6813
Method
Boolean Encoding Size
EIJ
> 1000000
SD
54465
– 12 –
Impact on SAT problem: SD vs EIJ
 Experimentally compared zChaff performance on SD and
EIJ encodings of several unsatisfiable formulas
 Sample result:
Method
# Boolean
variables
# CNF
Clauses
# Conflict
zChaff
Clauses Time (sec)
EIJ
57211
169387
150
0.56
SD
23112
67699
15811
21.63
EIJ better than SD for zChaff
– 13 –
Impact on SAT: Why is EIJ better
than SD?

Conjecture: For SD, SAT solver has to
“discover” transitivity constraints as conflict
clauses
– Violation of transitivity constraint might be discovered only
after assigning bits of several bit-vectors

EIJ adds all such constraints a priori
– Less learning and backtracking required by the SAT solver
– 14 –
Eager Encoding Tradeoffs

SD encoding
+ Polynomial size encoding
– Worse for SAT solvers

EIJ encoding
– Worst-case exponential size encoding
+ Better for SAT solvers

Can we automatically select between SD and EIJ
based on the input formula?
– 15 –
Selection Strategy
Estimate number of
transitivity constraints, C

– Computationally hard
to estimate number of
transitivity constraints

YES
Use SD
encoding
C>T?
NO
Use EIJ
encoding
Problem:
Can we use a different
metric?
– Idea: Identify feature of
the input formula that
varies monotonically
with run-time of EIJ
(but not with run-time
of SD)
– 16 –
A Good Formula Feature: Number of
Separation Predicates
– 17 –
A Good Formula Feature: Number of
Separation Predicates
– 18 –
Revised Selection Strategy
Count number of
separation predicates, m
YES
m>T?
+ Easy to count number
of separation predicates
– Very approximate
measure of # of
transitivity constraints
– Constraints only relate
predicates that share
variables
NO

Use SD
encoding
Use EIJ
encoding
Also need to automate
setting of threshold T
– Statistically estimate
from “training” set of
benchmarks
– 19 –
Identifying Variable Classes
Æ
Ç
u¸v
Æ
x¸y
Ç
z ¸ x+1
u = v-2
y¸z
{x,y,z} shared
{u,v} shared
Assignments to {u,v} are independent of those to {x,y,z}
– 20 –
Hybrid Encoding Technique
Separation Logic Formula
Compute 1. Variable classes based on predicates
2. Number of separation predicates for each class
{u,v}, mk
{x,y,z}, m1
NO
YES
m1 > T ?
EIJ
NO
SD
mk > T ?
EIJ
YES
SD
Encode each class using SD or EIJ based on local decision
Encoded Boolean Formula
– 21 –
Automatically Selecting a Threshold
Value: Intuition
EIJ run time increases drastically beyond
a certain number of separation predicates
– 22 –
Automatically Selecting a Threshold
Value using Clustering
Cluster total time (Y-axis) values, minimizing variance of each cluster
– 23 –
Experimental Evaluation Setup

Compared Hybrid against
– SD and EIJ encodings
– Cooperating Validity Checker (CVC) based on lazy encoding
method [Stump et al.’02]
– Stanford Validity Checker (SVC) – non SAT-based [Barrett et
al. ’96]
– CVC & SVC can handle more expressive logics than SUF

Benchmarks
– 49 unsatisfiable SUF formulas
– Load-store unit, out-of-order unit, device driver code,
compiler validation, DLX pipeline
– Threshold value calculated from subset of 16 benchmarks
 Worked well for 39 out of the 49 benchmarks

Setup
– Used zChaff SAT solver
– Imposed timeout of 1800 sec. on total time (Encoding+SAT)
– 24 –
Hybrid vs. SD (39/49 benchmarks)
Hybrid better
SD better
– 25 –
Hybrid vs. EIJ (39/49 benchmarks)
Hybrid better
EIJ better
– 26 –
Hybrid vs. Lazy Encoding (CVC)
(39/49 benchmarks)
Hybrid better
CVC better
– 27 –
Hybrid vs. Non-SAT-based Procedure
(SVC) (39/49 benchmarks)
Hybrid better
SVC better
– 28 –
SD outperforms Hybrid on 10/49
benchmarks
Hybrid better
SD better
– 29 –
Conclusions & Ongoing Work

Hybrid combination of EIJ and SD encodings
– is robust to formula variations
– outperforms lazy encoding methods (CVC)
– outperforms non-SAT-based methods (SVC)

Ongoing & Future work
– Alternate estimators for number of transitivity
constraints
– Threshold setting technique based on clustering
applies to other CAD problems too
– Combination of lazy and eager encoding
techniques might perform well on satisfiable
formulas?

More on UCLID project webpage
http://www.cs.cmu.edu/~uclid
– 30 –
Download