SAT-Based Decision Procedures for Subsets of First-Order Logic Part I: Equality with Uninterpreted Functions Randal E. Bryant Carnegie Mellon University http://www.cs.cmu.edu/~bryant Overall Outline Background SAT-based Decision Procedures Part I: Equality with Uninterpreted Functions Translating to propositional formula Exploiting positive equality and sparse transitivity Part II: Separation Logic –2– Restricted form of addition Translating to propositional formula Hybrid encoding techniques Decision Procedures in Formal Verification RTL/ Source Code + Specification Abstraction Formal Model + Specification Verification Decision Procedure for Decidable Fragment of First-Order Logic Applications: Out-of-order, Pipelined Microprocessors; Cache Coherence Protocols; Device Drivers; Compiler Validation; … –3– OK Error SAT-based Decision Procedures Input Formula Satisfiability-preserving Boolean Encoder Approximate Boolean Encoder Boolean Formula Boolean Formula SAT Solver SAT Solver satisfiable –4– Input Formula unsatisfiable EAGER ENCODING additional clause unsatisfiable First-order Conjunctions SAT Checker satisfiable satisfying assignment unsatisfiable LAZY ENCODING satisfiable Lazy Encoding Characteristics Uninterpreted Functions Linear Arithmetic First-order Conjunctions SAT Checker Theory Combiner Bit Vectors • • • Theory N + Can be extended to handle wide variety of theories + Clean & modular design – Does not scale well Number of calls to conjunction checker typically exponential in –5– formula size Each call independent: nothing learned in one call can be exploited by another Eager Encoding Characteristics Input Formula – Must encode all information about domain properties into Boolean formula – Some properties can give exponential blowup Satisfiability-preserving Boolean Encoder Boolean Formula SAT Solver + Lets SAT solver do all of the work Good Approach for Some Domains Modern SAT solvers have remarkable capacity Good at extracting relevant portions out of very large formulas Learns about formula properties as search proceeds satisfiable –6– unsatisfiable Focus of this talk Data and Function Abstraction x0 x1 x2 x xn-1 Common Operations p x 1 ITE(p, x, y) y 0 If-then-else Bit-vectors to (unbounded) Integers x y A L U x=y Test for equality f Functional units to Uninterpreted Functions –7– = a = x b = y f(a,b) = f(x,y) Abstract Modeling of Microprocessor IF/ID PC Op ID/EX Control EX/WB Control Rd Ra Instr F3 Mem = Adat Reg. File A FL2 U Imm F1 +4 Rb = For any Block that Transforms or Evaluates Data: –8– Replace with generic, unspecified function Also view instruction memory as function EUF: Equality with Uninterp. Functs Decidable fragment of first order logic Formulas (F ) F, F1 F2, F1 F2 T1 = T2 P (T1, …, Tk) Terms (T ) ITE(F, T1, T2) Fun (T1, …, Tk) Functions (Fun) f Read, Write Predicates (P) p –9– Boolean Expressions Boolean connectives Equation Predicate application Integer Expressions If-then-else Function application Integer Integer Uninterpreted function symbol Memory operations Integer Boolean Uninterpreted predicate symbol EUF Decision Problem Circuit Representation of Formula Truth Values Integer Values Task Dashed Lines Model Control Logical connectives Equations Solid lines Model Data Uninterpreted functions If-Then-Else operation e1 f T F e0 x0 f T d0 = T F = F Determine whether formula F is universally valid True for all interpretations of variables and function symbols Often expressed as (un)satisfiability problem – 10 – » Prove that formula F is not satisfiable Finite Model Property for EUF e1 f T F e0 x0 f T d0 x0 = f (x0) f (d0) T F d0 = F Observation – 11 – Any formula has limited number of distinct expressions Only property that matters is whether or not different terms are equal Boolean Encoding of Integer Values Expression x0 Possible Values {0} Bit Encoding 0 0 d0 {0,1} 0 b10 f (x0) {0,1,2} b21 b20 f (d0) {0,1,2,3} b31 b30 For Each Expression Either equal to or distinct from each preceding expression Boolean Encoding Use Boolean values to encode integers over small range EUF formula can be translated into propositional logic Logic circuit with multiplexors, comparators, logic gates – 12 – Tautology iff original formula valid Some History of EUF Decision Procedures Ackermann, 1954 Quantifier-free decision problem can be decided based on finite instantiations Burch & Dill, CAV ‘94 Automatic decision procedure » Davis-Putnam enumeration » Congruence closure to enforce functional consistency Boolean approaches Goel, et al, CAV ‘98 » Attempted with BDDs, but didn’t get good results Bryant, German, Velev, CAV ‘99 » Could verify microprocessor using BDDs Velev & Bryant, DAC 2001 » Demonstrated power of modern SAT procedures – 13 – Exploiting Positive Equality Bryant, German, Velev CAV ‘99 First successful use of Boolean methods for EUF Positive Equality Equations that appear in unnegated form Exploiting Can greatly reduce number of cases required to show validity Only need to consider maximally diverse interpretations – 14 – Reduce number of Boolean variables in bit-level encoding Diverse Interpretations: Illustration Task Verify someone’s obscure code for 4X4 array transpose void trans(int a[4][4]) { int t; for (t = 4; t < 15; t++) if (~t&2|| t&8 && ~t&1) { int r = t&0x3; int c = t>>2; int val = a[r][c]; Only operations a[r][c] = a[c][r]; on array elements a[c][r] = val; } } Observation – 15 – Array elements altered only by copying one to another Just need to make sure right set of copies performed Verifying Array Code Test for trans4 dest src 0 1 2 3 0 4 8 12 4 5 6 7 1 5 9 13 trans4 8 9 10 11 2 6 10 14 12 13 14 15 3 7 11 15 Single Test Adequate Unique value for each possible source element “Maximally Diverse” – 16 – If dest[r][c] = src[c][r], then must have copied proper value Characteristics of Array Verification Correctness Condition src[0][0] = dest[0][0] src[0][1] = dest[1][0] src[0][2] = dest[2][0] … … src[3][2] = dest[2][3] src[3][3] = dest[3][3] Properties All equations are in positive form Worst case test is one that tends to make things unequal I.e., maximally diverse interpretation All maximally diverse interpretations isomorphic Only need to try one to prove all handled correctly – 17 – Equations in Processor Verification IF/ID PC Op ID/EX Control EX/WB Control Rd Ra Instr Mem = Adat Reg. File A L U Imm +4 = Rb Data Types – 18 – Equations Register Ids Control stalling & forwarding Instruction Address Only top-level verification condition Program Data Only top-level verification condition Exploiting Equation Structure Positive Equations In top-level verification condition Can use maximally diverse interpretation Negative Equations PIpeline control logic Between register IDs Operation depends on whether or not two IDs are equal Must use general encoding Encode with Boolean variables All possibility of IDs that match and/or don’t match – 19 – Application of Positive Equality e1 f 0 1 7 8 0 1 = f 7 5 F T F T d0 F e0 x0 6 T 5 6 7 8 = 5 6 x0 d0 7 f (x0) f (d0) 1 5 6 7 6 5 6 Observation – 20 – 8 All equations are positive in this formula Can consider single, diverse interpretation for terms Function Elimination: Ackermann’s Method Replace All Function Applications by Integer Variables Introduce new domain variable Enforce functional consistency by global constraints x1 = x2 – 21 – vff1 = F vff2 Unclear how to restrict evaluation to diverse interpretations Function Elimination: ITE Method General Technique Introduce new domain variable Nested ITE structure maintains functional consistency f vf1 x1 = f vf x2 2 T F = = x3 – 22 – T f T vf3 F F Generating Diverse Encoding Replacing Application Use fixed values rather than variables Application results equal iff arguments equal f 5 x1 = f 6 x2 T F = = x3 T f T 7 – 23 – F F Benefits of Positive Equality Microprocessor Benchmarks 1xDLX: Single issue, RISC processor 2xDLX-EX-BP: Dual issue processor with exception handling & branch prediction 9VLIW-BP: 9-way VLIW processor with branch prediction Measurements Using BerkMin SAT solver Benchmark 1xDLX 2xDLX-EX-BP 9VLIW-BP – 24 – Using Pos. Eq. No Pos. Eq buggy 0.02 2 good 0.07 229 buggy 4 15 good 15 > 24hrs buggy 10 > 24hrs good 224 > 24hrs Benefits of Positive Equality Microprocessor Benchmarks Velev & Bryant, JSC ‘02 1xDLX: Single issue, RISC processor 2xDLX-EX-BP: Dual issue processor with exception handling & branch prediction 9VLIW-BP: 9-way VLIW processor with branch prediction Measurements Using BerkMin SAT solver Benchmark 1xDLX 2xDLX-EX-BP 9VLIW-BP – 25 – Using Pos. Eq. No Pos. Eq good 0.02 2 buggy 0.07 229 good 4 15 buggy 15 > 24hrs good 10 > 24hrs buggy 224 > 24hrs Revisiting Encoding Techniques x=y y=z zx Satisfiable? Small Domain (SD) x1x0 = y1y0 y1y0 = z1z0 z1z0 x1x0 Use bit-level encodings of bounded integers Implicitly encode properties of equality logic Per-Constraint Encoding (EIJ) Transitivity Constraints exy eyz exz eyz ezx exy exy eyz exz exy exz eyz – 26 – Introduce explicit Boolean variable for each equation Additional transitivity constraints to express properties of equality logic Per-Constraint Encoding Introduced by Goel et al., CAV ‘98 Exploiting sparse structure by Bryant & Velev, CAV 2000 Procedure Initial formula F Want to prove valid Prove that F is not satisfiable Replace each equation x = y by Boolean variable exy Gives formula Fsat Generate formula expressing transitivity constraints Gives formula Ftrans Use SAT solver to show that Fsat Ftrans not satisfiable Motivation – 27 – Provides SAT solver with more direct representation of underlying problem Graph Interpretation of Transitivity Transitivity Violation Cycle in graph Exactly one edge has ei,j = false = = = = = – 28 – = = Exploiting Chords Chord Edge connecting two nonadjacent vertices in cycle Property Sufficient to enforce transitivity constraints for all chord-free cycles If transitivity holds for all chord-free cycles, then holds for arbitrary cycles – 29 – Enumerating Chord-Free Cycles Strategy Enumerate chord-free cycles in graph Each cycle of length k yields k transitivity constraints Problem Potentially exponential number of chord-free cycles 1 2 ••• k 2k+k chord-free cycles ••• – 30 – Adding Chords Strategy Add edges to graph to reduce number of chord-free cycles 1 2 ••• k ••• Trade-Off – 31 – Reduces formula size Increases number of relational variables 2k+k chord-free cycles 2k+1 chord-free cycles Chordal Graph Definition Every cycle of length > 3 has a chord Goal Add minimum number of edges to make graph chordal Relation to Sparse Gaussian Elimination – 32 – Choose pivot ordering that minimizes fill-in NP-hard Simple heuristics effective 1xDLX-C Equation Structure Vertices For each vi 13 different register identifiers Edges For each equation Control stalling and forwarding logic 27 relational variables Out of 78 possible – 33 – Adding Chordal Edges to 1xDLX-C Original 27 relational variables 286 cycles 858 clauses Augmented 33 relational variables 40 cycles 120 clauses – 34 – 2DLX-CCt Equation Structure Equations Between 25 different register identifiers 143 relational variables Out of 300 possible – 35 – Adding Chordal Edges to 2xDLX-CCt Original 143 relational variables 2,136 cycles 8,364 clauses Augmented 193 relational variables 858 cycles 2,574 clauses – 36 – Choosing Encoding Method Comparison Formula length n with m integer variables & function applications Worst-case complexity Small Domain Per-Constraint Boolean Variables O(m log m) O(m2) Formula Size O(n + m2 log m) O(n + m3) Per-Constraint Encoding Works Well in Practice – 37 – Generates slightly larger formulas than small domain Better performance by SAT solver Encoding Comparison Benchmarks Velev & Bryant, JSC ‘02 Superscalar, out-of-order datapath 2–6 instructions issued in parallel Measurements – 38 – Using BerkMin SAT solver Per-Constraint Small Domain Issue Width Vars Clauses Time Vars Clauses Time 2 139 8,213 1.6 81 1,294 1.7 3 308 33,270 15 127 3,780 19 4 553 96,480 65 194 8,362 99 5 857 240,892 154 249 15,647 255 6 1,243 528,962 1,957 304 26,738 3,206