My slides (PPT)

Introduction to Satisfiability Modulo Theories (SMT) Clark Barrett, NYU Sanjit A. Seshia, UC Berkeley ICCAD Tutorial November 2, 2009 Boolean Satisfiability (SAT) p1 Ç Æ p2 . . . : Æ Ç  Ç pn Is there an assignment to the p1, p2, …, pn variables such that  evaluates to 1? C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 2 Satisfiability Modulo Theories p1 x=y p2 x+2z¸1 . . . pn w & 0xFFFF = x x % 26 = v Ç Æ : Æ Ç  Ç Is there an assignment to the x,y,z,w variables s.t.  evaluates to 1? C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 3 Satisfiability Modulo Theories • Given a formula in first-order logic, with associated background theories, is the formula satisfiable? – Yes: return a satisfying solution – No [generate a proof of unsatisfiability] C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 4 Applications of SMT • Hardware verification at higher levels of abstraction (RTL and above) • Verification of analog/mixed-signal circuits • Verification of hybrid systems • Software model checking • Software testing • Security: Finding vulnerabilities, verifying electronic voting machines, … • Program synthesis • … C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 5 References Satisfiability Modulo Theories Clark Barrett, Roberto Sebastiani, Sanjit A. Seshia, and Cesare Tinelli. Chapter 8 in the Handbook of Satisfiability, Armin Biere, Hans van Maaren, and Toby Walsh, editors, IOS Press, 2009. (available from our webpages) SMTLIB: A repository for SMT formulas (common format) and tools SMTCOMP: An annual competition of SMT solvers C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 6 Roadmap for this Tutorial • • • • Background and Notation Survey of Theories Theory Solvers Approaches to SMT Solving – Lazy Encoding to SAT – Eager Encoding to SAT • Conclusion C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 7 Roadmap for this Tutorial Background and Notation • Survey of Theories • Theory Solvers • Approaches to SMT Solving – Lazy Encoding to SAT – Eager Encoding to SAT • Conclusion C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 8 First-Order Logic • A formal notation for mathematics, with expressions involving – Propositional symbols – Predicates – Functions and constant symbols – Quantifiers • In contrast, propositional (Boolean) logic only involves propositional symbols and operators C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 9 First-Order Logic: Syntax • As with propositional logic, expressions in first-order logic are made up of sequences of symbols. • Symbols are divided into logical symbols and non-logical symbols or parameters. • Example: (x = y) Æ (y = z) Æ (f(z) ¸ f(x)+1) C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 10 First-Order Logic: Syntax • Logical Symbols – Propositional connectives: Ç, Æ, :, !, $ – Variables: v1, v2, . . . – Quantifiers: 8, 9 • Non-logical symbols/Parameters – Equality: = – Functions: +, -, %, bit-wise &, f(), concat, … – Predicates: ·, is_substring, … – Constant symbols: 0, 1.0, null, … C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 11 Quantifier-free Subset • We will largely restrict ourselves to formulas without quantifiers (8, 9) • This is called the quantifier-free subset/fragment of first-order logic with the relevant theory C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 12 Logical Theory • Defines a set of parameters (non-logical symbols) and their meanings • This definition is called a signature. • Example of a signature: Theory of linear arithmetic over integers Signature is (0,1,+,-,·) interpreted over Z C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 13 Roadmap for this Tutorial Background and Notation Survey of Theories • Theory Solvers • Two Approaches to SMT Solving – Lazy Encoding to SAT – Eager Encoding to SAT • Conclusion C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 14 Some Useful Theories • Equality (with uninterpreted functions) • Linear arithmetic (over Q or Z) • Difference logic (over Q or Z) • Finite-precision bit-vectors – integer or floating-point • Arrays / memories • Misc.: Non-linear arithmetic, strings, inductive datatypes (e.g. lists), sets, … C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 15 Theory of Equality and Uninterpreted Functions (EUF) • Also called the “free theory” – Because function symbols can take any meaning – Only property required is congruence: that these symbols map identical arguments to identical values i.e., x = y ) f(x) = f(y) • SMTLIB name: QF_UF C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 16 Data and Function Abstraction with EUF x0 x1 x2  x xn-1 Bit-vectors to Abstract Domain (e.g. Z) Common Operations p x 1 ITE(p, x, y) y 0 If-then-else A L U  x f y = x=y Test for equality Functional units to Uninterpreted Functions a = x Æ b = y ) f(a,b) = f(x,y) C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 17 Hardware Abstraction with EUF IF/ID PC Op ID/EX Control EX/WB Control Rd Ra Instr F1 Mem = Adat Reg. File A FL2 U Imm F +4 3 Rb = • For any Block that Transforms or Evaluates Data: – Replace with generic, unspecified function – Also view instruction memory as function C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 18 Example QF_UF (EUF) Formula (x = y) Æ (y = z) Æ (f(x)  f(z)) Transitivity: (x = y) Æ (y = z) ) (x = z) Congruence: (x = z) ) (f(x) = f(z)) C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 19 Equivalence Checking of Program Fragments int fun1(int y) { int x, z; z = y; y = x; x = z; return x*x; } int fun2(int y) { return y*y; } C. Barrett & S. A. Seshia SMT formula  Satisfiable iff programs non-equivalent ( z = y Æ y1 = x Æ x1 = z Æ ret1 = x1*x1) Æ ( ret2 = y*y ) Æ ( ret1  ret2 ) What if we use SAT to check equivalence? ICCAD 2009 Tutorial 20 Equivalence Checking of Program Fragments int fun1(int y) { int x, z; z = y; y = x; x = z; return x*x; } SMT formula  Satisfiable iff programs non-equivalent ( z = y Æ y1 = x Æ x1 = z Æ ret1 = x1*x1) Æ ( ret2 = y*y ) Æ ( ret1  ret2 ) Using SAT to check equivalence (w/ Minisat) int fun2(int y) { 32 bits for y: Did not finish in over 5 hours return y*y; 16 bits for y: 37 sec. } 8 bits for y: 0.5 sec. C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 21 Equivalence Checking of Program Fragments int fun1(int y) { int x, z; z = y; y = x; x = z; return x*x; } int fun2(int y) { return y*y; } C. Barrett & S. A. Seshia SMT formula ’ ( z = y Æ y1 = x Æ x1 = z Æ ret1 = sq(x1) ) Æ ( ret2 = sq(y) ) Æ ( ret1  ret2 ) Using EUF solver: 0.01 sec ICCAD 2009 Tutorial 22 Equivalence Checking of Program Fragments int fun1(int y) { int x; x = x ^ y; y = x ^ y; x = x ^ y; return x*x; } int fun2(int y) { return y*y; } C. Barrett & S. A. Seshia Does EUF still work? No! Must reason about bit-wise XOR. Need a solver for bit-vector arithmetic. Solvable in less than a sec. with a current bit-vector solver. ICCAD 2009 Tutorial 23 Finite-Precision Bit-Vector Arithmetic (QF_BV) – Fixed width data words • Can model int, short, long, etc. – Arithmetic operations • E.g., add/subtract/multiply/divide & comparisons • Two’s complement and unsigned operations – Bit-wise logical operations • E.g., and/or/xor, shift/extract and equality – Boolean connectives C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 24 Linear Arithmetic (QF_LRA, QF_LIA) • Boolean combination of linear constraints of the form (a1 x1 + a2 x2 + … + an xn » b) • xi’s could be in Q or Z , » 2 {¸,>,·,<,=} • Many applications, including: – Verification of analog circuits – Software verification, e.g., of array bounds C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 25 Difference Logic (QF_IDL, QF_RDL) • Boolean combination of linear constraints of the form xi - xj » cij or x i » ci » 2 {¸,>,·,<,=}, xi’s in Q or Z • Applications: – Software verification (most linear constraints are of this form) – Processor datapath verification – Job shop scheduling / real-time systems – Timing verification for circuits C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 26 Arrays/Memories • SMT solvers can also be very effective in modeling data structures in software and hardware – Arrays in programs – Memories in hardware designs: e.g. instruction and data memories, CAMs, etc. C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 27 Theory of Arrays (QF_AX) Select and Store • Two interpreted functions: select and store – select(A,i) – store(A,i,d) Read from A at index i Write d to A at index i • Two main axioms: – select(store(A,i,d), i) = d – select(store(A,i,d), j) = select(A,j) for i  j • One other axiom: – (8 i. select(A,i) = select(B,i)) ) A = B C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 28 Equivalence Checking of Program Fragments int fun1(int y) { int x[2]; x[0] = y; y = x[1]; x[1] = x[0]; return x[1]*x[1]; } SMT formula ’’ x1 = store(x,0,y) Æ y1 = select(x1,1) Æ x2 = store(x1,1,select(x1,0)) Æ ret1 = sq(select(x2,1)) ] Æ ( ret2 = sq(y) ) Æ ( ret1  ret2 ) [ int fun2(int y) { return y*y; } C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 29 Roadmap for this Tutorial Background and Notation Survey of Theories Theory Solvers • Two Approaches to SMT Solving – Lazy Encoding to SAT – Eager Encoding to SAT • Conclusion C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 30 Over to Clark… C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 31 Roadmap for this Tutorial Background and Notation Survey of Theories Theory Solvers • Approaches to SMT Solving – Lazy Encoding to SAT Eager Encoding to SAT • Conclusion C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 32 Eager Approach to SMT Input Formula Satisfiability-preserving Boolean Encoder Boolean Formula SAT Solver satisfiable unsatisfiable EAGER ENCODING C. Barrett & S. A. Seshia SAT Solver involved in Theory Reasoning Key Ideas: • Small-domain encoding – Constrain model search • Rewrite rules • Abstraction-based methods (eager + lazy) Example Solvers: UCLID, STP, Spear, Boolector, Beaver, … ICCAD 2009 Tutorial 33 Theories • Eager Encoding Methods have been demonstrated for the following Theories: – Equality & Uninterpreted Functions – Integer Linear Arithmetic – Restricted Lambda expressions • Arrays, memories, etc. – Finite-precision Bit-Vector Arithmetic – Strings C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 34 UCLID Operation Input Formula Lambda Expansion for Arrays  -free Formula Operation – Series of transformations leading to Boolean formula – Each step is validity (satisfiability) preserving – Each step performs optimizations http://uclid.eecs.berkeley.edu C. Barrett & S. A. Seshia Function & Predicate Elimination Linear/ Bitvector ArithmeticFormula Encoding Arithmetic Boolean Formula Boolean Satisfiability ICCAD 2009 Tutorial 35 Rewrites: Eliminating Function Applications – Two applications of an uninterpreted function f in a formula – f(x1) and f(x2) Bryant, German, Velev’s Encoding Ackermann’s Encoding f(x1) vf1 f(x1) f(x2) vf2 f(x2) x1= x2  vf1 = vf2 C. Barrett & S. A. Seshia vf1 ITE(x1= x2, vf1, vf2) ICCAD 2009 Tutorial 36 Small-Domain Encoding • Consider an SMT formula (x1, x2, …, xn) where xi 2 Di • Small-domain encoding/Finite instantiation: Derive finite set Si ½ Di s.t. |Si| ¿ |Di| – In some cases, Si is finite where Di is infinite • Encode each xi to take values only in Si – Could be done by encoding to SAT • Example: Integer Linear Arithmetic (QF_LIA) C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 37 Solving QF_LIA is NP-complete • In NP: – If a satisfying solution exists, then one exists within a bound d • log d is polynomial in input size – Expression for d [Papadimitriou, ‘82] 2m+3 (n+m) ¢ (bmax +1) ¢ ( m ¢ amax ) – Input size: • • • • m n bmax amax C. Barrett & S. A. Seshia – # constraints – # variables – largest constant (absolute value) – largest coefficient (absolute value) ICCAD 2009 Tutorial 38 Small-domain encoding / Finite Instantiation: Naïve approach • Steps – Calculate the solution bound d – Encode each integer variable with d log d e bits & translate to Boolean formula – Run SAT solver • Problem: For QF_LIA, d is W( m m ) – W( m log m ) bits per variable • Solution: Exploit special-cases and domainspecific structure C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 39 Special Case 1: Equality Logic • Linear constraints are equalities xi = xj • Result: d = n x1  x 2 Æ x2  x3 Æ x1  x3 3-valued domain is needed: {1, 2, 3} x1 = x2 Æ x2  x3 Æ x1  x3 Can find solution with domain {1, 2} [Pnueli et al., Information and Computation, 2002] C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 40 Special Case 2: Difference Logic • Boolean combination of difference-bound constraints – xi ¸ xj + b, § xi ¸ b • Result: d = n ¢ (bmax + 1) [Bryant, Lahiri, Seshia, CAV’02] • Proof sketch: satisfying solution corresponds to shortest path in constraint graph – Longest such path has length · n ¢ (bmax + 1) • Tighter formula-specific bounds possible C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 41 Special Case 3: Generalized 2SAT • Generalized 2SAT constraints – xi + xj ¸ b, - xi - xj ¸ b, xi - xj ¸ b, xi ¸ b • d = 2 ¢ n ¢ (bmax + 1) [Seshia, Subramani, Bryant,’04] C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 42 Full Integer Linear Arithmetic • Can we avoid the mm blow-up? • In fact, yes. The idea is to derive a new parameterized solution bound d – Formalize parameters that the bound really depends on – Parameters characterize sparse structure • Occurs especially in software verification; also in many high-level hardware models – [Seshia & Bryant, LICS’04, LMCS’05] C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 43 Structure of Linear Constraints in Software Verification • Characteristics of studied benchmarks – Mostly difference constraints • Only 3% of constraints were NOT difference constraints – Non-difference constraints are sparse • At most 6 variables per constraint (total number of variables in 1000s) • Some similar observations: Pratt’77, ESC/JavaSimplify-TR’03 C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 44 Parameterized Solution Bound   New parameters: – k non-difference constraints, – w variables per constraint (width) Our solution bound: n ¢ (bmax +1) ¢ ( w ¢ amax ) k Previous: (n+m) ¢ (bmax +1) ¢ ( m ¢ amax ) 2m+3 m #constraints n #variables bmax max |constant| amax max |coefficient| • Direct dependence on m eliminated (and k ¿ m ) C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 45 Example Æ Ç : Ç x1 - x2 ¸ 1 x1 + 2 x2 + x3 > -3 x2 – x4 ¸ 0 m #constraints 3 k #non-difference 1 n #variables 4 w width 3 Previous d bmax max |constant| 3 = 282,175,488 amax max |coefficient| 2 C. Barrett & S. A. Seshia d = 96 ICCAD 2009 Tutorial 46 Summary of d Values Logic Equality logic Difference logic Solution Bound d n n ¢ ( bmax + 1 ) Generalized 2SAT logic 2 ¢ n ¢ ( bmax + 1 ) Full Integer Linear Arithmetic n ¢ (bmax + 1) ¢ (amaxk ¢ w k) C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 47 Abstraction-Based Methods • For some logics, one cannot easily compute a closed-form expression for the small domain • Example: Bit-Vector Arithmetic • In such cases, an abstraction-refinement approach can be used to compute formula-specific small domains C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 48 Bit-Vector Arithmetic: Some History • B.C. (Before Chaff) – String operations (concatenate, field extraction) – Linear arithmetic with bounds checking – Modular arithmetic • SAT-Based “Bit Blasting” – Generate Boolean circuit based on bit-level behavior of operations • Handles arbitrary operations – Check with best available SAT solver – Effective in many applications • CBMC [Clarke, Kroening, Lerda, TACAS ’04] • Microsoft Cogent + SLAM [Cook, Kroening, Sharygina, CAV ’05] C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 49 Research Challenge • Is there a better way than bit blasting? • Requirements – Provide same functionality as with bit blasting • Must support all bit-vector operators – Exploit word-level structure – Improve on performance of bit blasting • Current Approaches based on two core ideas: 1. Simplification: Simplify input formula using word-level rewrite rules and solvers 2. Abstraction: Can use automatic abstraction-refinement to solve simplified formula C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 50 Bit-Vector SMT Solvers, circa Spr.’2009 Current Techniques with Sample Tools – Proof-based abstraction-refinement – UCLID [Bryant et al., TACAS ’07] – Solver for linear modular arithmetic to simplify the formula – STP [Ganesh & Dill, CAV’07] – Automatic parameter tuning for SAT– Spear [Hutter et al., FMCAD ’07] – Rewrites, underapproximation, efficient SAT engine – Boolector [Brummayer & Biere, TACAS’09] – Equality/constant propagation, logic optimization, special rules for non-linear ops - Beaver [Jha et al., CAV’09] – DPLL(T) framework: Layered approach, rewriting – CVC3 [Barrett et al.], MathSAT [Bruttomesso et al], Yices [Dutertre et al.], Z3 [de Moura et al] C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 51 Abstraction-Refinement • Deciding Bit-Vector Arithmetic with Abstraction [Bryant et al., TACAS ’07, STTT ’09] – Use bit blasting as core technique – Apply to simplified versions of formula: under and over approximations – Generate successive approximations until a solution is found or formula shown unsatisfiable – Inspired by McMillan & Amla’s proof-based abstraction for finite-state model checking • Small Motivating Example: (x + y  y + x) Æ (x * y  y * x) – Sufficient to prove the left-hand conjunct unsat C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 52 Approximations to Formula Overapproximation Original Formula Underapproximation +   + More solutions: If unsatisfiable, then so is  −   Fewer solutions: Satisfying solution also satisfies   − • Example Approximation Techniques – Underapproximating • Restrict word-level variables to smaller ranges of values – Overapproximating • Replace subformula with Boolean variable C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 53 Starting Iterations   1− • Initial Underapproximation – (Greatly) restrict ranges of word-level variables – Intuition: Satisfiable formula often has small-domain solution C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 54 First Half of Iteration  1+   1− UNSAT proof: generate overapproximation If SAT, then done • SAT Result for 1− – Satisfiable • Then have found solution for  – Unsatisfiable • Use UNSAT proof to generate overapproximation 1+ C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 55 Second Half of Iteration  1+ SAT: Use solution to generate refined underapproximation If UNSAT, then done  2−  1− • SAT Result for 1+ – Unsatisfiable: then have shown  unsatisfiable – Satisfiable: solution indicates variable ranges that must be expanded • Generate refined underapproximation C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 56 Example 1+ := (x = y+2) SAT x = 2, y = 0 UNSAT Look at proof  := (x = y+2) Æ (x2 > y2) 2− := (x[2] = y[2]+2) Æ (x[2] > y[2] 2 2) SAT, done. 1− := (x[1] = y[1]+2) Æ (x[1]2 > y[1]2) C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 57 Iterative Behavior • Underapproximations  2+  + 1  k+  k− – Successively more precise abstractions of  – Allow wider variable ranges • Overapproximations – No predictable relation – UNSAT proof not unique  2−  1− C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 58 Overall Effect • Soundness  2+  + 1 UNSAT  k+  SAT k− 2−  1− C. Barrett & S. A. Seshia – Only terminate with solution on underapproximation – Only terminate as UNSAT on overapproximation • Completeness – Successive underapproximations approach  – Finite variable ranges guarantee termination • In worst case, get k− =  ICCAD 2009 Tutorial 59 Roadmap for this Tutorial Background and Notation Survey of Theories Theory Solvers Approaches to SMT Solving – Lazy Encoding to SAT – Eager Encoding to SAT Conclusion C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 60 Summary of Ideas: Modeling • Philosophy: Model systems in first-order logic + suitable theories • Widely-used theories: – Equality and uninterpreted functions – Linear arithmetic – Bit-vector arithmetic – Arrays C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 61 Summary of Ideas: Lazy Methods • Philosophy: Extend DPLL framework from SAT to SMT • Literals assigned by SAT are sent to Theory Solver • Theory Solver determines if literals are satisfiable in the theory • Key optimizations: small explanations, early conflict detection, theory propagation C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 62 Summary of Ideas: Eager Methods • Philosophy: Constrain solution space with logic-specific methods • Small-domain encoding – Compute bounds that work for any formula in the logic • Abstraction-refinement of domains – Compute formula-specific small domains • Rewrite rules: high level and bit level – Simplify formula before and after bit-blasting C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 63 Challenges and Opportunities • Solvers for new theories – Strings – Non-linear arithmetic – Can we exploit domain-specific structure? • Parallel SMT • Better support for quantifiers • Better proof/interpolant generation C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 64 Join the SMT Community • We need your new, exciting applications! • Contribute to SMT-LIB • Create new solvers, compete in SMTCOMP Slides and book chapter available on our websites: Clark: http://cs.nyu.edu/~barrett Sanjit: http://www.eecs.berkeley.edu/~sseshia C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 65

My slides (PPT)

Related documents

Products

Support

My slides (PPT)

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib