SAT and SMT solvers Ayrat Khalimov (based on Georg Hofferek‘s slides) AKDV 2014 Motivation • SAT solvers: - They rocketed the model checking • First-Order Theories - Very expressive - Efficient SMT Solvers But: • What are they? • How do solvers work? Institute for Applied Information Processing and Communications 2 Outline • Propositional SAT solver - DPLL algorithm • Predicate Logic (aka. First-Order Logic) - Syntax - Semantics • First Order Logic • First-Order Theories • SMT solver - Eager Encoding - Lazy Encoding - DPLL(T) Institute for Applied Information Processing and Communications 3 Scope of Solvers theory of equality linear integer arithmetic propositional logic SAT solvers difference logic theory of arrays … SMT solvers first order logic Theorem provers Notation • propositional variables - e.g., a, b, c, d, … • literal is a variable or its negation - e.g., a, b, … • partial assignment A is a conjunction of literals - e.g., A = a d • clause is a disjunction of literals - e.g., c = a b • is a CNF formula (i.e. conjunction of clauses): - e.g., = (a b d) c • [A] is with all variables set according to A - e.g., [A] = (FALSE b TRUE) c = b c SAT Solver Formula in CNF SAT Solver Satisfiable (+ model) Unsatisfiable (+ refutation proof) DPLL Algorithm • Due to Davis, Putnam, Loveland, Logemann - two papers: 1960, 1962 • Basis for all modern SAT solvers CNF as a Set of Clauses • Formula: 𝜑= 𝑎 ∨ ¬𝑏 ∨ 𝑐 ∧ ¬𝑎 ∨ ¬𝑑 ∧ (¬𝑐) • Set Representation { 𝑎, ¬𝑏, 𝑐 , ¬𝑎, ¬𝑑 , ¬𝑐 } Idea of DPLL-based SAT Solvers • Recursively search an A: - [A] is TRUE • Proves satisfiable • “A” is a satisfying model • No such A exists - is unsatisfiable Setting Literals • Compute [l], for a literal l: - Remove all clauses that contain l: • They are true - Remove all literals l: • They are false (i.e., 𝑎 ∨ ¬𝑙 becomes a, ¬𝑙 becomes empty) - An empty clause is false - An empty set of clauses is true Truth Value of a CNF 𝒄𝒍𝒂𝒖𝒔𝒆 ∧ 𝒄𝒍𝒂𝒖𝒔𝒆 ∧ 𝒄𝒍𝒂𝒖𝒔𝒆 ∧ ⋯ ∧ 𝒄𝒍𝒂𝒖𝒔𝒆 { … ∧ … ∧ … ∧ … ∧ • At least one clause is empty: - FALSE • Clause set empty: - TRUE • Otherwise: - Unassigned Literals left … } DPLL Algorithm // sat(, A)=TRUE iff [A] is satisfiable // sat(, true)=TRUE iff is satisfiable sat(, A){ if([A] = true) return TRUE; if([A] = false) return FALSE; // Some unassigned variables left l = pick unassigned variable; AT = A l; if(sat(, AT)) return TRUE; AF = A l; if(sat(, AF)) return TRUE; return FALSE; } DPLL Example • Formula to check: (a b) (b c) (c a) 1. sat((a b) (b c) (c a), true) 2. sat( (a b) (b c) (c a), a) 3. sat( (a b) (b c) (c a), ab) 4. sat( (a b) (b c) (c a), abc) unsat 5. sat( (a b) (b c) (c a), abc) unsat 6. sat( (a b) (b c) (c a), ab) unsat 7. sat( (a b) (b c) (c a), a) 8. sat((a b) (b c) (c a), ab) 9. sat((a b) (b c) (c a), abc) sat Boolean Constraint Propagation (BCP) • Unit clause: - a clause with a single unassigned literal - Examples: • (a) • (b) • Unit Clause exists set its literal - Very simple but very important heuristic! DPLL with BCP sat(, A){ while(unit // l is // unit A = A } clause occurs){ only unassigned literal in clause; l; if([A] = true) return TRUE; if([A] = false) return FALSE; l = pick unassigned variable; AT = A l; if(sat(, AT)) return TRUE; AF = A l; if(sat(, AF)) return TRUE; return FALSE; } Example • Formula to check: (a b) (b c) (c a) 1. sat((a b) (b c) (c a), true) 2. sat( (a b) (b c) (c a), a) 3. [BCP]: sat( (a b) (b c) (c a), ab) 4. [BCP]: sat( (a b) (b c) (c a), abc) unsat 5. sat( (a b) (b c) (c a), a) 6. sat( (a b) (b c) (c a), ab) 7. sat((a b) (b c) (c a), abc) sat Can we do better? sat(, A){ while(unit clause occurs){ // l is only unassigned literal in // unit clause; A = A l; } if([A] = true) return TRUE; if([A] = false) return FALSE; l = pick unassigned variable; AT = A l; if(sat(, AT)) return TRUE; AF = A l; if(sat(, AF)) return TRUE; return FALSE; } Pure Literals • Pure literal: - Literal for unassigned variable - The variable appears in one phase only • Pure literals true them DPLL with BCP and Pure Literals sat(, A){ while(unit clause occurs){ // BCP let l be only unassigned literal in c; A = A l; } while(pure literal l exists){ // Pure literals A = A l; } if([A] = true) return TRUE; if([A] = false) return FALSE; l = pick a literal that does not occur in A; AT = A l; if(sat(, AT)) return TRUE; AL = A l; if(sat(, AL)) return TRUE; return FALSE; } Example • Formula to check: (a b) (b c) (c a) 1. sat((a b) (b c) (c a), true) [a pure] 2. sat( (a b) (b c) (c a), a) [b pure] 3. sat( (a b) (b c) (c a), ab) sat Can we do better? sat(, A){ while(unit clause l occurs) A = A l; while(pure literal l exists) A = A l; if([A] = true) return TRUE; if([A] = false) return FALSE; l = pick a literal that does not occur in A; AT = A l; if(sat(, AT)) return TRUE; AL = A l; if(sat(, AL)) return TRUE; return FALSE; } Institute for Applied Information Processing and Communications 21 Learning: informal • Whenever we get the conflict - analyze it • add clauses to avoid in future 2013-03-08 Institute for Applied Information Processing and Communications 22 Learning 1. 2. 3. 4. 5. 6. 7. (a c) (b c) (a b c) (a b) a (a b) (a b) UNSAT (a b) c Learning 1. 2. 3. 4. 5. 6. 7. (a c) (b c) (a b c) (a b) a (a b) (a b) UNSAT (a b) Without learning c a a UNSAT UNSAT a UNSAT The problem is with a: no need to set c=true! Learning 1. 2. 3. 4. 5. 6. 7. (a c) (b c) c (a b c) (a b) a (a b) (a b) UNSAT (a b) a We learn: a 6 b false 7 Learning & Backtracking 1. 2. 3. 4. 5. 6. 7. 8. (a c) (b c) c (a b c) (a b) a (a b) (a b) UNSAT (a b) a LEVEL 0 LEVEL 1 a 6 b false 7 LEVEL 2 We learn: a Jump back to level 0 is smart Learning & Backtracking 1. 2. 3. 4. 5. 6. 7. 8. (a c) (b c) c (a b c) (a b) a (a b) (a b) UNSAT (a b) a a LEVEL 0 LEVEL 1 LEVEL 2 Jump back to level 0 is smart Learning & Backtracking 1. 2. 3. 4. 5. 6. 7. 8. (a c) (b c) c (a b c) (a b) a (a b) (a b) UNSAT (a b) a a 4 b false 5 LEVEL 0 LEVEL 1 LEVEL 2 Learning & Backtracking 1. 2. 3. 4. 5. 6. 7. 8. (a c) (b c) c (a b c) (a b) a (a b) (a b) UNSAT (a b) a UNSAT a 4 b false 5 LEVEL 0 LEVEL 1 LEVEL 2 We learn: UNSAT, because no decision was necessary Backtrack Level • Three important possibilities 1. Backtrack as usual 2. Restart for every learned clause 3. Go to the earliest level in which the conflict clause is a unit clause • Option 3 often performs better Can we do better? (learning is not shown) sat(, A){ while(unit clause l occurs) A = A l; while(pure literal l exists) A = A l; if([A] = true) return TRUE; if([A] = false) return FALSE; l = pick a literal that does not occur in A; AT = A l; if(sat(, AT)) return TRUE; how to pick literals? AF = A l; if(sat(, AF)) return TRUE; return FALSE; } 31 Effect of picking heuristics on SAT solver performance Source: Armin Biere’s slides: http://fmv.jku.at/rerise14/rerise14-sat-slides.pdf Institute for Applied Information Processing and Communications 32 Can we do better? -- Special cases • Horn clauses can be solved in polynomial time • Cut width algorithm 2013-03-08 Institute for Applied Information Processing and Communications 33 source: http://gauss.ececs.uc.edu/SAT/ Syntax of Predicate Logic • Two sorts: - Objects • • • • “Terms” Numbers Strings Elements of sets … - Truth values • IsEven(42) “Formulas” From Terms to Formulas Formula 𝑃 𝑥, 𝑓 𝑦 Predicate Term Term FOL formulae: informal definition ∀𝒙∀𝒛∃𝒚 𝑷 𝒙, 𝒛 ∧ 𝑭 𝒚 → 𝑷(𝒕 𝒙, 𝒚 , 𝒛) quantifiers over variables unary functions predicates: binary , etc. • can FO formulae quantify over functions/predicates? • can FO formulae have free (non-quantified) variables? • * can FO formulae have ‘uninterpreted’ functions? • * can FO formula has infinite number of atoms? Syntax of Predicate Logic • Variables 𝕍 - x, y, z, … • Functions 𝔽 - f, g, h, … (arity > 0) - constants (arity = 0) • Predicates ℙ - P, Q, R, … (with arity > 0) • Terms 𝕋 and Formulae defined next Terms 𝕋 • Variable is a term • Constant is a term • If 𝒕𝟏 , 𝒕𝟐 , … , 𝒕𝒏 are terms, 𝒇 is 𝑛-ary function then 𝒇(𝒕𝟏 , 𝒕𝟐 , … , 𝒕𝒏 ) is a term Formulae Preconditions: 𝑷(𝒕𝟏 , 𝒕𝟐 , … , 𝒕𝒏 ) • Terms 𝒕𝟏 , 𝒕𝟐 , … , 𝒕𝒏 ¬𝝓 • 𝑛-ary predicate symbol 𝑷 𝝓 ∧ 𝝍, 𝝓 ∨ 𝝍, 𝝓 → 𝝍 • formulae 𝝓, 𝝍 ∀𝒙 𝝓 • Variable 𝒙 ∃𝒙 𝝓 True and False FO formulae • Functions and predicates in FO formulae are ‘uninterpreted’ - they can be any • Variables in FO formulae have no domains - what can x, y be? • What does it mean that this formula is true? or false? ∀𝒙∃𝒚: 𝑷 𝒙, 𝒚 → 𝑷 𝒇 𝒙, 𝒚 , 𝒚 • Depends.. Model 𝑀 for (𝔽, ℙ,𝕍) • Non-empty set 𝐷 - Domain for variables - Possibly infinite - Non-empty • For constansts 𝑐 ∈ 𝔽: concrete element 𝑐 𝑀 ∈ 𝐷 • For functions 𝑓 ∈ 𝔽: concrete function 𝑓 𝑀 : 𝐷𝑛 → 𝐷 • For predicates 𝑃 ∈ ℙ: subset 𝑃𝑀 ⊆ 𝐷𝑛 (of arity n) - i.e., set of tuples on which 𝑃 is true Semantics of Predicate Logic • Formula 𝜙 - Over 𝔽, ℙ, 𝕍 • Model 𝑀 - For 𝔽, ℙ, 𝕍 •𝑴⊨𝝓? - (𝜙 has no free variables) Inductive Definition Semantics of Predicate Logic • For 𝜙 of the form ∀𝑥 𝜓 - 𝑀 ⊨ ∀𝑥 𝜓 iff 𝑀 ⊨ 𝜓[𝑥↦𝑎] , for all 𝑎 ∈ 𝐷 • For 𝜙 of the form ∃𝑥 𝜓 - 𝑀 ⊨ ∃𝑥 𝜓 iff 𝑀 ⊨ 𝜓[𝑥↦𝑎] , for at least one 𝑎 ∈ 𝐷 • For 𝜙 of the form ¬𝜓, 𝜓1 ∧ 𝜓2 , 𝜓1 ∨ 𝜓2 - Like in propositional logic • No free variables => any predicate 𝑃 𝑡1 , 𝑡2 , … , 𝑡𝑛 has concrete arguments Examples • Let model M be: - D = {1,2} - 𝑃 1,1 = P 1,2 = T, others gives F - f(1, ..)=1, f(2, 1)=1, f(2,2)=2 • 𝝓 = ∀𝒙∃𝒚: 𝑷 𝒙, 𝒚 → 𝑷 𝒇 𝒙, 𝒚 , 𝒚 Does 𝑴 ⊨ 𝝓? 2013-03-08 Institute for Applied Information Processing and Communications 46 Satisfiable FO formulae ∀𝒙∃𝒚: 𝑷 𝒙, 𝒚 → 𝑷 𝒇 𝒙, 𝒚 , 𝒚 is sat means there is a model: • there is a non-empty domain D for x, y - for example, D={1,2} • there is predicate P, function 𝑓: - for example, 𝑃 = 1,2 , i.e. P(1,2)=true, P(2,.)=false - for example, 𝑓 = { 1,2 ↦ 1, . . }, i.e. 𝑓(1,2) = 1 such that 𝑴 ⊨ ∀𝒙 ∃𝒚: 𝑷 𝒙, 𝒚 → 𝑷 𝒇 𝒙, 𝒚 , 𝒚 Valid FO formulae ∀𝒙 𝑷 𝒙 → ∀𝒙 𝑷 𝒇 𝒙 is valid iff it is satisfied by any model Let us check for example the model: • D={1,2} • P={1,2} - i.e., P(1)=P(2)=T • function 𝜏 is any from {1,2} to {1,2} Some facts about our world • Gödel proved that - every valid FO formula has a finite proof. • Church-Turing proved that - no algorithm exists that can decide if FO formula is invalid proof FO formula deduction algorithm if invalid may never terminate Notion of “Theory” Application Domain Structures & Objects Arithmetic Numbers (Integers, Rationals, Reals) Computer Programs Arrays, Bitvectors Predicates & Functions Array-Read, Array-Write, … Definition of a Theory First-Order Theory 𝓣: 1. Signature Σ - Constants Predicates Functions 𝚺-formula: (non-logic) symbols from Σ only 2. Set of Axioms 𝒜 - Sentences (=Formulas without free variables) with symbols from Σ only 𝚺, 𝓐: possibly infinite Example: Theory of Equality 𝒯𝐸 • Signature Σ𝐸 = =, 𝑎0 , 𝑏0 , 𝑐0 , 𝑑0 , … - Binary equality predicate = - Arbitrary constant symbols - (no function/predicate symbols!) • Axioms 𝒜𝐸 : 1. ∀𝑥. 𝑥 = 𝑥 (reflexivity) 2. ∀𝑥. ∀𝑦. 𝑥 = 𝑦 → 𝑦 = 𝑥 (symmetry) 3. ∀𝑥. ∀𝑦. ∀𝑧. 𝑥 = 𝑦 ∧ 𝑦 = 𝑧 → 𝑥 = 𝑧 (transitivity) Model View All possible Models Models satisfying all axioms • We check satisfiability and validity only wrt models that satisfy axioms - “Satisfiability modulo (=‘with respect to’) theories” 𝒯-Satisfiability • Green: Models Satisfying all Axioms • Violet: Models Satisfying Formula in Question 𝓣-Satisfiable 𝓣-Satisfiable Not 𝓣-Satisfiable 𝒯-Validity • Green: Models Satisfying all Axioms • Violet: Models Satisfying Formula in Question 𝓣-Valid 𝓣-Valid Not 𝓣-Valid Theory Formulas vs. FO Formulas Theory Formula 𝝓𝑻 𝒜→𝝓 𝒜∧𝝓 Fragment of a Theory • Syntactically restricted subset - Quantifier-free fragment - Conjunctive fragment • e.g.: 𝑎 = 𝑏 ∧ 𝑏 = 𝑐 ∧ 𝑏 ≠ 𝑑 Scope of Solvers theory of equality linear integer arithmetic propositional logic SAT solvers difference logic theory of arrays … SMT solvers first order logic Theorem provers Deciding Satisfiability (quantifier free theory): main methods 1. Eager Encoding - Equisatisfiable 2. Lazy Encoding Theory Solver Conjunctive Fragment propositional formula Blocking Clauses - one fat SAT call numerous SAT calls 3. DPLL (T) Example: Theory of Uninterpreted Functions and Equality 𝒯𝑈𝐸 • Signature Σ𝑈𝐸 = =, 𝑎, 𝑏, 𝑐, 𝑑, … - Binary equality predicate = - Arbitrary constant- and function-symbols • Axioms 𝒜𝑈𝐸 : 1.-3. same as in 𝒜𝐸 4. ∀𝑥. ∀ 𝑦. ( 𝑖 𝑥𝑖 (reflexivity), (symmetry), (transitivity) = 𝑦𝑖 ) → 𝑓 𝑥 = 𝑓 𝑦 (function congruence) Axiom Schema: Template for (infinite number of) axioms Two-Stage Eager Encoding (quant.-free) 𝓣𝑼𝑬 formula Ackermann’s Reduction equisatisfiable 𝓣𝑬 formula Graph-based Reduction equisatisfiable propositional formula SAT Solver Ackermann’s Reduction (from 𝓣𝑼𝑬 to 𝓣𝑬 ) • Fresh Variables - 𝑓 𝑥 ⇝ 𝑓𝑥 , 𝑔 𝑥 ⇝ 𝑔𝑥 , ... • Functional Constraints - 𝑥 = 𝑦 → 𝑓𝑥 = 𝑓𝑦 • 𝓣𝑬 formula: 𝜙𝐸 = 𝜙𝐹𝐶 ∧ 𝜙𝑈𝐸 𝝓𝑼𝑬 Perform Ackermann’s Reduction for ≔𝒇 𝒈 𝒙 =𝒇 𝒚 ∨ 𝒛=𝒈 𝒚 ∧𝒛≠𝒇 𝒛 63 Graph-Based Reduction (from 𝓣𝐄 to propositional) • Non-Polar Equality Graph f - Node per variable e d - Edge per (dis)equality • Make it chordal - No chord-free cycles (size > 3) c a b g Graph-Based Reduction (from 𝓣𝐄 to propositional) • Fresh Propositional Variables - 𝑎 = 𝑏 ⇝ 𝑒𝑎=𝑏 - Order! 𝑏 = 𝑎 ⇝ 𝑒𝑎=𝑏 𝒂 • Triangle (𝑎, 𝑏, 𝑐): - Transitivity Constraints - 𝑒𝑐=𝑏 ∧ 𝑒𝑏=𝑎 → 𝑒𝑐=𝑎 ∧ 𝑒𝑐=𝑏 ∧ 𝑒𝑐=𝑎 → 𝑒𝑏=𝑎 ∧ 𝑒𝑐=𝑎 ∧ 𝑒𝑏=𝑎 → 𝑒𝑐=𝑏 - 𝜙𝑝𝑟𝑜𝑝 = 𝜙 𝑇𝐶 ∧ 𝜙𝐸 𝒄 𝒃 SAT Solver Perform Graph-Based Reduction for 𝝓𝑬 ≔ 𝒂 = 𝒃 ∧ 𝒃 = 𝒄 ∧ 𝒄 = 𝒅 ∧ 𝒅 ≠ 𝒂 66 Summary: Eager Encoding (quant.-free) 𝓣𝑼𝑬 formula Ackermann’s Reduction equisatisfiable 𝓣𝑬 formula 𝝓𝑬 = 𝝓𝑭𝑪 ∧ 𝝓𝑼𝑬 Graph-based Reduction equisatisfiable propositional formula 𝝓𝒑𝒓𝒐𝒑 = 𝝓𝑻𝑪 ∧ 𝝓𝑬 SAT Solver Lazy Encoding 𝒔𝒌𝒆𝒍(𝝓) SAT Solver UNSAT Assignment of Literals Blocking Clause Theory Solver SAT Conjunctive (quant-free) Fragment of 𝒯𝑈𝐸 • Conjunction of theory literals, where literals are: - (𝑡1 = 𝑡2 ), 𝑓(𝑡2 ) = 𝑡1 , 𝑡1 ≠ 𝑡2 , 𝑡1 ≠ 𝑓(𝑡2 ) , … . Congruence-Closure Algorithm • Equivalence Classes - introduce class for each term 𝑡1 , 𝑡2 , 𝑓 𝑡1 … . - 𝑡1 = 𝑡2 : merge classes of 𝑡1 𝑎𝑛𝑑 𝑡2 into one larger class - two classes shared terms -- merge classes! (repeat) - 𝑡𝑖 , 𝑡𝑗 from same class: Merge classes of 𝑓 𝑡𝑖 , 𝑓 𝑡𝑗 (repeat) • Check Disequalities 𝑡𝑘 ≠ 𝑡𝑙 - 𝑡𝑘 , 𝑡𝑙 in same class: UNSAT! - Otherwise: SAT! Perform Congruence Closure for 𝝓𝑼𝑬 ≔ 𝒂 = 𝒃 ∧ 𝒃 = 𝒄 ∧ 𝒅 = 𝒆 ∧ 𝒆 ≠ 𝒂 ∧ 𝒇 𝒂 ≠ 𝒇(𝒄) 71 Lazy Encoding 𝒔𝒌𝒆𝒍(𝝓) SAT Solver UNSAT Assignment of Literals Blocking Clause Theory Solver SAT DPLL(T) Decide partial assignment SAT Learn & Backtrack Start partial assignment full assignment BCP/PL conflict Analyze Conflict partial assignment Theory Solver theory propagation / conflict Add Clauses UNSAT Scope of Solvers theory of equality linear integer arithmetic propositional logic SAT solvers difference logic theory of arrays … SMT solvers first order logic Theorem provers Summary • Propositional SAT Problem - DPLL • First-Order Theories - Examples: 𝒯𝐸 , 𝒯𝑈𝐸 • Satisfiability modulo theories - Eager Encoding - Lazy Encoding - DPLL(T) 75 Self-check: learning targets • Explain Satisfiability Modulo Theories • Describe Theory of Uninterpreted Functions and Equality • Explain and use - Ackermann’s Reduction Graph-based Reduction Congruence Closure DPLL DPLL(T) Institute for Applied Information Processing and Communications 76 some reading • History of satisfiability: http://gauss.ececs.uc.edu/SAT/articles/FAIA1850003.pdf • SAT basics: http://gauss.ececs.uc.edu/SAT/articles/sat.pdf • Conflict Driven Clause Learning: http://gauss.ececs.uc.edu/SAT/articles/FAIA185-0131.pdf • Armin Biere’s slides: http://fmv.jku.at/rerise14/rerise14-sat-slides.pdf • SAT game http://www.cril.univartois.fr/~roussel/satgame/satgame.php?level=1&lang=eng • Logic and Computability classes by Georg http://www.iaik.tugraz.at/content/teaching/bachelor_courses/logik_und_ber echenbarkeit/ Institute for Applied Information Processing and Communications 77