SAT Genealogy Alexander Nadel, Intel, Haifa, Israel The Technion, Haifa, Israel July, 3 2012 1 Agenda Introduction Early Days of SAT Solving Core SAT Solving Conflict Analysis and Learning Boolean Constraint Propagation Decision Heuristics Restart Strategies Inprocessing Extensions to SAT Incremental SAT Solving under Assumptions Simultaneous Satisfiability (SSAT) Diverse Solutions Generation High-level (group-oriented) MUC Extraction 2 Agenda Introduction Early Days of SAT Solving Core SAT Solving Conflict Analysis and Learning Boolean Constraint Propagation Decision Heuristics Restart Strategies Inprocessing Extensions SAT We won’tto use implication graphs for explanation, but: Incremental SAT search Solvingand under Assumptions Duality between resolution Simultaneous Satisfiability (SSAT) Diverse Solutions Generation High-level (group-oriented) MUC Extraction 3 What is SAT? Find a variable assignment (AKA solution or model) that satisfies a propositional formula or prove that there are no solutions SAT solvers operate on CNF formulas: Any formula can be reduced to a CNF CNF Formula: F = ( a + c ) ( b + c ) (a’ + b’ + c’ ) clause negative literal positive literal 4 SAT: Theory and Practice Theory: SAT is the first known NP-complete problem One can check a solution in polynomial time Can one find a solution in polynomial time? Stephen Cook, 1971 The P=NP question… Practice: Amazingly, nowadays SAT solvers can solve industrial problems having millions of clauses and variables SAT has numerous applications in formal verification, planning, bioinformatics, combinatorics, … 5 Approaches to SAT Solving Backtrack search: DFS search for a solution Look-ahead: BFS search for a solution The baseline approach for industrial-strength solvers. In focus today. Helpful for certain classes of formulas Recently, there were attempts of combining it with backtrack search Local search Helpful mostly for randomly generated formulas 6 Early Days of SAT Solving Agenda Resolution Backtrack Search 7 Resolution: a Way to Derive New Valid Clauses Resolution over a pair of clauses with exactly one pivot variable: a variable appearing in different polarities: •Known to be invented by Davis&Putnam, 1960 •Had been invented independently by Lowenheim in early 1900’s (as well as the DP algorithm, presented next) •According to Chvatal&Szemeredy, 1988 (JACM) a + b + c’ + f g + h’ + c + f a + b + g + h’ + f - The resolvent clause is a logical consequence of the two source clauses DP Algorithm: Davis&Putnam, 1960 Remove the variables one-by-one by resolution over all the clauses containing that variable DP is sound and complete (a + b + c)(b + c’ + f’)(b’ + e) (a + c + e)(c’ + e + f) (a + e + f) SAT (a + b) (a + b’) (a’ + c)(a’ + c’) (a) (a’ + c)(a’ + c’) (c)(c’) () UNSAT 9 Backtrack Search or DLL: DavisLogemann-Loveland, 1962 a+b b’ + c b’ + c’ a’ + b Backtrack Search or DLL: DavisLogemann-Loveland, 1962 a+b b’ + c b’ + c’ a’ + b a’ Backtrack Search or DLL: DavisLogemann-Loveland, 1962 a+b a’ Decision level 1 b’ + c b’ + c’ a’ + b a is the decision variable; a’ is the decision literal Backtrack Search or DLL: DavisLogemann-Loveland, 1962 a+b a’ b’ + c b’ + c’ a’ + b b’ Decision level 2 Backtrack Search or DLL: DavisLogemann-Loveland, 1962 a+b a’ b’ + c b’ + c’ a’ + b b’ a+b A conflict. A blocking clause – a clause, falsified by the current assignment – is encountered. Backtrack Search or DLL: DavisLogemann-Loveland, 1962 a+b a’ b’ + c b’ + c’ a’ + b b’ a+b b Backtrack and flip Backtrack Search or DLL: DavisLogemann-Loveland, 1962 a+b a’ b’ + c b’ + c’ a’ + b Decision level 1 b’ a+b b c’ b’ + c Decision level 2 Backtrack Search or DLL: DavisLogemann-Loveland, 1962 a+b a’ b’ + c b’ + c’ a’ + b b’ b Decision level 1 a+b c’ b’ + c c b’ + c’ Backtrack Search or DLL: DavisLogemann-Loveland, 1962 a+b a’ a b’ + c b’ + c’ a’ + b b’ a+b b c’ b’ + c c b’ + c’ Backtrack Search or DLL: DavisLogemann-Loveland, 1962 a+b a’ a b’ + c b’ + c’ a’ + b b’ a+b b c’ b’ + c b c b’ + c’ Backtrack Search or DLL: DavisLogemann-Loveland, 1962 a+b a’ a b’ + c b’ + c’ a’ + b b’ a+b b c’ b’ + c b c b’ + c’ c’ b’ + c Backtrack Search or DLL: DavisLogemann-Loveland, 1962 a+b a’ a b’ + c b’ + c’ a’ + b b’ a+b b c’ b’ + c b c b’ + c’ c’ b’ + c c b’ + c’ Backtrack Search or DLL: DavisLogemann-Loveland, 1962 a+b a’ a b’ + c b’ + c’ a’ + b b’ a+b b c’ b’ + c b’ b c a’ + b b’ + c’ c’ b’ + c c b’ + c’ Backtrack Search or DLL: DavisLogemann-Loveland, 1962 UNSAT! a+b a’ a b’ + c b’ + c’ a’ + b b’ a+b b b’ b c’ b’ + c c a’ + b b’ + c’ c’ b’ + c c b’ + c’ Core SAT Solving: the Principles DLL could solve problems with <2000 clauses How can modern SAT solvers solve problems with millions of clauses and variables? The major principles: Learning and pruning Locality and dynamicity Block already explored paths Focus the search on the relevant data Well-engineered data structures Extremely fast propagation 24 Agenda Introduction Early Days of SAT Solving Core SAT Solving Conflict Analysis and Learning Boolean Constraint Propagation Decision Heuristics Restart Strategies Inprocessing Extensions to SAT Incremental SAT Solving under Assumptions Simultaneous Satisfiability (SSAT) Diverse Solutions Generation High-level (group-oriented) MUC Extraction 25 Duality between Basic Backtrack Search and Resolution One can associate a resolution derivation with every invocation of DLL over an unsatisfiable formula Duality between Basic Backtrack Search and Resolution a+b b’ + c b’ + c’ a’ + b Duality between Basic Backtrack Search and Resolution a+b b’ + c b’ + c’ a’ + b a’ Duality between Basic Backtrack Search and Resolution a+b a’ b’ + c b’ + c’ a’ + b b’ a+b Duality between Basic Backtrack Search and Resolution a+b a’ b’ + c b’ + c’ a’ + b b’ b a+b •A parent clause P(x) is associated with every flip operation for variable x. It contains: •The flipped literal •A subset of previously assigned falsified literals •The parent clause justifies the flip: its existence proves that the explored subspace has no solutions Duality between Basic Backtrack Search and Resolution a+b a’ b’ + c b’ + c’ a’ + b b’ a+b b c’ b’ + c Duality between Basic Backtrack Search and Resolution a+b a’ b’ + c b’ + c’ a’ + b b’ a+b b c’ b’ + c c Duality between Basic Backtrack Search and Resolution a+b a’ b’ + c b’ + c’ a’ + b b’ a+b b c’ b’ + c c b’ + c’ Duality between Basic Backtrack Search and Resolution a+b a’ b’ + c b’ + c’ a’ + b b’ b Pnew b’ a+b c’ P(c) b’ + c c Pold b’ + c’ • Backtracking over a flipped variable x can be associated with a resolution operation: • P = P(x) P • P is to become the parent clause for the upcoming flip • P is initialized with the last blocking clause Duality between Basic Backtrack Search and Resolution a+b a’ b’ + c b’ + c’ a’ + b b’ a P(b) a+b Pnew b b’ c’ b’ + c Pold c b’ + c’ • Backtracking over a flipped variable x can be associated with a resolution operation: • P = P(x) P • P is to become the parent clause for the upcoming flip • P is initialized with the last blocking clause Duality between Basic Backtrack Search and Resolution a+b b’ + c b’ + c’ a’ + b a’ (a) b’ a+b a a b b’ c’ b’ + c c b’ + c’ •The parent clause P(a) is derived by resolution. •The resolution proof (a) of the parent clause is called parent resolution Duality between Basic Backtrack Search and Resolution a+b a’ a b’ + c b’ + c’ a’ + b b’ a+b a b b’ c’ b’ + c b c b’ + c’ Duality between Basic Backtrack Search and Resolution a+b a’ a b’ + c b’ + c’ a’ + b b’ a+b a b b’ c’ b’ + c b c b’ + c’ c’ b’ + c Duality between Basic Backtrack Search and Resolution a+b a’ a b’ + c b’ + c’ a’ + b b’ a+b a b b’ c’ b’ + c b c b’ + c’ c’ P(c) b’ + c c Duality between Basic Backtrack Search and Resolution a+b a’ a b’ + c b’ + c’ a’ + b b’ a+b a b b’ c’ b’ + c b c b’ + c’ c’ b’ + c c b’ + c’ Duality between Basic Backtrack Search and Resolution a+b a’ a b’ + c b’ + c’ a’ + b b’ a+b a b b’ c’ b’ + c b c b’ b’ + c’ c’ P(c) b’ + c Pnew c Pold b’ + c’ Duality between Basic Backtrack Search and Resolution a+b a’ a b’ + c b’ + c’ a’ + b b’ a+b a b b’ c’ b’ + c b’ b c b’ + c’ (b) b’ c’ b’ + c c b’ + c’ Duality between Basic Backtrack Search and Resolution a+b a’ a b’ + c b’ + c’ a’ + b b’ a+b a b b’ c’ b’ + c b’ b c b’ b’ + c’ c’ b’ + c a’ + b c b’ + c’ Duality between Basic Backtrack Search and Resolution a+b a’ a b’ + c b’ + c’ a’ + b b’ a+b a a’ b b’ c’ b’ + c Pnew c b’ b P(b) Pold b’ b’ + c’ c’ b’ + c a’ + b c b’ + c’ Duality between Basic Backtrack Search and Resolution a+b a’ a’ + b a P(a) b’ + c b’ + c’ Pnew b’ a+b Pold a a’ b b’ c’ b’ + c b’ b c b’ b’ + c’ c’ b’ + c a’ + b c b’ + c’ Duality between Basic Backtrack Search and Resolution a+b a’ a b’ + c b’ + c’ a’ + b b’ a+b a a’ b b’ c’ b’ + c b’ b c b’ b’ + c’ c’ b’ + c a’ + b c b’ + c’ Duality between Basic Backtrack Search and Resolution The final trace of DLL is both a decision tree (top-down view) and a resolution refutation (bottom-up view) Variables associated with the edges are both decision variables in the tree and pivot variables for the resolution A forest of parent resolutions is maintained The forest converges to one resolution refutation in the end (for an UNSAT formula) a’ b’ a+b a a a’ b b’ b’ b c’ c b’ + c b’ + c’ b’ c’ b’ + c a’ + b c b’ + c’ Conflict Clause Recording The idea: update the instance with conflict clauses, that is some of the clauses generated by resolution Introduced in SAT by Bayardo&Schrag, 1997 (rel_sat) a’ b’ a+b a a a’ b b’ b’ b c’ c b’ + c b’ + c’ b’ c’ b’ + c a’ + b c b’ + c’ Conflict Clause Recording Assume the brown clause below was recorded a’ b’ a+b a a a’ b b’ b’ b c’ c b’ + c b’ + c’ b’ c’ b’ + c a’ + b c b’ + c’ Conflict Clause Recording Assume the brown clause below was recorded The violet part would not have been explored It is redundant a’ b’ a+b a a a’ b b’ b’ b c’ c b’ + c b’ + c’ b’ c’ b’ + c a’ + b c b’ + c’ Conflict Clause Recording Assume the brown clause below was recorded The violet part would not have been explored It is redundant a’ b’ a+b a a a’ b b b’ c’ c b’ + c b’ + c’ b’ a’ + b Conflict Clause Recording Most of the modern solvers record every non-trivial parent clause (since Chaff) : recorded : not recorded a’ b’ c’ c a e’ b d’ d f’ f e g’ g Enhancing CCR: Local Conflict Clause Recording The parent-based scheme is asymmetric w.r.t polarity selection a’ b’ c’ c a e’ b d’ d f’ f e g’ g Enhancing CCR: Local Conflict Clause Recording The parent-based scheme is asymmetric w.r.t polarity selection Solution: record an additional local conflict clause: a would-be conflict clause if the last polarity selection was flipped Dershowitz&Hanna&Nadel, 2007 (Eureka) : local conflict clause a’ b’ c’ c a e’ b d’ d f’ f e g’ g Managing Conflict Clauses Keeping too many clauses slows down the solver Deleting irrelevant clauses is very important. Some of the strategies: Size-based: remove too long clauses Age-based: remove clauses that weren’t used for BCP Marques-Silva&Sakallah, 1996 (GRASP) Goldberg&Novikov, 2002 (Berkmin) Locality-based (glue): remove clauses, whose literals are assigned far away in the search tree Audemard&Simon, 2009 (Glucose) 55 Modern Conflict Analysis Next, we present the following two techniques, commonly used in modern SAT solvers: Non-chronological backtracking (NCB) 1UIP scheme GRASP GRASP&Chaff Both techniques prune the search tree and the associated forest of parent resolutions Non-Chronological Backtracking (NCB) NCB is an additional pruning operation before flipping: eliminate all the decision levels adjacent to the decision level of the flipped literal, so that the parent clause is still falsified e’ e d’ (e) a+b a’ b’ + c b’ + c’ a’ + b … b’ a+b a •Assume we are about to flip a b b’ c’ b’ + c c b’ + c’ Non-Chronological Backtracking (NCB) NCB is an additional pruning operation before flipping: eliminate all the decision levels adjacent to the decision level of the flipped literal, so that the parent clause is still falsified e’ e d’ (e) a+b a’ b’ + c b’ + c’ a’ + b … b’ a+b a •Assume we are about to flip a •Eliminate irrelevant decision levels b b’ c’ b’ + c c b’ + c’ Non-Chronological Backtracking (NCB) NCB is an additional pruning operation before flipping: eliminate all the decision levels adjacent to the decision level of the flipped literal, so that the parent clause is still falsified a+b a’ b’ + c b’ + c’ a’ + b … b’ a+b a a b b’ c’ b’ + c c b’ + c’ •Assume we are about to flip a •Eliminate irrelevant decision levels •Flip 1UIP Scheme 1UIP Scheme 1UIP scheme consists of: A stopping condition for backtracking: stop whenever P contains one variable of the last decision level, called the 1UIP variable 1UIP Scheme 1UIP scheme consists of: A stopping condition for backtracking: stop whenever P contains one variable of the last decision level, called the 1UIP variable a+b a’ b’ + c b’ + c’ a’ + b b’ a+b b P b’ c’ b’ + c c b’ + c’ 1UIP Scheme 1UIP scheme consists of: A stopping condition for backtracking: stop whenever P contains one variable of the last decision level, called the 1UIP variable A rewriting operation: consider the 1UIP variable as a decision variable and P as its parent clause a+b a’ b’ + c b’ + c’ a’ + b b’ a+b b P b’ c’ b’ + c c b’ + c’ 1UIP Scheme 1UIP scheme consists of: A stopping condition for backtracking: stop whenever P contains one variable of the last decision level, called the 1UIP variable A rewriting operation: consider the 1UIP variable as a decision variable and P as its parent clause a+b a’ b’ + c b’ + c’ a’ + b b’ a+b b P b’ c’ b’ + c c b’ + c’ 1UIP Scheme 1UIP scheme consists of: A stopping condition for backtracking: stop whenever P contains one variable of the last decision level, called the 1UIP variable A rewriting operation: consider the 1UIP variable as a decision variable and P as its parent clause a+b a’ b’ + c b’ + c’ a’ + b b’ a+b b b’ c’ b’ + c c b’ + c’ 1UIP Scheme 1UIP scheme consists of: A stopping condition for backtracking: stop whenever P contains one variable of the last decision level, called the 1UIP variable A rewriting operation: consider the 1UIP variable as a decision variable and P as its parent clause A pruning technique: eliminate all the disconnected variables of the last decision level (along with their parent resolutions) a+b a’ b’ + c b’ + c’ a’ + b b’ a+b b b’ c’ b’ + c c b’ + c’ 1UIP Scheme 1UIP scheme consists of: A stopping condition for backtracking: stop whenever P contains one variable of the last decision level, called the 1UIP variable A rewriting operation: consider the 1UIP variable as a decision variable and P as its parent clause A pruning technique: eliminate all the disconnected variables of the last decision level (along with their parent resolutions) a+b b’ + c b’ + c’ b a’ + b b’ b’ c’ b’ + c c b’ + c’ Agenda Introduction Early Days of SAT Solving Core SAT Solving Conflict Analysis and Learning Boolean Constraint Propagation Decision Heuristics Restart Strategies Inprocessing Extensions to SAT Incremental SAT Solving under Assumptions Simultaneous Satisfiability (SSAT) Diverse Solutions Generation High-level (group-oriented) MUC Extraction 68 Boolean Constraint Propagation The unit clause rule A clause is unit if all of its literals but one are assigned to 0. The remaining literal is unassigned, e.g.: a = 0, b = 1, c is unassigned a + b’ + c Boolean Constraint Propagation (BCP) Pick unassigned variables of unit clauses as decisions whenever possible 80-90% of running time of modern SAT solvers is spent in BCP Introduced already in the original DLL 69 Data Structures for Efficient BCP Naïve: for each clause hold pointers to all its literals How to minimize the number of clause visits? When can a clause become unit? All literals in a clause but one are assigned to 0 For an N-literal clause, this can only occur after N-1 of the literals have been assigned to 0 So, theoretically, one could completely ignore the first N-2 assignments to this clause. The solution: one picks two literals in each clause to watch and thus can ignore any assignments to the other literals in the clause. Introduced by Zhang, 1997 (SATO solver); enhanced by Moskewicz& Madigan&Zhao&Zhang&Malik, 2001 (Chaff) 70 Watched Lists : Example a W b c d e f g h W 71 Watched Lists : Example a W b c d e f g h a’ W 72 Watched Lists : Example a b c W d e f g h a’ W •The clause is visited •The corresponding watch moves to any unassigned literal •No pointers to the previously visited literals are saved 73 Watched Lists : Example a b W c d e f g h W a’ c’ 74 Watched Lists : Example a b c d W e f g h W a’ c’ •The clause is not visited! 75 Watched Lists : Example a b W c d e f g h a’ W c’ e’ g’ 76 Watched Lists : Example a b c d W e f g h a’ W c’ e’ g’ •The clause is not visited! 77 Watched Lists : Example a b c d e f g W h a’ W c’ e’ g’ h’ 78 Watched Lists : Example a b c W d e f g h a’ W c’ e’ g’ h’ •The clause is visited •The corresponding watch moves to any unassigned literal •No pointers to the previously visited literals are saved 79 Watched Lists : Example a b c d e f W g h a’ W c’ e’ g’ h’ f’ 80 Watched Lists : Example a b W c d e f g h a’ W c’ e’ g’ h’ f’ 81 Watched Lists : Example a b W c d e f g h a’ W c’ e’ g’ h’ f’ b’ 82 Watched Lists : Example a b W c d e f g h a’ W c’ e’ g’ h’ f’ b’ •The watched literal b is visited. It is identified that the clause became unit! 83 Watched Lists : Example a b W c d e f g h a’ W c’ e’ g’ h’ f’ b’ Backtrack • b is unassigned : the watches do not move • No need to visit the clause during backtracking! 84 Watched Lists : Example a b W c d e f g h a’ W c’ e’ g’ h’ f’ Backtrack b’ • f is unassigned : the watches do not move 85 Watched Lists : Example a b d e f g h a’ W c’ e’ Backtrack W c g’ h’ f’ b’ • When all the literals are unassigned, the watches pointers do not get back to their initial positions 86 Watched Lists : Caching Chu&Harwood&Stuckey, 2008 Divide the clauses into various cache levels to improve cache performance Most of the modern solvers put one literal of each clause in the WL Special data structures for clauses of length 2 and 3 87 Agenda Introduction Early Days of SAT Solving Core SAT Solving Conflict Analysis and Learning Boolean Constraint Propagation Decision Heuristics Restart Strategies Inprocessing Extensions to SAT Incremental SAT Solving under Assumptions Simultaneous Satisfiability (SSAT) Diverse Solutions Generation High-level (group-oriented) MUC Extraction 88 Decision Heuristics Which literal should be chosen at each decision point? Critical for performance! Old-Days’ Static Decision Heuristics Go over all clauses that are not satisfied Compute some function f(A) for each literal— based on frequency Choose literal with maximal f(A) Variable-based Dynamic Heuristics: VSIDS VSIDS was the first dynamic heuristic (Chaff) Each literal is associated with a counter Initialized to number of occurrences in input Counter is increased when the literal participates in a conflict clause Occasionally, counters are halved Literal with the maximal counter is chosen Breakthrough compared to static heuristics: Dynamic: focuses search on recently used variables and clauses Extremely low overhead Enhancements to VSIDS Adjusting the scope: increase the scores for every literal in the newly generated parent resolution (Berkmin) Additional dynamicity: multiply scores by 95% after each conflict, rather than occasionally halve the scores Eén&Sörensson, 2003 (Minisat) 92 The Clause-Based Heuristic (CBH) The idea: use relevant clauses for guiding the decision heuristic The Clause-Based Heuristic or CBH (Eureka) All the clauses (both initial and conflict clauses) are organized in a list The next variable is chosen from the top-most unsatisfied clause After a conflict: All the clauses that participate in the newly derived parent resolution are moved to the top, then The conflict clause is placed at the top Partial clause-based heuristics: Berkmin, HaifaSAT CBH: More CBH is even more dynamic than VSIDS: prefers variables from very recent conflicts CBH tends to pick interrelated variables: Variables whose joint assignment increases the chances of: Satisfying clauses in satisfiable branches Quickly reaching conflicts in unsatisfiable branches Variables appearing in the same clause are interrelated: Picking variables from the same clause, results in either that: the clause becomes satisfied, or there’s a contradiction 94 Polarity Selection Phase Saving: Strichman, 2000; Pipatsrisawat&Darwiche, 2007 (RSAT) Assign a new decision variable the last polarity it was assigned: dynamicity rules again 95 Decision Heuristics: the Current Status Everybody uses phase saving Most of the SAT solvers use VSIDS Intel’s Eureka uses CBH for most of the instances and VSIDS for tiny instances only We plan to compare VSIDS and CBH thoroughly in our new solver Fiver 96 Core SAT Solving: the Major Enhancements to DLL Boolean Constraint Propagation Conflict Analysis and Learning Decision Heuristics Restart Strategies Pre- and Inter- Processing The slides on restarts are based on Vadim Ryvchin’s SAT’08 presentation 97 Agenda Introduction Early Days of SAT Solving Core SAT Solving Conflict Analysis and Learning Boolean Constraint Propagation Decision Heuristics Restart Strategies Inprocessing Extensions to SAT Incremental SAT Solving under Assumptions Simultaneous Satisfiability (SSAT) Diverse Solutions Generation High-level (group-oriented) MUC Extraction 98 Restarts Restarts: the solver backtracks to decision level 0, when certain criteria are met crucial impact on performance Motivation: Dynamicity: refocus the search on relevant data Variables identified as important will be pick first by the decision heuristic after the restart Avoid spending too much time in ‘bad’ branches 99 Restart Criteria Restart after a certain number of conflicts has been encountered either: Since the previous restart: global Higher than a certain decision level: local Gomes&Selman&Kautz, 1998 Ryvchin&Strichman, 2008 Next: methods to calculate the threshold on the number of conflicts Holds for both global and local schemes 100 Restarts Strategies Arithmetic (or fixed) series. Parameters: x, y. Init(t) = x Next(t)=t+y Arithm(2000, 0) , Arithm(1000, 10) Threshold 1. 3500 3000 2500 2000 1500 1000 500 0 1 21 41 61 81 101 121 141 161 181 201 Restart Number 101 Restarts Strategies (cont.) Luby et al. series. Parameter: x. Init(t) = x Next(t) = ti*x Ruan&Horvitz&Kautz, 2003 Luby(512) 20000 Threshold 2. 15000 10000 5000 0 1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 Restart Number ti =1 1 2 1 1 2 4 1 1 2 1 1 2 4 8 1 1 2 1 1 2 4 1 1 2 1 1 2 4 8 16 1 1 2 1 1 2 4 1 1 2 1 1 2 4 8 … 102 Restarts Strategies (cont.) Inner-Outer Geometric series. Parameters: x, y, z. Init(t) = x if (t*y < z) else Next(t) = t*y Inner-Outer (100, 1.1, 100) 2000 Threshold 3. 1500 1000 500 0 1 17 33 49 65 81 97 113 129 145 161 177 193 Next(t) = x Next(z) = z*y Restart Number Armin Biere, 2007 (Picosat) 103 Agenda Introduction Early Days of SAT Solving Core SAT Solving Conflict Analysis and Learning Boolean Constraint Propagation Decision Heuristics Restart Strategies Inprocessing Extensions to SAT Incremental SAT Solving under Assumptions Simultaneous Satisfiability (SSAT) Diverse Solutions Generation High-level (group-oriented) MUC Extraction 104 Preprocessing and Inprocessing The idea: Simplify the formula prior (pre-) and during (in-) the search History: Freeman, 1995 (POSIT): first mentioning of preprocessing in the context of SAT Eén&Biere, 2005 (SatELite): a commonly used efficient preprocessing procedure Heule&Järvisalo&Biere (2010-2012): a series of papers on inprocessing Used in the current state-of-the-art solvers Lingeling and CryptoMinisat Nadel&Ryvchin&Strichman (2012): apply SatELite in incremental SAT solving 105 Inprocessing Techniques SatELite: Subsumption: remove clause (C+D) if (C) exists Self-subsuming resolution: replace (D+l’) by (D), if (C+l) exists, such that C D Variable elimination: apply DP for variables, whose elimination does not increase the number of clauses Example: (a+b)(a+b’)(a’+c)(a’+c’) (a)(a’+c)(a’+c’) Example of other techniques: Failed literal elimination with BCP: Repeat for a certain subset of literals on decision level 0: Propagate a literal l with BCP. If a conflict emerges, l must be 0 the formula can be simplified 106 Agenda Introduction Early Days of SAT Solving Core SAT Solving Conflict Analysis and Learning Boolean Constraint Propagation Decision Heuristics Restart Strategies Inprocessing Extensions to SAT Incremental SAT Solving under Assumptions Simultaneous Satisfiability (SSAT) Diverse Solutions Generation High-level (group-oriented) MUC Extraction 107 Extensions to SAT Nowadays, SAT solving is much more than finding one solution to a given problem Extensions to SAT: Incremental SAT under assumptions Simultaneous SAT (SSAT): SAT over multiple properties at once Diverse solution generation Minimal Unsatisfiable Core (MUC) extraction Push/pop support Model minimization ALL-SAT XOR clauses support ISSAT: assumptions are implications … 108 Agenda Introduction Early Days of SAT Solving Core SAT Solving Conflict Analysis and Learning Boolean Constraint Propagation Decision Heuristics Restart Strategies Inprocessing Extensions to SAT Incremental SAT Solving under Assumptions Simultaneous Satisfiability (SSAT) Diverse Solutions Generation High-level (group-oriented) MUC Extraction 109 Incremental SAT Solving under Assumptions The challenge: speed-up solving of related SAT instances by enabling re-use of relevant data Incremental SAT solving has numerous applications Next, we review a prominent application in Formal Verification of Hardware 110 Reasoning about Circuit Properties with SAT-based Bounded Model Checking (BMC) BMC: given a circuit and a property, does the property holds for the first n cycles? Unroll: generate a combinational instantiation of the circuit for each cycle Run a SAT solver for each cycle over: The translation of unrolled circuit to CNF The negation of the property at that cycle The property holds for n cycles iff all the SAT solver invocations return UNSAT 111 BMC Example a b g h c The property: b’h’ BMC Example: Cycle 0 g a b h c The property: b’h’ a b ci g h A user-given initial value BMC Example: Cycle 0 g a b h c The property: b’h’ g + a’ + b’ g’ + a g’ + b g a b ci The negation of the property b’h’: h h + g’ + ci’ h’ + g h’ + ci b’ h UNSAT! BMC Example: Cycle 1 g a b h c The property: b’h’ a b ci g h cx ax bx gx hx BMC Example: Cycle 1 g a b h c The property: b’h’ The negation of the property bx’hx’: g + a’ + b’ g’ + a g’ + b g a b ci h h + g’ + ci’ h’ + g h’ + ci cx gx + ax’ + bx’ gx’ + ax gx’ + bx ax gx bx cx + h’ cx’ + h bx’ hx hx hx + gx’ + cx’ hx’ + gx hx’ + cx UNSAT! Re-Using Relevant Information from Previous Cycles C0 and C1 hold globally S0 and S1 hold solely for a particular cycle a b ci g h C0: cycle 0 C1: cycle 1 g + a’ + b’ g’ + a g’ + b gx + ax’ + bx’ gx’ + ax gx’ + bx h + g’ + ci’ h’ + g h’ + ci hx + gx’ + cx’ hx’ + gx hx’ + cx b’ h bx’ hx cx S0: cycle 0-specific bx hx cx + h’ cx’ + h S1: cycle 1-specific The property: b’h’ 117 Pervasive Clause Learning; MarquesSilva&Sakallah, 1997 (GRASP); Strichman, 2001 C0: cycle 0 g + a’ + b’ g’ + a g’ + b gx + ax’ + bx’ gx’ + ax gx’ + bx h + g’ + ci’ h’ + g h’ + ci hx + gx’ + cx’ hx’ + gx hx’ + cx b’ h bx’ hx S0: cycle 0-specific cx + h’ cx’ + h S1: cycle 1-specific Cycle 0: create a CNF instance C0 S0 and solve it C1: cycle 1 Let C0* be the set of pervasive conflict clauses, that is conflict clauses that depend only on C0 Cycle 1: create a CNF instance C0 C1 S1 C0* and solve it 118 Pervasive Clause Learning; MarquesSilva&Sakallah, 1997 (GRASP); Strichman, 2001 C0: cycle 0 g + a’ + b’ g’ + a g’ + b C1: cycle 1 C0* a + h’ h + g’ + ci’ h’ + g h’ + ci b’ h hx + gx’ + cx’ hx’ + gx hx’ + cx bx’ hx cx + h’ cx’ + h S1: cycle 1-specific Cycle 0: create a CNF instance C0 S0 and solve it g S0: cycle 0-specific gx + ax’ + bx’ gx’ + ax gx’ + bx Let C0* be the set of pervasive conflict clauses, that is conflict clauses that depend only on C0 Cycle 1: create a CNF instance C0 C1 S1 C0* and solve it 119 Incremental SAT Solving under Assumptions; Eén&Sörensson, 2003 (Minisat) C0: cycle 0 g + a’ + b’ g’ + a g’ + b gx + ax’ + bx’ gx’ + ax gx’ + bx h + g’ + ci’ h’ + g h’ + ci hx + gx’ + cx’ hx’ + gx hx’ + cx b’ h bx’ hx S0: cycle 0-specific cx + h’ cx’ + h S1: cycle 1-specific Cycle 0: create a CNF instance C0 and solve it under the assumptions S0 C1: cycle 1 S0 clauses are not part of the instance, instead: The literals of S0 are used as the first decision, or assumptions The solver stops, whenever one of the assumptions must be flipped Cycle 1: add the clauses C1 to the same instance and solve under the assumptions S1 120 Incremental SAT Solving: More Minisat’s method is the state-of-the-art Advantages: GRASP’s method advantage Re-uses a single solver instance: heuristics are incremental All the clauses are re-used Assumptions are unit clauses: preprocessing can use them to simplify the formula Incremental SAT solving was not compatible with preprocessing Nadel&Ryvchin&Strichman 2012: Make incremental SAT solving compatible with SatELite Show a way to treat assumptions efficiently 121 Agenda Introduction Early Days of SAT Solving Core SAT Solving Conflict Analysis and Learning Boolean Constraint Propagation Decision Heuristics Restart Strategies Inprocessing Extensions to SAT Incremental SAT Solving under Assumptions Simultaneous Satisfiability (SSAT) Diverse Solutions Generation High-level (group-oriented) MUC Extraction 122 Simultaneous SAT (SSAT) A SAT-based algorithm to efficiently solve chunks of related properties in one SAT solver invocation For example, one can solve multiple properties during BMC Khasidashvili&Nadel&Palti&Hanna, 2005 Khasidashvili&Nadel, 2011 123 Example: Solve Both p1 and p2 p1 C1 p2 C2 Incremental SAT-based Approach p1 C1 p2 C2 Translate C1 to CNF formula F Solve F under the assumption p1’ Update F with clause projection of C2\C1 Solve F under the assumption p2’ SSAT Approach p1 C1 p2 C2 Translate both C1 and C2 to CNF formula F Find the status of both p1 and p2 in the same invocation of the SAT solver Advantages of SSAT approach to Incremental SAT-based Approach Looks at all the properties at once One solution can falsify more than one property May find conflict clauses (lemmas) relevant for solving many POs SSAT: the Algorithm Interface Input A combinational formula F (in CNF) A list of proof objectives (POs) p1,p2,…,pn Output Each pi is either falsifiable A model to F, such that pi = 0, exists (F pi’ is SAT) valid pi always holds, given F (F pi’ is UNSAT) 128 SSAT Algorithm Interface Example F = (a + b) c’ a’ POs: a, b, c, a’, b’, c’ a is falsifiable: a = 0; b = 1; c = 0 is the model b is valid: there is no model to F, where b = 0 In another words, (a + b) c’ a’ b’ is UNSAT c is falsifiable: a = 0; b = 1; c = 0 is the model a’ is valid: no model to F where a = 1 b’ is falsifiable with a = 0; b = 1; c = 0 c’ is valid: no model to F where c = 1 • Both l and l’ may be falsifiable •Example: F = a + b; PO: a 129 Basic SSAT Algorithm SSAT(F; P={p1,p2,…,pn}) While (P is non-empty) Pick any s P Solve F under the assumption s’ If satisfiable by a satisfying assignment Initialized with clause projection of the union of cones of all the properties T:={s other POs in P falsified by } Return to the user that the POs T are falsifiable P := P \ T If unsatisfiable Return that s is valid P := P \ {p} SSAT: More How to boost SSAT Take further advantage of reasoning about all the POs at once Pick all the POs as decision variables and assign them 0 Fairness: rotate unsolved POs Set an internal time threshold for an attempt to solve one PO When the threshold expires: Move the unresolved PO to the end of unsolved POs list Switch to another PO SSAT is widely used at Intel Applied as the core reasoning engine for simultaneous model checking algorithms we developed Agenda Introduction Early Days of SAT Solving Core SAT Solving Conflict Analysis and Learning Boolean Constraint Propagation Decision Heuristics Restart Strategies Inprocessing Extensions to SAT Incremental SAT Solving under Assumptions Simultaneous Satisfiability (SSAT) Diverse Solutions Generation High-level (group-oriented) MUC Extraction 132 DiversekSet: Generating Diverse Solutions DiversekSet in SAT: generate a user-given number of diverse solutions, given a CNF formula Nadel, 2011 The problem has multiple applications at Intel 133 Application: Semi-formal FPV deep bugs New Initial states New Initial states Max FV bound initial states New Initial states Multi-Threaded Search to Enhance Coverage Choosing a single path through waypoints may miss the bug Must search along multiple diverse paths calculated: Diversification Quality as the Average Hamming Distance Quality: the average Hamming distance between the solutions, normalized to [0…1] Hamming distances matrix abc 1 0 0 0 2 1 1 0 3 0 1 1 4 1 0 0 1 1 2 3 4 2 2 3 4 Diversification Quality as the Average Hamming Distance Quality: the average Hamming distance between the solutions, normalized to [0…1] Hamming distances matrix abc 1 0 0 0 2 1 1 0 3 0 1 1 4 1 0 0 1 1 2 3 4 2 3 2 2 4 Diversification Quality as the Average Hamming Distance Quality: the average Hamming distance between the solutions, normalized to [0…1] Hamming distances matrix abc 1 0 0 0 2 1 1 0 3 0 1 1 4 1 0 0 1 1 2 3 4 2 3 4 2 2 1 Diversification Quality as the Average Hamming Distance Quality: the average Hamming distance between the solutions, normalized to [0…1] Hamming distances matrix abc 1 0 0 0 2 1 1 0 3 0 1 1 4 1 0 0 1 1 2 3 4 2 3 4 2 2 2 1 Diversification Quality as the Average Hamming Distance Quality: the average Hamming distance between the solutions, normalized to [0…1] Hamming distances matrix abc 1 0 0 0 2 1 1 0 3 0 1 1 4 1 0 0 1 1 2 3 4 2 3 4 2 2 2 1 1 Diversification Quality as the Average Hamming Distance Quality: the average Hamming distance between the solutions, normalized to [0…1] Hamming distances matrix abc 1 0 0 0 2 1 1 0 3 0 1 1 4 1 0 0 1 1 2 3 4 2 3 4 2 2 2 1 1 3 Diversification Quality as the Average Hamming Distance Quality: the average Hamming distance between the solutions, normalized to [0…1] Hamming distances matrix 1 abc 1 0 0 0 2 1 1 0 3 0 1 1 4 1 0 0 m Q 1 2 3 3 4 2 2 2 1 1 3 4 m D( , i 1 j i 1 Variables 2 m q 2 i j ) 2 2 1 2 1 3 11 Hamming Distance 18 4 3 Solutions 2 Diversification Quality as the Average Hamming Distance Quality: the average Hamming distance between the solutions, normalized to [0…1] Hamming distances matrix 1 abc 1 0 0 0 2 1 1 0 3 0 1 1 4 1 0 0 m Q 1 2 3 2 3 4 2 2 2 1 1 3 4 m D( , i 1 j i 1 m q 2 i j ) 2 2 1 2 1 3 11 18 4 3 2 Algorithms for DiversekSet in SAT in a Glance The idea: Adapt a modern CDCL SAT solver for DiversekSet Make minimal changes to remain efficient Compact algorithms: Invoke the SAT solver once to generate all the solutions Restart after a solution is generated Modify the polarity and variable selection heuristics for generating diverse solutions Algorithms for DiversekSet in SAT in a Glance Cont. Polarity-based algorithms: Change solely the polarity selection heuristic pRand: pick the polarity randomly pGuide: pick the polarity so as to improve the diversification quality pGuide outperforms pRand in terms of both diversification quality and performance Quality can be improved further by taking BCP into account and adapting the variable ordering Balance the number of 0’s and 1’s assigned to a variable by picking {0,1} when variable was assigned ’ more times Agenda Introduction Early Days of SAT Solving Core SAT Solving Conflict Analysis and Learning Boolean Constraint Propagation Decision Heuristics Restart Strategies Inprocessing Extensions to SAT Incremental SAT Solving under Assumptions Simultaneous Satisfiability (SSAT) Diverse Solutions Generation High-level (group-oriented) MUC Extraction 146 Unsatisfiable Core Extraction An unsatisfiable core is an unsatisfiable subset of an unsatisfiable set of constraints An unsatisfiable core is minimal if removal of any constraint makes it satisfiable (local minima) Has numerous applications Example Application: Proof-based Abstraction Refinement for Model Checking; McMillan et al.,’03; Gupta et al.,’03 Turn latches/ gates into free inputs Inputs: model M, property P Output: does P hold under M? Abstract model A { } A A latches/gates in the UNSAT core of BMC(M,P,k) Model Check A Valid No Bug Cex C at depth k BMC(M,P,k) Yes Spurious? No The UNSAT core is used for refinement The UNSAT core is required in terms of latches/gates Bug Example Application 2: Assumption Minimization for Compositional Formal Equivalence Checking (FEC); Cohen et al.,’10 Assumption Assertion Outputs FEC verifies the equivalence between the design (RTL) and its implementation (schematics). The whole design is too large to be verified at once. FEC is done on small sub-blocks, restricted with assumptions. Assumptions required for the proof of equivalence of subblocks must be proved relative to the driving logic. MUC extraction in terms of assumptions is vital for feasibility. Inputs Traditionally, a Clause-Level UC Extractor is the Workhorse Clause-level UC extraction: given a CNF formula, extract an unsatisfiable subset of its clauses F = ( a + b ) ( b’ + c ) (c’ ) (a’ + c ) ( b + c ) ( a + b + c’ ) U1 = ( a + b ) (b’ + c ) ( c’ ) ( a’ + c ) ( b + c ) ( a + b + c’ ) U2 = ( a + b ) ( b’ + c ) ( c’ ) ( a’ + c ) ( b + c ) ( a + b + c’ ) U3 = ( a + b ) ( b’ + c ) ( c’ ) ( a’ + c ) ( b + c ) ( a + b + c’ ) Dozens of papers on clause-level UC extraction since 2002 Traditional UC Extraction for Practical Needs: the Input The user is interested in a MUC in terms of these constraints An interesting constraint The remainder (the rest of the formula) Traditional UC Extraction: Example Input 1 Proof-based abstraction refinement An unrolled latch The rest of the unrolled circuit Traditional UC Extraction: Example Input 1 Assumption minimization for FEV An assumption Equivalence between sub-block RTL and implementation Traditional UC Extraction: Stage 1: Translate to Clauses An interesting constraint The remainder (the rest of the formula) Each small square is a propositional clause, e.g. (a + b’) Traditional UC Extraction: Stage 2: Extract a Clause-Level UC An interesting constraint The remainder (the rest of the formula) Colored squares belong to the clause-level UC Traditional UC Extraction: Stage 3: Map the Clause-Level UC Back to the Interesting Constraints An interesting constraint The remainder (the rest of the formula) The UC contains three interesting constraints High-Level Unsatisfiable Core Extraction Real-world applications require reducing the number of interesting constraints in the core rather than clauses Latches for abstraction refinement Assumptions for compositional FEV Most of the algorithms for UC extraction are clause-level High-level UC: extracting a UC in terms of interesting constraints only Liffiton&Sakallah, 2008; Nadel, 2010; Ryvchin&Strichman, 2011 Small/Minimal Clause-Level UC Small/Minimal High-Level UC A small clause-level UC, but the high-level UC is the largest possible: A large clause-level UC, but the high-level UC is empty: High-Level Unsatisfiable Core Extraction: Main Results Minimal UC extraction: high-level algorithms solve Intel families that are out of reach for clause-level algorithms Non-minimal UC extraction: high-level algorithms are preferable 2-3x boost on difficult benchmarks Thanks! 160