Revisiting Generalizations

Revisiting Generalizations Ken McMillan Microsoft Research Aws Albarghouthi University of Toronto Generalization • Interpolants are generalizations • We use them as a way of forming conjectures and lemmas • Many proof search methods uses interpolants as generalizations • SAT/SMT solvers • CEGAR, lazy abstraction, etc. • Interpolant-based invariants, IC3, etc. There is widespread use of the idea that interpolants are generalizations, and generalizations help to form proofs In this talk • We will consider measures of the quality of these generalizations • I will argue: • The evidence for these generalizations is often weak • This motivates a retrospective approach: revisiting prior generalizations in light of new evidence • This suggests new approaches to interpolation and proof search methods that use it. We’ll consider how these ideas may apply in SMT and IC3. Criteria for generalization • A generalization is an inference that in some way covers a particular case Example: a learned clause in a SAT solver • We require two properties of a generalization: • Correctness: it must be true • Utility: it must make our proof task easier A useful inference is one that occurs in a simple proof Let us consider what evidence we might produce for correctness and utility of a given inference… What evidence can we provide? • Evidence for correctness: • Proof (best) • Bounded proof (pretty good) • True in a few cases (weak) • Evidence for utility: • Useful for one truth assignment • Useful for one program path Interpolants as generalizations • Consider a bounded model checking formula: 𝑇0 𝑇1 𝑇2 𝑇3 𝑇4 𝐼 𝑇5 ¬𝑃 interpolant • Evidence for interpolant as conjectured invariant • Correctness: true after 3 steps • Utility: implies property after three steps • Note the duality • In invariant generation, correctness and utility are timereversal duals. CDCL SAT solvers • A learned clause is a generalization (interpolant) • Evidence for correctness: • Proof by resolution (strong!) • Evidence for utility: • Simplifies proof of current assignment (weak!) • In fact, CDCL solvers produce many clauses of low utility that are later deleted. • Retrospection • CDCL has a mechanism of revisiting prior generalization in the light of new evidence • This is called “non-chronological backtracking” Retrospection in CDCL The decision stack: Learned clause 𝐶2 contradicts 𝑟 Backtrack to here ¬𝑟𝑟 𝑥 ¬𝑥 ⊥ Conflict! 𝐶1 drops out of the proof on backtrack! Learned clause 𝐶1 contradicts ¬𝑥 Backtrack to here • CDCL can replace an old generalization with a new one that covers more cases • That is, utility evidence of the new clause is better Retrospection in CEGAR CEGAR considers one counterexample path at a time 𝑄 This path refutations Next pathhas cantwo onlypossible be refuted using 𝑄 of equal utility 𝑃 Use predicate 𝑃 here ¬𝑃 • Greater evidence for 𝑄 than 𝑃 ¬𝑄 • Two cases against one • However, most CEGAR methods cannot remove the useless predicate 𝑃 at this point Retrospection and Lazy SMT • Lazy SMT is a form of CEGAR • “program path” → truth assignment (disjunct in DNF) • A theory lemma is a generalization (interpolant) • Evidence for correctness: proof • Evidence for utility: Handles one disjunct • Can lead to many irrelevant theory lemmas • Difficulties of retrospection in lazy SMT • Incrementally revising theory lemmas • Architecture may prohibit useful generalizations Diamond example with lazy SMT 𝑥4 < 𝑥0 𝑥1 = 𝑥0 + 1 𝑥2 = 𝑥1 + 1 𝑥3 = 𝑥2 + 1 𝑥4 = 𝑥3 + 1 ∨ 𝑥1 = 𝑥0 ∨ ∨ 𝑥2 = 𝑥1 𝑥3 = 𝑥2 ∨ 𝑥4 = 𝑥3 • Theory lemmas correspond to program paths: ¬(𝑥1 = 𝑥0 + 1 ∧ 𝑥2 = 𝑥1 + 1 ∧ 𝑥3 = 𝑥2 ∧ 𝑥4 = 𝑥3 ∧ 𝑥4 < 𝑥0 ) ¬(𝑥1 = 𝑥0 ∧ 𝑥2 = 𝑥1 ∧ 𝑥3 = 𝑥2 + 1 ∧ 𝑥4 = 𝑥3 + 1 ∧ 𝑥4 < 𝑥0 ) … (16 lemmas, exponential in number of diamonds) • Lemmas have low utility because each covers only once case • Lazy SMT framework does not allow higher utility inferences Diamond example (cont.) 𝑥4 < 𝑥0 𝑥1 = 𝑥0 + 1 𝑥2 = 𝑥1 + 1 𝑥3 = 𝑥2 + 1 𝑥4 = 𝑥3 + 1 𝑥1 = 𝑥0 𝑥2 = 𝑥1 𝑥3 = 𝑥2 𝑥4 = 𝑥3 𝑥1 ≥ 𝑥0 𝑥2 ≥ 𝑥0 𝑥3 ≥ 𝑥0 • We can produce higher utility inferences by structurally decomposing the problem • Each covers many paths • Proof is linear in number of diamonds Compositional SMT • To prove unsatisfiability of 𝐴 ∧ 𝐵 … • Infer an interpolant 𝐼 such that 𝐴 → 𝐼 and 𝐵 → ¬𝐼. • The interpolant decomposes the proof structurally • Enumerate disjuncts (samples) of 𝐴, 𝐵 separately. 𝑆𝐴 , 𝑆𝐵 ← ∅ Choose 𝐼 so 𝑆𝐴 → 𝐼 and 𝑆𝐵 → ¬𝐼 Use SMT solver As block box If not 𝐴 → 𝐼 then add a disjunct to 𝑆𝐴 and continue… Chose 𝐼 to cover the samples as simply as possible If not 𝐵 → ¬𝐼 then add a disjunct to 𝑆𝐵 and continue… 𝐴 ∧ 𝐵 is unsatisfiable! With each new sample, we reconsider the interpolant to maximize utility Example in linear rational arithmetic 𝐴 = 𝑥 ≤1∧𝑦 ≤3 𝐴1 ∨ 1≤𝑥 ≤2∧𝑦 ≤2 𝐴2 ∨ 2≤𝑥 ≤3∧𝑦 ≤1 𝐴3 𝐵 = (𝑥 ≥ 2 ∧ 𝑦 ≥ 3) ∨ (𝑥 ≥ 3 ∧ 2 ≤ 𝑦 ≤ 3) 𝐵1 𝐵2 • 𝐴 and 𝐵can be seen as sets of convex polytopes An interpolant 𝐼 is a separator for these sets. Compositional approach 1. Choose two samples from 𝐴 and 𝐵and compute an interpolant 𝑦 ≤ 2.5 2. Add new sample 𝐴1 containing point (1,3) and update interpolant to 𝑥 + 𝑦 ≤ 4 3. Interpolant now covers all disjuncts Notice we reconsidered our first interpolant choice in Point (1,3) is in 𝐴, but not 𝐼 light of further evidence. Comparison to Lazy SMT • Interpolant from a lazy SMT solver proof: 𝑥 ≤ 2∧ 𝑦 ≤ 2 ∨ ( 𝑥 ≤ 1∨ 𝑦 ≤ 1 ∧ ( 𝑥 ≤ 2∧ 𝑦 ≤ 2 ∨ (𝑥 ≤ 1 ∨ 𝑦 ≤ 1))) • Each half-space corresponds to a theory lemma • Theory lemmas have low utility • Four lemmas cover six cases Why is the simpler proof better? • A simple fact that covers many cases may indicate an emerging pattern... B3 B4 • Greater complexity allows overfitting • Especially important in invariant generation Finding simple interpolants • We break this problem into two parts • Search for large subsets of the samples that can be separated by linear half-spaces. • Synthesize an interpolant as a Boolean combination of these separators. The first part can be accomplished by well-established methods, using an LP solver and Farkas’ lemma. The Boolean function synthesis problem is also well studied, though we may wish to use more light-weight methods. Farkas’ lemma and linear separators • Farkas’ lemma says that inconsistent rational linear constraints can be refuted by summation: 𝑎 (𝑥 − 𝑦 ≤ 0) 𝑏 (2𝑦 − 2𝑥 ≤ −1) (0 ≤ −1) constraints 𝑎≥0 𝑏≥0 𝑎 − 2𝑏 = 0 2𝑏 − 𝑎 = 0 −𝑏 < 0 solve 𝑎=2 𝑏=1 • The proof of unsat can be found by an LP solver • We can use this to discover a linear interpolant for two sets of convex polytopes 𝑆𝐴 and 𝑆𝐵 . Finding separating half-spaces • Use LP to simultaneously solve for: • A linear separator of the form 𝒄𝒙 ≤ 𝑏 • A proof that 𝐴𝑖 → 𝐼 for each 𝐴𝑖 in 𝑆𝐴 • A proof that 𝐵𝑖 → ¬𝐼 for each 𝐵𝑖 in 𝑆𝐴 • The separator 𝐼 is an interpolant for 𝑆𝐴 ∧ 𝑆𝐵 • The evidence for utility of 𝐼 is the size of 𝑆𝐴 and 𝑆𝐵 • Thus, we search for large sample sets that can be linearly separated. • We can also make 𝐼 simpler by setting as many coefficients in 𝑐 to zero as possible. Half-spaces to interpolants • When every pair of samples in 𝑆𝐴 × 𝑆𝐵 are separated by some half space, we can build an interpolant as a Boolean combination. Must be true Each region is a cube over halfspaces 𝐵1 𝐴1 Must be false 𝐵2 𝐴2 Don’t care In practice, we don’t have to synthesize an optimal combination Sequential verification • We can extend our notions of evidence and retrospection to sequential verification • A “case” may be some sequence of program steps • Consider a simple sequential program: x = y = 0; while(*) x++; y++; while(x != 0) x--; y--; assert (y <= 0); Wish to discover invariant: {y ≤ x} Execute the loops twice {True} x = y = 0; x++; y++; x++; y++; [x!=0]; x--; y--; Choose interpolants at each step, in hope of obtaining inductive invariant. {y≥ = 0} {x y} {y≥ = 1} {x y} {y≥ = 2} {x y} • These interpolants cover all the cases with just one predicate. • In fact, they are inductive. {y≥ = 1} {x y} [x!=0]; x--; y--; [x == 0] [y > 0] {y≥ = 0} {x y} {False} • These predicates have low utility, since each covers just one case. • As a result, we “overfit” and do not discover the emerging pattern. Sequential interpolation strategy • Compute interpolants for all steps simultaneously • Collect 𝐴 (pre) and 𝐵 (post) samples at each step • Utility of a half-space measured by how many sample pairs it separates in total. 𝑦 • Step 0: low evidence • Step 1: better evidence 𝑦 ≤ 𝑥 (good!) 𝑥 𝑦 ≤ 0 (bad!) Are simple interpolants enough? • Occam’s razor says simplicity prevents over-fitting • This is not the whose story, however • Consider different separators of similar complexity 𝑦 3 (good!) 𝑦 𝑦≤ <𝑥 𝑥 (yikes!) 2 𝑥 Simplicity of the interpolant may not be sufficient for a good generalization. Proof simplicity criterion • Empirically, the strange separators seem to be avoided by minimizing proof complexity • Maximize zero coefficients in Farkas proofs • This can be done in a greedy way • Note the difference between applying Occam’s razor in synthetic and analytic cases • We are generalizing not from data but from an argument • We want the argument to be as simple as possible Value of retrospection • 30 small programs over integers • Tricky inductive invariants involving linear constraints • Some disjunctive, most conjunctive Tool Comp. SMT % solved 100 CPA Checker UFO InvGen with AI InvGen w/o AI 57 57 70 60 • Better evidence for generalizations • Better fits the observed cases • Results in better convergence • Still must trade off cost v. utility in generalizing • Avoid excessive search while maintaining convergence Value of retrospection (cont) • Bounded model checking of inc/dec program • Apply compositional SMT to the BMC unfolding • First half of unfolding is 𝐴, second half is 𝐵 • Compare to standard lazy SMT using Z3 Exponential theory lemmas achieved in practice. Computing one interpolant gives power ½ speedup. Incremental inductive methods (IC3) • Compute a sequence interpolant by local proof Goal: ¬𝑠′Goal: ¬𝑠 𝑐 𝐼0 𝐹0 Inductive generalization 𝑐 𝑐 𝑐 𝑇0 𝑇1 𝑇2 𝑇3 ⇒ 𝐹1 ⇒ 𝐹2 ⇒ 𝐹3 ⇒ 𝑐 𝐹4 𝑇4 ⇒ 𝑐 𝐹5 𝑇5 ⇒ 𝑇6 𝐹6 𝑃7 𝐹6¬𝑠 ∧6𝑇6 ⇒ 𝑃7 Valid! Interpolant 𝐹5Check: ∧5weakens 𝑇5 ⇒ Check: 𝐹4successively ∧Check: 𝑇4 ⇒ ¬𝑠′ Cex: 𝑠′ Cex: 𝑠 Infer clause 𝑐 satisfying: 𝐼⇒𝑐 𝐹4 ∧ 𝑐4 ∧ 𝑇4 ⇒ 𝑐5 𝑐 ⇒ ¬𝑠′ 𝑐 is inductive relative to 𝐹4 Generalization in IC3 ¬𝑠 𝑐 𝑐 𝑇0 𝐹0 Inductive generalization 𝑐 𝑐 𝑐 𝑇1 𝐹1 𝑇2 𝐹2 𝑇3 𝐹3 𝑐 𝑇4 𝐹4 (bad state) 𝑐 𝑇5 𝐹5 𝑇6 𝐹6 • Think of inferred clauses as speculated invariants • Evidence for correctness • Clause is proved up to 𝑘 steps • Relative inductiveness (except initial inferences) • Clause is “as good as” previous speculations • Evidence for utility • Rules out one bad sample • Note the asymmetry here • Correctness and utility are dual • IC3 prefers correctness to utility • Is the cost paid for this evidence too high? 𝑇6 𝐹6 Retrospection in IC3? • Find a relatively inductive clause that covers as many bad states as possible 𝐼⇒𝑐 𝐹4 ∧ 𝑐4 ∧ 𝑇4 ⇒ 𝑐5 𝑐 ⇒ ¬ 𝑠 ∧ ¬𝑠 ′ ∧ ⋯ • Search might be inefficient, since each test is a SAT problem • Possible solution: relax correctness evidence • Sample on both A and B side • Correctness evidence is invariance in a sample of behaviors rather than all up to depth k. Generalization problem • Find a clause to separate A and B samples A samples (reachable states) B samples (bad states) ¬𝑎 ∧ 𝑏 ∧ ¬𝑐 separator 𝑎 ∧ 𝑏 ∧ ¬𝑐 ¬𝑎 ∧ ¬𝑏 ∧ 𝑐 ¬𝑎 ∨ ¬𝑏 𝑎∧𝑏∧𝑐 𝑎 ∧ ¬𝑏 ∧ 𝑐 True in all A samples ¬𝑎 ∧ 𝑏 ∧ 𝑐 False in many B samples This is another two-level logic optimization problem. This suggests logic optimization techniques could be applied to computing interpolants in incremental inductive style. Questions to ask • For a given inference technique we can ask • Where does generalization occur? • Is there good evidence for generalizations? • Can retrospection be applied? • What form do separators take? • How can I solve for separators? At what cost? • Theories: arrays, nonlinear arithmetic,… • Can I improve the cost? • Shift the tradeoff of utility and correctness • Can proof complexity be minimized? • Ultimately, is the cost of better generalization justified? Conclusion • Many proof search methods rely on interpolants as generalizations • Useful to the extent the make the proof simpler • Evidence of utility in existing methods is weak • Usually amounts to utility in one case • Can lead to many useless inferences • Retrospection: revisit inferences on new evidence • For example, non-chronological backtracking • Allows more global view of the problem • Reduces commitment to inferences based on little evidence Conclusion (cont) • Compositional SMT • • • • Modular approach to interpolation Find simple proofs covering many cases Constraint-based search method Improves convergence of invariant discovery • Exposes emerging pattern in loop unfoldings • Think about methods in terms of • The form of generalization • Quality of evidence for correctness and utility • Cost v. benefit of the evidence provided Cost of retrospection • Early retrospective approach due to Anubhav Gupta • Finite-state localization abstraction method • Finds optimal localization covering all abstract cex’s. • In practice, “quick and dirty” often better than optimal • Compositional SMT usually slower than direct SMT • However, if bad generalizations imply divergence, then the cost of retrospection is justified. • Need to understand when revisiting generalizations is justified. Inference problem • Find a clause to separate A and B samples • Clause must have a literal from every A sample • Literals must be false in many B samples • This is a binate covering problem • Wells studied in logic synthesis • Relate to Bloem et al • Can still use relative inductive strengthening • Better utility and evidence

Revisiting Generalizations

Related documents

Products

Support

Revisiting Generalizations

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib