Revisiting Generalizations

advertisement
Revisiting Generalizations
Ken McMillan
Microsoft Research
Aws Albarghouthi
University of Toronto
Generalization
• Interpolants are generalizations
• We use them as a way of forming conjectures and
lemmas
• Many proof search methods uses interpolants as
generalizations
• SAT/SMT solvers
• CEGAR, lazy abstraction, etc.
• Interpolant-based invariants, IC3, etc.
There is widespread use of the idea that interpolants are generalizations,
and generalizations help to form proofs
In this talk
• We will consider measures of the quality of these
generalizations
• I will argue:
• The evidence for these generalizations is often weak
• This motivates a retrospective approach: revisiting prior
generalizations in light of new evidence
• This suggests new approaches to interpolation and
proof search methods that use it.
We’ll consider how these ideas may apply in SMT and IC3.
Criteria for generalization
• A generalization is an inference that in some way
covers a particular case
Example: a learned clause in a SAT solver
• We require two properties of a generalization:
• Correctness: it must be true
• Utility: it must make our proof task easier
A useful inference is one that occurs in a simple proof
Let us consider what evidence we might produce for correctness and
utility of a given inference…
What evidence can we provide?
• Evidence for correctness:
• Proof (best)
• Bounded proof (pretty good)
• True in a few cases (weak)
• Evidence for utility:
• Useful for one truth assignment
• Useful for one program path
Interpolants as generalizations
• Consider a bounded model checking formula:
𝑇0
𝑇1
𝑇2
𝑇3
𝑇4
𝐼
𝑇5
¬π‘ƒ
interpolant
• Evidence for interpolant as conjectured invariant
• Correctness: true after 3 steps
• Utility: implies property after three steps
• Note the duality
• In invariant generation, correctness and utility are timereversal duals.
CDCL SAT solvers
• A learned clause is a generalization (interpolant)
• Evidence for correctness:
• Proof by resolution (strong!)
• Evidence for utility:
• Simplifies proof of current assignment (weak!)
• In fact, CDCL solvers produce many clauses of low utility
that are later deleted.
• Retrospection
• CDCL has a mechanism of revisiting prior generalization
in the light of new evidence
• This is called “non-chronological backtracking”
Retrospection in CDCL
The decision stack:
Learned clause 𝐢2 contradicts π‘Ÿ
Backtrack to here
¬π‘Ÿπ‘Ÿ
π‘₯
¬π‘₯
⊥
Conflict!
𝐢1 drops out of the proof on backtrack!
Learned clause 𝐢1 contradicts ¬π‘₯
Backtrack to here
• CDCL can replace an old generalization with a new
one that covers more cases
• That is, utility evidence of the new clause is better
Retrospection in CEGAR
CEGAR considers one counterexample path at a time
𝑄
This path
refutations
Next
pathhas
cantwo
onlypossible
be refuted
using 𝑄 of equal utility
𝑃
Use predicate 𝑃 here
¬π‘ƒ
• Greater evidence for 𝑄 than 𝑃
¬π‘„
• Two cases against one
• However, most CEGAR methods cannot remove
the useless predicate 𝑃 at this point
Retrospection and Lazy SMT
• Lazy SMT is a form of CEGAR
• “program path” → truth assignment (disjunct in DNF)
• A theory lemma is a generalization (interpolant)
• Evidence for correctness: proof
• Evidence for utility: Handles one disjunct
• Can lead to many irrelevant theory lemmas
• Difficulties of retrospection in lazy SMT
• Incrementally revising theory lemmas
• Architecture may prohibit useful generalizations
Diamond example with lazy SMT
π‘₯4 < π‘₯0
π‘₯1 = π‘₯0 + 1 π‘₯2 = π‘₯1 + 1 π‘₯3 = π‘₯2 + 1 π‘₯4 = π‘₯3 + 1
∨
π‘₯1 = π‘₯0
∨
∨
π‘₯2 = π‘₯1
π‘₯3 = π‘₯2
∨
π‘₯4 = π‘₯3
• Theory lemmas correspond to program paths:
¬(π‘₯1 = π‘₯0 + 1 ∧ π‘₯2 = π‘₯1 + 1 ∧ π‘₯3 = π‘₯2 ∧ π‘₯4 = π‘₯3 ∧ π‘₯4 < π‘₯0 )
¬(π‘₯1 = π‘₯0 ∧ π‘₯2 = π‘₯1 ∧ π‘₯3 = π‘₯2 + 1 ∧ π‘₯4 = π‘₯3 + 1 ∧ π‘₯4 < π‘₯0 )
… (16 lemmas, exponential in number of diamonds)
• Lemmas have low utility because each covers only once case
• Lazy SMT framework does not allow higher utility inferences
Diamond example (cont.)
π‘₯4 < π‘₯0
π‘₯1 = π‘₯0 + 1
π‘₯2 = π‘₯1 + 1
π‘₯3 = π‘₯2 + 1
π‘₯4 = π‘₯3 + 1
π‘₯1 = π‘₯0
π‘₯2 = π‘₯1
π‘₯3 = π‘₯2
π‘₯4 = π‘₯3
π‘₯1 ≥ π‘₯0
π‘₯2 ≥ π‘₯0
π‘₯3 ≥ π‘₯0
• We can produce higher utility inferences by
structurally decomposing the problem
• Each covers many paths
• Proof is linear in number of diamonds
Compositional SMT
• To prove unsatisfiability of 𝐴 ∧ 𝐡 …
• Infer an interpolant 𝐼 such that 𝐴 → 𝐼 and 𝐡 → ¬πΌ.
• The interpolant decomposes the proof structurally
• Enumerate disjuncts (samples) of 𝐴, 𝐡 separately.
𝑆𝐴 , 𝑆𝐡 ← ∅
Choose 𝐼 so 𝑆𝐴 → 𝐼 and 𝑆𝐡 → ¬πΌ
Use SMT solver
As block box
If not 𝐴 → 𝐼 then
add a disjunct to 𝑆𝐴 and continue…
Chose 𝐼 to cover
the samples as
simply as possible
If not 𝐡 → ¬πΌ then
add a disjunct to 𝑆𝐡 and continue…
𝐴 ∧ 𝐡 is unsatisfiable!
With each new sample, we reconsider the interpolant to maximize utility
Example in linear rational arithmetic
𝐴 = π‘₯ ≤1∧𝑦 ≤3
𝐴1
∨ 1≤π‘₯ ≤2∧𝑦 ≤2
𝐴2
∨ 2≤π‘₯ ≤3∧𝑦 ≤1
𝐴3
𝐡 = (π‘₯ ≥ 2 ∧ 𝑦 ≥ 3)
∨ (π‘₯ ≥ 3 ∧ 2 ≤ 𝑦 ≤ 3)
𝐡1
𝐡2
• 𝐴 and 𝐡can be seen as sets of convex polytopes
An interpolant 𝐼 is a
separator for these sets.
Compositional approach
1. Choose two samples from 𝐴 and 𝐡and compute an interpolant 𝑦 ≤ 2.5
2. Add new sample 𝐴1 containing point (1,3) and update interpolant to π‘₯ + 𝑦 ≤ 4
3. Interpolant now covers all disjuncts
Notice we reconsidered our
first interpolant choice in
Point
(1,3)
is in 𝐴,
but not 𝐼
light of
further
evidence.
Comparison to Lazy SMT
• Interpolant from a lazy SMT solver proof:
π‘₯ ≤ 2∧ 𝑦 ≤ 2 ∨
( π‘₯ ≤ 1∨ 𝑦 ≤ 1 ∧
( π‘₯ ≤ 2∧ 𝑦 ≤ 2 ∨
(π‘₯ ≤ 1 ∨ 𝑦 ≤ 1)))
• Each half-space corresponds to a theory lemma
• Theory lemmas have low utility
• Four lemmas cover six cases
Why is the simpler proof better?
• A simple fact that covers many cases may indicate
an emerging pattern...
B3
B4
• Greater complexity allows overfitting
• Especially important in invariant generation
Finding simple interpolants
• We break this problem into two parts
• Search for large subsets of the samples that can be
separated by linear half-spaces.
• Synthesize an interpolant as a Boolean combination of
these separators.
The first part can be accomplished by well-established
methods, using an LP solver and Farkas’ lemma. The Boolean
function synthesis problem is also well studied, though we may
wish to use more light-weight methods.
Farkas’ lemma and linear separators
• Farkas’ lemma says that inconsistent rational linear
constraints can be refuted by summation:
π‘Ž (π‘₯ − 𝑦 ≤ 0)
𝑏 (2𝑦 − 2π‘₯ ≤ −1)
(0 ≤ −1)
constraints
π‘Ž≥0
𝑏≥0
π‘Ž − 2𝑏 = 0
2𝑏 − π‘Ž = 0
−𝑏 < 0
solve
π‘Ž=2
𝑏=1
• The proof of unsat can be found by an LP solver
• We can use this to discover a linear interpolant for
two sets of convex polytopes 𝑆𝐴 and 𝑆𝐡 .
Finding separating half-spaces
• Use LP to simultaneously solve for:
• A linear separator of the form 𝒄𝒙 ≤ 𝑏
• A proof that 𝐴𝑖 → 𝐼 for each 𝐴𝑖 in 𝑆𝐴
• A proof that 𝐡𝑖 → ¬πΌ for each 𝐡𝑖 in 𝑆𝐴
• The separator 𝐼 is an interpolant for 𝑆𝐴 ∧ 𝑆𝐡
• The evidence for utility of 𝐼 is the size of 𝑆𝐴 and 𝑆𝐡
• Thus, we search for large sample sets that can be linearly
separated.
• We can also make 𝐼 simpler by setting as many
coefficients in 𝑐 to zero as possible.
Half-spaces to interpolants
• When every pair of samples in 𝑆𝐴 × π‘†π΅ are
separated by some half space, we can build an
interpolant as a Boolean combination.
Must be true
Each region is a
cube over halfspaces
𝐡1
𝐴1
Must be false
𝐡2
𝐴2
Don’t care
In practice, we don’t have to synthesize an optimal combination
Sequential verification
• We can extend our notions of evidence and
retrospection to sequential verification
• A “case” may be some sequence of program steps
• Consider a simple sequential program:
x = y = 0;
while(*)
x++; y++;
while(x != 0)
x--; y--;
assert (y <= 0);
Wish to discover invariant:
{y ≤ x}
Execute the loops twice
{True}
x = y = 0;
x++; y++;
x++; y++;
[x!=0];
x--; y--;
Choose interpolants at each step, in hope
of obtaining inductive invariant.
{y≥
= 0}
{x
y}
{y≥
= 1}
{x
y}
{y≥
= 2}
{x
y}
• These interpolants cover all the cases
with just one predicate.
• In fact, they are inductive.
{y≥
= 1}
{x
y}
[x!=0];
x--; y--;
[x == 0]
[y > 0]
{y≥
= 0}
{x
y}
{False}
• These predicates have low utility, since each
covers just one case.
• As a result, we “overfit” and do not discover the
emerging pattern.
Sequential interpolation strategy
• Compute interpolants for all steps simultaneously
• Collect 𝐴 (pre) and 𝐡 (post) samples at each step
• Utility of a half-space measured by how many sample
pairs it separates in total.
𝑦
• Step 0: low evidence
• Step 1: better evidence
𝑦 ≤ π‘₯ (good!)
π‘₯
𝑦 ≤ 0 (bad!)
Are simple interpolants enough?
• Occam’s razor says simplicity prevents over-fitting
• This is not the whose story, however
• Consider different separators of similar complexity
𝑦
3 (good!)
𝑦
𝑦≤
<π‘₯
π‘₯ (yikes!)
2
π‘₯
Simplicity of the
interpolant may not be
sufficient for a good
generalization.
Proof simplicity criterion
• Empirically, the strange separators seem to be
avoided by minimizing proof complexity
• Maximize zero coefficients in Farkas proofs
• This can be done in a greedy way
• Note the difference between applying Occam’s razor
in synthetic and analytic cases
• We are generalizing not from data but from an argument
• We want the argument to be as simple as possible
Value of retrospection
• 30 small programs over integers
• Tricky inductive invariants involving linear constraints
• Some disjunctive, most conjunctive
Tool
Comp.
SMT
% solved 100
CPA
Checker
UFO
InvGen
with AI
InvGen
w/o AI
57
57
70
60
• Better evidence for generalizations
• Better fits the observed cases
• Results in better convergence
• Still must trade off cost v. utility in generalizing
• Avoid excessive search while maintaining convergence
Value of retrospection (cont)
• Bounded model checking of inc/dec program
• Apply compositional SMT to the BMC unfolding
• First half of unfolding is 𝐴, second half is 𝐡
• Compare to standard lazy SMT using Z3
Exponential theory lemmas
achieved in practice.
Computing one interpolant gives
power ½ speedup.
Incremental inductive methods (IC3)
• Compute a sequence interpolant by local proof
Goal: ¬π‘ ′Goal: ¬π‘ 
𝑐
𝐼0
𝐹0
Inductive generalization
𝑐
𝑐
𝑐
𝑇0
𝑇1
𝑇2
𝑇3
⇒ 𝐹1 ⇒ 𝐹2 ⇒ 𝐹3 ⇒
𝑐
𝐹4
𝑇4
⇒
𝑐
𝐹5
𝑇5
⇒
𝑇6
𝐹6
𝑃7
𝐹6¬π‘ 
∧6𝑇6 ⇒ 𝑃7
Valid!
Interpolant
𝐹5Check:
∧5weakens
𝑇5 ⇒
Check: 𝐹4successively
∧Check:
𝑇4 ⇒ ¬π‘ ′
Cex: 𝑠′ Cex: 𝑠
Infer clause 𝑐 satisfying:
𝐼⇒𝑐
𝐹4 ∧ 𝑐4 ∧ 𝑇4 ⇒ 𝑐5
𝑐 ⇒ ¬π‘ ′
𝑐 is inductive relative to 𝐹4
Generalization in IC3
¬π‘ 
𝑐
𝑐
𝑇0
𝐹0
Inductive generalization
𝑐
𝑐
𝑐
𝑇1
𝐹1
𝑇2
𝐹2
𝑇3
𝐹3
𝑐
𝑇4
𝐹4
(bad state)
𝑐
𝑇5
𝐹5
𝑇6
𝐹6
• Think of inferred clauses as speculated invariants
• Evidence for correctness
• Clause is proved up to π‘˜ steps
• Relative inductiveness (except initial inferences)
• Clause is “as good as” previous speculations
• Evidence for utility
• Rules out one bad sample
• Note the asymmetry here
• Correctness and utility are dual
• IC3 prefers correctness to utility
• Is the cost paid for this evidence too high?
𝑇6
𝐹6
Retrospection in IC3?
• Find a relatively inductive clause that covers as
many bad states as possible
𝐼⇒𝑐
𝐹4 ∧ 𝑐4 ∧ 𝑇4 ⇒ 𝑐5
𝑐 ⇒ ¬ 𝑠 ∧ ¬π‘  ′ ∧ β‹―
• Search might be inefficient, since each test is a SAT
problem
• Possible solution: relax correctness evidence
• Sample on both A and B side
• Correctness evidence is invariance in a sample of
behaviors rather than all up to depth k.
Generalization problem
• Find a clause to separate A and B samples
A samples
(reachable states)
B samples
(bad states)
¬π‘Ž ∧ 𝑏 ∧ ¬π‘
separator
π‘Ž ∧ 𝑏 ∧ ¬π‘
¬π‘Ž ∧ ¬π‘ ∧ 𝑐
¬π‘Ž ∨ ¬π‘
π‘Ž∧𝑏∧𝑐
π‘Ž ∧ ¬π‘ ∧ 𝑐
True in all A samples
¬π‘Ž ∧ 𝑏 ∧ 𝑐
False in many B samples
This is another two-level logic optimization problem. This suggests logic
optimization techniques could be applied to computing interpolants in
incremental inductive style.
Questions to ask
• For a given inference technique we can ask
• Where does generalization occur?
• Is there good evidence for generalizations?
• Can retrospection be applied?
• What form do separators take?
• How can I solve for separators? At what cost?
• Theories: arrays, nonlinear arithmetic,…
• Can I improve the cost?
• Shift the tradeoff of utility and correctness
• Can proof complexity be minimized?
• Ultimately, is the cost of better generalization
justified?
Conclusion
• Many proof search methods rely on interpolants as
generalizations
• Useful to the extent the make the proof simpler
• Evidence of utility in existing methods is weak
• Usually amounts to utility in one case
• Can lead to many useless inferences
• Retrospection: revisit inferences on new evidence
• For example, non-chronological backtracking
• Allows more global view of the problem
• Reduces commitment to inferences based on little
evidence
Conclusion (cont)
• Compositional SMT
•
•
•
•
Modular approach to interpolation
Find simple proofs covering many cases
Constraint-based search method
Improves convergence of invariant discovery
• Exposes emerging pattern in loop unfoldings
• Think about methods in terms of
• The form of generalization
• Quality of evidence for correctness and utility
• Cost v. benefit of the evidence provided
Cost of retrospection
• Early retrospective approach due to Anubhav Gupta
• Finite-state localization abstraction method
• Finds optimal localization covering all abstract cex’s.
• In practice, “quick and dirty” often better than optimal
• Compositional SMT usually slower than direct SMT
• However, if bad generalizations imply divergence,
then the cost of retrospection is justified.
• Need to understand when revisiting generalizations is
justified.
Inference problem
• Find a clause to separate A and B samples
• Clause must have a literal from every A sample
• Literals must be false in many B samples
• This is a binate covering problem
• Wells studied in logic synthesis
• Relate to Bloem et al
• Can still use relative inductive strengthening
• Better utility and evidence
Download