Quantified Invariants in Rich Domains using Model Checking and Abstract Interpretation Anvesh Komuravelli, CMU Joint work with Ken McMillan © Anvesh Komuravelli The Problem Quantified Invariants! Safe + Proof Array-Manipulating Program P + Assertions Automatic analysis for assertion failures Unsafe + CEX Unknown + Partial Proof © Anvesh Komuravelli 1 Quantified Invariants, Typically Specialized Abstract Domains E.g. Segmentation abstraction, Indexed Predicate Abstraction, Points-to Analysis, etc. • Restrictive • False warnings Unrestricted Model Checking E.g. Interpolation-based • Hard to find the right quantifiers • Divergence Rich-enough abstract domain? © Anvesh Komuravelli 2 The abstract domain Quantified variables i := 0; while (i < n) { // a[i] := c; i++; } assume (0 ≤ k < n) assert (a[k] = c) Predicate signature Abstract Domain Goal: Find a quantifier-free interpretation of the predicates © Anvesh Komuravelli 3 Guess-and-check doesn’t work anymore! i := 0; while (i < n) { // a[i] := c; i++; } assume (0 ≤ k < n) assert (a[k] = c) Given a guess for P, how to check if it suffices? FOL validity is undecidable! Can we still use existing model checkers? © Anvesh Komuravelli 4 Let’s look at the VCs i := 0; while (i < n) { // a[i] := c; i++; } assume (0 ≤ k < n) assert (a[k] = c) © Anvesh Komuravelli 5 Let’s look at the VCs Pulled to the outermost scope © Anvesh Komuravelli 6 Let’s look at the VCs Real challenge! Find a sufficient set of witnesses © Anvesh Komuravelli 7 Let’s look at the VCs Reduces to quantifier-free invariant generation (use an off-the-shelf model checker) © Anvesh Komuravelli 8 Two Goals Quantified variables i := 0; while (i < n) { // a[i] := c; i++; } assume (0 ≤ k < n) assert (a[k] = c) Predicate signature Abstract Domain Goal 1: Find a sufficient set of witnesses for j Goal 2: Find a quantifier-free interpretation of the predicates © Anvesh Komuravelli 9 A Strategy Eager Syntactic Pattern Matching [BMR13] Guess some witnesses Check if they suffice using a model checker Y Found Proof N • Unguided instantiation • Worst-case unbounded • Grows exponentially with number of quantified vars • May choke the model checker • No fall-back strategy Give up! [BMR13]: On Solving Universally Quantified Horn Clauses, Bjorner, McMillan, Rybalchenko, SAS’13 © Anvesh Komuravelli 10 Our Strategy Guess some witnesses Check if they suffice using a model checker Y Found Proof N CEX Refine the guess Constraint on the witness Guess-and-check, but of the witnesses and not the invariant itself © Anvesh Komuravelli 11 Obtaining Strong Constraints Generalized Counterexamples Strong Constraints Symbolic Counterexamples • Number of variables = O(size) • Constraint solving becomes harder (easily diverging) Ground Counterexamples + Abstract Interpretation © Anvesh Komuravelli 12 Note – one witness suffices! May not be expressible! is equivalent to © Anvesh Komuravelli 13 Concrete vs. Abstract © Anvesh Komuravelli 14 Concrete vs. Abstract © Anvesh Komuravelli 15 The algorithm [B] [L] [E] © Anvesh Komuravelli 16 The algorithm [B] B P(k0,v0,i0,c0) L [L] P(k1,v1,i1,c1) L P(k2,v2,i2,c2) E Instantiate [E] Check © Anvesh Komuravelli 17 The algorithm B L P(k0,v0,i0,c0) Instantiate Check L P(k1,v1,i1,c1) E P(k2,v2,i2,c2) Analyze © Anvesh Komuravelli 18 The algorithm B ? Instantiate L P(0,0,0,0) P(0,0,1,0) ? Check L ? P(0,0,2,0) E ? Analyze © Anvesh Komuravelli 19 The algorithm B ? L P(0,0,0,0) P(0,1,0,0) ? L ? P(0,2,0,0) E ? Use k for j Instantiate Check Analyze © Anvesh Komuravelli 20 The algorithm [B] [L] [E] Instantiate © Anvesh Komuravelli 21 The algorithm [B] [L] [E] Instantiate … © Anvesh Komuravelli 22 Finding a new witness Given Constraint Check local vars quantified variable Skolem Template f Solve for t using sampling-based approach restrict to linear templates © Anvesh Komuravelli 23 Quantifier Alternation using Sampling Quantifier Elimination Pick candidate tc Y ? N New candidate tc Y Return tc CEX lc Cheap QE of integers Add lc to existing samples S ? Eliminate arrays (thanks to Nikolaj for the discussion), N CEX S Source of Divergence! © Anvesh Komuravelli 24 Abstract Post, in practice 1. Cheap QE tricks, case-split on equalities on j, etc. 2. Under-approximate, otherwise. Solve Generalize models 1. Cheap QE tricks, case-split on array-index arguments, etc. 2. Under-approximate, otherwise. Solve an SMT problem Generalize models © Anvesh Komuravelli 25 Experiments Implemented “qe_array” tactic in Z3 Prototype in Python using Z3Py interface for witness generation Automatically generated “sufficient witnesses” for small array-manipulating programs (BMR13) – array init, find, copy, concatenate, reverse, etc. Used GPDR engine in Z3 to solve for quantifier-free predicates Up to two universal quantifiers per predicate Witness was just a local variable in the VC © Anvesh Komuravelli 26 Moving forward… Scalability Handle large programs (with multiple procedures) How to pick relevant “set” of witnesses? Can we synthesize guards to combine them into a single witness? Implementation-wise Cache previous AI results Reuse bounded proofs – Proof-based Abstraction Lazy QE – postponing to later steps? Alternatives Use over-approximations of reachable states Witness may not exist – need to refine the approximation © Anvesh Komuravelli 27 Questions? © Anvesh Komuravelli 28