Art of Invariant Generation applied to Symbolic Bound Computation Part 3 Sumit Gulwani (Microsoft Research, Redmond, USA) Oregon Summer School July 2009 Art of Invariant Generation 1. Program Transformations – Reduce need for sophisticated invariant generation. – E.g., control-flow refinement, loop-flattening/peeling, non-standard cut-points, quantitative attributes instrumentation. 2. Colorful Logic – – Language of Invariants E.g., arithmetic, uninterpreted fns, lists/arrays 3. Fixpoint Brush – – Automatic generation of invariants in some shade of logic, e.g., conjunctive/k-disjunctive/predicate abstraction. E.g., Iterative, Constraint-based, Proof Rules 1 Program Transformations • Control-flow Refinement – Reduces need for disjunctive/non-linear invariants. • Quantitative Attributes Instrumentation – Reduces need for invariants that refer to numerical heap properties. • Loop Flattening/Peeling – Reduces need for disjunctive invariants. • Non-standard choice of cut-points – Reduces need for disjunctive invariants. 2 Program Transformations: References • Control-flow Refinement – Control-Flow Refinement and Progress Invariants for Bound Analysis; Gulwani, Jain, Koskinen; PLDI ‘09 • Quantitative Attributes Instrumentation – SPEED: Precise and Efficient Static Estimation of Program Computational Complexity; Gulwani, Mehra, Chilimbi; POPL ’09 • Non-standard choice of cut-points – Program Analysis as Constraint Solving; Gulwani, Srivastava, Venkatesan; PLDI ‘08 3 Program Transformations Control-flow Refinement • Quantitative Attributes Instrumentation 4 Example: Loop with multiple phases Inputs: int n, m Assume(0<n<m) x := n+1; while (xn) if (x·m) x++; else x := 0; x’ := n+1; while (*) P1: assume(xnÆx·m); x’:=x+1; Transition P2: assume(xnÆx>m); x’:=0; System Representation Control Flow Refinement x’ := n+1; while (*) {assume(xnÆx·m); x’:=x+1;} assume(xnÆx>m); x’:=0; while (*) {assume(xnÆx·m); x’:=x+1;} • Control-flow Refinement: Transform a loop with multiple paths into code-fragment with simpler loops. • For above example, (P1 | P2)* reduces to P1+ P2 P1+. • This implies a bound of (m-n)+(1)+(n) = m+1 5 Control-Flow Refinement • Recall algebraic equivalence: (P1|P2)* = Skip | (P1|P2) (P1|P2)* – Used by iteration based tools to compute fixed-points. • Now consider a different algebraic equivalence: (P1|P2)* = Skip | P1+ | P2+| P1+ P2 (P1|P2)* | P2+ P1 (P1|P2)* – Here the focus is on action when P1 and P2 interleave. 1. Expand a loop (P1 | P2)* using the above rule. 2. Use an invariant generation tool to check feasibility of above cases and accordingly expand recursively. • The expanded code-fragment with simpler loops is easier to analyze. Invariants of simpler loops correspond to disjunctive invariants over the original loop. 6 Program Transformations • Control-flow Refinement Quantitative Attributes Instrumentation 7 Example: Loop iterating over a data-structure BreadthFirstTraversal(List L): ToDo.Init(); L.MoveTo(L.Head(),ToDo); c:=0; while (! ToDo.IsEmpty()) c++; e := ToDo.Head(); ToDo.Delete(e); foreach successor s in e.Successors() if (L.contains(s)) L.MoveTo(s,ToDo); • Bound may require reference to quantitative attributes of a data-structure. E.g., Len(L): Length of list L. • Inductive Invariant for the outer while-loop c · Old(Len(L)) - Len(L) – Len(ToDo) Æ Len(L) ¸ 0 Æ Len(ToDo) ¸ 0 • This implies a bound of Old(Len(L)) for while loop. 8 User-defined Quantitative Attributes • User describes semantics of quantitative attributes by stating how they are updated by various data-structure methods. Data Structure Operation Updates to Quantitative Attributes L.Delete(e); Len(L)--; L.MoveTo(e,L’); Len(L)--; Len(L’)++; • Paper gives examples of quantitative attributes for trees, bit-vectors, composite structures (e.g., list of lists) – Trees: Height, Number of nodes – Bit-vectors: Number of 1 bits – List of lists: Sum of # of nodes in all nested lists. 9 Art of Invariant Generation 1. Program Transformations – Reduce need for sophisticated invariant generation. – E.g., control-flow refinement, loop-flattening/peeling, non-standard cut-points, quantitative attributes instrumentation. 2. Colorful Logic – – Language of Invariants E.g., arithmetic, uninterpreted fns, lists/arrays 3. Fixpoint Brush – – Automatic generation of invariants in some shade of logic, e.g., conjunctive/k-disjunctive/predicate abstraction. E.g., Iterative, Constraint-based, Proof Rules 10 Symbolic Bound Computation Problem We will now sketch a solution to the symbolic bound computation problem using the techniques learned. (Joint work with Florian Zuleger, TU-Darmstadt). We proceed by starting out with special cases and then generalizing. • ¼ is immediately inside a loop. Loop – Loop – Loop – Loop • has has has has only one transition/path s. two transitions s1 Ç s2 multiple transitions s1 Ç … Ç sn nested loops. ¼ can be any control-location. 11 Bounding Iterations of Loops with one transition Consider the loop while (cond) X := F(X) Transition system representation s: cond Æ X’=F(X) Example: Transition system representation of the loop while (x < n) {x++; n--;} is x<n Æ x’=x+1 Æ n’=n-1 Algorithm: 1. Find a ranking function r for transition s: – r is a ranking fn for s if: s ) (r>0 Æ r[X’/X]·r-1) 2. Output Max(0,r) – Claim: Bound(s) · Max(0,r) 12 Finding Ranking Functions • Iterative Forward – Instrument counter c and find an upper bound n. n-c is a ranking function. • Constraint-based – Assume a template a0 + iaixi for the ranking function r and then solve for ai’s in the constraint 9ai (s ) (r>0 Æ r[X’/X]·r-1) using Farkas lemma – Goal directed – Complete PTIME method for synthesis of linear ranking fns. • Podelski, Rybalchenko; VMCAI ‘04 • Proof Rules – Most scalable, and effective for several domains. – We discuss design of a rank computer RankC based on some proof rules that can be discharged using SMT solvers. 13 Arithmetic Iteration Patterns If s ) (e>0 Æ e[X’/X] < e), then e 2 RankC(s) • RankC(i’=i+1 Æ i<n Æ i<m Æ n’=n Æ m’·m) = { n-i, m-i } • RankC(n>0 Æ n’·n Æ A[n]A[n’]) = { n } If s ) (e¸1 Æ e[X’/X] · e/2), then log e 2 RankC(s) • RankC(i’·i/2 Æ i>1 ) = { log i } • RankC(i’=2£i Æ i>0 Æ n>i Æ n’=n) = { log (n/i) } 14 Boolean Iteration Pattern If s ) e Æ :e[X’/X], then Bool2Int(e) 2 RankC(s) • RankC(flag’=false Æ flag) = { Bool2Int(flag) } • RankC(x’=100 Æ x<100) = { Bool2Int(x < 100) } 15 Bit-vector Iteration Pattern If s ) (LSB(x’) < LSB(x) Æ x0), then LSB(x) 2 RankC(s) • RankC(x’=x << 1 Æ x0) = { LSB(x) } • RankC(x’=x&(x-1) Æ x0) = { LSB(x) } 16 Data-structure Iteration Patterns If s ) (xz Æ Dist(x’,z) < Dist(x,z)), then Dist(x,z) 2 RankC(s) • RankC(x Null Æ x’=x.next) = { Dist(x,Null) } • RankC(Mem’ = Update(Mem,x.next,x.next.next) Æ x Null Æ x.next Null) = { Dist(x,Null) } 17 Symbolic Bound Computation Problem • Bounding Loop Iterations – Loop has only one transition/path s • Constraint-based (Linear), Proof Rules Loop has two transitions s1 Ç s2 • Proof Rules – Loop has multiple transitions s1 Ç … Ç sn – Loop has nested loops. • Bounding Visits(¼), where ¼ is any control-location. 18 Symbolic Bound Computation Problem • Bounding Loop Iterations – Loop has only one transition/path s • Constraint-based (Linear), Proof Rules Loop has two transitions s1 Ç s2 • Proof Rules – Loop has multiple transitions s1 Ç … Ç sn – Loop has nested loops. • Bounding Visits(¼), where ¼ is any control-location. 19 Proof Rule for Max Composition Let r1 2 RankC(s1), r2 2 RankC(s2). Cooperative Interference CI(s1,r1,s2,r2): • Non-enabling condition: s1 ± s2 = false • Rank decrease condition: s1 ) r2[x’/x] · Max(r1,r2)-1 Proof Rule: If CI(s1, r1, s2, r2) and CI(s2,r2,s1,r1), then: Bound(s1 Ç s2) = Max(0, r1, r2) Example: s1 = (n’=n-1 Æ j<n-1 Æ j’¸j Æ i’=i) s2 = (n’=n-1 Æ i<n-1 Æ i’¸i+1 Æ j’ ¸i+2) r1 = n-j-1, r2 = n-i-1 Bound(s1 Ç s2) = Max(0, n-j-1, n-i-1) 20 Proof Rule for Additive Composition Let r1 2 RankC(s1), r2 2 RankC(s2). Non-Interference NI(s1,s2,r2): • Non-enabling condition: s1 ± s2 = false • Rank preserving condition: s1 ) r2[x’/x] · r2 Proof Rule: If NI(s1,s2,r2) and NI(s2,s1,r1), then: Bound(s1 Ç s2) = Max(0, r1) + Max(0,r2) Example: s1 = (z>x Æ x<n Æ x’=x+1 Æ Same({z,n}) ) s2 = (z·x Æ x<n Æ z’=z+1 Æ Same({x, n}) ) r1 = n-x, r2 = n-z Bound(s1 Ç s2) = Max(0, n-x) + Max(0, n-z) 21 Proof Rule for Multiplicative Composition Let r1 2 RankC(s1), r2 2 RankC(s2). Proof Rule: If NI(s2,s1,r1), then: Bound(s1 Ç s2) = Max(0,r1) + Max(0,r2) + Max(0,u2)*Max(0,r1) where u2(X) is an upper bound on r2[X’/X] as implied by s1. Example: s1 = (i’=i-1 Æ i>0 Æ j’=j-1 Æ j>0 Æ Same({k’,m’}) ) s2 = (j’=m Æ k’=k-1 Æ k>0 Æ Same({i’,m’}) ) RankC(s1) = {i,j}, RankC(s2) = {k} Bound(s1 Ç s2) = Max(0,k) + Max(0,m,j)*Max(0,k) or, Max(0,k) + Max(0,i) [Additive Composition] 22 Symbolic Bound Computation Problem • Bounding Loop Iterations – Loop has only one transition/path s • Constraint-based (Linear), Proof Rules – Loop has two transitions s1 Ç s2 • Proof Rules Loop has multiple transitions s1 Ç … Ç sn • Proof Rules – Loop has nested loops. • Bounding Visits(¼), where ¼ is any control-location. 23 Algorithm: ComputeBound(s1ÇÇsn) Iter(si) := ?; do { for i 2 {1,..n} and r 2 RankC(si): J := { j | : NI(sj,si,r) }; if (Iter[si] = ?) Æ (8j2J: Iter[sj] ?) factor := j2J Iter (sj); Let u(x) be upper bound on r[x’/x] implied by TC(Çji sj); Iter[si] := Max(0,r) + Max(0,u) * factor; } while any change in Iter array; If (8 1·j·n: Iter[sj] ?), return 1·j·n Iter (sj); Else return “Potentially Unbounded”; 24 Symbolic Bound Computation Problem • Bounding Loop Iterations – Loop has only one transition/path s • Constraint-based (Linear), Proof Rules – Loop has two transitions s1 Ç s2 • Proof Rules – Loop has multiple transitions s1 Ç … Ç sn • Control-flow Refinement + Proof Rules Loop has nested loops • Iterative Forward: Recursively replace each nested loop by the transitive closure of its transition system. • Bounding Visits(¼), where ¼ is any control-location. 25 Example: TransitiveClosure Input: int n flag := true; while (flag) { flag := false; while (n>0 Æ nondet()) n := n-1; flag := true; } • To compute a precise bound of n for the outer loop, we need to summarize behavior of nested loop (n>0 Æ n’=n-1 Æ flag’=true) by following transitive closure: (n>0 Æ n’·n-1 Æ flag’=true) Ç (flag’=flag) • Observe that n’·n is also a transitive closure, but it is too abstract to even conclude that outer loop terminates. 26 TransitiveClosure Definition • We say that a relation R is TransitiveClosure(T) if Id ) R and R ± T ) R where Id is the relation X’=X • Example of TransitiveClosure(s1 Ç s2) s1: i’=i+1 Æ j’=0 s2: i’=i Æ j’=j+1 (i’¸i+1) Ç (i’=i Æ j’¸j) 27 Convexity-like Assumption • Let s1ÇÇs’m be transitive closure of s1ÇÇsn. Then, – Id ) (s’1Ç …Çs’m) – (s’1Ç …Çs’m) ± (s1Ç …Çsn) ) (s’1Ç …Çs’m) Or, equivalently, s’j ± si ) (s’1Ç …Çs’m) • Convexity-like assumption: Id ) s’± where ± 2 {1,…,m} s’j ± si ) s’¾(j,i) where ¾(j,i) 2 {1,...,m} • A transitive closure with m disjuncts for a relation with n disjuncts corresponds to an integer q and a map ¾: {1,,n}£{1,,m} ! {1,,n} • We describe an algorithm for computing transitive closure that is stronger than s’1ÇÇs’m, given ± and map ¾. 28 TransitiveClosure Algorithm s’± := Id; for j 2 {1,…,m}/{± }: s’j := false; do { for j 2 {1,…,m} and i 2 {1,…,n}: s’k := Join(s’k, s’j ± si), where k = ¾(j,i) } until no change in (s’1, …, s’m) Return (s’1Ç …Çs’m) • The m*n disjuncts s’j ± si are merged/joined into n disjuncts using the map ¾. • The distinguishing key idea is to stick with the same merging criterion determined by ¾ for all iterations. • Precision Proof: s’j stays stronger than desired solution. • Termination may require widening, and precision is not guaranteed in that setting. 29 Symbolic Bound Computation Problem • Bounding Loop Iterations – Loop has only one transition/path s • Constraint-based (Linear), Proof Rules – Loop has two transitions s1 Ç s2 • Proof Rules – Loop has multiple transitions s1 Ç … Ç sn • Control-flow Refinement + Proof Rules – Loop has nested loops • Quantitative Attributes + Iterative Forward (Linear+UF) Bounding Visits(¼), where ¼ is any control-location. – Generate a transition system for ¼. 30 Algorithm: GenerateTransitionSystem(¼) 1. Split ¼ into ¼start and ¼end . 2. If graph between ¼start and ¼end is a DAG, then expand the DAG into a set/disjunction of paths, each path representing relation among X and X’. Step 1 ¼end ¼ ¼start Algorithm: GenerateTransitionSystem(¼) 3. If graph G between ¼start and ¼end is not a DAG: Foreach top-level loop L in graph G: a. T := TransitionSystem(¼Header(L)) b. Remove back-edges, place TransitiveClosure(T) at the beginning of ¼Header(L). Expand the resultant DAG into a set/disjunction of paths. Step 3(b) ¼Header TransitiveClosure 32 Algorithm for Symbolic Bound Computation Input: Control Location ¼ 1. T := GenerateTransitionSystem(¼) – T is a relation in DNF form among X and X’, where • • – X: variables live at ¼ X’: values of variables X in the next visit to ¼ Quantitative Attributes + Iterative Forward (Linear+UF) 2. B := 1+ComputeBound(T) – – B denotes a symbolic bound in terms of inputs to T Control-flow Refinement + Constraint-based (Linear), Proof Rules 3. B’ := TranslateBound(B, ¼) – Backward propagation based on Proof Rules Output: Bound B’ 33 Conclusion • Symbolic Bound Analysis – An application area waiting to benefit from advances in invariant generation technology. – Several important/open/challenging problems • Concurrent Procedures, Average-case Bounds • Art of Invariant Generation – Program Transformations + Colorful Logic + Fixpoint Brush – An effective solution for bound analysis involves using a variety of choices along each of these three dimensions. 34