The Reachability-Bound Problem

advertisement
The Reachability-Bound Problem
Sumit Gulwani
(Microsoft Research, Redmond)
Joint work with
Florian Zuleger
(TU Darmstadt)
Sudeep Juvekar
(UC-Berkeley)
The Reachability-Bound Problem
Let ¼ be some control location inside a procedure.
• Safety: Is ¼ never visited?
– Violation is a finite trace
• Liveness: Is ¼ visited at most finite number of times?
– Violation is an infinite trace
• Reachability-Bound: Symbolic bound on maximum visits to ¼.
–
–
–
–
Quantitative question as opposed to Boolean.
Checking validity of a given bound is a safety property.
Checking precision is not even a trace property.
The problem is challenging!
1
Motivation 1: Resource Bound Analysis
• Programs consume a variety of resources.
– CPU time, Memory, Network Bandwidth, Power
• It is important to bound use of such resources.
– Economic incentives
– Better user experience
– Hard constraints on availability of resources
• Real-time/embedded systems, Low power/bandwidth devices
• This requires computing bounds on # of visits to
control-locations that consume these resources.
– Memory Allocated = §¼ [Visits(¼) £ BytesAllocated(¼)]
– Asymptotic Time Complexity = §H [Visits(H)], where H
ranges over loop headers.
2
Motivation 2: Quantitative Analysis of Data
• Program execution affects certain quantitative
properties of data.
– Secrecy: information leakage.
– Robustness: error/uncertainty propagation.
• Bounding such properties requires computing bound on #
of visits to control-locations that affect such properties
of the data.
3
Example (.Net Library)
Inputs: int n, bool[] A
i := 0;
¼1: while (i < n) {
j := i+1;
¼2: while (j < n) {
if (A[j]) { ¼3: B[n] := new C(); }
j++; }
i++; }
• Time Complexity = Visits(¼1) + Visits(¼2)
– Visits(¼1) · n
and
Visits(¼2) · n2
• Memory Allocated = Visits(¼3) £ SizeOf(C)
– Visits(¼3) · n2
4
Example (.Net Library)
Inputs: int n, bool[] A
i := 0;
¼1: while (i < n) {
j := i+1;
¼2: while (j < n) {
if (A[j]) { ¼3: B[n] := new C(); j--; n--; }
j++; }
i++; }
• Time Complexity = Visits(¼1) + Visits(¼2)
– Visits(¼1) · n
and
Visits(¼2) · n2
• Memory Allocated = Visits(¼3) £ SizeOf(C)
– Visits(¼3) · n
– Nested loop does not necessarily imply quadratic complexity.
5
Algorithm: A variety of fixed-point techniques
Examine the loop induced by the control-flow graph starting
at , and the next visit to it.
• Loop has one path.
–
•
Compute ranking function using constraint-based or proof
rules based technique.
Loop has multiple paths.
– Compose ranking functions for paths using proof rules.
– One proof rule each for Max, Sum, and Product composition.
•
Loop has inner loops.
– Summarize inner loops by precise disjunctive invariants using
forward iterative technique (abstract interpretation).
•
Loop has other loops before it.
–
Perform backward symbolic execution (using proof rules to
trace across loops) to express bound in terms of inputs.
6
Algorithm: A variety of fixed-point techniques
Examine the loop induced by the control-flow graph starting
at , and the next visit to it.
 Loop has one path.
–
•
Compute ranking function using constraint-based or proof
rules based technique.
Loop has multiple paths.
– Compose ranking functions for paths using proof rules.
– One proof rule each for Max, Sum, and Product composition.
•
Loop has inner loops.
– Summarize inner loops by precise disjunctive invariants using
forward iterative technique (abstract interpretation).
•
Loop has other loops before it.
–
Perform backward symbolic execution (using proof rules to
trace across loops) to express bound in terms of inputs.
7
Ranking Function: Arithmetic Loops
Inputs: uint n,m
i := j := 0;
¼: while (i<n Æ j<m)
j++;
i++;
Visits(¼) · Min(n,m)
• There is one path between ¼ and the next visit to it.
Path 1: i<n Æ j<m Æ j’=j+1 Æ i’=i+1 Æ Same({n,m})
• n-i is a ranking function for path 1 because
– n-i > 0
– n-i decreases in each iteration, i.e., (n’-i’) < (n-i)
• Visits(¼) · Value of n-i immediately before the loop = n
• Similarly, m-j is also a ranking function and Visits(¼) · m
8
Computing Ranking Functions: Proof Rule Technique
• Guess a ranking function e
– For each (syntactically appearing) inequality e1 ¸ e2
in P, guess e1-e2 to be a candidate.
• Check whether e is a ranking function by validating
the following constraints using an SMT solver.
P ) e¸0
P ) (e[X’/X] · e-1)
• The proof rule based technique extends readily to
cases other than integer arithmetic.
– E.g., loops that iterate over bit-vectors or datastructures
9
Computing Ranking Functions: Constraint-based Technique
• The proof-rule based technique is not complete.
Consider the following example.
– P: x¸0 Æ y¸0 Æ x’=y Æ y’=x-1
– Neither x nor y is a ranking function, but x+y is.
• There is a “complete” method to find linear ranking
functions [Podelski, Rybalchenko, VMCAI ‘04]
– Let ranking function be of form a1x + a2y + a3
– We want to find a1, a2, a3 such that for all x,y
• P ) (a1x+a2y+a3) ¸ 0 and
• P ) (a1x’+a2y’+a3) · (a1x+a2y+a3) -1
– Farkas Lemma can be used to reduces the above system
of quantified equations to that of linear inequalities.
10
Ranking Function: Bitvector Loops (SQL)
Input: bitvector b
¼: while (b  0)
b := b & (b-1);
Input: bitvector b
¼: while (b  0)
b := b << 1;
Visits(¼) · Ones(b)
Visits(¼) · RMB(b)
Input: bitvector b
¼: while (BitScanForward(&id1,b))
b := b | ((1 << id1)-1);
if (BitScanForward(&id2,~x) break;
b := b & (~((1 << id2)-1);
Visits(¼) · Min { Ones(b), RMB(b)/2 }
Ones(b): # of 1 bits in bitvector b
RMB(b): position of right-most 1-bit
11
Ranking Function: Data-structure Loops
Input: List L
¼: while (L  Null)
L := L.Next;
Visits(¼) · Length(L, Next)
Input: ICollection C
¼: foreach(Element e in C)
…
Requires analysis of C.MoveNext() method.
In case of virtual method, we define Visits(¼) to be C.count
12
Algorithm: A variety of fixed-point techniques
Examine the loop induced by the control-flow graph starting
at , and the next visit to it.
• Loop has one path.
–
Compute ranking function using constraint-based or proof
rule based technique.
 Loop has multiple paths.
– Compose ranking functions for paths using proof rules.
– One proof rule each for Max, Sum, and Product composition.
•
Loop has inner loops.
– Summarize inner loops by precise disjunctive invariants using
forward iterative technique (abstract interpretation).
•
Loop has other loops before it.
–
Perform backward symbolic execution (using proof rules to
trace across loops) to express bound in terms of inputs. 13
Composition of Ranking Functions
Inputs: uint n,m
i := j := 0;
¼: while (j<m Ç i<n)
j++; i++;
Path 1: j<m Æ j’=j+1 Æ i’=i+1
Path 2: i<n Æ j’=j+1 Æ i’=i+1
Inputs: uint n,m
i := j := 0;
¼: while (i<n)
if (j<m) j++;
else i++;
Path 1: i<n Æ j<m Æ j’=j+1
Path 2: i<n Æ j¸m Æ i’=i+1
Inputs: uint n,m
i := j := 0;
¼: while (i<n)
if (j<m) j++;
else {i++; j:=0;}
Path 1: i<n Æ j<m Æ j’=j+1
Path 2: i<n Æ j¸m Æ i’=i+1 Æ j’=0
Visits(¼) · Max(n,m)
Visits(¼) · n + m
Visits(¼) · n £ (1+m)
14
Proof Rule for Additive Composition
Let r1, r2 be ranking functions for p1, p2 respectively.
Non-Interference NI(p1,p2,r2):
• Non-enabling condition: p1 ± p2 = false
• Rank preserving condition: p1 ) r2[x’/x] · r2
Proof Rule: If NI(p1,p2,r2) and NI(p2,p1,r1), then:
Bound(p1 Ç p2) = Max(0, r1) + Max(0,r2)
Example:
p1: (i<n Æ i’=i+1 Æ Same({j,n,m}) )
p2: (j<m Æ j’=j+1 Æ Same({i,n,m}) )
r1: n-i, r2: m-j
Bound(p1 Ç p2) = Max(0, n-i) + Max(0, m-j)
=n+m
15
Proof Rule for Multiplicative Composition
Let r1, r2 be ranking functions for p1, p2 respectively.
Proof Rule: If NI(p2,p1,r1), then:
Bound(p1 Ç p2) = Max(0,r1) + Max(0,r2)
+ Max(0,u2)*Max(0,r1)
where u2(X) is an upper bound on r2[X’/X] as implied by p1.
Example:
p1: (i<n Æ i’=i+1 Æ j’=0 Æ Same({n,m}))
p2: (j<m Æ j’=j+1 Æ Same({i,n,m}))
r1: n-i, r2: m-j
Bound(p1 Ç p2) = Max(0,n-i) * [1 + Max(0,m-j)]
= n * (1+m)
16
Proof Rule for Max Composition
Let r1, r2 be ranking functions for p1, p2 respectively.
Cooperative Interference CI(p1,r1,p2,r2):
• Non-enabling condition: p1 ± p2 = false
• Rank decrease condition: p1 ) r2[x’/x] · Max(r1,r2)-1
Proof Rule: If CI(p1, r1, p2, r2) and CI(p2,r2,p1,r1), then:
Bound(p1 Ç p2) = Max(0, r1, r2)
Example:
p1: (i<n Æ i’=i+1 Æ j’=j+1 Æ Same({n,m}) )
p2: (j<m Æ i’=i+1 Æ j’=j+1 Æ Same({n,m}) )
r1: n-i, r2: m-j
Bound(p1 Ç p2) = Max(0, n-i, m-j)
= Max(n,m)
17
Algorithm: A variety of fixed-point techniques
Examine the loop induced by the control-flow graph starting
at , and the next visit to it.
• Loop has one path.
–
•
Compute ranking function using constraint-based or proof
rule based technique.
Loop has multiple paths.
– Compose ranking functions for paths using proof rules.
– One proof rule each for Max, Sum, and Product composition.
 Loop has inner loops.
– Summarize inner loops by precise disjunctive invariants using
forward iterative technique (abstract interpretation).
•
Loop has other loops before it.
–
Perform backward symbolic execution (using proof rules to
trace across loops) to express bound in terms of inputs. 18
Transitive Closure
• A loop with body T can be replaced by TransitiveClosure(T).
• We say that a relation R is TransitiveClosure(T) if
Id ) R and R ± T ) R
where Id is the relation X’=X
• Precise transitive closures can be computed using iterative
fixed-point techniques such as abstract interpretation or
model checking.
• Example of TransitiveClosure(s1 Ç s2)
s1: i’=i+1 Æ j’=0
s2: i’=i Æ j’=j+1
(i’¸i+1 Æ j’¸0) Ç (i’=i Æ j’¸j)
19
Example (.Net Library)
Inputs: int n, bool[] A
i := 0;
¼1: while (i < n) {
j := i+1;
¼2: while (j < n) {
if (A[j]) { ¼3: B[n] := new C(); j--; n--; }
j++; }
i++; }
Visits(¼3) · n
20
Split Control Location
no
begin
begin
i := 0;
i := 0;
i<n
yes
end
j := j+1;
no
A[j]
j--;n--; π yes
3
B[n] := new C;
i := i+1;
j := i+1;
no
yes
i<n
yes
end
j := i+1;
j<n
no
i := i+1;
j := j+1;
no
j--;n--;
j<n
yes
A[j]
no
yes
π3b
π3a
B[n] := new C;
21
Split Control Location
no
begin
π3a
i := 0;
B[n] := new C;
i<n
yes
end
j--;n--;
i := i+1;
j := j+1;
j := i+1;
j := j+1;
no
j--;n--;
j<n
yes
A[j]
no
j<n
yes
π3b
π3a
B[n] := new C;
no
yes
A[j]
yes
j := i+1;
no
yes
i<n
i := i+1;
π3b
22
Transition System Generation
π3a
B[n] := new C;
j--;n--;
j := j+1;
j<n
no
yes
A[j]
yes
j := i+1;
no
yes
i<n
i := i+1;
π3b
23
Transition System Generation
π3a
• Transition-system T1 of inner loop
(j¸n Æ i<n-1 Æ i’=i+1 Æ j’=i+2)
B[n] := new C;
• T1’ = Transitive Closure(T1) =
(i’=iÆj’=j) Ç (j¸n Æ i<n-1 Æ i’>i Æ j’¸i+2) j--;n--;
j := j+1;
j<n
no
yes
A[j]
yes
j := i+1;
no
yes
i<n
i := i+1;
π3b
24
Transition System Generation
π3a
• Transition-system T1 of inner loop:
(j¸n Æ i<n-1 Æ i’=i+1 Æ j’=i+2)
B[n] := new C;
• T1’ = Transitive Closure(T1) =
(i’=iÆj’=j) Ç (j¸n Æ i<n-1 Æ i’>i Æ j’¸i+2) j--;n--;
j := j+1;
T1‘
no
A[j]
yes
π3b
25
Transition System Generation
π3a
• Transition-system T1 of inner loop:
(j¸n Æ i<n-1 Æ i’=i+1 Æ j’=i+2)
B[n] := new C;
• T1’ = Transitive Closure(T1) =
(i’=iÆj’=j) Ç (j¸n Æ i<n-1 Æ i’>i Æ j’¸i+2) j--;n--;
• Transition-system T2 of outer loop
j := j+1;
(j<n-1 Æj’=j+1 Æi’=i) Ç (i<n-1 Æi’>i Æj’¸i+2)
• T2’ = Transitive Closure(T2) =
T1‘
(j’¸j Æ i’=i) Ç (i<n-1 Æ i’>i Æ j’¸i+2)
no
A[j]
yes
π3b
26
Transition System Generation
π3a
• Transition-system T1 of inner loop:
(j¸n Æ i<n-1 Æ i’=i+1 Æ j’=i+2)
B[n] := new C;
• T1’ = Transitive Closure(T1) =
(i’=iÆj’=j) Ç (j¸n Æ i<n-1 Æ i’>i Æ j’¸i+2) j--;n--;
• Transition-system T2 of outer loop
(j<n-1 Æj’=j+1 Æi’=i) Ç (i<n-1 Æi’>i Æj’¸i+2)
• T2’ = Transitive Closure(T2) =
T2‘
(j’¸j Æ i’=i) Ç (i<n-1 Æ i’>i Æ j’¸i+2)
• Transition-system(¼3)
(n’=n-1 Æ j<n-1 Æ j’¸j Æ i’=i) Ç
(n’=n-1 Æ i<n-1 Æ i’>i Æ j’¸i+2)
π3b
27
Reachability-Bound Computation
1. Transition-system(¼3)
P1: (n’=n-1 Æ j<n-1 Æ j’¸j Æ i’=i) Ç
P2: (n’=n-1 Æ i<n-1 Æ i’>i Æ j’¸i+2)
2. n-1-j is a ranking function for P1.
n-1-i is a ranking function for P2.
3. Proof Rule for Max Composition yields a bound of
Max(0, n-1-i, n-1-j), which involves variables live at ¼3.
4. During first visit to ¼3, we have i¸0 Æ j¸1. This yields
a bound of Max(0,n-1) in terms of procedure inputs.
28
Algorithm: A variety of fixed-point techniques
Examine the loop induced by the control-flow graph starting
at , and the next visit to it.
• Loop has one path.
–
•
Compute ranking function using constraint-based or proof
rule based technique.
Loop has multiple paths.
– Compose ranking functions for paths using proof rules.
– One proof rule each for Max, Sum, and Product composition.
•
Loop has inner loops.
– Summarize inner loops by precise disjunctive invariants using
forward iterative technique (abstract interpretation).
 Loop has other loops before it.
–
Perform backward symbolic execution (using proof rules to
trace across loops) to express bound in terms of inputs. 29
Backward Symbolic Execution (.Net Library)
Inputs: List<int> C1, List<int> C2
List<int> C3 = new List<int>();
AddElements(C3,C1);
DeleteElements(C3,C2);
¼: foreach (int e in C3)
…
Visits(¼) = C3.Count
· C1.Count
AddElements(List<int> L1, List<int> L2)
foreach (int e in L2)
L1.Add(e);
DeleteElements(List<int> L1, List<int> L2)
foreach (int e in L2)
if (L1.Contains(e)) L1.Delete(e);
• Backward Propagation may require tracing back across
procedure calls and loops.
30
Backward Symbolic Execution across Loops
n := m
while (e) {
S1
¼: n := n+3;
S2
}
nafter · nbefore + 3£Visits(¼)
Use algorithm for computing Visits to relate values of
a variable before and after a loop.
31
SPEED Tool
• Computes symbolic computational complexity of procedures.
• Built over Phoenix Compiler Infrastructure and analyzes
.Net binaries.
• Uses Z3 SMT solver as the logical reasoning engine.
– Can reason about various data-types: arithmetic, bit-vector,
boolean, list/collection variables.
• Takes between 0.1 to 1 second to analyze each loop.
• Success ratio of 60-90% for computing loop bounds.
• Representative failure cases:
– Lack of global invariant analysis.
• for (i:=0; i<n; i := i+g);
• for (i:=0; ig; i := i+1);
– Failure to resolve virtual method calls.
32
Limitations and potential Extensions
• Worst-case bounds (as opposed to average bounds)
– Challenge: Requires modeling average/representative inputs.
– Use profiling/user-annotations to rule out exceptional paths.
• Static cost model for timing analysis
– Challenge: Difficult to model low-level architectural details
like caches, pipelines.
– Profiling may help generate a precise cost model.
• Imprecision (may generate higher bounds than possible)
– Challenge: Undecidable problem in general.
– Possible to generate proof of precision of bounds.
• Sequential Programs (as opposed to Concurrent programs)
– Challenge: Variety of concurrent programming models;
scheduling policies; # of processors
– Might be possible to model some of them.
33
Related Work
• Detailed lecture notes available at
http://www.cs.uoregon.edu/research/summerscho
ol/summer09
• Bound computation using Recurrence Relations
– Albert, Arenas, Genaim, Puebla, SAS ‘08
• Termination
– Disjunctively well-founded ranking functions
• Cook, Podelski, Rybalchenko, PLDI 2006
– Size-change abstraction
• Ben-Amran, CAV 2009
• Worst Case Execution Time
– R. Wilhelm et.al., ACM TECS 2007
34
Conclusion
• Bound Computation: An important application area that
can leverage advances in static program analysis.
• An effective solution involved a variety of techniques
for reasoning about loops/fixed-points.
– Iterative techniques for summarizing inner loops.
– Constraint-based techniques for ranking functions.
– Proof-rule based technique for composition of ranking
functions and bound computation in terms of inputs.
• Several important/open/challenging problems.
– Concurrent Procedures, Average-case Bounds
35
Download