Data-Flow Analysis

advertisement
Data-Flow Analysis
(Chapter 8)
Furman Michael
Outline
• What is Data-Flow Analysis?
• An example: Reaching Definitions
• Basic Concepts: Lattices, Flow-Functions, and Fixed
Points
• Taxonomy of Data-Flow Problems and Solutions
• Iterative Data-Flow Analysis
Data-Flow Analysis
• Input: A control flow graph
• Output: A control flow graph with “global” information at
every basic block
Examples
– Constant expressions: x+y*z
– Live variables
The purpose of data- flow analysis is to provide global information about
how a procedure manipulates its data.
For example, constant – propogation analysis seeks to determine, whether
all assignment to a practical variable that may provide the value of that
variable at some particular point necessary give it the same constant value. If
it so, a use of the variable at that point can be replaced by constant.
Data flow analysis should, as any optimization, always attempt to get the
greatest possible benefit from the analyses and code-improvement
transformations without ever transforming correct code to incorrect code.
Compiler structure
String of
characters
Scaner
Scanner
tokens
Tokens
Symbol Table
And
Access Routines
Parser
Parser
AST
AST
Semantic
Semantic
analyzer
Analizer
IR IR
Code Generator
Object code
Object code
structure
Fig 1. Compiler structure
Os
Interface
Optimizing compiler structure
String of characters
String of characters
Front - End
IR
Control Flow analysis
CFG
Data Flow Analysis
CFG + information
Program Transformation
IR
Instruction selection
Object
Code
Fig 2. Optimizing compiler structure
An Example
Reaching Definitions
• A definition --- an assignment to variable
• An assignment d reaches a basic block if there exists an
execution path to the basic block in which the value assigned at
d is still active at the basic block
Running Example
unsigned int fib(unsigned int m)
{unsigned int f0=0, f1=1, f2, i;
if (m <= 1) {
return m;
}
else {
for (i=2, i <=m, i++) {
f2=f0+f1;
f0=f1;
f1 =f2;}
return f2; }
}
1: receive m(val)
2: f0  0
3: f1  1
4: if m <= 1 goto L3
5: i  2
6: L1: if i <=m goto L2
7:
return f2
8: L2: f2  f0 + f1
9:
f0  f1
10: f1  f2
11:
i  i+1
12: goto L1
13: L3: return m
B0
Entry
B1
1:
2:
3:
4:
B2
5:
B3
B4
B5
6: L1: if i <=m goto L2
7:
return f2
B6
13: L3 : return m
B7
receive m(val)
f0  0
f1  1
if m<= 1 goto L3
i 2
8: L2: f2  f0 + f1
9: f0  f1
10: f1  f2
11: I  I+1
12: goto L1
Exit
Fig 3 . Control Flow Graph


2,3
2,3, 8,9,
2,3, 8,9, 10,
From definition of Reaching Definition it is easy to get following
table, that show, what line reaches each block:
B0

B1

B2
2,3
B3
2,3,5,8,9,10,11
B4
2,3,5,8,9,10,11
B5
2,3,5,8,9,10,11
B6
2,3
B7
2,3,5,8,9,10,11
There is two options what to do with reaching definition of line 1:
parameter m:
1) parameter m receive value , so each block from B1 to B7 also
reach line 1.
2) there is no assignment in line 1.
Difficulties in
Data-Flow Analysis
• In general it is recursively undecidable, when a definition
actually reaches some other point.
• Also, reaching definition may depend on input data.
For example in this C code, if actual parameter of function f n is 1, then line 5 is
actually reaches
lines 7 and 8, if actual parameter n > 0 and n <> 1 then line 4 actually reaches lines 7
and 8.
If actual parameter n < 0 then 4 and 5 not reaches line 7 and 8.
What is reaching definition of line 10?
Once again, depend on actual parameter. And reaching definition may be  ,
if actual parameter n > 0 , but function g does not decrement n.
1 int g(int m , int i)
2 int f(int n)
3{ int I = 0;
4 if (n == 1) I = 2
5 while (n > 0) {
6
j = I + 1;
7
n = g(n , I);
8
}
9
return j
10
}
Iterative Computation of Reaching Definitions
• Optimistically assume that every block no definition is reached
• Every basic block “generates” new definitions and “preserves” other
definitions
• No definition reaches ENTRY
• Iteratively compute more and more definitions at every basic block
• The process must terminate
• The final solution is unique and conservative
Iterative Computation of Reaching Definitions
RCin(ENTRY) = 
The definition may reach the beginning of block , if it may reach
the end of some it predecessor:
RCin(B) = Rcout(B’)
B’  Pred(b)
The definition may reach the end of basic block iff
1) It reach the beginning of block and preserve in block
2) It generates in block
RCOut(B) = GEN(B)  ( RCin(B)  PRSV(B))
Basic Block
B0
B1
B2
B3
B4
B5
B6
B7
Gen
Prsrv

2,3
5


8,9,10.11


2,3,5,8,9,10,11
5,8,11
2,3,5,8,9,10,11
2,3,5,8,9,10,11
2,3,5,8,9,10,11

2,3,5,8,9,10,11
2,3,5,8,9,10,11
After first iteration:
Basic Block
B0
B1
B2
B3
B4
B5
B6
B7
RCOut
RCIn

2,3
2,3
2,3,5
2,3,5
2,3,5
2,3,5


2,3
2,3
2,3,5
2,3,5
2,3,5
2,3,5
RCOut
RCIn
8,9,10,11
After one more iteration:
Basic Block
B0
B1
B2
B3
B4
B5
B6
B7


2,3

2,3
2,3
2,3,5
2,3
2,3,5,8,9,10,11 2,3,5,8,9,10,11
2,3,5,8,9,10,11 2,3,5,8,9,10,11
8,9,10,11
2,3,5,8,9,10,11
2,3,5,8,9,10,11 2,3,5,8,9,10,11
Iterative Computation of
Reaching Definitions
Using Bit-Vectors
• Represent every definition with a bit :
1 meaning it may reach the given point or 0 – meaning definition
does not reach the point.
• PRSV and GEN are bit-vectors
Our rules now presented as follow (for example :
RCin(ENTRY) = <0000….000>
RCin(i) = Rcout(j)
j  Pred(i)
RCOut(i) = GEN(i)  ( RCin(i)  PRSV(i))
Complete Join-Lattices
Lattice L consist of
1) set of values
2) two operations called meet () and join ().
Properties:
1) For all x, y  L , there is exist unique z and w  , such x  y = z
and x  y = w ( closure)
2)For all x, y  L , x  y = y  x , and x  y = y  x (commutativity)
3)
For all x,y,z  L, (x  y)  z = x  (y  z) and
(x  y)  z = x  (y  z) (associativity)
4)
There are two unique elements of L called bottom (┴) and top (┬),
Such for all x of L x  ┴ = ┴ and x  ┬ = ┬
Example 1: Bit vectors
BV(n) will be used to denote the lattice of bit vectors of
length n.
┴ = <0…0>
┬ = <1…1>
 is bitwise and
 is bitwise or
Example 2: ICP
Elements : ┴, ┬, all the integer and the booleans
Properties:
1)
For all n  ICP n  ┴ = ┴
2)
For all n  ICP n  ┬ = ┬
Meet of any two elements is found by following the lines downward from
them until they meet, and the join is found by following the lines up until
they join
┬
false …
-2
–1
0
┴
1
2
…
true
• An partial order  on the elements of L can be defined as follow:
X  y if and only if x  y = x
• x  y  x “covers” less states than y 
x is more precise than y
• height of a lattice  length of maximal strictly increasing chain
x1x2...  xk ,
where
┴ = x1
┬ = xk
Functions on Lattices
• A functions mapping lattice to itself f: L  L is monotonic
if for all x, y x  y f(x)  f(y)
Example:
f: BV(3)  BV(3) as defined by f(<x1 x2 x3>) = <x1 1 x3> is monotonic.
• A fixed point of a function f: L  L
Is an element x of L such f(x) = x.
Example: f:BV  BV
f(0) = 0 and f(1) = 1.
Both 0 and 1 are fixed points of f.
• For a montonic function f the effective height of L relative to
function L  L is the length of the longest increasing chain
obtaining by iterating application of f.
such exist x1 ,
x2 = f(x1),
x3 = f (x2),
………….,
xn= f (xn-1),
such that
x1x2... xk  ┬
The Join (Meet) Over All Paths
• A data-flow solution which is precise under the assumption that
every control flow path is executable
• Let G = < N,E > be a flow graph
• Let Path(B) represent the set of all paths from entry to any node
B of N and p be any element of Path(B)
Let F B () be the flow function representing flow trough block B
and F P () represent the composition of the flow function
encountered in following the path p.
For example , if B1 = entry , …, Bn = B are the blocks making the
path p to B, then
F p = FBn ... FB2 FB1
Let Init be the lattice value associated with the entry block
• The MOP at a block B
MOP(B) =  P  Path(B) Fp(Init)
• The JOP at a block B
JOP(B) =  P  Path(B) Fp(Init)
Dimensions for
Data-Flow Problems
• The information provided
• “ralational” Vs. independent attributes
• The type of lattice and functions used
powersets, ICPn, ..., unbounded heights
• The direction of information flow
forward, backward, bidirectional
Example Data-Flow Problems
•
•
•
•
•
•
•
Reaching Definitions
Available Expressions
Live Variables
Upward Exposed Uses
Copy-Propagation Analysis
Constant-Propagation Analysis
Partial-Redundency Analysis
Data-Flow Analysis Algorithms
•
•
•
•
•
•
•
•
Allen’s strongly connected regions
Kildall’s iterative algorithm
Ullman’s T1-T2 analysis
Kennedy’s node-listing algorithm
Farrow, Kennedy, and Zuconi’s graph grammar approach
Rosen’s high-level approach
structural analysis
slotwise analysis
Download