Data Flow Analysis CS2210 Lecture 14 CS2210 Compiler Design 2004/5 Lattices ■ D = (S, ≤) ■ ■ S set of elements ≤ induces a partial order ■ ■ Reflexive, transitive & anti-symmetrics ∀ x,y∈S: meet ^(x,y) (greatest lower bound) join v(x,y) (least upper bound) ν ■ ■ ν ■ = closure property Unique Top (T) & Bottom ⊥ elements: X^ ⊥ = ⊥ and x v T = T Meet and join are commutative and associative Height of lattice : longest path through partial order from top to bottom CS2210 Compiler Design 2004/5 Lattices in Data Flow Analysis ■ Model information by elements of a lattice domain ■ ■ ■ ■ ■ ■ Top = best case info Bottom = worst case info Initial info for optimistic analyses (at least back edges: top) If a ≤ b then a is a conservative approximation of b Merge function = meet (^) : the most precise element that’s a conservative approximation of both input elements Initial info for optimistic analyses (at least back edges: top) CS2210 Compiler Design 2004/5 1 Some Typical Lattice Domains ■ Two point lattice: ⊥ T ■ ■ ■ Lifted set: set of incomparable values and ⊥ and T ■ ■ Boolean property A tuple of two point lattices = bit vector Example? Powerset lattice: set of all subsets of S, ordered somehow (often by ⊆) ■ ■ ■ T = {} ⊥ = S or vice versa Collecting analysis Isomorphic to tuple of booleans indicating membership in subset of elements of S CS2210 Compiler Design 2004/5 Product (aka. Tuple) Lattices ■ ■ ν ν Often useful to break complex lattice into a tuple of lattices, one per variable analyzed DT=<ST, ≤T> = <S, ≤>N S = S XS X…XS T 1 2 N ≤T pointwise ordering TT = <TD,…,TD>, bottom tuple of bottoms Height(DT) = N * height(D) Example? CS2210 Compiler Design 2004/5 Analysis of Loops with Lattices F=flow function for loop body F(info-at-loop-head) = info at back edge: F0=d entry^T F1=d entry^B(F 0)=F(F0)=F(d entry ) F2=d entry^B(F 1)=F(F(F0)) Fk=d entry^B(F k-1) = F(F(…(F(d entry))…)) Repeat until Fk+1 =Fk CS2210 Compiler Design 2004/5 2 Termination of Iterative Analysis ■ Sufficient conditions ■ ■ Flow functions (F) are monotonic d1 ≤ d2 ⇒ F(d1) ≤ F(d2) Lattice is of finite height ■ ■ ■ Start at T Each application of F goes down one level Eventually hit fixed-point or bottom ■ At most height(D) -1 applications CS2210 Compiler Design 2004/5 Examples ■ Lattices for ■ ■ ■ Constant propagation Live variables Reaching definitions CS2210 Compiler Design 2004/5 Distributive Lattices ■ A lattice is distributive :⇔ ∀x,y,z ∈ D: (x ^ y) v z = (x v z) ^ (y v z) and (x v y) ^ z = (x ^ z) v (y ^z) ■ Example: ■ Counterexample ■ ■ Live variables, only elements T and ⊥ (easy exercise) Constant propagation: (1 v 2) ^ 3 = (1 ^ 3) v (2 ^3) = CS2210 Compiler Design 2004/5 3 Meet-Over-All Paths Solution ■ ■ Flow function for basic block B: F B, for a path p along B1 … B n: Fp= F B1 ° … ° F Bn MOP(B) = ^ p∈Path(B) F p(Init) Init is initial info at entry block ■ ■ MOP is most precise solution we can hope for MOP computation is NP for monotone flow functions ■ ■ I.e. there is no algorithm that is guaranteed to work for all flow graphs Use approximation: maximum Fixed-Point solution (MFP) CS2210 Compiler Design 2004/5 Important Results ■ Monotone lattices ■ ■ Iterative algorithm guaranteed to produce the MFP solution Distributive monotone lattices ■ ■ MFP = MOP Functions over lattices of bit vectors are distributive, i.e., all functions f: BV n -> BV n are distributive CS2210 Compiler Design 2004/5 Important Data Flow Problems ■ Reaching definitions ■ ■ What expressions are available at a particular program point (e.g., x*y is available in variable t1) Live variables ■ For a given program point, is there a use of the variable along some path to exit Upwards exposed uses ■ Available expressions ■ ■ ■ Which definitions of a variable v reach a particular use of v ■ Copy propagation ■ ■ ■ What uses of variables at particular points are reached by particular definitions For x :=y to a use of x no assignments to y? Constant propagation Partial redundancy elimination ■ Original formulation forward & backward flow problem CS2210 Compiler Design 2004/5 4 Worklist Algorithm for IDFA procedure Worklist_Iterate(N, Entry, FP, dfin, Init) N: in set of Node entry: in Node FP: in Node -> L dfin: out Node -> L Init: in L begin B,P: Node Worklist: set of Node effect, totaleffect: L dfin(entry) := Init Worklist := N - {entry} for each B in N do dfin(B) := TOP od repeat B := choose(Worklist) Worklist -={B} totaleffect := r for each P in Pred(B) do effect := F(P, dfin(P)) totaleffect := MEET= effect if dfin(B) != totaleffect then dfin(B) := totaleffect Worklist U= {B} fi od until Worklist = {} end CS2210 Compiler Design 2004/5 Lattices of Flow Functions (1) ■ Can define lattice of monotone flow functions over lattices: ■ ν ν ν L lattice, define L F set of monotone functions from L -> L, i.e., f∈ L F⇔ ∀x,y ∈ L x ≤ y⇒f(x) ≤ f(y) Meet defined as: ∀ f,g∈LF, ∀x ∈ L : (f ^g)(x) = f(x) ^g(x) Top: ∀ x∈L: T F(x)=T Bottom: ∀ x∈L: ⊥F(x)= ⊥ CS2210 Compiler Design 2004/5 Lattices of Flow Functions (2) ■ ■ Identify function: id(x) = x, ∀x ∈ L as Function composition: (f°g)(x) = f(g(x)) ■ ■ ■ LF is closed under composition f0 := id, f n := f ° fn-1 for n ≥ 1 Kleene closure f * ■ ■ ∀x ∈ L: f*(x) := lim n→∞(id^f) n(x) LF is closed under Kleene closure (for finite height lattices) ■ Sufficient to have finite effective height (relative to function f) =: longest strictly descending chain of the form f(x), f 2(x), f3(x)… CS2210 Compiler Design 2004/5 5 Control-Tree-Based Data Flow Analysis ■ Recall two approaches for control flow analysis ■ ■ ■ Interval analysis Structural analysis Control-tree-based data flow analysis uses the intervals / control structures identified to perform data flow analysis CS2210 Compiler Design 2004/5 Structural Data Flow Analysis ■ Example: if-then ■ if ■ Fif/Y if-then Fif-then= (Fthen ° Fif/Y) ^ Fif/N Propagate info to substructures: ■ then ■ Fif/N Fthen in(if) = in(if-then) in(then) = F if/Y (in(if)) Fif-then CS2210 Compiler Design 2004/5 While Loops ■ ■ while Fwhile/Y while-loop body Fbody ■ Floop = (F body ° Fwhile/Y)* Fwhile-loop = Fwhile/N ° F loop Propagate info to substructures: ■ Fwhile/N ■ Fwhile-loop in(while) = F loop(in(whileloop)) in(body) = Fwhile/Y (in(while)) CS2210 Compiler Design 2004/5 6 Structural Analysis for Backward Flow Problems ■ Problem ■ Single-entry but multiple-exit control structures ■ ■ Have multiple entries in backward flow Have to meet (^) possible exits ■ ■ Details in the book Single-entry, single-exit structures ■ Can turn equations around CS2210 Compiler Design 2004/5 Static Single Assignment Form CS2210 Compiler Design 2004/5 Reading ■ ■ Rest of chapter 8 Chapter 9 CS2210 Compiler Design 2004/5 7 Variable Webs ■ Web = maximal union of intersecting du-chains B2 B4 entry z>1 B1 x := 1 z>2 x := 2 B3 y := x+1 z := x-3 x :=4 B5 z := x+7 B6 exit CS2210 Compiler Design 2004/5 Static Single Assignment Form ■ ■ THE standard IR in optimizing compilers Properties ■ ■ ■ ■ Every use of a variable has at most one reaching definition Separates values from the locations ■ Makes many optimizations more effective: ■ Constant propagation, value numbering, invariant code motion & removal, strength reduction, PRE Mechanism: ■ ■ Create new target names for definitions by subscripting variables ■ Introduce Φ-functions at merge points as pseudodefinitions Adjust uses Want to minimize # Φfunctions ■ Minimizes translation overhead for code generation CS2210 Compiler Design 2004/5 Dominance Frontier ■ DF(x) = {y | ∃z ∈pred(y) so that x dom z and not x sdom y} ■ ■ Set of all CFG nodes which for which x dominates a predecessor but not the node itself Direct computation is quadratic in number of CFG nodes ■ Use recursive equations and solve iteratively CS2210 Compiler Design 2004/5 8 Dominance Frontier ■ DFlocal(x) = {y ∈succ(x)| idom(y) ≠ x} Dfup(x,z) = {y ∈ DF(z)| idom(y) ≠ x} DF(x) = DFlocal(x) ∪z ∈N, idom(z) = x DFup(x,z) CS2210 Compiler Design 2004/5 Example 1 2 5 9 6 3 7 Dom(5)= {5,6,7,8} 10 11 DF(5) = {4,15,12,13} 8 4 12 13 CS2210 Compiler Design 2004/5 Dominance Frontier Algorithm Procedure ComputeDF(n) begin S := {} for each node y in succ(n) do if idom(y) != n S U= {y} od /* this loop computes Dflocal(n) for each child c of n in the dominator tree do compute DF[c] for each element w of DF[c] do if n does not dominate w or if n = w S U= {w} Fi od od DF[n] = S Algorithm is O(E + sizeof(DFs)) In practice linear in size Of graph end CS2210 Compiler Design 2004/5 9 Dominance Frontier Criterion ■ Whenever node x contains a definition of some variable a, then any node in the dominance frontier of x needs a Φfunction ■ ■ Dominance frontier earliest points where definition is not guaranteed to be unique Since Φ-functions are definitions themselves have to iterate CS2210 Compiler Design 2004/5 Iterated Dominance Frontier ■ ■ ν Define DF for set of nodes: DF(S) = ∪x∈s DF(x) Iterated Dominance Frontier : DF+(S) = limi→∞ DFi(S), where DFi+1 (S)=DF(S ∪ DF i(S)), and DF 1(S) = DF(S) If S is set of nodes that assign to variable x (including the entry node) then DF +(S) is the set of nodes that need Φ-functions for x CS2210 Compiler Design 2004/5 Example ■ On board CS2210 Compiler Design 2004/5 10