More on Loop Optimization Data Flow Analysis CS 480 Make a broader goal for loop optimization • General idea: – Link every assignment of a variable to the set of all places it might possibly be used – Link every use of a variable to the set of all places it might possibly have been assigned • In general, a use might have several assignments, and an assign will almost always have many uses How to do this • Fundamental equation For each node: out = (in – kill) union gen Out: variables (and values) that flow out of a node In: variables (and values) that flow in to a node Kill: variables killed by a node Gen : variables (and values) generated by a node Note that kill definition slightly different • For common subscripts kill was defined as “possibly altered” • For this we need to define kill as “possibly survive” X = 12 *p = 42 … If (x < b) // could use of x come from assign? Apply equations repeatedly • Apply equations to each node in the graph until nothing changes • Will certainly need to do this repeated times x=3 while (x < a) // link use of x to assignment if (b > 0) x = x + b else x = x - b Once we have this information, what can we do with it? • What is a variable that is used but never assigned to? • What is a variable that is set but never used? What about this one? max = 142 …. // lots of code, none of which changes max If (a < (max – 1)) …. // called constant propagation Dead code elimination • Can possibly discover, at compile time, that a whole section of code cannot possibly ever be executed. • Think that no good programmer would ever do this? How many people have done this? Debug = false … … If (debug) { … } … Why generate code that is not executed? Copy Propagation • After X=y X and y are aliases of each other until one or the other is reassigned. Can look for common subexpressions that might not be obvious Z = (x + 42) * w G = (y + 42) * w From previous lectures • Loop invariant code (expressions with no set inside of loop) • Induction variables and the resulting reduction in strength Common subexpressions that span nodes • Previously we only looked at common subexpressions within a single node • What about expressions that span nodes – called available expressions A[x + 42] = 17 // subscript is common If (b < 39) a [ x + 42 ] = 18 // can be saved and reused Or can be distributed out • Or, common subexpressions can be moved back to a previous common node – called busy expressions if (a < 12) b[x + 32] = 17 Else b[x + 32] = 49 // can move subscript before if Live-dead analysis • When registers are used to hold variables (more next week) need to know when a variable is no longer being used, and can free up the register • Called “live-dead analysis” New one – polymorphism elimination Var w : window …. W = new textWindow(); …. w.draw(); // if we can ensure that w is a textwindow can probably generate faster code Bottom line • Data flow analysis, matching every variable set and every variable use, is one of the most powerful tools in the optimizer writers set of tricks