Escape Analysis for Java Will von Rosenberg Noah Wallace Points-to vs. Escape Analysis Points-to – Memory disambiguation – To determine if two pointers can be resolved to point at the same location – Points-to graph should lead to the same location for correctness, i.e. they may resolve to the same memory location Escape Analysis – Identify objects that might escape a (dynamic) scope such as a method invocation or a thread object. – Connection graph may lead to different nodes, but could still be correct – Can safely ignore the calling context for escape analysis. Reasons for Escape Analysis If an object does not escape the method it was created in, then that object can be allocated onto the stack instead of the heap, since heap allocation is (supposedly) time expensive. If an object does not escape the thread it was created in, then that object does not need to be synchronized. That means that lock() and unlock() do not need to be used on it. These synchronization methods are inherently time expensive. Escape Analysis Definitions and Propositions An object is said to escape a Method if the lifetime of the object is larger than the lifetime of the Method. That is to say that the scope of the object is greater than the scope of the Method it was created in. An object is said to escape a Thread if another thread, not equal to the first, uses (locks) that object. If an object does not escape the method, !Escapes(O, M), and that method was invoked in thread T, then it can be said that the object does not escape the thread, !Escapes(O, T). (Proposition 2.3) Escape Lattice GlobalEscape – Escapes all Methods and Threads ArgEscape – Escapes the Methods in which it was created, but not the thread invocation NoEscape – Does not escape either the Method and the Thread Connection Graph Capture the connectivity relationship among objects Connection Graph Fid() – A unique number that identifies a field within a class, this field identifier or offset, is unique within the class, and can be compared to instances of the same class. The notation refers to a reference node --------------- that contains an arbitrary number of deferred edges, that lead to a points-to edge. The PointsTo() function refers to all nodes, such as the above, where m, through a sequence of deferred edges, points-to n. The function returns the set of all n’s that m points-to. Intraprocedural Analysis Flow-sensitive and Flow-insensitive Simplify presentation by – splitting a multiple level reference expression into a two level reference. () – Bypass() function Eliminates all edges to a node, either incoming, or outgoing. Redirects them to a more precise location for the purpose of flow-sensitive analysis Intra (cont’d) P=new r() – FS (Flow Sensitive) new object node is created and ByPass(p). – FI new object node created with a points to edge from p to new node. ByPass(p) is not called. P=q – FS apply ByPass(p) and add the deferred edge to q. – FI ignore ByPass(p) and add the deferred edge to q. P.f=q – FS and FI are treated the same. We ignore ByPass(p) and add the deferred edges from V-> q. Where V are the field nodes. Phantom nodes may be created if pointsto(p) = 0 and field nodes may be created if V is empty. P=q.f – FS apply ByPass(p) and add the deferred edge to V. – FI ignore ByPass(p) and add the deferred edge to V. Interprocedural Analysis Method Entry Method Exit Immediately before method invocation Immediately after method invocation Method Entry Create Phantom reference nodes – F1 a F1 is the phantom node and a is the formal parameters. – EscapeState[F1] = NoEscape Because it is a phantom node and created and deleted in the Method. – EscapeState[a1] = ArgEscape a1 is created outside the method so it can leave the method. Method Exit Partitions graph into three subgraphs – Those nodes reachable by GlobalEscape – Those nodes reachable by ArgEscape – Those nodes that are left are NoEscape The union of GlobalEscape and ArgEscape graphs are deamed the NonLocalGraph The NoEscape graph is deamed the LocalGraph Immediately Before a Method Each parameter (object node) of the caller will be mapped to an object node in the callee Once inside, the Phantom reference nodes will be created referring back to the parameter object node. These correlations will be kept track of for both the purpose of the return values of the nodes and the ability to reconstruct the connection graph in the event of another call of the same function. Immediately After a Method The mapping from the Before, a1, a2…aN, will then get the escape status of the phantom nodes in the callee. The edges will be updated as well, showing the new relationships that were effected by the callee. Updating Caller Nodes This ensures that the node ai, in the caller function, will map to the correlated a^I, in the callee function. Through recursion, we also ensure that the set PointsTo(ai) will be equal to the set PointsTo(a^I) Results Results (cont’d) Results (cont’d) Results (cont’d) Results (cont’d) Percentage of objects that may be allocated on the stack exceeds 70% of all dynamically created objects in three out of ten benchmarks. 11% to 92% of all lock operations are eliminated in the ten programs. And execution time reduction ranges from 2% to 23%. Results Final We’re done! …but wait… Congrats to Noah and Kaleem (Hopefully) on their last day of class for a while Does anybody have anything to do for tomorrow? – Who’s 21? (raise your hand!) – Who wants to go to Flats? For Margarita’s? – PhD’s allowed!