Free-Me: A Static Analysis for Individual Object Reclamation Samuel Z. Guyer Kathryn S. McKinley Tufts University University of Texas at Austin Daniel Frampton Australian National University THE UNIVERSITY OF TEXAS AT AUSTIN Motivation Automatic memory reclamation (GC) No need for explicit “free” Garbage collector reclaims memory Eliminates many programming errors Problem: when do we get memory back? Frequent GCs: Reclaim memory quickly, with high overhead Infrequent GCs: Lower overhead, but lots of garbage in memory 2 Example Read a token (new String) void parse(InputStream stream) { Look up in while (not_done) { symbol table String idName = stream.readToken(); Identifier id = symbolTable.lookup(idName); if (id == null) { id = new Identifier(idName); If not there, create symbolTable.add(idName, id); new identifier, add } computeOn(id); to symbol table Compute on }} identifier Notice: String idName is often garbage Memory: 3 Solution void parse(InputStream stream) { while (not_done) { String idName = stream.readToken(); Identifier id = symbolTable.lookup(idName) if (id == null) { id = new Identifier(idName); symbolTable.add(idName, id); } else free(idName); computeOn(id); String idName is garbage, }} free immediately Garbage does not accumulate Memory: 4 Our approach Adds free() automatically FreeMe compiler pass inserts calls to free() Preserve software engineering benefits 1.7X performance Can’t determine lifetimesPotential: for all objects malloc/free vs GC Works with the garbage collector in on tight heaps Implementation of free() depends collector (Hertz & Berger, OOPSLA 2005) Goal: Incremental, “eager” memory reclamation Results: reduce GC load, improve performance 5 Outline Motivation Analysis Results Related work Conclusions 6 FreeMe Analysis Goal: Determine when an object becomes unreachable Within a method, for allocation site “p = new A” where can we place a call to “free(p)”? Not a whole-program analysis* Idea: pointer analysis + liveness I’ll describe the interprocedural parts later Pointer analysis for reachability Liveness analysis for when 7 Pointer Analysis String idName = stream.readToken(); Identifier id = symbolTable.lookup(idName); if (id == null) { id = new Identifier(idName); symbolTable.add(idName, id); } computeOn(id); id Connectivity graph symbolTable Variables Allocation sites Globals (statics) (global) Identifier idName Analysis algorithm Flow-insensitive, field-insensitive readToken String 8 Adding liveness Key: An object is reachable only when all incoming pointers are live idName (global) readToken String Identifier Reachability is union of all these live ranges From a variable: Live range of the variable From a global: Live from the pointer store onward From other object: Live from the pointer store until source object becomes unreachable 9 Liveness Analysis Computed as sets of edges Variables String idName = stream.readToken(); idName Identifier id = symbolTable.lookup(idName); Heap pointers Identifier (global) readToken String if (id == null) id = new Identifier(idName); symbolTable.add(idName, id); computeOn(id); 10 Where can we free it? Where object exists readToken String String idName = stream.readToken(); Identifier id = symbolTable.lookup(idName); -minus Where reachable if (id == null) id = new Identifier(idName); symbolTable.add(idName, id); Compiler inserts call to free(idName) computeOn(id); 11 Interprocedural component Detection of factory methods String idName = stream.readToken(); Return value is a new object Can be freed by the caller Effects of methods called Hashtable.add: (0 → 1) (0 → 2) symbolTable.add(idName, id); Describes how parameters are connected Compilation strategy: Summaries pre-computed for all methods Free-me only applied to hot methods 12 Implementation in JikesRVM FreeMe added to OPT compiler Run-time: depends on collector Mark/sweep Free-list: free() operation Generational mark/sweep Unbump: move nursery “bump pointer” backward Unreserve: reduce copy reserve Very low overhead Run longer without collecting 13 Volume freed – in MB 100% 74 91 98 105 180 183 263 271 103 348 515 523 716 822 1544 8195 Increasing alloc size Increasing alloc size SPEC benchmarks DaCapo benchmarks 50% 0% 14 Volume freed – in MB 100% 74 91 98 105 180 183 263 271 103 348 515 523 716 822 1544 8195 73 FreeMe Mean: 32% 73 45 163 50% 673 30 75 34 24 16 0% 0 222 278 1607 57 22 15 Compare to stack-like behavior Notice: Stacks and regions won’t work for example idName escapes some of the time Not biased: 35% vs 65% Comparison: restrict placement of free() Object must not escape No conditional free No factory methods Other approaches: Optimistic, dynamic stack allocation Scalar replacement [Azul, Corry 06] 16 Volume freed – in MB 100% 74 91 98 105 180 183 263 271 103 348 515 523 716 822 1544 8195 73 45 Stack-like Mean: 20% 73 50% 103 21 1566 15 0% 0 6 16 3 28 14 35 56 146 17 Mark/sweep – GC time All benchmarks 30% 9% 18 Mark/sweep – time 20% All benchmarks 15% 6% 19 GenMS – time All benchmarks 20 GenMS – GC time Why doesn’t this help? Note: the number of GCs is greatly reduced All benchmarks FreeMe mostly finding short-lived objects Nursery reclaims dead objects for free (cost ~ survivors) 21 Bloat – GC time 12% 22 Related work Compile-time memory management Stack allocation Functional languages Shape analysis [Gay 00, Blanchet 03, Choi 03] Tied to stack frames or other scopes Objects from a site must not escape Region inference [Barth 77, Hughs 92, Mazur 01] [Shaham 03, Cherem 06] [Tofte 96, Hallenberg 02] Cheap reclamation – no scoping constraints Similar all-or-nothing limitation 23 Conclusions FreeMe analysis GC + explicit free() Finds many objects to free: often 30% - 60% Most are short-lived objects Advantage over stack/region allocation: no need to make decision at allocation time Generational collectors Embedded applications: Nursery works very well Compile-ahead Abandon techniques that replace nursery? Memory constrained Mark-sweep collectors Non-moving collectors 50% to 200% speedup Works better as memory gets tighter 24 Thank You 25