Free-Me: A Static Analysis for Individual Object Reclamation T E X A S

advertisement
Free-Me: A Static
Analysis for Individual
Object Reclamation
Samuel Z. Guyer
Kathryn S. McKinley
Tufts University
University of Texas at Austin
Daniel Frampton
Australian National University
THE UNIVERSITY OF
TEXAS
AT AUSTIN
Motivation

Automatic memory reclamation (GC)




No need for explicit “free”
Garbage collector reclaims memory
Eliminates many programming errors
Problem: when do we get memory back?

Frequent GCs:
Reclaim memory quickly, with high overhead

Infrequent GCs:
Lower overhead, but lots of garbage in memory
2
Example
Read a token
(new String)
void parse(InputStream stream) {
Look up in
while (not_done) {
symbol table
String idName = stream.readToken();
Identifier id = symbolTable.lookup(idName);
if (id == null) {
id = new Identifier(idName);
If not there, create
symbolTable.add(idName, id);
new identifier, add
}
computeOn(id);
to symbol table
Compute on
}}
identifier

Notice: String idName is often garbage
Memory:
3
Solution
void parse(InputStream stream) {
while (not_done) {
String idName = stream.readToken();
Identifier id = symbolTable.lookup(idName)
if (id == null) {
id = new Identifier(idName);
symbolTable.add(idName, id);
}
else free(idName);
computeOn(id);
String idName is garbage,
}}
free immediately

Garbage does not accumulate
Memory:
4
Our approach


Adds free() automatically

FreeMe compiler pass inserts calls to free()

Preserve software engineering benefits
1.7X performance
Can’t determine lifetimesPotential:
for all objects


malloc/free vs GC
Works with the garbage collector
in on
tight
heaps
Implementation of free() depends
collector
(Hertz & Berger, OOPSLA 2005)

Goal:

Incremental, “eager” memory reclamation
Results: reduce GC load, improve performance
5
Outline





Motivation
Analysis
Results
Related work
Conclusions
6
FreeMe Analysis

Goal:

Determine when an object becomes unreachable
Within a method,
for allocation site “p = new A”
where can we place a call to “free(p)”?
Not a whole-program analysis*

Idea: pointer analysis + liveness


I’ll describe the
interprocedural
parts later
Pointer analysis for reachability
Liveness analysis for when
7
Pointer Analysis
String idName = stream.readToken();
Identifier id = symbolTable.lookup(idName);
if (id == null) {
id = new Identifier(idName);
symbolTable.add(idName, id);
}
computeOn(id);
id

Connectivity graph




symbolTable
Variables
Allocation sites
Globals (statics)
(global)
Identifier
idName
Analysis algorithm
Flow-insensitive, field-insensitive
readToken
String
8
Adding liveness

Key: An object is reachable only when
all incoming pointers are live
idName
(global)
readToken
String
Identifier
Reachability is
union of all these
live ranges
From a variable:
Live range of the variable
From a global:
Live from the pointer store onward
From other object: Live from the pointer store until
source object becomes unreachable
9
Liveness Analysis

Computed as sets of edges

Variables
String idName = stream.readToken();
idName
Identifier id = symbolTable.lookup(idName);

Heap
pointers
Identifier
(global)
readToken
String
if (id == null)
id = new Identifier(idName);
symbolTable.add(idName, id);
computeOn(id);
10
Where can we free it?

Where object
exists
readToken
String
String idName = stream.readToken();
Identifier id = symbolTable.lookup(idName);
-minus
Where
reachable
if (id == null)
id = new Identifier(idName);
symbolTable.add(idName, id);
Compiler inserts
call to
free(idName)
computeOn(id);
11
Interprocedural component

Detection of factory methods
String idName = stream.readToken();



Return value is a new object
Can be freed by the caller
Effects of methods called
Hashtable.add:
(0 → 1)
(0 → 2)
symbolTable.add(idName, id);


Describes how parameters are connected
Compilation strategy:


Summaries pre-computed for all methods
Free-me only applied to hot methods
12
Implementation in JikesRVM

FreeMe added to OPT compiler

Run-time: depends on collector

Mark/sweep
Free-list: free() operation

Generational mark/sweep
Unbump: move nursery “bump pointer” backward
Unreserve: reduce copy reserve
 Very low overhead
 Run longer without collecting
13
Volume freed – in MB
100%
74
91
98 105 180 183 263 271 103 348 515 523 716 822 1544 8195
Increasing alloc size
Increasing alloc size
SPEC benchmarks
DaCapo benchmarks
50%
0%
14
Volume freed – in MB
100%
74
91
98
105 180 183 263 271 103 348 515 523 716 822 1544 8195
73
FreeMe
Mean: 32%
73
45
163
50%
673
30
75
34
24
16
0%
0
222
278
1607
57
22
15
Compare to
stack-like behavior

Notice: Stacks and regions won’t work for example



idName escapes some of the time
Not biased: 35% vs 65%
Comparison: restrict placement of free()
Object must not escape
 No conditional free
 No factory methods

Other approaches:


Optimistic, dynamic stack allocation
Scalar replacement
[Azul, Corry 06]
16
Volume freed – in MB
100%
74
91
98 105 180 183 263 271 103 348 515 523 716 822 1544 8195
73
45
Stack-like
Mean: 20%
73
50%
103
21
1566
15
0%
0
6
16
3
28
14
35
56 146
17
Mark/sweep – GC time
All benchmarks
30%
9%
18
Mark/sweep – time
20%
All benchmarks
15%
6%
19
GenMS – time
All benchmarks
20
GenMS – GC time
Why doesn’t this help?
Note: the number of
GCs is greatly reduced
All benchmarks
FreeMe mostly finding
short-lived objects
Nursery reclaims dead
objects for free
(cost ~ survivors)
21
Bloat – GC time
12%
22
Related work

Compile-time memory management



Stack allocation



Functional languages
Shape analysis

[Gay 00, Blanchet 03, Choi 03]
Tied to stack frames or other scopes
Objects from a site must not escape
Region inference

[Barth 77, Hughs 92, Mazur 01]
[Shaham 03, Cherem 06]
[Tofte 96, Hallenberg 02]
Cheap reclamation – no scoping constraints
Similar all-or-nothing limitation
23
Conclusions

FreeMe analysis



GC + explicit free()



Finds many objects to free: often 30% - 60%
Most are short-lived objects
Advantage over stack/region allocation: no need to make
decision at allocation time
Generational collectors
Embedded applications:
 Nursery works very well
Compile-ahead
 Abandon techniques that replace nursery?
Memory constrained
Mark-sweep collectors
Non-moving collectors


50% to 200% speedup
Works better as memory gets tighter
24
Thank You
25
Download