CS 2130

advertisement
Garbage Collection
CSCI 2720
Spring 2005
Static vs. Dynamic Allocation
• Early versions of Fortran
– All memory was static
• C
– Mix of static and dynamic allocation
– Dynamic allocation must be managed 100% by programmer
•
•
•
•
malloc
realloc
calloc
free
• Lisp
– Completely dynamic
– Separate programmer from machine
Garbage Collection
• Sometimes called Automatic Memory Management
(OO)
• Affects design of programs
– Tendency to use painless features
– Does have cost
• Part of overall heap management problem
• Not the only solution
• Two flavors
– Constant sized allocation units
– Variable sized allocation units
• C does not have Garbage Collection!
What is Garbage Collection?
• Program(mer) requests allocation of memory from
heap.
• If allocation is granted, memory is allocated and
address is returned and stored in pointer variable.
• Contents of pointer variable may be copied so that
multiple pointers may exist pointing to same location
• The allocated area becomes "garbage" if it is no
longer being referenced by any pointer.
• Typically garbage collection occurs when the runtime
system no longer has any free memory to allocate
How to Find Garbage
• Root Set
– Set of all pointers that are either global or on activation stack
• All memory referenced by root set pointers OR by
pointers in memory that is referenced by root set
pointers
• Think about a linked list!
Abstract GC Algorithm
1. Stop the machine.
2. Partition the heap into live data and garbage.
3. Mark or rearrange heap so that garbage can be
reused.
4. Restart the machine.
When to Garbage Collect?
•
•
•
•
May be worst
When unable to allocate.
possible time
When remaining free space is low.
Periodically.
When user program pauses for terminal or disk I/O.
• Note: Good news?
– Memory is plentiful
– Virtual memory makes memory appear larger (cost?)
How to Decide?
•
•
•
•
•
Which collector algorithm will be used
Whether the application program is interactive
How much memory is available on the machine
The allocation behavior of the program
etc.
Some Typical GC Algorithms
•
•
•
•
Reference Counters
Stop and Copy
Generational
Mark/Sweep
Reference Counters
• Each allocated block of memory contains a counter.
– Each time another pointer starts pointing to the block the
counter is incremented
– Each time a pointer stops pointing at a block the counter is
decremented
– If the counter = 0 the block is returned to the free memory list
• Problems
– If the blocks are small the storage taken up by counters
becomes significant
– Execution time penalty
– Circular structures pose difficulties (not insurmountable)
Known as the Eager Approach
Stop and Copy
• Heap is divided into two partitions (to-space and
from-space)
• When GC runs copy all live allocations from the fromspace partition to the to-space partition
– To-space partition now contains contiguous memory
– This will typically run faster on modern hardware (Caches)
• Swap to-space and from-space (labels)
• Bad things
– Requires twice as much memory
– Will repeatedly copy large long-lived things (needlessly)
Generational
• Overcomes the problem of repeatedly copying large
long-lived objects?
– Observation: Most allocated data dies young.
• Idea: Use multiple generation spaces with the tospace of a younger generation equal to the fromspace of an older generation.
– Collect from from-space 0 to to-space of generation 1.
– Collect from generation 1 from-space to generation 2 tospace, etc.
• Only the oldest generation needs its own to-space.
• Collect younger generations more frequently than
older ones.
Mark & Sweep
• Language such as Lisp or Scheme based on
constant size memory cells: cons cell
Cons cell
Internally
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
But where are free cells?
Free List
()
Free
Allocating a cons cell means getting first
cell in free list. Deallocation just reverses
the process.
Free
Free List
X
()
()
Y
()
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Clear?
Mark -- Sweep Algorithm
• Each block must contain bit (mark bit)
• Initially all blocks are unmarked
• Starting at each symbol perform a depth-first search
marking all blocks reachable (mark means in-use)
• Sweep through all blocks.
– If marked: Unmark
– If unmarked: move to free list
• Note: Algorithm must be only thing running
• Garbage collection is only done when necessary
– i.e. When free list is empty
Mark
Mark
Free
Free List
X
()
()
Y
()
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Mark
Free
Free List
X
()
()
Y
()
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Mark
Free
Free List
X
()
()
Y
()
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Mark
Free
Free List
X
()
()
Y
()
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Mark
Free
Free List
X
()
()
Y
()
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Mark
Free
Free List
X
()
()
Y
()
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Mark
Free
Free List
X
()
()
Y
()
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Mark
Free
Free List
X
()
()
Y
()
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Mark
Free
Free List
X
()
()
Y
()
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Mark
Free
Free List
X
()
()
Y
()
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Mark
Free
Free List
X
()
()
Y
()
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Sweep
Free
Free List
X
()
()
Y
()
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
foo
bar
baz
Sweep
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
()
foo
bar
baz
Done
Free
Free List
X
()
()
Y
Internally
()
()
X
foo
blarg
()
bar
baz
()
Y
()
foo
bar
baz
Simple?
What about variable sized cells?
Variable Sized Cells
• Have all problems and needs of single-sized cells
• Have the following additional problems
– Sweeping each cell becomes more difficult. Need to have
size of each cell at beginning of cell.
– Where exactly are the pointers in the cells? One solution is
to add system pointers which has extra cost
– Free space must be managed as previously discussed
Questions?
Download