Heap Management

advertisement

CPSC 388 – Compiler Design and Construction

Heap Management

Areas of Memory Used by Program

 Program Code

 Static Data

 Heap

 Stack

Heap

 Used for dynamically allocated memory

 Important operations include allocation and deallocation

 In C++, Pascal, and Java allocation is done via the

“new” operator

 In C allocation is done via the “malloc” function call

 De-allocation is done either automatically or programmer must specify when to de-allocate memory:

 Pascal and C++ – dispose

 C – free

 Java – garbage collection

Managing the Heap

 Available memory is managed using a free list: a list of available “chunks”

 Each chunk includes:

 Size of chunk

 Address of the next item on the free list

 The chunk itself

Initial Heap Free List

0 4 …

100 \ …

103

First

Free size next

Request is made to allocate 20 bytes

Uses first portion of first chunk (after size

Field) and returns address of 4

Initial Heap Free List

0 4 … 23 24 28 …

20 76 \ size size next

First

Free

Request is made to allocate 10 bytes

103

Initial Heap Free List

0 4 … 23 24 28 …37 38 42 …

20 10 62 \ size size size next

First

Free

First chunk is freed

Adds chunk to front of free list

103

Initial Heap Free List

First

Free

0 4 … 23 24 28 …37 38 42 …

20 10 62 \ size size size next

103

Operations on Free List

 Request space

 Find a satisfactory chunk

 Free Space

 Return to Free List

 Goals for Operations

 Only fail to satisfy request for n bytes if there are not n bytes available on free list

 Do both operations quickly

Questions to Consider

 Given a request for n bytes, which n bytes to return?

 Given a de-allocation of a chunk, how to coalesce it with neighboring free chunks?

Techniques for Allocation

 Best Fit: Find the chunk on the freelist with the smallest size greater than or equal to allocation request

 May require search of entire freelist

(SLOW!)

 Leaves lots of little pieces of free storage on the list

Techniques for Allocation

 First Fit: Use the first chunk with size greater than or equal to n.

 Faster than best-fit.

 Produces little pieces of free storage at the front of the list, which slows later searches

Techniques for Allocation

 Circular First Fit: Make the freelist circular (i.e. have last item point back to the first item).

 Satisfy requests using the first chunk with size greater than or equal to n.

 Change the freelist pointer to point to next chunk after allocated one.

Techniques for de-allocation

 Use a doubly-linked list

 Each Chunk has a previous and next pointer

 One bit of size field reserved to indicated if chunk is “free” or “in-use”.

 Check free bit of storage after chunk

 If following chunk is free then coalesce

 Follow Example on Board

Techniques for De-allocation

 Can also coalesce with preceding chunk if you keep the size of chunk at beginning and end of chunk

 Follow example on board

 Note that NO pointers need to be updated

Automatic or Explicit De-allocation

 In C++ and C de-allocation must be done explicitly

 In Java de-allocation is done automatically (by the garbage collector)

 Making it Automatic reduces burden on the programmer (and eliminates some types of errors)

Errors of Explicit De-allocation

 Storage Leaks

Some storage is never freed even though it is inaccessible

Listnode *p = malloc( sizeof(Listnode) );

.

. // no copy from p in this code

.

p = ...;

Errors of Explicit De-allocation

 Dangling pointers

 A pointer that points to memory that has been freed

 May read garbage

 May mess up free list

 May corrupt other variables

Example Dangling Pointers

Listnode *p, *q; p = malloc( sizeof(Listnode) ); q = p;

.

. // no assignment to q in this code

.

free(p);

.

. // no assignment to q in this code

.

*q = ...

Detecting Dangling Pointers

 Add a new field to every allocated chunk (like size field) (lock)

 Add a new field to every pointer (in addition to storing the address) (key)

 If lock does not match key then throw an error

Detecting Dangling Pointers

 Each free chunk’s lock is set to 0

 When allocated both lock and key assigned a new value (always increasing)

 When storage is freed set lock back to zero

 When pointer is dereferenced, compiler generates code to first match key to lock, otherwise cause error

Automatic De-allocation

 Determine if a chunk of storage is no longer accessible to the program

 Make de-allocation efficient, avoid long pauses in program’s execution during de-allocation

 Two Approaches:

 Reference Counting

 Garbage Collection

Reference Counting

 Include invisible field in every chunk of storage: its reference count field.

 Value of field is the number of pointers that point to the chunk.

 Value is initialized to 1 when chunk is allocated and updated:

 When a pointer is copied, a new reference is created, so the reference count of chunk must be incremented

 When a non-null pointer’s value is over-written, a reference is removed, so the reference count of the chunk (before the over-write) must be decremented.

 When a reference count becomes zero, it means nothing points to it so the chunk can be de-allocated and added to free list. If the chunk contains pointers to other chunks, then their reference counts must be decrimented.

Problems with Reference Counting

 Slows Program Execution

 Every write into a pointer must test to see if old value is null.

 Requires updates to reference counts

 Cyclic Structures cannot be deallocated var p: Nodeptr; /* p is a pointer to a node */ new(p); /* p points to new storage, reference count is 1 */ p^.next = p; /* next field of node points to node, so now reference count is 2 */ p = nil; /* p's value is over-written, so node's reference count decremented(from 2 to 1)

In fact, it is inaccessible (it points to itself, no other pointer points to it), but we can't tell that just from the reference count. */

Garbage Collection

 Wait until no stoarge left then

 Find all accessible objects

 Free all other (inaccessible) objects

 Several Approaches to Garbage

Collection

 Mark and Sweep

 Stop and Copy

Mark and Sweep

 Two Phases

 Mark phase finds and marks all accessible objects

 Sweep phase sweeps through the heap, collecting all of the garbage and putting back on freelist

 Another “invisible” value in each chunk called mark bit

 Initialized to 0

 Set to 1 if the chunk is reached during mark phase

Mark Phase

Put all “active” pointers on a worklist

(“active” means pointer is on stack or static data area)

While worklist is not empty do: p=select_pointer(worklist) if p’s object’s mark-bit is zero: change it to one put all pointers in p’s object on worklist

Sweep Phase

 Looks at every chunk of storage in heap

 How?

 If mark-bit for chunk is 0 add to freelist

 If mark-bit for chunk is 1 change to 0

 When adding to freelist coalesce neighbor chunks

 See example on board

Stop and Copy Garbage Collection

 Heap is divided into two parts:

 Old space used for allocation of new chunks

 New space used for garbage collection

 First-free pointer points to first free space in old space

 When allocation request is made for n bytes, if space is available in old space then make allocation, otherwise perform garbage collection

Stop and Copy Garbage Collection

 Find all accessible objects (following same method as mark and sweep)

 Copy the object from old space to new space (no mark bit)

 After making all copies, reverse role of old and new space

 First-free pointer points to beginning of the “new” old space

Stop and Copy Garbage Collection

 When chunk is copied from old to new, ALL pointers to chunk must be updated

 A forwarding pointer is left behind in old space and used to update other pointers to same object

 Follow example on board

Advantages of Stop and Copy

 Allocation is Cheaper (no need for searching free list, just advance first-free pointer)

 No Freelist, just one chunk of free memory, no need to coalesce chunks

 Cheaper than mark and sweep – no need to scan entire heap

 Compacting objects means closer together

(fewer cache misses, fewer page faults)

Identifying Pointers

 Automatic deallocation requires the ability to find all pointers on the stack

 Every word has a one-bit tag (0 for notpointer, 1 for pointer)

 Maintain separate bit-map of tags

 Associate with each variable and each object a type tag.

Summary

 Two methods of Storage De-allocation

 Programmer controlled

 Automatic

 Programmer controlled errors include:

 Storage leaks

 Corrupted memory via dangling pointers

 Automatic De-allocation

 Reference counting

 High space and time overhead

 Cannot free cyclic structures

 Cost is distributed over the execution of program

 Garbage collection

 Mark and Sweep

 Stop and Copy

Download