Slides

advertisement
Experience with Safe Memory
Management in Cyclone
Michael Hicks
University of Maryland, College Park
Joint work with
Greg Morrisett - Harvard
Dan Grossman - UW
Trevor Jim - AT&T
Cyclone
• Derived from C, having similar goals
– Exposes low-level data representations,
provides fine-grained operations
• But safe
– Restrictions to C (e.g., (int *)1 not allowed)
– Additions and types to regain flexibility
• Today: balancing safety and flexibility
when managing memory
Goal: Programmer Control
• Many memory management choices
– Garbage collection
– Stack allocation
– malloc/free
– Reference counting (Linux, COM)
– Arenas (bulk free) (Apache, LCC)
• Depends on the application
Unifying Theme: Region types
• Conceptually divide memory into regions
– Different kinds of regions (e.g., not just bulk-free)
• Associate every pointer with a region
• Prevent dereferencing pointers into dead
regions
int *`r x; // x points into region `r
*x = 3;
// deref allowed if `r is live
(inference often obviates annotations `r)
Liveness by type & effects system (Tofte&Talpin)
Outline
• Motivation and basic approach
• Regions description
– Basics: LIFO arenas, stack and heap regions
– Unique and reference-counted pointers
– Dynamic arenas
• Programming experience
• Experimental measurements
• Conclusions
LIFO Arenas
• Dynamic allocation mechanism
• Lifetime of entire arena is scoped
– At conclusion of scope, all data allocated in
the arena is freed.
LIFO Arena Example
FILE *infile = …
Image *i;
if (tag(infile) == HUFFMAN) {
region<`r> h; // region `r created
struct hnode *`r huff_tree;
huff_tree = read_tree(h,infile); // allocates with h
i = decode_image(infile,huff_tree,…);
// region `r deallocated upon exit of scope
} else …
Stack and Heap Regions
• Stack regions
– Degenerate case of LIFO arena which does
not allow dynamic allocation
– Essentially activation records
• Heap region
– A global region `H that is always live
– Like a LIFO arena that never dies; objects
reclaimed by a garbage collector
Scoped Regions Summary
Region
Variety
Stack
LIFO Arena
Heap
Allocation Deallocation
Aliasing
(objects) (what) (when) (objects)
static
whole
region
exit of
scope
single
objects
GC
dynamic
See PLDI `02 paper for more details
free
Benefits
• No runtime access checks
• Arena/stacks
– costs are constant-time
• region allocation
• region deallocation
• object creation
– useful for
• Temporary data (e.g., local variables)
• Callee-allocates data (rprintf)
• Lots of C-style code
Limitations
• Lack of control over memory usage
– Spurious retention of regions and their objects
– Fragmentation
– Extra space required by the garbage collector
• Lack of control over CPU usage
– Garbage collection is “one-size-fits-all”
• Hard to tune
– Cannot avoid GC in some cases: LIFO arenas
not expressive enough
• E.g., objects with overlapping lifetimes
Overcoming the Limitations
• Allow greater control over lifetimes
– Object lifetimes
• Unique pointers and reference-counted pointers
– Arena lifetimes
• Dynamic arenas
• But not for nothing ...
– Restrictions on aliasing
– Possibility of memory leaks
Unique Region
• Distinguished region name `U
• Individual objects can be freed manually
• An intraprocedural, flow-sensitive analysis
– ensures that a unique pointer is not used after
it is consumed (i.e. freed)
– treats copies as destructive; i.e. only one
usable copy of a pointer to the same memory
– Loosely based on affine type systems
Unique Pointer Example
void foo() {
int *`U x = malloc(sizeof(int));
int *`U y = x; // consumes x
*x = 5;
// disallowed
free(y);
// consumes y
*y = 7;
// disallowed
}
Temporary Aliasing
• Problem: Non-aliasing too restrictive
• Partial solution: Allow temporary, lexicallyscoped aliasing under acceptable
conditions
– Makes unique pointers easier to use
– Increases code reuse
Alias construct
extern void f(int *`r x); // `r any scoped region
void foo() {
int *`U x = malloc(sizeof(int));
*x = 3;
{ alias <`r>int *`r y = x; // `r fresh
f(y); // y aliasable, but x consumed
} // x unconsumed
free(x);
}
Alias inference
extern void f(int *`r x); // `r any scoped region
void foo() {
int *`U x = malloc(sizeof(int));
*x = 3;
f(x); // alias inserted here automatically
free(x);
}
Reference-counted Pointers
• Distinguished region `RC
• Objects allocated in `RC have hidden
reference-count field
• Aliasing tracked as with unique pointers.
Explicit aliasing/freeing via
`a *`RC alias_refptr(`a *`RC);
void drop_refptr(`a *`RC);
Reference-counting Example
struct conn * `RC cmd_pasv(struct conn * `RC c) {
struct ftran * `RC f;
int sock = socket(...);
f = alloc_new_ftran(sock,alias_refptr(c));
c->transfer = alias_refptr(f);
listen(f->sock, 1);
f->state = 1;
drop_refptr(f);
return c;
}
Regions Summary
Region
Variety
Stack
LIFO
Dynamic
Heap
Unique
Refcounted
Allocation Deallocation
Aliasing
(objects) (what) (when) (objects)
static
whole exit of
free
dynamic region scope
manual
single
GC
objects manual restricted
Ensuring Uniformity and Reuse
• Many different idioms could be hard to use
– Duplicated library functions
– Hard-to-change application code
• We have solved this problem by
– Using region types as a unifying theme
– Region polymorphism with kinds
• E.g., functions independent of arguments’ regions
– All regions can be treated as if lexical
• Temporarily, under correct circumstances
• Using alias and open constructs
Programming Experience
Boa
BetaFTPD
Epic
Kiss-FFT
MediaNet
CycWeb
CycScheme
web server
ftp server
image compression
portable fourier transform
streaming overlay network
web server
scheme interpreter
Application Characteristics
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
MediaNet Datastructures
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Experimental Measurements
• Platform
– Dual 1.6 GHz AMD Athlon MP 2000
• 1 GB RAM
• Switched Myrinet
– Linux 2.4.20 (RedHat)
• Software
– C code: gcc 3.2.2
– Cyclone code: cyclone 0.8
– GC: BDW conservative collector 6.2a4
– malloc/free: Lea allocator 2.7.2
Bottom Line
• CPU time
– Most applications do not benefit from switching
from BDW GC to manual approach
– MediaNet is the exception
• Memory usage
– Can reduce memory footprint and working set
size by 2 to 10 times by using manual
techniques
Throughput: Webservers
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Throughput: MediaNet
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Memory Usage: Web (I)
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Memory Usage: Web (II)
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Memory Usage: Web (III)
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Memory Usage: MediaNet
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
(4 KB packets)
Related Work
• Regions
– ML-Kit (foundation for Cyclone’s type system)
– RC
– Reaps
– Walker/Watkins
• Uniqueness
– Wadler, Walker/Watkins, Clean
– Alias types, Calculus of Capabilities, Vault
– Destructive reads (e.g., Boyland)
Future Work
• Tracked pointers sometimes painful; want
– Better inference (e.g. for alias)
– Richer API (restrict; autorelease)
• Prevent leaks
– unique and reference-counted pointers
• Specified aliasing
– for doubly-linked lists, etc.
• Concurrency
Conclusions
• High degree of control, safely:
• Sound mechanisms for programmercontrolled memory management
– Region-based vs. object-based deallocation
– Manual vs. automatic reclamation
• Region-annotated pointers within a simple
framework
– Scoped regions unifying theme (alias,open)
– Region polymorphism, for code reuse
More Information
• Cyclone homepage
– http://www.cs.umd.edu/projects/cyclone/
• Has papers, benchmarks from this paper,
and free distribution
– Read about it, write some code!
Download