Alias Types What do you want to type check today? David Walker

advertisement
Alias Types
What do you want to
type check today?
David Walker
Cornell University
Types in Compilation
Terms
Types
Typed Source
Typed Intermediate
Typed Target
• Type-preserving compilers [Java,Til(t),Touchstone,Popcorn]
– produce certified code
– improve reliability & security
April 12, 2000
David Walker, Cornell University
2
High-level vs Low-level
• Typed high-level languages
– simple & concise
• programmers must be able to diagnose errors
• type inference improves productivity
• Typed low-level languages
– expressive
• capable of encoding multiple source languages
• capable of encoding multiple compilation strategies
• may focus on checking rather than inference
April 12, 2000
David Walker, Cornell University
3
Memory Management
• Typed high-level languages
– simple & concise
• automatic memory management
• Typed low-level languages
– expressive
• support for alternative memory management
techniques, compiler optimizations
• explicit memory allocation, initialization, recycling,
and deallocation
April 12, 2000
David Walker, Cornell University
4
Goals
• Study memory management invariants
• Make invariants explicit in a type system
– provide compiler writers, systems hackers
with flexibility & safety
• Today
– one particular type system
April 12, 2000
David Walker, Cornell University
5
Hazards
• When memory is recycled, it may be used to
store objects of different types
x
x
free(x)
let y = <x.x>
3
x
3
free_list
y
x.x
• x must not be used an integer reference
April 12, 2000
David Walker, Cornell University
6
MM Tradeoffs
• Safe memory management involves deciding
amongst tradeoffs:
– aliasing: are multiple references to an object allowed?
– first-class: where can references be stored?
– reuse: can memory be reused at different types?
Aliasing
First-class
Reuse
April 12, 2000
David Walker, Cornell University
7
ML Refs
Aliasing
•
First-class
Reuse
• Unlimited aliasing
• First-class
• Limited reuse
– refs obey the type invariance principle
– reuse is limited to objects of the same type
• explicit deallocation is disallowed
April 12, 2000
David Walker, Cornell University
8
Stack Allocation
Aliasing
•
First-class
Reuse
•
•
•
•
Unlimited reuse
Some aliasing
Not first-class
Examples
– algol, stack-based (typed) assembly language
April 12, 2000
David Walker, Cornell University
9
Linear Typing
Aliasing
First-class
•
Reuse
• Immediate reuse
• First-class
• No aliasing
– one reference to an object of linear type
April 12, 2000
David Walker, Cornell University
10
Alias Types
Aliasing
•
First-class
Reuse
• Unlimited reuse
• First-class
• Some aliasing
April 12, 2000
David Walker, Cornell University
11
Outline
• Alias types [with Fred Smith, Greg Morrisett]
– The basics: concrete store types
• Types for describing store shape
• Type checking
– Abstraction mechanisms
• Polymorphic, Existential & Recursive types
• Wrap-up
– Implementation & research directions
April 12, 2000
David Walker, Cornell University
12
Alias Analysis
• Alias analysis
– the problem of discovering aliasing relationships in
unannotated programs (often in a subset of C)
– goals:
• program optimization
• uncovering hazards in unsafe programs
– vast literature: Jones & Muchnick, Deutsche,
Ghiya & Hendren, Steensgaard, Evans, Sagiv &
Reps & Wilhelm, ...
April 12, 2000
David Walker, Cornell University
13
Our Problem
• Checking aliasing & typing in safe languages
– used in a certifying compiler
– integrated with a rich type system (TAL)
•
•
•
•
typing and aliasing are inter-dependent
aliasing relationships encoded using types
can express dependencies between functions & data
sound: standard proof techniques imply type safety
[Wright & Felleisen]
April 12, 2000
David Walker, Cornell University
14
Linear Types
• Linear types ensure there is one access path
to any memory object
x'
x
x : int  (int  int)
5 7
2
• A single-use constraint preserves the invariant
x : int  (int  int)
let y,z = x in ...
y : int, z : (int  int)
z
y=2
x is implicitly recycled:
April 12, 2000
David Walker, Cornell University
5 7
2
15
Aliasing
• User data structures involve aliasing:
– circular lists, queues, ...
• Compilers introduce more aliasing:
– displays, some implementations of exceptions
– transformations/optimizations: register allocation,
destination-passing style
• Bottom line:
– There are countless situations in which the
single access path invariant is too restrictive
April 12, 2000
David Walker, Cornell University
16
Alias Types
• Main idea: split an object type into two parts
– an address (a "name" for the object)
• multiple occurrences represent aliasing, multiple
access paths
– a type describing object contents
0x3466
address
April 12, 2000
<int,int>
memory/object contents
David Walker, Cornell University
17
Store Types
• Store types
{l1  <int,int>}
l1: 4 7
• Store type composition
{l1  <int,int>} 
{l2  <int,int>} 
{l3  <int,int>}
April 12, 2000
l1: 4 7
l2: 1 2
l3: 9 8
David Walker, Cornell University
18
Store Types
• Store component types are unordered:
{l1  <int>}  {l2  <char>} = {l2  <char>}  {l1  <int>}
• No aliasing/duplication of store types
– one type associated with each address
– no contraction rule
{l1  <int,int>}  {l1  <int,int>}  {l1  <int,int>}
April 12, 2000
David Walker, Cornell University
19
Aliasing
• Pointers have singleton type
–
–
–
–
x : ptr(l1)
"x points to the object at address l1"
aliases == pointers to objects with the same name
eg: x : ptr(l1), y : ptr(l1)
x
l1:
y
April 12, 2000
David Walker, Cornell University
20
Aliasing
• A dag:
{ l1  <int,ptr(l3)> } 
{ l2  <char,ptr(l3)> } 
{ l3  <int,int>
}
x : ptr(l1), y : ptr(l2)
x
4
y
'a'
5 7
• A cycle:
{ l1  <int,ptr(l1)> }
April 12, 2000
David Walker, Cornell University
4
21
Type Checking
• Store types vary between program points:
{ l1  1 }  { l2  2 }  ...
instruction
{ l1  1' }  { l2  2 }  ...
instruction
{ l1  1' }  { l2  2' }  ...
April 12, 2000
David Walker, Cornell University
22
Example
• Initializing data structures:
x
? ?
{ l  <Top,Top> }, x : ptr(l)
x.1 := 3;
{ l  <int,Top> }, x : ptr(l)
x.2 := 'a';
{ l  <int,char> }, x : ptr(l)
x
April 12, 2000
3 ‘a’
David Walker, Cornell University
23
Example
• Use of a pointer requires proper store type:
{ l1  <int,int> }, x:ptr(l1)
let z = x in
x
4 3
x
4 3
z
{ l1  <int,int> }, x:ptr(l1), z:ptr(l1)
free (z);
x
z
?
, x:ptr(l1), z:ptr(l1)
let w = x.1 in % Wrong: l1 not present in store
....
April 12, 2000
David Walker, Cornell University
24
Functions
• Function types specify input & output store:
f : { l1  <int,int> }.1  { l1  <char,char> }.2
• A call site:
{ l1  <int,int> }, x : 1
let y = f (x) in
{ l1  <char,char> }, x : 1,y : 2
...
• Technical note: calculus formalized in continuation-passing style
April 12, 2000
David Walker, Cornell University
25
Outline
• Alias types [with Fred Smith, Greg Morrisett]
– The basics: concrete store types
• Types for describing store shape
• Type checking
– Abstraction mechanisms
• Polymorphic, Existential & Recursive types
• Wrap-up
– Implementation & research directions
April 12, 2000
David Walker, Cornell University
26
Location Polymorphism
deref: { 0x12  <int> }.ptr(0x12)  { 0x12  <int> }.int
– Only concrete location 0x12 can be dereferenced
– Add location polymorphism:
deref: [1].{ 1  <int> }.ptr(1)  { 1  <int> }.int
– The dependence between pointer and memory block is
preserved
April 12, 2000
David Walker, Cornell University
27
Example
deref: [1].{ 1  <int> }.ptr(1)  { 1  <int> }.int

let , x = new(1) in
{   <Top> }, x : ptr()
x.1 := 3;
{   <int> }, x : ptr()
let y = deref [] (x) in
{   <int> }, x : ptr(), y : int
X
0x12: 3
– From now on, I will stop mentioning concrete locations
April 12, 2000
David Walker, Cornell University
28
Another Difficulty
• Currently, deref can only be used in a store with one
reference:
deref: [1].{ 1  <int> }.ptr(1)  { 1  <int> }.int
let , x = new(1) in x.1 := 3;
let ', y = new(1) in y.1 := 7;
{   <int> }  { '  <int> }
let _ = deref [] (x) ...
% {  <int>}  {'  <int>}  {  <int>}
April 12, 2000
David Walker, Cornell University
29
Subtyping?
deref: [1].{ 1  <int> }.ptr(1)  { 1  <int> }.int
– Subtyping (weakening) makes store components
unusable:
{   <int> }  { '  <int> }  {   <int> }
let _ = deref [] (x) in
{   <int> }
% ' inaccessible
April 12, 2000
David Walker, Cornell University
30
Store Polymorphism
– Store polymorphism hides store size and shape from
callee & preserves it across the call
deref: [,1].   { 1  <int> }.ptr(1)    { 1  <int> }.int
store preserved across the call
April 12, 2000
David Walker, Cornell University
31
Example
deref: [,1].   { 1  <int> }.ptr(1)    { 1  <int> }.int
– deref may be called with different references and
preserves the store at each step:
x: ptr(), y: ptr('),
{   <int> }  { '  <int> }
let _ = deref [{ '  <int> },] (x) in
{   <int> }  { '  <int> }
let _ = deref [{   <int> },'] (y) in
{   <int> }  { '  <int> }
April 12, 2000
David Walker, Cornell University
% OK
% OK
32
Example: A stack
rest of the stack
foo:[,sp,caller]. {sp  <int,ptr(caller)>}   . ptr(sp)  ....
stack frame
stack pointer
sp:
caller:
April 12, 2000
function argument
on stack

pointer to
caller's frame
• O'Hearn & Reynolds
• Stack-based TAL
David Walker, Cornell University
33
Aliasing
display
• Simple stack is purely linear
• Displays
– links to lexically enclosing scopes
– links for dynamic control
• Exceptions
– link to enclosing exception handler
– links for dynamic control
April 12, 2000
David Walker, Cornell University
enclosing
handler
34
Displays
sp
lex1
display
lex1caller
lex2
lex2caller
sp : ptr(lex1), display : ptr(display)
{lex1  <...,ptr(lex1caller)>}

{lex2  <...,ptr(lex2caller)>}

{display  <ptr(lex1),ptr(lex2)>}


April 12, 2000
David Walker, Cornell University
35
So Far
• Alias tracking to a fixed depth

k=2
• Roughly corresponds to k-limited analyses
• No way to specify repeated patterns
April 12, 2000
David Walker, Cornell University
36
Outline
• Alias types [with Fred Smith, Greg Morrisett]
– The basics: concrete store types
• Types for describing store shape
• Type checking
– Abstraction mechanisms
• Polymorphic, Existential & Recursive types
• Wrap-up
– Implementation & research directions
April 12, 2000
David Walker, Cornell University
37
Existential Types
• Existential Types
– hide object names so they can only be
referenced locally
1
2
3
pack
1
April 12, 2000
2
3
- 2 only accessible
through 1
David Walker, Cornell University
38
Existential Introduction
reference to 2 in location 1
{1 
top-level name & storage
<ptr(2)>}  {2  2}  ...
1
2
...
April 12, 2000
David Walker, Cornell University
39
Existential Introduction
top-level name & storage
{1 
<ptr(2)>}  {2  2}  ...
pack
{1  [2 ]. {2  2 } . <ptr(2)> }
hide name
April 12, 2000
local storage
 ...
the object in location 1
David Walker, Cornell University
40
Example
• Alternatives in a sum type may
encapsulate data structures
{ 1  < > + [].{  <char> }.<int, ptr()> }
 1:
or
2
‘c’
April 12, 2000
David Walker, Cornell University
41
Recursive Types
• Recursive types describe repeated
patterns in the store
– .
– standard roll/unroll coercions witness the
isomorphism
April 12, 2000
David Walker, Cornell University
42
Linear Lists
 1: 2
{ 1   list . <>
null
7
+
or
9
[] . {   list } . < int,ptr() > }
hidden tail
head
• Interior nodes can only be accessed
through predecessors
April 12, 2000
David Walker, Cornell University
43
In-place Append
{ 1  list }  { 2  list }
 1: 2
7
3
 2: 2
{ 1  <int,ptr(next)> }  { next  list }  { 2  list }
 1: 2
7
3
 2: 2
{1  <int,ptr(next)>}  {next  <int,ptr(next’)>}  {next’  list}  {2  list}
 1: 2
April 12, 2000
7
3
David Walker, Cornell University
 2: 2
44
Append Invariant
start
next
 1: 2
...

start : ptr(1),
second
3
next
next : ptr(next),
...
 2: 2
end
second : ptr(2)
  { next  <int,ptr(end)> }  { end  list }  { 2  list }
April 12, 2000
David Walker, Cornell University
45
In-place Append
...  {next  <int,ptr(next’)>}  {next’  <int,ptr(2)>}  {2  list}
 1: 2
7
3
2
...  {next  <int,ptr(next’)>}  { next’  list}
 1: 2
7
3
2
7
3
2
{ 1  list }
 1: 2
April 12, 2000
David Walker, Cornell University
46
Trees
:
{   tree.<> + [1,2].{1  tree}{2  tree}.<ptr(1),ptr(2)>}
:
{   dag.<> + [1].{1  dag}.<ptr(1),ptr(1)>}
April 12, 2000
David Walker, Cornell University
47
Other Possibilities
– circular lists:
 1: 2
7
9
{ 1   clist . <ptr(1)> + [] . {   clist } . < int,ptr() > }
– queues
– doubly-linked lists, trees with parent pointers
• require parametric recursive types
– destination-passing style [Wadler,Larus,Cheng&Okasaki,Minamide]
– link-reversal algorithms [Deutsche&Schorr&Waite,Sobel&Friedman]
April 12, 2000
David Walker, Cornell University
48
Limitations
• All (useable) access paths must be known
statically
• A tree with leaves linked in a list
– can be described but not used
– how do you unfold the interior nodes of an
arbitrary tree when traversing the list?
...
April 12, 2000
David Walker, Cornell University
49
Outline
• Alias types [with Fred Smith, Greg Morrisett]
– The basics: concrete store types
• Types for describing store shape
• Type checking
– Abstraction mechanisms
• Polymorphic, Existential & Recursive types
• Wrap-up
– Implementation & research directions
April 12, 2000
David Walker, Cornell University
50
Implementation
• Currently in Typed Assembly Language:
– Initialization of data structures
– Run-time code generation
• code templates are copied into buffers, changing buffer
type
– Alias tracking ensures consistency in the
presence of operations that alter object type
– Intuitionistic extension
• must-alias information, limited reuse
April 12, 2000
David Walker, Cornell University
51
Research Directions
• Language design
– Source language support for safe, explicit MM
– Application domains
• embedded, real-time systems
– Platforms: Popcorn
Cyclone
??
• Popcorn: safe C + polymorphism, exceptions, ML data
types & pattern matching
• Cyclone: gives programmers control over data layout
• ??: gives programmers control over MM
April 12, 2000
David Walker, Cornell University
52
Research Directions
• Further exploration of MM invariants:
– A single region of memory stores multiple objects
[Tofte&Talpin]
– Region deallocation frees all objects in that region
simultaneously
Regions
Objects in Regions
Aliasing
•
Reuse
April 12, 2000
Aliasing
•
First-class
First-class
Reuse
David Walker, Cornell University
53
Summary
• Low-level languages require operations
for explicit memory reuse
• Types ensure safety by encoding rich
memory management invariants
• Reading:
– esop '00, http://www.cs.cornell.edu/talc
April 12, 2000
David Walker, Cornell University
54
Download