Alias Types What do you want to type check today? David Walker Cornell University Types in Compilation Terms Types Typed Source Typed Intermediate Typed Target • Type-preserving compilers [Java,Til(t),Touchstone,Popcorn] – produce certified code – improve reliability & security April 12, 2000 David Walker, Cornell University 2 High-level vs Low-level • Typed high-level languages – simple & concise • programmers must be able to diagnose errors • type inference improves productivity • Typed low-level languages – expressive • capable of encoding multiple source languages • capable of encoding multiple compilation strategies • may focus on checking rather than inference April 12, 2000 David Walker, Cornell University 3 Memory Management • Typed high-level languages – simple & concise • automatic memory management • Typed low-level languages – expressive • support for alternative memory management techniques, compiler optimizations • explicit memory allocation, initialization, recycling, and deallocation April 12, 2000 David Walker, Cornell University 4 Goals • Study memory management invariants • Make invariants explicit in a type system – provide compiler writers, systems hackers with flexibility & safety • Today – one particular type system April 12, 2000 David Walker, Cornell University 5 Hazards • When memory is recycled, it may be used to store objects of different types x x free(x) let y = <x.x> 3 x 3 free_list y x.x • x must not be used an integer reference April 12, 2000 David Walker, Cornell University 6 MM Tradeoffs • Safe memory management involves deciding amongst tradeoffs: – aliasing: are multiple references to an object allowed? – first-class: where can references be stored? – reuse: can memory be reused at different types? Aliasing First-class Reuse April 12, 2000 David Walker, Cornell University 7 ML Refs Aliasing • First-class Reuse • Unlimited aliasing • First-class • Limited reuse – refs obey the type invariance principle – reuse is limited to objects of the same type • explicit deallocation is disallowed April 12, 2000 David Walker, Cornell University 8 Stack Allocation Aliasing • First-class Reuse • • • • Unlimited reuse Some aliasing Not first-class Examples – algol, stack-based (typed) assembly language April 12, 2000 David Walker, Cornell University 9 Linear Typing Aliasing First-class • Reuse • Immediate reuse • First-class • No aliasing – one reference to an object of linear type April 12, 2000 David Walker, Cornell University 10 Alias Types Aliasing • First-class Reuse • Unlimited reuse • First-class • Some aliasing April 12, 2000 David Walker, Cornell University 11 Outline • Alias types [with Fred Smith, Greg Morrisett] – The basics: concrete store types • Types for describing store shape • Type checking – Abstraction mechanisms • Polymorphic, Existential & Recursive types • Wrap-up – Implementation & research directions April 12, 2000 David Walker, Cornell University 12 Alias Analysis • Alias analysis – the problem of discovering aliasing relationships in unannotated programs (often in a subset of C) – goals: • program optimization • uncovering hazards in unsafe programs – vast literature: Jones & Muchnick, Deutsche, Ghiya & Hendren, Steensgaard, Evans, Sagiv & Reps & Wilhelm, ... April 12, 2000 David Walker, Cornell University 13 Our Problem • Checking aliasing & typing in safe languages – used in a certifying compiler – integrated with a rich type system (TAL) • • • • typing and aliasing are inter-dependent aliasing relationships encoded using types can express dependencies between functions & data sound: standard proof techniques imply type safety [Wright & Felleisen] April 12, 2000 David Walker, Cornell University 14 Linear Types • Linear types ensure there is one access path to any memory object x' x x : int (int int) 5 7 2 • A single-use constraint preserves the invariant x : int (int int) let y,z = x in ... y : int, z : (int int) z y=2 x is implicitly recycled: April 12, 2000 David Walker, Cornell University 5 7 2 15 Aliasing • User data structures involve aliasing: – circular lists, queues, ... • Compilers introduce more aliasing: – displays, some implementations of exceptions – transformations/optimizations: register allocation, destination-passing style • Bottom line: – There are countless situations in which the single access path invariant is too restrictive April 12, 2000 David Walker, Cornell University 16 Alias Types • Main idea: split an object type into two parts – an address (a "name" for the object) • multiple occurrences represent aliasing, multiple access paths – a type describing object contents 0x3466 address April 12, 2000 <int,int> memory/object contents David Walker, Cornell University 17 Store Types • Store types {l1 <int,int>} l1: 4 7 • Store type composition {l1 <int,int>} {l2 <int,int>} {l3 <int,int>} April 12, 2000 l1: 4 7 l2: 1 2 l3: 9 8 David Walker, Cornell University 18 Store Types • Store component types are unordered: {l1 <int>} {l2 <char>} = {l2 <char>} {l1 <int>} • No aliasing/duplication of store types – one type associated with each address – no contraction rule {l1 <int,int>} {l1 <int,int>} {l1 <int,int>} April 12, 2000 David Walker, Cornell University 19 Aliasing • Pointers have singleton type – – – – x : ptr(l1) "x points to the object at address l1" aliases == pointers to objects with the same name eg: x : ptr(l1), y : ptr(l1) x l1: y April 12, 2000 David Walker, Cornell University 20 Aliasing • A dag: { l1 <int,ptr(l3)> } { l2 <char,ptr(l3)> } { l3 <int,int> } x : ptr(l1), y : ptr(l2) x 4 y 'a' 5 7 • A cycle: { l1 <int,ptr(l1)> } April 12, 2000 David Walker, Cornell University 4 21 Type Checking • Store types vary between program points: { l1 1 } { l2 2 } ... instruction { l1 1' } { l2 2 } ... instruction { l1 1' } { l2 2' } ... April 12, 2000 David Walker, Cornell University 22 Example • Initializing data structures: x ? ? { l <Top,Top> }, x : ptr(l) x.1 := 3; { l <int,Top> }, x : ptr(l) x.2 := 'a'; { l <int,char> }, x : ptr(l) x April 12, 2000 3 ‘a’ David Walker, Cornell University 23 Example • Use of a pointer requires proper store type: { l1 <int,int> }, x:ptr(l1) let z = x in x 4 3 x 4 3 z { l1 <int,int> }, x:ptr(l1), z:ptr(l1) free (z); x z ? , x:ptr(l1), z:ptr(l1) let w = x.1 in % Wrong: l1 not present in store .... April 12, 2000 David Walker, Cornell University 24 Functions • Function types specify input & output store: f : { l1 <int,int> }.1 { l1 <char,char> }.2 • A call site: { l1 <int,int> }, x : 1 let y = f (x) in { l1 <char,char> }, x : 1,y : 2 ... • Technical note: calculus formalized in continuation-passing style April 12, 2000 David Walker, Cornell University 25 Outline • Alias types [with Fred Smith, Greg Morrisett] – The basics: concrete store types • Types for describing store shape • Type checking – Abstraction mechanisms • Polymorphic, Existential & Recursive types • Wrap-up – Implementation & research directions April 12, 2000 David Walker, Cornell University 26 Location Polymorphism deref: { 0x12 <int> }.ptr(0x12) { 0x12 <int> }.int – Only concrete location 0x12 can be dereferenced – Add location polymorphism: deref: [1].{ 1 <int> }.ptr(1) { 1 <int> }.int – The dependence between pointer and memory block is preserved April 12, 2000 David Walker, Cornell University 27 Example deref: [1].{ 1 <int> }.ptr(1) { 1 <int> }.int let , x = new(1) in { <Top> }, x : ptr() x.1 := 3; { <int> }, x : ptr() let y = deref [] (x) in { <int> }, x : ptr(), y : int X 0x12: 3 – From now on, I will stop mentioning concrete locations April 12, 2000 David Walker, Cornell University 28 Another Difficulty • Currently, deref can only be used in a store with one reference: deref: [1].{ 1 <int> }.ptr(1) { 1 <int> }.int let , x = new(1) in x.1 := 3; let ', y = new(1) in y.1 := 7; { <int> } { ' <int> } let _ = deref [] (x) ... % { <int>} {' <int>} { <int>} April 12, 2000 David Walker, Cornell University 29 Subtyping? deref: [1].{ 1 <int> }.ptr(1) { 1 <int> }.int – Subtyping (weakening) makes store components unusable: { <int> } { ' <int> } { <int> } let _ = deref [] (x) in { <int> } % ' inaccessible April 12, 2000 David Walker, Cornell University 30 Store Polymorphism – Store polymorphism hides store size and shape from callee & preserves it across the call deref: [,1]. { 1 <int> }.ptr(1) { 1 <int> }.int store preserved across the call April 12, 2000 David Walker, Cornell University 31 Example deref: [,1]. { 1 <int> }.ptr(1) { 1 <int> }.int – deref may be called with different references and preserves the store at each step: x: ptr(), y: ptr('), { <int> } { ' <int> } let _ = deref [{ ' <int> },] (x) in { <int> } { ' <int> } let _ = deref [{ <int> },'] (y) in { <int> } { ' <int> } April 12, 2000 David Walker, Cornell University % OK % OK 32 Example: A stack rest of the stack foo:[,sp,caller]. {sp <int,ptr(caller)>} . ptr(sp) .... stack frame stack pointer sp: caller: April 12, 2000 function argument on stack pointer to caller's frame • O'Hearn & Reynolds • Stack-based TAL David Walker, Cornell University 33 Aliasing display • Simple stack is purely linear • Displays – links to lexically enclosing scopes – links for dynamic control • Exceptions – link to enclosing exception handler – links for dynamic control April 12, 2000 David Walker, Cornell University enclosing handler 34 Displays sp lex1 display lex1caller lex2 lex2caller sp : ptr(lex1), display : ptr(display) {lex1 <...,ptr(lex1caller)>} {lex2 <...,ptr(lex2caller)>} {display <ptr(lex1),ptr(lex2)>} April 12, 2000 David Walker, Cornell University 35 So Far • Alias tracking to a fixed depth k=2 • Roughly corresponds to k-limited analyses • No way to specify repeated patterns April 12, 2000 David Walker, Cornell University 36 Outline • Alias types [with Fred Smith, Greg Morrisett] – The basics: concrete store types • Types for describing store shape • Type checking – Abstraction mechanisms • Polymorphic, Existential & Recursive types • Wrap-up – Implementation & research directions April 12, 2000 David Walker, Cornell University 37 Existential Types • Existential Types – hide object names so they can only be referenced locally 1 2 3 pack 1 April 12, 2000 2 3 - 2 only accessible through 1 David Walker, Cornell University 38 Existential Introduction reference to 2 in location 1 {1 top-level name & storage <ptr(2)>} {2 2} ... 1 2 ... April 12, 2000 David Walker, Cornell University 39 Existential Introduction top-level name & storage {1 <ptr(2)>} {2 2} ... pack {1 [2 ]. {2 2 } . <ptr(2)> } hide name April 12, 2000 local storage ... the object in location 1 David Walker, Cornell University 40 Example • Alternatives in a sum type may encapsulate data structures { 1 < > + [].{ <char> }.<int, ptr()> } 1: or 2 ‘c’ April 12, 2000 David Walker, Cornell University 41 Recursive Types • Recursive types describe repeated patterns in the store – . – standard roll/unroll coercions witness the isomorphism April 12, 2000 David Walker, Cornell University 42 Linear Lists 1: 2 { 1 list . <> null 7 + or 9 [] . { list } . < int,ptr() > } hidden tail head • Interior nodes can only be accessed through predecessors April 12, 2000 David Walker, Cornell University 43 In-place Append { 1 list } { 2 list } 1: 2 7 3 2: 2 { 1 <int,ptr(next)> } { next list } { 2 list } 1: 2 7 3 2: 2 {1 <int,ptr(next)>} {next <int,ptr(next’)>} {next’ list} {2 list} 1: 2 April 12, 2000 7 3 David Walker, Cornell University 2: 2 44 Append Invariant start next 1: 2 ... start : ptr(1), second 3 next next : ptr(next), ... 2: 2 end second : ptr(2) { next <int,ptr(end)> } { end list } { 2 list } April 12, 2000 David Walker, Cornell University 45 In-place Append ... {next <int,ptr(next’)>} {next’ <int,ptr(2)>} {2 list} 1: 2 7 3 2 ... {next <int,ptr(next’)>} { next’ list} 1: 2 7 3 2 7 3 2 { 1 list } 1: 2 April 12, 2000 David Walker, Cornell University 46 Trees : { tree.<> + [1,2].{1 tree}{2 tree}.<ptr(1),ptr(2)>} : { dag.<> + [1].{1 dag}.<ptr(1),ptr(1)>} April 12, 2000 David Walker, Cornell University 47 Other Possibilities – circular lists: 1: 2 7 9 { 1 clist . <ptr(1)> + [] . { clist } . < int,ptr() > } – queues – doubly-linked lists, trees with parent pointers • require parametric recursive types – destination-passing style [Wadler,Larus,Cheng&Okasaki,Minamide] – link-reversal algorithms [Deutsche&Schorr&Waite,Sobel&Friedman] April 12, 2000 David Walker, Cornell University 48 Limitations • All (useable) access paths must be known statically • A tree with leaves linked in a list – can be described but not used – how do you unfold the interior nodes of an arbitrary tree when traversing the list? ... April 12, 2000 David Walker, Cornell University 49 Outline • Alias types [with Fred Smith, Greg Morrisett] – The basics: concrete store types • Types for describing store shape • Type checking – Abstraction mechanisms • Polymorphic, Existential & Recursive types • Wrap-up – Implementation & research directions April 12, 2000 David Walker, Cornell University 50 Implementation • Currently in Typed Assembly Language: – Initialization of data structures – Run-time code generation • code templates are copied into buffers, changing buffer type – Alias tracking ensures consistency in the presence of operations that alter object type – Intuitionistic extension • must-alias information, limited reuse April 12, 2000 David Walker, Cornell University 51 Research Directions • Language design – Source language support for safe, explicit MM – Application domains • embedded, real-time systems – Platforms: Popcorn Cyclone ?? • Popcorn: safe C + polymorphism, exceptions, ML data types & pattern matching • Cyclone: gives programmers control over data layout • ??: gives programmers control over MM April 12, 2000 David Walker, Cornell University 52 Research Directions • Further exploration of MM invariants: – A single region of memory stores multiple objects [Tofte&Talpin] – Region deallocation frees all objects in that region simultaneously Regions Objects in Regions Aliasing • Reuse April 12, 2000 Aliasing • First-class First-class Reuse David Walker, Cornell University 53 Summary • Low-level languages require operations for explicit memory reuse • Types ensure safety by encoding rich memory management invariants • Reading: – esop '00, http://www.cs.cornell.edu/talc April 12, 2000 David Walker, Cornell University 54