Lecture 22: Shameless Self-Promotion From bnelson@netcom.com (Bob Nelson) Subject Re: NT vs. Linux Date Fri, 5 Jul 1996 05:11:22 GMT Newsgroups comp.os.linux.advocacy,comp.sys.ibm.pc.hardware, comp.os.ms-windows.win95.misc, comp.os.mswindows.nt.misc, alt.flame,alt.fan.billgates,alt.destroy.microsoft -----------------------------------------------------------------------Toni Anzlovar (toni.anzlovar@kiss.uni-lj.si) wrote: > Why does everybody want to RUN WORD? Why does nobody want to write and edit > text? Simple. A *tremendous* number of documents are written using Microsoft Word. One that is particularly ironic is the guide to LCLint -- a very popular lint tool -- often the lint of choice in the linux world. CS655: Programming Languages David Evans University of Virginia http://www.cs.virginia.edu/~evans Computer Science Menu • Garbage Collection • Theory of Type Qualifiers – PS3, Question 5 • LCLint 27 July 2016 University of Virginia CS 655 2 Static Storage Allocation • All storage allocated at compile time • Advantages: – Fast – Safe (cannot run out of memory) • Disadvantages: – Limited expressiveness • No recursion, no dynamic structures – Inefficient (sizes must be known at compile time) • FORTRAN 27 July 2016 University of Virginia CS 655 3 Stack Allocation • Activation records – Storage allocated on procedure entrance, deallocated on procedure exit • Advantages over static allocation: – Supports recursion – Local structures size may vary • But: storage lifetimes fixed to procedures • Algol60 27 July 2016 University of Virginia CS 655 4 Heap Allocation • Dynamically allocate storage • Advantages – Storage size and lifetime controlled by programmer • Disadvantages – Storage size and lifetime controlled by programmer – Heap fills up with garbage 27 July 2016 University of Virginia CS 655 5 What is Garbage? • Allocated memory that will never be used again • Conservative Predictions: – Java, CLU, LISP, ML • Objects that are not reachable – C/C++ • Reachability is much harder because of pointer arithmetic, casting – Linda • No way to tell 27 July 2016 University of Virginia CS 655 6 Reference Counting • Every allocated object has an associated reference counter, rc • Creating a new object, rc = 0 • Assignment (creating a reference), rc++ • Losing a reference: rc--; if rc == 0 free object • Advantages: – Overhead distributed • Disadvantages: – High overhead on assignments, block exits – Can’t reclaim cyclic structures 27 July 2016 University of Virginia CS 655 7 Cyclic structures List x 2 next: 1 next: 1 next: 1 next: 1 next: 1 next: x := new List (); Cannot be reclaimed! Reachability • “Root” is reachable • Object is reachable, if there is a reachable reference to it • Roots: – CLU • Every object reference on the stack, own variables – Java • Every object reference on the stack, static (global) references 27 July 2016 University of Virginia CS 655 9 Mark and Sweep root Mark and Sweep root Mark and Sweep • Stop everything • Mark objects reachable from roots – Just follow all references recursively • Reclaim everything that isn’t marked • Disadvantages – Long pauses (what gave GC a bad name) • Advantages – Simple, no overhead except when GC’ing 27 July 2016 University of Virginia CS 655 12 Stop and Copy • Divide storage into two spaces • Stop everything • Start from roots, copy all reachable objects to new space (switching references to point to new space as you go) • Advantages: – Improves locality – better cache behavior • Disadvantages: – Have to waste memory (need new space to copy into) – Changes references (okay if language has address transparency) 27 July 2016 University of Virginia CS 655 13 Garbage Collecting in C/C++ • Problem: what is reachable? • Approach 1: – Keep table of malloc’ed objects – Assume all values that look like pointers (have value in address range) are pointers, and make all pointer-like values on stack, in registers, in static storage the roots – Any object not reachable is from roots is garbage 27 July 2016 University of Virginia CS 655 14 Test Program 1 char *evil () { char *s = malloc (1000); long int adr1 = (long int) s & 0xFFFF0000; long int adr2 = (long int) s & 0x0000FFFF; s = malloc (1000); GC s = (char *) (adr1 | adr2); return s; } 27 July 2016 University of Virginia CS 655 15 Test Fragment 2 char *s = ...; // read a new string int len = 0; while (*s != ‘\0’) { *s = tolower (*s); s++; len++; } s = s – len; // point back to string start 27 July 2016 University of Virginia CS 655 16 Boehm [PLDI 96] • ANSI limits pointer arithmetic to within an allocated object • Source code transformations to explicitly mark live references (put them in GC roots) • Check source code for casts from nonpointer to pointer 27 July 2016 University of Virginia CS 655 17 GC Summary • After 40 years, still an active research area – PLDI ‘2000 – 3 GC papers (of 31) • Concurrent; generational; dealing with contaminated storage • Performance penalty can be low or negative (improved cache behavior) 27 July 2016 University of Virginia CS 655 18 Alternatives to GC • Never reclaim storage – Works fine for most PC applications, but not for embedded systems – Loses locality – real reason to GC most programs • Manually reclaim storage – Buggy, dangerous and time-consuming – Support manual reclamation with static checking 27 July 2016 University of Virginia CS 655 19 Type Qualifiers q q negative (narrowing) qualifier positive (widening) qualifier Is unsigned (in C) a qualifier? No, neither unsigned int int int unsigned int is true. 27 July 2016 University of Virginia CS 655 20 Subtyping Rule Q Q’ i = ’i i [1..n] [type-constructor] Q c(1 ,..., n ) Q’ c(’1 ,..., ’n ) Why not ? 27 July 2016 University of Virginia CS 655 21 PS3, Question 5: Subtyping ST array[S] array[T ] S=T array[S] array[T ] 27 July 2016 University of Virginia CS 655 [monotonic-arrays] [specialization of type-constructor] 22 PS3, Question 5 • Goal: call Scrunch (p) with a p that allows attacker to violate type safety • [monotonic-arrays] means typeof(p) must be array[String], so we can have typeof(p) = array[EvilString] where EvilString String • Method overriding allows attacker to define EvilString.concat String.concat 27 July 2016 University of Virginia CS 655 23 Overriding String.concat P1 Q1, ..., Pn Qn, S T proc (P1, ..., Pn) returns (S) [monotonic-procs] proc (Q1, ... , Qn) returns So, (T) String EvilString.concat (EvilString s) String String.concat (String s) 27 July 2016 University of Virginia CS 655 24 Implementing EvilString class EvilString extends String { private int hackPointer; String concat (EvilString s) { this.hackPointer = // forge an address return super.concat (s); } ... Can attacker make Scrunch call EvilString.concat (s) where s EvilString? } 27 July 2016 University of Virginia CS 655 25 No! Scrunch calls s.concat after initializing, String s = “”; ... s = s.concat ((String) ar[i]); ... so, EvilString.concat is never called! Attacker loses. If replaced initialization with, String s = a[0]; then, would call EvilString.concat(EvilString) if a[0] is EvilString, and a[1] is String EvilString. 27 July 2016 University of Virginia CS 655 26 Real Java • Has the [monotonic-array] rule! String [] s; SubString [] t; ... s = t; s[0] = new String (“test”); SubString tt = t[0]; • Assignment produces an ArrayStoreException 27 July 2016 University of Virginia CS 655 27 const • const is a positive qualifier const – values can be cast to const – const values cannot be cast to (Note: ANSI C allows it, but result is implementation dependent) • const qualified values can be initialized but not updated • From stdlib: char *strcpy (const char *, char *); 27 July 2016 University of Virginia CS 655 28 Assign A e1 : ref(2) A A A e2 : 2 e1 := e2 : unit e1 : const ref(2) A e2 : 2 A 27 July 2016 e1 := e2 : unit University of Virginia CS 655 [assign] [assign’] 29 Call A e1 : 2 A A e2 : 2 e1 (e2) : [call] No changes necessary: const No const rule means const So, existing [call] rule disallows passing const as parameter. 27 July 2016 University of Virginia CS 655 30 LCLint Approach • Programmers add annotations (formal specifications) – Simple and precise – Describe programmers intent: Types, memory management, data hiding, aliasing, modification, nullness, etc. (project group 3: buffer overflows) • LCLint detects inconsistencies between annotations and code – Simple (fast!) dataflow analyses 27 July 2016 University of Virginia CS 655 31 Sample Annotation: only extern only char *gptr; extern only out null void *malloc (int); • • • • Reference (return value) owns storage No other persistent (non-local) references to it Implies obligation to transfer ownership Transfer ownership by: – Assigning it to an external only reference – Return it as an only result – Pass it as an only parameter: e.g., extern void free (only void *); 27 July 2016 University of Virginia CS 655 32 Example extern only out null void *malloc (int); in library 1 int dummy (void) { 2 int *ip= (int *) malloc (sizeof (int)); 3 *ip = 3; 4 return *ip; 5 } LCLint output: dummy.c:3:4: Dereference of possibly null pointer ip: *ip dummy.c:2:13: Storage ip may become null dummy.c:4:14: Fresh storage ip not released before return dummy.c:2:43: Fresh storage ip allocated 27 July 2016 University of Virginia CS 655 33 only • Try: only is a negative qualifier only • From stdlib: only void *malloc (size_t); void free (only void *); • Does call rule work? (only pass onlys as onlys) • But, after call state is changed: only char *x; ... free (x); free(x); 27 July 2016 University of Virginia CS 655 34 Operational Semantics • After passing as only, becomes dead. • Configuration: < Instructions, PC, Store > Store: loc <value, state { only, dead, ... } 27 July 2016 University of Virginia CS 655 35 Pass as Only Instructions[PC] = f (e) & Store ( f ) = < vf, only > & Store (e) = < ve, only > PC = PC + 1; Store’ = Store[e < , dead >] 27 July 2016 University of Virginia CS 655 36 Assign Only Instructions[PC] = l := r & Store ( l ) = < vl, only > & Store (r) = < vr, only > PC = PC + 1 Store’ = Store[l < vr, only >] [r < , dead >] 27 July 2016 University of Virginia CS 655 37 Still Challenge Problem Remaining • How do you handle declarations? • How do you handle block exits? • Would denotational semantics work better? • What about a combination of static and operational? • How do you handle other annotations consistently? 27 July 2016 University of Virginia CS 655 38 Summary • Theory of Type Qualifiers uses: – – – – – – Static Semantics (Typing Judgments) Subtyping Rules Type Polymorphism Type Inference Lambda Calculus Operational Semantics • If you understand everything in this paper, you know 75% of what you need to for the final. 27 July 2016 University of Virginia CS 655 39 Charge • Next time: – Wacky Programming Paradigms – Guidelines for Rotunda Presentations – Signup for Final Timeslots • Project Final Reports due Friday – All team members should read complete drafts of your report 27 July 2016 University of Virginia CS 655 40