The Bugs and the Bees Research in Programming Languages and Security David Evans evans@cs.virginia.edu http://www.cs.virginia.edu/~evans University of Virginia Department of Computer Science Background • Joined UVA, November 1999 • BS/MS ‘94, and PhD ‘2000 from MIT • Funding for three new students (but will probably only accept 1 or 2) • Courses – Security (CS551) – Grad. Programming Languages (CS655) Menu • The Bugs LCLint How do we help good people write better programs? How do we prevent bad programs from doing bad things? • The Bees - “Programming the Swarm” How can we program large collections of devices? A Gross Oversimplification all Bugs Detected Formal Verifiers Compilers none Low Effort Required Unfathomable Requirements • No interaction required – as easy to use as a compiler • Fast checking – as fast as a compiler • Gradual Learning/Effort Curve – Little needed to start – Clear payoff relative to user effort Approach • Programmers add annotations (formal specifications) – Simple and precise – Describe programmers intent: • Types, memory management, data hiding, aliasing, modification, null-ity, etc. • LCLint detects inconsistencies between annotations and code – Simple (fast!) dataflow analyses Sample Annotation: only extern only char *gptr; extern only out null void *malloc (int); • • • • Reference (return value) owns storage No other persistent (non-local) references to it Implies obligation to transfer ownership Transfer ownership by: – Assigning it to an external only reference – Return it as an only result – Pass it as an only parameter: e.g., extern void free (only void *); Example extern only null void *malloc (int); in library 1 int dummy (void) { 2 int *ip= (int *) malloc (sizeof (int)); 3 *ip = 3; 4 return *ip; 5 } LCLint output: dummy.c:3:4: Dereference of possibly null pointer ip: *ip dummy.c:2:13: Storage ip may become null dummy.c:4:14: Fresh storage ip not released before return dummy.c:2:43: Fresh storage ip allocated LCLint Status • Public distribution since 1993 • Effective checking >100K line programs (checks about 1K lines per second) – Detects lots of real bugs in real programs (including itself, of course) – Thousands of users, Linux Journal, etc. • Checks include type abstractions, modifications, globals, memory leaks, dead storage, naming conventions, undefined behavior, incomplete definition... Where do we go from here? • Extensible Checking – Allow users to define new annotations and associated checking • Integrate run-time checking – Combine static and run-time checking to enable additional checking and completeness guarantees • Generalize framework – Support static checking for multiple source languages in a principled way LCLint • More information: lclint.cs.virginia.edu PATV ‘2000, PLDI ’96, FSE’94 • Students: David Larochelle, Chris Barker, Vic Ludwig • Current Funding: NASA (joint with John Knight) • Previous funding: DARPA, NSF, ONR, DEC Untrusted Program Safe Program Naccio Motivation • Weaknesses in existing code safety systems: – Limited range of policies – Policy definition is ad hoc and platform dependent • Enforcement is tied to a particular architecture • Can we solve them without sacrificing efficiency or convenience? Yes! Naccio Overview Program • General method for defining policies – Abstract resources – Platform independent Safety Policy • System architecture for enforcing policies – Prototypes for JavaVM classes, Win32 executables Safe Program Problem User’s View System View Program Policy Platform Interface WriteFile (fHandle, …) tar cf * System Library OS Kernel Resources Files Disk Safety Policy Definition • Resource descriptions: abstract operational descriptions of resources (files, network, threads, display, …) • Platform interface: mapping between system events (e.g., Java API calls, Win32 API calls) and abstract resources • Resource use policy: constraints on manipulating those resources Naccio Architecture Per policy Per application Safety policy definition Policy compiler Policy-enforcing system library Program Policy description file Application transformer Version of program that: • Uses policy-enforcing system library • Satisfies low-level code safety Current Platforms: JavaVM – program is collection of Java classes Win32 – program is Win32 executable and DLLs Open Issues • Low-Level Code Safety for Win32 – How can you prevent malicious programmer from tampering with checking code? • Policy Development – What is the correct policy for different environments? • User Interface – How can you present policy violations to naive users in a sensible way? Naccio Summary • Method for defining large class of policies – Using abstract resources • General architecture for code safety • Encouraging results so far – Win32 (Andrew Twyman, MIT MEng’99): need to implement low-level safety – JavaVM: believed to be secure For more information: http://naccio.cs.virginia.edu IEEE Security & Privacy `99, my PhD thesis Programming the Swarm Really Brief History of Computer Science 1950s: Programming in the small... Programmable computers Learned the programming is hard Birth of higher-order languages Tools for reasoning about trivial programs 1970s: Programming in the large... Abstraction, objects Methodologies for development Tools for reasoning about component-based systems 2000s: Programming in the Swarm! Programming the Swarm: Long-Range Goal Cement 10 GFlop What’s Changing • Execution Platforms – Not computers (98% of microprocessors sold this year) – Small and cheap • Execution environment – Interact with physical world – Unpredictable, dynamic • Programs – Old style of programming won’t work – Is there a new paradigm? Swarm Programming • Primitives describe group behaviors – What are the primitives? – How are they specified? • Important to understand both functional (how the state changes) and non-functional (power use, robustness, efficiency, etc.) properties • Construct complex behaviors by composing primitives – Predict behavior of result – Pick the right primitives based on description of desired non-functional properties smarter behavior Group Behavior Bee house-hunting Concrete Bridge Manchester United Jazz Octet CS IM Team Ant Routing Disperse (demo) Canada Geese Stadium Wave smarter devices Finding an Advisor • Don’t rely on the matching process – This is the last resort! • Find someone with whom you can well – Can’t tell this from one breakout session – Meet at least twice with potential advisors before matching forms Summary – Project Decision Tree People are basically good? Yes Sign up for meeting times (form going around). No People are basically evil? LCLint No Yes AnnotationAssisted Lightweight Static Checking People are basically crazy? Yes Policy-Directed Code Safety Programming the Swarm No ?