Design and Implementation of Object Oriented Dynamic Programming Languages Jonathan Bachrach Greg Sullivan Kostas Arkoudous Who am I? • Jonathan Bachrach • PhD in CS: neural networks applied to robotic control • Experience in real-time high performance electronic music • Involved in design of Sather and Dylan • Five years experience writing Dylan’s runtime and compiler at Harlequin The Seminar • • • • • • • • 6.894, 26-311, 1-2.30, 3-0-9, 3 EDP’s Theme Proto running example Lectures Readings + Presentations Panels Assignments Projects Today • • • • • • Taste of the Seminar Evolutionary Programming Proto Language Design in the New Millenium Language Implementation Techniques Seminar Details Tomorrow • Motivation and overview of OODL • Implementation perspective • Greg’s favorite topics Language Renaissance • • • • • • • • Perl Python MetaHTML PHP TCL Cecil C# Java • • • • • • • • JavaScript Sather Rebol Curl Dylan Isis Limbo … The Stage • Increasing Demands for Quick Time to Market • Moore’s Law for Hardware but is software keeping up? • Most Time Spent after initial deployment • Complex environments – – – – Distributed Agents Real world Continuous, non-deterministic, dynamic inputs Conventional Programming • Assume user knows exactly what they want before programming • Assume that specifications don’t change over time • Problem: Forces premature decisions and implementation effort • Example: Java Class Centric Methods Evolutionary Programming • Need to play with prototypes before knowing what you really want – Support rapid prototyping – Smooth transition to delivery • Requirements change all the time – Late binding allowing both developers and program alike to update behavior Language Design: User Oriented Goals • Perlis: “A language that doesn't affect the way you think about programming, is not worth knowing.” • What user-oriented goals would you suggest would result in the best language? Proto • • • • • Goals Examples Relation Definition State Proto Hello World (format out “hello world”) Proto Goals • • • • • • • Simple Productive Powerful Extensible Dynamic Efficient Real-time • Teaching and research vehicle • Electronic music is domain to keep it honest Proto Ancestors • Language Design is Difficult – Leverage proven ideas – Make progress in selective directions • Ancestors – Scheme – Cecil – Dylan Proto <=> Scheme • Concise naming • Procedural macros • Objects all the way • Long-winded naming • Rewrite rule only • Only records Proto <=> Cecil • Prefix syntax • Scheme inspired special forms • Infix syntax • Smalltalk inspired special forms Proto <=> Dylan • • • • Prefix syntax Prototype-based Procedural macros Rationalized collection protocol / hierarchy • Always open • Predicate types • • • • Infix syntax Class-based Rewrite-rule only … Conflated collection protocol / hierarchy • Sealing • Fixed set of types Object Orientation • Assume you know OO basics • Motivations: – Abstraction – Reuse – Extensibility Prototype-based OO • Simplified object model • No classes • Cloning basic operation for instance and prototype creation • Prototypes are special objects that serve as classes • Inheritance follows cloned-from relation Proto: OO & MM (dv <point> (isa <any>)) (slot <point> (point-x <int>) 0) (slot <point> (point-y <int>) 0) (dv p1 (isa <point>)) (dm + ((p1 <point>) (p2 <point>) => <point>) (isa <point> (set point-x (+ (point-x p1) (point-x p2))) (set point-y (+ (point-y p1) (point-y p2)))) Language Design: User Goals -- The “ilities” • • • • • • Learnability Understandability Writability Modifiability Runnability Interoperability Learnability • • • • Simple Small Regular Gentle learning curve • Perlis: “Symmetry is a complexity reducing concept…; seek it everywhere.” Proto: Learnability • Simple and Small: – 16 special forms: if, seq, set, fun, let, loc, lab, fin, dv, dm, dg, isa, slot, ds, ct, quote – 7 macros: try, rep, mif, and, or, select, case • Gentle Learning Curve: – Graceful transition from functional to object-oriented programming – Perlis: “Purely applicative languages are poorly applicable.” Proto: Special Forms IF SEQ SET LET FUN LOC LAB FIN DV DM DG ISA SLOT (IF ,test ,then ,else) (SEQ ,@forms) (SET ,name ,form) | (SET (,name ,@args) ,form) (LET ((,var ,init) …) ,@body) (FUN ,sig ,@body) (LOC ((,name ,sig ,@body) …) .@body) (LAB ,name ,@body) (FIN ,protected-form ,@cleanup-forms) (DV ,var ,form) (DM ,name ,sig ,@body) (DG ,name ,sig) (ISA (,@parents) ,@slot-inits) (SLOT ,owner ,var ,init) sig (,@vars) | (,@vars => ,var) var ,name | (,name ,type) slot-init (SET ,name ,value) Understandability • • • • • Natural notation Simple to predict behavior Modular Models application domain Concise Proto: Understandability • Describable by a small interpreter – Size of interpreter is a measure of complexity of language • Regular syntax – Debatable whether prefix is natural, but it’s simple, regular and easy to implement Writability • Expressive features and abstraction mechanisms • Concise notation • Domain-specific features and support • No error-prone features • Internal correctness checks (e.g., typechecking) to avoid errors Tradeoff One: Abstraction <=> Writability • Abstraction can obscure code – Example: abuse of macros • Concision can obscure code – Example: APL Tradeoff Two: Domain Specific <=> Simplicity • Challenge is to introduce simple domain specific features that don’t hair up language – CL has reader macros for introducing new token types Proto: Error Proneness • No out of language errors – At worst all errors will be be caught in language at runtime – At best potential errors such as “no applicable methods” will be caught statically earlier and in batch • Unbiased dispatching and inheritance – Example: Method selection not based on lexicographical order as in CLOS Design Principle Two: Planned Serendipity • Serendipity: – M-W: the faculty or phenomenon of finding valuable or agreeable things not sought for • Orthogonality – Collection of few independent powerful features combinable without restriction • Consistency Proto: Serendipity • Objects all the way down • Slots accessed only through calls to generic’s • Simple orthogonal special forms • Expression oriented • Example: – Exception handling can be built out of a few special forms: lab, fin, loc, … Modifiability • Minimal redundancy • Hooks for extensibility included automatically • No features that make it hard to change code later Proto: Extensible Syntax • Syntactic Abstraction • Procedural macros • WSYWIG – Pattern matching – Code generation • Example: (ds (unless ,test ,@body) `(if (not ,test) (seq ,@body))) Proto: Multimethods • Can add methods outside original class definition: – (dm jb-print ((x <node>)) …) – (dm jb-print ((x <str>)) …) Proto: Generic Accessors • All slot access goes through generic function calls • Can easily redefine these generic’s without affecting client code Runnability • Features for programmers to control efficiency • Analyzable by compilers and other tools • Note: this will be a running theme! Tradeoff Three: Runnability <=> Simplicity • Much of a language design can be dedicated to providing mechanisms to control efficiency (e.g., sealing in Dylan) • Obscure algorithms • Perlis: “A programming language is low level when its programs require attention to the irrelevant.” Proto: Optional Types • All bindings and parameters can take optional types • Rapid prototype without types • Add types for documentation and efficiency • Example: (dm format (s msg (args …)) …) (dm format ((s <stream>)(msg <str>) (args …)) …) Proto: Pay as You Go • Don’t charge for features not used • Pay more for features used in more complicated ways • Examples: – Dispatch • Just function call if method unambiguous from argument types • Otherwise require dynamic method lookup – Proto’s bind-exit called “lab” • Local exits are set + goto • Non local exits must create a frame and stack alloc an exit closure Interoperability • Portability • Foreign Language Interface • Directly or indirectly after mindshare The Rub • Support for evolutionary programming creates a serious challenge for implementors • Straightforward implementations would exact a tremendous performance penalty The Game • Every Problem in CS can be Solved with another Level of Indirection -- ancient proverb • Every Optimization Problem can be Viewed as Removing a Level of Indirection -- jb • Example: Procedures <=> Inlining Implementation Techniques • System code techniques: – Often called runtime optimizations – Turbo charges user code – Example: caching • User code rewriting techniques: – Often called compile-time optimizations – Example: inlining Implementing Prototypes • Problem – No classes only objects – But objects need descriptions of slots and state – Would be too expensive for each object to have a copy of these descriptions • A solution: – Create classes called traits on demand • When you are cloned or • When slots are added to you – Objects contain their values and share traits through a traits pointer Proto: Traits on Demand <point> Traits folks: x: x: 0 y: 0 <point> folks: x: x: 0 y: 0 <point> x: 0 folks: x: folks: x: x: 89 x: 33 y: 55 y: 67 p1 p2 0 x: x: 89 33 55 67 p1 p2 Implementing Generic Dispatch • Multimethod dispatch – Considers all arguments – Orders methods based on specializer type specificity • Naïve description: – Find applicable methods – Sort applicable methods – Call most specific method • Problem: too expensive Dispatch Technique One: Dispatch Cache • Hierarchical cache – Each level discriminates one argument using associative mechanism – Keys are concrete object-traits – Values are either • Cache for remaining arguments or • Method to call with given arguments • Problems – Too many indirections: • generic call + method call • key and value lookups Engine Node Dispatch • Glenn Burke and myself at Harlequin, Inc. circa 1996– Partial Dispatch: Optimizing Dynamically-Dispatched Multimethod Calls with Compile-Time Types and Runtime Feedback, 1998 • Shared decision tree built out of executable engine nodes • Incrementally grows trees on demand upon miss • Engine nodes are executed to perform some action typically tail calling another engine node eventually tail calling chosen method • Appropriate engine nodes can be utilized to handle monomorphic, polymorphic, and megamorphic discrimination cases corresponding to single, linear, and table lookup Engine Node Dispatch Picture Define method \+ (x :: <i>, y :: <i>) … end; Define method \+ (x :: <f>, y :: <f>) … end; Seen (<i>, <i>) and (<f>, <f>) as inputs. <mono-engine> <method> mono ep MEP ... \+ <i>,<i> method ... <linear-engine> <generic> linear ep <i> <method> text call MEP ... ... discriminator ... ... <i> <mono-engine> <f> mono ep ... <method> MEP ... <f> NAM \+ <f>,<f> method Dispatch Technique Two: Decision Tree • Intelligently number traits with id’s – Children tend to be in contiguous id range • Label all traits according to most applicable method • Construct decision tree constructing largest class id ranges by merging • Implement with simple numeric operations and JIT code generation • Problem: – Generic call + method call – Lost inlining opportunities Class Numbering <object> <a> <b> 0 <d> <c> 1 2 <e> text 3 4 5 Binary Search Tree Picture •From Chambers and Chen OOPSLA-99 Code Rewriting Technique One: Inlining Dispatch • • • • Inline when number of methods is small When methods themselves are small Example: List operations Twist: Partially inline dispatch when there is a spike in the method frequencies • Example: Numeric operations • Problem: Lose downstream type inference possibilities because result of call gets merged back into all other possibilities Inlining Dispatch Example One (dm empty? ((x <lst>)) #f) (dm empty? ((x nil)) #t) (dm null? ((x <lst>)) (empty? X)) => (dm null? ((x <lst>)) (if (== x nil) #t (if (isa? X <lst>) #f (error …)))) Inlining Dispatch Example Two (dm sum+1 (x y) (+ (+ x y) 1)) => (dm sum+1 (x y) (let ((t (if (and (isa? x <int>) (isa? y <int>)) (i+ x y) (+ x y)))) (if (isa? t <int>) (i+ t 1) (+ t 1)))) Code Rewriting Technique Two: Code Splitting • Clone code below typecase so each branch can run with tighter types • Problem: potential code explosion Code Splitting Example (dm sum+1 (x y) (+ (+ x y) 1)) => (dm sum+1 (x y) (if (and (isa? x <int>) (isa? y <int>)) (i+ (i+ x y) 1) (+ (+ x y) 1))) Summary • • • • Touched on evolutionary programming Introduced Proto Discussed language design Sketched out several implementation techniques to gain back performance Today’s Reading List • • • • • Dijkstra: Goto harmful Hoare: Hints on PL design Steele: Growing a Language Gabriel: Worse is Better Chambers: Synergies Between Object-Oriented Programming Language Design and Implementation Research • Chambers: The Cecil Language • Norvig: Adaptive Software • Cook: Interfaces and Specifications for the Smalltalk-80 Collection Classes • XP: www.extremeprogramming.com • *Lee: Advanced Language Implementation • *Queinnec: Lisp In Small Pieces Open Problems • Extensible enough language to stem need for domain specific languages • Best of both worlds of performance and dynamism • Simplest combination of language + implementation • … Credits • Craig Chambers’ notes on language design for UW CSE-505 Course • Seminar style format • Encourage discussion • Identify and make progress towards the solution of open problems • Course mostly not about language design – Will use proto as a point of discussion – Will talk about language design implications along the way Prerequisites • • • • 6.001 (SICP) 6.035 (Computer Language Engineering) 6.821 (Programming Languages) preferred Or permission of instructor Lectures • • • • Intro I & II Interpreters and VM’s Objects and Types Runtime Object Systems • Reflection • Memory Management • • • • • • Macros Type Inference OO Optimizations Partial Evaluation Dynamic Compilation Proof-Based Compilation • Practical Aspects Presentation Talking Points • Enumerate design parameters • Identify open problems and make progress towards their solution • Identify and understand tradeoffs • Note language design implications Paper Presentations • Each student will present one or two papers judged important to the course • Each presentation will be about a half hour • A reading list is available on the course web site • Other research can be presented at instructor’s discretion • Three slots in the following categories: • Language Design, Interpreters & VM’s • Types and Object Systems, Reflection, and Memory Management • Macros, Type Inference, and OO Optimizations, Partial Evaluation and Dynamic Compilation Panels • • • • • • Local experts System, Compiler, Language Design Positions Open Problems Secret Tricks Revealed Q &A Assignments Small one week projects – Metacircular interpreter – Simple VM – Fast isa? – Fast method dispatch – Slot layout in Proto such as: – – – – Simple GC Type inferencer Specialization Feedback guided dispatch – Profile guided inlining Project • Substantial project involving original language design and/or implementation • Produce a design and project plan • Working implementation • Present an overview of their project • 10+ page write-up Grades • Will be based on the deliverables and class participation First Assignment • Survey • Due by Thursday