Compilation 2007 Domain-Specific Languages Syntax Extensions Michael I. Schwartzbach BRICS, University of Aarhus GPL Problem Solving The General Purpose Language (GPL) approach: • analyze the problem domain • express the conceptual model as an OO design • program a framework Pros: • predictable and familiar result • (relatively) low cost of implementation Cons: • difficult to fully exploit domain-specific knowledge • only available to general programmers Domain-Specific Languages 2 DSL Problem Solving The DSL approach: • analyze the problem domain • express the conceptual model as a language design • implement a compiler or interpreter Pros: • possible to exploit all domain-specific knowledge • also available to domain experts Cons: • (relatively) high cost of implementation • risk of Babylonian confusion Domain-Specific Languages 3 Variations of DSLs A stand-alone DSL: • a novel language with unique syntax and features • example: LaTeX An embedded DSL: • an existing GPL extended with DSL features • example: JSP An external DSL: • a stand-alone DSL invoked from a GPL • example: SQL invoked from Java (JDBC) Domain-Specific Languages 4 From DSL to GPL A stand-alone DSL may evolve into a GPL: • • • • • Fortran Formula Translation Algol Algorithmic Language Cobol Common Business Oriented Language Lisp List Processing Language Simula = Simulation Language A (successful) DSL design should plan for growth Domain-Specific Languages 5 Using Domain-Specific Knowledge Domain-specific syntax: • directly denote high-level concepts Domain-specific analysis: • consider global properties of the application • domain-specific syntax clarifies the behavior Domain-specific optimization: • exploit domain-specific analysis results GPL frameworks cannot provide these benefits Domain-Specific Languages 6 The Joos Peephole Language A stand-alone DSL: • no general-purpose computing is required Domain concepts: • bytecodes • patterns • templates Implemented using: • a parser • a static checker • an interpreter Domain-Specific Languages 7 DSL Syntax for Peepholes pattern dup_istore_pop x: x ~ dup istore (i0) pop -> 3 istore (i0) Domain-Specific Languages 8 GPL Syntax Alternative boolean dup_istore_pop(InstructionList x) { int i0; if (is_dup(x) && is_istore(x.next) && is_pop(x.next.next)) { i0 = (int)x.next.getArg(); x = replace(x,3,new Arraylist().add(new Iistore(i0))); return true; } return false; } Much harder to write correctly Fixed implementation strategy Domain-Specific Languages 9 DSL Analysis for Peepholes Formal type and scope rules: |- E: bytecodes[→'] |- P['→''] |- E: boolean[→'] |- E ~ P: boolean[→''] |- E1: boolean[→'] |- ! E: boolean[→] |- E2: boolean['→''] |- E1 && E2: boolean[→''] This is checked by a phase in the DSL interpreter Domain-Specific Languages 10 GPL Analysis Alternative Lots of yellow PostIt notes: These cannot be checked by the Java compiler Domain-Specific Languages 11 The JWIG Language An embedded DSL (in Java): • lots of general-purpose computing is required Domain concepts: • XML templates • Web services • sessions Implemented using: • a syntax extension • a static analysis • a framework Domain-Specific Languages 12 DSL Syntax for JWIG public class test extends Service { String userid; public class Login extends Session { XML wrap = [[<html> <body bgcolor="yellow"> <[contents]> </body> </html>]]; public void main() { XML login = [[<form> Userid: <input type="text" name="userid"> <input type="submit"/> </form>]]; show wrap<[contents = login]; userid = receive userid; show wrap<[contents = "Welcome "+userid]; } } } Domain-Specific Languages 13 GPL Syntax Alternative XML login = XML.make("<form>\nUserid: <input type=\"text\" name=\"userid\">\n<input type="submit"/>\</form>"); show(wrap.plug("contents",login)); userid = receive("userid"); The DSL syntax maps directly to methods calls in an underlying Java framework Avoiding escapes makes the syntax more legible But this is just a thin layer of syntactic sugar Domain-Specific Languages 14 DSL Analysis for JWIG A static analysis that at compile time guarantees: • only well-formed and valid XML is every generated • only existing form fields are every received • only exisiting gaps are ever plugged This is a DSL analysis that is performed on the resulting compiled class files Domain-Specific Languages 15 JWIG Implementation Model JWIG syntax jwigc Java syntax javac .class files jwiga JWIG framework Domain-Specific Languages analysis results 16 Syntax Extensions Programmers may want to extend the syntax of their programming language: • • • • introduce domain-specific syntax abbreviate common idioms define language extensions ensure consistency Such extensions are introduced through macros Domain-Specific Languages 17 Macros Macros are as old as programming Is used as an orthogonal abstraction mechanism Two different flavors: • lexical macros • syntactic macros Domain-Specific Languages Main Entry: 2macro Pronunciation: 'ma-(")krO Function: noun Inflected Form(s): plural macros Etymology: short for macroinstruction Date: 1959 “a single computer instruction that stands for a sequence of operations” 18 Lexical Macros Operate on sequences of tokens Are handled by a preprocessor Are independent of the host language syntax Examples: • CPP • TeX Domain-Specific Languages 19 CPP - The C Preprocessor Integrated into C compilers Also works as a stand-alone expander Intercepts directives such as: • • • • • #define #undef #ifdef #if #include Domain-Specific Languages 20 Lexical Macro Example CPP macro to square a number: #define square(X) X * X square(z + 1) Domain-Specific Languages z + 1 * z + 1 21 Lexical Macro Example CPP macro to square a number: #define square(X) X * X square(z + 1) z + (1 * z) + 1 Adding parentheses as a hack: #define square(X) (X) * (X) square(z + 1) Domain-Specific Languages (z + 1)*(z + 1) 22 Parsing Problem #define swap(X,Y) { int t=X; X=Y; Y=t; } if (a > b) swap(a,b); else b=0; *** test.c:3: parse error before 'else' Domain-Specific Languages 23 Parsing Problem Hack #define swap(X,Y) { int t=X; X=Y; Y=t; } if (a > b) swap(a,b); else b=0; *** test.c:3: parse error before 'else' #define swap(X,Y) do { int t=X; X=Y; Y=t; } while (0) if (a > b) swap(a,b); else b=0; Domain-Specific Languages 24 Expansion Time #define A 87 #define B A #undef A #define A 42 B ??? Eager expansion (definition time): B 87 Lazy expansion (invocation time): B A 42 CPP is lazy Domain-Specific Languages 25 Expansion Order #define id(X) X #define one(X) id(X) #define two a,b one(two) ??? Inner (call-by-value): one(two) one(a,b) *** arity error 'one' Outer (call-by-name): one(two) Domain-Specific Languages id(two) two a,b 26 Expansion Order in CPP CPP uses a pragmatic "argument prescan": one(two) id(a,b) *** arity error 'id' Useful for composing macros: #define succ(X) ((X)+1) #define call7(X) X(7) call7(succ) Domain-Specific Languages succ(7) ((7)+1) 27 Recursive Expansion #define x 1+x x ??? Definition time: *** recursive definition Invocation time: x 1+x Domain-Specific Languages 1+1+x 1+1+1+x ... 28 Recursive Expansion in CPP CPP uses a pragmatic "intercept-and-ignore": int x = 2; #define x = 1+x x 1+x Maintain a stack of macro invocations Ignore invocations of macros already on the stack At runtime the value of x is 3 Domain-Specific Languages 29 TeX Macros \def \vector #1[#2..#3] { $({#1}_{#2},\ldots,{#1}_{#3})$ } \vector \phi[0..n-1] $({\phi}_{0},\ldots,{\phi}_{n-1})$ Flexible invocation syntax Parsing ambiguities (chooses shortest invocation) Expansion is lazy and outer Recursion is permitted (conditions allowed) Domain-Specific Languages 30 Syntactic Macros Operate on sequences of ASTs Are handled by the parser Are integrated with the host language syntax Examples: • C++ templates • Jakarta Tool Suite Domain-Specific Languages 31 C++ Templates Integrated into C++ compilers Is intended as a genericity mechanism But is often used as a macro language Macros accept ASTs for: • identifers • constants • types The result is always an AST for a declaration Domain-Specific Languages 32 Syntactic Macro Example template <class T> T GetMax(T x, T y) { return (x>y?x,y); } int i,j; max = GetMax <int> (i,j); Template bodies are parsed at definition time (unlike CPP macros) Templates are syntactically expanded Heavy use of templates yields bloated code (unlike Java generics that are not macros) Domain-Specific Languages 33 Metaprogramming C++ templates: • perform compile time constant folding of arguments • allow multiple template definitions and pattern matching This combination enables metaprogramming: • Turing-complete computations during compilation Template libraries exist for: • • • • • booleans control structures functions variables data structures Domain-Specific Languages 34 Metaprogramming Example template <int X> struct pow<X,0> { static const int n = 1; }; template <int X, int Y> struct pow { static const int n=X*pow<X,Y-1>::n; }; const int z = pow<5,3>::n; The value 125 is assigned to z at compile time Domain-Specific Languages 35 Metaprogramming for Specialization template <int I> inline float dot(float *a, float *b) { return dot<I-1>(a,b) + a[I]*b[I]; } template <> inline float dot<0>(float *a, float *b) { return a[0]*b[0]; } float x[3], y[3]; float z = dot<2>(x,y); float z = x[0]*y[0] + x[1]*y[1] + x[2]*y[2]; The overhead of control structures are removed Domain-Specific Languages 36 Jakarta Tool Suite JTS extends Java with simple syntactic macros Macros accept ASTs for: • • • • • • AST_QualifiedName AST_Exp AST_Stm AST_FieldDecl AST_Class AST_TypeName The result is an AST specified as: • • • • exp{ stm{ mth{ cls{ ... ... ... ... Domain-Specific Languages }exp }stm }mth }cls 37 Hygienic Macros macro swap(AST_QualifiedName x, AST_QualifiedName y) local temp stm{ int temp = x; x = y; y = temp; }stm int temp = 42; int tump = 87; #swap(temp,tump); Potential name clash problem: int temp = temp; temp = tump; tump = temp; But local names are renamed uniquely: int temp143 = temp; temp = tump; tump = temp143; Domain-Specific Languages 38 The MetaFront System Macros are special cases of transformations: transformation x: A => B input program program.a output program program.b metafront A input language B output language Inductive transformations allow: • arbitrary nonterminals • arbitrary invocation syntax Domain-Specific Languages 39 MetaFront Example language Maybe extends Java stm[maybe] -> maybe <stm> ; transformation Maybe2Java: transformer Xstm: stm => Xstm[maybe](S) S.xstm => << if (rnd(2)==1) <xS> } Domain-Specific Languages Maybe => Java { stm; xS ==> >> 40