Domain-specific languages Syntax extensions

advertisement
Compilation 2012
Domain-Specific Languages
and
Syntax Extensions
Jan Midtgaard
Michael I. Schwartzbach
Aarhus University
GPL Problem Solving
 The General Purpose Language (GPL) approach:
•
•
•
•
analyze the problem domain
express the conceptual model as an OO/FP/… design
program a framework/library
express concrete application as framework/library client
 Pros:
• predictable and familiar result
• (relatively) low cost of implementation
 Cons:
• difficult to fully exploit domain-specific knowledge
• only available to general programmers
Domain-Specific Languages
2
DSL Problem Solving
 The DSL approach:
• analyze the problem domain
• express the conceptual model as a language design
• implement a compiler or interpreter
 Pros:
• possible to exploit all domain-specific knowledge
• also available to domain experts
 Cons:
•
•
•
•
(relatively) high cost of implementation
risk of Babylonian confusion
lack of tool support (IDE,…)
hard to combine DSLs or DSL and GPL this way
Domain-Specific Languages
3
Variations of DSLs
 A stand-alone DSL:
• a novel language with unique syntax and features
• example: LaTeX
 An embedded DSL:
• an existing GPL extended with DSL features
• example: JSP
 An external DSL:
• a stand-alone DSL invoked from a GPL
• example: SQL invoked from Java (JDBC)
Domain-Specific Languages
4
From DSL to GPL
 A stand-alone DSL may evolve into a GPL:
•
•
•
•
•
•
Fortran  Formula Translation
Algol  Algorithmic Language
Cobol  Common Business Oriented Language
Lisp  List Processing Language
Simula  Simulation Language
ML  Meta Language
 A (successful) DSL design should plan for growth
Domain-Specific Languages
5
Using Domain-Specific Knowledge
 Domain-specific syntax:
• domain-specific syntax clarifies the behavior
• directly denote high-level concepts
 Domain-specific analysis:
• consider global properties of the application
 Domain-specific optimization:
• exploit domain-specific analysis results
 GPL frameworks cannot provide these benefits
Domain-Specific Languages
6
The Ocamlyacc/Menhir Languages
 A stand-alone (or external) DSL:
• no general-purpose computing is required
 Domain concepts:
Context-free grammars
• Tokens / terminals
• Non-terminals and productions
 Implemented using:
• a lexer+parser (hand-written or ocamllex/ocamlyacc)
• a symbol checker + analysis
• a parsetable builder + emitter (menhir contains different
table/code/Coq backends)
Domain-Specific Languages
7
DSL Syntax for Grammars
…
start : start PLUS term
| start MINUS term
| term
term : term STAR factor
| term SLASH factor
| factor
factor : ID
| LPAR start RPAR
{
{
{
{
{
{
{
{
}
}
};
}
}
};
}
};
 The BNF syntax closely matches the domain at hand
Domain-Specific Languages
8
GPL Alternatives
 Parsing can be done in a number of ways:
•
•
•
•
Hand-written (lexer and) parser (more next week)
Hand-written parser table
Parser combinators
…
 Harder to write correctly
 Fixed implementation strategy
 In contrast (OCaml)yacc and menhir decouple the
language description from the workings of the
language parser
Domain-Specific Languages
9
DSL Analysis for Grammars
 Symbol checking:
• Checks non-terminal and terminal names
• Checks indexes ($1) for validity (bounds + data)
 Menhir also type checks the productions (by type
checking the action code)
 Analyses grammar for useless productions
(reachability) and removes them
 Checks grammar for LALR/LR(1) conformance
 These are checked by phases in the
ocamlyacc/menhir compiler
Domain-Specific Languages
10
GPL Analysis Alternative
 Lots of yellow PostIt notes:
 These cannot (all) be checked by a GPL compiler,
e.g., OCaml or Java.
Domain-Specific Languages
11
The JWIG Language
 An embedded DSL (in Java):
• lots of general-purpose computing is required
 Domain concepts:
• XML templates
• Web services
• sessions
 Implemented using:
• a syntax extension
• a static analysis
• a framework
Domain-Specific Languages
12
DSL Syntax for JWIG
public class test extends Service {
String userid;
public class Login extends Session {
XML wrap = [[<html>
<body bgcolor="yellow">
<[contents]>
</body>
</html>]];
public void main() {
XML login = [[<form>
Userid: <input type="text" name="userid">
<input type="submit"/>
</form>]];
show wrap<[contents = login];
userid = receive userid;
show wrap<[contents = "Welcome "+userid];
}
}
}
Domain-Specific Languages
13
GPL Syntax Alternative
XML login = XML.make("<form>\nUserid: <input
type=\"text\" name=\"userid\">\n<input
type=\"submit\"/>\</form>");
show(wrap.plug("contents",login));
userid = receive("userid");
 The DSL syntax maps directly to methods calls in
an underlying Java framework
 Avoiding escapes makes the syntax more legible
 But this is just a thin layer of syntactic sugar
Domain-Specific Languages
14
DSL Analysis for JWIG
 A static analysis that at compile time guarantees:
• only well-formed and valid XML is ever generated
• only existing form fields are ever received
• only exisiting gaps are ever plugged
 This is a DSL analysis that is performed on the
resulting compiled class files
Domain-Specific Languages
15
JWIG Implementation Model
JWIG
syntax
jwigc
Java
syntax
javac
.class files
jwiga
JWIG
framework
Domain-Specific Languages
analysis
results
16
Syntax Extensions
 Programmers may want to extend the syntax of
their programming language:
•
•
•
•
introduce domain-specific syntax
abbreviate common idioms
define language extensions
ensure consistency
 Such extensions are introduced through macros
Domain-Specific Languages
17
Macros
 Macros are as old as programming
 Is used as an orthogonal abstraction mechanism
 Two different flavors:
• lexical macros
• syntactic macros
Domain-Specific Languages
Main Entry: 2macro
Pronunciation: 'ma-(")krO
Function: noun
Inflected Form(s): plural macros
Etymology: short for macroinstruction
Date: 1959
“a single computer instruction that
stands for a sequence of operations”
18
Lexical Macros
 Operate on sequences of tokens
 Are handled by a preprocessor
 Are independent of the host language syntax
 Examples:
• CPP
• TeX
Domain-Specific Languages
19
CPP - The C Preprocessor
 Integrated into C compilers
 Also works as a stand-alone expander
 Intercepts directives such as:
•
•
•
•
•
#define
#undef
#ifdef
#if
#include
Domain-Specific Languages
20
Lexical Macro Example
 CPP macro to square a number:
#define square(X) X * X
square(z + 1)
Domain-Specific Languages
z + 1 * z + 1
21
Lexical Macro Example
 CPP macro to square a number:
#define square(X) X * X
square(z + 1)
z + (1 * z) + 1
 Adding parentheses as a hack:
#define square(X) (X) * (X)
square(z + 1)
Domain-Specific Languages
(z + 1)*(z + 1)
22
Parsing Problem
#define swap(X,Y) { int t=X; X=Y; Y=t; }
if (a > b) swap(a,b); else b=0;
*** test.c:3: parse error before 'else'
Domain-Specific Languages
23
Parsing Problem Hack
#define swap(X,Y) { int t=X; X=Y; Y=t; }
if (a > b) swap(a,b); else b=0;
*** test.c:3: parse error before 'else'
#define swap(X,Y) do { int t=X; X=Y; Y=t; } while (0)
if (a > b) swap(a,b); else b=0;
Domain-Specific Languages
24
Expansion Time
#define A 87
#define B A
#undef A
#define A 42
B
???
 Eager expansion (definition time):
B
87
 Lazy expansion (invocation time):
B
A
42
 CPP is lazy
Domain-Specific Languages
25
Expansion Order
#define id(X) X
#define one(X) id(X)
#define two a,b
one(two)
???
 Inner (”call-by-value”):
one(two)
one(a,b)
*** arity error 'one'
 Outer (”call-by-name”):
one(two)
Domain-Specific Languages
id(two)
two
a,b
26
Expansion Order in CPP
 CPP uses a pragmatic "argument prescan":
one(two)
id(a,b)
*** arity error 'id'
 Useful for composing macros:
#define succ(X) ((X)+1)
#define call7(X) X(7)
call7(succ)
Domain-Specific Languages
succ(7)
((7)+1)
27
Recursive Expansion
#define x 1+x
x
???
 Definition time:
*** recursive definition
 Invocation time:
x
1+x
Domain-Specific Languages
1+1+x
1+1+1+x
...
28
Recursive Expansion in CPP
 CPP uses a pragmatic "intercept-and-ignore":
int x = 2;
#define x = 1+x
x
1+x
 Maintain a stack of macro invocations
 Ignore invocations of macros already on the stack
 At runtime the value of x is 3
Domain-Specific Languages
29
TeX Macros
\def \vector #1[#2..#3] {
$({#1}_{#2},\ldots,{#1}_{#3})$
}
\vector \phi[0..n-1]
$({\phi}_{0},\ldots,{\phi}_{n-1})$




Flexible invocation syntax
Parsing ambiguities (chooses shortest invocation)
Expansion is lazy and outer
Recursion is permitted (conditions allowed)
Domain-Specific Languages
30
Syntactic Macros
 Operate on sequences of ASTs
 Are handled by the parser
 Are integrated with the host language syntax
 Examples:
• C++ templates
• Jakarta Tool Suite
Domain-Specific Languages
31
C++ Templates
 Integrated into C++ compilers
 Is intended as a genericity mechanism
 But is often used as a macro language
 Macros accept ASTs for:
• identifers
• constants
• types
 The result is always an AST for a declaration
Domain-Specific Languages
32
Syntactic Macro Example
template <class T>
T GetMax(T x, T y) { return (x>y?x:y); }
int i,j;
max = GetMax <int> (i,j);
 Template bodies are parsed at definition time
(unlike CPP macros)
 Templates are syntactically expanded
 Heavy use of templates yields bloated code
(unlike Java generics that are not macros)
Domain-Specific Languages
33
Metaprogramming
 C++ templates:
• perform compile time constant folding of arguments
• allow multiple template definitions and pattern matching
 This combination enables metaprogramming:
• Turing-complete computations during compilation
 Template libraries exist for:
•
•
•
•
•
booleans
control structures
functions
variables
data structures
Domain-Specific Languages
34
Metaprogramming Example
template <int X, int Y>
struct pow { static const int n=X*pow<X,Y-1>::n; };
template <int X>
struct pow<X,0> { static const int n = 1; };
const int z = pow<5,3>::n;
 The value 125 is assigned to z at compile time
Domain-Specific Languages
35
Metaprogramming for Specialization
template <int I>
inline float dot(float *a, float *b)
{ return dot<I-1>(a,b) + a[I]*b[I]; }
template <>
inline float dot<0>(float *a, float *b)
{ return a[0]*b[0]; }
float x[3], y[3];
float z = dot<2>(x,y);
float z = x[0]*y[0] + x[1]*y[1] + x[2]*y[2];
 The overhead of control structures are removed
Domain-Specific Languages
36
Jakarta Tool Suite
 JTS extends Java with simple syntactic macros
 Macros accept ASTs for:
•
•
•
•
•
•
AST_QualifiedName
AST_Exp
AST_Stm
AST_FieldDecl
AST_Class
AST_TypeName
 The result is an AST specified as:
•
•
•
•
exp{
stm{
mth{
cls{
...
...
...
...
Domain-Specific Languages
}exp
}stm
}mth
}cls
37
Hygienic Macros
macro swap(AST_QualifiedName x, AST_QualifiedName y)
local temp
stm{ int temp = x; x = y; y = temp; }stm
int temp = 42;
int tump = 87;
#swap(temp,tump);
 Potential name clash problem:
int temp = temp; temp = tump; tump = temp;
 But local names are renamed uniquely:
int temp143 = temp; temp = tump; tump = temp143;
 Hygienic macros are available in Scheme, various macro
extensions of Java such as JSE, …
Domain-Specific Languages
38
Download