Code generation - uri=userpages.uni

advertisement
Code generation
Course "Software Language Engineering"
University of Koblenz-Landau
Department of Computer Science
Ralf Lämmel
Software Languages Team
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
1
Motivation
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
2
2!
1!
(Intermediate) code generation
Intermediate Code Generation!
•! Facilitates retargeting: enables attaching a back
end for the new machine to an existing front end!
Lexical Analysis Intermediate"
and"
Front end!
Back end!
code!
Lexical Analyzer Generators!
Target"
machine"
code!
•! Enables3!
machine-independent code optimization!
Chapter
Intermediate code (representation; IR) facilitates
retargeting (addressing a new back end) and
machine-independent optimization.
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
3
1!
Compiler architecture
2!
Position of a Code Generator in
the Compiler Model!
Lexical Analysis and"
Code"
Lexical Analyzer
Front-End!Generators!
Optimizer!
Source"
program!
Intermediate"
code!
Intermediate"
code!
Code"
Generator!
Target"
program!
Chapter
3!
Lexical error"
Syntax error"
Semantic error!
Symbol Table!
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
4
Different options for target code
Absolute machine code (directly executable code)
Relocatable machine code (subject to linking step)
Assembly code (which is suitable for debugging)
Bytecode (P code, intermediate representation)
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
The focus in
this course
5
Obvious requirements
Intermediate and final code must be correct.
That is, code generation must be semantics-preserving.
Code must be fast and resource-friendly.
We need a cost model of the code languages.
We may need weights in code generation.
We may use optimizations to reduce costs.
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
6
1!
Intermediate representations
targeted by code generation
3!
Intermediate Representations!
•! Graphical representations
Lexical Analysis
and" (e.g. AST)!
•! Postfix notation: operations on values stored
Lexical Analyzer
Generators!
on operand
stack (similar to JVM bytecode)!
•! Three-address code: (e.g. triples and quads)"
! x := y op z !
Chapter
3!
•! Two-address code:"
! x := op y"
which is the same as x := x op y!
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
7
7!
1!
Postfix Notation!
a := b * -c + b * -c!
Lexical
Analysis and"
a b c uminus * b c uminus * + assign!
Bytecode (for example)!
Lexical Analyzer
Generators!
Postfix notation
represents!
operations on a stack!
Chapter 3!
Pro: !easy to generate!
Cons: !stack operations are more"
!
difficult to optimize!
iload 2
iload 3
ineg
imul
iload 2
iload 3
ineg
imul
iadd
istore 1
//
//
//
//
//
//
//
//
//
//
push b
push c
uminus
*
push b
push c
uminus
*
+
store a
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
8
8!
1!
Three-Address Code!
a := b * -c + b * -c!
Lexical Analysis and"
Lexical Analyzer Generators!
t1
t2
t3
t4
t5
a
:=
:=
:=
:=
:=
:=
- c
b * t1
- c
b * t3
t2 + t4
t5
Chapter 3!
Linearized representation"
of a syntax tree!
t1
t2
t5
a
:=
:=
:=
:=
- c
b * t1
t2 + t2
t5
Linearized representation"
of a syntax DAG!
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
9
Big questions of code generation
What is the “language” of intermediate representations?
How “context-free” is the language?
What is the distance between high-level and IR language?
How to define a mapping between the languages?
What sort of optimizations are possible on IR?
What backends are available for IR?
...
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
10
Syntax-directed translation
re
h
ig
h
g
in
p
p
a
m
f
o
s
n
a
e
m
l
a
A typic
level to lower-level code
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
11
1!
9!
Three-Address Statements!
•! Assignment statements: x := y op z, x := op y!
•! Indexed assignments: x := y[i], x[i] := y!
•! Pointer assignments: x := &y, x := *y, *x := y!
•! Copy statements: x := y!
•! Unconditional jumps: goto lab!
Chapter 3!
•! Conditional jumps: if x relop y goto lab!
•! Function calls: param x… call p, n"
return y!
Various such
“P code machines”
exist ...
Lexical Analysis and"
Lexical Analyzer Generators!
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
12
1!
10!
Syntax-Directed Translation into
Three-Address Code!
Productions"
S ! id := E"
| while E do S"
E ! E + E"
| E * E"
| - E"
3!
| (Chapter
E )"
| id!
| num"
Synthesized attributes:"
S.code !
!three-address code for S!
S.begin!
!label to start of S or nil!
S.after !
!label to end of S or nil!
E.code!
!three-address code for E!
E.place!
!a name holding the value of E"
Lexical Analysis and"
Lexical Analyzer Generators!
gen(E.place ‘:=’ E1.place ‘+’ E2.place)!
Code generation!
t3 := t1 + t2
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
13
1!
11!
Syntax-Directed Translation into
Three-Address Code (cont’d)!
Productions"
S ! id := E!
S ! while E"
do S1"
E ! E1 + E2"
Semantic rules!
S.code := E.code || gen(id.place ‘:=’ E.place); S.begin := S.after := nil"
(see next slide)!
Lexical Analysis and"
Lexical Analyzer Generators!
E ! E1 * E2"
E ! - E1"
E.place := newtemp();"
E.code := E1.code || E2.code || gen(E.place ‘:=’ E1.place ‘+’ E2.place)"
E.place := newtemp();"
E.code := E1.code || E2.code || gen(E.place ‘:=’ E1.place ‘*’ E2.place)!
E.place := newtemp();"
E.code := E1.code || gen(E.place ‘:=’ ‘uminus’ E1.place)!
E.place := E1.place"
E.code := E1.code!
E.place := id.name"
E.code := ‘’"
E.place := newtemp();"
E.code := gen(E.place ‘:=’ num.value)!
Chapter 3!
E ! ( E1 )"
E ! id!
E ! num!
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
14
1!
12!
Syntax-Directed Translation into
Three-Address Code (cont’d)!
Lexical
Analysis and" S.begin:!
Production"
S ! while E do S !
Lexical Analyzer
Generators!
Semantic rule"
1
E.code!
if E.place = 0 goto S.after!
S.code!
goto S.begin!
S.begin := newlabel()"
S.after := newlabel()!
S.after:! …!
S.code := gen(S.begin ‘:’) ||"
!
E.code ||"
!
gen(‘if’ E.place ‘=‘ ‘0’ ‘goto’ S.after) ||"
!
S1.code ||"
!
gen(‘goto’ S.begin) ||"
!
gen(S.after ‘:’)!
Chapter 3!
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
15
1!
13!
Example!
i := 2 * n + k!
while i do!
i := i - k"
Lexical Analysis and"
Lexical Analyzer Generators!
Chapter 3!
t1 := 2
t2 := t1 * n
t3 := t2 + k
i := t3
L1: if i = 0 goto L2
t4 := i - k
i := t4
goto L1
L2:
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
16
18!
1!
Symbol Tables for Scoping!
struct S
{ int a;
int b;
} s;
We need a symbol table"
for the fields of struct S!
Need symbol table"
for global variables"
and functions!
Lexical
Analysis
and"
void swap(int&
a, int&
b)
{ int t;
t = a;
Lexical Analyzer
Generators!
a = b;
b = t;
}
Chapter 3!
void somefunc()
{ …
swap(s.a, s.b);
…
}
Need symbol table for arguments"
and locals for each function!
Check: s is global and has fields a and b"
Using symbol tables we can generate"
code to access s and its fields!
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
17
Additional aspects
Register allocation
Array operations
Procedure calls
Exceptions
No details provided on this
matter. See any textbook on
compiler construction. This would
be another course.
...
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
18
Portable code machines
Important kind of target for
code generation
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
19
Examples of P code machines
Java Virtual Machine
MATLAB precompiled code
The machine of UCSD Pascal
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
20
Source versus P code
public static int gcd(int, int);
public static int gcd(int x, int y) {
while (x != y) {
if (x > y)
x = x - y;
else
y = y - x;
}
return x;
}
0: iload_0
1: iload_1
2: if_icmpeq 24
5: iload_0
6: iload_1
7: if_icmple 17
10: iload_0
11: iload_1
12: isub
13: istore_0
14: goto 0
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
17:
18:
19:
20:
21:
24:
25:
iload_1
iload_0
isub
istore_1
goto 0
iload_0
ireturn
21
Characteristics of P code
Portability of code generation
Simplicity of code generation
Simplicity of interpretation
Compact size of P code
Debugging of P code
Efficiency by JITing or hardware implementation
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
22
Variation points in P code machines
The extent of using registers
Available machine registers (such as SP)
The amount of stacks
Availability of heap
Supported datatypes
Calling conventions
...
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
23
A quick look at compiler optimization
This is a huge area making up for the bigg
er part of
compiler construction which we cannot cove
r here in any
useful way. Some context is provided for m
otivation.
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
24
Major classification
Peephole optimization (pretty local)
Optimizations requiring global program analysis
Optimizations requiring IR in “normal” form
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
25
Peephole optimization
Perform optimization over small window of code.
Typical examples:
Constant folding (evaluate constant subexpressions)
Strength reduction (replace slow by fast operations)
Common sub-expression elimination (simple forms thereof)
Remove redundant operations
Algebraic transformations
Special case operations (inc vs. add 1)
...
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
26
An example for
strength reduction for Java Bytecode
...
aload 1
aload 1
mul
...
...
aload 1
dup
mul
...
[http://en.wikipedia.org/wiki/Peephole_optimization, 4 Dec 2012]
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
27
An example for
removing redundant operations
Source
Bytecode
a = b + c;
d = a + e;
MOV
ADD
MOV
MOV
ADD
MOV
b, R0
c, R0
R0, a
a, R0
e, R0
R0, d
#
#
#
#
#
#
Copy
Add
Copy
Copy
Add
Copy
b to the register
c to the register
the register to a
a to the register
e to the register
the register to d
Eliminate
past regular
generation
[http://en.wikipedia.org/wiki/Peephole_optimization, 4 Dec 2012]
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
28
An example for
removing redundant operations for Z80 :-)
PUSH AF
PUSH BC
PUSH DE
PUSH HL
CALL _ADDR1
POP HL
POP DE
POP BC
POP AF
PUSH AF
PUSH BC
PUSH DE
PUSH HL
CALL _ADDR2
POP HL
POP DE
POP BC
POP AF
PUSH AF
PUSH BC
PUSH DE
PUSH HL
CALL _ADDR1
POP HL
POP DE
POP BC
PUSH BC
PUSH DE
PUSH HL
CALL _ADDR2
POP HL
POP DE
POP BC
POP AF
PUSH AF
PUSH BC
PUSH DE
PUSH HL
CALL _ADDR1
POP HL
POP DE
PUSH DE
PUSH HL
CALL _ADDR2
POP HL
POP DE
POP BC
POP AF
PUSH AF
PUSH BC
PUSH DE
PUSH HL
CALL _ADDR1
POP HL
PUSH HL
CALL _ADDR2
POP HL
POP DE
POP BC
POP AF
PUSH AF
PUSH BC
PUSH DE
PUSH HL
CALL _ADDR1
CALL _ADDR2
POP HL
POP DE
POP BC
POP AF
[http://en.wikipedia.org/wiki/Peephole_optimization, 4 Dec 2012]
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
29
15!
1!
Transformations!
An exampleAlgebraic
for
an algebraic
transformation
on three-address
statements
•! Change
arithmetic operations
to transform
blocks to algebraic equivalent forms!
Lexical Analysis and"
Lexical Analyzer Generators!
t1 := a - a
t2 := b + t1
t3 := 2 * t2
t1 := 0
t2 := b
t3 := t2 << 1
Chapter 3! Bit shift being cheaper
than multiplication with 2.
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
30
SSA form: a key concept in
compiler optimization
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
31
Static single assignment form (SSA form)
A property of IR: each variable is assigned exactly once!
SSA enables or enhances compiler optimizations
Regular
SSA
y := 1
y := 2
x := y
y1 := 1
y2 := 2
x1 := y2
It requires some analysis to see that the
first assignment is “dead” and that the
third assignment relies on the second.
These facts are more
obvious in SSA form.
See also: http://en.wikipedia.org/wiki/Static_single_assignment_form
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
32
SSA conversion
Replace the target of each assignment with a new variable.
Replace each reference with the right “version”.
Regular
y := 1
y := 2
x := y
SSA
y1 := 1
y2 := 2
x1 := y2
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
33
Optimizations benefiting from SSA
ik
n.w
//e
rg
ia.o
iped
k
/wi
global value numbering
or m
strength reduction
f
ent_
gnm
assi
partial redundancy elimination
e_
ingl
c_s
tati
i/S
dead code elimination
:
http
sparse conditional constant propagation
:
also
value range propagation
See
constant propagation
register allocation
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
34
Control-flow graphs
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
35
2!
1!
Flow Graphs!
•! A flow graph is a graphical depiction of a
sequence of instructions with control flow
Lexicaledges!
Analysis and"
Lexical Analyzer
Generators!
•! A flow graph
can be defined at the
intermediate code level or target code level!
ChapterMOV3!1,R0
MOV n,R1
JMP L2
L1: MUL 2,R0
SUB 1,R1
L2: JMPNZ R1,L1
MOV 0,R0
MOV n,R1
JMP L2
L1: MUL 2,R0
SUB 1,R1
L2: JMPNZ R1,L1
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
36
1!
3!
Basic Blocks!
•! A basic block is a sequence of consecutive
with
exactly one entry point and
Lexical instructions
Analysis
and"
one exit point (with natural flow or a branch
Lexical Analyzer
Generators!
instruction)!
MOV 1,R0
MOV n,R1
JMP L2
L1: MUL 2,R0
SUB 1,R1
L2: JMPNZ R1,L1
Chapter 3!
MOV 1,R0
MOV n,R1
JMP L2
L1: MUL 2,R0
SUB 1,R1
L2: JMPNZ R1,L1
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
37
1!
4!
Basic Blocks and Control Flow
Graphs!
•! A control flow graph (CFG) is a directed
graph with basic blocks Bi as vertices and
Lexical Analysis
and"
with edges Bi ! Bj iff Bj can be executed
Lexical Analyzer
Generators!
immediately
after Bi!
MOV 1,R0
MOV n,R1
JMP L2
L1: MUL 2,R0
SUB 1,R1
L2: JMPNZ R1,L1
Chapter 3!
MOV 1,R0
MOV n,R1
JMP L2
L1: MUL 2,R0
SUB 1,R1
L2: JMPNZ R1,L1
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
38
1!
5!
Successor and Predecessor
Blocks!
•! Suppose the CFG has an edge B1 ! B2!
–! Basic block Band"
is a predecessor of B
Lexical Analysis
–! Basic block B is a successor of B
Lexical Analyzer Generators!
1
2
2
Chapter 3!
!
1!
MOV 1,R0
MOV n,R1
JMP L2
L1: MUL 2,R0
SUB 1,R1
L2: JMPNZ R1,L1
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
39
1!
6!
Partition Algorithm for Basic
Blocks!
Input: !A sequence of three-address statements!
Output: A list of basic blocks with each three-address statement"
in exactly one block!
Lexical Analysis and"
Lexical Analyzer
Generators!
1.! Determine the
set of leaders, the first statements if basic blocks!
a)! The first statement is the leader!
b)! Any statement that is the target of a goto is a leader!
c)! Any statement that immediately follows a goto is a leader!
2.! For each leader, its basic block consist of the leader and all"
statements up to but not including the next leader or the end"
of the program !
Chapter 3!
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
40
1!
7!
Loops!
•!
A
loop
is
a
collection
of
basic
blocks,
such
Lexical Analysis and"
that!
Lexical Analyzer
Generators!
–! All blocks in the collection are strongly
connected!
Chapter 3!
–! The collection has a unique entry, and the only
way to reach a block in the loop is through the
entry!
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
41
1!
8!
Loops (Example)!
B1:!
MOV 1,R0
MOV n,R1
JMP L2
Strongly connected"
components:!
Lexical Analysis and"
Lexical Analyzer Generators! SCC={!{B2,B3},"
!
{B4} }!
B2:! L1: MUL 2,R0
SUB 1,R1
B3:! L2: JMPNZ R1,L1
Chapter 3!
B4:! L3: ADD 2,R2
SUB 1,R0
JMPNZ R0,L3
Entries:!
B3, B4!
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
42
1!
10!
Transformations on Basic Blocks!
•! A code-improving transformation is a code
optimization to improve speed or reduce code size!
•! Global transformations are performed across basic
blocks!
•! Local transformations are only performed on
single basic blocks!
Chapter 3!
•! Transformations must be safe and preserve the
meaning of the code!
Lexical Analysis and"
Lexical Analyzer Generators!
–! A local transformation is safe if the transformed basic
block is guaranteed to be equivalent to its original form!
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
43
1!
11!
Common-Subexpression
Elimination!
•! Remove redundant computations!
Lexical Analysis and"
Lexical Analyzer Generators!
a
b
c
d
:=
:=
:=
:=
b
a
b
a
+
+
-
c
d
c
d
a
b
c
d
:=
:=
:=
:=
b + c
a - d
b + c
b
Chapter 3!
t1
t2
t3
t4
:=
:=
:=
:=
b * c
a - t1
b * c
t2 + t3
t1 := b * c
t2 := a - t1
t4 := t2 + t1
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
44
12!
1!
Dead Code Elimination!
•! Remove unused statements!
Lexical Analysis
b := aand"
+ 1
b := a + 1
a := b + c
…
Lexical Analyzer …Generators!
Assuming a is dead (not used)!
Chapter 3!
if true goto L2
b := x + y
…
Remove unreachable code!
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
45
1!
13!
Renaming Temporary Variables!
•! Temporary variables that are dead at the end
Lexicalof Analysis
and"
a block can be safely renamed!
Lexical Analyzer Generators!
t1 := b + c
t2 := a - t1
t1 := t1 * d
d := t2 + t1
Chapter 3!
t1 := b + c
t2 := a - t1
t3 := t1 * d
d := t2 + t3
Normal-form block!
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
46
1!
14!
Interchange of Statements!
•! Independent statements can be reordered!
Lexical Analysis and"
Lexical Analyzer
Generators!
t1 := b + c
t1 := b + c
t2 := a - t1
t3 := t1 * d
d := t2 + t3
Chapter 3!
t3 := t1 * d
t2 := a - t1
d := t2 + t3
Note that normal-form blocks permit all"
statement interchanges that are possible!
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
47
1!
15!
Algebraic Transformations!
Lexical
Analysis
and"operations to transform
•! Change
arithmetic
blocks to algebraic
equivalent forms!
Lexical Analyzer
Generators!
Chapter
3!
t1 := a
- a
t2 := b + t1
t3 := 2 * t2
t1 := 0
t2 := b
t3 := t2 << 1
COP5621 Compiler Construction"
Copyright Robert van Engelen, Florida State University, 2007-2009!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
48
Alias analysis
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
49
Alias analysis -- illustration
p.foo = 1;
q.foo = 2;
i = p.foo + 3;
Knowing whether or not two pointer variables alias
each other may heavily influence optimizations.
There are three possible alias cases here:
1. The variables p and q cannot alias.
2. The variables p and q must alias.
3. It cannot be conclusively determined at compile time if p and q alias or not.
If p and q cannot alias, then i = p.foo + 3; can be changed to i = 4. If p and q
must alias, then i = p.foo + 3; can be changed to i = 5. In both cases, we are
able to perform optimizations from the alias knowledge. On the other hand, if it is not
known if p and q alias or not, then no optimizations can be performed and the whole of
the code must be executed to get the result. Two memory references are said to have a
may-alias relation if their aliasing is unknown.
[http://en.wikipedia.org/wiki/Alias_analysis, 4 Dec 2012]
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
50
Alias analysis
Alias analysis is a technique in compiler theory, used to determine if
a storage location may be accessed in more than one way. Two
pointers are said to be aliased if they point to the same location.
Alias analysis techniques are usually classified by flow-sensitivity
and context-sensitivity. They may determine may-alias or must-alias
information. The termalias analysis is often used interchangeably
with term points-to analysis, a specific case.
[http://en.wikipedia.org/wiki/Alias_analysis, 4 Dec 2012]
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
51
Concluding remarks
Basic concepts:
P code, SSA, CFG, alias analysis
Typical optimizations:
constant folding, common subexpression elimination
Compiler architecture:
lexer, parser, AST, IR, optimization, code generation
Compiler implementation frameworks:
LLVM
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
52
Code generation and intermediate
representations in practice -- The LLVM
experience. (Overview and Demo)
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
53
LLVM overview
[The LLVM Compiler Framework and Infrastructure, LCPC Tutorial, 2004 by Chris Lattner]
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
54
!!"#$%&'()*+,$-./0+'$
•  !!"#$1$!&2$!+3+*$"),045*$#567)8+$$
•  97+$!!"#$%&'()*+,$:8;,5/0,4604,+$
–  <,&3)=+/$,+4/5>*+$6&'(&8+80/$;&,$>4)*=)8?$6&'()*+,/$
–  @+=46+$07+$A'+B6&/0$0&$>4)*=$5$8+2$6&'()*+,$
–  C4)*=$/05A6$6&'()*+,/D$E:9/D$0,56+F>5/+=$&(A')G+,/D$HHH$
•  97+$!!"#$%&'()*+,$I,5'+2&,J$
–  K8=F0&F+8=$6&'()*+,/$4/)8?$07+$!!"#$)8;,5/0,4604,+$
–  %$58=$%LL$5,+$,&>4/0$58=$5??,+//)3+M$
•  E535D$-67+'+$58=$&07+,/$5,+$)8$=+3+*&('+80$
–  K')0$%$6&=+$&,$85A3+$6&=+$;&,$NOPD$-(5,6D$<&2+,<%$
%7,)/$!5Q8+,$R$lattner@cs.uiuc.edu
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
55
!"#$$%&#'()#*%++,-%./(&/0$012%
•  !"$%++,-%!"#$%&'()*+$#%,-.*(/0$(
–  !"$%./((/0%3)045)4$6%)07%1)#4$16'07$&$07$01%89%
–  801$#0)3%:89;%)07%$<1$#0)3%:&$#2'21$01;%#$&#$2$01)=/0%
•  >%./33$.=/0%/?%@$336'01$4#)1$7%3'A#)#'$2%
–  >0)3*2$2B%/&=('C)=/02B%./7$%4$0$#)1/#2B%D8!%./(&'3$#B%
4)#A)4$%./33$.=/0%25&&/#1B%&#/E3'04B%F%
•  >%./33$.=/0%/?%1//32%A5'31%?#/(%1"$%3'A#)#'$2%
–  >22$(A3$#2B%)51/()=.%7$A544$#B%3'0G$#B%./7$%4$0$#)1/#B%
./(&'3$#%7#'H$#B%(/753)#%/&=('C$#B%F%
I"#'2%+)J0$#%K%lattner@cs.uiuc.edu
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
56
!"#$%%&'$()(**$(+,-./#0$
•  10+,$2"#$".3"$/#4#/5$.2$.6$7$62789709$:+,-./#0;$
–  (+,-7<=/#$>.2"$62789709$,7?#@/#6$
–  A6#6$B(($CDE$($789$(**$-706#0$
C file
llvmgcc
.o file
C++ file
llvmg++
.o file
Compile Time
llvm linker
executable
Link Time
•  F.6<83G.6".83$H#72G0#6;$
–  A6#6$%%&'$+-<,.I#065$8+2$B(($+-<,.I#06$
–  D+$@/#6$:+827.8$%%&'$JK)=L2#:+9#5$8+2$,7:".8#$:+9#$
–  MN#:G27=/#$:78$=#$=L2#:+9#$OPJ! 9Q$+0$,7:".8#$:+9#$
("0.6$%7R8#0$S$lattner@cs.uiuc.edu
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
57
!"#$%%&'$()(**$(+,-./#0$12+345$
64738708$2+,-./#0$+0973.:7;+3<$=".2"$>?#?$%%&'$7?$
,.8/#@#/$ABC$
D$%739>79#$?-#2.E2$F0+34G#38$/+=#0?$2+8#$4+$%%&'$AB$
D$%739>79#)4709#4$.38#-#38#34$+-;,.:#0?$.,-0+@#$2+8#$
D$(+8#$9#3#074+0$2+3@#04?$%%&'$2+8#$4+$4709#4$1#H9H$
AIJK5$2+8#$
("0.?$%7L3#0$D$lattner@cs.uiuc.edu
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
58
!"##$#%&'()*+,'-&).%&+./*/0/#&
1/#2$3'.&"2'&/4&567.'4'.'#8'&+).)*'9'.2-&
!"#$%&''(()%*"+#$!"#$,-.$/$
$$$0(#10"$-234$
5$
!"#$%&''(0).$/$
$$$0(#10"$%&''(()6.4$
5$
?($@&"#A$
!"#$%&''(()!"#$-.$/$
$$$0(#10"$-234"
5$
!"#$%&''(0).$/$
$$$0(#10"$%&''(()6.4$
5$
%*89!'(+$#*$
!"#$%&''(()%*"+#$!"#$7-.$/$
$$$0(#10"$7-234$$$!!"#$#%&'"(%)*"
5$
!"#$%&''(0).$/$
$$$!"#$#894$$$$$$$$$&!!"+,)-."%/0$-,$
$$$#89$:$64$$$$$$$$&!!"#$#%&'"+,%&$"
$$$0(#10"$%&''((),#89.4$
5$
! ;'!8!"&#(<$'*&<$!"$%&''(($
! ;'!8!"&#(<$+#*0($!"$%&''(0$
! ;'!8!"&#(<$+#&%=$+'*#$>*0$ #89 $
1:.$2&;)<#'.&=&lattner@cs.uiuc.edu
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
59
!"#$%&$'"%&$"()*+$
•  ,-./%)-&$%0'-)1)23-*/)(4$(0(4#&%&5$
–  6/&'$3"(07-$'"-$1)2'2'#1-$28$'"-$3(44--$
–  6/&'$/1*('-$(44$3(44$&%'-&$!$9-$:/&'$!"#$$(44$3(44-)&$
–  !"('$(;2/'$3(44-)&$2/'&%*-$'"-$')(0&4(<20$/0%'+$
•  ,-./%)-&$(4%(&$(0(4#&%&5$
–  ,-8-)-03-$32/4*$(4%(&$2'"-)$12%0'-)&$%0$3(44--$
–  6/&'$=029$'"('$42(*-*$>(4/-$*2-&0 '$3"(07-$8)2:$
8/03<20$-0')#$'2$'"-$42(*$
–  6/&'$=029$'"-$12%0'-)$%&$02'$;-%07$&'2)-*$'")2/7"$
•  ,-8-)-03-$:%7"'$02'$;-$'2$($&'(3=$2;?-3'@$
A")%&$B(C0-)$D$lattner@cs.uiuc.edu
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
60
!""#$%&'$%("')*)%(+',('-"./$0)12.)'
3'40)'
00*.&--'
3'("'!!=7'
M9"%()%8'
--O '
!!=7'>?'
E,9+)9'
5"'40)'
366'40)'
00*.&66'
5"'40)'
3"./$0)12.)'
J/2.$K)9'
366'("'!!=7'
M9"%()%8'
3"./$0)12.)'
J/2.$K)9'
&--,+ '
--O/0P+ '
&--,+ '
!!=7'
=)9$4)9'
7"8$4)8'*)9+$"%'":';33'
<.$(+'!!=7'>?',+'()@('40)'
!"A)9+'3'BCD'("'!!=7'
FG'!!=7'B%,0H+$+'I'
J/2.$K,2"%'E,++)+'
!!=7'5L-'
M$0)'N9$()9'
7"8$4)8'*)9+$"%'":';66'
<.$(+'!!=7'>?',+'()@('40)'
!"A)9+'366'BCD'("'!!=7'
Q),8';0"L,0'<0$.$%,2"%R'>E'3"%+(,%('E9"/,&,2"%R'Q),8'B9&P.)%('
<0$.$%,2"%R'>%0$%$%&R'?),++"-$,2"%R'!>37R'!""/'J/(+R'7)."9H'
E9"."2"%R'Q),8'C("9)'<0$.$%,2"%R'BQ3<R'ST'
3U9$+'!,V%)9'W'lattner@cs.uiuc.edu
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
61
!""#$%&'$%("')*)%(+',('-$%#./0)'
1"'2-)'
1"'2-)'
1"'2-)'
1"'2-)'
!!@A'
!$%#)3'
--*0'-$%#)3'
)4)56(,7-)'
!$%#./0)'
B</0$C)3'
HI'!!@A'J%,-K+$+'L'
B</0$C,/"%'M,++)+'
B</"%,--K' $%()3%,-$C)+ N'
0,3#+'0"+('D6%5/"%+',+'
$%()3%,-O'("'$0<3"*)'FMB'
M)3D)5('<-,5)'D"3',3&60)%('
<3"0"/"%'"</0$C,/"%P'
175'2-)'D"3'!!@A'EFG'
8,/*)'9":)'
;,5#)%:'
--5 '
9'9":)'
;,5#)%:'
--5'=0,35>?5 '
8,/*)')4)56(,7-)'
9'9"0<$-)3'
&55 '
8,/*)'
)4)56(,7-)'
!$%#'$%'%,/*)'1"'2-)+'
,%:'-$73,3$)+'>)3)'
9>3$+'!,Q%)3'='lattner@cs.uiuc.edu
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
62
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
63
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
64
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
65
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
66
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
67
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
68
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
69
!"#$%&'()*%+$,-().*%/$0-$1123$
!"#$%&''(()%*"+#$!"#$,-.$/$
$$$0(#10"$,-234$$$!!"#$%&"
5$
!"#$%&''(0).$/$
$$$!"#$64$$$$$$$$$$!!"$'"()%*+$
$$$6$7$84$$$$$$$$$!!"()$,-"
$$$0(#10"$%&''(()96.4$
5$
!"#(0"&'$!"#$:%&''(()!"#,$:-.$/$
$$$:#;<=3$7$'*&>$!"#,$:-$
$$$:#;<=?$7$&>>$!"#$:#;<=3@$3$
$$$0(#$!"#$:#;<=?"
5$
!"#$:%&''(0).$/$
$$$:6$7$&''*%&$!"#$
$$$+#*0($!"#$8@$!"#,$:6$
$$$:#;<=A$7$%&''$!"#$:%&''(()!"#,$:6.$
$$$0(#$!"#$:#;<=A$
5$
1.75%#$
9**$*-'/8:80-#%8$'#%$
.70%#7'*.<%8 $
40',5$'**-,'6-7$.8$%&)*.,.0$
(-80$="7,6-78$.7$(-80$
%&)*.,.0$.7$0;%$1123$
.7$1123$
#%)#%8%70'6-7$
,'8%8$
>;#.8$1'?7%#$@$lattner@cs.uiuc.edu
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
70
!"#$%&'()*%+$,%-.#%,$/#'0-12#('320$
!"#$%"&'(!"#()*&''$$+!"#=(),/(0(
((()#12-7(4(':&5(!"#=(),(
((()#12-3(4(&55(!"#()#12-76(7(
(((%$#(!"#()#12-3!
8(
!"#()*&''$%+/(0(
((()9(4(&'':*&(!"#(
(((;#:%$(!"#(<6(!"#=()9(
((()#12->(4(*&''(!"#()*&''$$+!"#=()9/(
(((%$#(!"#()#12->(
8(
!/5%#$/#'0-12#('320$
:),'/%$'**$8'**$-./%-$21$
45'06%$/5%$)#2/2/7)%$12#$
90-%#/$*2',$.0-/#"8320-$
;<(%(=#%6>$8*%'0-$")$
8'**%% $
.0/2$'**$8'**%#-$
/5%$1"08320$
/5%$#%-/$
!"#$%"&'(!"#()*&''$$+!"#(),-.&'/(0(
((()#12-3(4(&55(!"#(),-.&'6(7(
(((%$#(!"#()#12-3!
8(
!"#()*&''$%+/(0(
((()9(4(&'':*&(!"#(
(((;#:%$(!"#(<6(!"#=()9(
((()#12-7(4(':&5(!"#=()9(
((()#12->(4(*&''(!"#()*&''$$+)#12-7/(
(((%$#(!"#()#12->(
8(
!"#()*&''$%+/(0(
((()#12->(4(*&''(!"#()*&''$$+!"#(</(
(((%$#(!"#()#12->(
8(
45#.-$?'@0%#$A$lattner@cs.uiuc.edu
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
71
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
72
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
73
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
74
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
75
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
76
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
77
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
78
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
79
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
80
© 2012 Ralf Framework
Lämmel, Softwareand
Language
Engineering http://softlang.wikidot.com/course:sle
[The LLVM Compiler
Infrastructure,
LCPC Tutorial, 2004 by Chris Lattner]
81
Various omissions
Exception handling, string literals
Tools, applications, benchmarks
...
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
82
LLVM
The LLVM Compiler Infrastructure (Homepage)
Getting Started with the LLVM System (How to install ...?)
Kaleidoscope: Implementing a Language with LLVM
llvm/build/examples/Kaleidoscope
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
83
Implementing a language
with LLVM
Sample code (e
xtracted from LL
VM user guide)
https://github.
available online
com/slecourse/
:
slecourse/tree/
master/sources
/llv
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
m
84
Kaleidoscope: Implementing a Language with LLVM
LLVM Tutorial: Table of Contents
Implementing a Parser and AST
Implementing Code Generation to LLVM IR
Adding JIT and Optimizer Support
Extending the language: control flow
Extending the language: user-defined operators
ml
.ht
dex
/in
:
ce rial
ur uto
So cs/t
do
g/
.o r
vm
/ll
p:/
htt
Tutorial Introduction and the Lexer
Extending the language: mutable variables / SSA construction
Conclusion and other useful LLVM tidbits
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
85
Kaleidoscope
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
86
Recursive Fibonacci
# Compute the x'th fibonacci number.
def fib(x)
if x < 3 then
1
else
fib(x-1)+fib(x-2)
# This expression will compute the 40th number.
fib(40)
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
87
Iterative Fibonacci
def binary : 1 (x y) y;
User-defined
sequencing operator
def fibi(x)
var a = 1, b = 1, c in
(for i = 3, i < x in
c=a+b:
a=b:
Use of mutable variables.
b = c) :
b;
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
88
A simple parser
for Kaleidoscope
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
89
A simple parser for Kaleidoscope
A simple lexer
A recursive descent parser
Plain old C++ objects for AST
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
90
A simple lexer
if (isdigit(LastChar) || LastChar == '.') {
// Number: [0-9.]+
std::string NumStr;
do {
NumStr += LastChar;
LastChar = getchar();
} while (isdigit(LastChar) || LastChar == '.');
NumVal = strtod(NumStr.c_str(), 0);
return tok_number;
}
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
91
A recursive descent parser
/// primary
///
::= identifierexpr
///
::= numberexpr
///
::= parenexpr
static ExprAST *ParsePrimary() {
switch (CurTok) {
default: return Error("unknown token when expecting an expression");
case tok_identifier: return ParseIdentifierExpr();
case tok_number:
return ParseNumberExpr();
case '(':
return ParseParenExpr();
}
}
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
92
Plain old C++ objects for AST
/// NumberExprAST - Expression class for numeric literals like "1.0".
class NumberExprAST : public ExprAST {
double Val;
public:
NumberExprAST(double val) : Val(val) {}
};
/// numberexpr ::= number
static ExprAST *ParseNumberExpr() {
ExprAST *Result = new NumberExprAST(NumVal);
getNextToken(); // consume the number
return Result;
}
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
93
A simple code generator
for Kaleidoscope
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
94
A simple code generator for Kaleidoscope
Add virtual code generation methods to AST classes
Expressions return SSA registers (“Value” in LLVM)
Use some house-keeping data structures
Use uniqued IR, e.g., for literals
Use of name hints for intermediate results
Functions return, well, functions (“Function” in LLVM)
Use an error method to report, for example, missing decls
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
95
Add virtual code generation methods to AST classes
/// ExprAST - Base class for all expression nodes.
class ExprAST {
public:
virtual ~ExprAST() {}
virtual Value *Codegen() = 0;
};
/// NumberExprAST - Expression class for numeric literals like "1.0".
class NumberExprAST : public ExprAST {
double Val;
public:
NumberExprAST(double val) : Val(val) {}
virtual Value *Codegen();
};
...
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
96
Use some house-keeping data structures
Scope for
functions
The builder
used for code
generation
static Module *TheModule;
static IRBuilder<> Builder(getGlobalContext());
static std::map<std::string, Value*> NamedValues;
Remember values for
variables (parameters)
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
97
Floating point literals
Arbitrary precision
numbers
Value *NumberExprAST::Codegen() {
return ConstantFP::get(getGlobalContext(), APFloat(Val));
}
Uniqued IR for
numbers; no “new”!
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
98
Variable references
Value *VariableExprAST::Codegen() {
// Look this variable up in the function.
Value *V = NamedValues[Name];
return V ? V : ErrorV("Unknown variable name");
}
Mapping to be initialized by
code for function declaration.
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
99
Binary expressions
Value *BinaryExprAST::Codegen() {
Value *L = LHS->Codegen();
Value *R = RHS->Codegen();
if (L == 0 || R == 0) return 0;
Use of name hints for
intermediate results
switch (Op) {
case '+': return Builder.CreateFAdd(L, R, "addtmp");
case '-': return Builder.CreateFSub(L, R, "subtmp");
case '*': return Builder.CreateFMul(L, R, "multmp");
case '<':
L = Builder.CreateFCmpULT(L, R, "cmptmp");
// Convert bool 0/1 to double 0.0 or 1.0
return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
"booltmp");
default: return ErrorV("invalid binary operator");
}
}
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
100
Function calls
Value *CallExprAST::Codegen() {
// Look up the name in the global module table.
Function *CalleeF = TheModule->getFunction(Callee);
if (CalleeF == 0)
return ErrorV("Unknown function referenced"); Get
// If argument mismatch error.
if (CalleeF->arg_size() != Args.size())
return ErrorV("Incorrect # arguments passed");
pointer to
function
std::vector<Value*> ArgsV;
for (unsigned i = 0, e = Args.size(); i != e; ++i) {
ArgsV.push_back(Args[i]->Codegen());
if (ArgsV.back() == 0) return 0;
Generate code
}
for
arguments
return Builder.CreateCall(CalleeF, ArgsV, "calltmp");
}
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
101
Function prototypes
Function *PrototypeAST::Codegen() {
// Make the function type: double(double,double) etc.
std::vector<Type*> Doubles(Args.size(),
Type::getDoubleTy(getGlobalContext()));
FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()),
Doubles, false);
Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule);
... error handling omitted ...
// Set names for all arguments.
unsigned Idx = 0;
for (Function::arg_iterator AI = F->arg_begin(); Idx != Args.size();
++AI, ++Idx) {
AI->setName(Args[Idx]);
// Add arguments to variable symbol table.
NamedValues[Args[Idx]] = AI;
}
return F;
}
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
102
Function definitions
Function *FunctionAST::Codegen() {
NamedValues.clear();
Function *TheFunction = Proto->Codegen();
if (TheFunction == 0) return 0;
BasicBlock *BB = BasicBlock::Create(
getGlobalContext(), "entry", TheFunction);
Builder.SetInsertPoint(BB);
Create a new block
if (Value *RetVal = Body->Codegen()) {
// Finish off the function.
Builder.CreateRet(RetVal);
verifyFunction(*TheFunction);
return TheFunction;
}
TheFunction->eraseFromParent();
return 0;
}
Generate code for body
Perform verification
Error in body.
Remove function
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
103
Some optimizations
for Kaleidoscope
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
104
Trivial constant folding
Kaleidoscope def test(x) 1+2+x;
Relies on basic
optimization during
IR construction
IR define double @test(double %x) {
entry:
%addtmp = fadd double 3.000000e+00, %x
ret double %addtmp
}
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
105
Missing constant folding
Kaleidoscope def test(x) (1+2+x)*(x+(1+2));
IR define double @test(double %x) {
entry:
%addtmp = fadd double 3.000000e+00, %x
%addtmp1 = fadd double %x, 3.000000e+00
%multmp = fmul double %addtmp, %addtmp1
ret double %multmp
}
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
x+3
computed
twice!
106
Missing constant folding
Kaleidoscope def test(x) (1+2+x)*(x+(1+2));
IR define double @test(double %x) {
entry:
%addtmp = fadd double 3.000000e+00, %x
%multmp = fmul double %addtmp, %addtmp
ret double %multmp
}
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
Only
computed
once!
107
Register per-functions optimizations
FunctionPassManager OurFPM(TheModule);
// Basic setup logic.
OurFPM.add(new DataLayout(*TheExecutionEngine->getDataLayout()));
// Provide basic AliasAnalysis support for GVN.
OurFPM.add(createBasicAliasAnalysisPass());
// Do simple "peephole" optimizations and bit-twiddling optzns.
OurFPM.add(createInstructionCombiningPass());
// Reassociate expressions.
OurFPM.add(createReassociatePass());
// Eliminate Common SubExpressions.
OurFPM.add(createGVNPass());
// Simplify the control flow graph (deleting unreachable blocks, etc).
OurFPM.add(createCFGSimplificationPass());
OurFPM.doInitialization();
TheFPM = &OurFPM; // Inform code generation via global.
MainLoop(); // Run the main "interpreter loop" now.
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
108
Revised function definitions
if (Value *RetVal = Body->Codegen()) {
// Finish off the function.
Builder.CreateRet(RetVal);
// Validate the generated code, checking for consistency.
verifyFunction(*TheFunction);
// Optimize the function.
TheFPM->run(*TheFunction);
return TheFunction;
}
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
109
JIT-based execution/evaluation
for Kaleidoscope
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
110
Wanted!
$ ./toy
ready> 4+5;
ready> Read top-level expression:
define double @0() {
entry:
ret double 9.000000e+00
}
Evaluated to 9.000000
ready>
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
111
Create an IR interpreter
static ExecutionEngine *TheExecutionEngine;
...
int main() {
..
// Create the JIT. This takes ownership of the module.
TheExecutionEngine = EngineBuilder(TheModule).create();
..
}
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
112
Original main loop
/// top ::= definition | external | expression | ';'
static void MainLoop() {
while (1) {
fprintf(stderr, "ready> ");
switch (CurTok) {
case tok_eof:
return;
case ';':
getNextToken(); break; // ignore top-level “;”.
case tok_def:
HandleDefinition(); break;
case tok_extern: HandleExtern(); break;
default:
HandleTopLevelExpression(); break;
}
}
}
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
113
Eval top-level expression past parsing
static void HandleTopLevelExpression() {
// Evaluate a top-level expression into an anonymous function.
if (FunctionAST *F = ParseTopLevelExpr()) {
if (Function *LF = F->Codegen()) {
LF->dump();
// Dump the function for exposition purposes.
// JIT the function, returning a function pointer.
void *FPtr = TheExecutionEngine->getPointerToFunction(LF);
// Cast it to the right type (no args, returns a double).
// Thus, call it as a native function.
double (*FP)() = (double (*)())(intptr_t)FPtr;
fprintf(stderr, "Evaluated to %f\n", FP());
}
... continue with parsing ...
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
114
Control flow
for Kaleidoscope
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
115
Wanted!
IR
define double @fib(double %x) {
entry:
%cmptmp = fcmp ult double %x, 3.000000e+00
br i1 %cmptmp, label %ifcont, label %else
def fib(x)
if x < 3
then 1
else fib(x-1)+fib(x-2)
else:
; preds = %entry
%subtmp = fadd double -1.000000e+00, %x
%calltmp = call double @fib(double %subtmp)
%subtmp1 = fadd double -2.000000e+00, %x
%calltmp2 = call double @fib(double %subtmp1)
%addtmp = fadd double %calltmp, %calltmp2
br label %ifcont
Kaleidoscope
ifcont:
; preds = %entry, %else
%iftmp = phi double [ %addtmp, %else ], [ 1.000000e+00, %entry ]
ret double %iftmp
}
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
116
Extend lexer (trivial)
tok_if = -6, tok_then = -7, tok_else = -8,
...
if (IdentifierStr == "def") return tok_def;
if (IdentifierStr == "extern") return tok_extern;
if (IdentifierStr == "if") return tok_if;
if (IdentifierStr == "then") return tok_then;
if (IdentifierStr == "else") return tok_else;
return tok_identifier;
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
117
Extend AST (trivial)
/// IfExprAST - Expression class for if/then/else.
class IfExprAST : public ExprAST {
ExprAST *Cond, *Then, *Else;
public:
IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
: Cond(cond), Then(then), Else(_else) {}
virtual Value *Codegen();
};
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
118
Extend parser (trivial)
/// ifexpr ::= 'if' expression 'then' expression 'else' expression
static ExprAST *ParseIfExpr() {
getNextToken(); // eat the if.
ExprAST *Cond = ParseExpression();
if (!Cond) return 0;
if (CurTok != tok_then) return Error("expected then");
getNextToken(); // eat the then
ExprAST *Then = ParseExpression();
if (Then == 0) return 0;
if (CurTok != tok_else) return Error("expected else");
getNextToken();
ExprAST *Else = ParseExpression();
if (!Else) return 0;
return new IfExprAST(Cond, Then, Else);
}
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
119
Extend parser (trivial)
static ExprAST *ParsePrimary() {
switch (CurTok) {
default: return Error("unknown token when expecting an expression");
case tok_identifier: return ParseIdentifierExpr();
case tok_number:
return ParseNumberExpr();
case '(':
case tok_if:
return ParseParenExpr();
return ParseIfExpr();
}
}
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
120
Code generation for if-then-else 1/5
Value *IfExprAST::Codegen() {
Value *CondV = Cond->Codegen();
if (CondV == 0) return 0;
// Convert condition to a bool by comparing equal to 0.0.
CondV = Builder.CreateFCmpONE(CondV,
ConstantFP::get(getGlobalContext(), APFloat(0.0)),
"ifcond");
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
121
Code generation for if-then-else 2/5
Function *TheFunction = Builder.GetInsertBlock()->getParent();
// Create blocks for the then and else cases.
// Insert the 'then' block at the end of the function.
BasicBlock *ThenBB = BasicBlock::Create(
getGlobalContext(), "then", TheFunction);
BasicBlock *ElseBB = BasicBlock::Create(
getGlobalContext(), "else");
BasicBlock *MergeBB = BasicBlock::Create(
getGlobalContext(), "ifcont");
Builder.CreateCondBr(CondV, ThenBB, ElseBB);
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
122
Code generation for if-then-else 3/5
// Emit then value.
Builder.SetInsertPoint(ThenBB);
Value *ThenV = Then->Codegen();
if (ThenV == 0) return 0;
Builder.CreateBr(MergeBB);
// Codegen of 'Then' can change the current block.
// Update ThenBB for the PHI.
ThenBB = Builder.GetInsertBlock();
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
123
Code generation for if-then-else 4/5
// Emit else block.
TheFunction->getBasicBlockList().push_back(ElseBB);
Builder.SetInsertPoint(ElseBB);
Value *ElseV = Else->Codegen();
if (ElseV == 0) return 0;
Builder.CreateBr(MergeBB);
// Codegen of 'Else' can change the current block.
// Update ElseBB for the PHI.
ElseBB = Builder.GetInsertBlock();
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
124
Code generation for if-then-else 5/5
// Emit merge block.
TheFunction->getBasicBlockList().push_back(MergeBB);
Builder.SetInsertPoint(MergeBB);
PHINode *PN = Builder.CreatePHI(
Type::getDoubleTy(getGlobalContext()), 2,
"iftmp");
PN->addIncoming(ThenV, ThenBB);
PN->addIncoming(ElseV, ElseBB);
return PN;
}
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
125
User-defined operators
for Kaleidoscope
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
126
User-defined operators
Treat operators like functions
Use of precedence parsing
Use of external functions
Demo
http://llvm.org/docs/tutorial/LangImpl6.html
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
127
Mutable variables for
for Kaleidoscope
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
128
Wanted! Iterative Fibonacci
def binary : 1 (x y) y;
User-defined
sequencing operator
def fibi(x)
var a = 1, b = 1, c in
(for i = 3, i < x in
c=a+b:
a=b:
Use of mutable variables.
b = c) :
b;
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
129
SSA revisited
We need SSA for optimizations.
LLVM IR supports SSA.
Functional Kaleidoscope implied SSA trivially.
Let’s add mutable variables and see how IR is retained.
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
130
Illustrative C example
int G, H;
int test(_Bool Condition) {
int X;
if (Condition)
X = G;
The value of X obviously
else
depends on the executed path.
X = H;
return X;
}
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
131
Desired SSA
@G = weak global i32 0
@H = weak global i32 0
; type of @G is i32*
; type of @H is i32*
define i32 @test(i1 %Condition) {
entry:
br i1 %Condition, label %cond_true, label %cond_false
cond_true:
%X.0 = load i32* @G
br label %cond_next
cond_false:
%X.1 = load i32* @H
br label %cond_next
Extra merge step:
select the right value to use
based on where control flow is
coming from.
cond_next:
%X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
ret i32 %X.2
}
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
132
Desired SSA
SSA required (by LLVM) for registers.
Too difficult to generate in a frontend.
SSA not required (by LLVM) for memory:
Model mutable variables via stack.
Rely on optimization mem2reg.
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
133
Mutable variables based on stack alloca
@G = weak global i32 0
@H = weak global i32 0
; type of @G is i32*
; type of @H is i32*
define i32 @test(i1 %Condition) {
entry:
%X = alloca i32
; type of %X is i32*.
br i1 %Condition, label %cond_true, label %cond_false
cond_true:
%X.0 = load i32* @G
store i32 %X.0, i32* %X
br label %cond_next
cond_false:
%X.1 = load i32* @H
store i32 %X.1, i32* %X
br label %cond_next
cond_next:
%X.2 = load i32* %X
ret i32 %X.2
}
; Update X
1.
2.
3.
4.
Each mutable variable becomes a stack allocation.
Each read of the variable becomes a load from the stack.
Each update of the variable becomes a store to the stack.
The variable represents for the stack address.
; Update X
; Read X
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
134
Revision
static std::map<std::string, Value*> NamedValues;
static std::map<std::string, AllocaInst*> NamedValues;
Add allocations at the entry block of each function.
Codegen for variables reference needs to load from stack.
Codegen for assginments needs to store on stack.
© 2012 Ralf Lämmel, Software Language Engineering http://softlang.wikidot.com/course:sle
135
Download