Lecture 15: Putting it all together How to make the computer understand?

advertisement
How to make the computer understand?
•Write a program using a programming language
Fall 2005
• Microprocessors talk in assembly language
Lecture 15: Putting it all
together
Program
written
in a
Programming
Languages
From parsing to code generation
2
Saman Amarasinghe
Unoptimized Code
expr:
.LFB2:
pushq
.LCFI0:
movq
.LCFI1:
movl
movl
movl
imull
movl
incl
imull
movl
incl
imull
sall
movl
movl
leave
ret
int expr(int n)
{
int d;
d = 4 * n * n * (n + 1) * (n + 1);
return d;
}
3
6.035
6.035
©MIT Fall 1998
Example (Output assembly code)
Example (input program)
Saman Amarasinghe
Assembly
Language
Translation
Compiler
©MIT Fall 1998
Saman Amarasinghe
Anatomy of a Computer
Optimized Code
expr:
.LFB2:
movl
imull
incl
imull
imull
sall
ret
%rbp
%rsp, %rbp
%edi, -4(%rbp)
-4(%rbp), %eax
%eax, %edx
-4(%rbp), %edx
-4(%rbp), %eax
%eax
%eax, %edx
-4(%rbp), %eax
%eax
%edx, %eax
$2, %eax
%eax, -8(%rbp)
-8(%rbp), %eax
4
%edi, %eax
%edi, %eax
%edi
%edi, %eax
%edi, %eax
$2, %eax
6.035
©MIT Fall 1998
Lexical Analysis
Program (character stream)
Lexical Analyzer (Scanner)
Token Stream
Syntax Analyzer (Parser)
Parse Tree
Semantic Analyzer
• Lexical analyzer create tokens out of a text
stream
• Tokens are defined using regular expressions
Intermediate Representation
Code Optimizer
Optimized Intermediate Representation
Code Generator
Assembly code
Saman Amarasinghe
5
6.035
©MIT Fall 1998
Saman Amarasinghe
6
6.035
©MIT Fall 1998
1
Lexical Analysis
Examples of Regular Expressions
Regular Expression
a
Strings matched
“a”
a·b
a|b
ε
a*
(a | ε) · b
num = 0|1|2|3|4|5|6|7|8|9
posint = num · num*
int = (ε | -) · posint
real = int · (ε | (. · posint))
“ab”
“a” “b”
“”
“” “a” “aa” “aaa” …
“ab” “b”
“0” “1” “2” “3” …
“8” “6035” …
“-42” “1024” …
“-12.56” “12” “1.414”...
7
Saman Amarasinghe
6.035
©MIT Fall 1998
Syntax Analysis (parsing)
9
6.035
©MIT Fall 1998
<expr>
num
*
Saman Amarasinghe
Saman Amarasinghe
<expr>
6.035
©MIT Fall 1998
10
6.035
©MIT Fall 1998
• Defining a language using context-free
grammars
• Classification of Grammars
Context free
unambiguous
LR(k)
LR(1)
LALR(1)
SLR(1)
LR(0)
<expr>
(
8
Syntax Analysis (parsing)
num ‘*’ ‘(‘ num ‘+’ num ‘)’
<op>
– Transformation algorithm
– Executing a DFA is straightforward
<expr> → <expr> <op> <expr>
<expr> → ( <expr> )
<expr> → - <expr>
<expr> → num
<op> → +
<op> → *
Parse Tree Example
<expr>
– by simple construction
• NFA is transformed to a DFA
Example: A CFG for expressions
• Defining a language using context-free
grammars
Saman Amarasinghe
• Lexical analyzer create tokens out of a text
stream
• Tokens are defined using regular expressions
• Regular expressions can be mapped to
Nondeterministic Finite Automatons (NFA)
)
regular
<expr> <op>
num
Saman Amarasinghe
11
+
<expr>
num
6.035
©MIT Fall 1998
Saman Amarasinghe
12
6.035
©MIT Fall 1998
2
31
LR(k) Parser Engine
Symbol Stack
(
shift to s2
error
shift to s2
error
reduce (2)
reduce (3)
ACTION
)
error
error
shift to s5
shift to s4
reduce (2 )
reduce (3 )
$
error
accept
error
error
reduce (2)
reduce (3)
Goto
X
goto s1
goto s3
Parser Action
State Stack
State
s0
s1
s2
s3
s4
s5
State
s0
s1
s2
s3
s4
s5
s6
Figure by MIT OCW.
Current Symbol
s0
<S>
<X>
<X>
<Y>
<Y>
s5
<S> → <X> • $
13
6.035
©MIT Fall 1998
Semantic Analysis
Saman Amarasinghe
Goto
$
reduce (5 )
X
goto s5
s1
(
Y
<X>
<Y>
<Y>
<Y>
Y
goto s6
goto s3
goto s3
reduce ( 3 or 5 )
reduce (5 )
error
reduce (4 )
accept
reduce (2)
→(•
→ ( • <Y> )
→ • ( <Y> )
→•
(
s2
(
→ • <X> $
→•Y
→•(
→ • ( <Y> )
→•
X
Saman Amarasinghe
ACTION
)
reduce (5 )
reduce (5)
reduce (5 )
sh ift to s4
reduce (4 )
error
error
(
shift to s1
sh ift to s2
shift to s2
error
error
error
error
<Y> → ( • <Y> )
<Y> → • ( <Y> )
<Y> → •
Y
Y
s3
<Y> → ( <Y> • )
s6
14
)
s4
<X> → <Y> •
<Y> → ( <Y> ) •
6.035
©MIT Fall 1998
Translation to Intermediate Format
• Building a symbol table
• Static Checking
• Goal: Remain Largely Machine Independent
But Move Closer to Standard Machine Model
– From high-level IR to a low-level IR
– Flow-of-control checks
– Uniqueness checks
– Type checking
• Eliminate Structured Flow of Control
• Convert to a Flat Address Space
• Dynamic Checking
– Array bounds check
– Null pointer dereference check
Saman Amarasinghe
15
6.035
©MIT Fall 1998
Saman Amarasinghe
Code Optimizations
– Exploit architectural strengths
– Hide architectural weaknesses
17
6.035
©MIT Fall 1998
6.035
©MIT Fall 1998
Code Optimizations
• Generate code as good as hand-crafted by a
good assembly programmer
• Have stable, robust performance
• Abstract the architecture away from the
programmer
Saman Amarasinghe
16
6.035
©MIT Fall 1998
• Algebraic simplification
• Common subexpression elimination
• Copy propagation
• Constant propagation
• Dead-code elimination
• Register allocation
• Instruction Scheduling
Saman Amarasinghe
18
3
Compiler Project!
Compiler Derby
• You guys build a full-blown compiler from the
ground up!!!
• From decaf to working code
• Who has the fastest compiler in the east???
• Will give you the program 12 hours in advance
– Test and make all the optimizations work
– DO NOT ADD PROGRAM SPECIFIC HACKS!
• Wednesday, December 14th at 11:00AM
location TBA
– refreshments provided
Saman Amarasinghe
19
6.035
©MIT Fall 1998
How will you use 6.035 knowledge?
• As an informed programmer
• As a designer of simple languages to aid other
programming tasks
• As an engineer dealing with new computer
architectures
• As a true compiler hacker
Saman Amarasinghe
21
6.035
©MIT Fall 1998
20
Saman Amarasinghe
6.035
©MIT Fall 1998
1. Informed Programmer
• Now you know what the compiler is doing
– don’t treat it as a black box
– don’t trust it to do the right thing!
• Implications
– performance
– debugging
– correctness
22
Saman Amarasinghe
6.035
©MIT Fall 1998
25
1. Informed Programmer
2. Language Extensions
• What did you learned in 6.035?
• In many applications and systems, you may
need to:
– How optimizations work
or why they did not work
– How to read and understand optimized code
– implement a simple language
• handle input
• define an interface
• command and control
– extend a language
• add new functionality
• modify semantics
• help with optimizations
Saman Amarasinghe
23
6.035
©MIT Fall 1998
Saman Amarasinghe
24
6.035
©MIT Fall 1998
4
29
2. Language Extensions
3. Computer Architectures
• What you learned in 6.035
• Many special purpose processors
– define tokens and languages using regular expressions and CFGs
– use tools such as jlex, lex, javacup, yacc
– build intermediate representations
– perform simple transformations on the IR
Saman Amarasinghe
25
6.035
– in your cell phone, car engine, watch, etc. etc.
• Designing new architectures
• Adapting compiler back-ends for new
architectures
©MIT Fall 1998
3. Designing New Architectures
• Great advances in VLSI technology
Saman Amarasinghe
26
6.035
©MIT Fall 1998
3. Designing New Architectures
• What did you learned in 6.035
– very fast and small transistors
– scaling up to billion transistors
– but, slow and limited wires and I/O
– Capabilities of a compiler: what is simple and what
is hard to do
– How to think like a compiler writer
• A computer architecture is a combination of
hardware and compiler
– need to know what a compiler can do and what
hardware need to do
– If compiler can do it don’t waste hardware resources.
Saman Amarasinghe
27
6.035
©MIT Fall 1998
Saman Amarasinghe
3. Back-end support
– Even if the ISA is the same, different resource
constrains
– How to handle new features
29
6.035
©MIT Fall 1998
3. Back-end support
• Every new architecture need a new backend
• Instruction scheduling
Saman Amarasinghe
28
6.035
©MIT Fall 1998
• What do you learned in 6.035
–
–
–
–
–
–
Intermediate representations
Transforming/optimizing the IR
Process of generating assembly from a high-level IR
Assembly interface issues (eg: calling conventions)
Register allocation issues
Code scheduling issues
Saman Amarasinghe
30
6.035
©MIT Fall 1998
5
36
36
4. Compiler Hacking
4. Compiler Hacking
• Theory:
• Theory
– Develop general, abstract concepts
– Prove correctness, optimality etc.
• Algorithms
• Implementation
• Examples
–
–
–
–
31
Saman Amarasinghe
6.035
©MIT Fall 1998
parse theory
lattices and data-flow
abstract interpretation
The language ML
4. Compiler Hacking
6.035
©MIT Fall 1998
4. Compiler Hacking
• Algorithms:
• Implementation:
– Design a solution to a given problem
(Mostly new optimizations) – Use many techniques such as graph theory, number
theory, etc.
– May have to limit the scope and find good heuristics
• Examples
– partial redundancy elimination
– register allocation by graph coloring
– using multi-granular (MMX) operations
Saman Amarasinghe
32
Saman Amarasinghe
33
6.035
©MIT Fall 1998
4. Compiler Hacking
– Develop a new compiler
– Issues of designing a very complex software
– Putting theory and algorithms into practice
• Examples
– A JIT for Java
– A query optimization engine for SQL
– A rasterizer for postscript
Saman Amarasinghe
34
6.035
©MIT Fall 1998
Where to Look for Current Research?
• PLDI – Programming Languages Design and Implementation
Conference
• Code Generation / Machine specific
• What did you learned in 6.035?
– Micro Conference
– ASPLOS – Architecture Support for Programming Languages and
Operating Systems
– CGO – Code Generation and Optimization
• Language Theory
– POPL – Principles of Programming Languages
– OOPSLA – Object Oriented Programming Systems Languages and
Applications
• Program Analysis
– SAS – Static Analysis Symposium
– PPoPP – Principles and Practice of Parallel Programming
Saman Amarasinghe
35
6.035
©MIT Fall 1998
Saman Amarasinghe
36
6.035
©MIT Fall 1998
6
Download