ITS 015: Compiler Construction

advertisement
Compiler Construction
Overview
1
Today’s Goals


Summary of the subjects we’ve covered
Perspectives and final remarks
2
High-level View
Definitions
 Compiler consumes source code & produces target code


usually translate high-level language programs into machine code
Interpreter consumes executables & produces results

virtual machine for the input code
3
Why Study Compilers?

Compilers are important




Compilers are useful


Enabling technology for languages, software development
Allow programmers to focus on problem solving, hiding the
hardware complexity
Responsible for good system performance
Language processing is broadly applicable
Compilers are fun





Combine theory and practice
Overlap with other CS subjects
Hard problems
Engineering and trade-offs
Got a taste in the labs!
4
Structure of Compilers
5
The Front-end
6
Lexical Analysis

Scanner


Maps character stream into tokens
Automate scanner construction



Define tokens using Regular Expressions
Construct NFA (Nondeterministic Finite Automata) to recognize REs
Transform NFA to DFA




Convert NFA to DFA through subset construction
DFA minimization (set split)
Building scanners from DFA
Tools

ANTLR, lex
7
Syntax Analysis


Parsing language using CFG (context-free grammar)
CFG grammar theory




Derivation
Parse tree
Grammar ambiguity
Parsing

Top-down parsing



recursive descent
table-driven LL(1)
Bottom-up parsing


LR(1) shift reduce parsing
Operator precedence parsing
8
Top-down Predictive Parsing

Basic idea
Build parse tree from root. Given A → α | β,
use look-ahead symbol to choose between
α&β

Recursive descent
Table-driven LL(1)

Left recursion elimination

9
Bottom-up Shift-Reduce Parsing


Build reverse rightmost derivation
The key is to find handle (rhs of production)



All active handles include top of stack (TOS)
Shift inputs until TOS is right end of a handle
Language of handles is regular (finite)


Build a handle-recognizing DFA
ACTION & GOTO tables encode the DFA
10
Semantic Analysis

Analyze context and semantics


Attribute grammar


types and other semantic checks
associate evaluation rules with grammar
production
Ad-hoc

build symbol table
11
Intermediate Representation
12
Intermediate Representation

Front-end translates program into IR format for
further analysis and optimization





IR encodes the compiler’s knowledge of the program
Largely machine-independent
Move closer to standard machine model
AST Tree: high-level
Linear IR: low-level




ILOC 3-address code
Assembly-level operations
Expose control flow, memory addressing
unlimited virtual registers
13
Procedure Abstraction

Procedure is key language construct for
building large systems


Name Space
Caller-callee interface: linkage convention




Run-time support for nested scopes


Control transfer
Context protection
Parameter passing and return value
Activation record, access link, display
Inheritance and dynamic dispatch for OO


multiple inheritance
virtual method table
14
The Back-end
15
The Back-end
Instruction selection
 Mapping IR into assembly code
 Assumes a fixed storage mapping & code shape
 Combining operations, using address modes
Instruction scheduling
 Reordering operations to hide latencies
 Assumes a fixed program (set of operations)
 Changes demand for registers
Register allocation
 Deciding which values will reside in registers
 Changes the storage mapping, may add false sharing
 Concerns about placement of data & memory operations
16
Code Generation

Expressions









Recursive tree walk on AST
Direct integration with parser
Assignment
Array reference
Boolean & Relational Values
If-then-else
Case
Loop
Procedure call
17
Instruction Selection


Hand-coded tree-walk code generator
Automatic instruction selection

Pattern matching


Peephole Matching
Tree-pattern matching through tiling
18
Instruction Scheduling
The Problem
Given a code fragment for some target machine and the
latencies for each individual operation, reorder the operations
to minimize execution time
Build Precedence Graph
List scheduling
NP-complete problem
Heuristics work well for basic blocks


forward list scheduling
backward list scheduling
Scheduling for larger regions


EBB and cloning
Trace scheduling
19
Register Allocation

Local register allocation



top-down
bottom-up
Global register allocation




Find live-range
Build an interference graph GI
Construct a k-coloring of interference graph
Map colors onto physical registers
20
Web-based Live Ranges

Connect common defs and uses

Solve the Reaching data-flow problem!
21
Interference Graph
The interference graph, GI
 Nodes in GI represent live ranges
 Edges in GI represent individual interferences



For x, y ∈ GI, <x,y> ∈ iff x and y interfere
A k-coloring of GI can be mapped into an
allocation to k registers
22
Key Observation on Coloring


Any vertex n that has fewer than k neighbors
in the interference graph (n°< k) can always
be colored !
Remove nodes n°< k for GI ’, coloring for GI ’
is also coloring for GI
23
Chaitin’s Algorithm
1.
While ∃ vertices with < k neighbors in GI


Pick any vertex n such that n°< k and put it on the stack
Remove that vertex and all edges incident to it from GI

2.
If GI is non-empty (all vertices have k or more neighbors) then:



3.
This will lower the degree of n’s neighbors
Pick a vertex n (using some heuristic) and spill the live range
associated with n
Remove vertex n from GI , along with all edges incident to it and
put it on the stack
If this causes some vertex in GI to have fewer than k neighbors,
then go to step 1; otherwise, repeat step 2
If no spill, successively pop vertices off the stack and color
them in the lowest color not used by some neighbor; otherwise,
insert spill code, recompute GI and start from step 1
24
Brigg’s Improvement
Nodes can still be colored even with > k neighbors if some
neighbors have same color
1.
While ∃ vertices with < k neighbors in GI


Pick any vertex n such that n°< k and put it on the stack
Remove that vertex and all edges incident to it from GI

2.
If GI is non-empty (all vertices have k or more neighbors)
then:


3.
This may create vertices with fewer than k neighbors
Pick a vertex n (using some heuristic condition), push n on the
stack and remove n from GI , along with all edges incident to it
If this causes some vertex in GI to have fewer than k neighbors,
then go to step 1; otherwise, repeat step 2
Successively pop vertices off the stack and color them in the
lowest color not used by some neighbor

If some vertex cannot be colored, then pick an uncolored vertex
to spill, spill it, and restart at step 1
25
The Middle-end: Optimizer
26
Principles of Compiler Optimization

safety


profitability


Is there a reasonable expectation that applying the
transformation will improve the code?
opportunity


Does applying the transformation change the results of
executing the code?
Can we efficiently and frequently find places to apply
optimization
Optimizing compiler


Program Analysis
Program Transformation
27
Program Analysis


Control-flow analysis
Data-flow analysis
28
Control Flow Analysis





Basic blocks
Control flow graph
Dominator tree
Natural loops
Dominance frontier


the join points for SSA
insert Ф node
29
Data Flow Analysis

“compile-time reasoning about the runtime
flow of values”


represent effects of each basic block
propagate facts around control flow graph
30
DFA: The Big Picture

Set up a set of equations that relate program
properties at different program points in terms of the
properties at "nearby" program points

Transfer function

Forward analysis: compute OUT(B) in
terms IN(B)



Backward analysis: compute IN(B) in
terms of OUT(B)



Available expressions
Reaching definition
Variable liveness
Very busy expressions
Meet function for join points


Forward analysis: combine OUT(p) of
predecessors to form IN(B)
Backward analysis: combine IN(s) of
successors to form OUT(B)
31
Available Expression
Basic block b
 IN(b): expressions available at b’s entry
 OUT(b): expressiongs available at b’s exit
 Local sets


def(b): expressions defined in b and available on exit
killed(b): expressions killed in b


Transfer function


An expression is killed in b if operands are assigned in b
OUT(b) = def(b) ∪ (IN(b) – killed(b))
Meet function

IN(b) =

OUT ( p)
p pred ( b )
32
More Data Flow Problems

AVAIL Equations
AVAIL _ IN (n) 

AVAIL _ OUT ( p)
p pred ( n )
AVAIL _ OUT (n)  def (n)  ( AVAIL _ IN (n)  killed (n))

More data flow problems


Reaching Definition
Liveness
meet function
∪
∩
forward
reaching
definition
available
expression
backward
variable
liveness
very busy
expression
33
Compiler Optimization

Local optimization



Global optimization enabled by DFA




DAG CSE
Value numbering
Global CSE (AVAIL)
Constant propagation (Def-Use)
Dead code elimination (Use-Def)
Advanced topic: SSA
34
Perspective




Front end: essentially solved problem
Middle end: domain-specific language
Back end: new architecture
Verifying compiler, reliability, security
35
Interesting Stuff We Skipped

Interprocedural analysis
Alias (pointer) analysis
Garbage collection

Check the literature reference in EaC


36
How will you use the knowledge?




As
As
As
As
informed programmer
informed small language designer
informed hardware engineer
compiler writer
37
Informed Programmer

“Knowledge is power”



Compiler is no longer a black box
Know how compiler works
Implications

Use of language features



Code optimization



Avoid those can cause problem
Give compiler hints
Don’t optimize prematurely
Don’t write complicated code
Debugging

Understand the compiled code
38
Solving Problem the Compiler Way

Solve problems from language/compiler
perspective


Implement simple language
Extend language
39
Informed Hardware Engineer

Compiler support for programmable hardware



pervasive computing
new back-ends for new processors
Design new architectures


what can compiler do and not do
how to expose and use compiler to manage
hardware resources
40
Compiler Writer

Make a living by writing compilers!


Theory
Algorithms
Engineering

We have built:







scanner
parser
AST tree builder, type checker
register allocator
instruction scheduler
Used compiler generation tools

ANTLR, lex, yacc, etc
On track to jump into
compiler development!
41
Final Remarks

Compiler construction



Theory
Implementation
How to use what you learned in this lecture?




As
As
As
As
informed programmer
informed small language designer
informed hardware engineer
compiler writer
… and live happily ever after
42
Download