slide4

advertisement
Crafting a Compiler with C
Chapter 1 Introduction
Teacher : Dr. Lawrence Y. Deng
contact: [email protected]
Grading: Attendance 10%, Midterm exam. 30%, Homework 20%, Final exam. + Team project 40%
Textbook:
“Crafting a compiler with C” C. N. Fischer and R. J.
LeBlanc, Jr.1991, The Benjamin/Cummings Publishing
Company
Reference
“Compilers Principles, Techniques, and Tools”,A. V.
Aho, R. Sethi, J. D. Ullman, 2006 2th, Addison Weslay
1
Outlines
•
•
•
•
•
•
•
1.1 Overview and History
1.2 What Do Compilers Do?
1.3 The Structure of a Compiler
1.4 The Syntax and Semantics of Programming Languages
1.5 Compiler Design and Programming Language Design
1.6 Compiler Classifications
1.7 Influences on Computer Design
2
Overview and History
• Compilers are fundamental to modern
computing.
• They act as translators, transforming
human-oriented programming languages
into computer-oriented machine languages.
Programming
Language
(Source)
Compiler
Machine
Language
(Target)
3
Compilers are fundamental to
modern computing:
• Natural Language Processing
–資訊檢索、機器翻譯、智慧型介面(介面設計)、寫
作輔助、語音辨識與合成、OCR
–網頁的生成、檢索、擷取、多媒體性、及多語言性
–手語、行為偵測、網路監控、遊戲對奕
–內容管理
4
Reference Technologies
• Finite State Machine Network
• Data Classification
• Web Content Structure/Content/Usage
Mining
5
Overview and History (Cont’d)
•
The first real compiler
– FORTRAN compilers of the late 1950s, IBM
– 18 person-years to build
•
Compiler technology is more broadly applicable and has been employed in
rather unexpected areas.
– Text-formatting languages, like nroff and troff; preprocessor packages like eqn, tbl,
pic
– Silicon compiler for the creation of VLSI circuits
– Command languages of OS
– Query languages of Database systems
•
efficient? (compared with assembly program)
– not bad, but much easier to write
– high-level languages are feasible.
•
•
•
18 man-year, ad hoc structure
Today, we can build a simple compiler in a few month.
Crafting an efficient and reliable compiler is still challenging.
6
1.2 What Do Compilers Do?
• Compilers may be distinguished according to
the kind of target code they generate:
– Pure Machine Code
– Augmented Machine Code
– Virtual Machine Code
• JVM, P-code
7
What Do Compilers Do? (Cont’d)
• Another way that compilers differ from one another is in the format of
the target machine code they generate
– Assembly Language Format
– Relocatable Binary Format
• A linkage step is required
– Memory-Image (Load-and-Go) Format
8
What Do Compilers Do? (Cont’d)
• Another kind of language processor, called an interpreter, differs from a
compiler in that it executes programs without explicitly performing a
translation
Interpreter
Source
Program
Encoding
Output
Data
9
1.3 The Structure of a Compiler
• Any compiler must perform two major tasks
– Analysis of the source program
– Synthesis of a machine-language program
10
11
The Structure of a Compiler (Cont’d)
Source
Program
Tokens
Scanner
(Character
Stream)
Syntactic
Parser
Semantic
Structure Routines
Intermediate
Representation
Symbol and
Attribute
Tables
(Used by all
Phases of
The Compiler)
The structure of a Syntax-Directed Compiler
Optimizer
Code
Generator
Target Machine
Code 12
13
the structure of a compiler
source program
a:=b*2+0
scanner
id:=id*int+int
parser
syntax trees
symbol
table
load b
shL 1
sta a
semantic
routine
optimizer
code generator
1. type are correct
load b
mul 2
add 0
sta a
oo1F 2b4A 309F
target code
14
The Structure of a Compiler (Cont’d)
• Scanner
– The scanner begins the analysis of the source
program by reading the input, character by
character, and grouping characters into
individual words and symbols (tokens)
– The tokens are encoded and then are fed to the
parser for syntactic analysis
– For details, see the bottom of page 8.
• Scanner generators
15
The Structure of a Compiler (Cont’d)
• Parser
– Given a formal syntax specification (typically as a
context-free [CFG] grammar), the parse reads tokens
and groups them into units as specified by the
productions of the CFG being used.
– While parsing, the parser verifies correct syntax, and if
a syntax error is found, it issues a suitable diagnostic.
– As syntactic structure is recognized, the parser either
calls corresponding semantic routines directly or builds
a syntax tree.
16
<2> Parser: identify syntactic structure
-
Syntax is specified by grammars,e.g.,
IfStmt
== if ( Exp ) then Stmt
Exp
== …
- Parsers can be generated from grammars.
yacc or llgen
grammar
parser
- Error-repair, e.g., ")" is missing
if ( a > 3 then
- Parsers may build a syntax tree or it may call semantic routines
directly.
17
The Structure of a Compiler (Cont’d)
• Semantic Routines
– Perform two functions
• Check the static semantics of each construct
• Do the actual translation
– The heart of a compiler
• Optimizer
– The IR code generated by the semantic routines is
analyzed and transformed into functionally equivalent
but improved IR code.
– This phase can be very complex and slow
– Peephole optimization
18
<3> Semantic routines
- Check that variables are defined, that types are correct, … , etc.
Ex.
x := x + true
- Generate IR (3-addr code), e.g.,
ADD A, B, C
- hand-coded
- associated with grammar rules
- formalized as attribute grammars (AG)
19
• peephole optimization:
– simple and effective
– Ex 1. STORE R1, A
•
– Ex 2.
– Ex 3.
LOAD A, R1
MUL R1, #1
ADD R1, #0
– These innocent code can be actually generated by a
naive compiler.
20
The Structure of a Compiler (Cont’d)
• One-pass compiler
– No optimization is required
– To merge code generation with semantic
routines and eliminate the use of an IR
• Compiler writing tools
– Compiler generators or compiler-compilers
• E.g. scanner and parser generators
21
1.4 Compiler Design and
Programming Language Design
• An interesting aspect is how programming
language design and compiler design
influence one another.
• Programming languages that are easy to
compiler have many advantages
– See the 2nd paragraph of page 16.
22
1.5 Compiler Design and
Programming Language Design
(Cont’d)
• Languages such as Snobol and APL are usually
considered noncompilable
• What attributes must be found in a programming
language to allow compilation?
– Can the scope and binding of each identifier reference
be determined before execution begins
– Can the type of object be determined before execution
begins?
– Can existing program text be changed or added to
during execution?
23
1.6 Compiler Classifications
•
Diagnostic compilers
Report and repair compile-time errors.
- Add run-time checks, e.g., array subscripts.
- should be used in real world.
- vs. production compiler
•
Optimizing compilers
- for production use
- 'optimal' is suspicious because
1. theoretically undecidable
2. practically infeasible
- Perform a lot of transformations with unknown effects.
Ex. place data in registers
•
› fast access
› slow procedure call
Retargetable compiler
Localize machine dependence.
- difficult to implement
- less efficient object code
24
Download