PRINT-2010-lecture1.p

advertisement
COMP2010 – Compilers
Introduction
Dr. Licia Capra
UCL/CS
ABOUT ME
Lecturer
Dr. Licia Capra
Room
7.17 Malet Street Building
Phone
020 7679 3708
Email
l.capra@cs.ucl.ac.uk
Office Hours
Appointments via Email
Web Page
http://www.cs.ucl.ac.uk/staff/l.capra
ƒ Mobile & Pervasive Computing
Research
ƒ Recommender Systems
ƒ Trust Management
ƒ Content Sharing & Distribution
ƒ Ontology & Tagging
ABOUT YOU
2nd Year Undergrads
Students
– Any affiliate student?
– Any non-CS student?
Pre requisites
Pre-requisites
ƒ 1007-1008 (Java Programming)
ƒ 1001 Computer Architecture I (MIPS
(MIPS, SPIM MIPS
simulator)
Emails
ƒ Use your CS/UCL email account to write me
ƒ Register for 2010@cs.ucl.ac.uk
1
ABOUT THE COURSE
Lecturer (Me)
Teaching Assistant
Andrew Cox
Who
ABOUT THE COURSE
Term 2 Course
30 Lectures
When
– Wed 11am-1pm [Drayton Ricardo LT]
– Fri 10am-11am [MPEB 1.03]
Problem Classes:
–Tue 2pm-3pm [MPEB 1.02] – Start 19/01
Course Website
http://www.cs.ucl.ac.uk/staff/l.capra/teaching/2010.html
ABOUT THE COURSE
ƒ Lecture Slides
Course Material ƒ Exercises/Problem Classes
ƒ Code Examples
Books
“Compilers – Principles, Techniques and
Tools”, by A.V. Aho, R. Sethi, J.D. Ullman.
Addison Wesley
2
ABOUT THE COURSE
“Modern Compiler Implementation in Java”, by
A.W. Appel. Cambridge University Press
Books
“M d
“Modern
Compiler
C
il D
Design”,
i ” b
by D
D. G
Grune et al.l JJohn
h
Wiley and Sons Ltd
“Advanced Compiler Design and Implementation”,
by S.S. Muchnick. Morgan Kaufmann
Read at least one!
ABOUT THE COURSE
Other Books
ƒ“Linkers and Loaders”, by John R. Levine
ƒ“Building an Optimizing Compiler”, by Robert Morgan
ƒ“Advanced Compiling for High Perfomance”, by Kennedy
ƒ“Object-Oriented Compiler Construction”, by Jim Holmes
Conferences
(a few!)
ƒInt. Conf. on Compiler Construction (CC)
ƒInt. Conf. on Programming Languages and Compilers (PLC)
ƒPrinciples of Programming Languages (POPL)
ƒEuropean Conf. on OO Programming (ECOOP)
ƒInt. Conf. on Functional Programming (ICFP)
ƒInt. Conf. on Logic Programming (ICLP)
ƒParallel Architectures and Compilation Techniques (PACT)
ABOUT THE COURSE
SIG
ƒACM Special Interest Group on Programming Languages
http://www.acm.org/sigs/sigplan/
ƒJLex - A Lexical Analyzer Generator for Java
http://www.cs.princeton.edu/~appel/modern/java/JLex/
Tools
ƒJFlex - The Fast Scanner Generator for Java
http://www.jflex.de/
ƒCUP – LALR Parser Generator for Java
http://www.cs.princeton.edu/~appel/modern/java/CUP/
3
ABOUT THE COURSE
2 Submissions – 1 Overall Mark:
Coursework
–Part I : Parsing – due Wednesday 24/02
–Part II: AST and Semantic Analyser – due Wednesday 24/03
Actions to take NOW:
–Form groups of 3/4 people each
–Groups to email me as soon as formed
Assessment
20% coursework
80% written examination (2.5 hours)
NOTE ON PLAGIARISM…
2010 GOALS
ƒ Understand
… the code structure
… the language semantics
… the relationship between source and machine code
ƒ Learn
… theory (mathematical models and algorithms)
… practice (apply theory to build a real compiler)
ƒ Build
… a compiler!
WHAT ARE COMPILERS?
ƒ Compilers: translate computer program from one
language to another
Source language
g g
(high-level language:
Java, C, Pascal, …)
COMPILER
Target
g language
g g
(assembly language)
Error messages
4
WHY DO WE NEED COMPILERS?
ƒ Too difficult to write, debug, maintain programs
written in assembly language
ƒ Source code
optimised
p
for human readability
y
ƒ Machine code
optimised for hardware
ƒ Goal of compilers: translate a source code
program into an equivalent machine code
program efficiently
TRANSLATION CORRECTNESS
ƒ Translation is a complex process
– Source language and target language are very different
ƒ Solution
– Split compilation into different phases
– From language-specific to machine-specific
representation
SIMPLIFIED COMPILER STRUCTURE
Source code
if (b==0) a=b;
Analysis
Intermediate representation
Synthesis
Target code
CMP CX,0
CMOVZ DX,CX
5
ANALYSIS
Source code
(character stream)
Lexical Analysis
(Scanner)
Token stream
Syntax Analysis
(Parser)
Abstract Syntax Tree (AST)
Semantic Analysis
Decorated AST
SYNTHESIS
Decorated AST
Intermediate Code
Generator
Front-end
Intermediate code
Optimiser
Back-end
Intermediate code
Code Generator
Target program
LEXICAL ANALYSIS
ƒ Goal: recognise words and symbols in the source
program and group them into tokens
– Natural language: “I like classical music”
Tokens: “I” “like” “classical” “music”
– Programming language: “if (b==0) a=b”
Tokens: “if” “(” “b” “==” “0” “)” “a” “=” “b”
6
SYNTAX ANALYSIS
ƒ Goal: recognise the phrase structure
– Natural language:
I
like
classical
music
noun
verb
adj
noun
object
subject predicate
sentence
– Programming language:
if (b==0)
test
a=b
assignment
If-statement
SEMANTIC ANALYSIS
ƒ Goal: check whether the source program is semantically
valid
– Natural language:
Classical
adj
adj.
music
noun
likes
verb
I
noun
(syntax is correct, semantics is wrong)
– Programming language:
if (b==0)
test
a=“foo”
assignment
If `a’ is an integer, the semantic analysis will report an
error
(end of analysis)
ERROR HANDLING AND SYMBOL TABLE
Source code
(character stream)
Lexical Analysis
(Scanner)
Token stream
Symbol Table
Syntax Analysis
(Parser)
Error Handler
Abstract Syntax Tree (AST)
Semantic Analysis
Decorated AST
7
INTERMEDIATE CODE GENERATOR
ƒ Goal: create intermediate code that is
– easier to optimise than binary code
– portable
ƒ Example (3-address code):
CJUMP(b==0,L1,L2)
LABEL(L1)
a=b
LABEL(L2)
…
(end of front-end)
OPTIMISER
ƒ Goal: transform the intermediate code so to run
faster and/or to use less space
ƒ Example:
(intermediate code)
(optimised intermediate code)
CJUMP(b==0,L1,L2)
LABEL(L1)
a=b
LABEL(L2)
CJUMP(b==0,L1,L2)
LABEL(L1)
a=0
LABEL(L2)
CODE GENERATOR
ƒ Goal: generate target program (usually assembly
code) from optimised intermediate code
ƒ Example:
CMP ECX
ECX,0
0
CMOVZ [EBP+8],0
(end of synthesis)
(end of back-end)
(end of compilation)
8
OVERALL COMPILER STRUCTURE
Source code
(character stream)
Lexical Analysis (Scanner)
Token stream
Symbol Table
Syntax Analysis (Parser)
Error Handler
Abstract Syntax Tree (AST)
Semantic Analysis
Decorated AST
Intermediate Code
Intermediate Code
Optimisation
Optimised Intermediate Code
Code Generation
Target code
PHASES AND PASSES
ƒ Good compilers must be FAST!
ƒ Phases 1-4 executed in 1 pass
Intermediate code
Parser
Source
program
Scanner
Semantic
Analyser
Intermediate
Code Gen.
Phases 1-4: 1 pass
Phase 5: 1+ pass
3+ passes
Phase 6: 1 pass
Unlike phases, passes produce real output
COMPILERS VS. INTERPRETERS
Compiler
Source program
Compiler
Data
Executable program
Program
g
output
p
Interpreter
Program output
Executable program
Interpreter
Source program
Data
9
COMPILERS VS. INTERPRETERS
Hybrid approach
Source program
P-code Compiler
Data
P-code Interpreter
P-code program
Program output
10
Download