Compilers Compilers Compilers Compilers Compilers

advertisement
Systems Software - Compilers and
Assemblers
Oct-06
Compilers
Compilers
• design problem
• Translates a source language program into a
low level object language
– express every high level language program in
machine instructions
– high level source language => large distance to
machine language
– therefore use intermediate language in
translation process (2 steps not 1)
October 06
Compilers and Assemblers
– May be the assembly language or the machine
language of that particular computer.
• If the object file is assembler, then an
assembler converts it to machine language.
– A Linker then combines the object files with
libraries to create the final executable file.
1
October 06
Compilers
• Difference between Interpreters and
Compilers
– May translate the source into an internal
intermediate code that it can execute more
efficiently.
– May execute the source program statements
directly.
– Interpreter works similar to us in figuring out
what the code does.
• Check for syntax errors, locate the start and then
execute the statements.
– Compiler translates the source to object after
checking for errors. Can be either assembler or
machine language.
• Net result – interpreter translates the
program into actions specified by the code.
Compilers and Assemblers
3
October 06
Compilers
– Interpreter works as we do.
• Each time, it has to figure out what statements
mean.
– Compilers generate machine code which is
executed at top speed.
• Debuggers have helped a lot.
• Up to 100 times faster than an interpreter.
– However, the statements required to keep the
environment informed may affect the efficiency
of the code
Damien Costello, Dept of Computing &
Maths, GMIT
4
• What about speed?
– Interpreter may stop and ask for corrective
action, show you where the problem is and so
on.
– Compilers generate the object code which then
runs independently, may simply abort on error.
Compilers and Assemblers
Compilers and Assemblers
Compilers
• What about logic errors (divide by zero)?
October 06
2
Compilers
• An interpreter does not produce an object
program.
October 06
Compilers and Assemblers
5
• Would be nice to have both an interpreter
and a compiler for a given language.
October 06
Compilers and Assemblers
6
1
Systems Software - Compilers and
Assemblers
Oct-06
Compilers - Analysis-Synthesis
Analysis-Synthesis
• Front-end
• related view
– Reads the source program
– source-language-dependant
– target-language-independent
– analysis
• recognise structure and meaning of source
– synthesis
• Back-end
–
–
–
–
• construct the desired target
Interpreter – execute program
Compiler – generate object code
source-language-independent
target-language-dependant
October 06
Compilers and Assemblers
7
October 06
Compilers and Assemblers
Compilers
Compilers
• performs translation of high level language to
intermediate or machine language
• reads program in source language and translates to
target language
• target program may need further processing
• compiler generates assembly code
• translated by an assembler to relocatable code
from libraries
• final code can run on machine
October 06
Compilers and Assemblers
8
9
source
object files
from libraries
compiler
assembly
relocatable
machine code
assembler
linker/loader
October 06
Compiler Components
absolute
machine code
Compilers and Assemblers
10
Compiler Components
• Parser
• code generator
– Knows the syntax of the source language.
– constructs the target machine code
• Grammar and rules
• constrainer
– It controls the translation process.
– Sends “get” messages to the scanner object.
– Helps to enforces type and declaration rules
– Adds to symbol table
• Scanner
– reads the source as string of characters
– recognises streams of words and symbols (tokens)
– Starts building the symbol table
• semantic information about identifiers
– often considered part of parser as they work
closely
• character sting constants, spelling of identifiers
October 06
Compilers and Assemblers
Damien Costello, Dept of Computing &
Maths, GMIT
11
October 06
Compilers and Assemblers
12
2
Systems Software - Compilers and
Assemblers
Oct-06
Front End
Compiler Components
Back End
Token
Token
ICode
ICode
Parser
Get
GetChar
Source
Source
Buffer
Buffer
Get
Put
Get
Scanner
Scanner
parser
parser
Code
Code
Generator
Generator
Go
GetChar
Enter
Search
PutLine
PutLine
Scanner
Symbol Table
Symbol
Symbol
Table
Table
October 06
Compilers and Assemblers
13
October 06
– depend on characteristics of target machine code
• Machine Independent Optimisation
– operates on the Icode (sometimes called the Abstract
Syntax Tree)
• reshaping for more efficient code
• Peephole Optimisation
15
– most common form of machine dependent optimiser
• operates on machine code for local improvements
• small number of instructions considered
October 06
Language Definition
16
• some sequences of words are correct
• others are incorrect or ill formed
• grammar or syntax
– defines which sequences are correct
– set of rules that define how words can be
arranged to form sentences
– rules provide every sentence with structure
– consult language definition to find out
constructs and meanings
• compiler writer
– provide for every construct a translation
according to its meaning
Damien Costello, Dept of Computing &
Maths, GMIT
Compilers and Assemblers
Language Definition
• programmer and compiler writer need strict
definition of high level language
• programmer
Compilers and Assemblers
14
• can also include one or more optimising modules
• two kinds of optimisation
– Taken with the Symbol Table, this provides a
clean interface between the front and back ends.
October 06
Compilers and Assemblers
Compiler Components
• List buffer – for source listings, error
messages and other printed information
• Symbol Table – used to keep track of
information about certain tokens
(identifiers, function calls).
• Icode – intermediate code – a predigested
version of the source.
Compilers and Assemblers
Machine Code
Object
Object
Buffer
Buffer
Compiler Components
October 06
Code Generator
Put
Search
PutLine
List
List
Buffer
Buffer
Constrainer
• can be used as an instrument to recognise the
structure of sentence
17
October 06
Compilers and Assemblers
18
3
Systems Software - Compilers and
Assemblers
Oct-06
Language Definition
Syntax
• semantics
• consider English sentences
– define meaning of well formed sentences
– give meaning of every language structure
recognised by syntax
– subject, verb, object
• notions of sentence subject verb object
article noun
• pragmatics
– play a role in the language description
– denote parts of sentence
– syntactic categories
– characteristics of specific implementation
• restrictions of implementation of language
• in formal languages - tokens
October 06
Compilers and Assemblers
19
October 06
Syntax
• sentence denotes notion
– set of all strings of tokens that satisfy definition
of sentence
– defined using rewriting rules
• choices denoted by vertical bar
• language definition described is a
generation scheme
sentence → subject verb object
subject → article noun
verb → bites
object → article noun
article → a | the
noun → man | dog
October 06
– sentence generated by starting at “sentence”
and successively applying rewriting rules
Compilers and Assemblers
21
October 06
Syntax
article
verb
noun
Compilers and Assemblers
22
Syntax
sentence
subject
20
Syntax
• syntactic categories + tokens = grammar
symbols
• syntactic categories
•
•
•
•
•
•
Compilers and Assemblers
• interior node and children correspond to
rewriting rule
• tokens (a the dog man bites) are terminal
symbols
object
article
noun
– end of generation process
the
dog
bites
a
• sentence, subject, object are non terminals
• rewriting rules are called production rules
man
– produce sentences of the language
October 06
Compilers and Assemblers
Damien Costello, Dept of Computing &
Maths, GMIT
23
October 06
Compilers and Assemblers
24
4
Systems Software - Compilers and
Assemblers
Oct-06
Syntax
Syntax
• sentence
• Context free grammar consists of
– start symbol
– distinguished nonterminal where generation
process starts
– set of terminals or tokens
• (representation of tokens in sentences)
– set of nonterminals
• context free grammar
• do not occur in sentences
– productions apply in any context in which the
nonterminals occur
– start symbol
– set of production rules
• language is context free
• left side and right side
• string containing zero tokens - empty string
– defined by means of context free grammar
October 06
Compilers and Assemblers
25
October 06
Syntax
• depicts how a string in the language is derived
from start symbol
• parse tree properties
– need semantics of every language structure
– cannot assume to know the meanings of words
– evaluation of arithmetic expressions
–
–
–
–
• define precedence
– for every operator
• define result type for every combination of operand
types
Compilers and Assemblers
Damien Costello, Dept of Computing &
Maths, GMIT
26
Parse Trees
• in high level language
October 06
Compilers and Assemblers
27
root is labelled by the start symbol
leaf is labelled by a token
interior node is labelled by nonterminal
leaves, spelled from left to right
• yield of tree, generated or derived
• ambiguity ( 3 - 2 + 1)
– where expression can have more than one parse tree
October 06
Compilers and Assemblers
28
5
Download