Uploaded by codewith ib

compiler ppt

advertisement
COMPILER
Presented to:
Sir Naeem
Presented by:
Sahil (BSIT-2024-015)
1
COMPILER
A compiler is a computer program that transforms
source code written in a programming language (the
source language) into another computer language (the
target language), with the latter takes binary form
known as object code
It create an executable program
2
Cause
Software for early computers was written in
assembly language
The benefits of reusing software on
different CPUs started to become
significantly greater than the cost of writing
a compiler
The first real compiler
FORTRAN compilers of the late 1950s
18 person-years to build
3
Structure of Compiler
Any compiler must perform two major tasks
Analysis of the source program
Synthesis of a machine-language program
4
THE STRUCTURE OF A COMPILER (2)
Source
Program
(Character Stream)
Scanner
Tokens
Parser
Syntactic Semantic
Structure Routines
Intermediate
Representation
Symbol and
Attribute
Tables
Optimizer
(Used by all Phases of The Compiler)
Code
Generator
5
Target machine code
THE STRUCTURE OF A COMPILER (3)
Source
Program
(Character Stream)
Scanner
Tokens
Parser
Syntactic Semantic
Structure Routines
Intermediate
Representation
Scanner
 The scanner begins the analysis of the source program by
reading the input, character by character, and grouping
Symbol and
characters into individual words and symbols (tokens)
Attribute
Tables
Optimizer
The Compiler)
Code
Generator




RE ( Regular expression )
NFA ( Non-deterministic Finite Automata )
(Used by
DFA ( Deterministic Finite Automata
) all
LEX
Phases of
6
Target machine code
THE STRUCTURE OF A COMPILER (4)
Source
Program
(Character Stream)
Scanner
Parser
Tokens
Parser
Syntactic Semantic
Structure Routines
 Given a formal syntax specification (typically as a [CFG] ),
the parse reads tokens and groups them icontext-free
Symbol and
grammar nto units as specified by the productions of the
Attribute
CFG being used.
Tables
 As syntactic structure is recognized, the parser either calls
corresponding semantic routines directly or builds a syntax
(Used by all
tree.
Phases
of
 CFG ( Context-Free Grammar
)
 BNF ( Backus-Naur Form )The Compiler)
 GAA ( Grammar Analysis Algorithms )
 LL, LR, SLR, LALR Parsers
7
 YACC
Intermediate
Representation
Optimizer
Code
Generator
Target machine code
THE STRUCTURE OF A COMPILER (5)
Source
Program
(Character Stream)
Scanner
Tokens
Semantic Routines
Parser
 Perform two functions
 Check the static semantics of each construct
Symbol and
 Do the actual translation
Attribute
 The heart of a compiler
Tables
Syntactic Semantic
Structure Routines
Intermediate
Representation
Optimizer
 Syntax Directed Translation
 Semantic Processing Techniques
(Used
 IR (Intermediate Representation)
by all
Phases of
The Compiler)
8
Code
Generator
Target machine code
THE STRUCTURE OF A COMPILER (6)
Source
Program
(Character Stream)
Optimizer
Scanner
Tokens
Parser
Syntactic Semantic
Structure Routines
 The IR code generated by the semantic routines is
analyzed and transformed into functionally equivalent but
Symbol and
improved IR code
Attribute
 This phase can be very complex and slow
Tables
 Peephole optimization
 loop optimization, register allocation, code scheduling
(Used by all
Phases of
 Register and Temporary Management
 Peephole Optimization
The Compiler)
9
Intermediate
Representation
Optimizer
Code
Generator
Target machine code
THE STRUCTURE OF A COMPILER (7)
Source
Program
(Character Stream)
Scanner
Code Generator
 Interpretive Code Generation
 Generating Code from Tree/Dag
 Grammar-Based Code Generator
Tokens
Parser
Syntactic Semantic
Structure Routines
Intermediate
Representation
Optimizer
Code
Generator
10
Target machine code
THE STRUCTURE OF A COMPILER (8)
Code Generator
[Intermediate Code Generator]
Non-optimized Intermediate Cod
Scanner
[Lexical Analyzer]
Tokens
Code Optimizer
Parser
[Syntax Analyzer]
Optimized Intermediate Code
Parse tree
Code Optimizer
Semantic Process
[Semantic analyzer]
Abstract Syntax Tree w/ Attributes
11
Target machine code
Language Description
Identifier Rules
•Identifier can be of maximum length 6.
•Identifiers are not case sensitive.
•An Indetifier can only have alphanumeric characters( a-z
, A-Z , 0-9 ) and underscore(_).
•The first character of an identifier can only contain
alphabet( a-z , A-Z ).
•Keywords are not allowed to be used as Identifiers.
•No special characters, such as semicolon, period,
whitespaces, slash or comma are permitted to be used in
or as Identifier.
12
Data Types:
Our language supports only 3 datatypes
•Integer
•String
•Character
Expressions
1.Arithmetic operators (+, -, *, /, %)
2.Uniray operator
3.Paranthesis
4.Only Integer supported
5.Relational expression to be supported (>, <, >=, <=, ==, !=)
6. Character string and integer constants
13
Statements
•Declaration statement : int a;
•Declaration and Initialisation : int a=5;
•Assingment Statement : a=6;
Conditional statement
Simple if (nesting not allowed)
if then
Endif
Switch Statement (nesting not allowed)
Switch()
Cases
Value 1:
Break;
Value n:
break;
Endcase
14
Repetition Statement (nesting not allowed)
a.Repeat
Until ()
a.While (relational expression)
Endwhile
a.For = start value, end value, inc/dec
………
Endfor
4
I/O Statement
•Input ;
•Output ;
Program Structure
Decleration:
Start
End
15
1.Sample Program I
#mode 10
declaration
int r
int c
int in
int flg
start
r=0
flg = 1
while( flg == 1 )
if( c == 0) then
flg = 0
endif
c = c-1
endwhile
end
16
OUTPUT 1
START:
MOV AX, @DATA
MOV DS, AX
MOV AX,
MOV r, AX
MOV AX,
MOV flg, AX
LB01:
MOV AX,
CMP AX,
JNE LB01
MOV AX,
CMP AX,
JNE LB01
MOV AX,
MOV flg, AX
LB02:
MOV AX,
SUB AX,
MOV c, AX
JMP LB01
LB03:
MOV AX, 4C00H
INT 21H
END START
17
Sample Program II
#mode 10
declaration
int a ; b
int i
int k
string mes1
end
start
k=k*1
if(i<9 )then
i=i+9
k=k*1
endif
i=i-45
repeat
i=i+9*k+b
k=k*1
output "Hello World"
input k
until(i<2 )
while(k>3 )
i=i+9
k=k*1 endwhile
18
OUTPUT
START:
LB01:
MOV AX, @DATA
MOV AX, i
MOV DS, AX
SUB AX, 45
MOV AX, k
MUL 1
MOV i, AX
LB02:
MOV k, AX
MOV AX, i
MOV AX, i
ADD AX, 9
CMP AX, 9
MUL k
JGE LB01
ADD AX, b
MOV AX, i
MOV i, AX
ADD AX, 9
MOV AX, k
MOV i, AX
MUL 1
MOV AX, k
MOV k, AX
MUL 1
MOV k, AX
19
OUTPUT
LEA DX, "Hello World"
MUL 1
CALL MESSAGE
MOV k, AX
CALL INDEC
JMP LB01
MOV k, AX
MOV AX, i
MOV AX, i
ADD AX, 9
CMP AX, 2
MOV i, AX
JGE LB01
MOV AX, k
LB03:
MUL 1
MOV AX,
MOV k, AX
CMP AX, 3
JLE LB01
JMP LB01
LB04:
MOV AX, i
MOV AX, 4C00H
ADD AX, 9
INT 21H
MOV i, AX
MOV AX, k
END START
20
SCREENSHOTS
21
22
23
Feasibility and future scope
With the growth of technology ease of working is given
priority.
We have emerged from C , C++ to python ,ruby , etc. which
require less lines of code .
Our project can be extended to form a new language which is
easy to learn, faster , has more inbuilt features and has many
more qualities of a good programming language.
24
Conclusion
In a compiler the process of Intermediate code generation is
independent of machine and the process of conversion of
Intermediate code to target code is independent of language
used.
Thus we have done the front end of compilation process.
It includes 3 phases of compilation
lexical analysis
syntax analysis
semantic analysis
Followed by intermediate code generation.
25
References
•Salomaa, Arto [1973]. Formal Languages. Academic Press,
New York
•Schulz, Waldean A. [1976]. Semantic Analysis and Target
Language Synthesis in a Translator.Ph.D. thesis, University of
Colorado, Boulder, CO.
•https://www.cs.vt.edu/undergraduate/courses/CS4304
26
27
Download