Uploaded by lokesh

Lab Manual CD

advertisement
Ex. No.: 1
Lexical Analyzer Using C
Aim:
To Write a program for implementing Lexical Analyzer in ‘C’
Algorithm:
STEP 1: Create files for storing keywords, Operators, and a program as an input.
STEP 2: Read the source file as input.
STEP 3: Scan the first character.
STEP 4: Check the character begin with alphabet or digit using the functions
isalpha() and isdigit().
STEP 5: If isdigit() returns true then print it as a constant.
STEP 6: If isalpha() returns true check it with keywords in the keyword file.
STEP 7: If any word matches with the keywords in the file then print it as keyword
otherwise print it as an identifier.
Pseudo code:
BEGIN
IMPORT headers
DECLARE POINTERS fsur, fkey, fopr
SET set TO 0
ASSIGN source.text FILE to fsur
ASSIGN keyword.txt FILE to fkey
ASSIGN operator.txt FILE to fopr
READ keywords AND identifiers
IF input IS an ALPHABET
THEN ASSIGN input character c TO str[i]
END IF
ASSIGN ‘\0’ TO str[i]
REWIND fkey
IF strcmp(key,str) EQUALS 0
1
THEN SET set EQUAL to 1
BREAK
END IF
IF set EQUALS 0
PRINT “identifier”
IF input IS a DIGIT
PRINT “numeric literal”
IF input IS a STRING
PRINT “string literal”
IF input IS a CHARACTER
PRINT “character literal”
IF input IS a processor
PRINT “processor”
IF input IS a bracket
PRINT “open symbol” OR “close symbol”
IF input IS a punctuation
PRINT “punctuation”
IF input EQUALS “/”
PRINT “comment lines”
IF input IS an OPERATOR
PRINT “operator”
RETURN 0
Explanation:
Compiler is responsible for converting high level language in machine language. There are
several phases involved in this and lexical analysis is the first phase. The lexical analyzer is the
part of the compiler that detects the token of the program and sends it to the syntax analyzer.
Token is the smallest entity of the code, it is a keyword, identifier, constant, string literal,
symbol. Examples of different types of tokens in C. Token is the smallest entity of the code, it
is a keyword, identifier, constant, string literal, symbol. The process of forming tokens from an
input stream of characters is called tokenization.
2
Different types of tokens in C.
Keywords: for, if, include, etc
Identifier: variables, functions, etc
separators: ‘,’, ‘;’, etc
operators: ‘-’, ‘=’, ‘++’, etc
Example 1:
Consider the expression Sum=3+2 in C programming language:
Lexeme
Sum
Token category
identifier
=
assignment operator
3
integer
+
addition operator
2
integer
;
end of the statement
Example 2:
Consider the expression if ( x > 3.1 in C programming language:
3
Sample Input and Output:
Input:
// demo program- source.txt
# define N 30
int main()
{
float a = 61.37;
char str1 = ‘s’;
printf(“%c”, str1);
return 0;
}
Keyword.txt
define int main float char printf return if while for
Operator.txt
+ - * /
= % ==
><>= <=
Output:
# is a preprocessor symbol
define is a keyword.
N is an identifier
20 is a numeric literal
int is a keyword
main is a keyword
(
is an open brace
)
is an close braces
{
is an open braces
float is a keyword
a is an identifier
=
is an operator
61.37 is a numeric literal
;
is end of statement.
4
char is a keyword
str1 is an identifier
= is an operator
s is a character literal
printf is a keyword
(
is an opening braces
“%c” is a string literal
,
is a punctuation
strl
is an identifier
)
is an closing braces
;
is end of statement
return
is a keyword
0 is a numeric literal
} is a closing braces
Result:
Thus, the program to implement lexical analyzer using C was created and its output was verified.
5
Ex. No.: 2
To Ignore Redundant Spaces, Tabs, and New Lines Using
C Program
Aim:
To write a program in C to ignore redundant spaces, tabs, and new lines.
Algorithm:
STEP 1: Read the source file.
STEP 2: Recognize numbers using isdigit() function
STEP 3: Recognize keywords and Identifiers using isalpha() function
STEP 4: Write function to ignore whitespace, tabs and newline
STEP 5: Close the files and print the output.
STEP 6: End.
Pseudo code:
BEGIN
IMPORT libraries
DECLARE input variable
DECLARE a character variable AS previous
READ source FILE
READ next character FROM input
IF character EQUALS space OR tab OR newline
IF previous IS NOT space OR tab OR newline
PRINT space
SET previous flag TO TRUE
IF character NOT a space OR tab OR newline
PRINT character
SET previous flag TO FALSE
END
6
Explanation:
In C programming, white space refers to characters such as space, tab, and newline. These
characters are often used for formatting and are not significant in the logic of a program.
However, they can make reading and analyzing the code more difficult. Scanners reads the
source program one character at a time, carving the source program into a sequence of atomic
units called tokens.
The lexical analyzer is the first phase of compiler. Its main task is to read the input characters
and produce output a sequence of tokens that the parser uses for syntax analysis. Upon receiving
a “get next token” command from the parser the lexical analyzer reads input characters until it
can identify the next token.
One such task is stripping out from the source program comments and white space in the form
of blank, tab, and new line character. Another is correlating error messages from the compiler
with the source program. removing white space can improve the readability and maintainability
of your code. By reducing clutter and improving the structure of your code, you can make it
easier to understand and modify over time.
Sample Input and Output:
Input:
Enter the c program
int main()
{
int a=10,20;
charch;
float f;
}
Output:
The numbers in the program are: 10 20
The keywords and identifiers are:
int is a keyword
main is an identifier
7
int is a keyword
a is an identifier
char is a keyword
ch is an identifier
float is a keyword
f is an identifier
Special characters are ( ) { = , ; ; ; }
Total no. of lines are:5
Result:
Thus, the program to ignore redundant spaces, tabs and new lines was executed successfully.
8
Ex. No.: 3
Lexical Analyzer Using LEX Tool
Aim:
To write a program to implement Lexical Analyzer using LEX tool.
Algorithm:
STEP 1: Define and include necessary declarations and Regular definitions.
STEP 2: Define the Translation Rule section for Keywords, Identifiers, Constants,
operators and Literal Strings
STEP 3: Read the input from a file.
STEP 4: Lex tool will compare the input with patterns defined under translation
rule section
STEP 5: If any of the pattern matches with the input respective action will be
performed.
STEP 6: End.
Pseudo code:
BEGIN
IMPORT headers
DECLARE lex symbols
DECLARE id
READ input FROM file
CALL yylex()
DEFINE translation rules WITH patterns AND actions
COMPARE using lex tool
DEFINE auxiliary function()
SET integer expression
SET float expression
SET operator expression
SET identifier expression
SET punctuation expression
9
SET preprocessor expression
SET string literal expression
IF pattern MATCHES
PERFORM actions
PRINT output
END
Explanation:
It is a tool or software which automatically generates a lexical analyzer (finite Automata). It
takes as its input a LEX source program and produces lexical Analyzer as its output. Lexical
Analyzer will convert the input string entered by the user into tokens as its output.
In program with structure input-output two tasks occurs over and over. It can divide the inputoutput into meaningful units and then discovering the relationships among the units for C
program (the units are variable names, constants, and strings). This division into units (called
tokens) is known as lexical analyzer or LEXING. LEX helps by taking a set of descriptions of
possible tokens n producing a routine called a lexical analyzer or LEXER or Scanner.
Lex file format
A Lex program is separated into three sections by %% delimiters. The format of Lex source is
as follows:
{ definitions }
%%
{ rules }
%%
{ user subroutines }
•
Definitions include declarations of constant, variable and regular definitions.
•
Rules define the statement of form p1 {action1} p2 {action2}....pn {action}, Where pi
describes the regular expression and action1 describes the actions what action the
lexical analyzer should take when pattern pi matches a lexeme.
•
User subroutines are auxiliary procedures needed by the actions.
•
The subroutine can be loaded with the lexical analyzer and compiled separately.
10
Sample Lex program:
Lex program for Sum of two numbers
%{
#include<stdio.h>
%}
%%
%%
main()
{
int a=5,b=10;
printf(“sum=%d”,a+b);
return 0;
Commands to execute:
$ lex filename.l
// will create a file called lex.yy.c
$ cc lex.yy.c –lfl // compiling and linking flex library
$ ./a.out
Sample Input and Output:
Input:
source.c
# include<stdio.h>
void main()
{
int a;
A= 10 +2;
11
Printf(“%d”, a);
}
Output:
# include<stdio.h> is a preprocessor
A keyword: void
A keyword: main
( is a punctuation
) is a punctuation
{
is a punctuation
A keyword: int
An identifier: a
; is a punctuation
An identifier : a
An operator : =
An integer: 10
An operator : +
An integer: 2
; is a punctuation
A keyword :printf
(
is a punctuation
“%d” is a literal string
, is a punctuation
An identifier : a
) is a punctuation
; is a punctuation
} is a punctuation
Result:
Thus, Program to implement Lexical Analyzer using ‘lex’ tool was executed and verified.
12
Ex. No.: 4
Predictive Parser
Aim:
To write a program to implement Lexical Analyzer using LEX tool.
Algorithm:
STEP 1: Initialize stack and push starting symbol onto it.
STEP 2: Initialize index for input string
STEP 3: Repeat until stack is empty
STEP 4: Get top of stack
STEP 5: If symbol is a terminal, match it with current input token
STEP 6: If the symbols match, move to the next input token
STEP 7: If the symbols don't match, the input is not in the language
STEP 8: If symbol is a non-terminal, use the prediction table to get the production to
use production = predictionTable[symbol][input[inputIndex]]
STEP 9: If a production is found, push its symbols onto the stack in reverse order
STEP 10: If no production is found, the input is not in the language
STEP 11: If the stack is empty and the input has been fully processed, the input is in
the language return "Input is in the language" else return "Error: Input not
in language"
Pseudo code:
BEGIN
IMPORT headers
INITIALIZE stack AND PUSH start symbol
INITIALIZE index
REPEAT until stack IS EMPTY
GET top OF stack
IF symbol EQUALS terminal
MATCH WITH current token
IF symbol EQUALS match
MOVE TO next token
IF symbol NOT MATCHED
13
RETURN “input is not in the language”
IF symbol IS a non-terminal
SET production = predictionTable[symbol][input[inputIndex]]
IF production FOUND
PUSH ONTO stack IN reverse order
IF production NOT FOUND
RETURN “input is not in the language”
IF stack IS EMPTY
RETURN “Input is in the language”
ELSE
RETURN “Error: Input not in language”
Explanation:
The parser is that phase of the compiler which takes a token string as input and with the help
of existing grammar, converts it into the corresponding Intermediate Representation (IR). The
parser is also known as Syntax Analyzer.
14
A predictive parser is a recursive descent parser with no backtracking or backup. It is a topdown parser that does not require backtracking. At each step, the choice of the rule to be
expanded is made upon the next terminal symbol.
Example problem:
Construct Predictive parser or LL(1) PARSER
E->E+T| T
T->T*F | F
F-> (E)| id
Solution:
Step 1: Remove Left Recursion
E -> TE’
E ‘-> +TE’|ε
T -> FT’
T ’ -> *FT’|ε
F -> (E)|id
Step 2: Find FIRST( ) & FOLLOW( )
FIRST(E) = { ( , id}
FIRST(T) = { ( , id}
FIRST(F) = { ( , id}
FIRST(E’) = { +, Ɛ}
FIRST(T’) = { *, Ɛ}
FOLLOW(E) = { ) , $}
FOLLOW(E’) = { ) , $}
FOLLOW(T) = {+, ), $}
FOLLOW(T’) = {+, ), $}
FOLLOW(F) = {+, *, ), $}
15
Step 3: Construct Predictive table:
Non
Terminal
Terminals
id
E
(
T->FT’
$
E’-> ε
E’-> ε
T’-> ε
T’-> ε
T->FT’
T->FT’
F->id
)
E->TE’
E->TE’
T’
F
*
E->TE’
E’
T
+
T-> *FT’
F->(E)
Sample Input and Output:
Input:
S->A
A->Bb
A->Cd
B->aB
B->@
C->Cc
C->@
Output:
Predictive parsing table:
--------------------------------------------------------------------------------------------a
b
c
d
$
--------------------------------------------------------------------------------------------S
S->A
S->A
S->A
S->A
--------------------------------------------------------------------------------------------A
A->Bb
A->Bb
A->Cd
A->Cd
--------------------------------------------------------------------------------------------B
B->aB
B->@
B->@
B->@
--------------------------------------------------------------------------------------------C
C->@
C->@
C->@
---------------------------------------------------------------------------------------------
Result:
Thus, Program to implement Predictive parser was executed and verified successfully.
16
Ex. No.: 5
Arithmetic Calculator Using LEX and YACC
Aim:
To write a program to implement an Arithmetic Calculator using LEX and YACC
Algorithm:
Create a lex Program named “sample.l”
STEP 1: Include the necessary library files for token definitions and ‘C’ declarations
STEP 2: Define the rules for regular definitions and patterns
STEP 3: Define the functions invoked in the translation rules
Create a yacc Programnamed “sample.y”
STEP 1: Include the necessary declarations for yacc specifications and ‘C’ headers
STEP 2: Define the translation rules and the input structure specifications for the
grammar rules
STEP 3: Define the functions that are invoked in the rules
STEP 4: Read the input
STEP 5: If the given input matches with the defined rules, the yacc tool executes.
the respective actions.
STEP 6: End.
Pseudo code:
File Name: sample.l
BEGIN
IMPORT libraries
DEFINE rules FOR patterns
DEFINE functions
INVOKE translation rules
END
17
File Name: sample.y
BEGIN
DECLARE headers
DEFINE translation rules
AND input structure
DECLARE tokens
DEFINE lex AND yacc
DEFINE function yyparse() and CALL yylex()
CHECK input WITH rules
EXECUTE function
END
Explanation:
Yacc (for “yet another compiler compiler.”) is the standard parser generator for the Unix operating
system. An open source program, yacc generates code for the parser in the C programming
language. The acronym is usually rendered in lowercase but is occasionally seen as YACC or
Yacc.
How to execute:
$yacc –d file.y
$lex file.l
$cc lex.yy.c y.tab.c -ll –ly –lm
$./a.out
Sample Input and Output:
Input:
Enter the Expression: log100
Output:
Answer = 2
18
Input:
Enter the Expression: tan45
Output:
Answer = 0.999954
Result:
Thus, the Program to implement scientific calculator using LEX and YACC was executed
successfully.
19
Generate Three Address Code for a C Program Using
LEX and YACC
Ex. No.: 6
Aim:
To write a program to generate three address code for a simple program.
Algorithm:
STEP 1: Read the input expression from the user using gets() function.
STEP 2: Read and check each and every character of the given input inorder to
find the highest precedence operator.
STEP 3: After finding the highest precedence operator, assign the sub expression
involved on that operator on to a temporary name and print it.
STEP 4: Similarly find the next highest precedence operator and do Step 3 until ‘\0’
reached.
STEP 5: END
Pseudo code:
READ string
READ character y
IF y EQUALS ‘+’ OR ‘-’ OR ‘*’ OR ‘/’ OR ‘%’
IMPLEMENT function compute()
CHECK precedence
ADVANCE pointer
PERFORM arithmetic operation
ASSIGN subexpression TO temp
DECREMENT stpos
INCREMENT endpos
ASSIGN end EQUAL TO start
CONCAT buff, expr, stpos
ASSIGN ‘t’ TO buff[stpos]
20
COPY expr, buff
REPEAT until ‘\0’ IS reached
END
Explanation:
A compiler can broadly be divided into two phases based on the way they compile.
1. Analysis Phase
2. Synthesis Phase
Analysis Phase (Front end of the Compiler):
It is known as the front-end of the compiler, the analysis phase of the compiler reads the source
program, divides it into core parts and then checks for lexical, grammar and syntax errors.
The analysis phase generates an intermediate representation of the source program and symbol
table, which should be fed to the Synthesis phase as input.
The front end of a compiler is the part that takes the source language and produces an intermediate
representation. For example, a C compiler front end will take an input file containing C statements
and translate that into some intermediate form.
A compiler back end would convert that to a specific machine language.
Three address code is a type of intermediate code which is easy to generate and can be easily
converted to machine code.
It makes use of at most three addresses and one operator to represent an expression and the value
computed at each instruction is stored in temporary variable generated by compiler. The compiler
decides the order of operation given by three address code.
General representation – a = b op c
Where a, b or c represents operands like names, constants or compiler generated temporaries and
op represents the operator.
21
Example 1:
Source statement: x = a + b* c + d
Three address code:
t1 = b* c
t2 = a + t1
x = t2 + d
Example 2:
Source statement: (a+b)*(c+d) -(a+b+c)
Three address code:
t1 = a + b
t2 = c + d
t3 = t1 * t2
t4= a + b
t5 = t4 + c
t6 = t3 - t5
Sample Input and Output:
Input:
Enter the Expression
a=a+(3-(b–2)*4)
Output:
t1 = b – 2
t2 = t1 * 4
t3 = 3 – t2
22
t4 = a + t3
a = t4
Result:
Thus, the program to implement the Front end of a Compiler was successfully executed.
23
Code Optimization Technique – Constant Folding
Ex. No.: 7
Aim:
To write a program to implement the Constant folding Code Optimization techniques.
Algorithm:
STEP 1: Identify the tokens in the program and store it in a separate array.
STEP 2: Read through the tokens to identify any assignment to constants.
STEP 3: Scan through the tokens to find if there is any occurrence of the token in an
arithmetic expression which is a variable assigned to a constant.
STEP 4: Replace the occurrence of the token with the constant.
STEP 5: Finally write the set of tokens after replacement of the variable with the constant.
Pseudo Code:
//Constant folding
BEGIN
INCLUDE header files
DEFINE struct with variables
READ input
GET tokens by get_tokens
OPEN input file inside MAIN
WHILE true
READ char by getc and store in ch
WHILE ch true
BUFFER ++
ELSE break
DEFINE readinput
24
DECLARE variables
COPY buffer to temp by strcpy
READ tokens and compare
DEFINE gettokens
READ char in token
AND
COMPARE with DEFINED operator
IF match replace
END
Explanation:
Code optimization is a program modification strategy that endeavours to enhance the intermediate
code, so a program utilises the least potential memory, minimises its CPU time and offers high
speed. Code optimization is essential to enhance the execution and efficiency of a source code. It
is mandatory to deliver efficient target code by lowering the number of instructions in a program.
Code Optimization Techniques:
25
Examples:
Compile Time Evaluation:
x = 12.4,
y = x/2.3 ,
Evaluate x/2.3 as 12.4/2.3 at compile time.
Constant Propagation:
c=a*b
x=a
till
d=x*b+4
//After Optimization
c=a*b
x=a
till
d=a*b+4
Constant Propagation:
x = 12.4
y = x/2.3
Evaluates x/2.3 as 12.4/2.3 at compile time.
Copy Propagation:
c=a*b
x=a
till
d=x*b+4
//After Optimization
c=a*b
x=a
till
d=a*b+4
Dead Code Elimination:
c=a*b
26
x=a
till
d=a*b+4
//After elimination:
c=a*b
till
d=a*b+4
Unreachable Code Elimination:
#include <iostream>
using namespace std;
int main()
{
int num;
num=10;
cout << "GFG!";
return 0;
cout << num; //unreachable code
}
//after elimination of unreachable code
int main()
{
int num;
num=10;
cout << "GFG!";
return 0;
}
Induction Variable and Strength Reduction:
i = 1;
while (i<10)
{
y = i * 4;
}
27
//After Reduction
i=1
t=4
{
while( t<40)
y = t;
t = t + 4;
}
Sample Input and Output:
main()
{
float pi=3.14,r,a;
scanf("%f",&r);
a=pi*r*r;
printf("a = %f", a);
return 0;
}
$ ./a.out
main()
{
float pi = 3.14 , r , a ;
scanf("%f", , &r) ;
a = 3.14 * r * r ;
printf("a = %f" , a) ;
return 0 ;
}
RESULT:
Thus, the program to implement Constant folding Code Optimization is executed and verified.
28
Machine Code Generation – Back End Compiler
Ex. No.: 8
Aim:
To write a program to implement the Back end of the compiler.
Algorithm:
STEP 1: Read the intermediate codes as input from the user.
STEP 2: Read and check each and every intermediate code in order to find the operator
involved in it.
STEP 3: Based on the operator, we can print the mnemonics names for assembly code
instruction. i.e., if the operator involved is ‘+’ then we can print the equivalent
mnemonic name as ADD.
STEP 4: Similarly do steps 2 & 3 for all the intermediate codes given.
STEP 5: END.
Pseudo code:
BEGIN
DEFINE function isopr(c)
IF c EQUALS ‘+’ OR ‘-’ OR ‘*’ OR ‘/’
THEN RETURN 1
ELSE
RETURN 0
ASSIGN string TO tmp
IF tmp[0] is a DIGIT
PRINT newline PLUS MOV
CONCAT with string AS &#39 AND tmp AS &#39
CONCAT “R” with rno1
29
ELSE
PRINT newline PLUS MOV CONCAT tmp
CONCAT “R” WITH rnol
END IF
ELSE IF tmp[0] EQUALS ‘t’
THEN
CALL function addtoreg AND
RETURN rno2
PRINT newline PLUS “MOV R”
CONCAT rno1 AND “R”
CONCAT rno2
END IF
END
Explanation:
A compiler can broadly be divided into two phases based on the way they compile.
1. Analysis Phase
2.Synthesis Phase
Synthesis Phase (Back End of the Compiler):
Known as the back end of the compiler, the synthesis phase generates the target program with the
help of intermediate source code representation and symbol table.
A compiler back end takes that intermediate representation and produces object code. So, for
example, a C compiler front end will take an input file containing C statements and translate that
into some intermediate form. A compiler back end would convert that to a specific machine
language.
Example:
x = a+b*50
Intermediate Code:
temp1 = int to real (50)
temp2 = id3 * temp1
30
temp3 = id2 * temp2
id1 = temp3
Machine Code:
MOVF id3, R2
MULF #50.0, R2
MOVF id2, R2
ADDF R2, R1
MOVF R1, id1
Sample Input and Output:
Input:
Enter the number of Intermediate Code Entries : 3
Enter the Intermediate Code
t1 = a + b
t2 = 25 * t1
ans = t2
Output:
MOV a, R0
ADD b, R0
MOV #25, R1
MUL R0,R1
MOV R1, ans
Result:
Thus, the program to implement the back end of a Compiler was successfully executed.
31
Ex. No.: 9
Recursive Descent Parser
Aim:
To write a program to implement a Recursive Descent Parser.
Algorithm:
STEP 1: Read the number of productions.
STEP 2: Read the productions, Start Symbol and input string to be parsed.
STEP 3: Check each and every character of the input string with the Start symbol
production right side.
STEP 4: If any Non-Terminal Present in the production, then substitutes its
alternative from the remaining productions given.
STEP 5: If all the characters of the input string matches with productions right side
symbols then we can print that the parsing has been completed successfully.
STEP 6: Otherwise print that the parsing is not completed successfully.
STEP 7: End
Pseudo code:
IMPORT headers
DEFINE TRUE 1
DEFINE FALSE 0
READ no of productions
RAED productions, start symbol,string
SEARCH start symbol
PERFORM left factoring
DEFINE function TO parse single symbol
GET NEXT token FROM token stream
32
IF token MATCHES symbol
PRINT “Input is processed successfully”
THEN MOVE TO next symbol
ELSE
PRINT “Input does not matches”
END IF
END
Explanation:
Recursive Descent Parser uses the technique of Top-Down Parsing without backtracking. It can
be defined as a Parser that uses the various recursive procedure to process the input string with no
backtracking. It can be simply performed using a Recursive language.
The first symbol of the string of R.H.S of production will uniquely determine the correct
alternative to choose. The major approach of recursive-descent parsing is to relate each nonterminal with a procedure.
The objective of each procedure is to read a sequence of input characters that can be produced by
the corresponding non-terminal and return a pointer to the root of the parse tree for the nonterminal. The structure of the procedure is prescribed by the productions for the equivalent nonterminal.
Sample Input and Output:
Input:
Enter the no.of production rules : 3
Enter the production rules
S → cAd
A → ab
A→a
Output:
Enter the starting symbol : S
33
Enter the input string
:
cad
Current Production Symbol c
Input: c matches with the Production S → cAd
Current Production Symbol A
A is a non terminal, so expand with A → ab
Current Production Symbol a
Input: a matches with the Production A → ab
Current Production Symbol b
Input: a does not matches with Production A → ab
So, Backtrack and find the next alternative
Taking Next Alternative A → a
Current Production Symbol a
Input: a matches with the Production A → a
Current Production Symbol d
Input: d matches with the Production S → cAd
Input is processed successfully
Result:
Thus, the program to implement Recursive Descent Parser was executed successfully.
34
Ex. No.: 10
Symbol Table Generation
Aim:
To write a program to generate Symbol Table.
Algorithm:
STEP 1: Initialize an empty symbol table.
STEP 2: Parse the input source code and extract all identifiers (such as variable names,
function names, etc.).
STEP 3: For each identifier: Check if the identifier already exists in the symbol table.
i.
If it does, report an error (such as a redeclaration error).
ii.
If it does not, add the identifier to the symbol table with an associated
scope level, data type, and initial value (if applicable).
STEP 4: Repeat steps 2-3 for all source code files being compiled.
STEP 5: Return the symbol table for use by subsequent compiler phases.
Pseudo code:
BEGIN
INITIALIZE empty ST
PARSE input
EXTRACT identifiers
CHECK if identifier EXIST in ST
TRUE
report ERROR as REDECLARATION
FALSE
ADD to the ST
REPEAT
35
RETURN ST
END
Explanation:
Symbol Table is an important data structure created and maintained by the compiler in order to
keep track of semantics of variables i.e. it stores information about the scope and binding
information about names, information about instances of various entities such as variable and
function names, classes, objects, etc.
It is built-in lexical and syntax analysis phases. The information is collected by the analysis phases
of the compiler and is used by the synthesis phases of the compiler to generate code. It is used by
the compiler to achieve compile-time efficiency.
Items stored in Symbol table:
Variable names and constants
Procedure and function names
Literal constants and strings
Compiler generated temporaries
Labels in source languages
Information used by the compiler from Symbol table:
Data type and name
Declaring procedures
Offset in storage
If structure or record then, a pointer to structure table.
For parameters, whether parameter passing by value or by reference
Number and type of arguments passed to function
Base Address
Example:
main()
{
36
int counter;
int starPoint;
}
Symbol Table:
Name
Counter
starPoint
Type Address Scope
int
0
main
int
4
main
Sample Input and Output:
Input:
Enter the number of symbols: 4
Enter the symbol names:
x
y
z
s
Output:
Symbol table:
Name Address
x
0
y
4
z
8
s
12
Result:
Thus, the program to generate symbol table was executed successfully.
37
Download