Uploaded by Annie Rawat

Lecture [14-15]

advertisement
Semantic Analysis
The Structure of a compiler
Character Stream
Lexical Analyzer
Target
Code
Token Stream
M/c dependent Code
Optimizer
Syntax Analyzer
Syntax tree
Target
Code
Semantic Analyzer
Code Generator
Syntax tree
IR
Intermediate Code
Generator
Front End
March 10 & 21, 2014
IR
M/c Independent Code
Optimizer
Vandana, BITS Pilani, CSF363/IS F342
Back End
2
Syntax Directed Definition
• The grammar and the set of semantic rules
constitute the syntax directed definition.
• A syntax-directed translation is used to define the
translation of a sequence of tokens to some other
value, based on a CFG for the input.
• An information is associated as attributes to the
grammar symbols
• A syntax directed definition (SDD) specifies the
values of attributes associating semantic rules with
the grammar productions
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
3
Examples
• Type of identifiers
• Computation of expressions based on static
values
– X = 37 + 49*2 – 20;
• Intermediate code generation
• Construction of abstract syntax tree
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
4
Some definitions
• Annotated Parse Trees
A parse tree showing the attribute values at each node is called an
annotated parse tree.
• Synthesized Attributes
An attribute is said to be synthesized if its value at a parse tree
node is determined from attribute values at the children of the
node
• Inherited attributes
An inherited attribute is one whose value at a node in a parse tree
is defined in terms of attributes at the parent and/or siblings of that
node.
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
5
Synthesized Attributes
• Synthesized attributes are attributes that are
computed purely bottom-up
• Such a grammar plus semantic actions is
called an S-attributed definition
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
6
Understanding Translation:
syntax directed definition for evaluating an expression
Production
Semantic rule
1
expr exprL + term
expr.t:= exprL.t + term.t
2
expr exprL - term
expr.t:= exprL.t - term.t
3
expr term
expr.t := term.t
4
termdigit
term.t := digitval
(=getValue(ST, digit))
Example
expr
expr
expr
expr
term
digit
-
+
-
term
term
digit
digit
term
digit
expr.t =
Example
expr
-3+9 = 6
expr.t =
expr
-2 - 1=-3
expr.t =
expr2
2-4=–
expr.t =
expr
2
term.t
term=
2
digit
-
term.t =
4 term
digit
+
term.t
term =
1
digit
term.t =
9term
digit
Computing The Translation
To compute the s ynthesized attributed
translation of a string,
• Build the parse tree,
• Use the translation rules to compute the
translation of each nonterminal in the tree,
bottom-up
• The translation of the string is the translation
of the root nonterminal.
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
10
Example : Compute the type of an expression
• Compute the type of an expression that
includes both arithmetic and boolean
operators. (The type is INT, BOOL, or ERROR.)
• First define CFG Rules
• Determine meaning (semantic) of each rule
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
11
Context free grammar rules with SDD
1. exp -> exp + exp
if ((exp2.t == INT) and (exp3.t == INT)
then exp1.t = INT
else exp1.t = ERROR
2. exp -> exp and exp
if ((exp2.t == BOOL) and (exp3.t ==BOOL)
then exp1.t = BOOL
else exp1.t = ERROR
3. exp -> true
exp.t = BOOL
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
12
Context free grammar rules with SDD
4. exp -> false
exp.t = BOOL
5. exp -> int
exp.t = INT
6. exp -> ( exp
)
7. exp -> exp == exp
March 10 & 21, 2014
exp1.t = exp2.t
if ((exp2.t == exp3.t) and (exp2.t != ERROR))
then exp1.t = BOOL
else exp1.t = ERROR
Vandana, BITS Pilani, CSF363/IS F342
13
Generate annotated parse tree for (2+2)==4
exp
exp.t=BOOL
exp
==
exp.t=INT
(
exp
exp.t=INT
2
March 10 & 21, 2014
exp
exp.t=INT
+
)
exp
exp.t=INT
2
Vandana, BITS Pilani, CSF363/IS F342
exp
exp.t=INT
4
annotated
Parse Tree
Parse Tree
14
Translation Schemes
• A translation scheme is a context free
grammar embedded with semantic actions
• Semantic actions are the program fragments
embedded within the right sides of
productions
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
15
Annotated Parse Tree
• A parse tree for an S-attributed definition can
always be annotated by evaluating the
semantic rules for the attributes at each node
bottom up, from the leaves to the root.
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
16
Example of annotated parse tree for (2+2)==4
exp.t=BOOL
==
exp.t=INT
(exp.t=INT)
exp.t=INT
2
March 10 & 21, 2014
+
exp.t=INT
4
exp.t=INT
2
Vandana, BITS Pilani, CSF363/IS F342
annotated
Parse Tree
17
How to update the symbol table to keep the record
of the type of the variables
Lexeme
Token
Int
INT
num1
id
num2
id
char
CHAR
c
id
type
………
……
Use of inherited attributes evaluates
the type
Lexeme
Token
type
Int
INT
integer
num1
id
integer
num2
id
integer
char
CHAR
character
c
id
character
………
……
Define Grammar rules for declaration construct
(sample)
1.
2.
3.
4.
5.
DTL
Tint
Tchar
LL,id
Lid
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
20
Attributes for the nonterminals
• Terminals={int, char, id, ‘,’}
• Non terminals={D,T,L}
• Attribute for T:type (T.type represents the
type of the declared variables)
• Attribute for L: in (L.in represents the
inherited attribute information)
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
21
Grammar rules and associated semantic rules
1.
2.
3.
4.
5.
DTL
Tint
Tchar
LL1,id
Lid
March 10 & 21, 2014
1.
2.
3.
4.
L.in=T.type
T.type=integer
T.type=character
L1.in=L.in
addtype(id.entry,L.in)
5. addtype(id.entry,L.in)
Vandana, BITS Pilani, CSF363/IS F342
22
Parse tree for
int id,id
D
L
T
int
L
,
id
id
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
23
Annotated Parse tree for
INT id,id
D
L
L.in=integer
T.type=integer
T
int
L
L.in=integer
,
id
addtype(id.entry,L.in)
id
Symbol table is updated for both tokens ‘id’
with type of these as integers
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
24
Attributes dependency
• Inherited attributes are convenient for expressing
the dependence of a programming language
construct on the context in which it appears.
• If an attribute b at a node in a parse tree depends
on an attribute c, then the semantic rule for b at
that node must be evaluated after the semantic
rule that defines c.
• Dependency graph depicts the interdependencies
among the inherited and synthesized attributes at
a node
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
25
Dependency of attribute
• expr.t depends on expr1.t and term.t (refer slide 5 ,
expression grammar )
• This dependency can be represented as
expr.t=f(expr1.t,term.t)
• In general, b=f(c1, c2 , c3 , .. , ck) where attribute b
depends on attributes c1, c2 , c3 ,.. , ck
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
26
Dependency Graph
• The graph has a node for each attribute and
an edge to the node for b from the node for c
if attribute b depends on attribute c.
b
c1
March 10 & 21, 2014
c2
c3
Vandana, BITS Pilani, CSF363/IS F342
ck
27
Algorithm to construct Dependency Graph
for each node n in the parse tree do
for each attribute a of the grammar symbol at n do
construct a node in the dependency graph for a;
for each node n in the parse tree do
for each semantic rule b:= f(c1, c2,……..,ck) associated with the
production used at n do
for i:=1 to k do
construct an edge from the node for ci to
the node for b;
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
28
Evaluation Order
• A topological sort of a directed acyclic graph is
any ordering m1, m2,….mk of the node of the
graph such that if mimj is an edge from mi
to mj, then mi appears before mj in the
ordering.
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
29
Topological Sort
• Any topological sort of a dependency graph
gives a valid order in which the semantic rules
associated with the nodes in a parse tree can
be evaluated.
• The dependent attributes c1, c2,……..,ck in a
semantic rule b:= f(c1, c2,……..,ck) are
available at a node before f is evaluated.
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
30
Topological Sort
7
2
4
1
5
6
9
Directed graph
8
3
edge(i,j) indicates the dependency of node j on i
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
31
Topological Sort
7
2
4
1
5
6
9
Directed graph
8
3
Partial topological order: 1,2,3,4,5,6,8,7,9
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
32
Semantic Analyzer
• The principal job of the semantic analyzer is to
enforce static semantic rules
• Constructs a syntax tree (usually first)
• Information gathered is then needed by the
code generator
• An attribute's value at a given node depends on
those of other predecessor or successor nodes
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
33
Semantic Analyzer
• The principal job of the semantic analyzer is to
enforce static semantic rules
• Constructs a syntax tree (usually first)
• Information gathered is then needed by the
code generator
• An attribute's value at a given node depends on
those of other predecessor or successor nodes
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
34
Methods for Evaluating Semantic Rules
• Parse Tree Methods
Most Flexible
But fail if a cycle exists in the dependency graph
• Rule Based Methods
Analyze rules at compiler-generation time
Determine a static ordering at that time
Evaluate nodes in that order at compile time
•
Oblivious Methods
Ignore the parse tree and grammar
Choose a convenient order and use it
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
35
Parse-tree methods
1. Build the parse tree
2. Build the Abstract Syntax Tree (AST) to get
attribute dependence
3. Build the dependency graph
4. Topological sort the graph
5. Evaluate it
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
36
Multi-Pass Approach
• Separate AST construction from semantic checking
phase
• Traverse the AST and perform semantic checks (or
other actions) only after the tree has been built and
its structure is stable
• This approach is less error-prone and is better when
efficiency is not a critical issue
• Attribute evaluation proceeds as tree-walk of the
AST
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
37
Examples of semantic rules
 Variables must be defined before being used
 A variable should not be defined multiple times
 In an assignment stmt, the variable and the
expression must have the same type
 The test expression of an ‘if statement’ must have
boolean type
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
38
Abstract Syntax Trees
• An Abstract Syntax Tree is a condensed form of
parse tree.
• It is useful for representing language constructs
• The keywords and operators do not appear as the
leaves
• The syntax directed translation can be based on
AST as well as the parse trees
• The attributes can be attached to the nodes in
similar way as is done in parse trees
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
39
Example Parse tree and an AST
E
E
+
E
+
E
-
E
7
5
5
7
2
2
Parse Tree
March 10 & 21, 2014
-
Abstract Syntax Tree
Vandana, BITS Pilani, CSF363/IS F342
40
Construction of the Syntax Tree
• Similar to the translation of the infix
expression into postfix form
• A node is created for each operator and
operand
• The sub-expressions are represented by sub
trees whose roots are the children of the
operator nodes
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
41
Construction of the Syntax Tree
+
id
-
To entry
for b
num
id
To entry
for a
4
Syntax tree for a-4+b
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
42
Syntax Directed Definition To Construct
Syntax Tree
Production
Semantic rules
EE1+T
E.nptr:=makenode(‘+’,E1.nptr,T.nptr)
EE1-T
E.nptr:=makenode(‘-’,E1.nptr,T.nptr)
ET
E.nptr:=T.nptr
T(E)
T.nptr:= E.nptr
Tid
T.nptr:=makeleaf(id,id.entry)
Tnum
T.nptr:=makeleaf(num,num.val)
Rule no
Constructing
syntax tree
x3120
NULL +
x3048
E NULL
x3048
E
NULL
T
NULL
x3048
+
T
NULL
x3120
Semantic rules
1
EE1+T
E.nptr:=makenode(‘+’,E1.nptr,T.nptr)
2
EE1-T
E.nptr:=makenode(‘-’,E1.nptr,T.nptr)
3
ET
E.nptr:=T.nptr
4
T(E)
T.nptr:= E.nptr
5
Tid
T.nptr:=makeleaf(id,id.entry)
6
Tnum
T.nptr:=makeleaf(num,num.val)
x3120
x3072
(
NULL
E-
x3096
NULL
)
x3048
IDid
Production
E
x3072
NULL
T
x3072
NULL
-
x3096
NULL
T
NUM
num
x3096
IDid
x3072
• Evaluated Bottom up while
generating the parser
• The parser keeps the values
of S-attributes along with the
grammar symbols on its
stack
Class Assignment: Design Semantic Rules
for AST
<declarationStmt>===> <type> <var_list>
SEMICOLON
<var_list>===> ID <more_ids>
<more_ids>===> COMMA <var_list> | 
<type>===> INT | REAL | STRING | MATRIX
Input : string str1, s2;
•
•
Construct the parse tree.
Identify the information to retain and the one to
throw out.
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
45
<declarationStmt>
<type>
SEMICOLON
<varList>
STRING
<moreIds>
ID
<varList>
COMMA
ID
<moreIds>
Parse tree
EPSILON
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
46
<declarationStmt>
<type>
SEMICOLON
<varList>
STRING
<moreIds>
ID
<varList>
COMMA
ID
Important
information and
structure must
be retained
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
<moreIds>
EPSILON
47
<declarationStmt>
<type>
SEMICOLON
<varList>
STRING
<moreIds>
ID
<varList>
COMMA
ID
Important
information and
structure must
be retained
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
<moreIds>
EPSILON
48
<declarationStmt>
STRING
ID
ID
The subtree is
traversed now
to fetch type
info from the
extreme left
node.
Concrete
Subtree
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
49
<declarationStmt>
<type>
STRING
SEMICOLON
<varList>
<moreIds>
ID
<varList>
COMMA
ID
<moreIds>
Important information: Links
Associate rules to each
production to construct AST
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
EPSILON
50
Semantic Rules for AST
<declarationStmt><type> <var_list>
SEMICOLON
<declarationStmt>.leftChild = <type>.address
<declarationStmt>.rightChild =
<var_list>.address
<var_list> ID <more_ids>
<var_list>.leftChild = ID.address
<var_list>.rightChild = <more_ids>.address
<more_ids> COMMA <var_list>
<more_ids>.address = <var_list>.address
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
51
Attributes for different semantic
activity
•
•
•
•
AST Construction: links or addresses
Type checking: type
Expression evaluation: value
Intermediate Code generation: Three address
code
• Code generation: sequence of lines of
assembly code
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
52
Another Example of semantic rules
• <assignmentStmt_type1>===>
<leftHandSide_singleVar> ASSIGNOP
<rightHandSide_type1> SEMICOLON
• <leftHandSide_singleVar> ===> ID
• <rightHandSide_type1> ===>
<arithmeticExpression> | <sizeExpression>
|<funCallStmt>
• <sizeExpression> ==> SIZE ID
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
53
1. Preserve
valuable
information
2. Preserve
structure
AST
ASSIGNOP
ID
SIZE
ID
Class Assignment:
Design AST rules
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
54
AST vs Parse Tree
•
•
•
•
•
•
•
•
AST is condensed form of a parse tree
Operators appear at internal nodes, not at leaves.
Chains of single productions are collapsed.
Lists are "flattened“ into nodes with arbitrary number of
children
Omits details of concrete syntax (parenthesis, commas, semicolons etc.)
AST is a better structure for later compiler stages
Contains information about essential structure of the program
Source code is only a linearization of the AST
Source : google search
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
55
Semantic Analysis : other applications
• Three Address Code
The general form is x = y op z
• x,y,and z are names, constants, compilergenerated temporaries
• op stands for any operator such as +,-,…
•
(x*5)-y is translated as
t1 = x * 5
t2 = t1 - y
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
56
Syntax tree vs. Three address code
Expression: (A+B*C) + (-B*A) - B
_
+
B
+
*
_
*
A
B
C
A
B
T1 := B * C
T2 = A + T1
T3 = - B
T4 = T3 * A
T5 = T2 + T4
T6 = T5 – B
Three address code is a linearized representation
of a syntax tree in which explicit names correspond to the
interior nodes of the graph.
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
57
Semantic Analysis Applications:
a list
• Generating Intermediate Representation (AST,
Intermediate Code etc.)
• Collecting type information
• Type checking and error reporting
• Semantic errors reporting for undeclared variables
• Collecting Scope information
• Expression Evaluation
• Generating Code
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
58
Symbol table
• Symbol Table is populated with type and scope
information, which in turn associates the offset for each
identifier and is required at run time for allocating memory
to the identifiers
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
59
More on Semantic Analysis
• Type checking, intermediate representation
and code generation will be discussed later
March 10 & 21, 2014
Vandana, BITS Pilani, CSF363/IS F342
60
Download