Lecture # 7 - cse344compilerdesign

advertisement
Lecture # 7
Chapter 4: Syntax Analysis
What is the job of Syntax Analysis?
• Syntax Analysis is also called Parsing or Hierarchical Analysis.
• A Parser implements grammar of the language may it be C,
C++ etc
• The parser obtains a string of tokens from the lexical analyzer
and verifies that the string can be generated by the grammar
for the source language
• The grammar that a parser implements is called a Context
Free Grammar or CFG
Position of a Parser in the Compiler
Model
Source
Program
Token,
tokenval
Lexical
Analyzer
Get next
token
Lexical error
Parser
and rest of
front-end
Intermediate
representation
Syntax error
Semantic error
Symbol Table
3
The Parser
The role of the parser is twofold:
1. To check syntax (= string recognizer)
–
And to report syntax errors accurately
2. To invoke semantic actions
–
For static semantics checking, e.g. type checking of
expressions, functions, etc.
4
What is the difference between Syntax
and Semantic?
• Syntax is the way in which we construct
sentences by following principles and rules.
• Semantics is the interpretations of and
meanings derived from the sentence
transmission and understanding of the
message or in other words are the logical
sentences making sense or not
Error Handling
• A good compiler should assist in identifying and
locating errors
– Lexical errors: important, compiler can easily recover and
continue such as misspelling an identifier, keyword etc.
– Syntax errors: most important for compiler, can almost
always recover such as arithmetic expression with
unbalanced parenthesis
– Static semantic errors: important, can sometimes recover
such as operator applied to incompatible operands
– Dynamic semantic errors: hard or impossible to detect at
compile time, runtime checks are required
– Logical errors: hard or impossible to detect such as infinite
recursive calls
6
Viable-Prefix Property
• The viable-prefix property of parsers allows
early detection of syntax errors
– Goal: detection of an error as soon as possible
without further consuming unnecessary input
– How: detect an error as soon as the prefix of the
input does not match a prefix of any string in the
language
Prefix
…
for (;)
…
Error is
detected here
7
Error Recovery Strategies
• Panic mode
– Discard input until a token in a set of designated
synchronizing tokens is found
• Phrase-level recovery
– Perform local correction on the input to repair the error
• Error productions
– Augment grammar with productions for erroneous
constructs
• Global correction
– Choose a minimal sequence of changes to obtain a global
least-cost correction
8
Context free Grammar
• Context-free grammar is a 4-tuple
G = (N, T, P, S) where
– T is a finite set of tokens (terminal symbols)
– N is a finite set of nonterminals
– P is a finite set of productions of the form

where   N and   (NT)*
– S  N is a designated start symbol
Example Grammar
Context-free grammar for simple expressions:
G = <{expr,op,digit}, {+,-,*,/,0,1,2,3,4,5,6,7,8,9,),(}, P,expr>
with productions P =
expr expr op expr
expr (expr)
expr  digit
digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
op  + | - | * | /
10
Notational Conventions Used
• Terminals: Lower case letters, operator symbols,
punctuation symbols, digits, bolface strings are all
terminals
• Non Terminals: Upper case letters, lower case
italic names are usually non terminals
• Greek letters such as ,, represent strings of
grammars symbols. Thus a generic production
can be written as A 
Examples of CFG
• Write a CFG that generates Even Palindrome
S  aSa | bSb | є
• Write a CFG that generates Odd Palindrome
S  aSa | bSb | a | b
• Write a CFG that generates Equal number of a’s
and b’s
S  aSbS | bSaS | є
Examples
• Write a CFG that generates Equal number of
a’s and b’s
S  aSbScS | aScSbS | bSaScS | bScSaS |
cSaSbS | cSbSaS | є
Practice
a) CFG generating alternating sequence of 0’s
and 1’s
b) CFG in which no consecutive b’s can occur but
consecutive a’s can occur
c) CFG for the following language:
L(G)= {an b2n | n>=0}
Practice Answers
a) S 0A | 1B
A 1B | 0
B 0A | 1
b) S aS | bT |a |b
T aS | a
c) S aSbb | є
Example
• Design a CFG for the language
L(G)= {0n 1m | n <> m}
There are two cases:
– For n>m
– For n<m
– Write two separate set of rules and combine them
Example
• For n>m
S1 AB
B0A1 | Є
A0A | 0
For n<m
S2 XY
X0X1 | Є
Y1Y | 1
Combining both:
S  S1 | S2
Derivations
• The one-step derivation is defined by
A
where A   is a production in the grammar
• In addition, we define
–
–
–
–
 is leftmost lm if  does not contain a nonterminal
 is rightmost rm if  does not contain a nonterminal
Transitive closure * (zero or more steps)
Positive closure + (one or more steps)
• The language generated by G is defined by
L(G) = {w  T* | S + w}
Derivation (Example)
Grammar G = ({E}, {+,*,(,),-,id}, P, E) with
productions P =
EE+E
EE*E
E(E)
E-E
E  id
Example derivations:
E  - E  - id
E rm E + E rm E + id rm id + id
E * E
E * id + id
E + id * id + id
19
Example
• For the given grammar derive the string
9-5+2 using Left most derivation
Then derive the same string using Right most
derivation
Example Grammar
Context-free grammar for simple expressions:
G = <{list,digit}, {+,-,0,1,2,3,4,5,6,7,8,9}, P, list>
with productions P =
list  list + digit
list  list - digit
list  digit
digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
21
Derivation for the Example Grammar
list
 list + digit
 list - digit + digit
 digit - digit + digit
 9 - digit + digit
 9 - 5 + digit
9-5+2
This is an example leftmost derivation, because we replaced
the leftmost nonterminal (underlined) in each step.
Likewise, a rightmost derivation replaces the rightmost
nonterminal in each step
22
Download