Lecture#11 - cse344compilerdesign

advertisement
Lecture # 11
Grammar Problems
Left Recursion
• A grammar is left recursive if for a nonterminal A, there is a derivation A+ A
• There are three types of left recursion:
– direct (A  A x)
– indirect (A  B C, B  A )
– hidden (A  B A, B  )
How to eliminate Left recursion?
• To eliminate direct left recursion replace
A  A1 | A2 | ... | Am | 1 | 2 | ... | n
with
A  1B | 2B | ... | nB
B  1B | 2B | ... | mB | 
Left recursion
• How about this:
SE
E  E+T
ET
T  E-T
TF
F  E*F
F  id
There is direct recursion: EE+T
There is indirect recursion: TE+T, ET
Algorithm for eliminating indirect recursion
List the nonterminals in some order A1, A2, ...,An
for i=1 to n
for j=1 to i-1
if there is a production AiAj,
replace Aj with its rhs
eliminate any direct left recursion on Ai
Eliminating indirect left recursion
ordering: S, E, T, F
SE
E  E+T
ET
T  E-T
TF
F  E*F
F  id
i=S
SE
E  E+T
ET
T  E-T
TF
F  E*F
F  id
i=E
SE
E  TE'
E'+TE'|
T  E-T
TF
F  E*F
F  id
i=T, j=E
SE
E  TE'
E'+TE'|
T  TE'-T
TF
F  E*F
F  id
SE
E  TE'
E'+TE'|
T  FT'
T'  E'-TT'|
F  E*F
F  id
Eliminating indirect left recursion
i=F, j=E
SE
E  TE'
E'+TE'|
T  FT'
T'  E'-TT'|
F  TE'*F
F  id
i=F, j=T
SE
E  TE'
E'+TE'|
T  FT'
T'  E'-TT'|
F  FT'E'*F
F  id
SE
E  TE'
E'+TE'|
T  FT'
T'  E'-TT'|
F  idF'
F'  T'E'*FF'|
Example
• Eliminate Left Recursion from the following
grammar:
• S Aa | b
• A Ac | Sd
• The algorithm is guaranteed to work if the
grammar has no cycle and null productions
Example Left Recursion Elim.
ABC|a
BCA|Ab
CAB|CC|a
i = 1:
i = 2, j = 1:
i = 3, j = 1:
i = 3, j = 2:
Choose arrangement: A, B, C
nothing to do
BCA|Ab

BCA|BCb|ab
(imm) B  C A BR | a b BR
BR  C b BR | 
CAB|CC|a

CBCB|aB|CC|a
CBCB|aB|CC|a

C  C A BR C B | a b BR C B | a B | C C | a
(imm) C  a b BR C B CR | a B CR | a CR
CR  A BR C B CR | C CR | 
8
Grammar problems
• Consider S  if E then S else S | if E then S
– Which of the two productions should we use to
expand non-terminal S when the next token is if?
– We can solve this problem by factoring out the
common part in these rules. This way, we are
postponing the decision about which rule to
choose until we have more information (namely,
whether there is an else or not).
– This is called left factoring
Left factoring
A  1 | 2 |...| n | 
becomes
A  B| 
B  1 | 2 |...| n
Example
• Left factor the following grammar:
• S iEtS | iEtSeS |a
• E b
Grammar problems
• A symbol XV is useless if
– there is no derivation from X to any string in the
language (non-terminating)
– there is no derivation from S that reaches a
sentential form containing X (non-reachable)
• Reduced grammar = a grammar that does not
contain any useless symbols.
Useless symbols
• In order to remove useless symbols, apply two
algorithms:
– First, remove all non-terminating symbols
– Then, remove all non-reachable symbols.
• The order is important!
– For example, consider S + X where 
contains a non-terminating symbol. What will
happen if we apply the algorithms in the wrong
order?
• Concrete example: S  AB | a, A a
Useless symbols
• Example
Initial grammar:
Algorithm 1 (terminating symbols):
S AB | CA
A is in because of A a
A a
C is in because of C b
B CB | AB
D is in because of D d
C cB | b
S is in because A, C are in and S AC
D aD | d
Useless symbols
• Example continued
After algorithm 1:
Algorithm 2 (reachable symbols):
S CA
S is in because it is the start symbol
A a
C and A are in because S is in and S CA
C b
D aD | d
Final grammar:
S CA
A a
C b
Parsing
• Parsing = process of determining if a string of tokens
can be generated by a grammar
• For any CF grammar there is a parser that takes at
most O(n3) time to parse a string of n tokens
• Linear algorithms suffice for parsing programming
language source code
• Top-down parsing “constructs” a parse tree from root
to leaves
• Bottom-up parsing “constructs” a parse tree from
leaves to root
16
Predictive Parsing
• Recursive descent parsing is a top-down parsing
method
– Every nonterminal has one (recursive) procedure
responsible for parsing the nonterminal’s syntactic
category of input tokens
– When a nonterminal has multiple productions, each
production is implemented in a branch of a selection
statement based on input look-ahead information
• Predictive parsing is a special form of recursive
descent parsing where we use one lookahead token
to unambiguously determine the parse operations
17
Download