LR-k-grammars

advertisement
LR(K) Grammars
Hitesh keelapudi
Bottom-Up Parsing
• LR(k) Parsers are Bottom-Up Parsers
• LR(k) Grammars is exactly the set of
Deterministic Context-Free Grammars
Bottom-Up Parsing
• Start at the leaves and grow toward root
• As input is consumed, encode possibilities in
an internal state
• A powerful parsing technology
• LR grammars
– Construct right-most derivation of program
– Left-recursive grammar, virtually all programming
language are left-recursive
– Easier to express syntax
Bottom-Up Parsing
• Right-most derivation
– Start with the tokens
– End with the start symbol
– Match substring on RHS of production, replace by
LHS
– Shift-reduce parsers
• Parsers for LR grammars
• Automatic parser generators (yacc, bison)
Bottom-Up Parsing
• Example Bottom-Up Parsing
SS+E|E
E  num | (S)
 (E+2+(3+4))+5
(1+2+(3+4))+5
(S+2+(3+4))+5
(S+E+(3+4))+5 (S+(3+4))+5
(S+(E+4))+5
(S+(S+4))+5
(S+(S+E))+5
(S+(S))+5
(S+E)+5
(S)+5
S+5
S+E
E+5
S
Terminology LR(k)
•
•
•
•
Left-to-right scan of input
Right-most derivation
k symbol lookahead
[Bottom-up or shift-reduce] parsing or LR
parser
• Perform post-order traversal of parse tree
Shift-Reduce Parsing
• Parsing actions:
– A sequence of shift and reduce operations
• Parser state:
– A stack of terminals and non-terminals (grows to
the right)
• Current derivation step:
= stack + input
Shift-Reduce Parsing
Derivation Step
(Stack + input)
Stack
(terminals &
non-terminals)
Unconsumed
input
(1+2+(3+4))+5 shift
(1+2+(3+4))+5
(E+2+(3+4))+5
(E
+2+(3+4))+5 reduce
(S+2+(3+4))+5
(S
+2+(3+4))+5 reduce
(S+E+(3+4))+5
(S+E
+(3+4))+5 reduce
Shift-Reduce Actions
• Parsing is a sequence of shift and reduces
• Shift: move look-ahead token to stack
Stack
Input
(
1+2+(3+4))+5
(1
+2+(3+4))+5
Action
Shift 1
• Reduce: Replace symbols from top of stack
with non-terminal symbols X corresponding to
the production: X  β (e.g., pop β, push X)
Stack
Input
(S+E
+(3+4)+5
(S
+(3+4)+5
Action
Reduce SS+E
Shift-Reduce Parsing
Derivation
Stack
(1+2+(3+4))+5
SS + E | E
E  num | (S)
Input stream
Action
(1+2+(3+4))+5
shift
(1+2+(3+4))+5
(
1+2+(3+4))+5
shift
(1+2+(3+4))+5
(1
+2+(3+4))+5
reduce E  num
(E+2+(3+4))+5
(E
+2+(3+4))+5
reduce S  E
(S+2+(3+4))+5
(S
+2+(3+4))+5
Shift
(S+2+(3+4))+5
(S+
2+(3+4))+5
Shift
(S+2+(3+4))+5
(S+2
+(3+4))+5
reduce E  num
(S+E+(3+4))+5
(S+E
+(3+4))+5
reduce S  S + E
(S+(3+4))+5
(S
+(3+4))+5
Shift
(S+(3+4))+5
(S+
(3+4))+5
Shift
(S+(3+4))+5
(S+(
3+4))+5
Shift
(S+(3+4))+5
…
(S+(3
+4))+5
reduce E  num
BUILDING AN LR(0) PARSER
Lets Build An LR(0) Parser!
• First we shall define a simple grammar
–E→E*B
–E→E+B
–E→B
–B→0
–B→1
• We also add a new rule, S → E, which is used
by the parser as a final accepting rule
Items
• To create a parsing table for this grammar we
must introduce a special symbol, ∙, which
indicates the current position for which the
parser has already read symbols on the input and
what to expect next
• E.g. E → E ∙ + B
– This shows that the E has already been processed and
the parser is looking for a + symbol next
• Each of these above rules is called an item
• There is an item for each position the dot symbol
can take along the right-hand side of the rule
Item Sets
• Since a parser may not know which grammar rule to use in
advance, when creating our table we must use sets of items
to consider all the possibilities
• E.g.
–
–
–
–
–
–
S→•E
E→•E*B
E→•E+B
E→•B
B→•0
B→•1
• The first line is the initial rule for the item set, but since we
need to consider all possibilities when we come to a nonterminal, we must create a closure around the nonterminal E, in this case. (By extension, we must do the same
for B as shown by the 5th and 6th items.)
Item Sets for Our Example
• Set 0
–
–
–
–
–
–
S→•E
E→•E*B
E→•E+B
E→•B
B→•0
B→•1
• Set 1
– B→ 0•
• Set 2
– B→ 1•
• Set 3
– S→ E•
– E→ E•*B
– E→ E•+B
• Set 4
– E→ B•
• Set 5
– E→ E*•B
– B→•0
– B→•1
• Set 6
– E→ E+•B
– B→•0
– B→•1
• Set 7
– E→ E*B•
• Set 8
– E→ E+B•
Transition Portion of Parse Table
Item Set
*
+
0
0
1
E
B
1
2
3
4
5
1
2
7
6
1
2
8
1
2
3
5
6
4
7
8

Each of the transitions can be found by following the item sets
to where the new item set is created from
o Item Set 7 Spawned as a result of Item Set 5
Constructing the Table
• After finishing creating the item sets and the transitions,
follow the steps below to finish the table
1) The columns for nonterminals are copied to the goto table.
2) The columns for the terminals are copied to the action
table as shift actions.
3) An extra column for '$' (end of input) is added to the action
table that contains acc for every item set that contains S →
E •.
4) If an item set i contains an item of the
form A → w • and A → w is rule m with m > 0 then the row for
state i in the action table is completely filled with the reduce
action rm.
Final Parse Table
Action
State
*
+
0
Goto
0
1
s1
s2
1
r4
r4
r4
r4
2
r5
r5
r5
r5
3
s5
s6
4
r3
r3
$
E
B
g3
g4
acc
r3
r3
5
s1
s2
g7
6
s1
s2
g8
7
r1
r1
r1
r1
8
r2
r2
r2
r2
Thank you
Download