LR(K) Grammars Hitesh keelapudi Bottom-Up Parsing • LR(k) Parsers are Bottom-Up Parsers • LR(k) Grammars is exactly the set of Deterministic Context-Free Grammars Bottom-Up Parsing • Start at the leaves and grow toward root • As input is consumed, encode possibilities in an internal state • A powerful parsing technology • LR grammars – Construct right-most derivation of program – Left-recursive grammar, virtually all programming language are left-recursive – Easier to express syntax Bottom-Up Parsing • Right-most derivation – Start with the tokens – End with the start symbol – Match substring on RHS of production, replace by LHS – Shift-reduce parsers • Parsers for LR grammars • Automatic parser generators (yacc, bison) Bottom-Up Parsing • Example Bottom-Up Parsing SS+E|E E num | (S) (E+2+(3+4))+5 (1+2+(3+4))+5 (S+2+(3+4))+5 (S+E+(3+4))+5 (S+(3+4))+5 (S+(E+4))+5 (S+(S+4))+5 (S+(S+E))+5 (S+(S))+5 (S+E)+5 (S)+5 S+5 S+E E+5 S Terminology LR(k) • • • • Left-to-right scan of input Right-most derivation k symbol lookahead [Bottom-up or shift-reduce] parsing or LR parser • Perform post-order traversal of parse tree Shift-Reduce Parsing • Parsing actions: – A sequence of shift and reduce operations • Parser state: – A stack of terminals and non-terminals (grows to the right) • Current derivation step: = stack + input Shift-Reduce Parsing Derivation Step (Stack + input) Stack (terminals & non-terminals) Unconsumed input (1+2+(3+4))+5 shift (1+2+(3+4))+5 (E+2+(3+4))+5 (E +2+(3+4))+5 reduce (S+2+(3+4))+5 (S +2+(3+4))+5 reduce (S+E+(3+4))+5 (S+E +(3+4))+5 reduce Shift-Reduce Actions • Parsing is a sequence of shift and reduces • Shift: move look-ahead token to stack Stack Input ( 1+2+(3+4))+5 (1 +2+(3+4))+5 Action Shift 1 • Reduce: Replace symbols from top of stack with non-terminal symbols X corresponding to the production: X β (e.g., pop β, push X) Stack Input (S+E +(3+4)+5 (S +(3+4)+5 Action Reduce SS+E Shift-Reduce Parsing Derivation Stack (1+2+(3+4))+5 SS + E | E E num | (S) Input stream Action (1+2+(3+4))+5 shift (1+2+(3+4))+5 ( 1+2+(3+4))+5 shift (1+2+(3+4))+5 (1 +2+(3+4))+5 reduce E num (E+2+(3+4))+5 (E +2+(3+4))+5 reduce S E (S+2+(3+4))+5 (S +2+(3+4))+5 Shift (S+2+(3+4))+5 (S+ 2+(3+4))+5 Shift (S+2+(3+4))+5 (S+2 +(3+4))+5 reduce E num (S+E+(3+4))+5 (S+E +(3+4))+5 reduce S S + E (S+(3+4))+5 (S +(3+4))+5 Shift (S+(3+4))+5 (S+ (3+4))+5 Shift (S+(3+4))+5 (S+( 3+4))+5 Shift (S+(3+4))+5 … (S+(3 +4))+5 reduce E num BUILDING AN LR(0) PARSER Lets Build An LR(0) Parser! • First we shall define a simple grammar –E→E*B –E→E+B –E→B –B→0 –B→1 • We also add a new rule, S → E, which is used by the parser as a final accepting rule Items • To create a parsing table for this grammar we must introduce a special symbol, ∙, which indicates the current position for which the parser has already read symbols on the input and what to expect next • E.g. E → E ∙ + B – This shows that the E has already been processed and the parser is looking for a + symbol next • Each of these above rules is called an item • There is an item for each position the dot symbol can take along the right-hand side of the rule Item Sets • Since a parser may not know which grammar rule to use in advance, when creating our table we must use sets of items to consider all the possibilities • E.g. – – – – – – S→•E E→•E*B E→•E+B E→•B B→•0 B→•1 • The first line is the initial rule for the item set, but since we need to consider all possibilities when we come to a nonterminal, we must create a closure around the nonterminal E, in this case. (By extension, we must do the same for B as shown by the 5th and 6th items.) Item Sets for Our Example • Set 0 – – – – – – S→•E E→•E*B E→•E+B E→•B B→•0 B→•1 • Set 1 – B→ 0• • Set 2 – B→ 1• • Set 3 – S→ E• – E→ E•*B – E→ E•+B • Set 4 – E→ B• • Set 5 – E→ E*•B – B→•0 – B→•1 • Set 6 – E→ E+•B – B→•0 – B→•1 • Set 7 – E→ E*B• • Set 8 – E→ E+B• Transition Portion of Parse Table Item Set * + 0 0 1 E B 1 2 3 4 5 1 2 7 6 1 2 8 1 2 3 5 6 4 7 8 Each of the transitions can be found by following the item sets to where the new item set is created from o Item Set 7 Spawned as a result of Item Set 5 Constructing the Table • After finishing creating the item sets and the transitions, follow the steps below to finish the table 1) The columns for nonterminals are copied to the goto table. 2) The columns for the terminals are copied to the action table as shift actions. 3) An extra column for '$' (end of input) is added to the action table that contains acc for every item set that contains S → E •. 4) If an item set i contains an item of the form A → w • and A → w is rule m with m > 0 then the row for state i in the action table is completely filled with the reduce action rm. Final Parse Table Action State * + 0 Goto 0 1 s1 s2 1 r4 r4 r4 r4 2 r5 r5 r5 r5 3 s5 s6 4 r3 r3 $ E B g3 g4 acc r3 r3 5 s1 s2 g7 6 s1 s2 g8 7 r1 r1 r1 r1 8 r2 r2 r2 r2 Thank you