CS 241 Tutorial 7 Solution Rob Schluntz February 17, 2012 1 Context Free Grammar Formal Definition A context-free grammar is a 4-tuple (V, Σ, S, R) where: V – a finite set of non-terminals (variables) Σ – a finite set of terminals (alphabet symbols) S ∈ V – the starting non-terminal R – a finite set of productions (rules) Exercise Let Σ = {b, d}. The following is a context free grammar G that specifies the language L: S S → → b ε S d S 1. What are the non-terminals in this grammar? 2. What are the terminals in this grammar? 3. How many productions are in this grammar? 4. Determine if the following words are in L by attempting to find a derivation for the following word. (a) b (b) d (c) bd (d) ε 1 (e) bbdd (f) bdbdbdbd (g) db (h) bbbbdddbbbddd (i) bdbbbdddbd 2 Writing Context Free Grammar 1. Let Σ = {a, b}. Write a context free grammar that specifies the regular expression a∗ . 2. Let Σ = {a, b}. Let L = x ∈ Σ∗ |∃i such that ai bi where i ≥ 0, i ∈ Z . Write a context free grammar that specifies this language. 3. Let Σ = {“{00 , “}00 , “[00 , “]00 , “(00 , “)00 }. Write a context free grammar that would recognize balance braces for { }, ( ), [ ]. The bracket can be nested arbitarily. 4. Write a context free grammar that specifies this language. Let Σ = {a, b}. Let L = {x ∈ Σ∗ |x = an bm , m > n ≥ 0 } 3 Left-most Derivation Given the following CFG productions: • S → (f S) • S→ • S → SS • S→v Create a left-most canonical derivation for: (f v v (f (f v)) v) NOTICE: any peculiarities of the language? It is ambiguous! For the same input string you can get multiple left(right) canonical derivations. How would you fix it? 2 4 Parse Trees & Ambiguity Definition: A CFG is ambiguous if there is a string that is generated by the CFG for which more than one parse tree exists. Comparing Regular Languages and Context-Free Languages Write a context-free grammar that specifies the set of all syntactically valid regular expressions over Σ∗reg using ‘sym’ to represent a symbol in Σreg . Use terminal symbols Σ = {(, ), |, ∗, sym, ε} and non-terminal symbols V = {R}. Is this grammar ambiguous? Why or why not? If this grammar is ambiguous, construct a CFG for regular expressions that is unambiguous. “Prove” that RL ⊂ CFL 1. Show that for any regular language L we can construct a Context-Free Grammar G such that L(G) = L 2. Show that there exists a Grammar G such that L(G) cannot be expressed with a DFA Solution to Ambiguous Grammar Why is having an ambiguous grammar a bad thing? 1. Usually, we give meaning to a parse tree. If there are two distinct parse tree for parsing some form of string, then we can give different meaning to the parse and can cause problems. For example, if say 1 + 2 × 3, we really want 1 + (2 × 3) and not (1 + 2) × 3. The trees below illustrates the different trees we can come up with for the expression 1 + 2 × 3. expr expr ID (1) + expr × expr ID (2) expr ID (3) Figure 1: 1 + (2 × 3) = 7 3 expr × expr expr + expr expr ID (1) ID (3) ID (2) Figure 2: (1 + 2) × 3 = 9 5 Problem 1: Create the Parse Tree • Given the following CFG productions: – S→N – S→ – N →N +N – N → ID • A) Is it ambiguous? Try to create two different parse trees for a string generated by the language. • B) How can this grammar be modified to be unambiguous (if it is)? 6 Operator Precedence in CFGs Operator precedence is expressed in a CFG by having multiple non-terminals that each expand to use different operators. The lower that non-terminal and its associated operator is in the CFG’s recursion, the higher its precedence will be (and thus the lower in the parse tree it will be), and thus evaluated first. 7 Problem 2: Add Exponentiation • Given the following CFG productions: – expr → expr + term – expr → expr − term – expr → term – term → term ∗ f actor – term → f actor – f actor → (expr) – f actor → ID 4 • Add the exponentiation operator (^) to the language, such that it has the highest precedence of all operators. It should work the same as the other operators above. • Try to create a derivation or parse tree that shows that a use of exponentiation is lower in the parse tree/derivation than any of the other operators. 5