CS 3240 – Chapter 6 6.1: Simplifying Grammars Substitution Removing useless variables Removing λ Removing unit productions 6.2: Normal Forms Chomsky Normal (CNF) 6.3: A Membership Algorithm CYK Algorithm An example of bottom-up parsing CS 3240 - Normal Forms for Context-Free Languages 2 Variables in CFGs can often be eliminated If they are not recursive, you can substitute their rules throughout See next slide CS 3240 - Normal Forms for Context-Free Languages 3 A ➞ a | aaA | abBc B ➞ abbA | b Just substitute directly for B: A ➞ a | aaA | ababbAc | abbc CS 3240 - Normal Forms for Context-Free Languages 4 A variable is useless if: It can’t be reached from the start state, or It never leads to a terminal string ▪ Due to endless recursion Both problems can be detected by a dependency graph See next slide CS 3240 - Normal Forms for Context-Free Languages 5 S ➞ aSb | A | λ A ➞ aA A is useless (non-terminating): S ➞ aSb | λ CS 3240 - Normal Forms for Context-Free Languages 6 S➞A A ➞ aA | λ B ➞ bA B is useless (non-reachable): S➞A A ➞ aA | λ CS 3240 - Normal Forms for Context-Free Languages 7 Simplify the following: S ➞ aS | A | C A➞a B ➞ aa C ➞ aCb CS 3240 - Normal Forms for Context-Free Languages 8 CS 3240 - Normal Forms for Context-Free Languages 9 Simplify the following grammar: S ➞ AB | AC A ➞ aAb | bAa | a B ➞ bbA | aaB | AB C ➞ abCa | aDb D ➞ bD | aC CS 3240 - Normal Forms for Context-Free Languages 10 Any variable that can eventually terminate in the empty string is said to be nullable Note: a variable may be indirectly nullable In general: if A ➞ V1V2…Vn, and all the Vi are nullable, then A is also nullable ▪ See Theorem 6.3 CS 3240 - Normal Forms for Context-Free Languages 11 Consider the following grammar: S ➞ a | Xb | aYa X ➞Y | λ Y➞b|X Which variables are nullable? How can we substitute the effect of λ before removing it? CS 3240 - Normal Forms for Context-Free Languages 12 First find all nullable variables Then substitute (A + λ) for every nullable variable A, and expand Then remove λ everywhere from the grammar What’s left is equivalent to the original grammar except the empty string may be lost we won’t worry about that CS 3240 - Normal Forms for Context-Free Languages 13 Consider the following grammar (again): S ➞ a | Xb | aYa X ➞Y | λ Y➞b|X How can we substitute the effect of λ before removing it? CS 3240 - Normal Forms for Context-Free Languages 14 S → aSbS | bSaS | λ S → aSa | bSb | X X → aYb | bYa Y → aY | bY | λ CS 3240 - Normal Forms for Context-Free Languages 15 Unit Productions often occur in chains A➞B➞C Must maintain the effect of B and C when substituting for A throughout Procedure: Find all unit chains Rebuild grammar by: ▪ Keeping all non-unit productions ▪ Keeping only the effect of all unit productions/chains CS 3240 - Normal Forms for Context-Free Languages 16 S ➞ A | bb A➞B|b B➞S|a Note that S ⇒* {A,B}, A⇒* {B,S}, B ⇒* {S,A} Giving: S ➞ bb | b | a // Added non-unit part of A and B A ➞ b | a | bb // Added non-unit part of B and S B ➞ a | bb | b // Added non-unit part of S and A CS 3240 - Normal Forms for Context-Free Languages 17 1) Determine all variables reachable by unit rules for each variable 2) Keep all non-unit rules 3) Substitute non-unit rules in place of each variable reachable by unit productions CS 3240 - Normal Forms for Context-Free Languages 18 S ➞ aA A ➞ BB B ➞ aBb | λ Now remove nulls and see what happens…. (Also see the solution for #15 in 6.1) CS 3240 - Normal Forms for Context-Free Languages 19 S ➞ AB A➞B B ➞ aB | BB | λ CS 3240 - Normal Forms for Context-Free Languages 20 Do things in the following recommended order: Remove nulls Remove unit productions Remove useless variables Simplify by substitution as desired CS 3240 - Normal Forms for Context-Free Languages 21 Very important for our purposes All CNF rules are of one of the following two forms: A➞c A ➞ XY (a single terminal) (exactly two variables) Must begin the transformation after simplifying the grammar (removing λ, all unit productions, useless variables, etc.) CS 3240 - Normal Forms for Context-Free Languages 22 Convert to CNF: S ➞ bA | aB A ➞ bAA | aS | a B ➞ aBB | bS | b (NOTE: already has no nulls or units) CS 3240 - Normal Forms for Context-Free Languages 23 Convert the following grammar to CNF: S ➞ abAB A ➞ bAB | λ B ➞ BAa | A | λ CS 3240 - Normal Forms for Context-Free Languages 24 Convert the following grammar to CNF: S ➞ aS | bS | B B ➞ bb | C | λ C ➞ cC | λ CS 3240 - Normal Forms for Context-Free Languages 25 Single terminal character followed by zero or more variables (cV*, c ∈ Σ ) V→a V → aBCD… λ allowed only in S → λ Sometimes need to make up new variable names CS 3240 - Normal Forms for Context-Free Languages 26 S → AB A → aA | bB | b B→b Substitute for A in first rule (i.e., add B to each rule for A): S → aAB | bBB | bB The other rules are okay CS 3240 - Normal Forms for Context-Free Languages 27 S → abSb |aa Add rules to generate a and b: S → aBSB |aA A→a B→b CS 3240 - Normal Forms for Context-Free Languages 28 The “parsing” problem How do we know if a string is generated by a given grammar? Bottom-up parsing (CYK Algorithm) An Example of Dynamic Programming Requires Chomsky Normal Form (CNF) Start by considering A ➞ c rules Build up the parse tree from there CS 3240 - Normal Forms for Context-Free Languages 29 S ➞ XY X ➞ XA | a | b Y ➞ AY | a A➞a Does this grammar generate “baaa”? CS 3240 - Normal Forms for Context-Free Languages 30 CNF yields binary trees. (Can you find a third parse tree?) CS 3240 - Normal Forms for Context-Free Languages 31 S ➞ XY X ➞ XA | a | b Y ➞ AY | a A➞a Stage 3: baa ⇐ ba a ⇐ (S,X)(X,Y,A) = SX, SY, SA, XX, XY, XA = S, X baa ⇐ b aa ⇐ X(S,X,Y) = XS, XX, XY = S Stage 1: b⇐X a ⇐ X, Y, A aaa ⇐ aa a ⇐ (S,X,Y)(X,Y,A) = _________ aaa ⇐ a aa ⇐ (X,Y,A)(S,X,Y) = _________ Stage 2: ba ⇐ X(X,Y,A) = XX, XY, XA ⇐ S, X aa ⇐ (X,Y,A)(X,Y,A) = XX, XY, XA, YX, YY, YA, AX, AY, AA ⇐ S, X, Y Stage 4: (you finish…) baaa: b aaa ⇐ __________ ba aa ⇐ __________ baa a ⇐ __________ CS 3240 - Normal Forms for Context-Free Languages 32 CS 3240 - Normal Forms for Context-Free Languages 33 CS 3240 - Normal Forms for Context-Free Languages 34 CS 3240 - Normal Forms for Context-Free Languages 35 CS 3240 - Normal Forms for Context-Free Languages 36 Does the following grammar generate “abbaab”? S ➞ SAB | λ A ➞ aA | λ B ➞ bB | λ CS 3240 - Normal Forms for Context-Free Languages 37