CS 3240 – Chapter 8 Is anbncn context-free? CS 3240 - Properties of Context-Free Languages 2 anbncn is not context-free Neither is ww although wwR is! We will develop a pumping lemma for context-free languages (oh joy! :-) as before, can only be used to show that a language is not CF CS 3240 - Properties of Context-Free Languages 3 How can you tell by looking at a CFG whether its language is infinite or not? CS 3240 - Properties of Context-Free Languages 4 S → aaB A → bBb | λ B → Aa Consider the derivation: S ⇒ aaB ⇒aaAa ⇒ aabBba ⇒ aabAaba ⇒ aabbBbaba ⇒ aabbAababa … CS 3240 - Properties of Context-Free Languages 5 Grammars for infinite CFL must reuse a variable in some derivation S ⇒* uAz ⇒* uvAyz That is, A ⇒* vAy ⇒ u,v,y,z,are derivable strings of characters and variables We can repeat the same choices for A again: S ⇒* uAz ⇒* uvAyz ⇒* uvvAyyz And again and again… (finally stopping with x) S ⇒* uvnAynz ⇒* uvnxynz So for any sufficiently long string, s, we have s = uvnxynz CS 3240 - Properties of Context-Free Languages 6 Hint: think of derivation trees, and the relationship between the depth and the number of leaves in a tree CS 3240 - Properties of Context-Free Languages 7 If a complete binary has depth d, how many leaves does it have? CS 3240 - Properties of Context-Free Languages 8 Consider a path from the root of a tree (S) to a leaf. It is all variables, except for the leaf The longer the string, the deeper the path Eventually a variable must be repeated! CS 3240 - Properties of Context-Free Languages 9 Based on a repeated variable (a type of loop) For sufficiently-long strings (≥ p = 2v), some variable will be a descendant of itself in the parse Every string of sufficient length from an infinite CFL can be written as uvxyz, and pumped as uvixyiz, i ≥ 0: |v| + |y| > 0 |vxy| <= p Intuitively: We’ve already used up the stack to coordinate the anbn prefix Must consider all cases for a proof CFLPL-8.PDF Use the pumping lemma with apbpapbp (N)CFLs are closed under: Union Concatenation Kleene Star “Regular Intersection” (CF ∩ R = CF) (N)CFLs are not closed under: intersection complement CS 3240 - Properties of Context-Free Languages 13 Let S1 be the start symbol for L1, and S2 for L2 Just have a new start symbol point to the OR of the old ones: S => S1 | S2 S1 => … S2 => … S => S1S2 S1 => … S2 => … Rename the old start variable to S1 S => S1S | λ S1 => … Let L1 = anbncm The concatenation of anbn and cm Let L2 = anbmcm The concatenation of an and bmcm These are both context-free L1 ∩ L2 = anbncn We already showed this is not context free Let R be a regular language and L a contextfree language The R∩L is context-free Why? CS 3240 - Properties of Context-Free Languages 18 a,λ b,X ±s s/X f/λ +f Φ f/λ a b ±A B C B A D C D A D C B CS 3240 - Properties of Context-Free Languages 19 a,λ b,X ± sA s/X; B f/λ; C sB s/X; A f/λ; D fC ;D f/λ; A fD ;C f/λ; B + fA ;B f/λ; C fB ;A f/λ; D CS 3240 - Properties of Context-Free Languages 20 jail CS 3240 - Properties of Context-Free Languages 21 Proof by contradiction, derived from the formula for intersection: L1 ∩ L2 = (L1' + L2')' Since the intersection is not closed, but union is, then the complement cannot be. (Otherwise we could compute the intersection, which in general is not CF) Non-determinism is the problem Remember NFAs? To find the complement, we needed to first convert to a DFA, then flip the states Since some CFLs are inherently nondeterministic, they have no deterministic equivalent to “flip” But… what does this say about deterministic CFLs? CS 3240 - Properties of Context-Free Languages 23 DCFLs are closed under: Complement! Concatenation Kleene Star “Regular Intersection” (CF ∩ R = CF) DCFLs are not closed under: Intersection Union! CS 3240 - Properties of Context-Free Languages 24 Consider: L1 = {aibjck | i = j} L2 = {aibjck | j = k} Each of these is DCF (Easy to show – your 7.1 homework was similar) The union is not! It requires non-determinism It’s still context-free, but not Deterministic CF DCFLs always have an associated CFG that is unambiguous Closed under Union, Concatenation, Kleene Star Not closed under intersection, complement CFL ∩ Regular = CFL DCFLs are closed under complement But not union! Swap those two Unanswerable questions Answerable questions Do 2 arbitrary CFGs generate the same language? Is a CFG ambiguous? Is a given NCFL’s complement also CF? Is the intersection of 2 given CFLs CF? Do 2 CFLs have a common word? Is a non-terminal ever used in a productive derivation? Draw the connectivity graph ✔ Does a CFG generate any words? Substitute each “terminating production” (RHS is all terminals) throughout and see what happens ▪ “back substitution method” Is a CFL finite or infinite? Procedure to detect useful, repeated variables S → aA | bB | λ A → a | aCA | bDA | bBa | aAa B → b | aAb | aCB | bDB | bBb C → aCC | bDC D → aCD | bDD First remove useless variables… CS 3240 - Properties of Context-Free Languages 31 S → aA | bB | λ A → a | bBa | aAa B → b | aAb | bBb Pick a non-empty, terminal rule: A → a Back-substitute that rule: S → aa | bB | λ A → a | bBa | aaa B → b | aab | bBb Keep going until we have S → <terminal string>, or we can’t continue. We have S → aa. STOP. CS 3240 - Properties of Context-Free Languages 32 S → aA | bB | λ A → a | bBa | aAa B → b | aAb | bBb All variables are useful. Let’s see if A is repeated, for instance. First, mark all A’s on the right: S → aA | bB | λ A → a | bBa | aAa B → b | aAb | bBb Now mark all variables affected on the left: S → aA | bB | λ A → a | bBa | aAa B → b | aAb | bBb Since A was marked on the left, it is repeated. CS 3240 - Properties of Context-Free Languages 33 S ➞ aA | SB A ➞ baB | λ B ➞ bB | bA Mark on left: S ➞ aA | SB A ➞ baB | λ B ➞ bB | bA Mark A’s on right: S ➞ aA | SB A ➞ baB | λ B ➞ bB | bA A was marked on left. DONE. Now mark corresponding variables on left: S ➞ aA | SB A ➞ baB | λ B ➞ bB | bA Repeat marking on right: S ➞ aA | SB A ➞ baB | λ B ➞ bB | bA CS 3240 - Properties of Context-Free Languages 34