Ch16.doc

advertisement

Chapter 16

If live productions used once language is finite

S  AB

A  XY

X  b

Y  a

B  a

S  AB  XYB  bYB  baa (one more dead production than live productions)

Now consider a string, w, produced by some Grammar which contains the production Z  XY twice in its derivation.

S  AB

A  XY

X  b

Y  a

B  XY

S  AB  XYB  bYB  baB  baXY  babY  baba

Somewhere in the derivation sequence there was s

1

Zs

2

and before it was used the second time s

1 s

3

Zs

4

How did the second Z come to be?

It is either a tree descendant of the first Z or else it comes from something in the old s

2

Tree descendent Example:

Z  BB

B  ZA

A  a

B  b

A  AA

A  BC

C  BB

A  a

B  b

A a

S

Z

B b Z

B

A

B

S

A

B

C

B

A

Theorem 33

If G is a CFG in CNF that has p live productions and q dead productions, and if w is a word generated by G that has more than 2 p letters in it, then somewhere in every derivation tree for w there is an example of some nonterminal (call it

Z) being used twice where the second Z is tee descended from the first Z.

2 p letters forces the tree to have p+1 rows, thus p+1 live productions and one of the nonterminals has to be used twice in the tree as follows:

V

S

X

X

S

B

Y

X

X a

Y

Y a

Y a

Pumping Lemma

Let L be a CFL in CNF with p live productions. Then any word w in L with length > 2 P can be broken into five parts: w = uvxyz such that

length(vxy) <= 2 P

length(x) > 0

length(v) + length(y) > 0 such that all words uv n xy n z are in the language L.

Proof: From the previous work above, we know the length of w is great than 2 P There wukk always be a nonterminal that is a tree descendant of itself. Consider a string w generated by the grammar. Let P be the tree descendant nonterminal.

Let P -> QR be the production for P. Suppose the tree for w looks like this:

u

S

P

Q R v

P x y z

The complete tree could look like this:

S

A a u

A a v

Q

P B

R

P

Q R a x b y b z b

It is possible either u or z or both could be empty, however v is not, y is not, or both are not empty. Now it is clear that v and y could be repeated a number of times.

Example:

L = { a n b n c n

| n > 0 }

Assume some word w is larger than 2 p letters, i.e.

W = a 200 b 200 c 200

Now any method of dividing w = uvwyz into 5 parts will mean that uv 2 wy 2 z is not in the L.

All words in a n b n c n have exactly on occurrence of the stubstring ab no matter what n is. Now if either the v part or th y part has the substring ab in it, then uv 2 xy 2 z will have more than one substring of ab, so it cannot be in L, so neither v nor y can contain ab.

Similarly only one occurrence of ba.

If v and y are either all a’s or all b’s pumping once would increase the number of a’s or b’s, thus not in L.

Example:

L = {a n b m a n b m | m > 0, n>0} let the CFG in CNF have p live productions. Let us look at the word a 2 p b 2 p a 2 p b 2 p length (vxy) < 2 p so v and y cannot be solid blocks of one letter separated by a clump of the other letter, because the separator letter clump is longer than the length of the whole substring vxy. Counting substrings of

 ab

and

 ba  we see that v and y must be one solid letter. But because of the length, all the letters must come from the same clump. Any of the four clumps will do. So, this now means that uvvxyyz is not of the form a 2 p b

2 p a 2 p b 2 p which must also be in L, therefore L is non-context free.

Download