Pumping Lemma for Context-free Languages Lecture 16 COT 4420

advertisement
Lecture 16
Pumping Lemma for Context-free
Languages
COT 4420
Theory of Computation
Section 8.1
Statement of the CFL Pumping Lemma
Let L be an infinite context-free language.
There exists an integer m, such that
For every string w ∈ L with |w|> m,
w can be decomposed as w= uvxyz such
that:
1. |vxy| < m
2. |vy| > 1
3. For all i > 0, uvixyiz ∈ L.
Proof of Pumping Lemma
Let L-{λ} be an infinite context-free language.
We have a grammar G in Chomsky Normal form
for this language (does not have unitproductions or λ-productions)
In a derivation of a long enough string, since the
number of variables is finite, there must be
some variable that repeats.
Proof of Pumping Lemma
• Now let G have k variables, and choose m = 2k
and choose a string |w| ≥ m
• Claim: The parse tree for w must have k+2 or
more levels of nodes.
Variables
2k-1 terminals
If we have at most k+1 levels in the parse tree of
a CNF grammar then the longest yield has length
2k-1.
Why? If we forget about the terminals at the
leaves, a parse tree in a CNF grammar is a binary
tree.
Proof of Pumping Lemma
• Since w is of length m=2k, it cannot be the
yield of any tree that has k+1 or less levels.
Therefore, we can conclude that the parse
tree for w must have at least k+2 levels.
• Thus, there is a path from root to a leaf with
at least k+2 nodes.
V1
At least
K+2 levels
V1 = S (root)
V2
V3
K+1 Variables
Vk+1
σ
terminal
• Since there are at most k variables in the
grammar, some variable is repeated, say A.
S
• Take A to be the
deepest, so that
only A is
repeated in the
subtree 
u
v
A
A
x
z
y
We can write w = uvxyz where u,v,x,y,z are strings
of terminals
S =>* uAz
A =>* vAy
A =>* x
S
u
v
A
A
x
z
y
We can write w = uvxyz where u,v,x,y,z are strings
of terminals
S
S =>* uAz
A =>* vAy
A =>* x
u
Other possible
derivations:
S =>* uAz =>* uxz
uv0xy0z
A
x
z
We can write w = uvxyz where u,v,x,y,z are strings
of terminals
S =>* uAz
A =>* vAy
A =>* x
Other possible
derivations:
S =>* uAz =>* uvAyz
=>* uvvAyyz =>*
uvvxyyz
uv2xy2z
S
u
v
v
A
A
A
x
z
y
y
Proof of Pumping Lemma
Therefore,
knowing that w = uvxyz ∈ L(G)
then we also know:
uvixyiz ∈ L(G) for all i = 0,1,2,…
We can write w = uvxyz where u,v,x,y,z are strings
of terminals
S
Observation1:
|vy| ≥ 1
Since G has no unit
and λ-productions, v
and y cannot both be
empty strings.
u
v
A
A
x
z
y
Observation2:
|vxy| ≤ m
Since A is the last
repeated variable in the
green subtree, there is
at most k+1 variables in
the subtree
And therefore, the yield
of this subtree is no
longer than 2k=m
S
u
v
A
A
x
z
y
Statement of the CFL Pumping Lemma
Let L be an infinite context-free language.
There exists an integer m, such that
For every string w ∈ L with |w|> m,
w can be decomposed as w= uvxyz such
that:
1. |vxy| < m
2. |vy| > 1
3. For all i > 0, uvixyiz ∈ L.
Applications of Pumping Lemma
Example 1 { anbncn : n ≥ 0 }
Show that the language L = { anbncn : n ≥ 0 } is
not context-free.
Proof: Using Pumping Lemma, we first assume
for contradiction that L is context-free.
Since L is infinite and context-free, we can apply
the pumping lemma.
Example 1 { anbncn : n ≥ 0 }
Let m be the critical length of the pumping
lemma. Pick a string w in L such that |w| ≥ m.
• We pick w = ambmcm
• We can write w = uvxyz such that
|vxy| ≤ m and |vy| ≥ 1
• Pumping Lemma says: uvixyiz ∈ L
Example 1 { anbncn : n ≥ 0 }
w = ambmcm
w = uvxyz
|vxy| ≤ m and
|vy| ≥ 1
• We examine all the possible locations of string
vxy in w
Example 1 { anbncn : n ≥ 0 }
w = ambmcm
w = uvxyz
|vxy| ≤ m and
Case 1: vxy is in am
Case 2: vxy is in bm
Case 3: vxy is in cm
Case 4: vxy overlaps am and bm
Case 5: vxy overlaps bm and cm
|vy| ≥ 1
Example 1 { anbncn : n ≥ 0 }
w = ambmcm
w = uvxyz
|vxy| ≤ m and
|vy| ≥ 1
Case 1: vxy is in am
In this case the pumped string uv2xy2z will
obviously have more a’s than b’s and c’s
and therefore will not be in the language.
Contradiction!!!
Example 1 { anbncn : n ≥ 0 }
w = ambmcm
w = uvxyz
|vxy| ≤ m and
|vy| ≥ 1
Case 2: vxy is in bm
Similar to case 1, the pumped string
uv2xy2z will obviously have more b’s than
a’s and c’s and therefore will not be in the
language.
Contradiction!!!
Example 1 { anbncn : n ≥ 0 }
w = ambmcm
w = uvxyz
|vxy| ≤ m and
|vy| ≥ 1
Case 3: vxy is in cm
Similar to case 1, the pumped string
uv2xy2z will obviously have more c’s than
a’s and b’s and therefore will not be in the
language.
Contradiction!!!
Example 1 { anbncn : n ≥ 0 }
w = ambmcm
w = uvxyz
|vxy| ≤ m and
|vy| ≥ 1
Case 4: vxy overlaps am and bm
In this case, uv2xy2z will obviously have m
c’s but more than m a’s or b’s and
therefore will not be in the language.
 Let’s look at each sub-case and see
what happens …
Example 1 { anbncn : n ≥ 0 }
w = ambmcm
w = uvxyz
|vxy| ≤ m and
|vy| ≥ 1
Case 4: vxy overlaps am and bm
Sub-case 1: v contains only a
y contains only b
v = ak1
y = bk2
uv2xy2z = am+k1bm+k2cm ∉ L
Contradiction!!!
Example 1 { anbncn : n ≥ 0 }
w = ambmcm
w = uvxyz
|vxy| ≤ m and
|vy| ≥ 1
Case 4: vxy overlaps am and bm
Sub-case 2: v contains a and b
y contains only b
v = ak1bk2
y = bk3
uv2xy2z = ambk2ak1bm+k3cm ∉ L Contradiction!!!
Example 1 { anbncn : n ≥ 0 }
w = ambmcm
w = uvxyz
|vxy| ≤ m and
Case 4: vxy overlaps am and bm
Sub-case 3: v contains only a
y contains a and b
Similar to sub-case 2
|vy| ≥ 1
Example 1 { anbncn : n ≥ 0 }
w = ambmcm
w = uvxyz
|vxy| ≤ m and
Case 5: vxy overlaps bm and cm
Similar to case 4
|vy| ≥ 1
In all cases we obtained a contradiction
Therefore: the original assumption that
L = {a b c : n ≥ 0}
n n n
is context-free must be wrong
Conclusion: L is not context-free
Download