Context-Free Languages

advertisement
CD5560
FABER
Formal Languages, Automata
and Models of Computation
Lecture 9
Mälardalen University
2005
1
Content
The Pumping Lemma for CFL
Applications of the Pumping Lemma for CFL
Midterm Exam 2: Context-Free Languages
2
Pumping Lemma for CFL’s
3
Comparison to Regular Language
Pumping Lemma/Condition
4
What’s Difference between CFL’s and
Regular Languages?
• In regular languages, a single substring “pumps”
– Consider the language of even length strings over {a,b}
– We can identify a single substring which can be pumped
• In CFL’s, multiple substrings can “pump”
– Consider the language {anbn | n > 0}
– No single substring can be pumped and allow us to stay in the
language
– However, there do exist pairs of substrings which can be
pumped resulting in strings which stay in the language
• This results in a modified pumping condition
5
Modified Pumping Condition
• A language L satisfies the
regular language pumping
condition if:
– there exists an integer m >
0 such that
– for all strings x in L of length
at least m
– there exist strings u, v, w
such that
•
•
•
•
x = uvw and
|uv| ≤ m and
|v| ≥ 1 and
For all i ≥ 0, uviw is in L
• A language L satisfies the
CFL pumping condition if:
– there exists an integer m >
0 such that
– for all strings x in L of length
at least n
– there exist strings u, v, w, y,
z such that
• x = uvwyz and
• |vwy| ≤ m and
• |vy| ≥ 1 and
• For all i ≥ 0, uviwyiz is in L
6
Pumping Lemma
• All CFL’s satisfy the CFL pumping condition
CFL’s
“Pumping Languages”
All languages over {a,b}
7
Implications
CFL’s
“Pumping Languages”
All languages over {a,b}
• We can use the pumping lemma to prove a language
L is not a CFL
– Show L does not satisfy the CFL pumping
condition
• We cannot use the pumping lemma to prove a
language is context-free
– Showing L satisfies the pumping condition does
not guarantee that L is context-free
8
Pumping Lemma
What does it mean?
9
Pumping Condition
• A language L satisfies the CFL pumping condition if:
– there exists an integer m > 0 such that
– for all strings x in L of length at least m
– there exist strings u, v, w, y, z such that
• x = uvwyz and
• |vwy| ≤ m and
• |vy| ≥ 1 and
• For all i ≥ 0, uviwyiz is in L
10
v and y can
be pumped
1) x in L
2) x = uvwyz
3) For all i ≥ 0, uviwyiz is in L
• Let x = abcdefg be in L
• Then there exist 2 substrings v and y in x such that v and y can
be repeated (pumped) in place any number of times and the
resulting string is still in L
– uviwyiz is in L for all i ≥ 0
• For example
– v = cd and y = f
• uv0wy0z = uwz = abeg is in L
• uv1wy1z = uvwyz = abcdefg is in L
• uv2wy2z = uvvwyyz = abcdcdeffg is in L
• uv3wy3z = uvvvwyyyz = abcdcdcdefffg is in L
• …
11
What the other parts mean
• A language L satisfies the CFL pumping condition if:
– there exists an integer m > 0 such that
– for all strings x in L of length at least m
• x must be in L and have sufficient length
– there exist strings u, v, w, y, z such that
• x = uvwyz and
• |vwy| ≤ m and
– v and y are contained within m characters of x
– Note: these are NOT necessarily the first m characters of x
• |vy| ≥ 1 and
– v and y cannot both be l,
– One of them might be l, but not both
• For all i ≥ 0, uviwyiz is in L
12
How we use the Pumping Lemma
• We choose a specific language L
– For example, {anbncn | n > 0}
• We have shown that L does not satisfy
the pumping condition and
• concluded that L is not context-free
13
Showing L “does not pump”
• A language L satisfies the CFL
pumping condition if:
– there exists an integer m >
0 such that
– for all strings x in L of
length at least m
– there exist strings u, v, w,
y, z such that
• x = uvwyz and
• |vwy| ≤ m and
• |vy| ≥ 1 and
• For all i ≥ 0, uviwyiz is
in L
•
A language L does not satisfy
the CFL pumping condition if:
– for all integers m of
sufficient size
– there exists a string x in L of
length at least m such that
– for all strings u, v, w, y, z
such that
• x = uvwyz and
• |vwy| ≤ m and
• |vy| ≥ 1
– There exists a i ≥ 0 such
that uviwyiz is not in L
14
Two Rules of Thumb
• Try to use blocks of at least m characters in x
– For TWOCOPIES, choose x = ambmambm rather
than ambamb
• Guarantees v and y cannot be in more than 2 blocks of x
• Try i=0 or i=2
– i=0
• This reduces number of occurrences of v and y
– i=2
• This increases number of occurrences of v and y
15
Summary
• We use the Pumping Lemma to prove a language is
not a CFL
– Note, does not work for all non CFL languages
– Can be strengthened to Ogden’s Lemma
• Choosing a good string x is first key step
• Choosing a good i is second key step
• Typically have several cases for v, w, y
16
More Applications
of
The Pumping Lemma
17
The Pumping Lemma for CFL
For infinite context-free language L
there exists an integer
for any string
m such that
w  L,
| w | m
we can write
w  uvxyz
with lengths
| vxy | m and | vy | 1
and
uv xy z  L,
i
i
for all i  0
18
The Pumping Lemma for CFL
Let G be a context free grammar.
There exists an integer
w  L(G),
m such that
| w | m
can be written
w  uvxyz
with lengths
| vxy | m and | vy | 1
and
uv xy z  L,
i
i
for all i  0
19
Unrestricted grammar languages
{a b c : n  0}
n n n
Non-regular languages
Context-Free Languages
n n
R
{a b }
{ww }
Regular Languages
a *b *
20
Theorem
The language
L  {ww : w {a, b}*}
is not context free
Proof
Use the Pumping Lemma
for context-free languages
21
L  {ww : w {a, b}*}
Assume the contrary - that
L
is context-free
Since L is context-free and infinite
we can apply the pumping lemma
22
L  {ww : w {a, b}*}
Pumping Lemma gives a number
such that:
Pick any string of
with length at least
we pick:
m
L
m
m m m m
a b a b
L
23
L  {ww : w {a, b}*}
We can write:
a b a b  uvxyz
with lengths
| vxy | m and | vy | 1
m m m m
Pumping Lemma says:
uv xy z  L
i
i
for all
i0
24
L  {ww : w {a, b}*}
a b a b  uvxyz | vxy | m | vy | 1
m m m m
We examine all the possible locations
m m m m
vxy
of string
in a b a b
25
L  {ww : w {a, b}*}
| vxy | m
a b a b  uvxyz
m m m m
Case 1:
va
vxy is within the first
k1
ya
k2
| vy | 1
a
m
k1  k2  1
m
m
m
m
a ...... a b ...... b a ...... a b ...... b
z
u vx y
26
L  {ww : w {a, b}*}
| vxy | m
a b a b  uvxyz
m m m m
Case 1:
va
vxy is within the first a
k1
ya
k2
| vy | 1
m
k1  k2  1
m
m
m  k1  k2 m
a ................ a b ...... b a ...... a b ...... b
z
u v2 x y 2
27
L  {ww : w {a, b}*}
a b a b  uvxyz | vxy | m | vy | 1
m m m m
vxy is within the first a
Case 1:
a
m  k1  k2 m m m
m
b a b  uv xy z  L
2
2
k1  k2  1
28
L  {ww : w {a, b}*}
a b a b  uvxyz | vxy | m | vy | 1
m m m m
Case 1:
a
vxy is within the first
m  k1  k2 m m m
a
m
b a b  uv xy z  L
2
However, from Pumping Lemma:
Contradiction!
2
uv xy z  L
2
2
29
L  {ww : w {a, b}*}
a b a b  uvxyz | vxy | m | vy | 1
m m m m
Case 2: v is in the first
y is in the first
m
a
m
b
k1  k2  1
yb
va
m
m
m
m
a ...... a b ...... b a ...... a b ...... b
z
u v x y
k1
k2
30
L  {ww : w {a, b}*}
a b a b  uvxyz | vxy | m | vy | 1
m m m m
Case 2: v is in the first
y is in the first
va
k1
yb
k2
m
a
m
b
k1  k2  1
m
m
m  k2
m  k1
a ............ a b ............ b a ...... a b ...... b
2 x
2
z
u
v
y
31
L  {ww : w {a, b}*}
a b a b  uvxyz | vxy | m | vy | 1
m m m m
m
a
m
b
Case 2: v is in the first
y is in the first
a
m  k1 m  k2 m m
b
a b  uv xy z  L
k1  k2  1
2
2
32
L  {ww : w {a, b}*}
a b a b  uvxyz | vxy | m | vy | 1
m m m m
m
a
m
b
Case 2: v is in the first
y is in the first
a
m  k1 m  k2 m m
b
a b  uv xy z  L
2
However, from Pumping Lemma:
Contradiction!
2
uv xy z  L
2
2
33
L  {ww : w {a, b}*}
a b a b  uvxyz | vxy | m | vy | 1
m m m m
Case 3: v overlaps the first
y is in the first
v
k1 k 2
a b
b
m m
m
a b
k1, k2  1
yb
m
m
m
m
a ...... a b ...... b a ...... a b ...... b
u
v xy z
k3
34
L  {ww : w {a, b}*}
a b a b  uvxyz | vxy | m | vy | 1
m m m m
Case 3: v overlaps the first
y is in the first
v
k1 k 2
a b
yb
k3
b
m m
m
a b
k1, k2  1
k1
k2
m  k3
m
m
m
a ...... a a ... a b ... b b ......... b a ...... a b ...... b
u
v
2
x y2
z
35
L  {ww : w {a, b}*}
a b a b  uvxyz | vxy | m | vy | 1
m m m m
Case 3: v overlaps the first
y is in the first
m k1 k2 m k3 m m
a a b b
b
m m
m
a b
a b  uv xy z  L
k1, k2  1
2
2
36
L  {ww : w {a, b}*}
a b a b  uvxyz | vxy | m | vy | 1
m m m m
Case 3: v overlaps the first
y is in the first
b
m m
m
a b
a a b b a b  uv xy z  L
m k1 k2 k3 m m
2
However, from Pumping Lemma:
Contradiction!
2
uv xy z  L
2
2
37
L  {ww : w {a, b}*}
a b a b  uvxyz
m m m m
Case 4: v in the first
| vxy | m
a
| vy | 1
m
y Overlaps the first
m m
a b
Analysis is similar to case 3
m
m
m
m
a ...... a b ...... b a ...... a b ...... b
z
uv x y
38
Other cases:
vxy is within
m m m m
a b a b
or
m m m m
a b a b
or
Analysis is similar to case 1:
m m m m
a b a b
m m m m
a b a b
39
More cases:
m m m m
a b a b
vxy overlaps
or
m m m m
a b a b
Analysis is similar to cases 2,3,4:
m m m m
a b a b
40
There are no other cases to consider
Since | vxy | m , it is impossible for
vxy to overlap:
neither
nor
m m m m
a b a b
m m m m
a b a b
nor
m m m m
a b a b
41
In all cases we obtained a contradiction
Therefore:
The original assumption that
L  {ww : w {a, b}*}
is context-free must be wrong
Conclusion:
L is not context-free
END OF PROOF
42
Unrestricted grammar languages
{a b c : n  0}
n n n
{ww}
Non-regular languages
Context-Free Languages
n n
R
{a b }
{ww }
Regular Languages
a *b *
43
Theorem
The language
n!
L  {a : n  0}
is not context free
Proof
Use the Pumping Lemma
for context-free languages
44
L  {a : n  0}
n!
Assume to the contrary that
L
is context-free
Since L is context-free and infinite
we can apply the pumping lemma
45
L  {a : n  0}
n!
Pumping Lemma gives a magic number
such that:
Pick any string of
we pick:
m
L with length at least m
a
m!
L
46
L  {a : n  0}
n!
We can write:
with lengths
a
m!
 uvxyz
| vxy | m and | vy | 1
Pumping Lemma says:
uv xy z  L
i
i
for all
i0
47
L  {a : n  0}
n!
a
m!
 uvxyz
| vxy | m
| vy | 1
We examine all the possible locations
m!
vxy
of string
in a
There is only one case to consider
48
L  {a : n  0}
n!
a
m!
| vxy | m
 uvxyz
| vy | 1
m!
a ............... a
u v x y z
va
k1
ya
k2
1  k1  k2  m
49
L  {a : n  0}
n!
a
m!
| vxy | m
 uvxyz
| vy | 1
m!k1  k2
a ........................... a
u v2 x y 2 z
va
k1
ya
k2
1  k1  k2  m
50
L  {a : n  0}
n!
a
m!
| vxy | m
 uvxyz
m! k
a ........................... a
u v2 x y 2 z
va
k1
ya
k2
| vy | 1
k  k1  k2
1 k  m
51
L  {a : n  0}
n!
a
m!
 uvxyz
a
m! k
| vxy | m
| vy | 1
 uv xy z
2
2
1 k  m
52
Since
1  k  m for m  2 we have:
m! k  m! m
 m! m!m
 m!(1  m)
 (m  1)!
m! m! k  (m  1)!
53
L  {a : n  0}
n!
a
m!
 uvxyz
| vxy | m
| vy | 1
m! m! k  (m  1)!
a
m! k
 uv xy z  L
2
2
54
L  {a : n  0}
n!
a
m!
| vxy | m
 uvxyz
| vy | 1
However, from Pumping Lemma:
2 2
uv xy z  L
a
m! k
 uv xy z  L
2
Contradiction!
2
55
We obtained a contradiction
Therefore:
The original assumption that
L  {a : n  0}
n!
is context-free must be wrong
Conclusion:
L is not context-free
END OF PROOF
56
{a n! : n  0}
n n n
n2 n
{a b : n  0}
{a b c : n  0} {ww : w {a, b}}
Unrestricted grammar languages
Context-free languages
{a b : n  0}
n n
{ww : w  {a, b}*}
R
Regular Languages
a *b *
57
Theorem
The language
2
n n
L  {a b : n  0}
is not context free
Proof
Use the Pumping Lemma
for context-free languages
58
n2 n
L  {a b : n  0}
Assume to the contrary that
L
is context-free
Since L is context-free and infinite
we can apply the pumping lemma
59
2
L  {a b : n  0}
n
n
Pumping Lemma gives a number
such that:
Pick any string of
we pick:
m
L with length at least m
a
m
2
b
m
L
60
2
L  {a b : n  0}
n
We can write:
a
m
2
n
b  uvxyz
m
| vxy | m and
with lengths
| vy | 1
Pumping Lemma says:
uv xy z  L
i
i
for all
i0
61
2
L  {a b : n  0}
n
a
m
2
b  uvxyz
m
n
| vxy | m
| vy | 1
We examine all the possible locations
of string
vxy in a
m
2
b
m
62
2
L  {a b : n  0}
n
a
m
2
b  uvxyz
m
n
| vxy | m
Most complicated case:
| vy | 1
m
v is in a
m
y is in b
2
m
m
a ..................... a b ...... b
u
v x y z
63
n2 n
L  {a b : n  0}
a
m
va
2
b  uvxyz
k1
m
yb
k2
| vxy | m
| vy | 1
1  k1  k2  m
2
m
m
a ..................... a b ...... b
u
v x y z
64
n2 n
L  {a b : n  0}
a
m2 m
b  uvxyz
| vxy | m
Most complicated sub-case:
va
k1
yb
k2
k1  0
| vy | 1
and
k2  0
1  k1  k2  m
2
m
m
a ..................... a b ...... b
u
v x y z
65
n2 n
L  {a b : n  0}
a
m2 m
b  uvxyz
| vxy | m
k1  0
Most complicated sub-case:
va
k1
yb
k2
| vy | 1
and
k2  0
1  k1  k2  m
m  k1 m  k2
a ............... a b ... b
u
0 x 0z
2
v
y
66
n2 n
L  {a b : n  0}
a
m
2
b  uvxyz
m
| vxy | m
Most complicated sub-case:
va
k1
yb
a
k2
m 2  k1 m  k2
b
k1  0
| vy | 1
and
k2  0
1  k1  k2  m
 uv xy z
0
0
67
k1  0
and
k2  0
1  k1  k2  m
(m  k 2 )  (m  1)
2
2
 m  2m  1
2
 m  k1
2
m  k1  (m  k2 )
2
2
68
n2 n
L  {a b : n  0}
a
m
2
b  uvxyz
m
| vxy | m
m  k1  (m  k2 )
2
2
m  k1 m k2
a
b
| vy | 1
2
 uv xy z  L
0
0
69
n2 n
L  {a b : n  0}
a
m
2
b  uvxyz
m
| vxy | m
However, from Pumping Lemma: uv
2
m  k1 m k2
a
b
| vy | 1
0
xy z  L
0
 uv xy z  L
0
Contradiction!
0
70
When we examine the rest of the cases
we also obtain a contradiction
71
In all cases we obtained a contradiction
Therefore:
The original assumption that
n2 n
L  {a b : n  0}
is context-free must be wrong
Conclusion:
L is not context-free
END OF PROOF
72
Midterm Exam 2
Context-Free Languages
Place: Lambda
Time: on Monday 20054-05-16, 13:15-15:00
It is OPEN BOOK.
(This means you are allowed to bring in
one book of your choice.)
It will cover Context-free Languages.
You will have the complete 2 hours to do the test.
73
Check your knowledge before midterm exam!
Selected Examples
of
CF Language Problems
74
Example
Find a CFG for the following language
L  {a b c : k  n  m}
n m k
Let G be the grammar with productions:
S  aSc | B
B  bBc | l
Claim: L(G) = L
75
Find a CFG for the following language
L  {a b c : k  n  m}
n m k
Proof:
Consider the following derivation:
S  aSc | B
B  bBc | l
S * anScn  anBcn * anbmBcmcn  anbmc(n + m)
(where the first * applies S  aSc n times,
the second B  bBc m times)
Since all words in L(G) must follow this pattern in their
derivations, it is clear that L(G)  L
76
Find a CFG for the following language
L  {a b c : k  n  m}
n m k
S  aSc | B
B  bBc | l
Consider w  L, w = anbmc(n + m) for some n, m  0
The derivation
S * anScn  anBcn * anbmBcmcn  anbmc(n + m)
clearly produces w for any n, m.
 L  L(G)
 L  L(G)
G is a CFG for L
END OF PROOF
77
Example
Find a PDA and CFG for the following language
L  {a b : n  N}
2n 3n
a , l/a
2
b3 , a2 / l
2
b , a /l
3
qi
2
qf
Is the automaton deterministic? Yes. It acts in a unique
way in each state.
78
L  {a b : n  N}
2n 3n
CFG :
S  l | a Sb
2
3
b3 , a2 / l
a2 , l / a2
qi
b3 , a2 / l
qf
79
Example
Find a PDA and CFG for the following language

L  {x {a, b} : na  2nb}
a, l / a
PDA
b, l / b
aa, b / l
ab, a / l
ba, a / l
b, aa / l
80

L  {x {a, b} : na  2nb}
a, l / a
b, l / b
aa, b / l
ab, a / l
ba, a / l
CFG :
b, aa / l
S  l | aSbSa| aSaSb| bSaSa| SS
81
Example
Prove that the language L is context-free
L  {a b : n is not multipleof 5}
n n
Consider the following two languages:
L1 ={w : w is made from a’s and b’s
and the length of w is a multiple of 10}
L2 = {anbn: n  0}
82
L1 ={w : w is made from a’s and b’s and the
length of w is a multiple of ten}
L2 = {anbn: n  0}
Let L1 denote the complement of L1. We have that
L = L1  L2.
L1 is a regular language, since we can easily build
a finite automaton with 10 states that accepts
any string in this language.
L1 is regular too, since regular languages are
closed under complement.
83
The language L2 is context-free.
The grammar is: S  aSb | l
Therefore, the language L = L1  L2 is also
context-free,
since context-free languages are closed under
regular intersection (Regular Closure).
END O PROOF
84
Example
Find a PDA and CFG for the following language

L  {xa : n  N , x {a, b} , | x | n}
n
CFG
S  ASa | A | l
A  aA | bA | a | b
Production ex.
S  ASa  AASaa AAaa
 abAaa abaAaa ababaa
85

L  {xa : n  N , x {a, b} , | x | n}
n
PDA
l , S / ASa | A | l
l , A / aA | bA | a | b
qi
l, l / S
qf
a, a / l
b, b / l
86
Example
Find a PDA and CFG for the following language
L  {x {a, b} : na  nb ,
the starting and the finishing sy mbolsare different}
PDA
b, l / b
a, l / l
a, l / a
b, a / l
a, l / a
b, a / l
b, l / b
a, b / l
b, l / b
a, b / l
a, l / l
b, a / l
l, a / l
87
L  {x {a, b} : na  nb ,
the starting and the finishing sy mbolsare different}
CFG, direct construction
b, l / b
a, l / l
a, l / a
b, a / l
a, l / a
b, a / l
b, l / b
a, b / l
b, l / b
a, b / l
a, l / l
b, a / l
l, a / l
•Strings start and finish with
different symbols
S  aAb | bAa
•Strings contain at least one
more a than b
A  a | aA | AAb | AbA | bAA
(we must have AA here as only one A just
balances b)
88
Additional Excersize..
89
90
91
92
Download