Context-Free Languages

advertisement
Big picture
●
All languages
●
Decidable
Turing machines
●
NP
●
P
●
Context-free
Context-free grammars, push-down automata
●
Regular
Automata, non-deterministic automata,
regular expressions
Recall:
Theorem: L := {0n 1n : n  0} is not regular
But it is often needed to recognize this language
Example: Programming language syntax have
matching brackets, not regular.
Next: Introduce context-free languages
Why study context-free languages
●
●
Practice with more powerful model
Programming languages: Syntax of C++, Java,
etc. is context-free language (specified by
context-free grammar)
●
Other reasons: human language has structures
that can be modeled as context-free language
Example: Context-free grammar G, S = {0,1}
S→0S1
S→e
Two substitution rules (a.k.a. productions) →
Variables = {S}, Terminals = {0,1}
Derivation of 0011 in grammar:
S  0S1  00S11  0011
L(G) = {0n 1n : n  0}
Example: Context-free grammar G, S = {0,1}
S→A
S→B
A→0A1
A→e
B→1B0
B→e
L(G) = L(A) U L(B)
= {0n 1n : n  0} U {1n 0n : n  0}
Example: Context-free grammar G, S = {0,1}
S→A|B
A→0A1|e
B→1B0|e
Convention: Write A → w|w' for
A → w and A → w'
Definition: A context-free grammar (CFG) G is
a 4 tuple (V, S, R, S) where
●
V is a finite set of variables
●
S is a finite set of terminals (V  S = )
●
R is a finite set of rules, where each rule is
A→w
●
A  V, w  (V U S)*
S  V is the start variable
Example
The language L = {ambn : m > n}
is described by the CFG G = (V, S, R, S)
where:
V = {S, T}
S = {a, b}
R = { S → aS | aT
T → aTb | e }
Derive aaab:
S→?
Example
The language L = {ambn : m > n}
is described by the CFG G = (V, S, R, S)
where:
V = {S, T}
S = {a, b}
R = { S → aS | aT
T → aTb | e }
Derive aaab:
S → aS
→?
Example
The language L = {ambn : m > n}
is described by the CFG G = (V, S, R, S)
where:
V = {S, T}
S = {a, b}
R = { S → aS | aT
T → aTb | e }
Derive aaab:
S → aS
→ aaT
→?
Example
The language L = {ambn : m > n}
is described by the CFG G = (V, S, R, S)
where:
V = {S, T}
S = {a, b}
R = { S → aS | aT
T → aTb | e }
Derive aaab:
S → aS
→ aaT
→ aaaTb
→?
Example
The language L = {ambn : m > n}
is described by the CFG G = (V, S, R, S)
where:
V = {S, T}
S = {a, b}
R = { S → aS | aT
T → aTb | e }
Derive aaab:
S → aS
→ aaT
→ aaaTb
→ aaab
Definition: Let G = (V, S, R, S) be a CFG
we write uAv  uwv and say uAv yields uwv
if A → w is a rule
We say u derives v, written u * v, if
●
●
u = v, or
 u1, u2, … , uk k  1 :
u  u1  u2  …  uk = v
The language of the grammar is L(G) = {w : S * w}
Definition: A language is context-free if
 CFG G : L(G) = L
Example:
∑ = {0,1,#}
Give a CFG for L = { x#y : x,y in {0,1}* |x|  |y| }
G = S → BL
S → RB
L → BL | A
R → RB | A
A → BAB | #
B → 0 | 1
Remark: B * ?
To understand, explain what each piece does!
Example:
∑ = {0,1,#}
Give a CFG for L = { x#y : x,y in {0,1}* |x|  |y| }
G = S → BL
S → RB
L → BL | A
R → RB | A
A → BAB | #
Remark: A * ?
B → 0 | 1
Remark: B * 0, B * 1
Example:
∑ = {0,1,#}
Give a CFG for L = { x#y : x,y in {0,1}* |x|  |y| }
G = S → BL
S → RB
L → BL | A
R → RB | A
Remark: R * ?
A → BAB | #
Remark: A * x#y : |x|=|y|
B → 0 | 1
Remark: B * 0, B * 1
Example:
∑ = {0,1,#}
Give a CFG for L = { x#y : x,y in {0,1}* |x|  |y| }
G = S → BL
S → RB
L → BL | A
Remark: L * ?
R → RB | A
Remark: R * x#y : |x| ≤ |y|
A → BAB | #
Remark: A * x#y : |x|=|y|
B → 0 | 1
Remark: B * 0, B * 1
Example:
∑ = {0,1,#}
Give a CFG for L = { x#y : x,y in {0,1}* |x|  |y| }
G = S → BL
S → RB
Remark: RB * ?
L → BL | A
Remark: L * x#y : |x| ≥ |y|
R → RB | A
Remark: R * x#y : |x| ≤ |y|
A → BAB | #
Remark: A * x#y : |x|=|y|
B → 0 | 1
Remark: B * 0, B * 1
Example:
∑ = {0,1,#}
Give a CFG for L = { x#y : x,y in {0,1}* |x|  |y| }
G = S → BL
Remark: BL * ?
S → RB
Remark: RB * x#y : |x| < |y|
L → BL | A
Remark: L * x#y : |x| ≥ |y|
R → RB | A
Remark: R * x#y : |x| ≤ |y|
A → BAB | #
Remark: A * x#y : |x|=|y|
B → 0 | 1
Remark: B * 0, B * 1
Example:
∑ = {0,1,#}
Give a CFG for L = { x#y : x,y in {0,1}* |x|  |y| }
G = S → BL
Remark: BL * x#y : |x| > |y|
S → RB
Remark: RB * x#y : |x| < |y|
L → BL | A
Remark: L * x#y : |x| ≥ |y|
R → RB | A
Remark: R * x#y : |x| ≤ |y|
A → BAB | #
Remark: A * x#y : |x|=|y|
B → 0 | 1
Remark: B * 0, B * 1
L(G) = L
Example: CFG for expressions in programming
languages
Task: recognize strings like 0 + 0 + 1 x (1 + 0)
S → S+S | S x S | ( S ) | 0 | 1
S→ S+S → 0+S → 0+S+S → 0+0+S
→0+0+SxS → 0+0+1xS
→ 0 + 0 + 1 x (S) → 0 + 0 + 1 x (S + S)
→ 0 + 0 + 1 x (1 + S) → 0 + 0 + 1 x (1 + 0)
We have seen: CFG, definition, and examples
Next: Ambiguity
●
Ambiguity: Some string may have multiple
derivations in a CFG
●
Ambiguity is a problem for compilers:
Compilers use derivation to give meaning to strings.
Example: meaning of 1+0x0 ∈ ∑* is its value, 1 ∈ℕ
If there are two different derivations,
the value may not be well defined.
Example: The string 1+0x0 has two derivations in
S → S+S | S x S | ( S ) | 0 | 1
One derivation:
S → S+S → 1+S → 1+SxS → 1+0xS → 1+0x0
Another derivation:
S→?
Example: The string 1+0x0 has two derivations in
S → S+S | S x S | ( S ) | 0 | 1
One derivation:
S → S+S → 1+S → 1+SxS → 1+0xS → 1+0x0
Another derivation:
S → SxS → ?
Example: The string 1+0x0 has two derivations in
S → S+S | S x S | ( S ) | 0 | 1
One derivation:
S → S+S → 1+S → 1+SxS → 1+0xS → 1+0x0
Another derivation:
S → SxS → Sx0 → ?
Example: The string 1+0x0 has two derivations in
S → S+S | S x S | ( S ) | 0 | 1
One derivation:
S → S+S → 1+S → 1+SxS → 1+0xS → 1+0x0
Another derivation:
S → SxS → Sx0 → S+Sx0 → ?
Example: The string 1+0x0 has two derivations in
S → S+S | S x S | ( S ) | 0 | 1
One derivation:
S → S+S → 1+S → 1+SxS → 1+0xS → 1+0x0
Another derivation:
S → SxS → Sx0 → S+Sx0 → S+0x0 → ?
Example: The string 1+0x0 has two derivations in
S → S+S | S x S | ( S ) | 0 | 1
One derivation:
S → S+S → 1+S → 1+SxS → 1+0xS → 1+0x0
Another derivation:
S → SxS → Sx0 → S+Sx0 → S+0x0 → 1+0x0
We now want to define CFG with no ambiguity
Definition: A derivation is leftmost if at every step
the leftmost variable is expanded
st
Example: the 1 previous derivation was leftmost
S → S+S → 1+S → 1+SxS → 1+0xS → 1+0x0
Definition: A CFG G is un-ambiguous if no string
has two different leftmost derivations.
Example
The CFG
S → S+S | S x S | ( S ) | 0 | 1
is ambiguous because 1+0x0 has two distinct
leftmost derivations
One leftmost derivation:
S → S+S → 1+S → 1+SxS → 1+0xS → 1+0x0
Another leftmost derivation:
S → SxS → S+SxS → 1+SxS → 1+0xS → 1+0x0
Example Instead of using CFG
S → S+S | S x S | ( S ) | 0 | 1
we may use un-ambiguous grammar
S→S+T|T
T→TxF|F
F → 0 | 1 | (S)
Unique leftmost derivation of 1+0x0:
S→?
Example Instead of using CFG
S → S+S | S x S | ( S ) | 0 | 1
we may use un-ambiguous grammar
S→S+T|T
T→TxF|F
F → 0 | 1 | (S)
Unique leftmost derivation of 1+0x0:
S→S+T→?
Example Instead of using CFG
S → S+S | S x S | ( S ) | 0 | 1
we may use un-ambiguous grammar
S→S+T|T
T→TxF|F
F → 0 | 1 | (S)
Unique leftmost derivation of 1+0x0:
S→S+T→T+T→?
Example Instead of using CFG
S → S+S | S x S | ( S ) | 0 | 1
we may use un-ambiguous grammar
S→S+T|T
T→TxF|F
F → 0 | 1 | (S)
Unique leftmost derivation of 1+0x0:
S→S+T→T+T→F+T→?
Example Instead of using CFG
S → S+S | S x S | ( S ) | 0 | 1
we may use un-ambiguous grammar
S→S+T|T
T→TxF|F
F → 0 | 1 | (S)
Unique leftmost derivation of 1+0x0:
S→S+T→T+T→F+T→1+T
→?
Example Instead of using CFG
S → S+S | S x S | ( S ) | 0 | 1
we may use un-ambiguous grammar
S→S+T|T
T→TxF|F
F → 0 | 1 | (S)
Unique leftmost derivation of 1+0x0:
S→S+T→T+T→F+T→1+T
→1+TxF→?
Example Instead of using CFG
S → S+S | S x S | ( S ) | 0 | 1
we may use un-ambiguous grammar
S→S+T|T
T→TxF|F
F → 0 | 1 | (S)
Unique leftmost derivation of 1+0x0:
S→S+T→T+T→F+T→1+T
→1+TxF→1+0xF→1+0x0
Actual Java specification grammar snippet
Cumbersome but un-ambiguous
MultiplicativeExpression:
UnaryExpression
MultiplicativeExpression * UnaryExpression
MultiplicativeExpression / UnaryExpression
MultiplicativeExpression % UnaryExpression
AdditiveExpression:
MultiplicativeExpression
AdditiveExpression + MultiplicativeExpression
AdditiveExpression - MultiplicativeExpression
Next: understand power of context-free languages
Study closure under not, U, o, *
Recall from regular langues: If A, B are regular then
not A
is regular ?
A U B is regular ?
A o B is regular ?
A*
is regular ?
Next: understand power of context-free languages
Study closure under not, U, o, *
Recall from regular langues: If A, B are regular then
not A
regular
A U B regular
A o B regular
A*
regular
Suppose A, B are context-free:
A = L(GA) for CFG GA=(VA, S, RA, SA)
B = L(GB) for CFG GB=(VB, S, RB, SB)
What about
AUB
AoB
A*
S→?
Suppose A, B are context-free:
A = L(GA) for CFG GA=(VA, S, RA, SA)
B = L(GB) for CFG GB=(VB, S, RB, SB)
What about
AUB
S → SA|SB
AoB
A*
S→?
Context-free
Suppose A, B are context-free:
A = L(GA) for CFG GA=(VA, S, RA, SA)
B = L(GB) for CFG GB=(VB, S, RB, SB)
What about
AUB
S → SA|SB
Context-free
AoB
S → S A SB
Context-free
A*
S→?
Suppose A, B are context-free:
A = L(GA) for CFG GA=(VA, S, RA, SA)
B = L(GB) for CFG GB=(VB, S, RB, SB)
What about
AUB
S → SA|SB
Context-free
AoB
S → S A SB
Context-free
A*
S → SSA | e
Context-free
Above all context-free!
In general, (not A) is NOT context-free
Suppose A, B are context-free:
A = L(GA) for CFG GA=(VA, S, RA, SA)
B = L(GB) for CFG GB=(VB, S, RB, SB)
What about
AUB
S → SA|SB
Context-free
AoB
S → S A SB
Context-free
A*
S → SSA | e
Context-free
Above also shows regular ⇨context-free
Context-free languages contain regular languages
Example: Context Free UNION
∑ = {0,1,#}
Give a CFG for L = { x#y : x,y in {0,1}*
|x|  |y| OR x = yR }
yR is the reverse of y:
001R
= 100
11010R = 01011
1R
=1
Example: Context Free UNION
∑ = {0,1,#}
Give a CFG for L = { x#y : x,y in {0,1}*
|x|  |y| OR x = yR }
Write L = L1 U L2, where
L1 = { x#y : |x|  |y| }
L2 = { x#y : x = yR }
Example: Context Free UNION
∑ = {0,1,#}
Give a CFG for L = { x#y : x,y in {0,1}*
|x|  |y| OR x = yR }
Write L = L1 U L2, where
L1 = { x#y : |x|  |y| }
L2 = { x#y : x = yR }
G1= S1 → BL | RB
L → BL | A
Remark: L * x#y : |x| ≥ |y|
R → RB | A
Remark: R * x#y : |x| ≤ |y|
A → BAB | #
Remark: A * x#y : |x|=|y|
B → 0 | 1
Example: Context Free UNION
∑ = {0,1,#}
Give a CFG for L = { x#y : x,y in {0,1}*
|x|  |y| OR x = yR }
Write L = L1 U L2, where
L1 = { x#y : |x|  |y| }
G1= S1 → BL | RB
L → BL | A
R → RB | A
A → BAB | #
B → 0 | 1
L2 = { x#y : x = yR }
G2= S2 → 0S20 | 1S21 | #
Example: Context Free UNION
∑ = {0,1,#}
Give a CFG for L = { x#y : x,y in {0,1}*
|x|  |y| OR x = yR }
Write L = L1 U L2, where
L1 = { x#y : |x|  |y| }
G1= S1 → BL | RB
L → BL | A
R → RB | A
A → BAB | #
B → 0 | 1
L2 = { x#y : x = yR }
G2= S2 → 0S20 | 1S21 | #
Let G = S → S1 | S2
Then, L(G1) = L1 & L(G2) = L2
 L(G) = L1 U L2 = L
Example: Context Free CONCATENATION
m m n n
Give a CFG for L = { 0 1 0 1 : m even and n odd}
Example: Context Free CONCATENATION
m m n n
Give a CFG for L = { 0 1 0 1 : m even and n odd}
Write L = L1 o L2, where
L1 = { 0m1m : m even}
L2 = { 0n1n : n odd}
Example: Context Free CONCATENATION
m m n n
Give a CFG for L = { 0 1 0 1 : m even and n odd}
Write L = L1 o L2, where
L1 = { 0m1m : m even}
G1= S1 → 00S111 | e
L2 = { 0n1n : n odd}
Example: Context Free CONCATENATION
m m n n
Give a CFG for L = { 0 1 0 1 : m even and n odd}
Write L = L1 o L2, where
L1 = { 0m1m : m even}
G1= S1 → 00S111 | e
L2 = { 0n1n : n odd}
G2= S2 → 00S211 | 01
Example: Context Free CONCATENATION
m m n n
Give a CFG for L = { 0 1 0 1 : m even and n odd}
Write L = L1 o L2, where
L1 = { 0m1m : m even}
G1= S1 → 00S111 | e
L2 = { 0n1n : n odd}
G2= S2 → 00S211 | 01
Let G = S → S1S2
Then, L(G1) = L1 & L(G2) = L2
 L(G) = L1 o L2 = L
Example: Context Free STAR
Give a CFG for
L = { w in {0,1}* : w = w1 w2 ▪▪▪ wk , k  0
where each wi is a palindrome }
●
A string w is a palindrome if w = wR
That is, w reads the same forwards and backwards
●
Example: 00100, 1001, and 0 are palindromes;
0011, 01 are not
Example: Context Free STAR
Give a CFG for
L = { w in {0,1}* : w = w1 w2 ▪▪▪ wk , k  0
where each wi is a palindrome }
Write L = L1*, where L1 = {w : w is a palindrome}
Note: In fact, L = {0,1}*, but we will not use that.
Example: Context Free STAR
Give a CFG for
L = { w in {0,1}* : w = w1 w2 ▪▪▪ wk , k  0
where each wi is a palindrome }
Write L = L1*, where L1 = {w : w is a palindrome}
G1= S1 → 0S10 | 1S11 | 0 | 1 | e
Example: Context Free STAR
Give a CFG for
L = { w in {0,1}* : w = w1 w2 ▪▪▪ wk , k  0
where each wi is a palindrome }
Write L = L1*, where L1 = {w : w is a palindrome}
G1= S1 → 0S10 | 1S11 | 0 | 1 | e
Let G = S → SS1 | e.
Then, L(G1) = L1
 L(G) = L1* = L.
Beyond regular
A string and its reversal with C in middle:
S → 0S0 | 1S1 | C
Example: S ⇨* 0001C1000
More generally, to get strings of the form Ak C Bk
use rules: S → A S B | C
Example: S = {0,1,#}, wR is reverse of w
L = {w # x : wR is a substring of x}
Useful to rewrite L as:
Example: S = {0,1,#}, wR is reverse of w
L = {w # x : wR is a substring of x}
= { w#x wR y : w,x,y  {0,1}* }
G :=
S → CB
C → 0C0 | 1C1 | #B
B→0B|1B|e
Remark: B * ?
Example: S = {0,1,#}, wR is reverse of w
L = {w # x : wR is a substring of x}
= { w#x wR y : w,x,y  {0,1}* }
G :=
S → CB
C → 0C0 | 1C1 | #B
Remark: C * ?
B→0B|1B|e
Remark: B * {0,1}*
Example: S = {0,1,#}, wR is reverse of w
L = {w # x : wR is a substring of x}
= { w#x wR y : w,x,y  {0,1}* }
G :=
S → CB
C → 0C0 | 1C1 | #B
Remark: C * w#{0,1}*wR
B→0B|1B|e
Remark: B * {0,1}*
L(G) = L
CFG vs. automata
CFG  non-deterministic pushdown automata (PDA)
A PDA is simply an NFA with a stack.
q1
x, y → z
q2
This means: “read x from the input;
pop y off the stack;
push z onto the stack”
Any of x,y,z may be e.
Example: PDA for L = {0n1n : n  0}
q0
e, e → $
0, e → 0
1, 0 → e
q1
q2
e, e → e
e, $ → e
q3
The $ is a special symbol to recognize end of stack
Idea:
q1 : read and push 0s onto stack until no more
q2 : read 1s and match with 0s popped from stack
Unlike the case for regular automata,
non-deterministic PDA are strictly more powerful
than deterministic PDA.
Compilers must work with deterministic PDA,
an important subclass of context-free languages
Non-context-free languages
Intuition: If L involves regular expressions and/or
nested matchings then probably context-free.
If not, probably not.
{ 0n 1n : n  0 } CF :
000 111
nested
{w w : w  S*} not CF: 1101 1101
not nested
{0n1n2n : n0} not CF: 00 11 22
not nested
Non-context-free languages
There is a pumping lemma for context-free
languages.
Similar to the one for regular, but
simultaneously “pump” string in two parts:
w = u vi x yi z
Context-free pumping lemma:
L is CF language 
 p 0
 w  L, |w|  p
 u,v,x,y,z :
w= uvxyz, |vy|> 0, |vxy| p
 i  0 : uvixyiz  L
Context-free pumping lemma:
L is CF language 
 p 0
 w  L, |w|  p
 u,v,x,y,z :
w= uvxyz, |vy|> 0, |vxy| p
Proof idea:
 i  0 : uvixyiz  L
Let G be CFG : L(G) = L
If w  L is very long, derivation repeats a variable V
(like repeat states in regular P.L.)
vxy = piece of w that V derives: V ⇨* vxy
Because V repeated once, can repeat it again
Context-free pumping lemma:
L is CF language 
 p 0
A
 w  L, |w|  p
 u,v,x,y,z :
w= uvxyz, |vy|> 0, |vxy| p
 i  0 : uvixyiz  L
Useful to prove L NOT context-free.
Use contrapositive:
L context-free language  A
same as
(not A)  L not context-free
Context-free pumping lemma (contrapositive)
 p 0
not A
 w  L, |w|  p
 u,v,x,y,z :
 L not context-free
w = uvxyz, |vy|> 0, |vxy| p
 i  0 : uvixyiz  L
To prove L not context-free it is enough to prove not A
Not A is the stuff in the box.
Context-free pumping lemma (contrapositive)
 p 0
 w  L, |w|  p
 u,v,x,y,z :
 L not context-free
w = uvxyz, |vy|> 0, |vxy| p
 i  0 : uvixyiz  L
Adversary picks p,
You pick w ∈ L of length  p,
Adversary decomposes w = uvxyz, |vy| > 0, |vxy|  p
You pick i  0
Finally, you win if uvixyiz  L
Theorem: L := {an bn cn : n  0} is not context-free.
Proof:
 p 0
Adversary moves p
 w  L, |w|  p
You move w := ap bp cp
 u,v,x,y,z : w = uvxyz,
Adversary moves u,v,x,y,z
|vy|> 0, |vxy| p
You move i := 2
 i  0 : uvixyiz  L
You must show uvvxyyz  L:
vy misses at least one symbol in ∑ = {a,b,c}
since ?
Theorem: L := {an bn cn : n  0} is not context-free.
Proof:
 p 0
Adversary moves p
 w  L, |w|  p
You move w := ap bp cp
 u,v,x,y,z : w = uvxyz,
Adversary moves u,v,x,y,z
|vy|> 0, |vxy| p
You move i := 2
 i  0 : uvixyiz  L
You must show uvvxyyz  L:
vy misses at least one symbol in ∑ = {a,b,c}
since between as and cs there are p bs, and |vy| ≤ p
so uvvxyyz ????
Theorem: L := {an bn cn : n  0} is not context-free.
Proof:
 p 0
Adversary moves p
 w  L, |w|  p
You move w := ap bp cp
 u,v,x,y,z : w = uvxyz,
Adversary moves u,v,x,y,z
|vy|> 0, |vxy| p
You move i := 2
 i  0 : uvixyiz  L
You must show uvvxyyz  L:
vy misses at least one symbol in ∑ = {a,b,c}
since between as and cs there are p bs, and |vy| ≤ p
so uvvxyyz has too few of that symbol, so ∉ L
DONE
Theorem: L := {ai bj ck : 0  i  j  k } is not context-free.
Proof:
Adversary moves p
You move w := ap bp cp
Adversary moves u,v,x,y,z
 p 0
 w  L, |w|  p
 u,v,x,y,z : w = uvxyz,
|vy|> 0, |vxy| p
 i  0 : uvixyiz  L
So far, same as {an bn cn : n  0}.
But now we need a few cases.
Our choice of i depends on u,v,x,y,z
Theorem: L := {ai bj ck : 0  i  j  k } is not context-free.
Proof (cont.):
You have w = apbpcp, with w = uvxyz, |vy|> 0, |vxy| p.
You must pick i  0 and show uvixyiz  L.
If no a's in vy: ?
Theorem: L := {ai bj ck : 0  i  j  k } is not context-free.
Proof (cont.):
You have w = apbpcp, with w = uvxyz, |vy|> 0, |vxy| p.
You must pick i  0 and show uvixyiz  L.
If no a's in vy: uv xy0z has fewer b's or c's than a's.
0
If no c's in vy: ?
Theorem: L := {ai bj ck : 0  i  j  k } is not context-free.
Proof (cont.):
You have w = apbpcp, with w = uvxyz, |vy|> 0, |vxy| p.
You must pick i  0 and show uvixyiz  L.
If no a's in vy: uv xy0z has fewer b's or c's than a's.
0
If no c's in vy: uv2xy2z has more a's or b's than c's.
If no b's in vy:
?
Theorem: L := {ai bj ck : 0  i  j  k } is not context-free.
Proof (cont.):
You have w = apbpcp, with w = uvxyz, |vy|> 0, |vxy| p.
You must pick i  0 and show uvixyiz  L.
If no a's in vy: uv xy0z has fewer b's or c's than a's.
0
If no c's in vy: uv2xy2z has more a's or b's than c's.
If no b's in vy:
You fall in a previous case, since |vxy|  p
DONE
Theorem: L := {s s : s  {0,1}* } is not context-free.
Proof:
 p 0
Adversary moves p
 w  L, |w|  p
You move w := 0p 1p 0p 1p  u,v,x,y,z : w = uvxyz,
|vy|> 0, |vxy| p
 i  0 : uvixyiz  L
Note: To prove L not regular we moved w = 0p 1 0p 1
That move does not work for context-free!
Theorem: L := {s s : s  {0,1}* } is not context-free.
Proof:
 p 0
Adversary moves p
 w  L, |w|  p
You move w := 0p 1p 0p 1p  u,v,x,y,z : w = uvxyz,
Adversary moves u,v,x,y,z
|vy|> 0, |vxy| p
Three cases:
vxy in 1st half of w: ?
 i  0 : uvixyiz  L
Theorem: L := {s s : s  {0,1}* } is not context-free.
Proof:
 p 0
Adversary moves p
 w  L, |w|  p
You move w := 0p 1p 0p 1p  u,v,x,y,z : w = uvxyz,
Adversary moves u,v,x,y,z
|vy|> 0, |vxy| p
Three cases:
 i  0 : uvixyiz  L
vxy in 1st half of w: 2nd half of uv2xy2z starts with 1,
but uv2xy2z still starts with 0.
vxy in 2nd half of w: ?
Theorem: L := {s s : s  {0,1}* } is not context-free.
Proof:
 p 0
Adversary moves p
 w  L, |w|  p
You move w := 0p 1p 0p 1p  u,v,x,y,z : w = uvxyz,
Adversary moves u,v,x,y,z
|vy|> 0, |vxy| p
Three cases:
 i  0 : uvixyiz  L
vxy in 1st half of w: 2nd half of uv2xy2z starts with 1,
but uv2xy2z still starts with 0.
vxy in 2nd half of w: 1st half of uv2xy2z ends with 0,
but uv2xy2z still ends with 1.
vxy touches midpoint: ?
Theorem: L := {s s : s  {0,1}* } is not context-free.
Proof:
 p 0
Adversary moves p
 w  L, |w|  p
You move w := 0p 1p 0p 1p  u,v,x,y,z : w = uvxyz,
Adversary moves u,v,x,y,z
|vy|> 0, |vxy| p
Three cases:
 i  0 : uvixyiz  L
vxy in 1st half of w: 2nd half of uv2xy2z starts with 1,
but uv2xy2z still starts with 0.
vxy in 2nd half of w: 1st half of uv2xy2z ends with 0,
but uv2xy2z still ends with 1.
vxy touches midpoint:
uv0xy0z = 0p 1i 0j 1p with either i < p or j < p.
DONE
L := { w  {a,b}* : w has same number of a and b}
Grammar for L
??
L := { w  {a,b}* : w has same number of a and b}
Grammar for L
S → ε | SS | aSb | bSa
Not clear why this works.
It requires a proof.
Proofs by induction
Let P(n) be any claim
To prove “∀n ≥ 0, P(n) is true” it suffices to prove
Base case: P(0) is true
Induction step: ∀ n : ( (∀ i < n, P(i) ) => P(n) )
Induction hypothesis
You can replace “0” by any fixed value
Example: P(n) = ∑ i=0n i = n(n+1)/2
Claim: ∀ n ≥ 0, P(n)
Proof by induction:
Base case: P(0)
0 = 0(1)/2 = 0 is true
Induction step: ∀ n : ( (∀ i < n, P(i) ) => P(n) )
∑ i=0n i = ??
Example: P(n) = ∑ i=0n i = n(n+1)/2
Claim: ∀ n ≥ 0, P(n)
Proof by induction:
Base case: P(0)
0 = 0(1)/2 = 0 is true
Induction step: ∀ n : ( (∀ i < n, P(i) ) => P(n) )
∑ i=0n i = ∑ i=0n-1 i + n = (n-1)n/2 + n = n(n+1)/2
L := { w  {a,b}* : w has same number of a and b}
S → ε | SS | aSb | bSa
Claim: For any w ∈ {a,b}* , S →* w if and only if w ∈ L
Proof of “only if”: Suppose S →* w. Must show w ∈ L.
This fact is self-evident.
We show a proof by induction nevertheless,
as a warm-up for the other direction,
which is not self-evident.
L := { w  {a,b}* : w has same number of a and b}
S → ε | SS | aSb | bSa
Claim: For any w ∈ {a,b}* , S →* w if and only if w ∈ L
Proof of “only if”: Suppose S →* w. Must show w ∈ L.
Let P(n) = any w ∈ {S,a,b}* such that S →* w in n steps
has same number of a and b.
Base case (n=1): ??
L := { w  {a,b}* : w has same number of a and b}
S → ε | SS | aSb | bSa
Claim: For any w ∈ {a,b}* , S →* w if and only if w ∈ L
Proof of “only if”: Suppose S →* w. Must show w ∈ L.
Let P(n) = any w ∈ {S,a,b}* such that S →* w in n steps
has same number of a and b.
Base case (n=1): ε, SS, aSb, bSa have same number.
Induction step: Suppose S →* w' → w
where S →* w' in n-1 steps.
By induction hypothesis, ??
L := { w  {a,b}* : w has same number of a and b}
S → ε | SS | aSb | bSa
Claim: For any w ∈ {a,b}* , S →* w if and only if w ∈ L
Proof of “only if”: Suppose S →* w. Must show w ∈ L.
Let P(n) = any w ∈ {S,a,b}* such that S →* w in n steps
has same number of a and b.
Base case (n=1): ε, SS, aSb, bSa have same number.
Induction step: Suppose S →* w' → w
where S →* w' in n-1 steps.
By induction hypothesis, w' has same number of a, b.
Since any rule adds same number of a and b, w has too.
L := { w  {a,b}* : w has same number of a and b}
S → ε | SS | aSb | bSa
Claim: For any w ∈ {a,b}* , S →* w if and only if w ∈ L
Proof of “if”: Suppose w ∈ L. Must show S →* w
Let P(n) = ∀ w ∈ {S,a,b}*, |w| = n, S →* w.
Base case: w = ε. Use rule ??
L := { w  {a,b}* : w has same number of a and b}
S → ε | SS | aSb | bSa
Claim: For any w ∈ {a,b}* , S →* w if and only if w ∈ L
Proof of “if”: Suppose w ∈ L. Must show S →* w
Let P(n) = ∀ w ∈ {S,a,b}*, |w| = n, S →* w.
Base case: w = ε. Use rule S → ε
Induction step: Let |w| = n.
This step is more complicated, and is the
“creative step” of this proof.
S → ε | SS | aSb | bSa
Induction step: Let |w| = n.
Define ci := number of a - number of b in w1 w2 … wi
c0 = 0 cn = ??
S → ε | SS | aSb | bSa
Induction step: Let |w| = n.
Define ci := number of a - number of b in w1 w2 … wi
c0 = 0 cn = 0
If ∃ 0 < i < n : ci = 0
then w = ??
S → ε | SS | aSb | bSa
Induction step: Let |w| = n.
Define ci := number of a - number of b in w1 w2 … wi
c0 = 0 cn = 0
If ∃ 0 < i < n : ci = 0
then w = w' w'', where w', w'' ∈ L,
and |w'| <n, |w''| < n
By induction hypothesis. S → * w', S → * w''.
Hence S → SS → * w' S → w' w'' = w
S → ε | SS | aSb | bSa
Induction step: Let |w| = n.
Define ci := number of a - number of b in w1 w2 … wi
c0 = 0 cn = 0
If ∀ 0 < i < n : ci > 0
then w = ??
S → ε | SS | aSb | bSa
Induction step: Let |w| = n.
Define ci := number of a - number of b in w1 w2 … wi
c0 = 0 cn = 0
If ∀ 0 < i < n : ci > 0
then w = a w' b, where w' ∈ L and |w'| < n
By induction hypothesis. S → * w'
Hence S → aSb → * a w' b = w
S → ε | SS | aSb | bSa
Induction step: Let |w| = n.
Define ci := number of a - number of b in w1 w2 … wi
c0 = 0 cn = 0
If ∀ 0 < i < n : ci < 0
then w = ??
S → ε | SS | aSb | bSa
Induction step: Let |w| = n.
Define ci := number of a - number of b in w1 w2 … wi
c0 = 0 cn = 0
If ∀ 0 < i < n : ci < 0
then w = b w' a, where w' ∈ L and |w'| < n
By induction hypothesis. S →* w'
Hence S → bSa → * b w' a = w
S → ε | SS | aSb | bSa
Induction step: Let |w| = n.
Define ci := number of a - number of b in w1 w2 … wi
c0 = 0 cn = 0
These three cover all cases, because two consecutive
ci differ by 1. So the ci cannot change sign without
going through 0
DONE
Download