Theorem 7.2: If L = L(M) for some npda M then L is a context-free language. Basic idea of the proof: Build a grammar that imitates the action of the PDA, specifically, it must show how the stack eventually empties, so productions in the grammar describe how the stack grows and shrinks. Given M = (Q, ,, , q0, Z, F) construct grammar G = (V, , S, P). Define V = {S} {[q, A, p] | p, q, Q and A } Thus, there are |Q|2•|| + 1 variables. The interpretation of [q,A,p] should be when starting in state q with an A on top of the stack after the A (and anything that replaced it) is ultimately popped from the stack then M will be in state p. In other words, looking at the number of symbols on the stack, if M is in state q and A is the top of a stack that contains k symbols, when the stack has only k1 symbols, M will be in state p. (This may require several moves.) For a string to be accepted by M we must be able to start in state q0, process the string, and finally end up with only Z on the stack. Since the state of the machine when this happens is irrelevant, we must anticipate all possible ways this can occur in the grammar. Thus, we have the following productions for S: S [q0, Z, q] for every q Q. In other words, from the initial configuration the machine may terminate in any state. If a move in M simply pops the stack, then the corresponding production in the grammar should produce either a terminal symbol or the empty string i.e. it should not produce any new nonterminals. If (q, a, A) contains (p, ) it means that M, in state q, looking at an a with A on top of the stack moves to state p and does not push anything new onto the stack. In the grammar this means we want to eliminate the variable [q, A, p] so we add to P the production [q, A, p] a. Similarly, if (q, , A) contains (p, ) then add the production [q, A, p] Now, we must take care of moves of the form (q, a, A) contains (p, BA). To do this requires us to anticipate all possible ways to start in state q with an A on top of the stack and eventually end up in state p with a stack with one less symbol. Starting with the simplest case, if (q, a, A) contains (p, B) then put in the productions [q, A, r] a [p, B, r] for every r in Q. (If it helps you can interpret this as replacing A at the top of the stack by B and now we have to worry about how to pop B off to decrease the height of the stack.) In general, if (q, a, A) contains (r, B1B2 …Bm) then add all productions of the form [q, A, p] a [r, B1,q2][ q2, B2, q3] ... [qm, Bm, p] for all possible ways to choose the qi's from the states in Q. (NOTE: the qi's are not necessarily distinct states.) This means when M makes the indicated move, it replaces the A on the stack with the string of B i's, and enters state r. Then to actually decrease the height of the stack requires popping off all of the m symbols just put on the stack, ultimately ending up in state p. Notice that the states match up in adjacent variables from V (i. e. [q, A, p]) since the right state in a variable tells what state M moved to, and the left state in the next variable tells the state the machine is starting in next. In effect, the grammar must anticipate all possible ways the machine might be able to accomplish this. If you haven't already guessed it, this can lead to many useless symbols and impossible moves so in practice, you only add variables as needed and toss out any productions that can never be used. Here's a very simple example of the last way of adding productions to the grammar. Let Q = {p, q, r} and look at one move of M, namely (q, a, A) = {(p, BA), (r, B)}. Then we need the following productions for the transition (q, a, A) = (p, BA): Start in q, end in p Start in q, end in q Start in q, end in r [q, A, p] a [p, B, p][p, A, p] [q, A, q] a [p, B, p][p, A,q] [q, A, r] a [p, B, p][p, A, r] [q, A, p] a [p, B, q][q, A, p] [q, A, q] a [p, B, q][q, A, q] [q, A, r] a [p, B, q][q, A, r] [q, A, p] a [p, B, r][r, A, p] [q, A, q] a [p, B, r][r, A, q] [q, A, r] a [p, B, r][r, A, r] and from (q, a, A) = (r, B) we get [q, A, p] a [r, B, p] [q, A, q] a [r, B, q] [q, A, r] a [r, B, r]