CS 341: Foundations of Computer Science II Prof. Marvin Nakayama Homework 9 Solutions 1. Let B be the set of all infinite sequences over {0, 1}. Show that B is uncountable, using a proof by diagonalization. Answer: Each element in B is an infinite sequence (b1 , b2 , b3 , . . .), where each bi ∈ {0, 1}. Suppose B is countable. Then we can define a correspondence f between N = {1, 2, 3, . . .} and B. Specifically, for n ∈ N , let f (n) = (bn1 , bn2 , bn3 , . . .), where bni is the ith bit in the nth sequence, i.e., n 1 2 3 4 .. . (b11 , (b21 , (b31 , (b41 , b12 , b22 , b32 , b42 , f (n) b13 , b14 , b23 , b24 , b33 , b34 , b43 , b44 , .. . b15 , b25 , b35 , b45 , . . .) . . .) . . .) . . .) Now define the sequence c = (c1 , c2 , c3 , c4 , c5 , . . .) ∈ B over {0, 1}, where ci = 1−bii . In other words, the ith bit in c is the opposite of the ith bit in the ith sequence. For example, if n f (n) 1 (0, 1, 1, 0, 0, . . .) 2 (1, 0, 1, 0, 1, . . .) 3 (1, 1, 1, 1, 1, . . .) 4 (1, 0, 0, 1, 0, . . .) .. .. . . then we would define c = (1, 1, 0, 0, . . .). Thus, c ∈ B differs from each sequence by at least one bit, so c does not equal f (n) for any n, which is a contradiction. Hence, B is uncountable. 2. Recall that EQCFG = { hG1 , G2 i | G1 and G2 are CFGs and L(G1 ) = L(G2 ) }. Show that EQCFG is undecidable. Answer: We will reduce ALLCFG to EQCFG , where ALLCFG = { hGi | G is a CFG and L(G) = Σ∗ }. Sipser shows that ALLCFG is undecidable. 1 Define CFG G0 = (V, Σ, R, S), where V = {S} and S is the starting variable. There is a rule S → `S in R for every terminal ` ∈ Σ. Also, G0 includes the rule S → ε. For example, if Σ = {a, b}, then the rules in G0 are S → aS | bS | ε. It is easy to see that L(G0 ) = Σ∗ . Let R be a TM that decides EQCFG and construct TM S to decide ALLCFG . Then S works in the following manner. S = “On input hGi, where G is a CFG: 1. Run R on input hG, G0 i, where G0 is the CFG defined above with L(G0 ) = Σ∗ . 2. If R accepts, accept. If R rejects, reject.” In step 1, TM R decides if L(G) = L(G0 ), but since L(G0 ) = Σ∗ , it decides if L(G) = Σ∗ . 3. Show that EQCFG is co-Turing-recognizable. Answer: Recall that EQCFG is a co-Turing-recognizable language if and only if its complement EQCFG is a Turing-recognizable language. Now, EQCFG = C ∪ D, where C = { w | w does not have the form hG1 , G2 i for some CFGs G1 and G1 }, D = { hG1 , G2 i | G1 and G2 are CFGs and L(G1 ) 6= L(G2 ) }. We claim that the set C (consisting of strings that violate the syntax for encoding hG1 , G2 i) is easy to recognize. We do not provide a formal proof though. The set D can be recognized as follows. We convert CFGs G1 and G2 into Chomsky normal form. Then we start enumerating strings in Σ∗ lexicographically, where Σ is the set of terminals for both G1 and G2 . For each string w enumerated, we check whether it can be generated by G1 and by G2 . If both CFGs or neither CFG can generate w, then TM moves on to produce the next string in the lexicographic sequence. Otherwise, exactly one of the CFGs generates the string, and the TM accepts. Thus, D is Turingrecognizable. We showed in a previous homework that the class of Turing-recognizable languages is closed under union, so EQCFG is Turing-recognizable. Let s1 , s2 , s3 , . . . be a list of the strings in Σ∗ listed in lexicographic order. Then a TM T that recognizes EQCFG is as follows: T = “On input hG1 , G2 i, where G1 and G2 are CFGs: 1. Check if G1 and G2 are valid CFGs. If at least one isn’t, accept. 2. Convert G1 and G2 each into equivalent CFGs G01 and G02 , both in Chomsky normal form. 3. Repeat the following for i = 1, 2, 3, . . . 4. Test if both G01 and G02 generate si . If exactly one of them does, accept.” 2 Why did we convert the CFGs into Chomsky normal form? The reason is that there is a procedure that always halts for checking whether a CFG in Chomsky normal form can generate a particular string w or not. 4. Let S TM = { hM i | M is a TM that accepts wR whenever it accepts w }. Show that S TM is undecidable. Answer: The basic idea is to reduce ATM to S TM , where ATM = { hM, wi | M is a TM that accepts w }, which we know is undecidable. Let us assume that S TM is decidable. For convenience, we will denote the TMs that accept S TM and ATM by S and A, respectively. The TM A receives hM, wi as input. It rejects the input if the syntax is incorrect. Otherwise, it constructs a TM M2 from M and w, and feeds hM2 i to S. We describe how to construct TM M2 from M and w below. The TM A accepts hM, wi if and only if S accepts hM2 i. The TM M2 works as follows. It scans the input x and accepts if x ∈ 00∗ 11∗ . Otherwise, it simulates M on w. If M accepts w, then M2 accepts irrespective of what the input for M2 was. Note that the description of M and w is hard-coded into M2 ; it is not fed as input to M2 . By construction, M2 recognizes the language 00∗ 11∗ if M does not accept w; otherwise, M2 recognizes Σ∗ . Recall that L(M2 ) is the language recognized by M2 . When the language L(M2 ) is 00∗ 11∗ , if w ∈ L(M2 ), then w begins with 0 and ends with 1, so wR 6∈ L(M2 ); thus, hM2 i 6∈ S TM in this case. On the other hand, when the language L(M2 ) is Σ∗ , if w ∈ L(M2 ), then wR ∈ L(M2 ) because L(M2 ) contains all strings over Σ = {0, 1}; thus, hM2 i ∈ S TM in this case. Therefore, because S decides S TM by assumption, hM2 i will be accepted by S if and only if M accepts w. This means that TM A will accept hM, wi if and only if S accepts hM2 i. By assumption, S TM is decidable. This implies ATM is decidable, which is a contradiction. We now provide a description of TM A that decides ATM by reducing it to S TM . A = “On input hM, wi, where M is a TM and w is a string: 1. Check if hM, wi is a valid encoding of a TM M and string w. If not, reject. 2. Construct the following TM M2 . M2 = “On input x: 1. If x ∈ 00∗ 11∗ , accept. 2. If x 6∈ 00∗ 11∗ , then run M on input w and accept if M accepts w.” 3. Run S on input hM2 i. 4. If S accepts, accept; if S rejects, reject.” 3