CSE396 Problem Set 4 Answer Key Spring 2016 (1) Write a regular expression R over the alphabet {0, 1, 2} such that L(R) equals the language A of numbers in base-3 that are congruent to either 0 or 3 modulo 4. Count the empty string as belonging to L(R) (i.e., it counts as zero), and allow leading 0s in the numbers. First design a DFA M such that L(M ) = A. Then show the conversion to a regular expression. For good measure, give s PD set of size 4 for this language based on your M . As a hint for designing M , use four states for the possible congruences. The idea of an “even bank” and “odd bank” like in 1(g) of PS2 may help too. (24 pts.) Answer: For the DFA M and the algorithm to find a regular expression R see the separate hand-drawn sheet. Briefly, if you draw the M with states q0 , q2 for the even numbers mod 4 on the left and q1 , q3 for teh odd congruences on the right, it makes a nice square with arcs on 1 going straight across between the “banks,” arcs on 2 going up and down the “even bank” with 0 as self-loops, and the roles of 0 and 2 reversed on the “odd bank.” The sheet shows eliminating the state q2 first; elimiating q1 first comes downto the same 2-state GNFA with both states q0 , q3 accepting and arcs as follows: α β γ η = = = = T (0, 0) = 0 + 20∗ 2 + 12∗ 1 T (0, 3) = 20∗ 1 + 12∗ 0 T (3, 3) = 2 + 10∗ 1 + 02∗ 0 T (3, 0) = 10∗ 2 + 02∗ 1 Then L(M ) = L0,0 ∪ L0,3 . From the notes (which wrote “1,2” in place of “0,3”) the formulas are: L0,0 = (α ∪ βγ ∗ η)∗ L0,3 = α∗ β(γ ∪ ηα∗ β)∗ = (α ∪ βγ ∗ η)∗ βγ ∗ . The latter form gives L0,3 = L0,0 βγ ∗ . Then for the final union you can factor out L0,0 to give L(M ) = R = L0,0 ( ∪ βγ ∗ ). I don’t know whether this particular answer can be usefully simplified further—so long as you made clear you knew L(M ) = L0,0 ∪ L0,3 and gave one or other of the above labeled forms that was OK. It was not required to do a final yucky substitution to get a mess of 0s, 1s, 2s, and *s—though if you followed the text uncritically by adding a new final state f with epsilon-arcs (q0 , , f ) and (q3 , , f ) and eliminated state q3 too you’d be locked into the yuck. (2) Prove that the following two languages are non-regular via the Myhill-Nerode Theorem. Both use alphabet Σ = {0, 1}. Here xR stands for x reversed, e.g. 0111R = 1110. The condition x = xR means that x is a palindrome, but the members of La are a little more restrictive, including 1001 but not 10001. (2 × 12 = 24 pts.) (a) La = {xxR : x ∈ Σ∗ }. (b) Lb = {x ∈ Σ∗ : #0(x) > #1(x)}. Answer to (a): The idea is that “critical first halves” can be 0n 1 where the final 1 makes a definite boundary to the rest of the string that ahs to balance it. So take S = 0∗ 1. Clearly S is infinite. Let any x, y ∈ S, 6= y, be given. Then there are distinct numbers m, n ∈ N such that x = 0m 1 and y = 0n 1. Take z = 10m . Then xz = 0m 110m ∈ La since the halves balance to make an even-length palindrome, but yz = 0n 110m ∈ / La since it is not a palindrome at all—the ‘1’s prevent any way of trying to break the parts with ‘0’s differently to balance them. So L(xz) 6= L(yz). Since x, y ∈ S are arbitrary distinct members, this shows S is PD for La , and since S is infinite, La is not regular by the Myhill-Nerode Theorem. (It is fine to take just S = 0∗ , but then you have to remember later that z needs to be 110m with two ‘1’s, not one or none. Another mistake was S = 10∗ and later z = 0m 1 which is mirror-image wrong: if m = 3 and n = 5 then yz = 105 03 1 = 1000000001 = 104 04 1 which belongs to La after all.) Answer to (b): Take S = 0∗ . Clearly S is infinite. Let any x, y ∈ S, 6= y, be given. Then we can helpfully write x = 0m and y = 0n where m, n are natural numbers such that without loss of generality m < n. Take z = 1m . Then xz = 0m 1m ∈ / Lb since #0(xz) = m is not greater than #1(xz) = m, but yz = 0n 1m ∈ Lb since n > m by the “wlog.” provision. Hence L(xz) 6= L(yz). Since x, y ∈ S are arbitrary, this shows S is PD for Lb , and since S is infinite, Lb is not regular by the Myhill-Nerode Theorem. (3) Prove that the following language is nonregular via the Myhill-Nerode Theorem. L3 = { ai bj ck : i = j or i 6= k }. (18 pts., for 66 total on the set) Answer: To make the OR statement critical we should falsify the i = j part since it comes first in the relevant strings. We can handle it either entirely with our choice of S or split between S and z. In the latter case, it helps to make i = j false by fixing j = 0 which lets i be any arbitrary positive number. So take S = a+ = {an : n ≥ 1}. Clearly S is infinite. Let any x, y ∈ S, x 6= y, be given. Then there are positive natural numbers m, n ∈ N+ such that x = am and y = an . Take z = cm . Then xz = am cm = am b0 cm ∈ / L3 since it fails both the “i = j” and “i 6= k” tests when you substitute i = m, j = 0, and k = m (noting m ≥ 1 by the definition of S). But yz = am cn ∈ L3 since m 6= n, so L3 (xz) 6= L3 (yz). It follows that S is an infinite PD set for L3 , so L3 is nonregular. An example of the other way is to take S = {am bm+1 : m ≥ 0}. No one says S has to be regular—it only has to be infinite. This automatically falsifies the i = j part. Let any x.y ∈ S, x 6= y, be given. Then there are numbers m 6= n such that x = am bm+1 and y = an bn+1 . (Note how this is truly a general choice from this particular S, in contrast to the fallacy mentioned in lecture notes of making n = m + 1.) Take z = cm . The rest of the reasoning is as before. (There are other correct answers too.)