ASSIGNMENT ONE SOLUTIONS MATH 4805 / COMP 4805 / MATH 5605 (1) (a) (0 + 1)∗ 010 (finite automata below). (b) First observe that the following regular expression generates the binary strings with an even number of 0s and an odd number of 1s r = (00 + 11 + (01 + 10)(00 + 11)∗ (01 + 10))∗ . Next observe that the following regular expression generates the binary strings with an even number of 0s and an odd number of 1s with the additional property that no prefix has even number of 0s and an odd number of 1s s = (1 + (01 + 10)(00 + 11)∗ 0) . Any string that has an even number of 0s and an odd number of 1s must have a longest prefix that has an even number of 0s and an even number of 1s, followed by a suffix that has an even number of 0s and an odd number of 1s. Therefore, a regular expression for the language is rs (finite automata below). (2) (a) Connect passwords “Must be between 6 and 8 characters long” so there are a finite number of possibilities. Thus, connect passwords are a finite language, and hence a recognizable language. (b) Consider the following languages that are recognizable (or equivalently, regular). • The finite automaton below accepts strings that have length at least six. • Given any a, b, c ∈ A the following regular expression generates strings that have the substring abc: A∗ abcA∗ . Therefore, by the closure of regular languages under union, the strings containing a substring of length three in any fixed login name is a regular language. Therefore, by the closure 1 • • • • of regular language under complement, the strings that do not contain a substring of length three in any fixed login name is a regular language. The following regular expression generates strings that have a substring that is a postal code: A∗ U DU DU DA∗ . Therefore, by the closure of regular language under complement, the strings that do not contain a substring that is a postal code is a regular language. The following regular expression generates strings that have a substring that is a license plate: A∗ U U U DDDA∗ . Therefore, by the closure of regular language under complement, the strings that do not contain a substring that is a license plate is a regular language. In the previous three bullets, replace the initial regular expressions by (i) A∗ cbaA∗ , (ii) A∗ DU DU DU A∗ , and (iii) A∗ DDDU U U A∗ to prove that the strings whose reverse does not contain the given substrings is a regular language, respectively. Next we consider avoiding the desired substrings in ww instead of the reverse of w. In the previous three bullets, replace the initial regular expressions by (i) A∗ abcA∗ +bcA∗ a+cA∗ ab, (ii) A∗ U DU DU DA∗ +DA∗ U DU DU + U DA∗ U DU D + DU DA∗ U DU + U DU DA∗ U D + DU DU DA∗ U , and (iii) A∗ U U U DDDA∗ +DA∗ U U U DD+DDA∗ U U U D+DDDA∗ U U U +U DDDA∗ U U + U U DDDA∗ U to prove that the strings whose concatenation with themselves (ie ww) does not contain the given substrings is a regular language, respectively. Observe that if w is a string of length at least six (as per the first bullet) and s is a string of length at most six (as in the previous three bullets), then some string in w∗ contains s as a substring if and only if ww contains s as a substring. Therefore, the strings of length at least six whose Kleene closure does not contain the given substrings is a regular language by using the same logic above. The following non-deterministic finite automaton accepts strings that contain a symbol in Q. Therefore, by the closure of regular language under complement, the strings that do not contain a symbol in Q is a regular language. • Let a1 , a2 , a3 ∈ A. The following regular expression generates strings that contain only these three symbols: (a1 + a2 + a3 )∗ . By considering all choices for a1 , a2 , and a3 and by the closure of regular languages under union, the set of strings containing at most three distinct symbols is a regular language. By the closure of regular languages under complement, the set of strings that contain at least four distinct symbols is regular. • The following regular expression generates strings whose first six symbols contain at least one symbol from L∪U , one symbol from D, and one symbol from P . In this regular expression, the terms are simply the permutations of the multiset {A, A, A, (L+U ), D, P } (ensuring that the first six symbols contain the desired three types of symbols) followed by A∗ . To be complete, 2 the full regular expression appears below. AAA(L + U )DP A∗ + AAA(L + U )P DA∗ + AAAD(L + U )P A∗ + AAADP (L + U )A∗ + AAAP (L + U )DA∗ + AAAP D(L + U )A∗ + AA(L + U )ADP A∗ + AA(L + U )AP DA∗ + AA(L + U )DAP A∗ + AA(L + U )DP AA∗ + AA(L + U )P ADA∗ + AA(L + U )P DAA∗ + AADA(L + U )P A∗ + AADAP (L + U )A∗ + AAD(L + U )AP A∗ + AAD(L + U )P AA∗ + AADP A(L + U )A∗ + AADP (L + U )AA∗ + AAP A(L + U )DA∗ + AAP AD(L + U )A∗ + AAP (L + U )ADA∗ + AAP (L + U )DAA∗ + AAP DA(L + U )A∗ + AAP D(L + U )AA∗ + A(L + U )AADP A∗ + A(L + U )AAP DA∗ + A(L + U )ADAP A∗ + A(L + U )ADP AA∗ + A(L + U )AP ADA∗ + A(L + U )AP DAA∗ + A(L + U )DAAP A∗ + A(L + U )DAP AA∗ + A(L + U )DP AAA∗ + A(L + U )P AADA∗ + A(L + U )P ADAA∗ + A(L + U )P DAAA∗ + ADAA(L + U )P A∗ + ADAAP (L + U )A∗ + ADA(L + U )AP A∗ + ADA(L + U )P AA∗ + ADAP A(L + U )A∗ + ADAP (L + U )AA∗ + AD(L + U )AAP A∗ + AD(L + U )AP AA∗ + AD(L + U )P AAA∗ + ADP AA(L + U )A∗ + ADP A(L + U )AA∗ + ADP (L + U )AAA∗ + AP AA(L + U )DA∗ + AP AAD(L + U )A∗ + AP A(L + U )ADA∗ + AP A(L + U )DAA∗ + AP ADA(L + U )A∗ + AP AD(L + U )AA∗ + AP (L + U )AADA∗ + AP (L + U )ADAA∗ + AP (L + U )DAAA∗ + AP DAA(L + U )A∗ + AP DA(L + U )AA∗ + AP D(L + U )AAA∗ + (L + U )AAADP A∗ + (L + U )AAAP DA∗ + (L + U )AADAP A∗ + (L + U )AADP AA∗ + (L + U )AAP ADA∗ + (L + U )AAP DAA∗ + (L + U )ADAAP A∗ + (L + U )ADAP AA∗ + (L + U )ADP AAA∗ + (L + U )AP AADA∗ + (L + U )AP ADAA∗ + (L + U )AP DAAA∗ + (L + U )DAAAP A∗ + (L + U )DAAP AA∗ + (L + U )DAP AAA∗ + (L + U )DP AAAA∗ + (L + U )P AAADA∗ + (L + U )P AADAA∗ + (L + U )P ADAAA∗ + (L + U )P DAAAA∗ + DAAA(L + U )P A∗ + DAAAP (L + U )A∗ + DAA(L + U )AP A∗ + DAA(L + U )P AA∗ + DAAP A(L + U )A∗ + DAAP (L + U )AA∗ + DA(L + U )AAP A∗ + DA(L + U )AP AA∗ + DA(L + U )P AAA∗ + DAP AA(L + U )A∗ + DAP A(L + U )AA∗ + DAP (L + U )AAA∗ + D(L + U )AAAP A∗ + D(L + U )AAP AA∗ + D(L + U )AP AAA∗ + D(L + U )P AAAA∗ + DP AAA(L + U )A∗ + DP AA(L + U )AA∗ + DP A(L + U )AAA∗ + DP (L + U )AAAA∗ + P AAA(L + U )DA∗ + P AAAD(L + U )A∗ + P AA(L + U )ADA∗ + P AA(L + U )DAA∗ + P AADA(L + U )A∗ + P AAD(L + U )AA∗ + P A(L + U )AADA∗ + P A(L + U )ADAA∗ + P A(L + U )DAAA∗ + P ADAA(L + U )A∗ + P ADA(L + U )AA∗ + P AD(L + U )AAA∗ + P (L + U )AAADA∗ + P (L + U )AADAA∗ + P (L + U )ADAAA∗ + P (L + U )DAAAA∗ + P DAAA(L + U )A∗ + P DAA(L + U )AA∗ + P DA(L + U )AAA∗ + P D(L + U )AAAA∗ For the overall proof, observe that each bullet above proves that certain languages are regular. Moreover, the intersection of these languages are precisely the passwords defined in the question. Therefore, passwords are a regular language by the closure of regular languages under intersection. (3) (a) No, Lp is not recognizable. Consider the string w = 1n 01n , where n is the pumping constant for Lp . Observe that w ∈ Lp and |w| ≥ n. Consider any x, y, z ∈ {0, 1}∗ such that (i) w = xyz, (ii) |xy| ≤ n, and (iii) |y| > 0. By these constraints, x = 1a , y = 1b , and z = 1c 01n where a ≥ 0, b > 0, and a + b + c = n. If i = 0, then xy i z = xz = 1a+c 01n = 1a+c 01a+b+c ∈ / Lp since b > 0. Therefore, Lp does not satisfy the pumping property, and hence Lp is not regular. (b) Yes, Le is recognizable. Suppose w ∈ {0, 1}∗ and let n01 be its number of 01 substrings and let n10 be its number of 10 substrings. Observe that n10 + 1 if w begins with 0 and w ends with 1 n01 = n10 − 1 if w begins with 1 and w ends with 0 n otherwise 10 3 where the final case includes (i) w = , (ii) w begins and ends with 0, and (iii) w begins and ends with 1. Therefore, Le is generated by the regular expression + 0 + 1 + 0(0 + 1)∗ 0 + 1(0 + 1)∗ 1. Therefore, Le is regular, and hence Le is recognizable. (4) (a) Consider an arbitrary w = w1 w2 · · · wn ∈ La , where each wi ∈ {0, 1}. Observe that w can be expressed as w = 0r 1s 2t where r 6= 0 =⇒ (s = t). Let x = , y = w1 , and z = w2 w3 · · · wn . Observe that these choices ensure (i) w = xyz, (ii) |xy| ≤ n, and (iii) |y| > 0, where n is the pumping constant for La . We wish to prove that (iv) xy i z ∈ La for all i ≥ 0. There are two cases to consider. If r 6= 0, then xy i z = 0r−1+i 1s 2t for all i ≥ 0. In this case, s = t by the implication given above. Therefore, xy i z = 0r−1+i 1s 2s ∈ L for all i ≥ 0. On the other and, if r = 0, then xy i z = 1s−1+i 2t . Therefore, xy i z ∈ La for all i ≥ 0. Therefore, La satisfies the pumping property. (b) The difference between the pumping property and the generalized pumping property is that the former gives conditions on a prefix of length at most n in each string, whereas the latter gives conditions on a substring of length at most n in each string. (c) Consider the string pws with p = 0, w = 1n 2n , and s = , where n is the generalized pumping constant. Observe that pws = 01n 2n ∈ La and |w| ≥ n. Consider any x, y, z ∈ {0, 1, 2}∗ such that (i) w = xyz, (ii) |xy| ≤ n, and (iii) |y| > 0. By these constraints, x = 1a , y = 1b , and z = 1c 2n where a ≥ 0, b > 0, and a + b + c = n. If i = 0, then pxy i zs = 0xz = 01a+c 2n = 01a+c 2a+b+c ∈ / La since b > 0. Therefore, La does not satisfy the generalized pumping property. (d) If L is a recognizable language, then there exists some finite automata M = (S, A, s0 , δ, F ) such that L(M ) = L. Let n = |S| be the number of states in M . Consider a string pws ∈ L with |w| ≥ n. Let w = w1 w2 · · · wm where each wi ∈ A and m ≥ n. Consider the states sj = δ ∗ (s0 , pw1 w2 · · · wj ) for j = 0, 1, . . . , n. Since there are only |S| = n states, there must be a repetition among s0 , s1 , . . . , sn . Let r and t be chosen such that sr = st and 0 ≤ r < t ≤ n. Now consider the strings x = w0 w1 · · · wr , y = wr+1 wr+2 · · · w + t, and z = wr+1 wr+2 · · · wt . Observe that the repeated state implies that δ ∗ (s0 , pxy i zs) = δ ∗ (st , zs) for all i ≥ 0. Furthermore, the state δ ∗ (st , zs) ∈ F since pws = pxyzs ∈ L. Therefore, pxy i zs ∈ L for all i ≥ 0. (5) (a) The language Ls = {w ∈ A∗ | w has suffix s} is recognizable. Therefore, if L is recognizable then so is L ∩ Ls by the closure of recognizable languages under intersection. Observe that L ∩ Ls contains all strings in L that have suffix s. Therefore, L ∩ Ls = L/s, and so L/s is recognizable. (b) Let S = S1 × S2 , i = {(ii , i2 )}1, F = (F1 × (S2 \F2 )) ∪ (F2 × (S1 \F1 )), and define δ : S × A → P (S) by δ((x, y), a) = {(δ1 (x, a), δ2 (y, a))} for all (x, y) ∈ S and a ∈ A. Observe that M is essentially a (deterministic) finite-automata and that it accepts the desired language. 1This i should have been stated as I in the question since it is a set of states. 4 (c) If L is recognizable, then there exists a finite automata M = (S, A, i, δ, F ) such that L(M ) = L. We will construct an -finite automata M 0 (S 0 , B, I, δ 0 , F ) as follows. S 0 is a set of states that includes those in S as well as additional states to be defined, and the set of initial states I = {i} includes only the single initial state from M . The basic idea is to replace each arc labeled a in M by a path whose arcs are labeled with the symbols in f (a) Formally, define δ 0 : S 0 × B → P (S 0 ) as follows. If f (a) = for a ∈ A, then for each s ∈ S let δ 0 (s, ) = {δ(s, a)}. If f (a) = b for a ∈ A and b ∈ B, then for each s ∈ S let δ 0 (s, ) = {δ(s, a)}. (6) (a) δ ∗ (s1 , 1) = s0 ∈ / F and δ ∗ (s2 , 1) = s1 ∈ F so s1 M s2 . (b) See the tables below. Partition Table s1 s2 s3 s4 x0 x0 x0 x1 x0 x1 x0 x0 for S/ ∼M s5 s6 x0 x0 s 0 x1 s 1 x1 s2 x1 s 3 x0 x0 s 4 x1 s 5 Table 1 0 1 (0,4) (1,5) (4,0) (1,2) (3,1) (2,0) (1,3) (3,3) (2,6) (1,5) (3,1) (2,6) (1,6) (3,1) (2,4) (2,3) (1,3) (0,6) (2,5) (1,1) (0,6) (2,6) (1,1) (0,4) (3,5) (3,1) (6,6) (3,6) (3,1) (6,4) (5,6) (1,1) (6,4) ∗ (c) The language is {w ∈ {0, 1} | w has suffix 0 or 01}. See below. Table 2 0 1 (0,4) (1,5) (4,0) (1,3) (3,3) (2,6) (1,5) (3,1) (2,6) (2,6) (1,1) (0,4) (3,5) (3,1) (6,6) the minimal automata (7) The finite automaton is given by the transition table below. (The shortest strings that distinguish between each pair of states are given in parentheses.) 5 0 1 → {a} {b, d} {b} () ∗{b} {c} {b, c} (1) ∗{b, d} {c} {a, b, c} (0) {c} {d} {a} (10) ∗{b, c} {c, d} {a, b, c} (11) ∗{a, b, c} {b, c, d} {a, b, c} (01) ∗{d} ∅ {a} (100) ∗{c, d} {d} {a} (110) ∗{b, c, d} {c, d} {a, b, c} (010) ∅ ∅ ∅ (1000) In the above table → and ∗ denote the start state and final states, respectively. (8) (a) See the table below. (b) See the table below. (c) See the table below. state x E(x) E(x)· a E(x) · b s1 {s1 , s2 } {s4 } ∅ s2 {s2 } {s4 } ∅ s3 {s1 , s2 , s3 } {s2 , s4 } {s1 } s4 {s1 , s2 , s3 , s4 } {s2 , s4 } {s1 } (d) See the NFA below. E(E(x) · a) E(E(x) · b) {s1 , s2 , s3 , s4 } ∅ {s1 , s2 , s3 , s4 } ∅ {s1 , s2 , s3 , s4 } {s1 , s2 } {s1 , s2 , s3 , s4 } {s1 , s2 } (9) Since L1 and L2 are recognizable languages there are finite automata that accept these languages. Let M1 = (S1 , A1 , δ1 , F1 , i1 ) and M2 = (S2 , A2 , δ2 , F2 , i2 ) be finite automata such that L(M1 ) = L1 and L(M2 ) = L2 . Consider the non-deterministic finite automata M = (S, A, δ, F, i) such that S = S1 × S2 × {1, 2}, A = A1 ∪ A2 , F = F1 × F2 × {1}, i = {(i1 , i2 , 1)} and δ is defined as follows. If a1 ∈ A1 then, δ((s1 , s2 , 1), a1 ) = {(δ1 (s1 , a1 ), s2 , 2)} and if a2 ∈ A2 then, δ((s1 , s2 , 2), a2 ) = {(s1 , δ2 (s2 , a2 ), 1)} and for all other inputs, δ returns the empty set. Observe that M accepts the shuffle of L1 and L2 . Therefore, the shuffle of two regular languages is also a regular language. 6