Chapter 3 Properties of Finite Automata and Regular Expressions 3.1 Minimization of Finite Automata For a given regular language L, it is possible to construct infinitely many finite state automata to accept L. The reason is that based on a given DFA M 1, there would be a loop in the transition diagram. So we can construct a new DFA M 2 to have more states and L(M 1) = L(M 2) = L. And from DFA M 2 , we can construct a DFA M 3 with even more states and L(M 3) = L. Consider example 1, given a DFA M 1, we can construct a DFA M 2 to have more states and L(M 1) = L(M 2)= L. Example 1: The following two DFA’s M 1 and M 2 are equivalent. Both machines accept the set {0 n|n>0} over the alphabet ={0, 1}. q0 0 qf 1 M1 1 qt 0, 1 0 q0 1 q1 1 0 qg 0 1 0 0 1 q2 1 q3 0 qh 0 1 M2 0 The states q 1, q 2, and q 3 of the machine M 2 are equivalent, we can combine these 3 states into one state as q t in the machine M 1. The states q g and q h of the machine M 2 are equivalent, we can combine these 2 states into one state as q f in the machine M 1. For a given regular language L, it is possible to find a DFA M with minimum number of states such that L(M)=L. Usually, we first find a DFA M accepting L. Then find equivalent states of M and reduce the number of states in M to get a new DFA with minimum number of states. Definition 1: Two states p and q of a DFA M are equivalent, if for all strings in * both the states (p, ) and (q, ) are in F or both are not in F. Two states p and q are distinguishable, if they are not equivalent. Lemma 1: If states p 1 and q 1 of a DFA M are distinguishable, and there is a symbol a in such that p 2= (p 1, a) and q 2=(q 2, a). Then p 2 and q 2 are distinguishable. Using lemma 1, we can easily design an algorithm to find all equivalent states in a DFA and convert to a DFA with minimum number of states. Here we may get into a problem that whether or not all the DFAs with minimum number of states accepting the same set are unique up to graph isomorphic. We shall solve the problem later in this section. For a given DFA M=(Q, , , q 0, F), where Q={q 0, q 1, …, q n-1 }, let D[i, j]= 0 denote the states q i and q j are distinguishable, and D[i, j]= 1 denote the states q i and q j are equivalent. Let the constants distinguishable = 0 and equivalent = 1. Algorithm : Step 1: do D[i, j] equivalent, for 1 i, j n Step 2: for q iF and q j F, do D[i, j] D[j, i] distinguishable Step 3: continue false for i j and D[i, j] = distinguishable do for a do If q k = (q i, a) and q l=(q j, a), and D[k, l] = equivalent, then D[k, l] = D[l, k] distinguishable continue true Step 4: if continue = true then goto step 3 Example 2: Consider the following DFA M. 0 q0 q0 q1 D[i, j] 0 1 q3 0 1 0 1 q1 q2 qg 0 1 0 0 1 qg 0 0 1 q2 0 1 q3 q 1, q 2, and q 3 are equivalent 0 q g and q h are equivalent q0 0 qh 1 1 0 0 qf 1 qh 0 0 0 0 1 q0 q1 q2 q3 qg 0 1 qt 0, 1 0 Program : for(i=0; i<n; i++) for(j=0; j<n; j++) D[i][j]=1; for(i=0; i<n; i++) for(j=0; j<n; j++) if(inF(i)&& !inF(j)) { D[i][j]=0; D[j][i]=0;} continue = TRUE; while(continue) { continue = FALSE; for(i=0; i<n; i++) for(j=0; j<n; j++)for(a=0; a<=1; a++) if((i != j) && D[i][j]==0){k= delta(i,a); l= delta(j,a); if(D[k][l]==1) { D[k][l]=0; D[l][k]=0; continue = TRUE;}}} How do we know that all the DFA’s with minimum number of states accepting the same set are unique up to graph isomorphic? We shall prove the Myhill-Nerode theorem to show that there is a unique DFA with minimal states. The proof of the Myhill-Nerode theorem is based on the equivalence relation RL. The set of equivalence classes of RL forms the set of the states of the DFA with minimum number of states. We shall define 2 equivalence relations RL and RM in order to prove the Myhill-Nerode theorem. Definition 2: Let L be a language over an alphabet , define a relation RL by that for strings x and y in *, x RL y, if for all strings z in *, we have that both xz and yz are in L or both are not in L. The relation RL is an equivalent relation, since for x, y and z in *, we have (1) x RL x, (2) x RL y implies y RL x, and (3) x RL y and y RL z implies x RL z. Example 3: Let L be a language over an alphabet = {0, 1}, L={ 0 n | n > 0}. Find the equivalence classes of RL. Solution : The equivalence classes of RL are : [0] = { 0 n | n > 0 }, [] = {}, [1] = * \ {0 n | n 0 }. There are 3 equivalence classes of RL. 0 2 RL 0 3 , since (1) If z = 0 k , for some k 0, then 0 2 z, 0 3 z L. (2) If z contains at least one 1 , then 0 2 z, 0 3 z L. Therefore, [0]={0 n | n > 0 } Example 4: Let L be a language over an alphabet = {0, 1}, L={ 0 n 1 n | n 0 }. Find the equivalence classes of RL. Solution : The equivalence classes of RL are : [0 n] = { 0 n 0 m 1 m | m 0 }, n = 1, 2, …, [01] = { 0 m 1 m | m > 0 } = L \{}, [] = {}, [1] = * \ n > 0 {0 n 0 m 1 m | m 0 } \ L. There are infinitely many equivalence classes of RL. 0 2 RL 03 1 1 RL 04 1 2 RL 05 1 3 …, since (1) If z = 1 2 , then 0 2 z = 0 2 1 2 , 03 1 1 z = 03 1 3 , 04 1 2 z = 04 1 4 , 05 1 3 z = 05 1 5 … L. (2) If z 1 2 , then 0 2 z, 03 1 1 z, 04 1 2 z, 05 1 3 z … L. Therefore, [0 2]={0 2 0 m 1 m | m 0 } Definition 3: Let M be a DFA, define a relation R M by that for strings x and y in *, x R M y, if (q 0, x) = (q 0, y). The relation R M is an equivalent relation, since for x, y and z in *, we have (1) x R M x, (2) x R M y implies y R M x, and (3) x R M y and y R M z implies x R M z. Example 5: Given a DFA M as follows, Find the equivalence classes of R M. q0 1 0 q1 0 qf 0 1 1 qt 0, 1 Solution : The equivalence classes of RM are : [q 0] = {}, [q 1] = {0}, [q t] = * \ { 0 n | n > 0 }. [q f] = { 0 n | n > 1 }, Lemma 2: Let L be a regular language and L = L(M) for some DFA M = (Q, , , q 0, F). Then for a given equivalence class [q] of R M , there is an equivalence class [] of R L such that [q] []. And each equivalence class [] of the relation R L is a finite union of equivalence classes of R M. Proof : If [q], where [q] is an equivalence classes of RM, and [] is an equivalence class of RL. If x[q], then (q 0, x) = q = (q 0, ). For any string y *, we have that (q 0, xy) =(q 0, y)= (q, y) = p. If p F, then xy, y L. Otherwise xy, y L Therefore, x R L . So, we have that x [] and [q] []. Suppose that [] is an equivalence class of RL. It is obvious that if (q 0, ) F, then x [] (q 0, x) F. And if (q 0, ) F, then x [] (q 0, x) F. The set S={(q 0, x) | x [] } is finite. And we have that either SF= or S F. Therefore, the equivalence class [] of R L is a finite union of the equivalence classes [q] of R M, where q S={(q 0, x) | x [] }. Now, we can say that the number of equivalence classes of R L is less than or equal to that of R M. And the equivalence classes of R M is a refinement of the equivalence classes of R L . Theorem 1(The Myhill-Nerode Theorem): Let L be a regular language. Then there is a DFA M = (Q, , , q 0, F), where Q = {[] | [] is an equivalence class of R L }, q 0 = [], F = {[] | [] is an equivalence class of R L and [] L}, and ([], a)=[a] for a such that L(M)=L and the number of the states is minimum among all DFAs accepting L. And we have that the minimal DFA is unique up to graph isomorphic. Proof : By lemma 2, we have that the number of equivalence classes of R L is finite. Let Q be the set of equivalence classes of R L. Let F, q 0 and be defined as those in the theorem. We have that L iff L(M), since (q 0 , )= ([] , )=[ ] =[] F. Example 6: Let L be a language over an alphabet = {0, 1}, L={ 0 n | n > 0}. Find a DFA M such that L(M)=L. Solution : The equivalence classes of RL are : [0] = { 0 n | n > 0 }, [] = {}, [1] = * \ {0 n | n 0 }. By theorem 1, we have that the DFA M is as follows. [] 0 0 [0] 1 M 1 [1] 0, 1