Lecture 23
MAS 714
Hartmut Klauck
• Two players, Alice and Bob
• Alice receives an input x 2 § *
• Bob receives y 2 § *
• Alice sends a message to Bob, who decides whether x ± y 2 L
– ± : concatenation, we will also just write xy
• Cost of the game: number of messages used
• Theorem [DFA vs. game]:
– If L is regular, then the optimal number of messages in the game is equal to the smallest number of states in a DFA.
– If L is not regular, then the smallest number of messages is infinite
• How to show that we need many messages?
• Consider the infinite matrix M, rows labeled with x 2 § * and columns labeled with y 2 § * and M[x,y]=1 iff xy 2 L
• Call this matrix the communication matrix of L
• Lemma: The minimum number of messages in the game is equal to the number of distinct rows in the matrix
• Two rows labeled x, x’ are distinct if there is a column y where M[x,y]=1 and M[x’,y]=0
– or vice versa
• If Alice sends the same message on x and x’
Bob cannot distinguish x and x’, so there is an error on xy or on x’y
• On the other hand it is enough for Alice to send the same message for rows that are not distinct
• Parity: M has two distinct rows
– one for x with odd number of 1’s and for even
• {0 n 1 n }:
– Consider the rows labeled 0 k for all k=1,…, 1
– The rows are all different
• Conclusion: {0 n 1 n : n natural} is not a regular language
• Note: We can consider subsets of the rows and columns of M and show that this submatrix has infinitely many distinct rows. Then M also has infinitely many distinct rows
• L={xx: x 2 {0,1}* } is not regular
• Proof: we need to show the number of distinct rows is infinite
• Consider the rows and columns labeled by 0 n 1 for all natural n
• Take two rows labeled by x=0 p 1 and x’=0 r 1
• Clearly there is a 1 in M[x,y] for y=0 p 1 but there is no 1 in M[x’,1]
• Again M contains an infinite identity matrix
• Assume L is regular. There is a DFA M for L.
• Alice simulates M on x, and sends the state of
M reached after reading x to Bob, who continues the simulation
• Bob accepts iff M accepts xy
• Clearly this is a correct protocol, and the number of messages is the size of the set of states
• Now assume that there is a protocol with m messages
• We can assume that Alice uses one message for each distinct row of the matrix, otherwise the protocol is not optimal
– i.e., for two rows that are not distinct she uses the same message
• We construct a DFA with m states corresponding to the m messages
• q
0
[the starting state] is the message that Alice sends on the empty string
• We use m-1 other states, corresponding to messages
• Set of accepting states: messages/states, such that Bob accepts on the message when y is empty string
• Remains to define the transition function
• Assume that on some input x Alice sends message q, and on input xa she sends message q’
• Then we include the transition q,a
q’
• Clearly this defines transitions, but are they consistent?
• I.e., If Alice sends q on x and q’ on xa, and Alice also sends q on x’, then she should send q’ on x’a
• But x and x’ have the same row, hence xa and x’a also have the same row [if not, there is some y such that xay 2 L and x’ay not in L, hence rows of x and x’ not the same]
• Hence the transition function is well-defined.
• We get a DFA with m states for L
• The minimum number of messages in our communication game is usually called the
Myhill-Nerode index
• Theorem: 2-way DFA can recognize only regular languages
• Note:
– Sometimes it is easier to show that L is regular by using a 2-way DFA
– Sometimes 2-way DFA need fewer states
– Proof: We show that the Myhill-Nerode index is finite.
• L={w: third last symbol of w is 1}
• 2-way DFA: move to the right end of the input, move 3 steps left, accept if there is a 1
• L rev for a regular language L is the set of w, such that w read in reverse is in L
– Move to the right end and then simulate the DFA for L while reading the input backwards
• Proof: We show that the Myhill-Nerode index is finite.
• Language L is decided by a 2-way DFA M, where M has n states
• Consider words w=xy
• Suppose the tape-head of M comes back from y into x while M is in state q, and the same happens later in state q’
– q
q’ otherwise there is an infinite loop
• Hence the states on which M crosses from y to x are distinct, and there are at most n such crossings
• Simulation in the communication game:
• We describe the communication protocol
• Let S=[q
1
,...,q m
] be a sequence of any m · n distinct states of M
• Alice simulates M on x, assuming that when M first leaves x it later returns to x in state q
1
– Etc: If M leaves x for the k-th time it returns to x in state q k
• Alice’s message: for every possible S Alice sends the m states in which M leaves x (for even m), and additionally if she would accept, for odd m
• Bob finds the sequence S that is consistent with M’s behavior on y and Alice’s message (S is the correct sequence occurring on M and xy)
• Bob accepts if M accepts
• It is easy to see that this protocol is correct, and uses no more than
O(n 2n ) messages, which is finite
• Definition: a two-way NFA is a nondeterministic
Turing machine that cannot write and cannot leave the tape area that contains the input.
• Theorem: Two-way NFA can decide only regular languages
• Proof: Same as for two-way DFA, Alice and Bob check consistency with respect to the nondeterministic A
– Is there a computation that leads to the communicated sequence of states and accepts?
• Is there a language, for which 2-way NFA need much fewer states than 2-way DFA?
• We know L is regular, if there is a DFA with a finite number of states for L
• Minimum number of states is the Myhill-Nerode
Index
• Questions:
1.
How can we find a minimal DFA for L?
2.
Is the minimal DFA unique (up to renaming states)?
3. Are NFA sometimes smaller than minimal DFA?
4. Are two-way DFA sometimes smaller than DFA?
• Consider the language
L={xi : x 2 {0,1} n and i 2 {0,1} log n and x n-i
=1}
– n is fix, this is a finite language, hence regular
• It is easy to see that there are at least 2 n rows in the communication matrix
– Consider only columns for the different i=n-1,…,0
– Rows labeled x contains the string x at entries i=n-1…0
• Hence any DFA for L has at least 2 n states
• Can give a 2-way DFA with O(n 2 ) states:
– go to the right end of the input, read i into the state, move i steps left, accept if there is a 1
• Consider L={xy: x,y 2 {0,1} n , x
y}
– This is a FINITE language, n is fixed
• Exercise: there is an NFA with O(n 2 ) states for L
• DFA-size: the matrix has more than 2 n distinct rows
– DFA size is exponential
• Hence NFA can be exponentially smaller than
DFA for some languages
• Example where they are not: complement of L
• Uniqueness of the minimal DFA (up to vertex names) follows from the Myhill-Nerode characterization
• The optimal DFA has exactly one state for each set of equal rows of the communication matrix
• Edges/the transition function are also determined uniquely, i.e., if x is in set q and xa is in set q’ then there must be an edge labeled a from q to q’