Mathematical Foundations Finite Automata (FA) (Based on Chapter 5 of the textbook Cohen (1997) Introduction to Computer Theory, 2nd Edition, New York, NY: John Wiley & Sons) Consider a wall switch that turns a light on and off. If the switch is up the light is on. If it is down the light is off. There are two possible states the switch can be in. Now consider a hair dryer with two switches that can each be either up or down. The hair dryer can be in one of four states at any point in time (up/up, up/down, down/up, down/down.) Now consider a game of tic-tac-toe. Each possible combination of X's and O's on a board is a different state. In tic-tac-toe the empty board is always the starting state. At each step of the game one player makes a move which consists of putting an X or an O (depending on whose turn it is) into one of the 9 possible positions. If we think of a game of tic-tac-toe as a machine which receives input, the inputs consist of a character and a position. The machine starts in the empty state and changes states every time it receives an input. Certain states are special in that when they are reached, the game is over. We call such states final states. Certain final states mean a win for X and others mean a win for O. Still others mean the game is a draw. Instead of the term final state, we might use terminal state, or accept state, the preferred term. which is The general model of such a machine is called a finite automaton. The word finite refers to the fact that the machine has a finite number of states. Formally we define a finite automaton as follows. A finite automaton is a collection of three things: 1. A finite set of states, one of which is a start state, and some of which may be designated as accept states. 2. An alphabet . 3. A finite set of transitions that tell for each state and for each letter of the alphabet which state to go to next. The input to a finite automaton is a string of letters from the alphabet, finite in length, that is read left to right one letter at a time. Beginning in the start state, the machine reads a letter and makes a transition over and over until the string ends. If this process ends with the machine in an accept state, we say that the string is accepted. Otherwise the string is rejected. We draw a Finite Automaton (FA) in the following manner: The machine’s states are represented by circles and transitions are represented by arrows. Every transition is marked by a symbol from the alphabet . The start state is marked by a – sign and the final states are marked by a + sign as shown in the FA for a*ba*. The finite automaton given above has one start state. Of course, every finite automaton has exactly one start state by definition. The above finite automaton has one final state although finite automata are allowed to have zero or more final states. The set of strings accepted by a finite automaton is referred to as the language accepted by the finite automaton (or the regular expression defined by the finite automaton). The above finite automaton accepts the language defined by a*ba*. The intuitive explanation is that the machine starts at the start state and can take the loop transition marked by the symbol a from the start state zero or more times and then take the arrow marked by b and reach the final state where it can take the loop transition marked by a as many time as it likes and finish at the final state marked by +. From the final state, if the transition marked by b is taken then the machine will reach the non-final state that has two loops. However, it does not add to the language the machine accepts. Because, a finite automata’s accepted language is defined by tracing the paths from the start state to the final state(s). Paths ending in the non-final states do not contribute to the accepted language of the machine. The above notation for finite automata is given in Daniel I. Cohen, 1997, Introduction to Computer Theory , 2nd Edition, Finite automata can be drawn with different notations. We draw another Finite Automaton (FA) below which accepts any string from ab*. Please note that the Non-Deterministic Finite Automaton (NFA) for the same regular expression, ab*, is given below which looks simpler. Why? Because in order to meet the definition of FA there should be one transition for every element of from each state. . We draw another finite automaton which accepts all strings from a*bb*aa*(ba*bb*aa*)*. Please note that some books show the start state with an incoming arrow that is not attached to any other state (instead of a – sign). Accept states or final states are often drawn as double concentric circles in some books (instead of a + sign). States are given names (0,1,2) to make discussions of the behavior of the machine easier. Transitions are shown as labeled arcs from one state to another or from a state back to itself. In the above, if the machine is in the start state 0 and it reads an a, the transition leaves state 0 and reenters state 0. If it reads a b from state 0, the machine enters state 1. It is common for each state to have a transition leaving it for each letter of the alphabet. The above machine, when given the string bbabba will end up in state 2 and accept the string. If it reads string abbbab it ends up in state 0 and does not accept the string. We say that the machine rejects abbbab. Note that the author uses a slightly different schema for drawing finite automata. He puts a minus sign in the start state and plus signs in the accept states. The set of strings accepted by a finite automaton is referred to as the language accepted by the finite automaton. We might describe a finite automaton as a language recognizer whereas a regular expression is a language generator. For each finite automaton there is a regular expression that defines the same language. Later we will learn an algorithm for determining the regular expression, but sometimes we can figure it out using our common sense. Look back at the above finite automaton. A regular expression corresponding to that machine is a*bb*aa*(ba*bb*aa*)*. Note that the portion before the parentheses moves you from the start state to the accept state. The portion in the parentheses moves from the accept state back to the accept state and this expression is "starred" because you can repeat this cycle as many times as necessary. Another way to represent a finite automaton is with a transition table. Here is the table for the above machine. The rows correspond to states, the columns correspond to characters from the alphabet, and the cell contents correspond to the transitions of the machine. State a b 0 0 1 1 2 1 2 2 0 A drawing of a finite automaton is easier for a human to understand than a table, but implementing a machine with a computer program requires storing the finite automaton's transitions in a table. There is not just one finite automaton for a given language. Typically it is possible to draw the finite automaton in any of several different ways. There are languages for which it is not possible to draw any finite automaton. For example, there is no FA for L3 = { anbn: where n > 0 }. Later we will show that any language that can be described by a regular expression can also be described by a finite automaton and vice versa. A regular expression for the above machine is (baa + abb)*. Can you find a regular expression for this next one? If you try running some strings through the most recent machine you may notice that what it takes to get to the accept state is to see either two a's in a row or two b's. So the regular expression for the machine is (a+b)*(aa+bb)(a+b)*. Let's try one more: Notice that this machine will accept any string consisting entirely of a's. It will also accept any string in which the number of b's is divisible by 3. Since the start state is not an accept state, the machine does not accept . A regular expression for the machine is (a + ba*ba*b)(a + ba*ba*b)*. How do you create a finite automaton that will accept any string with an odd number of a's? As the machine reads a's, you need to move back and forth between two states, one where the number of a's so far is even and one where the number is odd. Since we are not concerned with the number of b's, each state should have a self-transition for a b (a transition that leaves and enters the same state.) Which of the two states should be the start state? Since 0 is even and at the beginning we have seen 0 a's, the start state should be the state that corresponds to having seen an even number of a's. The other state, the one for having seen an odd number of a's, is an accept state. A machine that accepts strings with an even number of a's would look just like the one for odd a's except that the other state is the accept state. What if we are concerned about the parity (oddness or evenness) of both a's and b's? For example, we might want to accept strings that contain an odd number of a's and an even number of b's. To figure out how many states we need we must ask how many different combinations of odd and even there are for two letters. The answer to this question is four: even/even, even/odd, odd/even, and odd/odd. The even/even state is the start state. Which states are accept states depends on the language we wish to accept. If we want strings that contain an odd number of a's and an even number of b's then we make the state odd/even be the only accept state. If we want strings in which the parity of a's matches the parity of b's then we make both even/even and odd/odd be accept states. Here is one final example of a finite automaton. What strings are accepted by this one? There are two accept states in this machine. In order to get to the top one, the first letter of the string must be a and the last letter must be b. In order to get to the bottom one, the first letter must be b and the last one a. So this machine accepts strings that start and end with a different letter. There is no way to do this job with a single accept state. The machine must remember what the first letter was by taking one or the other of the transitions off of the start state. Once the machine has taken either the "high road" or the "low road" it cannot leave that portion of the machine or it would "forget" the first letter. The following Non-deterministic Finite Automaton would accept any string from the regular expression b*aba*b. Non-deterministic Finite Automata are distinct from the Deterministic FA or FA because they do not require one outgoing transition for each element of Σ from every state. However, it is proven by Kleene that Non-deterministic Finite Automata are equivalent to Deterministic FA. The FA (i.e. the Deterministic FA ) for b*aba*b is given below: Chapter 6 - Transition Graphs In this chapter we define a new type of finite state machine called a transition graph. These machines were invented in 1957 by John Myhill in order to simplify the proof of a theorem known as Kleene's Theorem. A Transition Graph (TG) is a nondeterministic finite automaton with null transitions. The following features of TG’s are emphasized: 1. We allow the labels on the transitions of a transition graph to be strings in addition to single characters (although we prefer single characters). 2. We allow as a transition label. This is interpreted as a "free move" from one state to another, a move that does not involve reading any input. 3. We allow more than one start state. 4. We allow the machine to read more than one character at a time. 5. There can be two transitions leaving a state that both have the same label. Here is an example of a transition graph that illustrates some of the quirks of such machines. In this machine there are two different accepting paths that the machine may take on the string baab. There is no path at all for strings other than baab, so the string abb is rejected as are all strings other than baab. If any path for a particular input leads to an accept state, we say the string is accepted. What if one path for string X leads to an accept state and one to a non-accept state? Then that string is accepted. Self-Checking Exercises: (You are encouraged to do these exercises. You are not required to submit them) 1. Construct a Finite Automaton that accepts only the word baa. 2. Construct a Transition Graph that accepts only the word baa. 3. Build a Transition Graph that accepts only those words that have more than 4 letters.