Chapter 10: Modeling Computation Review: Have seen Phrase-structure grammars (V, T, S, P) - formal languages that can be generated by a grammar G [L(G)] or stated as a grammar [G] in special form. V : alphabet; T: terminals; S : start symbol; P : production rules [ex: V = {Sm, 0, 1}, T = {Sm}, S = Sm, and P = {Sm, Sm1Sm00}, and G={V, T, S, P} generates all bit strings starting with n 1’s followed by 2n 0’s for n 0] Can determine if a string is in G or not: 1100; 100; 10; 111000000 or can generate strings in G: Sm1Sm0011Sm0000111Sm000000 111000000111000000 Note: this is { 1n02n | n 0}, so this is a Context-free or type 2 language/grammar. Recall: types of grammars are characterized by restrictions on productions: no restrictions: type 0; phrasestructure grammar LHS no longer than RHS: type 1; context-sensitive LHS is always a non-terminal: type 2; context free LHS is a nonterminal and RHS is aB or a, or S: type 3; regular and regular context-free contextsensitive phrase-structure And finite automata with output: theoretical machines that process strings by changing states and output symbols. [Mealy machines have output associated with each transition - this is what we saw on Friday; Moore machines associate output with each state] M = (S, I, O, f, g, s0) S, a finite set of states I, a finite input alphabet O, a finite output alphabet f, a transition function that takes <currentState, inputSymbol> to a new state g, an output function that assigns each <currentState, inputSymbol> an output symbol s0, the initial (start) state Start with input string of symbols in initial state s0; read it symbol by symbol and use transition function f to determine next state while output function g determines output symbol. Suppose we want to add two bit strings 0111011111 + 0212121202 = 10001. To process with a machine, we need to reverse [1111011101 and 0212121202] (padding a shorter number with zeros if it exists) and then interleave the numbers: 11021112011211120102. Now we have the low-end digits at the beginning of the string, ready to add... we just have to build our machine. Notice that we’ll have to know whether we’ve got a carry bit to add in - so we’ll use our states to let us know! f g state input input s0 s1 00 01 10 11 s0 s0 s0 s1 s0 s1 s1 s1 00 01 10 11 0 1 1 0 1 0 0 1 This machine can also be represented graphically We can build a machine which recognizes languages. For example, we can create a machine which will recognize when three consecutive 1’s occur in an input string, and output a 1 when this occurs. Any output string with a 1 in it will indicate there were three consecutive 1’s, and if the MAT 2345, Chapter 10 Page 1 of 3 output string ends in a 1, there were three 1’s at the end of the input string: s0 - start s1 - seen one 1 s2 - seen two 1’s s3 - seen 3 1’s Section 10.3: Finite State Machines with No Output Recall: Vstar is the set of all strings over the alphabet V Definition. If A and B Vstar then the concatenation of A and B [AB] is the set of all strings of the form xy where x is a string in A, and y is a string in B. Notes: this definition is for SETS of strings, and that AB may not necessarily be the same as BA. Ex: A = {11, 101}, B = {0, 00, 110} then AB = {110, 1100, 11110, 1010, 10100, 101110} while BA = {011, 0101, 0011, 00101, 11011, 110101}. Definition. An is defined recursively as: A0 = {} and An+1 = AnA for n 0. EX: A = {11, 0}, then A0 = {}, A1 = A0A = A = {11, 0} A2 = A1A = {11,0}{11,0} = {1111, 110, 011, 00} and A3 = A2A = {1111,110,011,11}{11,0} = {111111, 11110, 11011, 1100, 01111, 0110, 1111, 110} Definition. If A Vstar, the Kleene closure of A [Astar], is the set consisting of concatenations of arbitrarily many strings from A. That is, the union as k goes from 0 to infinity of Ak. EX: Kleene Closures A = {0}, then A* = {0n | n 0} B = {0, 1}, then B* = V*, the set of all strings over the alphabet {0,1} i.e. = {12n | n 0} Definition. Finite State Automaton (finite state machine with no output): M = (S, I, f, s0, F) consists of S, a finite set of states I, a finite input alphabet f, a transition function from <currentState, inputSymbol> to nextState s0, an initial state F, a subset of S, the set of final or accepting states When we process an input string, symbol by symbol, beginning in s0 and using f to go from state to state, if we are left in a FINAL state at the end of the input string, then the machine accepts the string. Definition. A FSA is deterministic if for each <currentState, inputSymbol> pair there is a unique nextState given by the transition function (f). [There is no “fuzziness” about where we’re going!] Definition. A FSA is nondeterministic if there are more than one possible nextState for each pair of state and input value. Definition. A Nondeterministic FSA M = (S, I, f, s0, F) consists of S, a set of states I, an input alphabet f, a transition function which assigns <CurrentState, InputSymbol> to a set of states s0, a start state f, a subset of S, consisting of the final states. Theorem. NFSA and FSA are equivalent. I.e., If a language L is recognized by a NFSA, then L is also recognized by some FSA. C = {11}, then Cstar = the set of strings consisting of an even number of ones, MAT 2345, Chapter 10 Page 2 of 3 Method for finding the deterministic equivalent of a nondeterministic machine: consider combinations of states from NFSA as states for the FSA. Section 10.4: Language Recognition Regular sets can be built up from the null set, the empty string, and singleton strings by taking concatenations, unions, and Kleene closures (all in arbitrary orders). We will see that Regular sets are exactly those which are recognizable by FSA. 0 (0 1)* = any string beginning with a 0 (0*1)* = any string not ending in 0 (i.e., ends in a 1 or is the empty string) Kleene’s Theorem. A set is regular IFF it is recognized by a finite state automaton. Would have to show: ==> if a set is recognized by a FSA, it is regular <== if a set is regular, it is recognized by a FSA Definition. The regular expressions over a set I are defined recursively by: the symbol is a regular expression; the symbol is a regular expression; the symbol x is a regular expression whenever x I; the symbols (AB), (AB), and A* are regular expressions whenever A and B are regular expressions. where: represents the empty set - the set with no strings represents the set {}, the set containing the empty string x represents the set {x} containing the string with one symbol, x (AB) represents the concatenation of the sets represented by A and by B (AB) represents the union of the sets represented by A and by B A* represents the Kleene closure of the set represented by A Sets represented by regular expressions are called regular sets. strings in the regular sets specified by regular expressions: 10* = a 1 followed by any number of 0’s, including no 0’s (10)* = any number of copies of 10 (including the null string) 0 01 = the string 0 or the string 01 MAT 2345, Chapter 10 Page 3 of 3