DEPARTMENT OF CSE COURSE MATERIAL U4CSA10 THEORY OF COMPUTATION SEMESTER: IV PREPARED BY R.KAVITHA A.P/CSE U4CSA10 THEORY OF COMPUTATION LTPC 3 003 OBJECTIVES To have an understanding of finite state and pushdown automata. To have a knowledge of regular languages and context free languages. To know the relation between regular language, context free language and corresponding recognizers. To study the Turing machine and classes of problems. UNIT I 8 Languages and Problems: Symbols – Alphabets and strings – languages – operation on languages – Alphabetical Coding – types of problems – representation of graphs – spanning trees- Decision problems – Function problems – Security problems – enumeration – regular expression – application of regular expression UNIT II 9 Fundamental Machines – Basic machine notation – Deterministic Finite Automata (DFA) – Non Deterministic Finite Automata (NFA) – Equivalence of DFA and NFA – Properties of Finite State Languages – machine for five language operations – Closure under complement, Union, Intersection, Concatenation and Kleene star – Equivalence of regular expressions and DFA – Pumping Lemma for Regular Language – Applications of pumping Lemma UNIT III 9 Fundamental Machines - Push Down Automata – Turing Machines – Deterministic Turing machine – Multiple work tape turing machine – Non Deterministic turing machine – equivalence of Deterministic turing machines and non deterministic turing machines – Un decidable languages – Relation among classes – grammars – regular grammars – context free – grammar – closure properties of context free grammar- parsing with non deterministic push down automata – parsing with deterministic pushdown automata – parse trees UNIT IV 10 Computational Complexity: Asymptotic notations – Time Space Complexity – Simulations – Reducibility - Circuit Complexity – Boolean circuit model of computation – circuit resources – examples UNIT V 9 Polynomial time – P Completeness theory – examples of P Completeness – General machine simulation – NAND circuit value problems - Circuit problems and reduction. TOTAL: 45 periods TEXT BOOKS 1. John E Hopcraft, Rajeev Motwani, Jeffrey D Ullman, “Introduction to Automata Theory, Languages and Computation”, PEA, Second Edition, 2001 REFERENCE BOOKS 1. Green Law, Hoover, “Fundamentals of the Theory of Computation – Principles and practice”, Morgan & Kauffman Publishers, 1998 LESSON PLAN Faculty Name Subject Name Semester Hour Count 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 R.Kavitha Theory of Computation IV Faculty ID Subject Code Year TTS1435 U4CSA10 II A & B Proposed Actual Mode of Topic Covered Unit Remarks Date Date Delivery 16/12/2013 Introduction I Chalk & Board 18/12/2013 Symbols and alphabets I Chalk & Board 19/12/2013 Strings I Chalk & Board 20/12/2013 Languages I Chalk & Board 21/12/2013 Operations of languages I Chalk & Board 24/12/2013 albhabetical coding I Chalk & Board 26/12/2013 Types of problems I Chalk & Board 27/12/2013 Representation Of Graphs I Chalk & Board Spanning 28/12/2013 I PPT Trees,enumerations 30/12/2013 Decision Problems I Chalk & Board 20/1/2014 Function Problems I Chalk & Board Security Problems & 21/1/2014 I Chalk & Board Revision 22/1/2014 Seminar & Tutorial I Chalk & Board 23/1/2014 Tutorial I Chalk & Board 25/1/2014 Fundamental Machines II Chalk & Board 27/1/2014 Basic Machine Notation II Chalk & Board 28/1/2014 Finite automata II Chalk & Board DFA & NFA and 29/1/2014 II Chalk & Board Properties of languages Equivalence Of DFA And 31/1/2014 II Chalk & Board NFA Machine For Five 3/2/2014 II Chalk & Board Language Operations Closure Under Complement, Union, 4/2/2014 Intersection, II PPT Concatenation And Kleene Star 5/2/2014 Equivalence Of Regular II Chalk & Board 23 7/2/2014 24 25 26 27 10/2/2014 13/2/2014 14/2/2014 15/2/2014 28 18/2/2014 29 19/2/2014 30 20/2/2014 31 21/2/2014 32 26/2/2014 33 27/2/2014 34 28/2/2014 35 28/2/2014 36 37 38 39 1/3/1014 2/3/2014 4/3/2014 5/3/2014 40 6/3/2014 41 42 7/3/2014 11/3/2014 43 12/3/2014 44 13/3/2014 45 14/3/2014 46 47 48 49 17/3/2014 18/3/2014 19/3/2014 20/3/2014 Expressions And DFA Pumping Lemma For Regular Seminar and Tutorial Fundamental Machines Push Down Automata Turing Machines Deterministic Turing Machine Multiple Work Tape Turing Machine Non Deterministic Turing Machine Equivalence Of Deterministic Turing Machines Non Deterministic Turing Machines – Un Decidable Languages Relation Among Classes – Grammars – Regular Grammars Context Free – Grammar Closure Properties Of Context Free Grammar Tutorial and Seminar Introduction computational complexity example example on complete problems Asymptotic Notations Time Space Complexity Reducibility and Simulations Circuit Complexity Boolean Circuit Model Of Computation Seminar and Tutorial Tutorial Polynomial Time Example on Pt problems II Chalk & Board II Chalk & Board III Chalk & Board III PPT III Chalk & Board III Chalk & Board III Chalk & Board III Chalk & Board III Chalk & Board III PPT III Chalk & Board III Chalk & Board III Chalk & Board III IV IV IV Chalk & Board Chalk & Board Chalk & Board Chalk & Board IV Chalk & Board IV Chalk & Board IV Chalk & Board IV PPT IV Chalk & Board IV PPT IV IV V V Chalk & Board Chalk & Board Chalk & Board Chalk & Board 50 51 52 24/3/2014 25/3/2014 26/3/2014 53 27/3/2014 54 55 56 57 58 59 1/4/2014 2/4/2014 3/4/2014 4/4/2014 7/4/2014 8/4/2014 Examples Of P complete Examples Of NP Completeness General Machine Simulation NAND Circuit Value Examples Of P problems Reduction Circuit problems Seminar & Tutorial V PPT V Chalk & Board V Chalk & Board V Chalk & Board V V V V V V PPT Chalk & Board Chalk & Board Chalk & Board Chalk & Board Chalk & Board Unit-1 Languages An alphabet is a finite, nonempty set of symbols. We use E to denote this alphabet. Note: symbols may be more than one English letter long, e.g. While is a single symbol in Pascal. A string is a finite sequence of symbols from e. The length of a string, denoted s is the number of symbols in it. The empty string is the string of length zero. It really looks like this. But for convenience we usually write it like this: denotes the set of all sequences of strings that are composed of zero or more symbols of . + denotes the set of all sequences of strings composed of one or more symbols of . That is, +=*-. A language is subset of *. The concatenation of two strings is formed by joining the sequence of symbols in the first string with the sequence of symbols in the second string. If a string S can be formed by concatenating two strings A and B, S=AB, then A is called a prefix of S and B is called a suffix of S. The reverse of a strings S, SR , is obtained by reversing the sequence of symbols in the string. For example if S=abcd, then SR =dcba. Any string that belongs to a language is said to be a word or a sentence of that language. Operations on languages Languages are sets. Therefore any operation that can be performed on sets can be performed on languages. IfL, L1 and L2 are languages then, L1 L2is a language L1 L2is a language L1-L2 is a language -L=*-L the complement of L is a language. In addition, L1L2. the catenation of L1 and L2 is a language (the strings of L1 L2 are strings that have a word of L1 as a prefix and a word of L2 as a suffix. Ln , the catenation of L with itself n times is a language. L*= L LLLLLLLL….. the star closure of L is a language. L+=LLLLLLLLLLL….the positive closure of L is a language Sets: A set is a collection of “things” called the elements or members of the It is essential to have a criterion for determining, for any given thing, whether it is or is not a member of the given set. This criterion is called the membership criterion of the set. There are two common ways of indicating the members of a set: List all the elements, e.g.(a,e,i,o,u) Provide some sort of an algorithm or rule such as a grammar Notation. To indicate that x is a member of set S, we write X=S We denote the empty set the set with no members as or If every element of set A is also an element of set B, we say that A is a subset of B and write A=B If every element of set A is also an element of set B, but B also has some elements not contained in A. we say that A is a proper subset of B, and write AB Operations on sets The union of sets A and B written AB is a set that contains Everything that is in A, or in B, or in both. The intersection of sets A and B written A B is a set that contains exactly those elements that are in both A and B. The complement of a set a, written as A or better A with a bar drawn over it is the set containing everything that is not in A. This is almost always used in the context of some universal. Set U that contains “everything” (meaning everything we are interested in at the moment). Then-A is shorthand for U-A The cardinality of set A, written A is the number of elements in a set A. The power set of a set Q, written 2Q is the set of all subsets of Q. The notation suggest the fact that a set containing n elements has a power set containing 2n elements. Two sets are disjoint if they have no elements in common that is if AB= Relations and functions A relation on sets S and T is a set of ordered pairs (s,t) where s=S(s is a member of S) t=T S and T need not be different The set of all first elements (s) is the domain of the relation and The set of all second elements is the range of the relation. A relation is a function if every element of S occurs once and only once as a first element of the ration. A relation is a partial function if every element of S occurs at most as an element of the relation. Function and Relations Consider two sets, which we will call S and T. There are several ways in which the two sets S and T may be related. The sets may be identical that is every element of one set is also an element of the other set. The sets may be disjoint that is , no element belongs to both sets. There is no overlap between the sets. Set S may be a proper subset of set T. That is every element of s is also an element of T, but T has some elements that are not in S. Set T may be a proper subset of set B. The sets may overlap some elements are in both S and T with out either being a proper subset of the other (each set contains some elements that are not in the other set.) Relations However, this page isn’t really about sets, this page is really about relations and functions. For clarity all our examples use disjoint sets see figure) However, the same definitions apply regardless of the relationship between the two sets. The only difference is that if we didn’t use disjoint sets, the examples would be harder to figure out. Suppose we have two sets, S and T. A relation on S and T is a set of ordered pairs, where the first thing in each pair is an element of S and the second thing is an element of T. For example suppose S is the set ( A,B,C,D,E) while T is the set (W,X,Y,Z) Then one relation on S and T is (A,Y),(C,w)(C,Z)(D,Y). There are four ordered pairs in the relation. We can draw this as four arrows going from S to T (see Figure2) One arrow goes from A to Y; another goes from C to Z and another goes from D to Y. The purpose of this pate is just to define terms. Giving names to things is not important unless you can later use those names to talk about the things, These terms we define here are used throughout mathematics, and are pretty important: but we don’t go into any of that here. We just define terms. The most important of the terms we will define is function. You have probably seen this word defined in algebra or calculus, and you may think this is another meaning for the same word. Its Important to realize that there is only one meaning in mathematics for the word “function “ and this is it. Moreover, the definition given here is the “best” definition because, since the definition is given in terms of sets, it is the most general and most applicable definition. Any other definition is just a special case. Anyway, to continue, The domain of a relation on S and T is the set of elements of S that appear as the first element in an ordered pair of the relation. In the relation (A,Y), (C,Z), (D,Y) the domain is (A,C,D) .If you look at Figure2 these are the elements of S that have arrow coming out of them. The range of a relation on S and T is the set of elements of T that appear as the second element in an ordered pair or the relation. In the relation (A, Y, (C,W), (D,Y) the range is (W,Y,Z) if you look at figure 2 these are the elements, of T that have arrows. Pointing to them. For some reason the word co domain has become popular as a synonym for “range” I think it’s an ugly word. If anyone has an explanation for why this word has become popular, I would very much like to hear it. In Figure2 not every element of T has an arrow pointing to it. That is, there are some elements of T in particular the element X that do not occur as a second element of an ordered pair. We say that the relation is into T. Suppose we have relation in which every element occurs at least once as a second element of an ordered pair. For example the relation (A,Y) (B,X), (C,Z), (D,Y) see figure 3 is just like the previous relation, except that it also contains the ordered pair (B,X) The relation is one to every element of set T has an arrow (at least one) pointing to it. While the word “onto” has a precise definition (the range of the relation is the set T itself) the word “into” is not usually so well defined, Into” could be used to mean “not onto” there is at least one element of T that does not have an arrow pointing to it) or it could mean “not necessarily onto” that is, there might be some element of T that does not have an arrow pointing to it. Different authors might choose to define “into” in different ways, or not define it precisely or just not define it at all. Suppose we put in every possible arrow form S to. T that is from each element of S we draw arrow to each element of T see Figure 4 This “largest possible relation” is called the Cartesian. Product of S and T. Every other relation on S and T is a subset of the Cartesian product. How is it possible for a relation to be a subset of another relation? Remember that a relation is just a set of ordered pairs. Or in our pictures, a relation is a set of arrows. Next we will say that a relation is one to one or 1-1 if no element of s occurs more than once as the first element of an ordered pair, and no element to T occurs more than once as a second element or an ordered pair. In other words, no element of S or T has more than a single arrow attached to it. (See Figure 5) This definition holds even when S and T are not disjoint but the picture is a little more confusing An element that is in both S and T could have a single arrow attached to it as before but at both ends. Functions Now we get to what is perhaps the most important term. Suppose every element of occurs exactly once as the first element of an ordered pair in our pictures every element of S has exactly one arrow coming from it. This kind of relation is called a function. Another word for function is mapping A function is said to map an element in its domain to an element in its range. Here are some important facts about a function from S to T: 1. 2. 3. 4. 5. 6. Every element in S in the domain of the function: That is every element of S is mapped to some element in the range. If some element in S has no mapping arrow then the relation is sometimes called a partial function , but it is not a function No element in the domain maps to more than one element in the range. The mapping is not necessarily onto; some elements of T may not be in the range. The mapping is not necessarily 1-1 some elements of T may have more than one element of S mapped to them. S and T need not be disjoint. To tell whether a relation on S and T is a function you can ignore T altogether. Just look at S. If every element of S has one and only one element coming out of it. Then the relation is a function. Kinds of functions There are three kinds of functions that are important enough to have special names surjection injection and bisection. A surjection or onto function from S to T is a function whose range is T itself. That is every element of T has at least one arrow pointing to it, so the relation is on to Figure 6 is an example of a surjection. Since the domain of a function from S to T is the entire set S by the definition of function and since the range of a subjective function from S to T is the entire set T this means that every element of both sets is in the relation has an arrow connected to it. Also note that if you have a surjection from S to T may have more elements than T, or it may have the same number of elements but it cannot have fewer elements than. T this is because every element of S has exactly one arrow emanating from it and could have may. An injection or one to one function from S to is a function that is one to one. That is every element in both S and T has at least one arrow attached to it. By the definition of function, of course, every element of S has exactly one arrow emanating from it, no more, no less. A one-to- function also requires that every element of T can have no more than one arrow pointing to it but there could be elements of T that do not have any arrows pointing to them. Figure 7 shows an injection. However, we had to modify the sets a little bit to get this example. A little thought will show that with an injection from S to T, T must have at least as many elements as S. To get our example we removed some elements from S. We could equally as well have added elements to T. A B C D E A B C D E Finally a function that is both 1-1 and onto is called a bisection. Such a function maps each and every element of S to exactly one element of T with no elements left over. Shows a bisection. Again we had to modify the sets a little because with a bisection, set S T have exactly the same number of elements. A bisection is particularly interesting because this kind of function has an inverse. If you took a bisection from S to T and reversed the direction of all the arrow, you would have a bisection form T to S. This new function would be the inverse of the original function. Primitive Regular Expressions A regular expression can be used to define a language, A regular expression represents a “ pattern” strings that match the pattern are in the language, strings that do not match the pattern and not in the language. As usual the strings are over some alphabet The following are primitive regular expressions: X, for each X= , the empty string and indicating no strings at all. This if =n , then there are n+2 primitive regular expressions defined over . Here are the languages defined by the primitive regular expressions: For each X= the primitive regular expression x denotes the language ( X). That is, the only string in the language is the string “ X”. The primitive regular expression denotes the language ( ) the only string in this language is the empty string. The primitive regular expression denotes the language () there are no strings in this language. Regular Expressions Every primitive regular expression is a regular expression. We can compose additional regular expressions by applying the following rules a finite number of times. If r1 is a regular expression, then so is (r1) . If r1 is a regular expression , then so is r1 . If r1 and r2 are regular expressions then so is r1r2. If r1 and r2 are regular expressions. Then so is r1 r2 Here’s what the above notation means. Parentheses are just used for grouping. The postfix star indicates zero or more repetitions of the preceding regular expression. Thus , if X= then the regular expression X* denotes the language ( ,X. XX,XXx….) The plus sign read as “or” denotes the language containing strings described by either of the component regular expressions. For example, if X,Y= then the regular expression X+Y describes the language (x,y) Precedence:* binds most tightly, then just a position, then + for example a+bc* denotes the language (a,b,bc,bcc,bccc,bcccc,…..) Languages Defined by Regular Expressions There Is a simple correspondence between regular expressions and the languages they denote. Regular expression X, for each X (r1) R1* r1 r2 r1+ r2 L(regular expression) {X} {} {} L(r1) L(r1))* L(r1) L(r2) L(r1) L(r2) Building regular Expressions Here are some hints on building regular expressions. We will assume =(a,b,c) Zero or more a* means “zero or more a’s” To say” zero or more ab’s “that is ( ab, abab, ababab,….) you need to say (ab)* Don’t say ab* because that denotes the language (a, ab, abb, abbb, abbbb,…) One or more. Since a* means “zero or more a’s” you can use aa* (or equivalently a*a) to mean “one or more a’s similarly to describe “one or more ab’s” that is (ab, abab, ababab,…..) you can use ab(ab).* Zero or one. You can describe an optional a with (a+ ) Any string at all. To describe any string at all (with =(a,b,c) you can use (a+b+c)* Any nonempty string. This can be written as any character from followed by any string at all (a+B+C)(a+b+c)* Any string not containing…. To describe any string at all that doesn’t contain an a with (a,b,c,) you can use (b+C)* Any string containing exactly one……. To describe any string that contains exactly one a, put “any string not containing an a, “ on either side of the a, like this (b+c)* (b+c)* Example Regular Expressions Give regular expressions for the following languages on =(a,b,c) All strings containing exactly one a. b+c)*(a(b+c)* All strings containing no more than three a’s. We can describe the string containing zero, one two or three a’s (and nothing else )as (+a)(+a)(+a)x so we put in (b+c)* for each X (b c) *( a)(b c) ( a)(b c) ( a)(b c) All strings which contain at least one occurrence of each symbol in . The problem here is that we cannot assume the symbols are in any particular order. We have no way of saying “in any order” so we have to list the possible orders. Abc+acb+bac+bca+cab+cba To make it easier to see what’s happening, let’s put an X in every place we want to allow an arbitrary string. XaXbXcX+XaXcXbX+XbXaXcX+XbXcXaX+XcXaXbX+XcXbXaX Finally replacing the X s with (a+b+c) * gives the final unwieldy answer. A+b+c)*a(a+b+c)*b(a+b+c)*c(a+b+c)*+ A+b+c)*a(a+b+c)*c(a+b+c)*b(a+b+c)*+ A+b+c)*b(a+b+c)*a(a+b+c)*c(a+b+c)*+ A+b+c)*b(a+b+c)*c(a+b+c)*a(a+b+c)*+ A+b+c)*c(a+b+c)*a(a+b+c)*b(a+b+c)*+ A+b+c)*c(a+b+c)*b(a+b+c)*a(a+b+c)*+ All strings which contain no runs of a’s of length greater than two. We can fairly easily build an expression containing no a, one a or one aa. (b+c)*(+a+aa)(b+c)* but if we want to repeat this, we need to be sure to have at least one non a between repetitions: (b+c)*(+a+aa)(b+c)(b+c)*(+a+aa)(b+c)** All strings in which all runs of a’s have lengths that are multiples of three (aaa+b+c)* Convertion primitive regular expression to NFA Every nfa construct will have a single start state and a single final state. We will build more complex nfas out of simpler nfas, each with a single start state and a single final state. The simplest nfas will be those for the primitive regular expressions. For any x in the regular expression x denotes the language (x).This nfa represents exactly that language. Note that if this were a dfa, we would have to include arcs for all the other elements of Nfa for x The regular expression denotes the language ( ) that is the language containing only the empty string. The regular expression denotes the language no strings belong to this language, not even the empty string. nfa for Since the final state is unreachable, why bother to have it at all? The answer is that it simplifies the construction if every nfa has exactly one start state and one final state. We could do without this final state, but we would have more special cases to consider, and it doesn’t, hurt anything to include it. UNIT-2 DFAs are Deterministic-there is no element of choice Finite –only a finite number of states and arcs Acceptors (Automata)-produce only a yes no answer A deterministic finite automata acceptor of dfa is a quintuple M=(Q,, q0,F) Where Q is a finite set of states, is a finite set of symbols the input alphabet QxQ is a transition function q0=Q is the initial state F q is a set of final states. A DFA is drawn as a graph with each state represented by a circle. Start state One designated state is the start state Final rate Some state possibly including the start state can be designated as final states. State transition arc Arcs between states represent state transitions each such arc is labeled with the symbol that triggers the transition. Example DFA Let Q = {q0, q1 }, {a, b} F= {q0} the transition diagram q0 q1 Operation Start with the “current state” set to the start state and a “read head” at the beginning of the input string While there are still characters in the string. Read the next character and advance the read head. From the current state, follow the arc that is labeled with the character just read the state that the arc points to becomes the next current state. When all characters have been read, accepts the string if the current state is a final state, otherwise reject the string. Sample trace q0 1 q1 0 q3 0 q1 1 q0 0 q2 0 q0 Since q0 is a final state the string is accepted. Implementing a DFA Start q0 q1 q2 q3 If you don’t object to the go to statement there is an easy way to implement a DFA. Q0 : read char If eof then accept string; If char =0 then go to q2; If char=1 then go to q1? qo: read char; If eof then reject string; If char=0 then go to q3; If char=1 then go to q0; Q1: read char; If eof then reject reject string; If char=0 then go to q0 q3: If char =1 then go to q3; read char; If eof then reject string; If char=0 then go to q1 If char=1 then go to q2; Implementing a DFA, part 2 If you are not allowed to use a go to statement, you can fake it with a combination of a loop and a case statement; State =q0 Loop Case state of q0: read char; If eof then accept string; If char=0 then state;q2; If char=1 then state :=q1; Q1: read char; If eof then reject string; If char=0 then state : =q3; If char=1then state : =q0; Q2: read char; If eof then reject string; If char=0 then state:=q0; If char=1 then state:=q3 q3: read char; If eof then reject string; If char=0 then state:=q1 If char=1 then state:=q2 End case; End loop; Nondeterministic finite acceptors nondeterministic finite Automata. Formal Definition M= (Q, ,,,q0,F) Where Q is finite set of states. is a finite set of symbol, the input alphabet . Qx(---2Q is a transition function. . q0 =(Q is the initial state. . F Q is a set of final states. These are all the same as for a dfa except for the definition. Transitions on are allowed in addition to transitions on elements of and The range of is 2Q rather than Q. This means that the values of are not elements of Q. but rather are sets of elements of Q. A finite state automaton can be nondeterministic in either or both of two ways. A state may have two or more arcs emanating from it labeled with the same symbol. When the symbol occurs in the input either arc may be followed. A state may have one or more arcs emanating from it labeled with the empty string. These arcs may optionally be followed without looking at the input or consuming an input symbol. Due to non determinism, he same string may cause an nfa to end up in one of several different states. Some of which may be final while others are not. The string is accepted if any possible ending state is a final state. Example NFAs Let M = {a, b, c}, {a01, a1, a02, a03} 101, s, {qn} State q0 q1 q2 q3 a {q0,q1} {q1,q3} {q2} q3 ID b {q0,q1} {q1} {q2,q4} c {q0,q2} q1 q2 Implementing an NFA If you think of an automaton as a computer how does it handle no determinism? There are two ways that this could, in theory be done. 1. 2. When the automaton is faced with a choice, it always (magically) chooses correctly, We sometimes think of the automaton as consulting an oracle which advises it as to the correct choice. When the automaton is faced with a choice, it spawns a new process so that all possible paths are followed simultaneously. The first of these alternatives, using an oracle is sometimes attractive mathematically. But if we want to write a program to implement an nfa, that isn’t feasible. There are three ways, two feasible and one not yet feasible to simulate the second alternative. 1. Use a recursive backtracking algorithm. Whenever the automaton has to make a choice cycle through all the alternatives and make a recursive call to determine whether any of the alternatives leads to as solution final state. 2. Maintain a state set or a state vector, keeping track of all the states that the nfa could be in at any given point in the string. 3. Use a quantum computer, Quantum computers explore literally all possibilities simultaneously. They are theoretically possible, but are at the cutting edge of physics, It may (or may nor) be feasible to build such a device. Recursive Implementation of NFAs An nfa can be implemented by means of a recursive search from the start state for a path directed by the symbols of the input string to a final state. Here is a rough outline of such an implementation. function nfa (state A) returns Boolean: local state B, symbol x; for each transition from state A to some state B do If nfa (B) then return True; If there is a next symbol then {Read next symbol x For each x transition from state A to Some state B do If nfa (B) then Return True Return False; } Else {If A is a final state then return True else return False; } One problem with this implementation is that it could get into an infinite loop if there is a cycle of transitions. This could be prevented by maintaining a simple counter. State –Set Implementation of NFAs Another way to implement an NFA is to keep either a state set or a bit vector of all the states that the NFA could be in at any given time. Implementation is easier if you use a bit vector approach v (i) is True iff state I is a possible state since most languages provide vectors, Approach v(i) is True iff state I is a possible state since most language provide Vectors, but not sets as a built in data type. However, it’s a bit easier to describe the algorithm if you use a state set approach, so that’s what we will do. The logic is the same in either case. Function nfa state set A returns Boolean: local state set B, state a, state b, state c, symbol x; for each a in A do for each transition from a to some state b do add b to b While there is a next symbol do Read next symbol x B= for each a in A do {for each transition from a to some state b for each x transition from a to some state b add b to B’ } For each transition from Some state b in B to some state c not in B do add c to B; A :=B; } if any element of A is a final state then Return true Else Return False Conversion from NFA to DFA From NFA to DFA Consider the following NFA. What states can be in (in the NFA) before reading any input? Obviously the start state, A but there a transition from A to B , so we could also be in state B. For the DFA we construct the composite state (A, B) State (A, B) lacks a transition for x. From A, x takes us to A in the NFA and the null transition might take us to B; from B, x takes us to B in the DFA, x takes us form (A,B) to (A, B). State (A, B) also needs a transition for y. In the NFA (a, Y)=C and (B, Y)=C so we need to add a state © and an arc Y from (A, B) to © In the NFA (C, x)=A but then a pull transition might or might not take us to B, so we need to add the state (B, C) and the arc Y from ( C) to this new state. n the NFA (B, X)=B and (C, X) A and by a transition we might get back to B so we need an x arc from (B, C) to (A, B) (B, Y) =C while (C,Y) Is either B or C , so we have labeled from (B, C) to (B, C) We now have a transition from every state for every symbol . The only remaining chore is to mark all the final states, In the containing B is a final state. The pumping Lemma Here’s what the pumping lemma says. If an infinite language is regular it can be defined by a dfa. The dfa has some finite number of states say n Since the language is infinite, some strings of the language must have length >n For a string of length>n accepted by the dfa, the walk through the dfa must contain a cycle. Repeating the cycle an arbitrary number of times must yield another string accepted by the dfa Thus pumping lemma for regular languages is another way of proving that a given infinite language is not regular. The pumping lemma cannot be used to prove that a given language is regular. The proof is always by contradiction. A brief outline of the technique is as follows Assume the language L is regular. By the pigeonhole principle, any sufficiently long string in L must repeat some state in the dfa: thus the walk contains a cycle. Show that repeating the cycle some number of times pumping the cycle yields a string that is not in L. Conclude that L is not regular. Why this is hard We don’t know the dfa (if we did the language would be regular) Thus, we have do the proof for an arbitrary dfa that accepts L. Since we don’t know the dfa, we certainly don’t know the cycle. Why we can sometimes pull it off. We get to choose the string but it must be in L we get to choose the number of times to “puno” Applying the Pumping Lemma Here’s a more formal definition of the pumping lemma; If L is an infinite regular language, then there exists some positive integer m such that any string W=L whose length is m or greater can be decomposed into three parts, xyz, where xy is less than or equal to m y 0, W1 xy1isalsoin L for all i=0,1.2.3..... Here s what it all means: m is a (finite) number chosen so that strings of length m or greater must contain a cycle. Hence m must be equal to or greater than the number of states, in the dfa. Remember that we don’t know the dfa, so we can’t actually choose m, we just know that such an m must exist. Since string w has length greater than or equal to m, we can break it into two parts, xy and z such that xy must contain a cycle. We don’t know the dfa, so we don’t know exactly where to make this break, but we know that xy can be less than equal to m. We let x be the part before the cycle y be the cycle and z the part after the cycle. (it is possible that x and z contain cycle but we don’t care about that. Again we don’t know exactly where to make this break. Since y is the cycle we are interested in we must have y>0 otherwise it isn’t cycle. By repeating y an arbitrary number of times xy*z we must other strings in L. If despite all the above uncertainties, we can show that the has to accept some string that we known is not in the language then we can conclude that the language is not regular. To use this lemma, we need to show. 1. For any choice of m 2. For some w L that we get to chose (and we will choose one of length at least m) 3. For any way of decomposing w into xyz so long as sy isn’t greater than m and y isn’t 4. We can choose an I such that xy.z is not in L. We can view this as a game wherein our opponent makes moves I and 3 choosing m and choosing xyz and we make moves 2 and 4 choosing w and choosing. Our goal is to shown that we can always beat our opponent. If we can show this we have proved that L is not regular. Pumping Lemma Example 1 Prove that L=(anbn ; n 0 is not regular. 1. we don’t know m, but assume there is one. 2. choose a string W=anbn where n>m, so that any prefix of length m consists entirely of a’s 3. We don’t know the decomposition of w into xyz, but since xy m xy must consist entirely of a’s Moreover y cannot be empty. 4. Choose i=0 This has the effect of dropping y a’s out of the string without affecting the number of b’s. The resultant string has fewer as than b’s hence does not belong to L. Therefore L is not regular. Pumping Lemma Example 2 . prove that L=(anbk n>k and n 0 is not regular. 1. We don’t know m, but assume there is one. 2. Choose a string W=anbk where n>m so that any prefix of length m consists entirely of a’s and k=n-1, so that there is just one more a than b. 3. We don’t know the decomposition of w into xyz, but since xy m ,xy must consist entirely of a’s Moreover, y cannot be empty. 4. Choose i=0 this has the effect of dropping y a’s out of the string, without affecting the number of b’s. The resultant string has fewer a’s than before, so it has either fewer a’s than b’s or the same number of each” Either way the string does not belong to L so L is not regular. Pumping Lemma Example 3 Prove that L=(an n) is a prime number is not regular 1. We don’t know m, but assume there is one 2. choose a string W=an where n is a prime number and xyz n m 1 This can always be done because there is no largest prime number. Any prefix of w consists entirely of a’s 3. We don’t know the decomposition of w into xyz, but since xy m it follows that z 1.as usual y 0 4. since z 1, xz 1.Choosei xz .Then xy1 z xz yxz 1 y xy sin ce(1 y and xz are each greater than 1. the product must be a composite number. Thus xy1 z is a Closure properties of regular sets Closure 1 A set is closed under an operation if, whenever the operation is applied to members of the set, the result is also a member of the set For example the set of integers is closed under addition because x+y is an integer whenever x and y are integers. However. Integers are not closed under division: if x and y are integers, x/y may or may not be an integer. We have defined several operations on languages L1 strings in either L1 or L2 L2 L1L2 strings composed of one string from L1 followed by one string from L2 -L1 All strings over the same alphabet not in L1 L1* Zero or more strings from L1 concatenated together L1- strings in L1 that are not in L2 L1R strings in L1 reversed. We will show that the set of regular languages is closed under each of these operations. We will also define the operations of “homomorphism” and “right quotient” and show that the set of regular languages is also closed under these operations. Close 11 Union, concatenation, Negation, Kleene star, Reverse General Approach Build automat a( dfas or nfas ) for each of the languages involved. Show how to combine the automata to create a new automaton that recognizes the desired language. Since the language is represented by an nfa or dra, conclude that the language is regular. Union of L1 and L2 Create a new start state. Make a transition from the new start state to each of the original start states. Concatenation of L1and L2 Put a transition from each final state of L1 to the initial state of L2 Make the original final state of L1 nonfinal Negation of L1 Start with a (complete) dfa, not with an nfa. Make every final state nonfinal and every nonfinal state final. Kleene Star of L1 Make a new start state; connect it to the original start state with a transition. Make a new final state, connect the original final state; connect the original final states (which become nonfinal) to it with transitions Connect the new start state and new final state with a pair of transitions. Reverse of L1 Start with an automaton with just one final state. Make the initial state final and the final state initial. Reverse the direction of every are. Closure 111 intersection and set Difference Just as with the other operations, you prove that regular languages are closed under intersection and set difference by starting with automata for the initial language, and constructing a new automaton that represents the operation applied to the initial languages. However, the constructions are somewhat trickier. In these constructions you form a completely new machine, whose states are each labeled with an ordered pair of state names,; the first element of each pair is a state form L1 and the second element of each pair is a state from L2 (Usually you won’t need a state for every such pair just some of them. 1. Begin by creating a start state whose label is (start state of L1 start state of L2) 2. Repeat the following until no new arcs can be added. 1. Find a state (A,B) that lacks a transition for some in . 2. Add a transition on x from state (A,B) to state (B,X) (If this state doesn’t already exist create it.) The same construction is used for both intersection and set difference the distraction is in how the final states are selected. Intersection: Mark a state (A,B) as final if both (i) A is a final state in L1 and (ii) B is a final state in L2 Set difference: Mark a state (A,B) as final if A is a final state in L1 but B is not a final state in L2 Closure iv: Homomorphism Note: “Homomorphism” is a term borrowed from group theory, What we refer to as a “ homomorphism: is really a special case. Suppose are alphabets (not necessarily distinct). Then a homomorphism h is a function from ot r If w is a string in then we define h (w) to be the string obtained by replacing each symbol x= by the corresponding string h(x) * If is a languages on then its homomorphic image is a language on . Formally, H(L) = (h(w):W L) Theorem If L is a regular language on then its homomorphic image h(L) is a regular language on . That is, if you replaced every string w in L with h(w) the resultant set of strings would be a regular language on . Proof Construct a dfa representing L. This is possible because L is regular. For each arc in the dfa replace its label x with h(x) . If an arc is labeled with a string W of length greater than one, replace the arc with a series of arcs and (new) states, so that each arc is labeled with a single element or . The result is an nfa that recognizes exactly the languages h(L). Since the language h(L) can be specified by an nfa, the language is regular. Closure V: Right Quotient Let L1 and L2 be languages on the same alphabet. The right quotient of L1 with L2 is L1/L2 =(W:Wx L1 and X =L2) That is the strings in L1/L2 are strings from L1 “with their tails cut off “ if some string of L1 can be broken into two parts, w and x, where x is in language L2 then w is an language L1/L2. UNIT-3 Deterministic TM (DTM) a quintuple (Q, , , , s), where – the set of states Q is finite, not containing halt state h, – the input alphabet is a finite set of symbols not including the blank symbol , – the tape alphabet is a finite set of symbols containing , but not including the blank symbol , – the start state s is in Q, and – the transition function is a partial function from Q ({}) Q{h} ({}) {L, R, S}. Nondeterministic TM An NTM starts working and stops working in the same way as a DTM. Each move of an NTM can be nondeterministic reads the symbol under its tape head According to the transition relation on the symbol read from the tape and its current state, the TM choose one move nondeterministically to: – write a symbol on the tape – move its tape head to the left or right one cell or not – changes its state to the next state a quintuple (Q, , , , s), where – the set of states Q is finite, and does not contain halt state h, – the input alphabet is a finite set of symbols, not including the blank symbol , – the tape alphabet is a finite set of symbols containing , but not including the blank symbol , – the start state s is in Q, and – the transition fn :Q({})2Q{h}({}){L,R,S}. Definition • Let T = (Q, , , , s) be an TM. A configuration of T is an element of Q • Can be written as – (q,l,a,r) or – (q,lar) Equivalence of NTM and DTM Theorem: For any NTM Mn, there exists a DTM Md such that: – if Mn halts on input with output , then Md halts on input with output , and – if Mn does not halt on input , then Md does not halt on input . Proof: Let Mn = (Q, , , , s) be an NTM. We construct a 2-tape TM Md from Mn as follows: Programming Techniques for Turing Machine Construction: A turing machine is also as powerful as a conventional computer. The following are the different techniques of constructing a TM to meet high – level needs. 1. 2. 3. 4. Storage in the finite control (or) State Multiple tracks Subroutines Checking off symbols. Storage in the State (or) Storage in the Finite Control: The finite control can also be used to hold a finite amount of information along with the task of representing a position in the program. The state is written as a pair of elements, one for control and the other storing a symbol. Figure: Storage in finite control Multiple Tracks: It is also possible that Turing Machine’s input tape can be divided into several tracks. Each track can hold one symbol, and the tape alphabet of the TM consists of tuples with one component for each track. Figure: A three track Turing Machine Subroutines: A problem with same tasks to be repeated for many number of times, can be programmed using subroutines. A turing machine with subroutine is a set of states that perform some useful process. The idea here is to write part of a TM programs to serve as a subroutine which has its own initial state and a return state for returning to the calling routine. It improves the modular or top – down programming design. Multitape Turing Machine: A multiple turing machine has a finite control with some finite number of rapes. Each tape is infinite in both directions. It has it’s own initial state and some accepting states. Initially The finite set of input symbols is placed on the first tape. All the other cells of all the tapes hold the blank. The control head of the first tape is at the left end of the input. Figure: Multitape Turing Machine In one move, the multitape TM can Change state, Print a new symbol on each of the cells scanned by its tape heads. Move each of its tape heads, independently, one cell to the left or right or keep it stationary. Context free Grammar (CFG). A context free grammar is a 4 tuple (V, T, P, S) where V is a finite set of Varieties or non-terminals. T is a finite set of terminals. P is a finite set of productions. S is a start symbol with S Є V. Given a grammar of G = ({S}, {a , b}, R, S). The set of rules or productions R is defined as SaSB S SS S Є Since L>H>S is a single non terminal. This grammar generates strings such as a b a b, a a a b b b b, and a a b a b b. Grammar for Regular languages. A language defined by a dfa is a regular language Any dfa can be regarded as a special case of an nfa. Any nfa can be converted to an equivalent dfa thus a language defined by an nfa is a regular language. A regular expression can be converted to an equivalent nfa thus a language defined by a regular expression is a regular language. An nfa can with some effort be converted to a regular expression. So dfas nfas and regular expression are all equivalent in the same that any language you define with on of these could be defined by the others as well. We also know that languages can be defined by grammars. Now we will begin to classify grammars: and the first kinds of grammars we will look at are the regular grammars. As you might expect, regular grammars will turn out to be equivalent to dfas, nfas, and regular expressions. Classifying Grammars. Recall that a grammar G is a quadruple G= (V,T,S, P ) Where V is finite set of meta symbols or variables. T is a finite set of terminal symbols. S V is a distinguished element of V called the start symbol P is a finite set of productions. The above is true for all grammars. We will distinguish among different kinds of grammars based on the farm of the productions. If the productions of a grammar all follow a certain pattern, We have one kind of grammar. If the productions all fit a different pattern, we have a different kind of grammar. Productions have the form V T) (V T) In a right linear grammar. All productions have one of the tow forms V T*v Or V T* That is the left hand side must consist of a single variable and the right hand side consists of any number of terminals (members of ) optionally followed by a single variable. (The following the arrow, a variable can occur only as the rightmost symbol of the production. Right Linear Grammars and NFAs There is a simple connection between right linear grammars and NFAs as suggested by the following diagrams. A - x B x A A - xyzB B x A y z B AB A B x Ax A As an example of the correspondence between an nfa and a right linear grammar, the following automaton and grammar both recognize the set of strings consisting of an even number of 0’s and an even number of 1’s Left Linear Grammars In a left linear grammar. All productions have one of the two forms: V V VT T That is the left-hand side must consist of a single variable, and the right-hand side consists of an optional single variable followed by any number of terminals. This is just like a right linear grammar except that, following the arrow, a variable can occur only on the left of the terminals, rather than only on the right. We won’t pay much attention to left linear grammars, because they turn out to be equivalent to right linear grammars. Given a left grammar for the same language, as follows. Step Method Construct a right linear grammar for the Replace each production A x of L with a production A different language LR XR and replace each production A B x with a production A XR B. Construct an nfa for LR from the right We talked about deriving an nfa from a right linear linear grammar. This nfa should have just grammar on an earlier page. If the nfa has more than one final state. one finals state, we can make those states nonfinal, add a new final state, and put transitions from each previously final state to the new final state. Construct an nfa to recognize language L. 2. Ensure Reverse the nfa for L to obtain an nfa the nfa has only a single final state. 3. Reverse the for L. direction of the arcs. 3. Make the initial state final and the final state initial. This is the technique we just talked about on an earlier Construct a right linear grammar for L page. from the nfa for L R Regular Grammars A regular grammar is either a right linear grammar or a left linear grammar. To be a right linear grammar, every production of the grammar must have one of the two forms V vT or V T To be a left linear grammar, every production of the grammar must have one of the two forms V vT or V t* You do not get to mix the two for example, consider a grammar with the following productions. S S aX Sb This grammar is neither right linear nor left-linear, hence it is not a regular grammar. We have no reason to suppose that the language it generates is a regular language ( one that is generated by a dfa) Nondeterministic Pushdown automaton (NPDA) or PDA An NPDA is defined by - Tuple. M = (, , , , q0, Z, F) Where = finite set of states = finite set of input symbols = finite set of pushdown symbols or finite set of symbols called stack alphabet. q0 = Initial state Z0 = Initial stack symbol F = finite set of accepting states = mansion function QX(U{}) x Q x The Languages are accepted by PDA (m) by entered final state. Languages accepted by final state L=(M) = {w | (q0 , w, Z0 ) ( p, t, ) For some P in F and in } Languages accepted by PDA by final state is defined to be the set of all inputs for which some sequence of moves causes the PDA to enter the final state. The PDA is deterministic in the sense that at most one move is possible from any ID. PDA M = ( , , , , q0, Z0, F) Is deterministic if a) for each q in , and Z in , whenever (q, t, Z) is non empty then ( q, a, Z) is empty for all a in . b) for no q in Q, Z in and a in U{t} does (q, a, Z) contain more than one element. Transition Functions for NPDAs The transition function for an npda has the form QX ( finite subsets of Q X is now a function of three arguments. The first two are the same as before: the state and either or a symbol from the input alphabet. The third argument is the symbol on top of the stack. Just as the input symbol is “consumed” when the function is applied the stack symbol is also consumed removed from the stack. Note that while the second argument may be rather than a member of the input alphabet so the no input symbol is consumed three is no such option for the third argument. always consumes a symbol from the stack no move is possible if the stack is empty. In the deterministic case, when the function is applied the automaton moves to a new state q=Q and pushes a new string of symbols X onto the stack. Since we are dealing with a nondeterministic pushdown automaton, the result of applying is a finite set of (q X) pairs. If we were to draw the automaton, each such pair would be represented by a single arc. As with an nfa we do not need to specify for every possible combination of arguments. For any case where is not specified, the transition is to Q the empty set of states. The language given by L= a n | n 1 is context free or not. 2 Soln: Let us assume that 2 = an Let s = uvwny where 1 | vx | n since |vwx| m,m n By pumping lemma, uv 2 wx 2 y is in L. since | uv 2 wx 2 y | > n2 | uv 2 wx 2 y | =k2 where k n+1 But | uv 2 wx 2 y | = n2 + m < n2+2n+1 Therefore | uv 2 wx 2 y | lies between n2 and (n+1)2. Hence uv2wx2y L ;n 1 which is a contradiction. Therefore a n ; n 1 is not context free. 2 Closure properties of CFL. Closure properties of CFL The operations are useful not only in constructing or proving that certain languages are context free, but also in proving certain languages not to be context free. Here we see some of the operations like substitutions, homomorphisms, inverse homomorphism, intersection, difference, etc. Substitutions Let be an alphabet. For each symbol a in , choose a language La and S(a) defines the substitution function. If w = a1 a2 … an is a string in *, then s(w) is the language of all strings x1 x2 ….xn such that string xi is in the language S(ai) for I = 1, 2, …n. S(L) is the union of S(w) for all strings w in L. Example: LetS 0 0n1n n 1 then S 1 00,11 a=1 It contains only two strings 00 and 11. But S(0) contains one or more as followed by an equal number of b’s. Let w = 10, then S(w) is the concatenation of S(1) and S(0). S(w) = S(1).S(0) = 000n1,110n1n. Applications of the Substitution Theorem Theorem The context free languages are closed under Union, Concatenation, Closure, Homomorphism. Proof: Apply the previous theorem (substitution theorem) for proving all these operations. 1. Union: Let L1 and L2 be two CFL’s. Then the union of L1 and L2 is, S(L)=L1 U L2 where L is the language {1,2} and S is the substitution given by S(1) = L1 and S(2) =L2. 2. Concatenation: Let L1 be CFL’s. Then the concatenation of two language L1 and L2 is. S(L) = L1.L2 where L is the language {12} and S is the substitution S(1) = L1 and S(2) = L2. 3. Closure and Positive Closure: If L1 is a CFL’s. Then the closure is given by S(L)=L1* where S(1) = L1 The positive closure is given by S L L1 whereL 1 . 4. Homomorphism: Let L be a CFL over the alphabet and h is a homomorphism on . Let S be the substitution that replaces each symbol a in by the language consisting of one string h(a). S(a) = {h(a)} for all in . Thus, h(L) = S(L). Intersection Theorem:1 The CFL’s are not closed under intersection. Let L1 and L2 be CFL’s. Then L1 L2 is the language 1(L) where it satisfies both the properties of L1 and L2 which is not possible in CFL. Example a b n 1,m 1 Let L1 ambn m 1,n 1 L2 n m L = L1 L2 is not possible. Because L1 requires that there be M number of a’s and n number of b’s but L2 requires n number of a’s and M number of b’s. So CFL’s are not closed under in section. Inverse Homomorphism If h is a homomorphism, and L is any language, then h-1(L) is the set of strings w such that h(w) is in L. The CFL’s are closed under inverse homomorphism. The following figure shows the Inverse Homomorphism of PDAS. Buffer Input a h ha) PDA State Stack Figure: Inverse Homomorphism of a PDA After getting the input a, h(a) is placed in a buffer. The symbols of h(a) are used one at a time and fed to the PDA being simulated. While applying homomorphism, the PDA checks whether the buffer is empty. If it is empty, the PDA read the input symbols and apply the homomorphism. Design a TM to accept the language L = 0n1n \ n 1 Given a finite sequence of 0’s and 1’s on its tape. The turing machine is designed using the following way. (i) (ii) (iii) (iv) M replaces the leftmost 0 ny x, moves right to the leftmost 1, replacing it by y. Then M moves left to find the rightmost x, and moves one cell right to the leftmost 0 and repeats the cycle. While searching for a 1, if a blank is encountered, then M halts without accepting. After changing a 1 to a y, if M finds no more 0’s then M checks that no more 1’s remain, accepting the string else not. Assume the set of states Q q0 ,q1,q2,q3,q4 0,1 0,1,1,y,B F q4 let q0 be the initial state and at state q0, it replaces the leftmost 0 by x, and changes it to q1, M searches right for 1’s, skipping over 0’s and y’s. If M finds a 1, it changes it to y, entering state q 2. From q2, it searches left for an x and moves right to change the state to q0. At q0. if y is encountered, it goes to state q3 and checks that no 1’s remain. If the y’s are followed by a B, state q4 is entered and then accepted. And for all others, M rejects. Eg: q0 0 (q1,x,R) 1 - X - Y (q3,y,R) B - q1 (q1,0,R) (q2,y,L) - (q1,y,R) - q2 (q2,0,L) - (q0,X,R) (q2,y,L) - q3 - - - (q3,y,R) - q4 - - - - - i q0 0011 xq1011 x0q111 xq2 0yl q2 x0y1 xq0 0y1 xxq1y1 xxyq11 xxq2 yy xq2 xyy xxq0 yy xxq3 y xxyyq3 xxyyBq4 . Accepted. ii q0 011xq111 q1xy1 xq0 y1 xyq31 rejected. Figure: Transition diagram for 0n 1n. M Q, , , ,q0 ,B,q4 where Q q0 ,q1,q2 ,q3 ,q4 0,1 0,1,x,y,B q0 Initial state q4 Final state is given in the table. Design a Turing machine to check whether the given input is prime or not using multiple tracks. Solution: The binary input greater than two is placed on the first track. And also the same input is placed on the third track. Then TM writes the number two in binary form on th second track. Then divide the three track by the second as follows. The number on the second track is subtracted from the third track as many times as possible, till getting the remainder. If the remainder is zero, then the number on the first track is not a prime. If the remainder is non zero, then increase the number on the second track by one. If the second track equals the first, the number given is a prime because it should be divided by one and itself. Eg: (i) 8 I II III 8 2 8 8 2 6 (1) 8 2 4 (2) (3) Track Track Track 8 2 0 8 2 2 (4) (5) The given number is not a prime number Eg: (ii) 5 5 2 5 5 2 3 5 3 5 5 3 2 5 4 55 5 5 5 2 1 Increase the value by 1 Increase the value by 1 5 4 1 second second track track Increase the second track value by 1 Here number on II track = number on III track The given number is a prime number. Undecidable problem . A problem whose language is not recursive is said to be undecidable problem. In other words problem which has no algorithm to solve is called undecidable problem. 1. Does turing machine M halt on input w ? 2. For grammars G1 and G2 check whether L(G1 )= L(G2) UNIT IV Asymptotic Notation: O( ), o( ), Ω( ), ω(), and θ( ) “Big-O” notation was introduced in P. Bachmann’s 1892 book Analytische Zahlentheorie. He used it to say things like “x is O(n2 )” instead of “x _ n2 .” The notation works well to compare algorithm efficiencies because we want to say that the growth of effort of a given algorithm approximates the shape of a standard function. Big-O (O()) is one of five standard asymptotic notations. In practice, Big-O is used as a tight upper-bound on the growth of an algorithm’s effort (this effort is described by the function f(n)), even though, as written, it can also be a loose upper-bound. To make its role as a tight upper-bound more clear, “Little-o” (o()) notation is used to describe an upper-bound that cannot be tight. Definition (Big–O, O()): Let f(n) and g(n) be functions that map positive integers to positive real numbers. We say that f(n) is O(g(n)) (or f(n) 2 O(g(n))) if there exists a real constant c > 0 and there exists an integer constant n0 _ 1 such that f(n) _ c _ g(n) for every integer n _ n0. Definition (Little–o, o()): Let f(n) and g(n) be functions that map positive integers to positive real numbers. We say that f(n) is o(g(n)) (or f(n) 2 o(g(n))) if for any real constant c > 0, there exists an integer constant n0 _ 1 such that f(n) < c _ g(n) for every integer n _ n0. Definition (Big–Omega, ()): Let f(n) and g(n) be functions that map positive integers to positive real numbers. We say that f(n) is (g(n)) (or f(n) 2 (g(n))) if there exists a real constant c > 0 and there exists an integer constant n0 _ 1 such that f(n) _ c · g(n) for every integer n _n0. Definition (Little–Omega, !()): Let f(n) and g(n) be functions that map positive integers to positive real numbers. We say that f(n) is !(g(n)) (or f(n) 2 !(g(n))) if for any real constant c > 0, there exists an integer constant n0 _ 1 such that f(n) > c · g(n) for every integer n _ n0. This graph should help you visualize the relationships between these notations: These definitions have far more similarities than differences. Here’s a table that summarizes the key restrictions in these four definitions: Time Space Complexity Determining the number of steps (operations) needed as a function of the problem size Count the exact number of steps needed for an algorithm as a function of the problem size Each atomic operation is counted as one step: o Arithmetic operations o Comparison operations o Other operations, such as “assignment” and “return” Time complexity: A measure of the amount of time required to execute an algorithm Objectives of time complexity analysis: • To determine the feasibility of an algorithm by estimating an upper bound on the amount of work performed • To compare different algorithms before deciding on which one to implement Analysis is based on the amount of work done by the algorithm • Time complexity expresses the relationship between the size of the input and the run time for the algorithm • Usually expressed as a proportionality, rather than an exact function To simplify analysis, we sometimes ignore work that takes a constant amount of time, independent of the problem input size • When comparing two algorithms that perform the same task, we often just concentrate on the differences between algorithms Simplified analysis can be based on: Number of arithmetic operations performed Number of comparisons made Number of times through a critical loop Number of array elements accessed etc Simulations A computer simulation, a computer model, or a computational model is a computer program, run on a single computer, or a network of computers, that attempts to simulate an abstract model of a particular system. Computer simulations have become a useful part of mathematical modeling of many natural systems in physics (computational physics), astrophysics, chemistry and biology, human systems in economics, psychology, social science, and engineering. Simulation of a system is represented as the running of the system's model. It can be used to explore and gain new insights into new technology, and to estimate the performance of systems too complex for analytical solutions. Computer simulations vary from computer programs that run a few minutes, to networkbased groups of computers running for hours, to ongoing simulations that run for days. The scale of events being simulated by computer simulations has far exceeded anything possible (or perhaps even imaginable) using traditional paper-and-pencil mathematical modeling. Over 10 years ago, a desert-battle simulation, of one force invading another, involved the modeling of 66,239 tanks, trucks and other vehicles on simulated terrain around Kuwait, using multiple supercomputers in the DoD High Performance Computer Modernization Program[2] Other examples include a 1-billion-atom model of material deformation(2002); a 2.64-million-atom model of the complex maker of protein in all organisms, a ribosome, in 2005;[3] a complete simulation of the life cycle of Mycoplasma genitalium in 2012; and the Blue Brain project at EPFL (Switzerland), begun in May 2005, to create the first computer simulation of the entire human brain, right down to the molecular level. Reducibility: A reduction is a way of converting one problem to another problem, so that the solution to the second problem can be used to solve the first problem. Finding the area of a rectangle, reduces to measuring its width and height Solving a set of linear equations, reduces to inverting a matrix. Reducibility involves two problems A and B. If A reduces to B, you can use a solution to B to solve A When A is reducible to B solving A can not be “harder” than solving B. If A is reducible to B and B is decidable, then A is also decidable. If A is undecidable and reducible to B, then B is undecidable. Circuit Complexity A Boolean circuit C on n inputs x1, . . . , xn is a directed acyclic graph (DAG) with n nodes of in-degree 0 (the inputs x1, . . . , xn), one node of out-degree 0 (the output), and every node of the graph except the input nodes is labeled by AND, OR, or NOT; it has in-degree 2 (for AND and OR), or 1 (for NOT). The Boolean circuit C computes a Boolean function f(x1, . . . , xn) in the obvious way: the value of the function is equal to the value of the output gate of the circuit when the input gates are assigned the values x1, . . . , xn. The size of a Boolean circuit C, denoted |C|, is defined to be the total number of nodes (gates) in the graph representation of C. The depth of a Boolean circuit C is defined as the length of a longest path (from an input gate to the output gate) in the graph representation of the circuit C. A Boolean formula is a Boolean circuit whose graph representation is a tree. Given a family of Boolean functions f = {fn}n_0, where fn depends on n variables, we are interested in the sizes of smallest Boolean circuits Cn computing fn. Let s(n) be a function such that |Cn| _ s(n), for all n. Then we say that the Boolean function family f is computable by Boolean circuits of size s(n). If s(n) is a polynomial, then we say that f is computable by polysize circuits. It is not difficult to see that every language in P is computable by polysize circuits. Note that given any language L over the binary alphabet, we can define the Boolean function family {fn}n_0 by setting fn(x1, . . . , xn) = 1 iff x1 . . . xn 2 L. Is the converse true? No! Consider the following family of Boolean functions fn, where fn(x1, . . . , xn) = 1 iff TM Mn halts on the empty tape; here, Mn denotes the nth TM in some standard enumeration of all TMs. Note that each fn is a constant function, equal to 0 or 1. Thus, the family of these fn’s is computable by linear-size Boolean circuits. However, this family of fn’s is not computable by any algorithm (let alone any polytime algorithm), since the Halting Problem is undecidable. Thus, in general, the Boolean circuit model of computation is strictly more powerful that the Turing machine model of computation. Still, it is generally believed that NP-complete languages cannot be computed by polysize circuits. Proving a superpolynomial circuit lower bound for any NP-complete language would imply that P 6= NP. (Check this!) In fact, this is one of the main approaches that was used in trying to show that P 6= NP. So far, however, nobody was able to disprove that every language in NP can be computed by linear-size Boolean circuits of logarithmic depth! Boolean circuit model of computation For every n,m 2 N a Boolean circuit C with n inputs and m outputs1is a directed acyclic graph. It contains n nodes with no incoming edges; called the input nodes and m nodes with no outgoing edges, called the output nodes. All other nodes are called gates and are labeled with one of _, ^ or ¬ (in other words, the logical operations OR, AND, and NOT). The _ and ^ nodes have fanin (i.e., number of incoming edges) of 2 and the ¬ nodes have fanin 1. The size of C, denoted by |C|, is the number of nodes in it. The circuit is called a Boolean formula if each node has at most one outgoing edge. Definition 6.2 (Circuit families and language recognition) Let T : N ! N be a function. A T(n)-sized circuit family is a sequence {Cn}n2N of Boolean circuits, where Cn has n inputs and a single output, such that |Cn| _ T(n) for every n. We say that a language L is in SIZE(T(n)) if there exists a T(n)-size circuit family {Cn}n2N such that for every x 2 {0, 1}n, x 2 L , C(x) = 1. every language is decidable by a circuit family of size O(n2n), since the circuit for input length n could contain 2n “hardwired” bits indicating which inputs are in the language. Given an input, the circuit looks up the answer from this table. (The reader may wish to work out an implementation of this circuit.) The following definition formalizes what we can think of as “small” circuits. UNIT V Polynomial Time An algorithm is said to be solvable in polynomial time if the number of steps required to complete the algorithm for a given input is for some nonnegative integer , where is the complexity of the input. Polynomial-time algorithms are said to be "fast." Most familiar mathematical operations such as addition, subtraction, multiplication, and division, as well as computing square roots, powers, and logarithms, can be performed in polynomial time. Computing the digits of most interesting mathematical constants, including and , can also be done in polynomial time. polynomial-time many-one reduction, polynomial transformation, or Karp reduction. If it is a Turing reduction, it is called a polynomial-time Turing reduction or Cook reduction. Polynomial-time reductions are important and widely used because they are powerful enough to perform many transformations between important problems, but still weak enough that polynomial-time reductions from problems in NP or co-NP to problems in P are considered unlikely to exist. This notion of reducibility is used in the standard definitions of several complete complexity classes, such as NP-complete, PSPACE-complete and EXPTIMEcomplete. Within the class P, however, polynomial-time reductions are inappropriate, because any problem in P can be polynomial-time reduced (both many-one and Turing) to almost[1] any other problem in P. Thus, for classes within P such as L, NL, NC, and P itself, log-space reductions are used instead. If a problem has a Karp reduction to a problem in NP, this shows that the problem is in NP. Cook reductions seem to be more powerful than Karp reductions; for example, any problem in co-NP has a Cook reduction to any NP-complete problem, whereas any problems that are in co-NP - NP (assuming they exist) will not have Karp reductions to any problem in NP. While this power is useful for designing reductions, the downside is that certain classes such as NP are not known to be closed under Cook reductions (and are widely believed not to be), so they are not useful for proving that a problem is in NP. However, they are useful for showing that problems are in P and other classes that are closed under such reductions P Completeness theory In complexity theory, the notion of P-complete decision problems is useful in the analysis of both: 1. which problems are difficult to parallelize effectively, and; 2. which problems are difficult to solve in limited space. Formally, a decision problem is P-complete (complete for the complexity class P) if it is in P and that every problem in P can be reduced to it by using an appropriate reduction. The specific type of reduction used varies and may affect the exact set of problems. If we use NC reductions, that is, reductions which can operate in polylogarithmic time on a parallel computer with a polynomial number of processors, then all P-complete problems lie outside NC and so cannot be effectively parallelized, under the unproven assumption that NC ≠ P. If we use the weaker log-space reduction, this remains true, but additionally we learn that all Pcomplete problems lie outside L under the weaker unproven assumption that L ≠ P. In this latter case the set P-complete may be smaller. There is a theorem of Ladner [77] that plays the same role for the P =NLOGSPACE or P = LOGSPACE question that the Cook–Levin theorem plays for the P = NP question. The decision problem involved is the circuit value problem (CVP): given an acyclic Boolean circuit with several inputs and one output and a truth assignment to the inputs, what is the value of the output? The circuit can be evaluated in deterministic polynomial time; the theorem says that this problem is ≤log m -complete for P. It follows from the transitivity of ≤log m that P = NLOGSPACE iff CVP ∈ NLOGSPACE and P = LOGSPACE iff CVP ∈ LOGSPACE. Formally, a Boolean circuit is a program consisting of finitely many assignments of the form Pi := 0, Pi := 1, Pi := Pj ∧ Pk, j,k<i, Pi := Pj ∨ Pk, j,k<i, or Pi := ¬Pj, j<i, where each Pi in the program appears on the left-hand side of exactly one assignment. The conditions j, k < i and j < i ensure acyclicity. We want to compute the value of Pn, where n is the maximum index. Circuit Satisfiability The circuit satis_ability problem (CIRCUIT-SAT) is the circuit analogue of SAT. Given a Boolean circuit C, is there an assignment to the variables that causes the circuit to output 1? Theorem 1 CIRCUIT-SAT is NP-complete. Proof It is clear that CIRCUIT-SAT is in NP since a nondeterministic machine can guess an assignment and then evaluate the circuit in polynomial time. Now suppose that A is a language in NP. Recall from Lecture 3 that A has a polynomial-time veri_er, an algorithm V with the property that x 2 A if and only if V accepts hx; yi for some y. In addition, from Lecture 5, we know that there is a polynomial-size circuit C equivalent to V . The input of C is the entire input of V , i.e., both x and y. And C can be constructed in polynomial time given the length of x and y. The reduction from A to CIRCUIT-SAT operates as follows: given an input x, output a description of the circuit C(x; y) with the x variables set to the given values and the y variables left as variables. The resulting circuit is satis_able if and only x 2 A. And the reduction can be computed in polynomial time because of the uniformity of C. Circuit satisfiability is a good example of a problem that we don’t know how to solve in polynomial time. In this problem, the input is a boolean circuit: a collection of AND, OR, and NOT gates connected by wires. We will assume that there are no loops in the circuit (so no delay lines or flip-flops). The input to the circuit is a set of m boolean (TRUE/FALSE) values x1, . . . , xm. The output is a single boolean value. Given specific input values, we can calculate the output of the circuit in polynomial (actually, linear) time using depth-firstsearch, since we can compute the output of a k-input gate in O(k) time. P, NP, and co-NP A decision problem is a problem whose output is a single boolean value: YES or NO.2 Let me define three classes of decision problems: _ P is the set of decision problems that can be solved in polynomial time.3 Intuitively, P is the set of problems that can be solved quickly. _ NP is the set of decision problems with the following property: If the answer is YES, then there is a proof of this fact that can be checked in polynomial time. Intuitively, NP is the set of decision problems where we can verify a YES answer quickly if we have the solution in front of us. _ co-NP is the opposite of NP. If the answer to a problem in co-NP is NO, then there is a proof of this fact that can be checked in polynomial time. For example, the circuit satisfiability problem is in NP. If the answer is YES, then any set of m input values that produces TRUE output is a proof of this fact; we can check the proof by evaluating the circuit in polynomial time. It is widely believed that circuit satisfiability is not in P or in co-NP, but nobody actually knows. Every decision problem in P is also in NP. If a problem is in P, we can verify YES answers in polynomial time recomputing the answer from scratch! Similarly, any problem in P is also in co-NP. One of the most important open questions in theoretical computer science is whether or not P = NP. Nobody knows. Intuitively, it should be obvious that P 6= NP; the homeworks and exams in this class and others have (I hope) convinced you that problems can be incredibly hard to solve, even when the solutions are obvious in retrospect. But nobody knows how to prove it. A more subtle but still open question is whether NP and co-NP are different. Even if we can verify every YES answer quickly, there’s no reason to think that we can also verify NO answers quickly. For example, as far as we know, there is no short proof that a boolean circuit is not satisfiable. It is generally believed that NP 6= co-NP, but nobody knows how to prove it. The Turing machine simulator. The simulator also supports multi-track machines (the notation of Note 4 is used for tape symbols, so that ha, bi denotes a symbol where a is on the upper track and b on the lower track). The simulator will allow you to vary the number of tracks, since all they are is a convenient device for certain structured symbol names. As an extra feature the simulator issues a warning if there is a transition such as (q1,<0,1>,q2,<1,0>,R) that changes more than one track at the same time, since this goes against the idea of tracks. No warning is issued if the transition involves symbols with different numbers of tracks (say from 2 tracks to 3 tracks). The Turing machine description language. The user must create a file that contains a description of the machine to be simulated. The description language has been designed with an eye to making the descriptions look as similar as possible to those that can be found in Note 3. Thus, for example, the palindrome recognizer, Mpalin, of Note 3 might be presented to the simulator in the form of the following file: Q = {q0, q1, q2, q3, q4, q5, q6} I = q0 F = q6 G = {0, 1, b} S = {0, 1} D = {(q0,b,q6,b,R)*, (q0,0,q1,b,R)*, (q0,1,q3,b,R)*, (q1,b,q2,b,L)*, (q1,0,q1,0,R), (q1,1,q1,1,R), (q2,b,q6,b,R)*, (q2,0,q5,b,L)*, (q3,b,q4,b,L)*, (q3,0,q3,0,R), (q3,1,q3,1,R), (q4,b,q6,b,R)*, (q4,1,q5,b,L)*, (q5,b,q0,b,R)*, (q5,0,q5,0,L), (q5,1,q5,1,L)} Q: The set of states is presented as a list of identifiers, separated by commas, and enclosed in braces, i.e., { and }. The identifiers are arbitrary sequences of alphanumeric characters. (Some special characters, such as { and } are not allowed.) I: The initial state of the machine (the qI of Note 3). F: The final state of the machine (the qF of Note 3). G: The tape alphabet, �, of the machine. The tape symbols can be almost any printing characters. Exceptions include non-printable characters such as (space), and ? (question mark); the latter has a special meaning to be explained shortly. The characters are separated by commas and enclosed in braces. S: The input alphabet _ _ �. D: The transition function is presented as a list of quintuples, separated by commas, and enclosed in braces. The components of each quintuple specify, in order, the old state, old tape symbol, new state, new tape symbol, and head movement (L for left, and R for right). MODEL QUESTION PAPER U4CSA10 THEORY OF COMPUTATION Part – A (15 x 2 marks = 30 marks) Answer All Questions. Each question carries 2 marks 1. What is meant by symbols? 2. Define Alphabets and Strings? 3. Construct a regular expression for the language which accepts all strings with at least two c’s over the set ={c,b} 4. Give the formal definition of a regular language. 5. Write a regular expression to denote a language L which accepts all the strings which begin or end with either 00 or 11. 6. List the types of automation? 7. what is: (i) (0+1)* (ii)(01)* (iii)(0+1) (iv)(0+1)+ 8. What are the applications of pumping lemma? What is the closure property of Regular sets? 9. What is a deterministic PDA? 10. What is left linear grammar? 11. When we say a problem is decidable? Give an example of u ndecidable problem? 12. What is an ambiguous grammar? 13. What is a 2-way infinite tape Turing Machine? 14. What is the storage in FC 15. Define Decidability? Part – B (5 x14 marks= 70 marks) (Answer ALL questions. Each question carries 14 marks) 16. A) What is a: (a) String (b) Regular language? What is a regular expression? Differentiate L* and L+ (OR) B) Define: (i) Finite Automaton (FA) (ii) Transition diagram iii) What are the applications of automata theory? 17. A) Find the regular expression for the set of all strings denoted byR132 from the DFA given below. (OR) B) What are the closure properties of CFL? State the pumping lemma for CFLs. What is the main application of pumping lemma in CFLs? 18. A) i) Find CFG with no useless symbols equivalent to: S AB | CA, B BC | AB, A a , C aB | b. ii) Construct CFG without Є production from: S a | Ab | aBa, A b | Є , Bb | A. (OR) B) What is a ambiguous grammar?Consider the grammar p = {S->aS | aSbS | Є } is ambiguous by constructing: (a) two parse trees (b) two leftm ost derivation (c) rightmost derivation 1. A) What is a turing machine? What are the special features of TM? Define Turing machine. Define Instantaneous description of TM. Discuss the various techniques for Turing machine construction? (OR) B) What are the different types of language acceptances by a PDA and define them. Define Deterministic PDA. Define Instantaneous description (ID) in PDA. 2. A) What is circuit complexity? Discuss in detail about the problems under the circuit complexity? (OR) B) Explain in detail about NAND circuit value problems with an example?