U4CSA10 - Vel Tech University

advertisement
DEPARTMENT OF CSE
COURSE MATERIAL
U4CSA10 THEORY OF COMPUTATION
SEMESTER: IV
PREPARED BY
R.KAVITHA
A.P/CSE
U4CSA10 THEORY OF COMPUTATION
LTPC
3 003
OBJECTIVES




To have an understanding of finite state and pushdown automata.
To have a knowledge of regular languages and context free languages.
To know the relation between regular language, context free language and
corresponding recognizers.
To study the Turing machine and classes of problems.
UNIT I
8
Languages and Problems: Symbols – Alphabets and strings – languages – operation on
languages – Alphabetical Coding – types of problems – representation of graphs – spanning
trees- Decision problems – Function problems – Security problems – enumeration – regular
expression – application of regular expression
UNIT II
9
Fundamental Machines – Basic machine notation – Deterministic Finite Automata (DFA) –
Non Deterministic Finite Automata (NFA) – Equivalence of DFA and NFA – Properties of
Finite State Languages – machine for five language operations – Closure under complement,
Union, Intersection, Concatenation and Kleene star – Equivalence of regular expressions and
DFA – Pumping Lemma for Regular Language – Applications of pumping Lemma
UNIT III
9
Fundamental Machines - Push Down Automata – Turing Machines – Deterministic Turing
machine – Multiple work tape turing machine – Non Deterministic turing machine –
equivalence of Deterministic turing machines and non deterministic turing machines – Un
decidable languages – Relation among classes – grammars – regular grammars – context
free – grammar – closure properties of context free grammar- parsing with non deterministic
push down automata – parsing with deterministic pushdown automata – parse trees
UNIT IV
10
Computational Complexity: Asymptotic notations – Time Space Complexity – Simulations –
Reducibility - Circuit Complexity – Boolean circuit model of computation – circuit resources –
examples
UNIT V
9
Polynomial time – P Completeness theory – examples of P Completeness – General machine
simulation – NAND circuit value problems - Circuit problems and reduction.
TOTAL: 45
periods
TEXT BOOKS
1. John E Hopcraft, Rajeev Motwani, Jeffrey D Ullman, “Introduction to Automata Theory,
Languages and Computation”, PEA, Second Edition, 2001
REFERENCE BOOKS
1. Green Law, Hoover, “Fundamentals of the Theory of Computation – Principles and
practice”, Morgan & Kauffman Publishers, 1998
LESSON PLAN
Faculty Name
Subject Name
Semester
Hour
Count
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
R.Kavitha
Theory of Computation
IV
Faculty ID
Subject Code
Year
TTS1435
U4CSA10
II A & B
Proposed Actual
Mode of
Topic Covered
Unit
Remarks
Date
Date
Delivery
16/12/2013
Introduction
I Chalk & Board
18/12/2013
Symbols and alphabets
I Chalk & Board
19/12/2013
Strings
I Chalk & Board
20/12/2013
Languages
I Chalk & Board
21/12/2013
Operations of languages
I Chalk & Board
24/12/2013
albhabetical coding
I Chalk & Board
26/12/2013
Types of problems
I Chalk & Board
27/12/2013
Representation Of Graphs I Chalk & Board
Spanning
28/12/2013
I
PPT
Trees,enumerations
30/12/2013
Decision Problems
I Chalk & Board
20/1/2014
Function Problems
I Chalk & Board
Security Problems &
21/1/2014
I Chalk & Board
Revision
22/1/2014
Seminar & Tutorial
I Chalk & Board
23/1/2014
Tutorial
I Chalk & Board
25/1/2014
Fundamental Machines
II Chalk & Board
27/1/2014
Basic Machine Notation II Chalk & Board
28/1/2014
Finite automata
II Chalk & Board
DFA & NFA and
29/1/2014
II Chalk & Board
Properties of languages
Equivalence Of DFA And
31/1/2014
II Chalk & Board
NFA
Machine For Five
3/2/2014
II Chalk & Board
Language Operations
Closure Under
Complement, Union,
4/2/2014
Intersection,
II
PPT
Concatenation And Kleene
Star
5/2/2014
Equivalence Of Regular II Chalk & Board
23
7/2/2014
24
25
26
27
10/2/2014
13/2/2014
14/2/2014
15/2/2014
28
18/2/2014
29
19/2/2014
30
20/2/2014
31
21/2/2014
32
26/2/2014
33
27/2/2014
34
28/2/2014
35
28/2/2014
36
37
38
39
1/3/1014
2/3/2014
4/3/2014
5/3/2014
40
6/3/2014
41
42
7/3/2014
11/3/2014
43
12/3/2014
44
13/3/2014
45
14/3/2014
46
47
48
49
17/3/2014
18/3/2014
19/3/2014
20/3/2014
Expressions And DFA
Pumping Lemma For
Regular
Seminar and Tutorial
Fundamental Machines
Push Down Automata
Turing Machines
Deterministic Turing
Machine
Multiple Work Tape
Turing Machine
Non Deterministic Turing
Machine
Equivalence Of
Deterministic Turing
Machines
Non Deterministic Turing
Machines – Un Decidable
Languages
Relation Among Classes –
Grammars – Regular
Grammars
Context Free – Grammar
Closure Properties Of
Context Free Grammar
Tutorial and Seminar
Introduction
computational complexity
example
example on complete
problems
Asymptotic Notations
Time Space Complexity
Reducibility and
Simulations
Circuit Complexity
Boolean Circuit Model Of
Computation
Seminar and Tutorial
Tutorial
Polynomial Time
Example on Pt problems
II Chalk & Board
II Chalk & Board
III Chalk & Board
III
PPT
III Chalk & Board
III Chalk & Board
III Chalk & Board
III Chalk & Board
III Chalk & Board
III
PPT
III Chalk & Board
III Chalk & Board
III Chalk & Board
III
IV
IV
IV
Chalk & Board
Chalk & Board
Chalk & Board
Chalk & Board
IV Chalk & Board
IV Chalk & Board
IV Chalk & Board
IV
PPT
IV Chalk & Board
IV
PPT
IV
IV
V
V
Chalk & Board
Chalk & Board
Chalk & Board
Chalk & Board
50
51
52
24/3/2014
25/3/2014
26/3/2014
53
27/3/2014
54
55
56
57
58
59
1/4/2014
2/4/2014
3/4/2014
4/4/2014
7/4/2014
8/4/2014
Examples Of P complete
Examples Of NP
Completeness
General Machine
Simulation
NAND Circuit Value
Examples Of P
problems
Reduction
Circuit problems
Seminar & Tutorial
V
PPT
V Chalk & Board
V Chalk & Board
V Chalk & Board
V
V
V
V
V
V
PPT
Chalk & Board
Chalk & Board
Chalk & Board
Chalk & Board
Chalk & Board
Unit-1
Languages
An alphabet is a finite, nonempty set of symbols. We use E to denote this alphabet.
Note: symbols may be more than one English letter long, e.g. While is a single symbol in
Pascal.
A string is a finite sequence of symbols from e. The length of a string, denoted s is the
number of symbols in it.
The empty string is the string of length zero. It really looks like this. But for
convenience we usually write it like this: 
 denotes the set of all sequences of strings that are composed of zero or more symbols of .
+ denotes the set of all sequences of strings composed of one or more symbols of . That is,
+=*-.
A language is subset of *.
The concatenation of two strings is formed by joining the sequence of symbols in the
first string with the sequence of symbols in the second string.
If a string S can be formed by concatenating two strings A and B, S=AB, then A is
called a prefix of S and B is called a suffix of S.
The reverse of a strings S, SR , is obtained by reversing the sequence of symbols in the
string. For example if S=abcd, then SR =dcba.
Any string that belongs to a language is said to be a word or a sentence of that
language.
Operations on languages
Languages are sets. Therefore any operation that can be performed on sets can be
performed on languages.
IfL, L1 and L2 are languages then,
 L1  L2is a language
 L1  L2is a language
 L1-L2 is a language
 -L=*-L the complement of L is a language.
In addition,
 L1L2. the catenation of L1 and L2 is a language
(the strings of L1 L2 are strings that have a word of L1 as a prefix and a word of L2 as a
suffix.
 Ln , the catenation of L with itself n times is a language.
 L*= L LLLLLLLL….. the star closure of L is a language.
 L+=LLLLLLLLLLL….the positive closure of L is a language
Sets:
A set is a collection of “things” called the elements or members of the It is essential to
have a criterion for determining, for any given thing, whether it is or is not a member of the
given set. This criterion is called the membership criterion of the set.
There are two common ways of indicating the members of a set:
List all the elements, e.g.(a,e,i,o,u)
Provide some sort of an algorithm or rule such as a grammar Notation.
To indicate that x is a member of set S, we write X=S
We denote the empty set the set with no members as or 
If every element of set A is also an element of set B, we say that A is a subset of B and write
A=B
If every element of set A is also an element of set B, but B also has some elements
not contained in A. we say that A is a proper subset of B, and write AB
Operations on sets
The union of sets A and B written AB is a set that contains
Everything that is in A, or in B, or in both.
 The intersection of sets A and B written A B is a set that contains exactly those elements
that are in both A and B.
The complement of a set a, written as A or better A with a bar drawn over it is the set
containing everything that is not in A. This is almost always used in the context of some
universal. Set U that contains “everything” (meaning everything we are interested in at the
moment). Then-A is shorthand for U-A
The cardinality of set A, written A is the number of elements in a set A.
The power set of a set Q, written 2Q is the set of all subsets of Q. The notation suggest the
fact that a set containing n elements has a power set containing 2n elements.
Two sets are disjoint if they have no elements in common that is if AB=
Relations and functions
A relation on sets S and T is a set of ordered pairs (s,t) where
 s=S(s is a member of S)
 t=T
 S and T need not be different
 The set of all first elements (s) is the domain of the relation and
 The set of all second elements is the range of the relation.
 A relation is a function if every element of S occurs once and only once as a first element of
the ration.
 A relation is a partial function if every element of S occurs at most as an element of the
relation.
Function and Relations
Consider two sets, which we will call S and T.
There are several ways in which the two sets S and T may be related.
The sets may be identical that is every element of one set is also an element of the other set.
 The sets may be disjoint that is , no element belongs to both sets. There is no overlap
between the sets.
 Set S may be a proper subset of set T. That is every element of s is also an element of T, but
T has some elements that are not in S.
 Set T may be a proper subset of set B.
 The sets may overlap some elements are in both S and T with out either being a proper
subset of the other (each set contains some elements that are not in the other set.)
Relations
However, this page isn’t really about sets, this page is really about relations and
functions. For clarity all our examples use disjoint sets see figure) However, the same
definitions apply regardless of the relationship between the two sets. The only difference is
that if we didn’t use disjoint sets, the examples would be harder to figure out.
Suppose we have two sets, S and T. A relation on S and T is a set of ordered pairs,
where the first thing in each pair is an element of S and the second thing is an element of T.
For example suppose S is the set ( A,B,C,D,E) while T is the set (W,X,Y,Z) Then one
relation on S and T is (A,Y),(C,w)(C,Z)(D,Y). There are four ordered pairs in the relation.
We can draw this as four arrows going from S to T (see Figure2) One arrow goes from A to
Y; another goes from C to Z and another goes from D to Y.
The purpose of this pate is just to define terms. Giving names to things is not
important unless you can later use those names to talk about the things, These terms we
define here are used throughout mathematics, and are pretty important: but we don’t go into
any of that here. We just define terms.
The most important of the terms we will define is function.
You have probably seen this word defined in algebra or calculus, and you may think
this is another meaning for the same word. Its
Important to realize that there is only one meaning in mathematics for the word
“function “ and this is it. Moreover, the definition given here is the “best” definition because,
since the definition is given in terms of sets, it is the most general and most applicable
definition. Any other definition is just a special case.
Anyway, to continue,
The domain of a relation on S and T is the set of elements of S that appear as the first
element in an ordered pair of the relation. In the relation (A,Y), (C,Z), (D,Y) the domain is
(A,C,D) .If you look at Figure2 these are the elements of S that have arrow coming out of
them.
The range of a relation on S and T is the set of elements of T that appear as the
second element in an ordered pair or the relation. In the relation (A, Y, (C,W), (D,Y) the
range is (W,Y,Z) if you look at figure 2 these are the elements, of T that have arrows.
Pointing to them.
For some reason the word co domain has become popular as a synonym for “range” I
think it’s an ugly word. If anyone has an explanation for why this word has become popular, I
would very much like to hear it.
In Figure2 not every element of T has an arrow pointing to it.
That is, there are some elements of T in particular the element X that do not occur as a
second element of an ordered pair. We say that the relation is into T.
Suppose we have relation in which every element occurs at least once as a second
element of an ordered pair. For example the relation (A,Y) (B,X), (C,Z), (D,Y) see figure 3 is
just like the previous relation, except that it also contains the ordered pair (B,X) The relation
is one to every element of set T has an arrow (at least one) pointing to it.
While the word “onto” has a precise definition (the range of the relation is the set T
itself) the word “into” is not usually so well defined, Into” could be used to mean “not onto”
there is at least one element of T that does not have an arrow pointing to it) or it could mean
“not necessarily onto” that is, there might be some element of T that does not have an arrow
pointing to it. Different authors might choose to define “into” in different ways, or not define
it precisely or just not define it at all.
Suppose we put in every possible arrow form S to. T that is from each element of S
we draw arrow to each element of T see Figure 4 This “largest possible relation” is called the
Cartesian.
Product of S and T. Every other relation on S and T is a subset of the Cartesian
product.
How is it possible for a relation to be a subset of another relation? Remember that a
relation is just a set of ordered pairs. Or in our pictures, a relation is a set of arrows.
Next we will say that a relation is one to one or 1-1 if no element of s occurs more
than once as the first element of an ordered pair, and no element to T occurs more than once
as a second element or an ordered pair. In other words, no element of S or T has more than a
single arrow attached to it. (See Figure 5)
This definition holds even when S and T are not disjoint but the picture is a little more
confusing An element that is in both S and T could have a single arrow attached to it as
before but at both ends.
Functions
Now we get to what is perhaps the most important term. Suppose every element of
occurs exactly once as the first element of an ordered pair in our pictures every element of S
has exactly one arrow coming from it. This kind of relation is called a function.
Another word for function is mapping A function is said to map an element in its
domain to an element in its range.
Here are some important facts about a function from S to T:
1.
2.
3.
4.
5.
6.
Every element in S in the domain of the function:
That is every element of S is mapped to some element in the range. If some element in
S has no mapping arrow then the relation is sometimes called a partial function , but it
is not a function
No element in the domain maps to more than one element in the range.
The mapping is not necessarily onto; some elements of T may not be in the range.
The mapping is not necessarily 1-1 some elements of T may have more than one
element of S mapped to them.
S and T need not be disjoint.
To tell whether a relation on S and T is a function you can ignore T altogether. Just look
at S. If every element of S has one and only one element coming out of it. Then the relation is
a function.
Kinds of functions
There are three kinds of functions that are important enough to have special names
surjection injection and bisection.
A surjection or onto function from S to T is a function whose range is T itself. That
is every element of T has at least one arrow pointing to it, so the relation is on to Figure 6 is
an example of a surjection.
Since the domain of a function from S to T is the entire set S by the definition of
function and since the range of a subjective function from S to T is the entire set T this means
that every element of both sets is in the relation has an arrow connected to it.
Also note that if you have a surjection from S to T may have more elements than T, or
it may have the same number of elements but it cannot have fewer elements than. T this is
because every element of S has exactly one arrow emanating from it and could have may.
An injection or one to one function from S to is a function that is one to one. That is
every element in both S and T has at least one arrow attached to it. By the definition of
function, of course, every element of S has exactly one arrow emanating from it, no more, no
less. A one-to- function also requires that every element of T can have no more than one
arrow pointing to it but there could be elements of T that do not have any arrows pointing to
them.
Figure 7 shows an injection. However, we had to modify the sets a little bit to get this
example. A little thought will show that with an injection from S to T, T must have at least as
many elements as S. To get our example we removed some elements from S. We could
equally as well have added elements to T.
A
B
C
D
E
A
B
C
D
E
Finally a function that is both 1-1 and onto is called a bisection. Such a function maps
each and every element of S to exactly one element of T with no elements left over. Shows a
bisection. Again we had to modify the sets a little because with a bisection, set S T have
exactly the same number of elements.
A bisection is particularly interesting because this kind of function has an inverse. If
you took a bisection from S to T and reversed the direction of all the arrow, you would have a
bisection form T to S. This new function would be the inverse of the original function.
Primitive Regular Expressions
A regular expression can be used to define a language, A regular expression
represents a “ pattern” strings that match the pattern are in the language, strings that do not
match the pattern and not in the language.
As usual the strings are over some alphabet 
The following are primitive regular expressions:
 X, for each X= ,
  the empty string and
  indicating no strings at all.
This if  =n , then there are n+2 primitive regular expressions defined over .
Here are the languages defined by the primitive regular expressions:
 For each X=  the primitive regular expression x denotes the language ( X). That is, the
only string in the language is the string “ X”.
 The primitive regular expression  denotes the language ( ) the only string in this
language is the empty string.
 The primitive regular expression  denotes the language () there are no strings in this
language.
Regular Expressions
Every primitive regular expression is a regular expression.
We can compose additional regular expressions by applying the following rules a finite
number of times.
 If r1 is a regular expression, then so is (r1)
. If r1 is a regular expression , then so is r1
. If r1 and r2 are regular expressions then so is r1r2. If r1 and r2 are regular expressions. Then
so is r1 r2
Here’s what the above notation means.
 Parentheses are just used for grouping.
 The postfix star indicates zero or more repetitions of the preceding regular expression.
Thus , if X=  then the regular expression X* denotes the language ( ,X. XX,XXx….)
 The plus sign read as “or” denotes the language containing strings described by either of the
component regular expressions. For example, if X,Y=  then the regular expression X+Y
describes the language (x,y)
Precedence:* binds most tightly, then just a position, then + for example a+bc* denotes the
language (a,b,bc,bcc,bccc,bcccc,…..)
Languages Defined by Regular Expressions
There Is a simple correspondence between regular expressions and the languages they denote.
Regular expression
X, for each X 


(r1)
R1*
r1 r2
r1+ r2
L(regular expression)
{X}
{}
{}
L(r1)
L(r1))*
L(r1) L(r2)
L(r1) L(r2)
Building regular Expressions
Here are some hints on building regular expressions. We will assume  =(a,b,c)
Zero or more
a* means “zero or more a’s” To say” zero or more ab’s “that is (  ab, abab, ababab,….)
you need to say (ab)* Don’t say
ab* because that denotes the language (a, ab, abb, abbb, abbbb,…)
One or more.
Since a* means “zero or more a’s” you can use aa* (or equivalently a*a) to mean “one or
more a’s similarly to describe “one or more ab’s” that is (ab, abab, ababab,…..) you can use
ab(ab).*
Zero or one.
You can describe an optional a with (a+ )
Any string at all.
To describe any string at all (with  =(a,b,c) you can use (a+b+c)*
Any nonempty string.
This can be written as any character from  followed by any string at all (a+B+C)(a+b+c)*
Any string not containing….
To describe any string at all that doesn’t contain an a with  (a,b,c,) you can use (b+C)*
Any string containing exactly one…….
To describe any string that contains exactly one a, put “any string not containing an a, “ on
either side of the a, like this (b+c)* (b+c)*
Example Regular Expressions
Give regular expressions for the following languages on =(a,b,c)
All strings containing exactly one a.
b+c)*(a(b+c)*
All strings containing no more than three a’s.
We can describe the string containing zero, one two or three a’s (and nothing else )as
(+a)(+a)(+a)x
so we put in (b+c)* for each X
(b  c) *(  a)(b  c)  (  a)(b  c)  (  a)(b  c) 
All strings which contain at least one occurrence of each symbol in .
The problem here is that we cannot assume the symbols are in any particular order.
We have no way of saying “in any order” so we have to list the possible orders.
Abc+acb+bac+bca+cab+cba
To make it easier to see what’s happening, let’s put an X in every place we want to allow an
arbitrary string.
XaXbXcX+XaXcXbX+XbXaXcX+XbXcXaX+XcXaXbX+XcXbXaX
Finally replacing the X s with (a+b+c) * gives the final unwieldy answer.
A+b+c)*a(a+b+c)*b(a+b+c)*c(a+b+c)*+
A+b+c)*a(a+b+c)*c(a+b+c)*b(a+b+c)*+
A+b+c)*b(a+b+c)*a(a+b+c)*c(a+b+c)*+
A+b+c)*b(a+b+c)*c(a+b+c)*a(a+b+c)*+
A+b+c)*c(a+b+c)*a(a+b+c)*b(a+b+c)*+
A+b+c)*c(a+b+c)*b(a+b+c)*a(a+b+c)*+
All strings which contain no runs of a’s of length greater than two.
We can fairly easily build an expression containing no a, one a or one aa.
(b+c)*(+a+aa)(b+c)*
but if we want to repeat this, we need to be sure to have at least one non a between
repetitions:
(b+c)*(+a+aa)(b+c)(b+c)*(+a+aa)(b+c)**
All strings in which all runs of a’s have lengths that are multiples of three
(aaa+b+c)*
Convertion primitive regular expression to NFA
Every nfa construct will have a single start state and a single final state. We will build
more complex nfas out of simpler nfas, each with a single start state and a single final state.
The simplest nfas will be those for the primitive regular expressions.
For any x in  the regular expression x denotes the language (x).This nfa represents
exactly that language.
Note that if this were a dfa, we would have to include arcs for all the other elements
of
Nfa for x
The regular expression  denotes the language ( ) that is the language containing only the
empty string.
The regular expression  denotes the language  no strings belong to this language, not even
the empty string.
nfa for 
Since the final state is unreachable, why bother to have it at all? The answer is that it
simplifies the construction if every nfa has exactly one start state and one final state. We
could do without this final state, but we would have more special cases to consider, and it
doesn’t, hurt anything to include it.
UNIT-2
DFAs are
 Deterministic-there is no element of choice
 Finite –only a finite number of states and arcs
 Acceptors (Automata)-produce only a yes no answer
A deterministic finite automata acceptor of dfa is a quintuple
M=(Q,, q0,F)
Where
Q is a finite set of states,
 is a finite set of symbols the input alphabet
 QxQ is a transition function
q0=Q is the initial state
F  q is a set of final states.
A DFA is drawn as a graph with each state represented by a circle.
Start state
One designated state is the start state
Final rate
Some state possibly including the start state can be designated as final states.
State transition arc
Arcs between states represent state transitions each such arc is labeled with the symbol
that triggers the transition.
Example DFA
Let Q = {q0, q1 },  {a, b}
F= {q0} the transition diagram
q0
q1
Operation
Start with the “current state” set to the start state and a “read head” at the beginning of the
input string
While there are still characters in the string.
Read the next character and advance the read head.
From the current state, follow the arc that is labeled with the character just read the state that
the arc points to becomes the next current state.
When all characters have been read, accepts the string if the current state is a final state,
otherwise reject the string.
Sample trace q0 1 q1 0 q3 0 q1 1 q0 0 q2 0 q0
Since q0 is a final state the string is accepted.
Implementing a DFA
Start
q0
q1
q2
q3
If you don’t object to the go to statement there is an easy way to implement a DFA.
Q0 : read char
If eof then accept string;
If char =0 then go to q2;
If char=1 then go to q1?
qo:
read char;
If eof then reject string;
If char=0 then go to q3;
If char=1 then go to q0;
Q1:
read char;
If eof then reject reject string;
If char=0 then go to q0
q3:
If char =1 then go to q3;
read char;
If eof then reject string;
If char=0 then go to q1
If char=1 then go to q2;
Implementing a DFA, part 2
If you are not allowed to use a go to statement, you can fake it with a combination of
a loop and a case statement;
State =q0
Loop
Case state of
q0:
read char;
If eof then accept string;
If char=0 then state;q2;
If char=1 then state :=q1;
Q1:
read char;
If eof then reject string;
If char=0 then state : =q3;
If char=1then state : =q0;
Q2:
read char;
If eof then reject string;
If char=0 then state:=q0;
If char=1 then state:=q3
q3: read char;
If eof then reject string;
If char=0 then state:=q1
If char=1 then state:=q2
End case;
End loop;
Nondeterministic finite acceptors nondeterministic finite Automata.
Formal Definition
M= (Q, ,,,q0,F)
Where
 Q is finite set of states.
 is a finite set of symbol, the input alphabet
.  Qx(---2Q is a transition function.
. q0 =(Q is the initial state.
. F  Q is a set of final states.
These are all the same as for a dfa except for the definition.
Transitions on  are allowed in addition to transitions on elements of  and
The range of  is 2Q rather than Q. This means that the values of  are not elements of Q.
but rather are sets of elements of Q.
A finite state automaton can be nondeterministic in either or both of two ways.
A state may have two or more arcs emanating from it labeled with the same symbol.
When the symbol occurs in the input either arc may be followed.
A state may have one or more arcs emanating from it labeled with  the empty string.
These arcs may optionally be followed without looking at the input or consuming an input
symbol.
Due to non determinism, he same string may cause an nfa to end up in one of several
different states. Some of which may be final while others are not. The string is accepted if
any possible ending state is a final state.
Example NFAs
Let M = {a, b, c}, {a01, a1, a02, a03}
101, s, {qn}
State
q0
q1
q2
q3
a
{q0,q1}
{q1,q3}
{q2}
q3
ID
b
{q0,q1}
{q1}
{q2,q4}
c
{q0,q2}
q1
q2
Implementing an NFA
If you think of an automaton as a computer how does it handle no determinism? There are
two ways that this could, in theory be done.
1.
2.
When the automaton is faced with a choice, it always (magically) chooses correctly,
We sometimes think of the automaton as consulting an oracle which advises it as to the
correct choice.
When the automaton is faced with a choice, it spawns a new process so that all
possible paths are followed simultaneously.
The first of these alternatives, using an oracle is sometimes attractive mathematically. But if
we want to write a program to implement an nfa, that isn’t feasible.
There are three ways, two feasible and one not yet feasible to simulate the second alternative.
1. Use a recursive backtracking algorithm. Whenever the automaton has to make a
choice cycle through all the alternatives and make a recursive call to determine
whether any of the alternatives leads to as solution final state.
2. Maintain a state set or a state vector, keeping track of all the states that the nfa could
be in at any given point in the string.
3. Use a quantum computer, Quantum computers explore literally all possibilities
simultaneously. They are theoretically possible, but are at the cutting edge of physics,
It may (or may nor) be feasible to build such a device.
Recursive Implementation of NFAs
An nfa can be implemented by means of a recursive search from the start state for a
path directed by the symbols of the input string to a final state.
Here is a rough outline of such an implementation.
function nfa (state A) returns Boolean:
local state B, symbol x;
for each  transition from state A to some state B do
If nfa (B) then return True;
If there is a next symbol then
{Read next symbol x
For each x transition from state A to
Some state B do
If nfa (B) then
Return True
Return False;
}
Else
{If A is a final state then return True else return False;
}
One problem with this implementation is that it could get into an infinite loop if there is a
cycle of  transitions. This could be prevented by maintaining a simple counter.
State –Set Implementation of NFAs
Another way to implement an NFA is to keep either a state set or a bit vector of all the
states that the NFA could be in at any given time. Implementation is easier if you use a bit
vector approach v (i) is True iff state I is a possible state since most languages provide
vectors,
Approach v(i) is True iff state I is a possible state since most language provide
Vectors, but not sets as a built in data type. However, it’s a bit easier to describe the
algorithm if you use a state set approach, so that’s what we will do. The logic is the same in
either case.
Function nfa state set A returns Boolean:
local state set B, state a, state b, state c, symbol x;
for each a in A do
for each  transition from a
to some state b do
add b to b
While there is a next symbol do
Read next symbol x
B= 
for each a in A do
{for each  transition from a to some state b
for each x transition from a to some state b
add b to B’
}
For each  transition from
Some state b in B to some state c not in B do
add c to B;
A :=B;
}
if any element of A is a final state then
Return true
Else
Return False
Conversion from NFA to DFA
From NFA to DFA
Consider the following NFA.
What states can be in (in the NFA) before reading any input? Obviously the start state,
A but there a transition from A to B , so we could also be in state B. For the DFA we
construct the composite state (A, B)
State (A, B) lacks a transition for x. From A, x takes us to A in the NFA and the null
transition might take us to B; from B, x takes us to B in the DFA, x takes us form (A,B) to
(A, B).
State (A, B) also needs a transition for y. In the NFA (a, Y)=C and  (B, Y)=C so we
need to add a state © and an arc Y from (A, B) to ©
In the NFA  (C, x)=A but then a pull transition might or might not take us to B, so
we need to add the state (B, C) and the arc Y from ( C) to this new state.
n the NFA  (B, X)=B and  (C, X) A and by a  transition we might get back to B so we
need an x arc
from (B, C) to (A, B)
(B, Y) =C while (C,Y) Is either B or C , so we have labeled from (B, C) to (B, C)
We now have a transition from every state for every symbol . The only remaining
chore is to mark all the final states, In the containing B is a final state.
The pumping Lemma
Here’s what the pumping lemma says.
 If an infinite language is regular it can be defined by a dfa.
 The dfa has some finite number of states say n
 Since the language is infinite, some strings of the language must have length >n
 For a string of length>n accepted by the dfa, the walk through the dfa must contain a
cycle.
 Repeating the cycle an arbitrary number of times must yield another string accepted by
the dfa
Thus pumping lemma for regular languages is another way of proving that a given
infinite language is not regular. The pumping lemma cannot be used to prove that a given
language is regular. The proof is always by contradiction. A brief outline of the technique is
as follows
 Assume the language L is regular.
 By the pigeonhole principle, any sufficiently long string in L must repeat some state in the
dfa: thus the walk contains a cycle.
 Show that repeating the cycle some number of times pumping the cycle yields a string that
is not in L.
 Conclude that L is not regular.
Why this is hard
 We don’t know the dfa (if we did the language would be regular) Thus, we have do the
proof for an arbitrary dfa that accepts L.
 Since we don’t know the dfa, we certainly don’t know the cycle.
Why we can sometimes pull it off.
 We get to choose the string but it must be in L
 we get to choose the number of times to “puno”
Applying the Pumping Lemma
Here’s a more formal definition of the pumping lemma;
If L is an infinite regular language, then there exists some positive integer m such that any
string W=L whose length is m or greater can be decomposed into three parts, xyz, where
 xy is less than or equal to m
 y  0,
W1  xy1isalsoin L for all i=0,1.2.3.....
Here s what it all means:
 m is a (finite) number chosen so that strings of length m or greater must contain a cycle.
Hence m must be equal to or greater than the number of states, in the dfa. Remember that we
don’t know the dfa, so we can’t actually choose m, we just know that such an m must exist.

Since string w has length greater than or equal to m, we can break it into two
parts, xy and z such that xy must contain a cycle. We don’t know the dfa, so we
don’t know exactly where to make this break, but we know that xy can be less
than equal to m.

We let x be the part before the cycle y be the cycle and z the part after the cycle.
(it is possible that x and z contain cycle but we don’t care about that. Again we
don’t know exactly where to make this break.

Since y is the cycle we are interested in we must have y>0 otherwise it isn’t cycle.


By repeating y an arbitrary number of times xy*z we must other strings in L.
If despite all the above uncertainties, we can show that the has to accept some
string that we known is not in the language then we can conclude that the
language is not regular.
To use this lemma, we need to show.
1. For any choice of m
2. For some w L that we get to chose (and we will choose one of length at
least m)
3. For any way of decomposing w into xyz so long as sy isn’t greater than m
and y isn’t 
4. We can choose an I such that xy.z is not in L.
We can view this as a game wherein our opponent makes moves I and 3 choosing m and
choosing xyz and we make moves 2 and 4 choosing w and choosing. Our goal is to shown
that we can always beat our opponent. If we can show this we have proved that L is not
regular.
Pumping Lemma Example 1
Prove that L=(anbn ; n  0 is not regular.
1. we don’t know m, but assume there is one.
2. choose a string W=anbn where n>m, so that any prefix of length m consists
entirely of a’s
3. We don’t know the decomposition of w into xyz, but since xy m xy must consist
entirely of a’s Moreover y cannot be empty.
4. Choose i=0 This has the effect of dropping y a’s out of the string without
affecting the number of b’s. The resultant string has fewer as than b’s hence does
not belong to L. Therefore L is not regular.
Pumping Lemma Example 2
. prove that L=(anbk n>k and n  0 is not regular.
1. We don’t know m, but assume there is one.
2. Choose a string W=anbk where n>m so that any prefix of length m consists
entirely of a’s and k=n-1, so that there is just one more a than b.
3. We don’t know the decomposition of w into xyz, but since xy m ,xy must
consist entirely of a’s Moreover, y cannot be empty.
4. Choose i=0 this has the effect of dropping y a’s out of the string, without
affecting the number of b’s. The resultant string has fewer a’s than before, so
it has either fewer a’s than b’s or the same number of each” Either way the
string does not belong to L so L is not regular.
Pumping Lemma Example 3
Prove that L=(an n) is a prime number is not regular
1. We don’t know m, but assume there is one
2. choose a string W=an where n is a prime number and xyz  n  m  1 This
can always be done because there is no largest prime number. Any prefix
of w consists entirely of a’s
3. We don’t know the decomposition of w into xyz, but since xy  m it
follows that z  1.as usual y  0
4. since
z  1, xz  1.Choosei  xz .Then xy1 z  xz 
yxz  1  y xy sin ce(1  y and xz
are each greater than 1.
the product must be a composite number. Thus xy1 z is a
Closure properties of regular sets
Closure 1
A set is closed under an operation if, whenever the operation is applied to members of the
set, the result is also a member of the set
For example the set of integers is closed under addition because x+y is an integer
whenever x and y are integers. However.
Integers are not closed under division: if x and y are integers, x/y may or may not be an
integer.
We have defined several operations on languages
L1 strings in either L1 or L2
L2
L1L2 strings composed of one string from L1 followed by one string from L2
-L1 All strings over the same alphabet not in L1
L1* Zero or more strings from L1 concatenated together
L1- strings in L1 that are not in L2
L1R strings in L1 reversed.
We will show that the set of regular languages is closed under each of these operations.
We will also define the operations of “homomorphism” and “right quotient” and show that
the set of regular languages is also closed under these operations.
Close 11 Union, concatenation, Negation, Kleene star, Reverse
General Approach
 Build automat a( dfas or nfas ) for each of the languages involved.
 Show how to combine the automata to create a new automaton that recognizes the
desired language.
 Since the language is represented by an nfa or dra, conclude that the language is
regular.
Union of L1 and L2
 Create a new start state.
 Make a  transition from the new start state to each of the original start states.
Concatenation of L1and L2
 Put a  transition from each final state of L1 to the initial state of L2
 Make the original final state of L1 nonfinal
Negation of L1
Start with a (complete) dfa, not with an nfa.
Make every final state nonfinal and every nonfinal state final.
Kleene Star of L1
Make a new start state; connect it to the original start state with a  transition.
Make a new final state, connect the original final state; connect the original final states
(which become nonfinal) to it with  transitions
Connect the new start state and new final state with a pair of  transitions.
Reverse of L1
Start with an automaton with just one final state.
Make the initial state final and the final state initial.
Reverse the direction of every are.
Closure 111 intersection and set Difference
Just as with the other operations, you prove that regular languages are closed under
intersection and set difference by starting with automata for the initial language, and
constructing a new automaton that represents the operation applied to the initial languages.
However, the constructions are somewhat trickier.
In these constructions you form a completely new machine, whose states are each
labeled with an ordered pair of state names,; the first element of each pair is a state form L1
and the second element of each pair is a state from L2 (Usually you won’t need a state for
every such pair just some of them.
1.
Begin by creating a start state whose label is (start state of L1 start state of L2)
2.
Repeat the following until no new arcs can be added.
1. Find a state (A,B) that lacks a transition for some in .
2. Add a transition on x from state (A,B) to state  (B,X) (If this state
doesn’t already exist create it.)
The same construction is used for both intersection and set difference the distraction
is in how the final states are selected.
Intersection: Mark a state (A,B) as final if both (i) A is a final state in L1 and (ii) B is a final
state in L2
Set difference: Mark a state (A,B) as final if A is a final state in L1 but B is not a final state
in L2
Closure iv: Homomorphism
Note: “Homomorphism” is a term borrowed from group theory, What we refer to as a “
homomorphism: is really a special case.
Suppose  are alphabets (not necessarily distinct). Then a homomorphism h is a function
from  ot r
If w is a string in  then we define h (w) to be the string obtained by replacing each symbol
x= by the corresponding string h(x) *
If is a languages on  then its homomorphic image is a language on . Formally,
H(L) = (h(w):W L)
Theorem
If L is a regular language on  then its homomorphic image h(L) is a regular language
on . That is, if you replaced every string w in L with h(w) the resultant set of strings would
be a regular language on .
Proof
 Construct a dfa representing L. This is possible because L is regular.
 For each arc in the dfa replace its label x  with h(x) .
 If an arc is labeled with a string W of length greater than one, replace the arc with a
series of arcs and (new) states, so that each arc is labeled with a single element or .
The result is an nfa that recognizes exactly the languages h(L).
 Since the language h(L) can be specified by an nfa, the language is regular.
Closure V: Right Quotient
Let L1 and L2 be languages on the same alphabet. The right quotient of L1 with L2 is
L1/L2 =(W:Wx L1 and X =L2)
That is the strings in L1/L2 are strings from L1 “with their tails cut off “ if some string of
L1 can be broken into two parts, w and x, where x is in language L2 then w is an language
L1/L2.
UNIT-3
Deterministic TM (DTM)
a quintuple (Q, , , , s), where
– the set of states Q is finite, not containing halt state h,
– the input alphabet  is a finite set of symbols not including the blank symbol
,
– the tape alphabet  is a finite set of symbols containing , but not including
the blank symbol ,
– the start state s is in Q, and
– the transition function  is a partial function from Q  ({})  Q{h} 
({})  {L, R, S}.
Nondeterministic TM
An NTM starts working and stops working in the same way as a DTM.
Each move of an NTM can be nondeterministic
reads the symbol under its tape head
According to the transition relation on the symbol read from the tape and its current
state, the TM choose one move nondeterministically to:
– write a symbol on the tape
– move its tape head to the left or right one cell or not
– changes its state to the next state
a quintuple (Q, , , , s), where
– the set of states Q is finite, and does not contain halt state h,
– the input alphabet  is a finite set of symbols, not including the blank symbol
,
– the tape alphabet  is a finite set of symbols containing , but not including
the blank symbol ,
– the start state s is in Q, and
– the transition fn :Q({})2Q{h}({}){L,R,S}.
Definition
• Let T = (Q, , , , s) be an TM.
A configuration of T is an element of Q      
• Can be written as
– (q,l,a,r) or
– (q,lar)
Equivalence of NTM and DTM
Theorem: For any NTM Mn, there exists a DTM Md such that:
– if Mn halts on input  with output , then Md halts on input  with output ,
and
– if Mn does not halt on input , then Md does not halt on input .
Proof:
Let Mn = (Q, , , , s) be an NTM.
We construct a 2-tape TM Md from Mn as follows:
Programming Techniques for Turing Machine Construction:
A turing machine is also as powerful as a conventional computer. The following are
the different techniques of constructing a TM to meet high – level needs.
1.
2.
3.
4.
Storage in the finite control (or) State
Multiple tracks
Subroutines
Checking off symbols.
Storage in the State (or) Storage in the Finite Control:
The finite control can also be used to hold a finite amount of information along with
the task of representing a position in the program.
The state is written as a pair of elements, one for control and the other storing a
symbol.
Figure: Storage in finite control
Multiple Tracks:
It is also possible that Turing Machine’s input tape can be divided into several tracks.
Each track can hold one symbol, and the tape alphabet of the TM consists of tuples with one
component for each track.
Figure: A three track Turing Machine
Subroutines:
A problem with same tasks to be repeated for many number of times, can be
programmed using subroutines. A turing machine with subroutine is a set of states that
perform some useful process.
The idea here is to write part of a TM programs to serve as a subroutine which has its
own initial state and a return state for returning to the calling routine. It improves the
modular or top – down programming design.
Multitape Turing Machine:
A multiple turing machine has a finite control with some finite number of rapes. Each
tape is infinite in both directions. It has it’s own initial state and some accepting states.
Initially



The finite set of input symbols is placed on the first tape.
All the other cells of all the tapes hold the blank.
The control head of the first tape is at the left end of the input.
Figure: Multitape Turing Machine
In one move, the multitape TM can



Change state,
Print a new symbol on each of the cells scanned by its tape heads.
Move each of its tape heads, independently, one cell to the left or right or keep it
stationary.
Context free Grammar (CFG).
A context free grammar is a 4 tuple (V, T, P, S) where
V is a finite set of Varieties or non-terminals.
T is a finite set of terminals.
P is a finite set of productions.
S is a start symbol with S Є V.
Given a grammar of G = ({S}, {a , b}, R, S). The set of rules or productions R is defined as
SaSB
S  SS
S Є
Since L>H>S is a single non terminal.
This grammar generates strings such as a b a b,
a a a b b b b, and a a b a b b.
Grammar for Regular languages.
 A language defined by a dfa is a regular language
 Any dfa can be regarded as a special case of an nfa.
 Any nfa can be converted to an equivalent dfa thus a language defined by
an nfa is a regular language.
 A regular expression can be converted to an equivalent nfa thus a language
defined by a regular expression is a regular language.
 An nfa can with some effort be converted to a regular expression.
So dfas nfas and regular expression are all equivalent in the same that any language
you define with on of these could be defined by the others as well.
We also know that languages can be defined by grammars. Now we will begin to
classify grammars: and the first kinds of grammars we will look at are the regular grammars.
As you might expect, regular grammars will turn out to be equivalent to dfas, nfas, and
regular expressions.
Classifying Grammars.
Recall that a grammar G is a quadruple G= (V,T,S, P )
Where
 V is finite set of meta symbols or variables.
 T is a finite set of terminal symbols.
 S V is a distinguished element of V called the start symbol
 P is a finite set of productions.
The above is true for all grammars. We will distinguish among different kinds of
grammars based on the farm of the productions. If the productions of a grammar all follow a
certain pattern, We have one kind of grammar. If the productions all fit a different pattern, we
have a different kind of grammar.
Productions have the form
V  T)
(V T)
In a right linear grammar. All productions have one of the tow forms
V
T*v
Or
V
T*
That is the left hand side must consist of a single variable and the right hand side
consists of any number of terminals (members of ) optionally followed by a single variable.
(The following the arrow, a variable can occur only as the rightmost symbol of the
production.
Right Linear Grammars and NFAs
There is a simple connection between right linear grammars and NFAs as suggested by the
following diagrams.
A - x B
x
A
A - xyzB
B
x
A
y
z
B

AB
A
B
x
Ax
A
As an example of the correspondence between an nfa and a right linear grammar, the
following automaton and grammar both recognize the set of strings consisting of an even
number of 0’s and an even number of 1’s
Left Linear Grammars
In a left linear grammar. All productions have one of the two forms:
V
V
VT
T
That is the left-hand side must consist of a single variable, and the right-hand side
consists of an optional single variable followed by any number of terminals. This is just like a
right linear grammar except that, following the arrow, a variable can occur only on the left of
the terminals, rather than only on the right.
We won’t pay much attention to left linear grammars, because they turn out to be
equivalent to right linear grammars. Given a left grammar for the same language, as follows.
Step
Method
Construct a right linear grammar for the Replace each production A x of L with a production A
different language LR
XR and replace each production A B x with a
production A XR B.
Construct an nfa for LR from the right We talked about deriving an nfa from a right linear
linear grammar. This nfa should have just grammar on an earlier page. If the nfa has more than
one final state.
one finals state, we can make those states nonfinal, add
a new final state, and put  transitions from each
previously final state to the new final state.
Construct an nfa to recognize language L. 2. Ensure
Reverse the nfa for L to obtain an nfa the nfa has only a single final state. 3. Reverse the
for L.
direction of the arcs.
3.
Make the initial state final and the final state
initial.
This is the technique we just talked about on an earlier
Construct a right linear grammar for L page.
from the nfa for L
R
Regular Grammars
A regular grammar is either a right linear grammar or a left linear grammar.
To be a right linear grammar, every production of the grammar must have one of the
two forms V
vT or V
T
To be a left linear grammar, every production of the grammar must have one of the two
forms V
vT or V
t*
You do not get to mix the two for example, consider a grammar with the following
productions.
S

S
aX
Sb
This grammar is neither right linear nor left-linear, hence it is not a regular grammar.
We have no reason to suppose that the language it generates is a regular language ( one
that is generated by a dfa)
Nondeterministic Pushdown automaton (NPDA) or PDA
An NPDA is defined by  - Tuple.
M = (, , , , q0, Z, F)
Where
 = finite set of states
 = finite set of input symbols
 = finite set of pushdown symbols
or
finite set of symbols called stack alphabet.
q0 = Initial state
Z0 = Initial stack symbol
F = finite set of accepting states
 = mansion function
QX(U{}) x  Q x 
The Languages are accepted by PDA (m) by entered final state.
Languages accepted by final state
L=(M) = {w | (q0 , w, Z0 )  ( p, t, )
For some P in F and  in   }
Languages accepted by PDA by final state is defined to be the set of all inputs for which
some sequence of moves causes the PDA to enter the final state.
The PDA is deterministic in the sense that at most one move is possible from any ID.
PDA M = ( , , , , q0, Z0, F)
Is deterministic if
a) for each q in , and Z in ,
whenever (q, t, Z) is non empty
then  ( q, a, Z) is empty for all a in .
b) for no q in Q, Z in  and a in U{t} does
(q, a, Z) contain more than one element.
Transition Functions for NPDAs
The transition function for an npda has the form
 QX (   finite subsets of Q X 
 is now a function of three arguments. The first two are the same as before: the state and
either  or a symbol from the input alphabet. The third argument is the symbol on top of the
stack. Just as the input symbol is “consumed” when the function is applied the stack symbol
is also consumed removed from the stack.
Note that while the second argument may be  rather than a member of the input
alphabet so the no input symbol is consumed three is no such option for the third argument. 
always consumes a symbol from the stack no move is possible if the stack is empty.
In the deterministic case, when the function  is applied the automaton moves to a
new state q=Q and pushes a new string of symbols X onto the stack. Since we are dealing
with a nondeterministic pushdown automaton, the result of applying  is a finite set of (q X)
pairs. If we were to draw the automaton, each such pair would be represented by a single arc.
As with an nfa we do not need to specify  for every possible combination of arguments.
For any case where  is not specified, the transition is to  Q the empty set of states.


The language given by L= a n | n  1 is context free or not.
2
Soln:
Let us assume that
2
 = an
Let s = uvwny where
1 | vx | n
since |vwx|  m,m  n
By pumping lemma,
uv 2 wx 2 y is in L.
since
| uv 2 wx 2 y | > n2
| uv 2 wx 2 y | =k2
where k  n+1
But | uv 2 wx 2 y | = n2 + m < n2+2n+1
Therefore
| uv 2 wx 2 y | lies between
n2 and (n+1)2.
Hence
uv2wx2y  L ;n  1 which is a contradiction.


Therefore a n ; n  1 is not context free.
2
Closure properties of CFL.
Closure properties of CFL
The operations are useful not only in constructing or proving that certain languages
are context free, but also in proving certain languages not to be context free. Here we see
some of the operations like substitutions, homomorphisms, inverse homomorphism,
intersection, difference, etc.
Substitutions
Let  be an alphabet. For each symbol a in , choose a language La and S(a) defines
the substitution function.
If w = a1 a2 … an is a string in *, then s(w) is the language of all strings x1 x2 ….xn
such that string xi is in the language S(ai) for I = 1, 2, …n. S(L) is the union of S(w) for all
strings w in L.
Example:
 
LetS  0   0n1n n  1 then
 S 1  00,11 
a=1
It contains only two strings 00 and 11. But S(0) contains one or more as followed by
an equal number of b’s.
Let w = 10, then S(w) is the concatenation of S(1) and S(0).
S(w) = S(1).S(0) = 000n1,110n1n.
Applications of the Substitution Theorem
Theorem
The context free languages are closed under Union, Concatenation, Closure,
Homomorphism.
Proof:
Apply the previous theorem (substitution theorem) for proving all these operations.
1. Union:
Let L1 and L2 be two CFL’s. Then the union of L1 and L2 is,
S(L)=L1 U L2 where L is the language {1,2} and S is the substitution given by S(1) =
L1 and S(2) =L2.
2. Concatenation:
Let L1 be CFL’s. Then the concatenation of two language L1 and L2 is.
S(L) = L1.L2 where L is the language {12} and S is the substitution
S(1) = L1 and S(2) = L2.
3. Closure and Positive Closure:
If L1 is a CFL’s. Then the closure is given by
S(L)=L1* where S(1) = L1
The positive closure is given by
S L   L1 whereL  1  .
4. Homomorphism:
Let L be a CFL over the alphabet  and h is a homomorphism on .
Let S be the substitution that replaces each symbol a in  by the
language consisting of one string h(a).
S(a) = {h(a)} for all in .
Thus, h(L) = S(L).
Intersection
Theorem:1
The CFL’s are not closed under intersection.
Let L1 and L2 be CFL’s. Then L1  L2 is the language 1(L) where it satisfies both the
properties of L1 and L2 which is not possible in CFL.
Example

 a b

n  1,m  1
Let L1  ambn m  1,n  1
L2
n m
L = L1  L2 is not possible. Because L1 requires that there be M number of a’s and n
number of b’s but L2 requires n number of a’s and M number of b’s.
So CFL’s are not closed under in section.
Inverse Homomorphism
If h is a homomorphism, and L is any language, then h-1(L) is the set of strings w such
that h(w) is in L.
The CFL’s are closed under inverse homomorphism.
The following figure shows the Inverse Homomorphism
of PDAS.
Buffer
Input
a
h
ha)
PDA
State
Stack
Figure: Inverse Homomorphism of a PDA
After getting the input a, h(a) is placed in a buffer. The symbols of h(a) are used one
at a time and fed to the PDA being simulated.
While applying homomorphism, the PDA checks whether the buffer is empty. If it is
empty, the PDA read the input symbols and apply the homomorphism.
Design a TM to accept the language L = 0n1n \ n  1
Given a finite sequence of 0’s and 1’s on its tape. The turing machine is designed using the
following way.
(i)
(ii)
(iii)
(iv)
M replaces the leftmost 0 ny x, moves right to the leftmost 1, replacing it by y.
Then M moves left to find the rightmost x, and moves one cell right to the
leftmost 0 and repeats the cycle.
While searching for a 1, if a blank is encountered, then M halts without
accepting.
After changing a 1 to a y, if M finds no more 0’s then M checks that no more
1’s remain, accepting the string else not.
Assume the set of states Q  q0 ,q1,q2,q3,q4 
  0,1
   0,1,1,y,B 
F  q4 
let q0 be the initial state and at state q0, it replaces the leftmost 0 by x, and changes it
to q1, M searches right for 1’s, skipping over 0’s and y’s.
If M finds a 1, it changes it to y, entering state q 2. From q2, it searches left for an x
and moves right to change the state to q0.
At q0. if y is encountered, it goes to state q3 and checks that no 1’s remain. If the y’s
are followed by a B, state q4 is entered and then accepted. And for all others, M rejects.
Eg:
q0
0
(q1,x,R)
1
-
X
-
Y
(q3,y,R)
B
-
q1
(q1,0,R)
(q2,y,L)
-
(q1,y,R)
-
q2
(q2,0,L)
-
(q0,X,R)
(q2,y,L)
-
q3
-
-
-
(q3,y,R)
-
q4
-
-
-
-
-
i
q0 0011 xq1011 x0q111 xq2 0yl q2 x0y1
xq0 0y1 xxq1y1 xxyq11 xxq2 yy xq2 xyy
xxq0 yy xxq3 y
xxyyq3 xxyyBq4 .
Accepted.
ii
q0 011xq111 q1xy1 xq0 y1 xyq31
rejected.
Figure: Transition diagram for 0n 1n.
 M   Q,  , , ,q0 ,B,q4  
where
Q  q0 ,q1,q2 ,q3 ,q4 
  0,1
   0,1,x,y,B 
q0  Initial state
q4  Final state
 is given in the table.
Design a Turing machine to check whether the given input is prime or not using
multiple tracks.
Solution:
The binary input greater than two is placed on the first track. And also the same input
is placed on the third track. Then TM writes the number two in binary form on th second
track. Then divide the three track by the second as follows.
The number on the second track is subtracted from the third track as many times as
possible, till getting the remainder. If the remainder is zero, then the number on the first track
is not a prime.
If the remainder is non zero, then increase the number on the second track by one. If
the second track equals the first, the number given is a prime because it should be divided by
one and itself.
Eg: (i) 8
I
II
III
8
2
8
8
2
6
(1)
8
2
4
(2)
(3)
Track
Track
Track
8
2
0
8
2
2
(4)
(5)
The given number is not a prime number
Eg: (ii) 5
5
2
5
5
2
3
5
3
5
5
3
2
5
4
55
5
5
5
2
1
Increase the
value by 1
Increase the
value by 1
5
4
1
second
second
track
track
Increase the second track
value by 1
Here number on II track = number on III track
 The given number is a prime number.
Undecidable problem .
A problem whose language is not recursive is said to be undecidable problem.
In other words problem which has no algorithm to solve is called undecidable
problem.
1. Does turing machine M halt on input w ?
2. For grammars G1 and G2 check whether L(G1 )= L(G2)
UNIT IV
Asymptotic Notation: O( ), o( ), Ω( ), ω(), and θ( )
“Big-O” notation was introduced in P. Bachmann’s 1892 book Analytische Zahlentheorie. He
used it to say things like “x is O(n2 )” instead of “x _ n2 .” The notation works well to
compare algorithm efficiencies because we want to say that the growth of effort of a given
algorithm approximates the shape of a standard function.
Big-O (O()) is one of five standard asymptotic notations. In practice, Big-O is used as a tight
upper-bound on the growth of an algorithm’s effort (this effort is described by the function
f(n)), even though, as written, it can also be a loose upper-bound. To make its role as a tight
upper-bound more clear, “Little-o” (o()) notation is used to describe an upper-bound that
cannot be tight.
Definition (Big–O, O()): Let f(n) and g(n) be functions that map positive integers to positive
real numbers. We say that f(n) is O(g(n)) (or f(n) 2 O(g(n))) if there exists a real constant c >
0 and there exists an integer constant n0 _ 1 such that f(n) _ c _ g(n) for every integer n _ n0.
Definition (Little–o, o()): Let f(n) and g(n) be functions that map positive integers to positive
real numbers. We say that f(n) is o(g(n)) (or f(n) 2 o(g(n))) if for any real constant c > 0, there
exists an integer constant n0 _ 1 such that f(n) < c _ g(n) for every integer n _ n0.
Definition (Big–Omega, ()): Let f(n) and g(n) be functions that map positive integers to
positive real numbers. We say that f(n) is (g(n)) (or f(n) 2 (g(n))) if there exists a real constant
c > 0 and there exists an integer constant n0 _ 1 such that f(n) _ c · g(n) for every integer n
_n0.
Definition (Little–Omega, !()): Let f(n) and g(n) be functions that map positive integers to
positive real numbers. We say that f(n) is !(g(n)) (or f(n) 2 !(g(n))) if for any real constant c >
0, there exists an integer constant n0 _ 1 such that f(n) > c · g(n) for every integer n _ n0.
This graph should help you visualize the relationships between these notations:
These definitions have far more similarities than differences. Here’s a table that summarizes
the key restrictions in these four definitions:
Time Space Complexity
Determining the number of steps (operations) needed as a function of the problem size
Count the exact number of steps needed for an algorithm as a function of the problem size
Each atomic operation is counted as one step:
o Arithmetic operations
o Comparison operations
o Other operations, such as “assignment” and “return”
Time complexity: A measure of the amount of time required to execute an algorithm
Objectives of time complexity analysis:
• To determine the feasibility of an algorithm by estimating an upper bound on the
amount of work performed
• To compare different algorithms before deciding on which one to implement
Analysis is based on the amount of work done by the algorithm
• Time complexity expresses the relationship between the size of the
input and the run time for the algorithm
• Usually expressed as a proportionality, rather than an exact function
To simplify analysis, we sometimes ignore work that takes a constant
amount of time, independent of the problem input size
• When comparing two algorithms that perform the same task, we often just
concentrate on the differences between algorithms
Simplified analysis can be based on:
 Number of arithmetic operations performed
 Number of comparisons made
 Number of times through a critical loop
 Number of array elements accessed
 etc
Simulations
A computer simulation, a computer model, or a computational model is a computer
program, run on a single computer, or a network of computers, that attempts to simulate an
abstract model of a particular system. Computer simulations have become a useful part
of mathematical modeling of many natural systems in physics (computational
physics), astrophysics, chemistry and biology, human systems
in economics, psychology, social science, and engineering. Simulation of a system is
represented as the running of the system's model. It can be used to explore and gain new
insights into new technology, and to estimate the performance of systems too complex
for analytical solutions.
Computer simulations vary from computer programs that run a few minutes, to networkbased groups of computers running for hours, to ongoing simulations that run for days. The
scale of events being simulated by computer simulations has far exceeded anything possible
(or perhaps even imaginable) using traditional paper-and-pencil mathematical modeling.
Over 10 years ago, a desert-battle simulation, of one force invading another, involved the
modeling of 66,239 tanks, trucks and other vehicles on simulated terrain around Kuwait,
using multiple supercomputers in the DoD High Performance Computer Modernization
Program[2] Other examples include a 1-billion-atom model of material deformation(2002); a
2.64-million-atom model of the complex maker of protein in all organisms, a ribosome, in
2005;[3] a complete simulation of the life cycle of Mycoplasma genitalium in 2012; and
the Blue Brain project at EPFL (Switzerland), begun in May 2005, to create the first
computer simulation of the entire human brain, right down to the molecular level.
Reducibility:
A reduction is a way of converting one problem to another problem, so
that the solution to the second problem can be used to solve the first
problem. Finding the area of a rectangle, reduces to measuring its width and height
Solving a set of linear equations, reduces to inverting a matrix.
Reducibility involves two problems A and B.
If A reduces to B, you can use a solution to B to solve A
When A is reducible to B solving A can not be “harder” than solving B.
If A is reducible to B and B is decidable, then A is also decidable.
If A is undecidable and reducible to B, then B is undecidable.
Circuit Complexity
A Boolean circuit C on n inputs x1, . . . , xn is a directed acyclic graph (DAG) with n nodes
of in-degree 0 (the inputs x1, . . . , xn), one node of out-degree 0 (the output), and every node
of the graph except the input nodes is labeled by AND, OR, or NOT; it has in-degree 2
(for AND and OR), or 1 (for NOT). The Boolean circuit C computes a Boolean function
f(x1, . . . , xn) in the obvious way: the value of the function is equal to the value of the output
gate of the circuit when the input gates are assigned the values x1, . . . , xn.
The size of a Boolean circuit C, denoted |C|, is defined to be the total number of nodes
(gates) in the graph representation of C. The depth of a Boolean circuit C is defined as the
length of a longest path (from an input gate to the output gate) in the graph representation
of the circuit C.
A Boolean formula is a Boolean circuit whose graph representation is a tree.
Given a family of Boolean functions f = {fn}n_0, where fn depends on n variables,
we are interested in the sizes of smallest Boolean circuits Cn computing fn. Let s(n) be a
function such that |Cn| _ s(n), for all n. Then we say that the Boolean function family f is
computable by Boolean circuits of size s(n). If s(n) is a polynomial, then we say that f is
computable by polysize circuits.
It is not difficult to see that every language in P is computable by polysize circuits. Note
that given any language L over the binary alphabet, we can define the Boolean function
family {fn}n_0 by setting fn(x1, . . . , xn) = 1 iff x1 . . . xn 2 L.
Is the converse true? No! Consider the following family of Boolean functions fn, where
fn(x1, . . . , xn) = 1 iff TM Mn halts on the empty tape; here, Mn denotes the nth TM in
some standard enumeration of all TMs. Note that each fn is a constant function, equal to 0
or 1. Thus, the family of these fn’s is computable by linear-size Boolean circuits. However,
this family of fn’s is not computable by any algorithm (let alone any polytime algorithm),
since the Halting Problem is undecidable. Thus, in general, the Boolean circuit model of
computation is strictly more powerful that the Turing machine model of computation.
Still, it is generally believed that NP-complete languages cannot be computed by polysize
circuits. Proving a superpolynomial circuit lower bound for any NP-complete language would
imply that P 6= NP. (Check this!) In fact, this is one of the main approaches that was used
in trying to show that P 6= NP. So far, however, nobody was able to disprove that every
language in NP can be computed by linear-size Boolean circuits of logarithmic depth!
Boolean circuit model of computation
For every n,m 2 N a Boolean circuit C with n inputs and m outputs1is a directed
acyclic graph. It contains n nodes with no incoming edges; called the input nodes
and m nodes with no outgoing edges, called the output nodes. All other nodes
are called gates and are labeled with one of _, ^ or ¬ (in other words, the logical
operations OR, AND, and NOT). The _ and ^ nodes have fanin (i.e., number of
incoming edges) of 2 and the ¬ nodes have fanin 1. The size of C, denoted by |C|,
is the number of nodes in it.
The circuit is called a Boolean formula if each node has at most one outgoing edge.
Definition 6.2 (Circuit families and language recognition)
Let T : N ! N be a function. A T(n)-sized circuit family is a sequence {Cn}n2N of Boolean
circuits,
where Cn has n inputs and a single output, such that |Cn| _ T(n) for every n.
We say that a language L is in SIZE(T(n)) if there exists a T(n)-size circuit family {Cn}n2N
such that for every x 2 {0, 1}n, x 2 L , C(x) = 1.
every language is decidable by a circuit family of size O(n2n), since
the circuit for input length n could contain 2n “hardwired” bits indicating which inputs are in
the language. Given an input, the circuit looks up the answer from this table. (The reader may
wish to work out an implementation of this circuit.) The following definition formalizes what
we can think of as “small” circuits.
UNIT V
Polynomial Time
An algorithm is said to be solvable in polynomial time if the number of steps required to
complete the algorithm for a given input is
for some nonnegative integer , where is the
complexity of the input. Polynomial-time algorithms are said to be "fast." Most familiar
mathematical operations such as addition, subtraction, multiplication, and division, as well as
computing square roots, powers, and logarithms, can be performed in polynomial time.
Computing the digits of most interesting mathematical constants, including and , can also
be done in polynomial time.
polynomial-time many-one reduction, polynomial transformation, or Karp reduction. If
it is a Turing reduction, it is called a polynomial-time Turing reduction or Cook reduction.
Polynomial-time reductions are important and widely used because they are powerful enough
to perform many transformations between important problems, but still weak enough that
polynomial-time reductions from problems in NP or co-NP to problems in P are considered
unlikely to exist. This notion of reducibility is used in the standard definitions of several
complete complexity classes, such as NP-complete, PSPACE-complete and EXPTIMEcomplete.
Within the class P, however, polynomial-time reductions are inappropriate, because any
problem in P can be polynomial-time reduced (both many-one and Turing) to almost[1] any
other problem in P. Thus, for classes within P such as L, NL, NC, and P itself, log-space
reductions are used instead.
If a problem has a Karp reduction to a problem in NP, this shows that the problem is in NP.
Cook reductions seem to be more powerful than Karp reductions; for example, any problem
in co-NP has a Cook reduction to any NP-complete problem, whereas any problems that are
in co-NP - NP (assuming they exist) will not have Karp reductions to any problem in NP.
While this power is useful for designing reductions, the downside is that certain classes such
as NP are not known to be closed under Cook reductions (and are widely believed not to be),
so they are not useful for proving that a problem is in NP. However, they are useful for
showing that problems are in P and other classes that are closed under such reductions
P Completeness theory
In complexity theory, the notion of P-complete decision problems is useful in the analysis of
both:
1. which problems are difficult to parallelize effectively, and;
2. which problems are difficult to solve in limited space.
Formally, a decision problem is P-complete (complete for the complexity class P) if it is in P
and that every problem in P can be reduced to it by using an appropriate reduction.
The specific type of reduction used varies and may affect the exact set of problems. If we use
NC reductions, that is, reductions which can operate in polylogarithmic time on a parallel
computer with a polynomial number of processors, then all P-complete problems lie outside
NC and so cannot be effectively parallelized, under the unproven assumption that NC ≠ P. If
we use the weaker log-space reduction, this remains true, but additionally we learn that all Pcomplete problems lie outside L under the weaker unproven assumption that L ≠ P. In this
latter case the set P-complete may be smaller.
There is a theorem of Ladner [77] that plays the same role for the P =NLOGSPACE or P =
LOGSPACE question that the Cook–Levin theorem plays for the P = NP question. The
decision problem involved is the circuit value problem (CVP): given an acyclic Boolean
circuit with several inputs and one output and a truth assignment to the inputs, what is the
value of the output? The circuit can be evaluated in deterministic polynomial time; the
theorem says that this problem is ≤log m -complete for P. It follows from the transitivity of
≤log m that P = NLOGSPACE iff CVP ∈ NLOGSPACE and P = LOGSPACE iff CVP ∈
LOGSPACE.
Formally, a Boolean circuit is a program consisting of finitely many assignments of the form
Pi := 0,
Pi := 1,
Pi := Pj ∧ Pk, j,k<i,
Pi := Pj ∨ Pk, j,k<i, or
Pi := ¬Pj, j<i,
where each Pi in the program appears on the left-hand side of exactly one
assignment. The conditions j, k < i and j < i ensure acyclicity. We want to compute the value
of Pn, where n is the maximum index.
Circuit Satisfiability
The circuit satis_ability problem (CIRCUIT-SAT) is the circuit analogue of SAT.
Given a Boolean circuit C, is there an assignment to the variables that causes the
circuit to output 1?
Theorem 1 CIRCUIT-SAT is NP-complete.
Proof It is clear that CIRCUIT-SAT is in NP since a nondeterministic machine can
guess an assignment and then evaluate the circuit in polynomial time.
Now suppose that A is a language in NP. Recall from Lecture 3 that A has a
polynomial-time veri_er, an algorithm V with the property that x 2 A if and only
if V accepts hx; yi for some y. In addition, from Lecture 5, we know that there is a
polynomial-size circuit C equivalent to V . The input of C is the entire input of V ,
i.e., both x and y. And C can be constructed in polynomial time given the length of
x and y.
The reduction from A to CIRCUIT-SAT operates as follows: given an input x,
output a description of the circuit C(x; y) with the x variables set to the given values
and the y variables left as variables. The resulting circuit is satis_able if and only
x 2 A. And the reduction can be computed in polynomial time because of the uniformity of C.
Circuit satisfiability is a good example of a problem that we don’t know how to solve in
polynomial time. In this problem, the input is a boolean circuit: a collection of AND, OR,
and NOT gates connected by wires. We will assume that there are no loops in the circuit (so
no delay lines or flip-flops). The input to the circuit is a set of m boolean (TRUE/FALSE)
values x1, . . . , xm. The output is a single boolean value. Given specific input values, we can
calculate the output of the circuit in polynomial (actually, linear) time using depth-firstsearch, since we can compute the output of a k-input gate in O(k) time.
P, NP, and co-NP
A decision problem is a problem whose output is a single boolean value: YES or NO.2 Let
me define three classes of decision problems:
_ P is the set of decision problems that can be solved in polynomial time.3 Intuitively, P is
the set of problems that can be solved quickly.
_ NP is the set of decision problems with the following property: If the answer is YES, then
there is a proof of this fact that can be checked in polynomial time. Intuitively, NP is the set
of decision problems where we can verify a YES answer quickly if we have the solution in
front of us.
_ co-NP is the opposite of NP. If the answer to a problem in co-NP is NO, then there is a
proof of this fact that can be checked in polynomial time.
For example, the circuit satisfiability problem is in NP. If the answer is YES, then any set of
m input values that produces TRUE output is a proof of this fact; we can check the proof by
evaluating the circuit in polynomial time. It is widely believed that circuit satisfiability is not
in P or in co-NP, but nobody actually knows.
Every decision problem in P is also in NP. If a problem is in P, we can verify YES answers in
polynomial time recomputing the answer from scratch! Similarly, any problem in P is also in
co-NP.
One of the most important open questions in theoretical computer science is whether or not P
= NP.
Nobody knows. Intuitively, it should be obvious that P 6= NP; the homeworks and exams in
this class and others have (I hope) convinced you that problems can be incredibly hard to
solve, even when the solutions are obvious in retrospect. But nobody knows how to prove it.
A more subtle but still open question is whether NP and co-NP are different. Even if we can
verify every YES answer quickly, there’s no reason to think that we can also verify NO
answers quickly. For example, as far as we know, there is no short proof that a boolean
circuit is not satisfiable. It is generally believed that NP 6= co-NP, but nobody knows how to
prove it.
The Turing machine simulator.
The simulator also supports multi-track machines (the notation of Note 4 is
used for tape symbols, so that ha, bi denotes a symbol where a is on the upper
track and b on the lower track). The simulator will allow you to vary the number
of tracks, since all they are is a convenient device for certain structured symbol
names. As an extra feature the simulator issues a warning if there is a transition
such as (q1,<0,1>,q2,<1,0>,R) that changes more than one track at the same
time, since this goes against the idea of tracks. No warning is issued if the transition
involves symbols with different numbers of tracks (say from 2 tracks to 3 tracks).
The Turing machine description language. The user must create a
file that contains a description of the machine to be simulated. The description
language has been designed with an eye to making the descriptions look as similar
as possible to those that can be found in Note 3. Thus, for example, the palindrome
recognizer, Mpalin, of Note 3 might be presented to the simulator in the
form of the following file:
Q = {q0, q1, q2, q3, q4, q5, q6}
I = q0
F = q6
G = {0, 1, b}
S = {0, 1}
D = {(q0,b,q6,b,R)*, (q0,0,q1,b,R)*, (q0,1,q3,b,R)*,
(q1,b,q2,b,L)*, (q1,0,q1,0,R), (q1,1,q1,1,R),
(q2,b,q6,b,R)*, (q2,0,q5,b,L)*,
(q3,b,q4,b,L)*, (q3,0,q3,0,R), (q3,1,q3,1,R),
(q4,b,q6,b,R)*, (q4,1,q5,b,L)*,
(q5,b,q0,b,R)*, (q5,0,q5,0,L), (q5,1,q5,1,L)}
Q: The set of states is presented as a list of identifiers, separated by commas, and
enclosed in braces, i.e., { and }. The identifiers are arbitrary sequences of
alphanumeric characters. (Some special characters, such as { and } are not
allowed.)
I: The initial state of the machine (the qI of Note 3).
F: The final state of the machine (the qF of Note 3).
G: The tape alphabet, �, of the machine. The tape symbols can be almost
any printing characters. Exceptions include non-printable characters such
as (space), and ? (question mark); the latter has a special meaning to be
explained shortly. The characters are separated by commas and enclosed in
braces.
S: The input alphabet _ _ �.
D: The transition function is presented as a list of quintuples, separated by commas,
and enclosed in braces. The components of each quintuple specify, in
order, the old state, old tape symbol, new state, new tape symbol, and head movement (L for
left, and R for right).
MODEL QUESTION PAPER
U4CSA10 THEORY OF COMPUTATION
Part – A (15 x 2 marks = 30 marks)
Answer All Questions. Each question carries 2 marks
1.
What is meant by symbols?
2.
Define Alphabets and Strings?
3.
Construct a regular expression for the language which accepts
all strings with at least two c’s over the set ={c,b}
4.
Give the formal definition of a regular language.
5.
Write a regular
expression to denote a
language L which
accepts all the strings which begin or end with either 00 or
11.
6.
List the types of automation?
7.
what is: (i) (0+1)* (ii)(01)* (iii)(0+1) (iv)(0+1)+
8.
What are the applications of pumping lemma?
What is the closure property of Regular sets?
9.
What is a deterministic PDA?
10.
What is left linear grammar?
11.
When we say a problem is decidable? Give an example of u
ndecidable problem?
12.
What is an ambiguous grammar?
13.
What is a 2-way infinite tape Turing Machine?
14.
What is the storage in FC
15.
Define Decidability?
Part – B (5 x14 marks= 70 marks)
(Answer ALL questions. Each question carries 14 marks)
16.
A) What is a: (a) String (b) Regular language?
What is a regular expression? Differentiate L* and L+
(OR)
B) Define: (i) Finite Automaton (FA) (ii) Transition diagram iii)
What are the applications of automata theory?
17.
A) Find the regular expression for the set of all strings denoted
byR132 from the DFA given below.
(OR)
B) What are the closure properties of CFL?
State the
pumping lemma for CFLs.
What is the main application of pumping lemma in CFLs?
18. A) i)
Find CFG with no useless symbols equivalent to: S
AB | CA, B BC | AB,
A a , C aB | b.
ii) Construct CFG without Є production from: S
a | Ab | aBa, A b | Є , Bb | A.
(OR)
B) What is a ambiguous grammar?Consider the grammar p =
{S->aS | aSbS | Є } is
ambiguous by constructing: (a) two parse trees (b) two leftm
ost derivation (c) rightmost derivation
1. A) What is a turing machine?
What are the special features of TM?
Define Turing machine.
Define Instantaneous description of TM. Discuss the various
techniques for Turing machine construction?
(OR)
B) What are the different types of language acceptances by a
PDA and define them. Define Deterministic PDA.
Define Instantaneous description (ID) in PDA.
2. A) What is circuit complexity? Discuss in detail about the
problems under the circuit complexity?
(OR)
B) Explain in detail about NAND circuit value problems with an
example?
Download