Uploaded by Mr Robot

Lecture - 01 Introduction to Theory of Automata

advertisement
1
Theory of Automata
Lecture – 01
Disclaimer: The Contents of this reader are borrowed from the book(s) mentioned in the
reference section.
Since the theory of computers is what we are interested in, we develop a number of abstract
mathematical models that, to varied degrees, describe the components of computers, as
well as different kinds of computers and other related machines. The more abstract
problems regarding the limits of capabilities of these mechanical devices will be discussed
using our models rather than the more concrete engineering aspects of the hardware of
computers.
There are separate courses that deal with circuit design, data structures and algorithms,
operating systems and compiler design and artificial intelligence. All these courses are
different from the study of theory of computation or Automata in two different ways
1. They deal with the computers that already exists; our model on the other hand will
encompass all computer that do exists, will exists and that can ever be dreamed of.
2. They are interested in how best to do things, we shall not be interested in the
question of optimality rather we are interested in the question of possibility - what
can and cannot be done.
The most obvious component of Computer Theory is the theory of mathematical logic.
With the advent of set theory by George Cantor, he also created things that were
contradicting. Some of his unusual findings could be tolerated (such as that infinity comes
in different sizes), but some could not (such as that some set is bigger than the universal
set). This left a cloud over mathematics that needed to be resolved.
Hilbert desired a methodical approach—a detailed procedure for getting outcomes, similar
to the instructions in a recipe. An algorithm is a set of instructions that are comprehensive,
certain, and simple to follow. He believed that entire classes of mathematical problems
might be solved using algorithms or methods. Just such a method is provided for solving
all systems of linear equations by the body of knowledge known as linear algebra. Hilbert
sought to create algorithms for resolving different types of mathematical issues, possibly
even one that could resolve all mathematical issues in a limited number of stages.
One preset set of activities could only be completed by an early calculator at once. The
calculators needed to be physically rebuilt in order to make modifications to their
processes, either by rewiring, resetting, or reconnecting various pieces. Von Neumann
designed a central control section that, after reading input data, could choose which
operation to carry out based on a programme or algorithm encoded in the input and stored
in the computer along with the raw data to be processed. Von Neumann permanently wired
certain operations into the machine. The operations that were to be carried out on
themselves were determined by the inputs in this manner. Von Neumann's goal was to
convert the electronic calculator into a real life model of one of the logicians' ideal
universal-algorithm machines, such as those Turing had described.
What is the "best" language to write programmes in? is a question that emerged along with
the idea of programming a computer. There were numerous invented languages, each with
a unique set of intended machines and issue categories. But as new languages developed,
it became obvious that they shared a lot of characteristics. They appeared to have similar
capabilities and restrictions. Although Turing had earlier worked on essentially the same
problem from a different perspective, this finding was first simply intuitive
2
There are many other studies investigating the similar subjects: What is language in
general? How could primitive humans have developed language? How do people
understand it? How do they learn it as children? What ideas can be expressed, and in what
ways? How do people construct sentences from the ideas in their minds?. In an effort to
provide an explanation, Noam Chomsky developed the field of mathematical models for
the description of languages. His theory advanced to the point that it started to have an
impact on the research of computer languages. The languages that people created to talk
with one another and the languages required for humans to communicate with machines
had many fundamental characteristics. Despite not fully understanding how language is
understood by humans, we do understand how machines process information. As a result,
linguistics—which was previously unrelated to mathematics—started to make use of the
formulations of mathematical logic. Metaphorically, we could say that the computer then
took on linguistic abilities. It became a word processor, a translator, and an interpreter of
simple grammar, as well as a compiler of computer languages.
What does Automata Means?
 It is the plural of automaton, and it means “something that works automatically”
 Automata theory is the study of abstract computational devices and the computational
problems that can be solved using them.
 Abstract devices are (simplified) models of real computations.
 Turing studied (1930) an abstract machine that had all the capabilities of today’s
computer, at-least what they could do.
 Turing Goal to describe the boundaries what a machine can do and what cannot.
 In automata we will simulates parts of computers.
 Helps in design and construction of different software's and what we can expect from
our software's.
 Automata play a major role in theory of computation, compiler design, artificial
intelligence.
Why Study Automata Theory?
Introduction to Finite Automata
Useful for many important kinds of hardware and software
Systems as diverse as








Parity checkers,
Vending machines,
Communication protocols,
Interactive video games,
Building security devices
Software for designing and checking of behavior of digital circuits
Lexical Analyzer – breaks inputs text into logical units like identifiers
Software for scanning large bodies of text, such as collection of web pages
to find occurrence of words, phrases or other pattern.
 Software to verify systems that have a finite number of distinct states
Can be straight forwardly described as finite state machines.
Finite Automata Example
The simplest non trivial finite automaton is an on/off switch. The device remembers
whether it is in the “on” state or the “off” state and it allows the user to press a button
whose effect is different depending on the state of the switch.
3




Circle represents the state
Arcs between the states represents external influence (input)
A start state
Often necessary to designated a final state
In second example the finite automaton could be a part of lexical analyzer. It recognize
the word ‘then’. Five states each represents different position in the word then that has
been reached.
Example 2
Structural Representation
Following two notation are not like automata but plays an important role in the
study of automata and their applications.
1. Grammars
A grammar is a finite set of rules defining a language.
Useful modals when designing software that process data with recursive
function e.g “parser”, the component of compiler, deal with recursively nested
features of programming languages, such as expression rule like E=>E+E
2. Regular Expressions
Denotes the structure of data especially text string.
UNIX Style regular expression [A-Z][a-z]*[ ][A-Z][A-Z] represents capitalized
words followed by a space and two capital letters e.g; Itcha NY and we want to
write Palo Alto CA then what we need to do – a more complex regular expression
[A-Z][a-z]*([ ][A-Z][a-z]*)*[ ][A-Z][A-Z]
What is an Abstract machine?
 A procedure for executing a set of instructions in some formal language
 It is not intended to be constructed as hardware but are is used in thought experiments
about computability
 E.g. Finite State Machine, Turing Machine
4
Automata and Complexity
Automata are essential for the study of limitation of computation
1. What a computer can do? – Study of decidability
2. What a computer can do efficiently? Study of Intractability
Types of Languages
 Formal Languages (Syntactic languages)
 Informal Languages (Semantic languages)
Direct Communication with computer is not possible. We need help from the language of
computer. We use symbols, alphabets, words in our daily life. Similarly in our theory we
will discuss alphabets, letters and symbols.
In English we distinguish the three different entities: letters, words, and sentences. There
is a certain parallelism between the fact that groups of letters make up words and the fact
that groups of words make up sentences. Not all collections of letters form a valid word,
and not all collections of words form a valid sentence. The analogy can be continued.
Certain groups of sentences make up coherent paragraphs, certain groups of paragraphs
make up coherent stories, and so on.
We need to adopt a definition of a "most universal language structure,” that is, a structure
in which the decision of whether a given string of units constitutes a valid larger unit is not
a matter of guesswork but is based on explicitly stated rules.
When we call our study the Theory of Formal Languages, the word "formal" refers to the
fact that all the rules for the language are explicitly stated in terms of what strings of
symbols can occur. No liberties are tolerated, and no reference to any "deeper
understanding" is required. Language will be considered solely as symbols on paper and
not as expressions of ideas in the minds of humans. In this basic model, language is not
communication among intellects, but a game of symbols with formal rules.
The Central Concepts of Automata Theory
Here we will discuss the important definitions required for the study of Automata Theory.
Alphabets
A finite non-empty set of symbols (called letters), is called an alphabet. It is denoted by Σ
( Greek letter sigma).
Example
Σ = {a,b}
Σ = {0,1} (important as this is the language which the computer understands.)
Σ = {i,j,k}
Σ (alphabet) includes letters, digits and a variety of operators including sequential operators
such as GOTO and IF
Valid / Invalid Alphabets
While defining an alphabet, an alphabet may contain letters consisting of group of
symbols for example
Σ1= {B, aB, bab, d}.
5
The BababB (string) can be tokenized in two different ways for Σ2= {B, Ba, bab, d}:
o (Ba), (bab), (B)
o (B), (abab), (B)
As when this string is scanned by the compiler (Lexical Analyzer), first symbol B is
identified as a letter belonging to Σ, while for the second letter the lexical analyzer would
not be able to identify, so while defining an alphabet it should be kept in mind that
ambiguity should not be created.
While defining an alphabet of letters consisting of more than one symbols, no letter
should be started with the letter of the same alphabet i.e. one letter should not be the
prefix of another. However, a letter may be ended in the letter of same alphabet i.e. one
letter may be the suffix of another.
Σ1= {B, aB, bab, d}
Σ2= {B, Ba, bab, d}
Σ1 is a valid alphabet while Σ2 is an in-valid alphabet.
Strings
A string (some time word) is a finite sequence of symbols choosen from some alphabets.
For example 01101 is a string from the binary alphabet Σ = {0,1}. Similarly a, abab,
aaabb,ababab are symbols from alphabet Σ = {a,b}
The Empty String
The empty string denoted by ε (Epsilon) or (Small Greek letter Lambda) λ or (Capital
Greek letter Lambda) Λ. is a string with zero occurrences of symbols.
Length of String
This is the number of positions for symbols in the string. For instance, 01101 has length 5.
We have only two symbols that is 0 and 1 but there are 5 different position for symbols
that actually defines the length.
The standard notation for the length of a string w is |w|, for example |011| is 3 and |ε| is 0.
Power of Alphabets
If Σ is an alphabet, we can express the set of strings of a certain length from that alphabet
by using an exponential notation.
Let ∑ be an alphabet.
 ∑k = the set of all strings of length k
 ∑0 = {ε}
Let Σ = {0,1}.
 ∑1 = {0,1}
 ∑2 = {00,01,10,11}
 ∑3 = {000,001,010,011,100,101,110,111}
The set of all strings over alphabet Σ is denoted by ∑*
 ∑* = ∑0 U ∑1 U ∑2 U …
 ∑* = ∑+ U {ε}
The set of all strings over alphabet Σ excluding empty string is denoted by ∑+
6
 ∑+ = ∑1 U ∑2 U ∑3 U …
Concatenation of Strings
Let x and y be string. Then xy denotes the concatenation of x and y.
Let x=010101 and y=100101 then xy=010101100101
Similarly for any string w the equation εw=w ε=w hold.
Language
A set of strings all of which are chosen from some ∑* where ∑ is a particular alphabet, is
called a language.
If ∑ is alphabet, and L ∑* then L is language over ∑.
Examples
1. Let L be the language of all strings consisting of n 0’s followed by n 1’s for come n ≥
0:
L = {,01,0011,000111,…}
2. Let L be the language of all strings of with equal number of 0’s and 1’s:
L = {,01,10,0011,1100,0101,1010,1001,…}
3. ∑* is language for any alphabet over ∑
4. ∅, the empty language over any alphabet
5. { ε } is a language consisting of empty string ∅≠ { ε }
Example of Descriptive Language - Language L of strings of odd length of alphabet ∑={a}
can be written as L={a,aaa,aaaaa,…..}
Problems
In Theory of Automata, a problem is a statement or question about the decidability that
whether a strings belongs to a language or not. For example ∑ is an alphabet, and L is a
language over ∑, then the problem statement can be
Given a string  in ∑*, decide whether or not  is in L.
Define a language formally through Set former
{ |
1. { |
2. { |
ℎ
}
′ }
ℎ
}
W can also be replaced by some expression.
{0 1 |
≥ 1}
Reference
1. Introduction to Computer Theory by Daniel A. Cohen
2. Introduction to Automata Theory, Languages, and Computation by John E.
Hopcraft, 3rd Edition
Download