Defining the language

advertisement
Welcome to !
Theory Of Automata
1
Text and Reference Material
1. Introduction to Computer Theory, by Daniel
I. Cohen, John Wiley and Sons, Inc., 1991,
Second Edition
2. Introduction to Languages and Theory of
Computation, by J. C. Martin, McGraw Hill
Book Co., 1997, Second Edition
2
What does automata mean?
It is the plural of automation, and it means
“something that works automatically”
Study of abstract computing devices or
machines
3
History
 Turing study an abstract machine
All capabilities of today's computers
Goal was to describe precisely the boundaries of B/W what a
computing machine could do and what couldn't.
 In 1940,1950 simpler kinds of machines “finite
automata” studied by researchers
Which model brain functions
 In 1950 linguist begun study the formal
grammar's
Serve as the basis of some important software components
including parts of compiler
 Finite automata and formal grammars are used in
the design and construction of softwares
4
Why Study Automata Theory
Finite automata is useful model for many
hardware and software's
Software for designing and checking the
behavior of digital circuits
Lexical analyzer of typical compiler
Software's for scanning large bodies of text.
Software's for verifying systems of all types that
have a finite number of distinct states, such as
communications protocol.
5
Example
Nontrivial finite automation is an on/off switch.
device remembers whether it is in the on state
or off state
Push
Start
ON
OFF
Push
States are represented by Circles
Arcs labeled with Inputs
6
Example (Recognizing of Then)
Start
T
T
H
TH
E
THE
N
THEN
 Inputs are letters
 Analyze examines one character of program
 Start state corresponding to empty string
 Each state has a transition to next letters
7
Introduction to languages
A set of symbols that expresses ideas and
allows people to think and communicate
with each other.
Language is a system at many levels.
Not just a collection of words, language
consists of rules and patterns that relate
the words to one another
8
Introduction to languages
There are two types of languages
 Formal Languages (Syntactic languages)
 Informal Languages (Semantic
languages)
9
English Language
 There are three different entities
1. Letters
2. Words
3. Sentences
Group of letters make words
Group of words make sentences
Not all collection of letters form valid word
Not all collections of words form valid sentences
If Analogy Continued
 collections of sentences make paragraph
 Collection of paragraph make stories




10
English Language
Humans agree on which sequence are valid and
which are not
Situation exists with computer languages certain
character strings are recognizable
Words (Do, If, End)
Certain strings of words recognizable become
commands, and commands become program
and then be compiled to machine code.
11
English Language
Whether an input is valid communication then
rules for decoding exactly what the
communications means
Language must be able to tell who is in and who
is out.
Very hard to state rules.
12
Theory of Formal Language
Refers to the fact that all rules for language
explicitly stated what strings of symbols can
occur.
No liberties are tolerated.
Its game of symbols with formal rules
Not expressions of ideas in the minds of human
13
Formal Languages (Alphabets)
Definition:
A finite non-empty set of symbols (letters), is
called an alphabet. It is denoted by Σ ( Greek
letter sigma).
Example:
Σ={a,b}
Σ={0,1} //important as this is the language
//which the computer understands.
Σ={i,j,k}
14
NOTE:
 A certain version of language ALGOL has
113 letters
Σ (alphabet) includes letters, digits and a
variety of operators including sequential
operators such as GOTO and IF
15
Strings
Definition:
Concatenation of finite symbols or letters
from the alphabet is called a string.
Example:
If Σ= {a,b} then
a, abab, aaabb, ababababababababab
16
NOTE:
EMPTY STRING or NULL STRING
Sometimes a string with no symbol or letters
at all is used, denoted by (Small Greek letter
Lambda) λ or (Capital Greek letter Lambda) Λ,
is called an empty string or null string.
What alphabet is considering the null string is
always Λ
The capital lambda will mostly be used to
denote the empty string, in further discussion.
17
Words
 Definition:
Words are strings belonging to some
language.
Example:
If Σ= {x} then a language L can be
defined as
L={xn : n=1,2,3,…..} or L={x,xx,xxx,….}
Here x,xx,… are the words of L
18
NOTE:
Finite set of fundamental units out of
which we build structure called Alphabet.
Specified set of strings of characters from
alphabet called Language
Strings those are permissible in language
called Words.
Possible string is that it contain only
finitely many letters or symbols
All words are strings, but not all strings
are words.
19
Note
Two words consider same if their order and
letters are same
There is only one word without no letters.
 Λ symbol is not allowed in the part of
alphabets of any language.
The language that has no words the symbol is
used Ф.
This is not true Λ is the word in the language Ф.
20
Note
If L= Ф not contain Λ
If we want to add Λ to L we use union of set
operators ‘+’ to form L + { Λ }
This language is not same as L
But L + Ф =L
If we have method for producing language and
in certain instance method produce nothing
We can say method produced nothing or failed.
21
English
Whole alphabets are represented as
Σ= {a, b, c, d, ……}
Sometimes elements are separated by
comma, spaces and some times uppercase
letters are used.
From these alphabets which strings are valid
English-word={all words in a standard
dictionary}
22
English
 This language still have no grammar if we want to make
a formal definition use capital gamma
 ┌ ={entries in standard dictionary, blank space, usual punctuation
marks}
 Produce sentences as
I am teaching
U are listening
 If we only follow rules of grammar then
I ate three Tuesdays
You ate cloths
 Grammatically corrects but has wrong meanings
 In formal languages these sentences are correct
 We interested syntax alone not semantics or diction
 The set of rules defining English is a grammar
23
Example
My_subject
Alphabet for this language is
{E A P T S W}
Only one word in this language I wish to specify
If earth and moon ever collide then
My_subject={SE}
If earth and moon never collide then
My_subject={AT}
24
Example
It is impossible to be certain whether the word
AT is or not in language MY_subject
Set of rules must enable us to decide, in a finite
amount of time whether given string of alphabet
letters is or not a word in language
Requirements are not made that all the letters
in the alphabet need to appear in the word
selected for the language
25
Defining Languages
 Two kinds of rules to define languages
 How to test a valid word
OR
 How to construct all word in the language
Example:
If Σ= {a} then a language L can be defined as
L={a, aa, aaa, aaaaa….}
L={an : n=1,2,3,…..} or Here a, aa,… are the words of L
concatenation operation is same as addition
If aa concatenated with aaa then we find aaaaa written as
An concatenated with am is word a m+n
 Convenient way is x=aaa and y=aa
 Xy=aaaaa
26
Defining Languages
 Not always true that when two words are concatenated
they produce another word in language.
L2={a, aaa, aaaaa, aaaaaaa….}
 ={a Odd }
 ={a 2n+1 for n=0,1,2,3,…} then
 X=aaa and y=aaaaa then
 Xy=aaaaaaaa not in L2 but alphabet of L2 and L1 are
same
 Also xy=yx but in some case that’s not true like
 X=house and y=boat
 Xy=houseboat and yx=boathouse so xy # yx
27
Valid/In-valid alphabets
While defining an alphabet, an alphabet may
contain letters consisting of group of symbols
for example Σ1= {B, aB, bab, d}.
 Now consider an alphabet
Σ2= {B, Ba, bab, d}
and a string BababB.
28
Valid/In-valid alphabets
This string BababB can be tokenized in two
different ways
 (Ba), (bab), (B)
 (B), (abab), (B)
Which shows that the second group cannot
be identified as a string, defined over
Σ = {a, b}.
29
Valid/In-valid alphabets
As when this string is scanned by the
compiler (Lexical Analyzer), first symbol B is
identified as a letter belonging to Σ, while for
the second letter the lexical analyzer would
not be able to identify, so while defining an
alphabet it should be kept in mind that
ambiguity should not be created.
30
Remarks:
While defining an alphabet of letters
consisting of more than one symbols,
no letter should be started with the letter of
the same alphabet i.e. one letter should not
be the prefix of another. However, a letter
may be ended in the letter of same alphabet
i.e. one letter may be the suffix of another.
31
Conclusion
Σ1= {B, aB, bab, d}
Σ2= {B, Ba, bab, d}
Σ1 is a valid alphabet while Σ2 is an in-valid
alphabet.
32
Length of Strings
Definition:
The length of string s, denoted by |s|, is the
number of letters in the string.
Example:
Σ={a,b}
s=ababa
|s|=5
33
Length of Strings
Example:
Σ= {B, aB, bab, d}
s=BaBbabBd
Tokenizing=(B), (aB), (bab), (B) , (d)
|s|=5
length(Λ)=0 means if length (w)=0 then w=Λ
34
Reverse of a String
Definition:
The reverse of a string s denoted by Rev(s)
or s r, is obtained by writing the letters of s
in reverse order.
Example:
If s=abc is a string defined over Σ={a,b,c}
then Rev(s) or s r = cba
35
Example:
Σ= {B, aB, bab, d}
s=BaBbabBd
Rev(s)=dBbabaBB
36
Defining Languages
The languages can be defined in different
ways , such as Descriptive definition,
Recursive definition, using Regular
Expressions(RE) and using Finite
Automaton(FA) etc.
Descriptive definition of language:
The language is defined, describing the
conditions imposed on its words.
37
Defining Languages
Example:
The language L of strings of odd length,
defined over Σ={a}, can be written as
L={a, aaa, aaaaa,…..}
Example:
The language L of strings that does not start
with a, defined over Σ={a,b,c}, can be written
as
L={b, c, ba, bb, bc, ca, cb, cc, …}
38
Defining Languages
Example:
The language L of strings of length 2,
defined over Σ={0,1,2}, can be written as
L={00, 01, 02,10, 11,12,20,21,22}
Example:
The language L of strings ending in 0,
defined over Σ ={0,1}, can be written as
L={0,00,10,000,010,100,110,…}
39
Defining Languages
Example: The language EQUAL, of strings with
number of a’s equal to number of b’s, defined
over Σ={a,b}, can be written as
{Λ ,ab,aabb,abab,baba,abba,…}
Example: The language EVEN-EVEN, of strings
with even number of a’s and even number of
b’s, defined over Σ={a,b}, can be written as
{Λ, aa, bb, aaaa,aabb,abab, abba, baab, baba,
bbaa, bbbb,…}
40
Defining Languages
Example: The language INTEGER, of strings
defined over Σ={-,0,1,2,3,4,5,6,7,8,9}, can
be written as
INTEGER = {…,-2,-1,0,1,2,…}
Example: The language EVEN, of stings
defined over Σ={-,0,1,2,3,4,5,6,7,8,9}, can
be written as
EVEN = { …,-4,-2,0,2,4,…}
41
Defining Languages
Example: The language {anbn }, of strings
defined over Σ={a,b}, as
{an bn : n=1,2,3,…}, can be written as
{ab, aabb, aaabbb,aaaabbbb,…}
Example: The language {anbnan }, of strings
defined over Σ={a,b}, as
{an bn an: n=1,2,3,…}, can be written as
{aba, aabbaa, aaabbbaaa,aaaabbbbaaaa,…}
42
Defining Languages
Example: The language factorial, of strings
defined over Σ={1,2,3,4,5,6,7,8,9} i.e.
{1,2,6,24,120,…}
Example: The language FACTORIAL, of
strings defined over Σ={a}, as
{an! : n=1,2,3,…}, can be written as
{a,aa,aaaaaa,…}. It is to be noted that the
language FACTORIAL can be defined over
any single letter alphabet.
43
Defining Languages
Example: The language DOUBLEFACTORIAL,
of strings defined over Σ={a, b}, as
{an!bn! : n=1,2,3,…}, can be written as
{ab, aabb, aaaaaabbbbbb,…}
Example: The language SQUARE, of strings
defined over Σ={a}, as
n2
{a : n=1,2,3,…}, can be written as
{a, aaaa, aaaaaaaaa,…}
44
Defining Languages
Example: The language
DOUBLESQUARE, of strings defined
over Σ={a,b}, as
n2 n2
{a b : n=1,2,3,…}, can be written as
{ab, aaaabbbb, aaaaaaaaabbbbbbbbb,…}
45
Defining Languages
Example: The language PRIME, of
strings defined over Σ={a}, as
p
{a : p is prime}, can be written as
{aa,aaa,aaaaa,aaaaaaa,aaaaaaaaaaa…}
46
An Important language
 PALINDROME:
The language consisting of Λ and the
strings s defined over Σ such that
Rev(s)=s.
It is to be denoted that the words of
PALINDROME are called palindromes.
 Example:For Σ={a,b},
PALINDROME={Λ , a, b, aa, bb, aaa, aba,
bab, bbb, ...}
47
Note
Number of strings of length ‘m’ defined over
alphabet of ‘n’ letters is nm.
Examples:
The language of strings of length 2, defined
over Σ={a,b} is L={aa, ab, ba, bb} i.e.
number of strings = 22
The language of strings of length 3, defined
over Σ={a,b} is L={aaa, aab, aba, baa, abb,
bab, bba, bbb} i.e. number of strings = 23
48
Exercise
Q) Prove that there are as many palindromes
of length 2n, defined over Σ = {a,b,c}, as
there are of length 2n-1. Determine the
number of palindromes of length 2n defined
over the same alphabet as well.

49
KLEENE STAR Closure
Given Σ, then the KLEENE STAR Closure of
the alphabet Σ, denoted by Σ*, is the
collection of all strings defined over Σ,
including Λ.
It is to be noted that KLEENE STAR Closure
can be defined over any set of strings.
50
Examples
 If Σ = {x}
Then Σ* = {Λ, x, xx, xxx, xxxx, ….}
 If Σ = {0,1}
Then Σ* = {Λ, 0, 1, 00, 01, 10, 11, ….}
 If Σ = {aaB, c}
Then Σ* = {Λ, aaB, c, aaBaaB, aaBc, caaB,
cc, ….}
51
Note
Languages generated by Kleene Star Closure
of set of strings, are infinite languages. (By
infinite language, it is supposed that the
language contains infinite many words, each
of finite length).
Order the words in Lexicographic order.
Shorter length first and then other words of
same length
52
Example
Let S={aa, b} then
 S* ={Λ Plus any word composed of factors
of aa and b }
 S* ={Λ Plus all strings of a’s and b’s in which
a’s occur in even clumps}
 ={Λ b aa aab baa bbb aaaa baab bbaa…….}
NOTE: string aabaaab is not in S*
53
Example
Let S={a, ab} then
 S* ={Λ Plus any word composed of factors
of a and ab }
 S* ={Λ Plus all strings of a’s and b’s except
those that start with b and those that contain
a double b}
 ={Λ a aa ab aaa aab …….}
54
Example
Parenthesis can be the letter of the alphabet
 If Σ = {x ( ) }
Then Σ* = {Λ, x, xx, xxx, xxxx, ….}
Length(xxxxx)=5
Length( (xx)(xxx) )=9
55
Note
If alphabet has no letters then its closure is a
language with null string as its only word.
 If Σ = Ф
Then Σ* = { Λ }
But not same as
if s={ Λ } then
S* ={ Λ }
56
Task
 Q)
1) Let S={ab, bb} and T={ab, bb, bbbb} Show that S*
= T*
2) Let S={ab, bb} and T={ab, bb, bbb} Show that S*
≠ T* But S*  T*
3) Let S={a, bb, bab, abaab} be a set of strings. Are
abbabaabab and baabbbabbaabb in S*? Does any
word in S* have odd number of b’s?
57
PLUS Operation (+)
Plus Operation is same as Kleene Star Closure
except that it does not generate Λ (null string),
automatically.
Example:
 If Σ = {0,1}
Then Σ+ = {0, 1, 00, 01, 10, 11, ….}
If Σ = {aab, c}
Then Σ+ = {aab, c, aabaab, aabc, caab, cc, ….}
58
Remark
It is to be noted that Kleene Star can also be
operated on any string i.e. a* can be considered
to be all possible strings defined over {a}, which
shows that a* generates
Λ, a, aa, aaa, …
It may also be noted that a+ can be considered
to be all possible non empty strings defined over
{a}, which shows that a+ generates
a, aa, aaa, aaaa, …
59
Theorem1
i.



For any set S of strings we have
S*=S**
Every word in S** is made up of factors from S*
Every factor from S* is made up of factors from S. so every
word in S** is made up of factors from S.
Every word in S** is also a word in S* we can write as
S** contain S*
S**  S* --------------------------1
As we know that A A*
If A=S* then S*  S** --------------------------2
By 1 and 2
S*=S**
60
TASK
Q1)Is there any case when S+ contains Λ? If
yes then justify your answer.
Q2) Prove that for any set of strings S
i. (S+)*=(S*)*
ii. (S+)+=S+
iii. Is (S*)+=(S+)*
61
Defining Languages Continued…
 Recursive definition of languages
The following three steps are used in recursive
definition
1. Some basic objects (words) are specified in the
language.
2. Rules for constructing more objects (words)
are defined in the language.
3. No objects (strings) except those constructed
in above, are allowed to be in the language.
62
Example
Defining language of POSITIVE
INTEGER
Rule 1:
1 is in INTEGER.
Rule 2:
If x is in INTEGER then x+1 and x-1 are
also in INTEGER.
Rule 3:
No strings except those constructed in
above, are allowed to be in INTEGER.
63
Example
Defining language of EVEN
Even is the set of the all positive whole
numbers divisible by 2
Even is the set of all 2n where
n=1,2,3,4,5,…..
64
Example
 Defining language of EVEN
Rule 1:
2 is in EVEN.
Rule 2:
If x is in EVEN then x+2 and x-2 are also in EVEN.
Rule 3:
No strings except those constructed in above, are
allowed to be in EVEN.
Assignment: state and prove two more recursive definition
of Even
65
Example
Defining language of POSITIVE and
NEGATIVE INTEGER
Rule 1:
1 is in INTEGER.
Rule 2:
If both x and y is in INTEGER then x+y and
x-y are also in INTEGER.
Rule 3:
No strings except those constructed in
above, are allowed to be in INTEGER.
66
Example
Defining the language factorial
Rule 1:
As 0!=1, so 1 is in factorial.
Rule 2:
n!=n*(n-1)! is in factorial.
Rule 3:
No strings except those constructed in above,
are allowed to be in factorial.
67
Example
 Defining the language PALINDROME, defined
over Σ = {a,b}
Rule 1:
a and b are in PALINDROME
Rule 2:
if x is palindrome, then s(x)Rev(s) and xx will also
be palindrome, where s belongs to Σ*
Rule 3:
No strings except those constructed in above,
are allowed to be in palindrome
68
Example
Defining the language {anbn }, n=1,2,3,… ,
of strings defined over Σ={a,b}
Rule 1:
ab is in {anbn}
Rule 2:
if x is in {anbn}, then axb is in {anbn}
Rule 3:
No strings except those constructed in
above, are allowed to be in {anbn}
69
Example
 Defining the language L, of strings ending in a ,
defined over Σ={a,b}
Rule 1:
a is in L
Rule 2:
if x is in L then s(x) is also in L, where s belongs to Σ*
Rule 3:
No strings except those constructed in above, are
allowed to be in L
70
Example
 Defining the language L, of strings beginning and
ending in same letters , defined over Σ={a, b}
Rule 1:
a and b are in L
Rule 2:
(a)s(a) and (b)s(b) are also in L, where s belongs to Σ*
Rule 3:
No strings except those constructed in above, are
allowed to be in L
71
Example
 Defining the language L, of strings containing aa
or bb , defined over
Σ={a, b}
Rule 1:
aa and bb are in L
Rule 2:
s(aa)s and s(bb)s are also in L, where s belongs to Σ*
Rule 3:
No strings except those constructed in above, are
allowed to be in L
72
Example
 Defining the language L, of strings containing
exactly aa, defined over
Σ={a, b}
Rule 1:
aa is in L
Rule 2:
s(aa)s is also in L, where s belongs to b*
Rule 3:
No strings except those constructed in above, are
allowed to be in L
73
Example
 An Important Language ARITHMETIC EXPRESSION (A.E)
Rule 1:
Any number (+ive, -ive or zero) is in A.E
Rule 2: if x is in A.E so
 (x)
 -x
(x does not start with already – sign)
Rule 3: if x and y are in A.E so are
 X+y
 X-y
 X*y
 x/y
 X**y
No strings except those constructed in above, are allowed to be in L
(2+4)*(7*(9-3)/4*(2+8)-1
74
Theorem-2
An arithmetic expression cannot contain the
character $
Proof
Denied by rule 1
Denied by rule 2
Denied by rule 3
75
Theorem-3
No A.E can begin or end with symbol /
Proof
Denied by rule 1
Denied by rule 2
Denied by rule 3
76
Theorem-4
No A.E contain the substring //
77
Summing Up
Recursive definition of languages, INTEGER,
EVEN, factorial, PALINDROME, {anbn},
languages of strings (i) ending in a, (ii)
beginning and ending in same letters, (iii)
containing aa or bb (iv)containing exactly aa,
78
Download