Welcome to ! Theory Of Automata 1 Text and Reference Material 1. Introduction to Computer Theory, by Daniel I. Cohen, John Wiley and Sons, Inc., 1991, Second Edition 2. Introduction to Languages and Theory of Computation, by J. C. Martin, McGraw Hill Book Co., 1997, Second Edition 2 What does automata mean? It is the plural of automation, and it means “something that works automatically” Study of abstract computing devices or machines 3 History Turing study an abstract machine All capabilities of today's computers Goal was to describe precisely the boundaries of B/W what a computing machine could do and what couldn't. In 1940,1950 simpler kinds of machines “finite automata” studied by researchers Which model brain functions In 1950 linguist begun study the formal grammar's Serve as the basis of some important software components including parts of compiler Finite automata and formal grammars are used in the design and construction of softwares 4 Why Study Automata Theory Finite automata is useful model for many hardware and software's Software for designing and checking the behavior of digital circuits Lexical analyzer of typical compiler Software's for scanning large bodies of text. Software's for verifying systems of all types that have a finite number of distinct states, such as communications protocol. 5 Example Nontrivial finite automation is an on/off switch. device remembers whether it is in the on state or off state Push Start ON OFF Push States are represented by Circles Arcs labeled with Inputs 6 Example (Recognizing of Then) Start T T H TH E THE N THEN Inputs are letters Analyze examines one character of program Start state corresponding to empty string Each state has a transition to next letters 7 Introduction to languages A set of symbols that expresses ideas and allows people to think and communicate with each other. Language is a system at many levels. Not just a collection of words, language consists of rules and patterns that relate the words to one another 8 Introduction to languages There are two types of languages Formal Languages (Syntactic languages) Informal Languages (Semantic languages) 9 English Language There are three different entities 1. Letters 2. Words 3. Sentences Group of letters make words Group of words make sentences Not all collection of letters form valid word Not all collections of words form valid sentences If Analogy Continued collections of sentences make paragraph Collection of paragraph make stories 10 English Language Humans agree on which sequence are valid and which are not Situation exists with computer languages certain character strings are recognizable Words (Do, If, End) Certain strings of words recognizable become commands, and commands become program and then be compiled to machine code. 11 English Language Whether an input is valid communication then rules for decoding exactly what the communications means Language must be able to tell who is in and who is out. Very hard to state rules. 12 Theory of Formal Language Refers to the fact that all rules for language explicitly stated what strings of symbols can occur. No liberties are tolerated. Its game of symbols with formal rules Not expressions of ideas in the minds of human 13 Formal Languages (Alphabets) Definition: A finite non-empty set of symbols (letters), is called an alphabet. It is denoted by Σ ( Greek letter sigma). Example: Σ={a,b} Σ={0,1} //important as this is the language //which the computer understands. Σ={i,j,k} 14 NOTE: A certain version of language ALGOL has 113 letters Σ (alphabet) includes letters, digits and a variety of operators including sequential operators such as GOTO and IF 15 Strings Definition: Concatenation of finite symbols or letters from the alphabet is called a string. Example: If Σ= {a,b} then a, abab, aaabb, ababababababababab 16 NOTE: EMPTY STRING or NULL STRING Sometimes a string with no symbol or letters at all is used, denoted by (Small Greek letter Lambda) λ or (Capital Greek letter Lambda) Λ, is called an empty string or null string. What alphabet is considering the null string is always Λ The capital lambda will mostly be used to denote the empty string, in further discussion. 17 Words Definition: Words are strings belonging to some language. Example: If Σ= {x} then a language L can be defined as L={xn : n=1,2,3,…..} or L={x,xx,xxx,….} Here x,xx,… are the words of L 18 NOTE: Finite set of fundamental units out of which we build structure called Alphabet. Specified set of strings of characters from alphabet called Language Strings those are permissible in language called Words. Possible string is that it contain only finitely many letters or symbols All words are strings, but not all strings are words. 19 Note Two words consider same if their order and letters are same There is only one word without no letters. Λ symbol is not allowed in the part of alphabets of any language. The language that has no words the symbol is used Ф. This is not true Λ is the word in the language Ф. 20 Note If L= Ф not contain Λ If we want to add Λ to L we use union of set operators ‘+’ to form L + { Λ } This language is not same as L But L + Ф =L If we have method for producing language and in certain instance method produce nothing We can say method produced nothing or failed. 21 English Whole alphabets are represented as Σ= {a, b, c, d, ……} Sometimes elements are separated by comma, spaces and some times uppercase letters are used. From these alphabets which strings are valid English-word={all words in a standard dictionary} 22 English This language still have no grammar if we want to make a formal definition use capital gamma ┌ ={entries in standard dictionary, blank space, usual punctuation marks} Produce sentences as I am teaching U are listening If we only follow rules of grammar then I ate three Tuesdays You ate cloths Grammatically corrects but has wrong meanings In formal languages these sentences are correct We interested syntax alone not semantics or diction The set of rules defining English is a grammar 23 Example My_subject Alphabet for this language is {E A P T S W} Only one word in this language I wish to specify If earth and moon ever collide then My_subject={SE} If earth and moon never collide then My_subject={AT} 24 Example It is impossible to be certain whether the word AT is or not in language MY_subject Set of rules must enable us to decide, in a finite amount of time whether given string of alphabet letters is or not a word in language Requirements are not made that all the letters in the alphabet need to appear in the word selected for the language 25 Defining Languages Two kinds of rules to define languages How to test a valid word OR How to construct all word in the language Example: If Σ= {a} then a language L can be defined as L={a, aa, aaa, aaaaa….} L={an : n=1,2,3,…..} or Here a, aa,… are the words of L concatenation operation is same as addition If aa concatenated with aaa then we find aaaaa written as An concatenated with am is word a m+n Convenient way is x=aaa and y=aa Xy=aaaaa 26 Defining Languages Not always true that when two words are concatenated they produce another word in language. L2={a, aaa, aaaaa, aaaaaaa….} ={a Odd } ={a 2n+1 for n=0,1,2,3,…} then X=aaa and y=aaaaa then Xy=aaaaaaaa not in L2 but alphabet of L2 and L1 are same Also xy=yx but in some case that’s not true like X=house and y=boat Xy=houseboat and yx=boathouse so xy # yx 27 Valid/In-valid alphabets While defining an alphabet, an alphabet may contain letters consisting of group of symbols for example Σ1= {B, aB, bab, d}. Now consider an alphabet Σ2= {B, Ba, bab, d} and a string BababB. 28 Valid/In-valid alphabets This string BababB can be tokenized in two different ways (Ba), (bab), (B) (B), (abab), (B) Which shows that the second group cannot be identified as a string, defined over Σ = {a, b}. 29 Valid/In-valid alphabets As when this string is scanned by the compiler (Lexical Analyzer), first symbol B is identified as a letter belonging to Σ, while for the second letter the lexical analyzer would not be able to identify, so while defining an alphabet it should be kept in mind that ambiguity should not be created. 30 Remarks: While defining an alphabet of letters consisting of more than one symbols, no letter should be started with the letter of the same alphabet i.e. one letter should not be the prefix of another. However, a letter may be ended in the letter of same alphabet i.e. one letter may be the suffix of another. 31 Conclusion Σ1= {B, aB, bab, d} Σ2= {B, Ba, bab, d} Σ1 is a valid alphabet while Σ2 is an in-valid alphabet. 32 Length of Strings Definition: The length of string s, denoted by |s|, is the number of letters in the string. Example: Σ={a,b} s=ababa |s|=5 33 Length of Strings Example: Σ= {B, aB, bab, d} s=BaBbabBd Tokenizing=(B), (aB), (bab), (B) , (d) |s|=5 length(Λ)=0 means if length (w)=0 then w=Λ 34 Reverse of a String Definition: The reverse of a string s denoted by Rev(s) or s r, is obtained by writing the letters of s in reverse order. Example: If s=abc is a string defined over Σ={a,b,c} then Rev(s) or s r = cba 35 Example: Σ= {B, aB, bab, d} s=BaBbabBd Rev(s)=dBbabaBB 36 Defining Languages The languages can be defined in different ways , such as Descriptive definition, Recursive definition, using Regular Expressions(RE) and using Finite Automaton(FA) etc. Descriptive definition of language: The language is defined, describing the conditions imposed on its words. 37 Defining Languages Example: The language L of strings of odd length, defined over Σ={a}, can be written as L={a, aaa, aaaaa,…..} Example: The language L of strings that does not start with a, defined over Σ={a,b,c}, can be written as L={b, c, ba, bb, bc, ca, cb, cc, …} 38 Defining Languages Example: The language L of strings of length 2, defined over Σ={0,1,2}, can be written as L={00, 01, 02,10, 11,12,20,21,22} Example: The language L of strings ending in 0, defined over Σ ={0,1}, can be written as L={0,00,10,000,010,100,110,…} 39 Defining Languages Example: The language EQUAL, of strings with number of a’s equal to number of b’s, defined over Σ={a,b}, can be written as {Λ ,ab,aabb,abab,baba,abba,…} Example: The language EVEN-EVEN, of strings with even number of a’s and even number of b’s, defined over Σ={a,b}, can be written as {Λ, aa, bb, aaaa,aabb,abab, abba, baab, baba, bbaa, bbbb,…} 40 Defining Languages Example: The language INTEGER, of strings defined over Σ={-,0,1,2,3,4,5,6,7,8,9}, can be written as INTEGER = {…,-2,-1,0,1,2,…} Example: The language EVEN, of stings defined over Σ={-,0,1,2,3,4,5,6,7,8,9}, can be written as EVEN = { …,-4,-2,0,2,4,…} 41 Defining Languages Example: The language {anbn }, of strings defined over Σ={a,b}, as {an bn : n=1,2,3,…}, can be written as {ab, aabb, aaabbb,aaaabbbb,…} Example: The language {anbnan }, of strings defined over Σ={a,b}, as {an bn an: n=1,2,3,…}, can be written as {aba, aabbaa, aaabbbaaa,aaaabbbbaaaa,…} 42 Defining Languages Example: The language factorial, of strings defined over Σ={1,2,3,4,5,6,7,8,9} i.e. {1,2,6,24,120,…} Example: The language FACTORIAL, of strings defined over Σ={a}, as {an! : n=1,2,3,…}, can be written as {a,aa,aaaaaa,…}. It is to be noted that the language FACTORIAL can be defined over any single letter alphabet. 43 Defining Languages Example: The language DOUBLEFACTORIAL, of strings defined over Σ={a, b}, as {an!bn! : n=1,2,3,…}, can be written as {ab, aabb, aaaaaabbbbbb,…} Example: The language SQUARE, of strings defined over Σ={a}, as n2 {a : n=1,2,3,…}, can be written as {a, aaaa, aaaaaaaaa,…} 44 Defining Languages Example: The language DOUBLESQUARE, of strings defined over Σ={a,b}, as n2 n2 {a b : n=1,2,3,…}, can be written as {ab, aaaabbbb, aaaaaaaaabbbbbbbbb,…} 45 Defining Languages Example: The language PRIME, of strings defined over Σ={a}, as p {a : p is prime}, can be written as {aa,aaa,aaaaa,aaaaaaa,aaaaaaaaaaa…} 46 An Important language PALINDROME: The language consisting of Λ and the strings s defined over Σ such that Rev(s)=s. It is to be denoted that the words of PALINDROME are called palindromes. Example:For Σ={a,b}, PALINDROME={Λ , a, b, aa, bb, aaa, aba, bab, bbb, ...} 47 Note Number of strings of length ‘m’ defined over alphabet of ‘n’ letters is nm. Examples: The language of strings of length 2, defined over Σ={a,b} is L={aa, ab, ba, bb} i.e. number of strings = 22 The language of strings of length 3, defined over Σ={a,b} is L={aaa, aab, aba, baa, abb, bab, bba, bbb} i.e. number of strings = 23 48 Exercise Q) Prove that there are as many palindromes of length 2n, defined over Σ = {a,b,c}, as there are of length 2n-1. Determine the number of palindromes of length 2n defined over the same alphabet as well. 49 KLEENE STAR Closure Given Σ, then the KLEENE STAR Closure of the alphabet Σ, denoted by Σ*, is the collection of all strings defined over Σ, including Λ. It is to be noted that KLEENE STAR Closure can be defined over any set of strings. 50 Examples If Σ = {x} Then Σ* = {Λ, x, xx, xxx, xxxx, ….} If Σ = {0,1} Then Σ* = {Λ, 0, 1, 00, 01, 10, 11, ….} If Σ = {aaB, c} Then Σ* = {Λ, aaB, c, aaBaaB, aaBc, caaB, cc, ….} 51 Note Languages generated by Kleene Star Closure of set of strings, are infinite languages. (By infinite language, it is supposed that the language contains infinite many words, each of finite length). Order the words in Lexicographic order. Shorter length first and then other words of same length 52 Example Let S={aa, b} then S* ={Λ Plus any word composed of factors of aa and b } S* ={Λ Plus all strings of a’s and b’s in which a’s occur in even clumps} ={Λ b aa aab baa bbb aaaa baab bbaa…….} NOTE: string aabaaab is not in S* 53 Example Let S={a, ab} then S* ={Λ Plus any word composed of factors of a and ab } S* ={Λ Plus all strings of a’s and b’s except those that start with b and those that contain a double b} ={Λ a aa ab aaa aab …….} 54 Example Parenthesis can be the letter of the alphabet If Σ = {x ( ) } Then Σ* = {Λ, x, xx, xxx, xxxx, ….} Length(xxxxx)=5 Length( (xx)(xxx) )=9 55 Note If alphabet has no letters then its closure is a language with null string as its only word. If Σ = Ф Then Σ* = { Λ } But not same as if s={ Λ } then S* ={ Λ } 56 Task Q) 1) Let S={ab, bb} and T={ab, bb, bbbb} Show that S* = T* 2) Let S={ab, bb} and T={ab, bb, bbb} Show that S* ≠ T* But S* T* 3) Let S={a, bb, bab, abaab} be a set of strings. Are abbabaabab and baabbbabbaabb in S*? Does any word in S* have odd number of b’s? 57 PLUS Operation (+) Plus Operation is same as Kleene Star Closure except that it does not generate Λ (null string), automatically. Example: If Σ = {0,1} Then Σ+ = {0, 1, 00, 01, 10, 11, ….} If Σ = {aab, c} Then Σ+ = {aab, c, aabaab, aabc, caab, cc, ….} 58 Remark It is to be noted that Kleene Star can also be operated on any string i.e. a* can be considered to be all possible strings defined over {a}, which shows that a* generates Λ, a, aa, aaa, … It may also be noted that a+ can be considered to be all possible non empty strings defined over {a}, which shows that a+ generates a, aa, aaa, aaaa, … 59 Theorem1 i. For any set S of strings we have S*=S** Every word in S** is made up of factors from S* Every factor from S* is made up of factors from S. so every word in S** is made up of factors from S. Every word in S** is also a word in S* we can write as S** contain S* S** S* --------------------------1 As we know that A A* If A=S* then S* S** --------------------------2 By 1 and 2 S*=S** 60 TASK Q1)Is there any case when S+ contains Λ? If yes then justify your answer. Q2) Prove that for any set of strings S i. (S+)*=(S*)* ii. (S+)+=S+ iii. Is (S*)+=(S+)* 61 Defining Languages Continued… Recursive definition of languages The following three steps are used in recursive definition 1. Some basic objects (words) are specified in the language. 2. Rules for constructing more objects (words) are defined in the language. 3. No objects (strings) except those constructed in above, are allowed to be in the language. 62 Example Defining language of POSITIVE INTEGER Rule 1: 1 is in INTEGER. Rule 2: If x is in INTEGER then x+1 and x-1 are also in INTEGER. Rule 3: No strings except those constructed in above, are allowed to be in INTEGER. 63 Example Defining language of EVEN Even is the set of the all positive whole numbers divisible by 2 Even is the set of all 2n where n=1,2,3,4,5,….. 64 Example Defining language of EVEN Rule 1: 2 is in EVEN. Rule 2: If x is in EVEN then x+2 and x-2 are also in EVEN. Rule 3: No strings except those constructed in above, are allowed to be in EVEN. Assignment: state and prove two more recursive definition of Even 65 Example Defining language of POSITIVE and NEGATIVE INTEGER Rule 1: 1 is in INTEGER. Rule 2: If both x and y is in INTEGER then x+y and x-y are also in INTEGER. Rule 3: No strings except those constructed in above, are allowed to be in INTEGER. 66 Example Defining the language factorial Rule 1: As 0!=1, so 1 is in factorial. Rule 2: n!=n*(n-1)! is in factorial. Rule 3: No strings except those constructed in above, are allowed to be in factorial. 67 Example Defining the language PALINDROME, defined over Σ = {a,b} Rule 1: a and b are in PALINDROME Rule 2: if x is palindrome, then s(x)Rev(s) and xx will also be palindrome, where s belongs to Σ* Rule 3: No strings except those constructed in above, are allowed to be in palindrome 68 Example Defining the language {anbn }, n=1,2,3,… , of strings defined over Σ={a,b} Rule 1: ab is in {anbn} Rule 2: if x is in {anbn}, then axb is in {anbn} Rule 3: No strings except those constructed in above, are allowed to be in {anbn} 69 Example Defining the language L, of strings ending in a , defined over Σ={a,b} Rule 1: a is in L Rule 2: if x is in L then s(x) is also in L, where s belongs to Σ* Rule 3: No strings except those constructed in above, are allowed to be in L 70 Example Defining the language L, of strings beginning and ending in same letters , defined over Σ={a, b} Rule 1: a and b are in L Rule 2: (a)s(a) and (b)s(b) are also in L, where s belongs to Σ* Rule 3: No strings except those constructed in above, are allowed to be in L 71 Example Defining the language L, of strings containing aa or bb , defined over Σ={a, b} Rule 1: aa and bb are in L Rule 2: s(aa)s and s(bb)s are also in L, where s belongs to Σ* Rule 3: No strings except those constructed in above, are allowed to be in L 72 Example Defining the language L, of strings containing exactly aa, defined over Σ={a, b} Rule 1: aa is in L Rule 2: s(aa)s is also in L, where s belongs to b* Rule 3: No strings except those constructed in above, are allowed to be in L 73 Example An Important Language ARITHMETIC EXPRESSION (A.E) Rule 1: Any number (+ive, -ive or zero) is in A.E Rule 2: if x is in A.E so (x) -x (x does not start with already – sign) Rule 3: if x and y are in A.E so are X+y X-y X*y x/y X**y No strings except those constructed in above, are allowed to be in L (2+4)*(7*(9-3)/4*(2+8)-1 74 Theorem-2 An arithmetic expression cannot contain the character $ Proof Denied by rule 1 Denied by rule 2 Denied by rule 3 75 Theorem-3 No A.E can begin or end with symbol / Proof Denied by rule 1 Denied by rule 2 Denied by rule 3 76 Theorem-4 No A.E contain the substring // 77 Summing Up Recursive definition of languages, INTEGER, EVEN, factorial, PALINDROME, {anbn}, languages of strings (i) ending in a, (ii) beginning and ending in same letters, (iii) containing aa or bb (iv)containing exactly aa, 78