AUTOMATA AND LANGUAGE THEORY CS342 - Fall 2023 Lecture 1 : Overview & Finite Automata (FA) Dr. Fawzya Ramadan Fawzya.ramadan@fayoum.edu.eg 2/21/2023 CS342 – Fall 2023 2 Overview • This course will focus on definitions and properties of fundamental mathematical models of computation (Automata Theory). • We will be interested in both the inherent capabilities and limitations of these computational models, as well as their relationships with formal languages. • The course covers three broad areas: • (1) Regular Languages • (2) Context-free Languages • (3) Turing machines 2/21/2023 CS342 – Fall 2023 4 Automata Theory • Finite automatons (FA): • Used in text processing, compilers, and hardware design. • It is the heart of many electromechanical devices such as automatic doors, elevators, dishwashers, calculators,…etc. • Context-Free Grammars (CFG): • Used to describe the syntax of essentially every modern programming language • Every modern complier uses CFG concepts to parse programs. • Describing natural languages. • Turing Machines: • These form a simple abstract model of a “real” computer, such as your PC at home. 2/21/2023 Automata Theory CS342 – Fall 2023 5 CS342 – Fall 2023 2/21/2023 6 • Lectures: Main Textbook “Introduction to the Theory of Computation” • Wednesday 01:00 - 03:00 pm at Hall 2 • Section: • Eng. Asmaa Seoudy • Thursday 11:00 - 01:00 pm at Hall 5 • Thursday 01:00 - 03:00 pm at Hall 5 • Office hours (Tentative): • Monday 09 am - 11 am • Course Classroom • Class Room Link: https://classroom.google.com/c/NTkzODMwMjI4 NzQz?cjc=lnrhtio • Class Room Code: lnrhtio • All announcements, lecture slides, assignments, grades… etc. will be published on the course classroom, so you have to check regularly for updates. • Reference Books • P. Linz. Introduction to Formal Languages and Automata • J. Hopcroft, R. Motwani, and J. Ullman. Introduction to Automata Theory, Languages, and Computation 2/21/2023 CS342 – Fall 2023 Grading • Assessment Methods: • Sheets & Quizzes: • Midterm Exam : 20% • Final Exam : 60% • Notes: • All assignments are individual. 20% 7 2/21/2023 CS342 – Fall 2023 8 Course Policies • Slides : • Will usually be uploaded on the web the morning of the class. • The slides are for MY convenience and for helping you recollect the material covered. • They are not a substitute for, or a comprehensive summary of, the textbook. • Resources: • We will follow the textbook closely. • There are more resources than you can possibly read – including books, lecture slides and notes. 2/21/2023 CS342 – Fall 2023 9 Course Policies (Cont.) • Assignments Submission: • All assignments will be handed out in class section time. • NO late submissions will be permitted. • All students must follow submission date. • Late submissions can be accepted only in strong circumstances but with loss of some credit. • Plagiarism: • Will be dealt with very strictly • Many students find it helpful to consult their peers while doing assignments and this is expected. However, it is not acceptable practice to pool thoughts and produce common answers. • Students who allow their assignments to be copied are as guilty as those who copy and will be treated accordingly. • Copying solutions from the Internet or books or any other public sources without explicit citations is prohibited! 2/21/2023 CS342 – Fall 2023 Course Topics • Regular Languages and their descriptors: • Finite automata • Nondeterministic finite automata (NFA) • Regular expressions (RE) • Closure properties of regular languages. • Context-free languages and their descriptors: • Context-free grammars (CFG) • Pushdown automata (PDA) • Turing Machines (TM) 10 2/21/2023 CS342 – Fall 2023 Course Schedule Topics Readings Course Overview & Introduction Ch 0 Finite Automata : DFA Ch 1.1 Finite Automata : NFA Ch 1.2 NFA Equivalence to DFA Ch 1.2 Regular Expression : RE Ch 1.3 Midterm - Context-Free Grammar and Languages: CFG & CFL Ch 2.1 Push Down Automata : PDA Ch 2.2 Pumping Lemma Ch 2.3 Turing Machine Ch 3.1 Revision - Quiz 11 2/21/2023 CS342 – Fall 2023 Courses Depend on CS342 • CS441: Compiler Construction (Mandatory) • CS467: Theory of Computation (Elective) 12 2/21/2023 Our Course… CS342 – Fall 2023 13 2/21/2023 CS342 – Fall 2023 Overview of Languages • Sentences is the basic building block of languages • Sentence = Syntax + Semantics • Grammar is the study of the structure of a sentence • Ex.: <sentence> ::= <noun phrase> <verb> <noun phrase> <noun phrase> ::= <article> < noun> “A person entered the room” Can you draw the derivation tree of the above sentence? 14 CS342 – Fall 2023 2/21/2023 15 Natural vs. Formal Languages Natural Language Formal Language • Rules comes after the language • Developed with strict rules • Imprecise • Precise • Ambiguous • Unambiguous • Hardly be processed by machine • Can be processed by machine • Highly flexible • Unfamiliar notations • No special learning effort needed • Initial learning effort needed • Ex. English, French, Arabic, • Ex. C++, Java, Python, …etc …etc. 2/21/2023 CS342 – Fall 2023 16 Mathematical Preliminaries (Sets) • A set is a collection of objects called “elements” or “members” • Set membership , non-membership • A = {1, 2, 3} 1 A • B = {train, bicycle, bus, airplane} ship B • Represented as : • D = { 2, 4, 6, …} • D = { j : j > 0 , and j = 2k for some k > 0} • D = { j : j is non-negative and even } • Finite set is a set with finite number of elements, while Infinite set is a set with infinite number of elements. • C = {a, b, c, d, e, f, g, …., z} finite set • D = {2, 4, 6, 8, ….} infinite set 2/21/2023 CS342 – Fall 2023 17 Mathematical Preliminaries (Sets) • Set operations: Union, Intersection, and Complement • Subset: A is subset of B A B • Proper set A proper subset is one that contains a few elements of the original set whereas an improper subset, contains every element of the original set along with the null set. • For example, if set A = {2, 4, 6}, then, • • • • • Number of subsets: {2}, {4}, {6}, {2,4}, {4,6}, {2,6}, {2,4,6} and Φ or {}. Proper Subsets: {}, {2}, {4}, {6}, {2,4}, {4,6}, {2,6} Improper Subset: {2,4,6} Proper Symbol : A ⊂ B Improper symbol : A B • Power set (P) is a set of all subsets of a set including the empty set and the set itself. • Example: A = {0, 1} , What is P(A)? • Empty set () is a set with zero members (i.e. is subset of any set) • ={} • Neither order nor repetition of members matter in sets. • {1, 2, 3} and {3, 2, 1} and {1, 2, 3, 2} are all the same set. 2/21/2023 CS342 – Fall 2023 18 Mathematical Preliminaries (Sequence) • A sequence of objects is a list of these objects in some order ( e.g.: function parameters ) • The sequence 7, 21, 57 is ( 7, 21, 57 ) • Set vs. Sequence: • (7, 21, 57) and (21, 7, 57) are different sequences • {7, 21, 57} and {21, 7, 57} are same sets • (7, 7, 57) and (7, 57) are different sequences • {7, 7, 57} and {7, 57} are same sets • Finite sequences are often called tuples. • A sequence with k elements is a k-tuple. • (7, 21, 57) is a 3-tuple and (1, 5) is a 2-tuple (pair) 2/21/2023 CS342 – Fall 2023 19 Mathematical Preliminaries (Functions and Relations) • Function is an object that sets an input-output relationship. Takes an input and produces an output. • Also called a mapping, maps a given input to an output. • f(a) = b f maps a to b • Domain (D): is a set of possible inputs. • Range (R): is a set of possible outputs. • f : D R , f is a function with domain D and range R • Example: function add is written as f : Z x Z Z 2/21/2023 CS342 – Fall 2023 20 Mathematical Preliminaries (Functions and Relations) • Functions can be described in different ways: • Procedure Table: 1-D if input is one argument, or 2-D if the domain is a Cartesian product of two sets. • A predicate or property is a function whose range is {TRUE, FALSE}. • A relation is a property whose domain is a set of k-tuples. • Example: “less than” is a relation with 2-tuple input (binary relation) • For binary relation, aRb means aRb = TRUE • Binary relation R is an equivalence relation if: • R is reflexive , if for every a, aRa • R is symmetric, if for every a and b, aRb implies bRa • R is transitive, if for every a, b, and c, aRb and bRc implies aRc 2/21/2023 CS342 – Fall 2023 21 Alphabets • Alphabet () is defined to be any nonempty finite set whose members are the symbols of this alphabet. • Examples: • = { 0, 1 } • = { a, b, c, d, e, f, g, h, I, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z} 2/21/2023 CS342 – Fall 2023 22 Strings • A string (w ) over an alphabet is a finite sequence of symbols from that alphabet. • Example: • 101011 is a string over the = {0,1} • cat is a string over the = { a, b, c, d, e, f, g, h, I, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z} • If w is a string over , then the length of w, |w |, is the number of symbols it contains. • Example: | 101011 | = 6 • The empty string () is the string of length zero. ||= 0 2/21/2023 CS342 – Fall 2023 23 Languages • Language (L) is a set of strings. • Language L over alphabet is a set of strings over • Example: Language of even a‟s over the alphabet = {a, b} is: • L = { aa, aaaa, aaaaaa, ... , ak, ...} , where k is an even number. • L = { ak | k is an even number } • * is the set of all strings over (i.e. any language L * ). • Simple Languages: • L = All words in the American Heritage Dictionary • L = *, where = {a,b} • L = {x *:|x| = 3} {aaa, aab, aba, abb, baa, bab, bba, bbb} 2/21/2023 CS342 – Fall 2023 24 Complicated Languages • The set of strings x {a, b}* such that x has more a‟s than b‟s. • The set of strings x {0, 1}* such that x is the binary representation of a prime number. • All „C‟ programs that do not go into an infinite loop. 2/21/2023 CS342 – Fall 2023 25 Be Careful to Distinguish… • The empty string (a string) • The empty set (a set, possibly a language) • {} The set containing one element, which is the empty string (a language) • {} The set containing one element, which is the empty set (a set of sets, maybe a set of languages) CS342 – Fall 2023 2/21/2023 26 Automaton • Automaton is a machine which automatically performs a range of functions according to a predetermined set of coded instructions. • Automata are distinguished by the temporary memory. Temporary Memory Automaton CPU Input Memory Program Memory Output Memory CS342 – Fall 2023 2/21/2023 Finite Automata (FA) • No temporary memory • Ex: Vending machine with small computing power Finite Automata CPU Input Memory Program Memory Output Memory 27 CS342 – Fall 2023 2/21/2023 28 Pushdown Automata (PDA) • Temporary memory is Stack • Ex: Programming languages (medium computing power) Stack Push , Pop Pushdown Automata CPU Input Memory Program Memory Output Memory CS342 – Fall 2023 2/21/2023 Turing Machine • Temporary memory is Stack • Ex: Algorithms (highest computing power) Random Access Memory Turing Machine CPU Input Memory Program Memory Output Memory 29 FINITE AUTOMATA (FA) 2/21/2023 CS342 – Fall 2021 31 Overview Automaton • a self-operating machine that can move automatically. • a machine or control mechanism designed to follow a precise sequence of instructions. Finite Automata (FA) is a model of computer with an extremely limited amount of memory. FA devices examples: • Elevator controller. • Household appliances controllers, such as dishwasher and electronic thermostats. • Automatic door controller. Pinocchio Automaton (Wikipedia) CS342 – Fall 2021 2/21/2023 32 Finite Automata (FA) • Consider one-way automatic door. • The door has two pads that can sense when someone is standing on them, a front and rear pad. • State Diagram: Neither Rear Both Front Rear Both Front closed open Neither 2/21/2023 CS342 – Fall 2021 33 Finite Automata (FA) • Finite Automata (FA) is the simplest machine that can recognize an infinite language with a finite number of states (a finite memory). • FA is a : “Read once”, “no write” procedure. • A FA is made up of states and transitions, and as it sees a symbol or letter of input, it makes a transition to another state taking the current state and symbol as input. • The transition function of an FA has two classes: • Deterministic DFA • Nondeterministic NFA 2/21/2023 CS342 – Fall 2021 34 Finite Automata (FA) • The automata receives an input string and produces an output which is either accept or reject. • The process proceeds as follows: 1. 2. 3. 4. Start at the start state of the machine. Receive symbol by symbol from the input string from left to write. Read each symbol and make a transition from state to another. Produce the output after reading the last symbol, accept if the current state is an accept state and reject otherwise. CS342 – Fall 2021 2/21/2023 35 Finite Automata (FA) – State Diagram • State Diagram of a FA, consists of states and transitions. • One of the states is a start state, indicated by an arrow pointing to it from nowhere. • One or more states are accept (final) states, indicated by a double circle. 0 1 1 q1 0 q3 q2 0,1 A finite automaton called Ml that has three states 2/21/2023 CS342 – Fall 2021 36 Finite Automata (FA) Example A finite automaton Ml • What will be the output when feeding Ml with the strings: “1101” , “0100”, and “101000” ? • What type of strings that Ml accepts or rejects? 2/21/2023 CS342 – Fall 2021 37 Formal Definition of FA • Formal definition resolves any uncertainties about what is allowed in a FA. • The formal definition says that a finite automaton is a 5tuple (quintuple) of five objects: 1. 2. 3. 4. 5. Set of states (Q) Input alphabet () Rules for moving – transition function () Start state (q0) Accept states (F) CS342 – Fall 2021 2/21/2023 38 Formal Definition of DFA • Formal definition of a DFA is: M = ( Q , , , q0 , F ) where Q is a finite set of states. is a finite set of symbols, called the alphabet. : Q X Q is a transition function that defines the rules for moving. q0 Q is the start state. F Q is a set of accept (final) states. 2/21/2023 CS342 – Fall 2021 39 Questions about DFA • FA were allowed to have 0 accept states? • FA must have exactly one transition exiting every state for each possible input symbol? 2/21/2023 CS342 – Fall 2021 40 Questions about DFA (Answers) • FA were allowed to have 0 accept states? Yes, as F can be the empty set • DFA must have exactly one transition exiting every state for each possible input symbol? Yes, as specifies exactly one next state for each possible combination of a state and an input symbol CS342 – Fall 2021 2/21/2023 41 Formal Definition of DFA (Example) • How can we formally define this automaton? 1. 2. 3. 4. 5. Q = {q1 , q2 , q3} = {0, 1} is described as: q1 is the start state F = {q2} 2/21/2023 CS342 – Fall 2021 42 Language Recognized by FA • Language of a FA machine M is denoted by L(M). • L(M) is the set A of all string that machine M accepts, i.e. L(M) = A • We say that: M recognizes A • Example: L(M) = A, where A = { w | w contains at least one 1 and an even number of 0s follow the last 1} 2/21/2023 CS342 – Fall 2021 43 Language Recognized by FA • How many string can a machine accept and how many language can it recognize? A machine may accept several strings, but it recognize only one language. • What if a machine accepts no string, what language does it recognize? The machine recognizes the empty language () CS342 – Fall 2021 2/21/2023 44 DFA Example 1 • Give state diagrams of DFAs recognizing the following language. = {0, 1} • L1 = {w | w contains the substring 001} 1 q1 0,1 0 0 1 q2 0 q3 1 q4 CS342 – Fall 2021 2/21/2023 45 DFA Example 2 • Give state diagrams of DFAs recognizing the following language. = {0, 1} • L2 = {w | w contains either 101 or 11 as a substring} 0 q1 0,1 1 0 1 q2 q4 1 0 q3 2/21/2023 CS342 – Fall 2023 END OF LECTURE! See you next week… 46