Uploaded by CARTOONFLIX

CH 1 - Introduction to FLAT

advertisement
Formal Languages
and
Theory of Automata
Formal Languages and Theory of Automata
Chapter 1– Introduction
Basic Concepts of Finite Automata and
Languages
Outline
• Introduction
• Alphabets and Strings
• What is a Formal Language?
• What is Automata Theory?
• General Model of Automaton
• Introduction to Finite Automata
Introduction
• The term Formal Languages refers to languages that can be
described by a body of systematic rules – simple and precise
mathematical rules.
• The formal language theory as a discipline started from the work
of linguist Noam Chomsky in the 1950s –
– When he attempted to give a precise characterization of the structure of
natural languages.
– His goal was to define the syntax of languages using simple and precise
mathematical rules.
– Later it was found that the syntax of programming languages can be
described using one of Chomsky’s grammatical models called Context
Free Grammars.
Introduction – cont.…
• Formal Language theory mainly deals with the mathematical
properties of strings and collection of strings.
• Soon after the advent of modern electronic computers, people
realized that all forms of information – whether numbers, names,
pictures, or sound waves can be represented as strings.
• Then collection of strings known as languages became central to
computer science.
• Also theoretical computer science studies the mathematical
foundation of computer science
– which investigates the power and limitations of computing devices using
abstract mathematical models of computers.
Introduction – cont.…
• Models should be as simple as possible so that they can be easily
analyzed; yet they have to be powerful enough to be able to perform
relevant computation process.
– The model ignores the implementation details of individual computers and
concentrates on the actual computation process.
• Real computers are modeled with a very simple abstract mathematical
model – called Turing Machine.
– Using it we are able to prove that whether some computational problems can
be algorithmically solvable (decidable) or cannot be algorithmically solvable (i.e.
undecidable).
• Automata theory is the study of abstract machines and automata as
well as the computational problems that can be solved using them.
Introduction – cont.…
• An automaton is an abstract model of machines that perform
computation on an input.
• Automata theory and formal language theory are closely related,
where an automaton is a finite representation of a formal language
that may be an infinite set.
• Throughout the course, we deal with the relationship between formal
languages and automaton by constructing accepting devices for four
types of formal languages:
–
–
–
–
Regulars language and Finite automata
Context free languages and Pushdown automata
Context sensitive languages and Linear-bounded automata
Recursively enumerable languages and Turing machines.
Definitions
• Symbol – An atomic unit, such as a digit, character, lowercase letter, etc. Sometimes a word. [Formal language does
not deal with the “meaning” of the symbols.]
• Alphabet – A finite set of symbols, usually denoted by Σ.
Σ = {0, 1} Σ = {0, a, 4}
Σ = {a, b, c, d}
• String – A finite length sequence of symbols, presumably
from some alphabet.
Alphabets and strings
• A common way to talk about words, number, pairs of words, etc.
is by representing them as strings
• To define strings, we start with an alphabet
An alphabet is a finite set of symbols.
• Examples
S1 = {a, b, c, d, …, z}: the set of letters in English
S2 = {0, 1, …, 9}: the set of (base 10) digits
S3 = {a, b, …, z, #}: the set of letters plus the
special symbol #
S4 = {(, )}: the set of open and closed brackets
Strings
A string over alphabet S is a finite sequence
of symbols in S.
• The empty string will be denoted by e (epsilon)
• Examples
abfbz is a string over S1 = {a, b, c, d, …, z}
9021 is a string over S2 = {0, 1, …, 9}
ab#bc is a string over S3 = {a, b, …, z, #}
))()(() is a string over S4 = {(, )}
Strings
Let: u=ε
w = 0110 y = 0aa
x = aabcaa z = 111
concatenation: wz = 0110111
length:
|w| = 4
reversal:
yR = aa0
|x| = 6
but |u| = 0
Some special sets of strings
•
Σ*
All strings of symbols from Σ
•
Σ+
Σ* - {ε}
Example:
Σ = {0, 1}
•
Σ* = {ε, 0, 1, 00, 01, 10, 11, 000, 001,…}
•
Σ+ = {0, 1, 00, 01, 10, 11, 000, 001,…}
What is Formal Language ?
A (formal) language is a set of strings over an alphabet.
i.e. any subset L of Σ*
• Classes of formal languages (e.g., regular, context-free,
context-sensitive, recursive, recursively enumerable).
• Formal languages are related to programming languages and
natural languages
Formal Language Examples:
Σ = {0, 1}

L = {x | x is in Σ* and x contains an even number of 0’s}

= {00, 010,11100,00100,…}
Σ = {0, 1, 2,…, 9, .}
L = {x | x is in Σ* and x forms a finite length real number}
= {0, 1.5, 9.326,…}
Σ = {a, b, c,…, z, A, B,…, Z}
L = {x | x is in Σ* and x is a CPP reserved word}
= {while,
for, if, int, …}

Formal Language Examples:

Σ = {CPP reserved words} U { (, ), ., :, ;,…} U {Legal CPP
identifiers}

L = {x | x is in Σ* and x is a syntactically correct CPP
program}

Σ = {English words}
L = {x | x is in Σ* and x is a syntactically correct English
sentence}

What is automata theory ?
• Automata theory is the study of abstract computational devices
• Abstract devices are (simplified) models of real computations
• Computations happen everywhere: On your laptop, on your cell
phone, in nature, …
• Why do we need abstract models?
A simple “computer”
BATTERY
input: switch
output: light bulb
actions: flip switch
states: on, off
A simple “computer”
f
BATTERY
start
on
off
f
input: switch
output: light bulb
actions: f for “flip switch”
states: on, off
bulb is on if and only if there is
an odd number of flips
Another “computer”
1
1
start
off
2
BATTERY
off
1
2
2
2
1
2
off
1
on
inputs: switches 1 and 2
actions: 1 for “flip switch 1”
actions: 2 for “flip switch 2”
states: on, off
bulb is on if and only if both
switches were flipped an odd
number of times
A design problem
1
BATTERY
2
3
?
4
5
Can you design a circuit where the light is on if and only if all the
switches were flipped exactly the same number of times?
A design problem
• Such devices are difficult to reason about, because they can be
designed in an infinite number of ways
• By representing them as abstract computational devices, or
automata, we will learn how to answer such questions
These abstract devices can model many things
• They can describe the operation of any “small computer”, like
the control component of an alarm clock or a microwave
• They are also used in lexical analyzers to recognize well formed
expressions in programming languages:
ab1 is a legal name of a variable in C++
5u= is not
Different kinds of automata
• This was only one example of a computational device, and there
are others
• We will look at different devices, and look at the following
questions:
– What can a given type of device compute, and what are its limitations?
– Is one type of device more powerful than another?
Some devices we will see
finite automata
Devices with a finite amount of memory.
Used to model “small” computers.
push-down automata
Devices with infinite memory that can be accessed in
a restricted way.
Used to model parsers, etc.
Turing Machines
Devices with infinite memory.
Used to model any computer.
Preliminaries of automata theory
• How do we formalize the question
Can device A solve problem B?
• First, we need a formal way of describing the problems that we
are interested in solving
Problems
• Examples of problems we will consider
–
–
–
–
Given a word s, does it contain the subword “fool”?
Given a number n, is it divisible by 7?
Given a pair of words s and t, are they the same?
Given an expression with brackets, e.g. (()()), does every left bracket
match with a subsequent right bracket?
• All of these have “yes/no” answers.
• There are other types of problems, that ask “Find this” or “How
many of that” but we won’t look at those.
General Model of Automaton
●
An automaton is defined as a system where
energy, materials and information are
transformed, transmitted and used for performing
some functions without direct participation of man.
●
Examples are
●
●
●
automatic machine tools,
automatic packing machines, and
automatic photo printing machines, etc.
26
General Model of Automaton
●
In computer science the term 'automaton'
means 'discrete automation' and is defined in a
more abstract way as shown in figure below.
27
General Model of Automaton
u
28
General Model of Automaton
●
The characteristics of an automaton are:
4) State relation: The next state of an automaton at
any instant of time is determined by the present
state and input.
5) Output relation: The output is related to either state
only or to both the input and the state.
• It should be noted that at any instant of time the automaton
is in some state.
• On 'reading' an input symbol, the automaton moves to a next
state which is given by the state relation.
29
General Model of Automaton
●
We can note the following with regard to output relation:
i.
An automaton in which the output depends only on the
input is called an automaton without a memory.
ii.
An automaton in which the output depends on the states
as well, is called automaton with a finite memory.
iii. An automaton in which the output depends only on the
states of the machine is called a Moore machine.
iv. An automaton in which the output depends on the state
as well as on the input at any instant of time is called a
Mealy machine.
30
Finite Automata
Example of a finite automaton
f
on
off
f
• There are states off and on, the automaton starts in off and tries
to reach the “good state” on
• What sequences of fs lead to the good state?
• Answer: {f, fff, fffff, …} = {f n: n is odd}
• This is an example of a deterministic finite automata over
alphabet {f}
Deterministic finite automata
• A deterministic finite automata (DFA) is a 5-tuple (Q, S, d, q0, F)
where
–
–
–
–
–
Q is a finite set of states
S (sigma) is an alphabet
d: Q × S → Q is a transition function
q0  Q is the initial state
F  Q is a set of accepting states (or final states).
• In diagrams, the accepting states will be denoted by double loops
Deterministic finite automata
• At the initial time, it is assumed to be in the initial state
q0, with its input mechanism on the leftmost symbol of the
input string.
• During each move of the automation, the input mechanism
advances one position to the right, so each move consumes one
input symbol.
• When the end of the string is reached, the string is accepted if
the automation is in one of its final states; otherwise the string is
rejected.
Example
0
q0
1
1
q1
0,1
0
q2
State transition diagram
For every transition rule d(qi,a)= qj, the graph
has an edge (qi, qj) labeled a.
The vertex associated with q0 is called the initial
vertex, while those labeled with qf are the final
vertices.
Transition table
inputs
states
alphabet S = {0, 1}
Set of states Q = {q0, q1, q2}
initial state q0
accepting states F = {q0, q1}
q0
q1
q2
0
q0
q2
q2
1
q1
q1
q2
Language of a DFA
The language of a DFA (Q, S, d, q0, F) is the set of
all strings over S that, starting from q0 and
following the transitions as the string is read left
to right, will reach some accepting state.
f
M:
on
off
f
• Language of M is {f, fff, fffff, …} = {f n: n is odd}
Examples
0
0
1
q0
q1
1
0
1
1
q0
0
0
q0
q1
1
1
q1
0,1
0
What are the languages of these DFAs?
q2
Examples
• Construct a DFA that accepts the language
( S = {0, 1} )
L = {010, 1}
• Answer
0
q0
qe
1
q1
1
q01
0
q010
Examples
• Construct a DFA that accepts the language
( S = {0, 1} )
L = {010, 1}
• Answer
0
q0
1
0
qe
0
q01
q010
1
0, 1
1
q1
0, 1
qdie
0, 1
Examples
• Construct a DFA over alphabet {0, 1} that accepts all strings that
end in 101
Examples
• Construct a DFA over alphabet {0, 1} that accepts all strings that
end in 101
• Hint: The DFA must “remember” the last 3 bits of the string it is
reading
Download