courses:cs240-201601:finite-automata.pptx (875 KB)

advertisement
CS240 • Language Theory and Automata • Spring 2016
Finite
Automata
Abstract Machines
• Modern computers are capable of performing a wide
variety of computations
• An abstract machine reads in an input string, and
depending on the input it
– outputs true (accept)
– outputs false (reject)
– gets stuck in an infinite loop and outputs nothing
• We say that a machine recognizes a particular
language if it outputs true for any input string in the
language it is designed to handle, and false
otherwise
• The artificial restriction to such decision problems is
purely for notational convenience
Language Recognition Problems
• Virtually all computational problems can be
recast as language recognition problems
• Examples:
– to determine whether an integer 97 is prime, ask whether 97
is in the language consisting of all primes {2, 3, 5, 7, 13, ... }
– to determine the decimal expansion of the mathematical
constant π, ask whether 7 is the 100th digit of π and so on
Power
• We would like to be able to formally compare
different classes of abstract machines in order to
address questions like
– Is a(n abstract) Mac more powerful than a(n abstract) PC?
– Can Watson do more things than, say, a laptop?
• To accomplish this, we define a notion of power
– We say that machine A is at least as powerful as machine B if machine A
can be "programmed'" to recognize all of the languages that B can
– Machine A is more powerful than B, if in addition, it can be programmed to
recognize at least one additional language
– Two machines are equivalent if they can be programmed to recognize
precisely the same set of languages
• Using this definition of power, we will classify several
fundamental machines
• We are interested in designing the most
powerful computer, i.e., the one that can
solve the widest range of language
recognition problems
– our notion of power does not say anything about
how fast a computation can be done
– reflects a more fundamental notion of whether or
not it is even possible to perform some
computation in a finite number of steps
Computing Machine
temporary memory
CPU
input memory
output memory
Program memory
Example
f ( x)  x
3
temporary memory
CPU
input memory
output memory
Compute y = x*x
Compute y * x
Example
f ( x)  x
3
temporary memory
CPU
x=2
output memory
Compute y =x*x
Compute y*x
Example
f ( x)  x
3
y = x*x =4
CPU
x=2
output memory
Compute y =x*x
Compute y*x
Example
f ( x)  x
3
y*x=8
CPU
x=2
f(x) = 8
Compute y = x*x
Compute y*x
Different Kinds of Automata
• Automata are distinguished by their
temporary memory
– Finite Automata
• No temporary memory
– Pushdown Automata
• Stack
– Turing Machines
• Random access memory
11
Finite Automaton
X
temporary memory
CPU
input memory
output memory
Program memory
Pushdown Automaton
Stack
CPU
input memory
output memory
Program memory
Turing Machine
Random
access
memory
CPU
input memory
output memory
Program memory
Power of Automata
Finite Automata
Pushdown
Automata
Less power
Solve fewer computational problems
Turing
Machines
More power
Solve more computational problems
Finite Automaton
• Perhaps the simplest type of machine that is still
interesting to study
– Many of its important properties carry over to more complicated
machines
– To understand more complicated machines, we first study FAs
• Captures the basic elements of an abstract
machine
– Reads in a string and outputs true (yes) or false (no)
– An output of True means the string is in the language the FA has
been programmed to recognize, False means it is not
• A useful practical abstraction because
– FAs retain sufficient flexibility to perform interesting tasks
– Hardware requirements for building them are relatively minimal
• FAs recognize a class of simple but highly useful
languages called regular languages
Definition
• A finite automaton is a graph with a finite number
of nodes, called states
• Arcs are labeled with one or more symbols from
some alphabet
• One state is designated the start state or initial
state
• Some states are final states or accepting states
• The language of the FA is the set of strings that
label paths that go from the start state to some
accepting state
Operation
• FA is always is one of the n states, which we
(typically) name 0 through n-1
– Each state is labeled true (“yes”) or false (“no”)
• Begins in the start state
• As the input characters are read in one at a time,
changes from one state to another in a pre-specified
way
– new state is completely determined by the current state and the
character just read in
• When input is exhausted, outputs true (yes, the
string is in the language) or false (no, the string is not
in the language) according to the label of the state it
is currently in
Transition Diagram
• Represent an FA visually by a graph:
– nodes = states
– arc from q to p is labeled by the set of input
symbols a such that (q, a) = p
– No arc if no such a
– Start state indicated by an arrow
– Accepting states (those labeled “yes”) get
double circles
– Non-accepting states are implicitly labeled
“no”
Example
0
start
1
𝑞1
0,1
𝑞2
1
𝑞3
0
•
•
•
•
•
States are {𝑞1,𝑞2,𝑞3}
0
Transitions are arrows with 0 or 1, such as →
Start state is 𝑞1 (it has a regular arrow leading to it)
Accept state is {𝑞3} (it has a double circle)
Each state has 2 arrows exiting it, labeled 0 and 1
– i.e., one for every symbol in the alphabet of the language 
• How does this automaton work when we feed it a string
such as 010110?
– Start at the start state 𝑞1
– Read in the input symbols one at a time, and follow the
transition arrow given by the next bit
0
start
STEPS:
0: take the arrow from 𝑞1 back to 𝑞1
1: take the arrow from 𝑞1 to 𝑞2
0: take the arrow back to 𝑞1
1: get to 𝑞2
1: get to 𝑞3
0: stay at 𝑞3
1
𝑞1
0,1
𝑞2
0
Since 𝑞3 is an accept state, the output is accept
How about input 101?
1
𝑞3
What strings does this machine
accept?
0
start
1
𝑞1
0,1
𝑞2
0
1
𝑞3
Answer
• The machine accepts exactly the strings with two
consecutive 1s
• The language of the automaton 𝐴, denoted 𝐿(𝐴),
is the set of accepted strings, i.e. the language
that the machine recognizes
– This term comes from linguistics
• We say that the language of 𝐴 is
𝐿(𝐴) = {𝑤 | 𝑤 has substring 11}
What strings does this
automaton accept?
0
0 mod 3
start
0
1
2 mod 3
1
Try several:
• 10
• 11
• 1101
• etc.
1
1 mod 3
0
Binary numbers
Think binary!
Any pattern?
representing values
evenly divisible by 3
FAs in Action
• Used in
–
–
–
–
Text editors and search engines for pattern matching
Compilers for lexical analysis
Web browsers for html parsing
Operating systems for graphical user interfaces
• Serve as the control unit in many physical systems,
including
– Vending machines, elevators, automatic traffic signals
– Computer microprocessors
– Network protocol stacks and old VCR clocks
• Play a key role in natural language processing and
machine learning
– Markov chains are probabilistic FAs used in part of speech tagging, speech
processing, and optical character recognition
String searching FAs
• One of the most important applications of FAs is
searching for patterns in strings
– at the heart of Web search engines like Google
• FA providing a simplified example of such a tool over
the binary alphabet
– Accepts all string inputs that contain the pattern aabaaabb as a
substring
FA for Newspaper Vending Machine
quarter
quarter
dime
dime
dime,
quarter
dime
start
nickel
nickel
nickel
quarter
nickel
nickel,
dime,
quarter
FA that recognizes simple identifiers
start
letter
letter or digit
other character
(delimeter)
• When we have a finite automaton, and we want
to design an automaton for a certain task, think
as follows:
The states of the automaton represent its memory.
Use different states for different possibilities.
• For example:
– An automaton that accepts iff the string has an even number of
1s will have to count number of 1s mod 2
• You want to have one state for each possibility
– An automaton that accepts iff the first equals the last symbol
will have to keep track of what the first symbol is
• It should have different states for different possibilities of the
first symbol
Example: FA for coin flipping
• Four ways of arranging two coins, depending
on which is heads (H) and which is tails (T)
– HH, HT, TT, TH
• Two operations:
– Flip the first coin (a)
– Flip the second coin (b)
• Assume initially coins are laid out as HH
• What are all possible ways of applying the
operations so that the configuration is TT?
Model as an FA
aH ,bH
start
aT,bH
aT
HH
HH: Flip first coin
HH: Flip second coin
TH
TH: Flip first coin
TH: Flip second coin
aH
bT
HT
bT
bH
aT
bH
HT: Flip second coin
TT
aH
aH ,bT
aT,bT
HT: Flip first coin
TT: Flip first coin
TT: Flip second coin
Final state
Conventions
• It helps if we can avoid mentioning the
type of every name by following some
rules:
– Input symbols are a, b, etc., or digits
– Strings of input symbols are u, v, . . . , z
– States are q, p, etc.
Problems
• Build an automaton to recognize:
– The set of strings with an even number of
1s
– The set of strings that start and end with
the same symbol
JFlap
http://www.jflap.org
• Software for experimenting with deterministic
and nondeterministic finite automata (among
others)
• Allows experimenting with construction proofs
from one form to another, such as converting
an NFA to a DFA to a regular expression
Fill out a simple form to download for free
Formal Definition of DFA
•
•
•
•
•
Finite set of states, Q
Alphabet of input symbols, 
One state is the start/initial state, q0
Zero or more final/accepting states, the set is F
A transition function, . This function:
– Takes a state and input symbol as arguments
– Returns a state
– One "rule" of  would be written (q, a) = p, where q and p are
states, and a is an input symbol
– Intuitively: if the FA is in state q, and input a is received, then
the FA goes to state p (note: q = p OK)
An FA is represented as five-tuple: A = (Q, , , q0 , F).
Example: Clamping Logic
• We may think of an accepting state as representing a
"1" output and non-accepting states as representing
"0" output
• A "clamping" circuit waits for a 1 input, and forever
after makes a 1 output. However, to avoid clamping
on spurious noise, we'll design an FA that waits for
two 1's in a row, and "clamps" only then
• In general, we may think of a state as representing
a summary of the history of what has been seen
in the input so far
• The states we need are:
– State q0, the start state, says that the most recent input
(if there was one) was not a 1, and we have never seen
two 1's in a row.
– State q1 says we have never seen 11, but the previous
input was 1.
– State q2 is the only accepting state, it says that we have
at some time seen 11.
– Thus, A = ({q0, q1, q2}, {0, 1}, , q0, {q2}), where  is
given by:
>q0
q1
*q2
0
1
q0
q0
q2
q1
q2
q2
By marking the start state
with > and accepting states
with *, the transition table
that defines  also specifies
the entire FA
Transition Graph
0
start
0,1
1
0
1
Extension of  to Paths
Intuitively, a FA accepts a string
w = a1a2… an if there is a path in
the transition diagram that:
1. Begins at the start state,
2. Ends at an accepting state, and
3. Has sequence of labels a1,a2 , … , an .
Formally, we extend transition function  to dˆ
(q,w), where w can be any string of input
symbols:dˆ
– Basis: (q, ) = q (i.e., on no input, the FA doesn't
go anywhere).
dˆ
dˆ
– Induction: (q, wa) =  ( (q, w),a), where w is a
string, and a a single symbol (i.e., see where the FA
goes on w, then look for the transition on the last
symbol from that state).
Important fact with a straightforward, inductive proof:
^
^
 really represents paths. That is, if w = a1a2 … an, and
 (pi, ai) = pi+1
for all i = 0, 1, . . . , n-1, then^  (p0, w) = pn
Formal Definition of Computation
We’ll base our definition on Finite
Automata, of course …
• What our machines do with strings
– i.e. accept or reject a particular string
• Whether our machine recognizes a
language or not
– In terms of strings
– And what term we give to such languages
• Acceptance of Strings
An FA A = (Q, , , q0, F) accepts string w
if
^
 (q0, w)  F
• Language of a FA
– FA A accepts the language
^
L(A) = {w | (q0, w)  F}
Definition: Regular Language
A language is called a regular
language if some finite automaton
accepts it
OK, what’s an irregular language?
• Answer depends on memory:
• A Finite State Machine has limited memory!
– It cannot store the string that’s been processed to date
– It cannot count characters in a string
• Example: ww (duplicate strings)
– 01101 01101
• Example an bn (duplicate characters)
– 00000 11111 (here n =5)
• Easy to program: Not a regular language
Empty Concepts
• The empty string (null string)
– 𝜀 can be accepted by a FA
– Initial state == final state. How would that look graphically?
• The empty language
– 0={}
– How would this look graphically?
• Language containing empty string is not
an empty language!
– {𝜀} != 0 and 𝜀 != 0
• If an FA accepts no strings, it recognizes
the empty language (not an interesting case.)
Closure properties of regular
languages
• ∪union:
– 𝐴∪𝐵={𝑤 | 𝑤∈𝐴 or 𝑤∈𝐵}
• ∘ concatenation:
– 𝐴 ∘ 𝐵 = 𝐴𝐵 = {𝑤 | 𝑤 = 𝑥𝑦, 𝑥 ∈ 𝐴, 𝑦 ∈ 𝐵}
• * Kleene star (unary operation)
– 𝐴* ={𝑤 | 𝑤=x1x2···x𝑘, 𝑘≥0, 𝑥𝑖 ∈𝐴}
Traditionally called regular operations
– minimal, because starting from a simple set of regular languages
and applying these three operations we can get to all regular
languages
Example
• If 𝐴 = {a, b} and 𝐵 = {b, c} we get
– 𝐴 ∘ 𝐵 = {ab, ac, bb, bc}
• Note for *, we stick together symbols in
any way we want to get longer string
– For A = {a,b,c}, A* = {𝜀, a, b, c, aa, bb, ab, ba, ac,
bc, aba, abb…}
– We get an infinite language unless 𝐴 ⊆ {𝜀}
– Note 𝜀 ∈ 𝐴*
Theorem
• The collection of regular languages
is closed under regular operations
– i.e., if we take 2 regular languages (or 1 regular
language, for *) and apply a regular operation, we
get another regular language
Closure
• We say the integers are “closed” under multiplication and
addition, but not “closed” under division, because if you divide
one by another you might not get an integer
• Closed means “you can’t get out” by using the operation
Proof of closure under ∪
• Show that if 𝐴 and 𝐵 are regular, then so is
𝐴∪𝐵
– Have to show how to construct the automaton for the
union language given the automata that recognize 𝐴 and
𝐵, i.e. given
– 𝑀1 = {𝑄1, Σ, 𝛿1, 𝑞1, 𝐹1} recognizing 𝐴
– 𝑀2 = {𝑄2, Σ, 𝛿2, 𝑞2, 𝐹2} recognizing 𝐵
• Construct 𝑀 = (𝑄, Σ, 𝛿, 𝑞0, 𝐹 ) recognizing
𝐴∪𝐵
– For simplicity, let Σ1 = Σ2 = Σ
• You might think: run the string through 𝑀1, see whether 𝑀1
accepts it, then run the string through 𝑀2 and see whether
𝑀2 accepts it
• But you can’t try something on the whole input string, and try
another thing on the whole input string
You get only 1 pass!
A
M
B
AUB
Imagine yourself in the role of 𝑀
• Solution : run both 𝑀1 and 𝑀2 at the
same time
– Imagine putting two fingers on the diagrams of the
automata for 𝑀1 and 𝑀2, and moving them around
according to the input
– At the end, if either finger is on an accept state,
then we accept
• Implement this strategy in 𝑀
Formalization
• Keep track of a state in 𝑀1 and a state
in 𝑀2 as a single state in 𝑀
– Each state in 𝑀 corresponds to a pair of states,
one in 𝑀1 and one in 𝑀2
• Let
𝑄 = 𝑄1 × 𝑄2 = {(𝑞, 𝑟) : 𝑞 ∈ 𝑄1, 𝑟 ∈ 𝑄2}
• How to define 𝛿?
Define 𝛿
• When a new symbol comes in, go to
wherever 𝑞 goes and wherever 𝑟 goes,
individually
𝛿((𝑞, 𝑟), 𝑎) = (𝛿1(𝑞, 𝑎), 𝛿2(𝑟, 𝑎))
• Start state is 𝑞0 = (𝑞1, 𝑞2)
• Accept set is 𝐹 = (𝐹1 × 𝑄2)∪(𝑄1 × 𝐹2)
– Note 𝐹1 × 𝐹2 gives intersection
It is clear by induction that the 𝑘th state of 𝑀 is just
the 𝑘th state of 𝑀1 and 𝑘th state of 𝑀2
Problem
• Prove that the collection of regular
languages is closed under
concatenation and Kleene star

• As above, the solution involves “keeping track of
multiple possibilities”
• Stay tuned: We will develop a type of finite
automaton that can keep track of multiple possibilities
that simplifies writing these proofs
Determinism
• All of the FAs we have seen so far are
deterministic finite automata (DFAs)
– Only one choice of move from one state to
another for a given input symbol
– A move in each state for every input symbol
0
1
1 mod 3
start
1
0
2 mod 3
0 mod 3
1
0
Non-deterministic Finite
Automata
• Allow (deterministic) FA to have a choice of 0
or more next states for each state-input pair
q0
1
q1
1
q2
0
q3
0,1
• Note : two “1” arrows from 𝑞0
• Possibly several ways to proceed
• Present state does not determine the next
state
– There are several possible futures!
How does the NFA work?
• Multiple alternative computations on the
input
• When more than one possible way to
proceed, take all of them
– Imagine a parallel computer following each of the
paths independently
– When the machine comes to point of nondeterminism, imagine it forking into multiple copies
of itself, each going like a separate thread in a
computer program
What happens when parallel
branches differ in their output?
• One choice might end up at 𝑞3, and another
may end up not at 𝑞3
• Only one path needs to lead to an accept
state, for the entire machine to accept
• If any computational branch leads to an
accepting state, we say the machine accepts
the input
– Acceptance overrules rejection
– Reject only if every possible way to proceed leads to
rejection
Example
q0
1
0,1
q1
1
q2
0
q3
Input: 010110
• Begin at the start state 𝑞0
• Read 0; follow the loop back to 𝑞0
• Read 1; there are 2 arrows labeled 1 starting at 𝑞0, so split into 2
paths to represent the 2 different places machine could be: 𝑞0
and 𝑞1
• Read 0; Now each path proceeds independently, because they
represent different threads of computation
– The path at 𝑞0 goes back to 𝑞0
– There is no place for the path at 𝑞1 to go (no arrow with 0 from 𝑞1), so
remove that path
• Only path at 𝑞0 left
q0
1
q1
1
q2
0
q3
Input: 010110
0,1
• Read 1; Branch into 𝑞0, 𝑞1
• Read 1; Follow 1 arrows from 𝑞0 and 𝑞1 to get to 𝑞0, 𝑞1, 𝑞2
• Read 0; Follow 0 arrows from 𝑞0, 𝑞1, 𝑞2 to get to 𝑞0, 𝑞3
• Each path represents a different thread of the computation
• The machine accepts because at least one path ended up
at an accepting state (𝑞3)
– The NFA accepts this string, i.e. 010110 ∈ 𝐿(𝐵)
– By contrast, 01011 is not in 𝐿(𝐵) because paths end at 𝑞0, 𝑞1, 𝑞2
• All possibilities are reject states
Problem
• Design an NFA to accept strings over alphabet
{1, 2, 3} such that the last symbol appears
previously, without any intervening higher
symbol, e.g.,
• … 11
• … 21112
• … 312123
– Trick: use start state to mean "I guess I haven't
seen the symbol that matches the ending symbol
yet"
– Three other states represent a guess that the
matching symbol has been seen, and remembers
what that symbol is
1,2,3
p
q
1
1
1
2
r
t
2
3
3
s
1,2
Formal NFA
• N = (Q, , , q0, F) where all is as DFA, but:
– (q, a) is a set of states, rather than a single state
• Extension todˆ
– Basis: dˆ (q, ) = {q}
– Induction: Let:
• dˆ (q, w) = {p1, p2, …, pk}
Set of states you can reach by
starting in state q and processing w
•  (pi, a) = Si for i = 1, 2 … , k
• Then dˆ (q, wa) = S1  S2  …  Sk
Set of states you can reach by
starting in state pi and processing a
Set of states you can reach by
starting in state q and processing wa
• Language of an NFA
– An NFA accepts w if any path from the start state to an
accepting state is labeled w. Formally:
L(N) = {w | dˆ (q , w)  F   }
0
Example
Here is a DFA that accepts a language L
consisting of all strings over (a,b) that
begin with either aa or bb
a
1
a
1,2
36
a,b
b
0
4
a,b
a
b
2
b
5t
a,b
Suppose we want to make an automaton to
recognize REV(L), the language of all strings
that end in aa or bb
Easy solution: reverse all transitions
and interchange start and final states:
a
a
1
But this is not a DFA!
3
1,2
a,b
More than one start state
b
0
4
a,b
a
b
2
This is an NFA
b
5
a,b
More than one
transition labeled
with the same
symbol
NFAs and DFAs
• Because there is a degree of
choice available in an NFA, is it
more powerful than a DFA?
– That is, can NFAs recognize languages a
DFA cannot?
Equivalence of NFAs and
DFAs
• NFAs and DFAs recognize the same
class of languages
• A bit surprising: NFAs seem more
powerful
Two machines are equivalent if they recognize
the same language
Theorem
Every non-deterministic finite
automaton has an equivalent
deterministic finite automaton
Proof Idea
• If a language is recognized by an NFA,
show the existence of a DFA that also
recognizes it
• Convert NFA to an equivalent DFA that
simulates the NFA
– Proof by construction
• Intuitively, can simulate the NFA by
keeping track of all the states you can
get to on a given input
Proof
1.
2.
Let N = (Q,,,q0,F) be an NFA recognizing some language A
Construct a DFA recognizing A
– M = (Q’,,’,q0’,F’)
3.
4.
Q’ = the set of subsets of N
For R  Q’ and a   let ’(R,a) = {q  Q | q  (r,a) for some r 
R
–
–
–
If R is a state of M, it is also a set of states of N (because of 3 above). When M
reads a symbol a in a state R, it shows where a goes from each state in R.
Because each state may go to a set of states, we take the union of all these
sets. This can be written as:
q0={q0}
•
–
M starts in the state corresponding to the collection containing just the start state of N
F’ = {R  Q’|R contains an accept state of N}.
•
The machine M accepts if one of the possible states that N could be in at this point is
an accept state
Convert the following NFA to a DFA
a,b
q0
b
q1
b
q2
Construct Q’, the set of subsets of Q :
• Q = {q0, q1, q2}
• Q’ = {{q0}, {q1}, {q2}, {q0, q1}, {q0, q2}, {q1, q2}, {q0, q1, q2}}
a,b
q0
b
q1
b
q2
Q’ = {{q0}, {q1}, {q2}, {q0, q1}, {q0, q2}, {q1, q2}, {q0, q1, q2}}
For R  Q’ and a   let ’(R,a) = {q  Q | q  (r,a) for some r  R
a
{q0}
{q0}
{q1}
ϕ
{q2}
ϕ
{q0, q1}
{q0}
{q0, q2}
{q0}
{q1, q2}
ϕ
{q0, q1, q2} {q0}
ϕ
ϕ
b
{q0, q1}
{q2}
ϕ
{q0, q1,q2}
{q0, q1}
{q2}
{q0,q1, q2}
ϕ
q0’= {q0}
F’= {{q2}, {q0, q2}, {q1, q2}, {q0, q1, q2}}
Useless (unreachable) states
{q0, q2}
a
a
{q1}
b
a
b
a
{q0}
{q0, q1}
b
a
b
a,b
b
{q1, q2}
{q0,q1,q2}
b
{q2}
a
ϕ
a,b
NFA
a,b
q0
b
b
q1
q2
DFA
a
a
{q0}
{q0, q1}
b
a
b
{q0,q1,q2}
b
Lazy Strategy
• You don’t have to construct all the
possible state sets at the outset
• Lazy strategy: construct state sets as
they appear in the computation
– i.e. start with the start state set, construct the set
for transitions on each input symbol, then
construct the set for transitions from those sets,
etc.
Example
5
Start state q0= {1}
b
(1,a)= {2,3}
6
a
a
1
b
a
b
a
2
3
b
7
8
9
4
b
10
(1,b)= {4}
({7,9},a)= 
({7,9},b)= 
({5,6,8},a)= 
({5,6,8},b)= 
({10},a)= 
({10},b)= 
({2,3},a)= ({2},a)  ({3},a)
=   {7,9}
= {7,9}
({2,3},b)= ({2},b)  ({3},b)
= {5,6}  {8}
= {5,6,8}
Final states F’=
({4},a) 
{{5,6,8},{10}}
=
({4},b)= {10}
The DFA
(1,a)={2,3}
(1,b)={4}
({2,3},a)={7,9}
({2,3},b)={5,6,8}
({4},a)=
({4},b)={10}
({7,9},a)= 
({7,9},b)= 
({5,6,8},a)= 
({5,6,8},b)= 
({10},a)= 
({10},b)= 
(,a)= 
(,b)= 
a
{7,9}
b
{5,6,8}
{2,3}
{1}
b
a
a,b
a
{4}
a,b

a,b
b
{10}
Start state q0={1}
Final states F’={{5,6,8},{10}}
a,b
Problem
Try converting this NFA to a DFA:
a
a
q1
a
a,b
a
b
q0
a
q3
b
q2
Recap
• First there was the DFA
– For every state and every alphabet symbol
there is exactly one move that the machine
can make
– δ:QxΣ→Q
– δ is a total function: completely defined
• I.e. it is defined for all q ∈ Q and a ∈ Σ
Then, the NFA
• Non-determinism
– When machine is in a given state and reads a
symbol, the machine will have a choice of where
to move to next
– There may be states where, after reading a given
symbol, the machine has nowhere to go
– Applying the transition function will give, not 1
state, but 0 or more states
– Transition function
– δ is a function from Q x Σ to 2Q
– δ (q, a) = subset of Q (possibly empty)
• And now...
• Introducing...
• The newest member of the FA family...
• The Nondeterministic finite
automaton with  transitions (NFA-)
NFA With -Transitions
• For both DFAs and NFAs, you must
read a symbol in order for the machine
to make a move
• In Nondeterministic Finite Automata
with  transitions
– Can make move without reading a symbol off the
read tape
– Such a move is called a -transition
NFA With -Transitions
• Allow  to be a label on arcs
– Nothing else changes: acceptance of w is still
the existence of a path from the start state to an
accepting state with label w
– But  can appear on arcs, and means the empty
string (i.e., no visible contribution to w)
– When an arc labeled  is traversed, no input is
consumed
Example
001
0

0
1

= 001
0
q
1
r


0
1
s
Formal Definition of NFA-
• A Non-Deterministic Finite Automaton
with -transitions is a 5-tuple (Q, , q0, 𝛿
, F) where
–
–
–
–
–
Q is a finite set (of states)
 is a finite alphabet of symbols
q0 ∈ Q is the start state
F ⊆ Q is the set of accepting states
𝛿 is a function from Q x ( ∪ {}) to 2Q (transition
function)
DFAs and NFA-’s
• -transitions are a convenience, but do
not increase the power of FA’s
• For any NFA- there is an equivalent
(i.e., accepts the same language)
DFA
• The construction is similar to the NFAto-DFA construction
Creating a DFA from a NFA-
(or, eliminating -transitions)
1. Compute the -closure for each state
• Gives the set of states reachable from that
state on -transitions only
2. Start state is -CLOSE(q0)
3. Compute  for each a   and each set S (each
of the -CLOSE’d sets) as follows:
 If a state p  S can reach state q on input a (not !),
then add a transition on input a from S to -CLOSE(q)
4. The set of final states includes those sets that
contain at least one accepting state of the NFA-
-closure example
a
p
b
r
a
b
s



q0

t
a
u
b
• Find the set T=ECLOSE({s})
–
–
–
–
T = {s}
T= {s,w}
T= {s,w,q0}
T= {s, w, q0, p ,t}
initial step
add 𝛿(s, )
add 𝛿(w, )
add 𝛿(q0, )
𝛿(p, ) = 𝛿(t, )= ∅
– We are done : ECLOSE({s}) = T = {s, w, q0, p, t}
w
a
v

a
Example
0
1. Compute -closure for all states:
E (q) = {q}
E (r) = {r,s}
E (s) = {r,s}
q
1
r

s

0
-closure signified by E
2. Compute :
({q},0)
({q},1)
({r,s},0)
({r,s},1)
1
= E ({s})={r,s}
= E ({r})={r,s}
= E ({q})={q}
= E ({q})={q}
RESULTING DFA:
3. Final states FD ={{r,s}}
q
0,1
0,1
r,s
Problem
Convert this NFA- to a DFA
q

a
q
1

2
a
q
3
q

4
q
0
q
5

q
6
a
q
7

Problem
Convert this NFA- to a DFA
JFlap
Play with http://jflap.org/tutorial/fa/nfa2dfa/index.html
NFAs, NFAs with -transitions, and DFAs
describe same class of languages. Thus to
show a language is a regular language, you can
just build a NFA that recognizes it, rather than a
DFA.
Many times it is more convenient to build a NFA
rather than a DFA, especially if you want to
keep track of multiple possibilities.
Download