Chapter 3 Finite State Automata 3.1 Deterministic Finite Automata

advertisement
Chapter 3
Finite State Automata
3.1 Deterministic Finite Automata (DFA)
Definition 1.1 A deterministic finite automaton (DFA) is a quintuple
M = (Q, ,  , q0 , F ) where Q is a finite set of states,  a finite set of symbols called the
alphabet, q0  Q a distinguished state called the start state, F a subset of Q consisting
of the final or accepting states, and  a total function from Q   to Q called the
transition function.
Example 1.1
Figure 2: Example DFA
Some strings accepted by the machine are:
baab
baaab
babaabaaba
aaaa
All of the above strings are characterized by the presence of at least one aa substring.
According to the definition of a DFA, the following are identified:
Q = {q0 , q1 , q2 }
 = {a, b}
 : Q    Q : (qi , a) |- q j
where i can be equal to j and the mapping is given by the transition table below.
Transition Table:

a
q1
q2
q2
q0
q1
q2
b
q0
q0
q2
A sample computation, on the string abaab , is represented as
[q0,abaab]
|- [q1, baab]
|- [q0 , aab] [q0 , abaab]
|- [q1, ab]
|- [q2 , b]
|- [q2 ,  ]
Definition 1.2 Let M = (Q, ,  , q0 , F ) be a DFA. The language of M , denoted L (M ) ,
is the set of strings in  * accepted by M .
A DFA can be considered as a language acceptor; the language recognized by the
machine is the set of strings that are accepted by its computations. Two machines that
accept the same language are said to be equivalent.
Definition 1.3 The extended transition function ˆ of a DFA with transition function 
is a function from Q to Q  * defined by recursion on the length of the input string.
• Basis: length(w) = 0 . Then w =  and ˆ(qi ,  ) = qi .
length(w) = 1 . Then w = a , for some a   and ˆ(qi , a) =  (qi , a) .
• Recursive step: Let w be a string of length n > 1 . Then w = ua and
ˆ(q , ua) =  (ˆ(q , u), a) .
i
i
The computation of a machine in state qi with string w halts in state ˆ(qi , w) . A string
w is accepted if ˆ(q0 , w)  F . Using this notation, the language of a DFA M is the set
L(M ) = {w | ˆ(q0 , w)  F} .
3.2 Nondeterministic Finite Automata (NFA)
Definition 2.1 A nondeterministic finite automaton is a quintuple M = {Q, ,  , q0 , F}
where Q is a finite set of states,  a finite set of symbols called the alphabet, q0  Q a
distinguished state known as the start state, F a subset of Q consisting of the final or
accepting states, and  a total function from Q   to P(Q) known as the transition
function
.
Note that a deterministic finite automaton is considered a special case of a
nondeterministic one. The transition function of a DFA specifies exactly one state that
may be entered from a given state and on a given input symbol, while an NFA allows
zero, one or more states to be entered. Hence, a string input to an NFA may generate
several distinct computations.
For the language over  = {a, b} where each string has at least one occurrence of a
double a, an NFA can be given with the following transition table:

q0
a
{q0 , q1}
b
{q0 }
q1
{q2 }

q2
{q2 }
{q2 }
Two computations on the string aabaa are given by:
[q1 , abaa]
|- [q0 , aabaa]
|- [q2 , baa]
|- [q2 , ba]
|- [q2 , a]
|- [q2 ,  ]
[q0 , aabaa ]
|- [q0 , abaa]
|- [q0 , baa]
|- [q0 , ba]
|- [q1, a]
|- [q2 ,  ]
and
We will further show that a language accepted by an NFA is also accepted by a DFA.
Definition 2.2 The language of an NFA M , denoted L (M ) , is the set of strings
accepted by M . That is, L( M ) = {w | there is a computation [q0 , w][ qi ,  ] with qi  F }.
3.3 NFA with Epsilon Transitions (NFA-  or  -NFA))
So far, in the discussion of Finite State automatons, the reading head was required to
move at each step of the transitions. Intuitively, an  -transition allows the reading head
of the automaton to remain at a cell during a transition. Such a transition is called an  transition .
Definition 3.1 A nondeterministic finite automaton with  -transitions is a quintuple
M = (Q, ,  , q0 , F ) where Q,  , q0 , and F are as in an NFA. The transition function is
a function from Q  (  { }) to 2Q .
Epsilon transitions can be used to combine existing machines to construct more complex
composite machines. Let M 1 and M 2 be two finite automata which consists of a single
start state and a final state where there are no arcs entering the start state, and no arcs
leaving the accepting state.
Composite machines that accept L(M1 )  L(M 2 ) , L(M1 ) L(M 2 ) , and L(M 1 )* are
constructed from M 1 and M 2 as depicted in Figure 2.
Figure 3: L(M1 )  L(M 2 )
Figure 4: L(M1 ) L(M 2 )
Figure 5: L(M 1 )*
The NFA-  of Figure 5 accepts the language over  = {a, b} where each string has at
least one occurrence of aa or bb. The states of machines M 1 and M 2 are given distinct
names. A possible computation on the string bbaaa is given below.
[ p0 , bbaaa] |- [t0 , bbaaa ]
|- [t1, baaa]
|- [t2 , aaa]
|- [t2 , aa]
|- [t2 , a]
|- [t2 ,  ]
|- [ p2 ,  ]
p0
p1
p2
p`0
P` 2
t0
t1
t2
Figure 6: Sample Union Construction
3.4 Finite Automata and Regular Sets
Theorem 4.1 The set of languages accepted by finite state automata consists precisely of
the regular sets over 
First we will show that every regular set is accepted by some NFA-  . This follows from
the recursive definition of regular sets. The regular sets are built from the basis elements
 ,{ } and the singletons containing a symbol from the alphabet. Machines that accept
these sets are given in Figure 7. The regular sets are constructed from the primitive
regular sets using union, concatenation, and Kleene star operations.
Figure 7: Machines that accept the primitive regular sets
3.4.1 Removing Nondeterminism
Definition 4.1 The  - closure of a state qi , denoted  - closure (qi ) , is defined
recursively by,
i. Basis: qi   - closure (qi ) .
ii. Recursive step: Let q j be an element of  - closure (qi ) . If qk   (q j ,  ) , then
qk   - closure (qi ) .
iii. Closure: q j is in  - closure (qi ) only if it can be obtained from qi by a finite number
of applications of operations in (ii).
Example 4.1 For the NFA-  of Figure 4.1, we derive the DFA of Figure 4.1.
Construction of DM, a DFA Equivalent to NFA-  M (see text)
Figure 3.7: An NFA-ɛ
0, 1
1

(Note: the diagram of the figure is missing a transition from FG to BCE on 1, and
transitions on 0 and 1 at

)
Figure 9: Equivalent DFA
3.4.2 Expression Graphs
Definition 4.2 An expression graph is a labeled directed graph in which the arcs are
labeled by regular expressions. An expression graph, like a state diagram, contains a
distinguished start node and a set of accepting nodes.
Example 4.2
The expression graph given in (fig 10) accepts the regular expressions u * and u *vw* .
Figure 10: Expression Graph
Algorithm: Construction of a Regular Expression from a Finite Automaton
input: state diagram G of a finite automaton and the nodes of G are numbered
1,2, , n
1. Make m copies of G , each of which has one accepting state. Call these graphs
G1 , G2 ,, Gm . Each accepting node G is the accepting node of Gt , for some
t = 1,2, , m .
2. for each Gt , do
• repeat
- choose a node i in Gt , that is neither the start nor the accepting node of Gt .
- delete the node i from Gt according to the procedure:
for every j , k not equal to i (this includes j = k ) do
- if w j ,i   , and wi ,i =

then add an arc from node j to node k labeled
w j ,i wi ,k
- if w j ,i   , wi , k   , and wi ,i 

then add an arc from node j to node k
labeled w j ,i (wi ,i )* wi ,k
- if nodes j and k have arcs labeled w1 , w2 ,, ws connecting them then replace
them by a single arc labeled w1  w2    ws
- remove the node i and all arcs incident to it in Gt until the only nodes in Gt
are the start node and the single accepting node.
end for
• determine the expression accepted by Gt . end for
3. The regular expression accepted by G is obtained by joining the expressions for each
Gt with  .
The deletion of the node i is accomplished by finding all paths j , i, k of length two that
have i as the intermediate node. An arc from j to k is added by passing the node i . If
there is no arc from i to itself, the new arc is labeled by the concatenation of the
expressions on each of the component arcs. If wi ,i 

, then the arc
wi ,i can be
traversed any number of times before following the arc from i to k . The label for the
new arc is w j ,i (wi ,i )* wi ,k . These graph transformations are illustrated in (fig 12).
Figure 12: Expression Graph Transformation
The reduced graph has atmost two nodes, the start node and an accepting node. If these
are the same node, the reduced graph has the form (fig 13(a)), accepting w* . A graph
with distinct start and accepting nodes reduces to (fig 13(b)) and accepts the expression
w1* w2 ( w3  w4 ( w1 )* w2 )* . This expression may be simplified if any of the arcs in the
graph are labeled

.
Figure 13: (a) w* , (b) w1* w2 ( w3  w4 ( w1 )* w2 )*
Example 4.3
1. Example 1: Fig 14(a) shows the original DFA which is reduced to an expression
graph shown in fig 14(b).
Figure 14: Example 4.3 - 1(a),(b)
2. Example 2: Explanation of elimination: Sequence of steps where one state is
eliminated at each step.
- Step 1: Given: fig 15(a)
- Step 2: Eliminating i at this step, fig 15(b)
Figure 15: Example 4.3 - 2(a),(b)
Figure 16: Example 4.3 - 2(c),(d)
- Step 3: After eliminating all but initial and final state in Gi , fig 16(c)
- Step 4: Final regular expression,fig 16(d)
3. Example 3: Fig. 17 shows the different steps where
L = r1*r2 (r3*  r3*r4 r1*r2 r3* )*
= r1*r2 (r3  r4 r1*r2 )*
or
L = r1* fig 16(d)
Figure 17: Example 4.3 - 3
Chapter 4
Regular Languages and Sets
4.1 Regular Grammars and Finite Automata
This chapter corresponds to Chapter 7 of the course textbook
Theorem 1.1 Let G = (V , , P, S ) be a regular grammar. Define the NFA M =
(Q, ,  , S , F ) as follows:
V  {Z } where Z  V , if P contains a rule A  a
• Q=
V
otherwise

B whenever A  aB  P
•  ( A, a) = 
 Z whenever A  a  P
{ A | A    P}  {Z } if Z  Q
• F =
 { A | A    P} otherwise
Then L( M ) = L(G ).
Example 1.1
The grammar G generates and the NFA M accepts the language a* (a  b  )
G : S  aS | bB | a
B  bB | 
Figure 18: NFA accepts a* (a  b  )
The derivation of a string such as aabb is explained below:
In G:
S a
 aaS
 aabB
 aabbB
 aabb
 aabb
In M:
[ S , aabb]
|- [ S , abb]
|- [S , bb]
|- [ B, b]
|- [ B,  ]
Similarly, a regular grammar that accepts the L (M ) is constructed from the automaton
M.
G : S  aS | bB | aZ
B  bB | 
Z 
The transitions provide the S rules and the first B rule. The varepsilon rules are added
since B and Z are accepting states.
Note:
Example 1.2
A regular grammar is constructed from the given DFA in fig 19.
Figure 19: Example 1.2
S  bB | aA
A  aS | bC
B  aC | bS | 
C  aB | bA
4.2 Closure Properties of Regular Sets
A language over an alphabet  is regular if it is
• a regular set (expression) over 
• accepted by DFA, NFA, or NFA- 
• generated by a regular grammar.
Theorem 2.1 Let L1 and L2 be two regular languages. The languages L1 , L2 , L1  L2 ,
*
and L1 are regular languages.
Theorem 2.2 Let L be a regular language over  . The language L is regular.
L = *  L
Theorem 2.3 Let L1 and L2 be regular languages over  . The language L1  L2 is
regular.
Proof: By DeMorgan's law
L1  L2 = L1  L2
The right-hand side of the equality is regular since it is built from L1 and L2 using union
and complementation.
Theorem 2.4 Let L1 be a regular language and L2 be a context-free language. The
language L1  L2 is not necessarily regular.
Proof: Let L1 = a*b* and L2 = {a i b i | i  0} . L2 is context-free since it is generated by
the grammar S  aSb |  . The intersection of L1 and L2 is L2 , which is not regular.
4.3 Pumping Lemma for Regular Languages
Pumping a string refers to constructing new strings by repeating (pumping) substrings in
the original string.
Theorem 3.1 Let L be a regular language that is accepted by a DFA M with n states.
Let w be any string in L with length( w)  n . Then w can be written as xyz with
length( xy)  n, length( y ) > 0 , and xyk z  L for all k  0 .
Example 3.1
Prove that the languge L = {a i b i | i  0} is not regular using the Pumping lemma for
regular languages.
Proof: By contradiction: Assume L is regular; then the pumping lemma holds. Let
w = a nb n .
By splitting a nb n into xyz , we get
x = a i , y = a j , and z = a n i  j b n
where
i  j  n and j > 0
Pumping y to y 2 gives,
a i a j a j a n i  j b n
= a n a j b n  L (contradiction with the pumping lemma).
Therefore, L is not regular.
Example 3.2
The language L = {a i | i is prime } is not regular.
Assume L is regular, and that a DFA with n states accepts L . Let m be a prime greater
than n . The pumping lemma implies that a m can be decomposed as xyz, y   , such
that xyk z is in L for all k  0 .
The length of s = xym1 z must be prime if s is in L. But,
length( xym 1 z ) = length( xyzym )
= length( xyz)  length( y m )
= m  m(length( y ))
= m(1  length( y ))
(4.1)
Since its length is not prime, xym1 z is not in L (contradiction with the pumping
lemma). Hence, L is not regular.
Corollary 3.1 Let DFA M have n states.
• L (M ) is not empty if, and only if, M accepts a string w with length( w) < n .
• L (M ) has an infinite number of strings if, and only if, M accepts a string w where
n  length( z ) < 2n .
Theorem 3.2 Let M be a DFA. There is a decision procedure to determine whether,
• L(M) is empty;
• L(M) is finite;
• L(M) is infinite.
Download