Finite Automata & Regular Languages

advertisement
Finite Automata &
Regular Languages
Sipser, Chapter 1
Deterministic Finite Automata

A DFA or deterministic finite automaton M is a
5-tuple, M = (Q, , , q0, F), where:





Q is a finite set of states of M
 is the finite input alphabet of M
: Q    Q is the state transition function
q0 is the start state of M
F  Q is the set of accepting states or final states
of M
DFA Example


State diagram
Q = { q0, q1 }
 = { 0, 1 }
F = { q1 }
0
M
q0
1
0
q1
1

q0
0
q0
1
q1
q1
q1
q0
State
Table
State table &
state transition function


State table

q0
0
q0
1
q1
q1
q1
q0
State transition function
(q0, 0) = q0, (q0, 1) = q1
(q1, 0) = q1, (q1, 1) = q0
State transitions


If q, q’  Q, s  , and (q, s) = q’,
then we say that q’ is an s-successor of
q, or there is a transition from q to q’
on input s, and we write
q s q’
Example: since (q0, 1) = q1, then there
is a transition from q0 to q1 on input 1,
and we write q0 1 q1.
State sequences

If a string of input symbols
w = s0s1s2 … sk-1 takes M from initial
state q0 to state qk, namely
q0 s0 q1 s1 q2 s2 q3  … s[k-1] qk
then we say that qk is a w-successor of
q0, and write q0 w qk. Also q0q1q2 … qk
is called an admissible state sequence
for w.
Strings accepted by a DFA

Let M = (Q, , , q0, F) be a DFA, and
w = s0s1s2 … sk-1  * be a string over
alphabet . Then M accepts w if there
exists an admissible state sequence
q0q1q2 … qk for w, starting at initial
state q0 and ending with state qk,
where qk  F. That is, M accepts input
string w if M ends up in one of the final
states.
Language recognized by a DFA


The language L(M) that is recognized
by a DFA, M = (Q, , , q0, F), is the set
of all strings accepted by M. That is,
L(M) = { w  * | M accepts w }
= { w  * | q0 w qk, qk  F }.
Example: For the previous DFA, L(M) is
the set of all strings of 0s and 1s with
odd parity, that is, odd number of 1s.
DFA Example 2

Recognizer for 11*01*
1
B
0
1
1
C
A
0
0
Trap
D
0,1
DFA Example 2

M = (Q, , , q0, F), L(M) = 11*01*
Q = { q0=A, B, C, D }
 = { 0, 1 }

0
1
F={C}
A
D
B
B
C
B
C
D
C
D
D
D
DFA Example 3

0
Modulo 3 counter
B
1
0,R
2,R
A
1
2
2
1,R
C
0
DFA Example 3

M = (Q, , , q0, F)
Q = { q0=A, B, C }
 = { 0, 1, 2, R }
F={A} 
0
1
2
R
A
A
B
C
A
B
B
C
A
A
C
C
A
B
A
Regular Languages


A language L  * is called regular if
there exists a DFA M such that L(M)=L.
Earlier, we defined a language L  *
as regular if there exists a T3 or regular
(left-linear or right-linear) grammar G
such that L(G)=L. We shall prove that
these two definitions are equivalent.
Operations on Regular Languages



Let A and B be regular languages:
Union:
A  B = { x | x  A or x  B }
Concatenation:
AB = { xy | x  A and y  B }.
Kleene Closure (A-star)
A* = {x1x2x3 ... xk | k  0 and xi  A }
Examples of regular operations

A = { good, bad }, B = { boy, girl }
A  B = { good, bad, boy, girl }
AB = { goodboy, goodgirl, badboy,
badgirl }
A* = { , good, bad, goodgood,
goodbad, badgood, badbad, … }
Closure under Union

If A and B are regular languages, then
their union, A  B, is a regular
language
Union Machine M(A B)
q1F

q0

M(A)
q2F
r0
p1F

p0
M(B)
p2F
Closure under Concatenation

If A and B are regular languages, then
their concatenation, AB, is a regular
language.
Concatenation Machine M(AB)

Closure under Kleene Star

If A is a regular language, then the
Kleene closure of A, A*, is also a
regular language
Kleene Closure Machine M(A*)

NFAs:
Nondeterministic Finite Automata




Presence of lambda transtitions.
May have more than one initial state.
On input a, state q may have no
transition out.
On input a, state q may have more than
one transition out.
NFAs

A nondeterministic finite automaton M
is a five-tuple M = ( Q, , R, I, F ),
where





Q is a finite set of states
 is the (finite) input alphabet
R is the transition relation, R  QQ
I  Q is the set of initial states
F  Q is the set of final states
Example NFAs


NFA that recognizes the language
0*1  1*0
NFA that recognizes the language
(0  1)*11 (0  1)*
Converting NFAs to DFAs

Given a NFA, M = (Q, , R, I, F), build a
DFA, M’ = (Q’, , , S0, F’) as follows.


The states S0, S1, S2, … of M’ are sets of
states of M.
The initial state of M’ is obtained by putting
together all the initial states of M and all
states reachable from those by 
transitions, and calling this set S0, the
initial state of M’
Converting NFAs to DFAs


For each state Sk already in Q’ in M’, and
for each input symbol a  , put together
into a set Sj all states of M reachable from
each state in Sk on input a. This set Sj may
or may not yet already be in Q’. Also it may
be the empty set . Add to  the transition
from Sk to Sj on input a.
Since there can only be a finite number of
subsets of states of M, this procedure will
stop after a finite number of steps.
Example conversions

Convert the NFA for the language
(0  1)*00  (0  1)*11 to a DFA
0,1
0
A
0
0,1
1
D
C
B
1
E
F
State transition table of NFA
0
1

A
A,B
A
-
B
C
-
-
C
-
-
-
D
D
D,E
-
E
-
F
-
F
-
-
-

State table of DFA
0
1
A,D
A,B,D
A,D,E
A,B,D
A,B,C,D
A,D,E
A,D,E
A,B,D
A,D,E,F
A,B,C,D
A,B,C,D
A,D,E
A,D,E,F
A,B,D
A,D,E,F

State diagram of DFA
ABD

0
ABCD
0
1
0
1
0
AD
0
1
ADE
1
ADEF
1
Regular Expressions (r.e.)







If a  , then the set a = {a} is a r.e.
The set  = {} is a r.e.
The set  = { } is a r.e.
If R and S are r.e., then (R  S) is a r.e.
If R and S are r.e., then (RS) is a r.e.
If R is a r.e., then ( R )* is a r.e.
Any r.e. is obtained by a finite application of
the above rules.
REs and Regular Languages

R.E.s are shorthand notation for regular
languages.
Regex: REs in Unix





[a-f], [^a-f]
R*, R+, R?
{R}
RS
R|S
Minimization of DFAs

Subset construction
(Myhill-Nerode Theorem)
NFAs, DFAs, & Lexical Analyzer
Generators




Sec 3.6: Finite Automata, Aho, Sethi,
Ullman, “Compilers: P.T.T”
Sec 3.7: From REs to NFAs (Thompson’s
Construction)
Sec 3.8: Design of a Lexical Analyzer
generator
Sec 3.9: Optimization of DFA-based
Lexical Analyzers
Download