( a b )+ - Midwestern State University in the Computer Science

advertisement
Automata, Computability, & Complexity
by Elaine Rich
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Slides provided by author
Slides edited for use by MSU Department of
Computer Science – R. Halverson
1
Regular Expressions
LOTS of problems at end of chapter for
you to practice!
Regular Languages
L
Regular Expression
Regular
Language
Accepts
Finite State
Machine



A method to describe a regular language.
◦ Different from that of a FSM
Consists of set of symbols + a syntax
Symbols
◦ Special symbols: ∅  U ( ) * +
◦ Alphabet ∑: from which strings in language
are made
Regular Expressions
The regular expressions over an alphabet  are all and
only the strings that can be obtained as follows:
1.  is a regular expression.
2.  is a regular expression.
3. Every element of  is a regular expression.
4. If  ,  are regular expressions, then so is .
5. If  ,  are regular expressions, then so is .
6. If  is a regular expression, then so is *.
7. If  is a regular expression, then so is +.
8. If  is a regular expression, then so is ().
Know this definition!
Regular Expression Examples
If  = {a, b}, the following are regular
expressions:


a
(a  b)*
abba  
b*(aa  bb)+ b
Regular Expressions Define Languages
Define L, a semantic interpretation function for regular
expressions:
1. L() = .
2. L() = {}.
3. L(c), where c   = {c}.
4. L() = L() L().
5. L(  ) = L()  L().
6. L(*) = (L())*.
7. L(+) = L(*) = L() (L())*.
If L() is equal to , then L(+) is also equal to .
Otherwise L(+) is language formed by concatenating
together one or more strings drawn from L().
8. L(()) = L().



Rules 1, 3, 4, 5 & 6 give the language its
power to define sets.
Rule 8 has as its only role grouping other
operators.
Rules 2 & 7 appear to add functionality to
regular expression language, but don’t.
2.  is a regular expression. * = 
this is like 50 = 1, 0 = 
7.  is a regular expression, then so is +. + = *
L((a  b)*b) = L((a  b)*) L(b)
= (L((a  b)))* L(b)
= (L(a)  L(b))* L(b)
= ({a}  {b})* {b}
= {a, b}* {b}.
Examples
Convention dictates that omit the L( ) portion and use the
expression to represent a language. Give a description.
L(a*b*) = a*b* = {a}*{b}*
L((a  b)*) = (a  b)* = {a,b}*
L((a  b)*a*b*)=(a  b)* a*b*={a,b}*{a}*{b}*
L((a  b)*abba(a  b)*) = (ab)*abba(ab)*)
= {a,b}*abba{a,b}*
Give a Regular Expression
L = {w  {a, b}*: |w| is even}
Solution
L = {w  {a, b}*: |w| is even}
(a  b) (a  b))*
OR
(aa  ab  ba  bb)*
Explain how this guarantees an even number of
characters in each string that fits the pattern of the
regular expression.
Give a regular expression
L = {w  {a, b}*: w contains an odd
number of a’s}
Solution
L = {w  {a, b}*: w contains an odd number
of a’s}
b* (ab*ab*)* a b*
b* a b* (ab*ab*)*
More Regular Expression Examples
L ( (aa*)   ) =
L ( (a  )* ) =
L = {w  {a, b}*: there is no more than one b in w}
L = {w  {a, b}* : no two consecutive letters in w are the
same}
Common Idioms
What do these mean?
(  )
(a  b)*
(a  b)+
Operator Precedence in Regular Expressions
Highest
Lowest
Regular
Expressions
Arithmetic
Expressions
Kleene star
exponentiation
concatenation
multiplication
union
addition
a b*  c d*
x y2 + i j 2
The Details Matter
Explain the differences!
These will be components of MANY of
your regular expressions
a*  b*  (a  b)*  (ab)*
(ab)*  a*b*
The Details Matter
L1 = {w  {a, b}* : every a is immediately followed a b}
A regular expression for L1:
A FSM for L1:
L2 = {w  {a, b}* : every a has a matching b somewhere}
A regular expression for L2:
A FSM for L2:
In this course…
 We will make claims that 2
methodologies are equivalent. i.e. Have
the same power.
 We will also claim that one methodology
is more powerful than another
What does that mean?
 Descriptive, Define
6.2 Kleene’s Theorem
Finite state machines & regular expressions define
the same class of languages. i.e. They are
equivalent. i.e. They are equally powerful.
To prove this, we must show:
Theorem: Any language that can be defined with a
regular expression can be accepted by some FSM
and so is regular.
Theorem: Every regular language (i.e., every
language that can be accepted by some DFSM)
can be defined with a regular expression.
For Every Regular Expression
There is a Corresponding FSM
We’ll show this by construction.
That is, for each of the components in the
definition of a Regular Expression (page 128), we
will develop a corresponding finite state machine.
The result will not necessarily be deterministic
The methods in the proof are not necessarily
unique
For Every Regular Expression
There is a Corresponding FSM
For the first 3 components:
:
A single element c of :
 = (*):
M1for Expression 1

S
S1
M2 for Expression 2

S2
Do any other states need to change? Minimal?
M1for Expression 1
S1


F
is this still F?
M2 for Expression 2
S2
Do any other states need to change? Finals?
M1for Expression 1
S1
SF


F

Do any other states need to change? Finals?
M1for Expression 1
SF



Do any other states need to change? Finals?
F
Example 1
(b  ab)*
An FSM for b
An FSM for a
An FSM for b
An FSM for ab:
Note: This Example 6.5 page 136 is in error in text.
Example 1
(b  ab)*
An FSM for (b  ab):
Can we reduce it?
Example 1
(b  ab)*
An FSM for (b  ab)*:
Reduce??
Do Homework starting page 151.
Simplifying Regular Expressions
Regex’s describe sets:
● Union is commutative:    =   .
● Union is associative: (  )   =   (  ).
●  is the identity for union:    =    = .
● Union is idempotent:    = .
Concatenation:
● Concatenation is associative: () = ().
●  is the identity for concatenation:   =   = .
●  is a zero for concatenation:   =   = .
Concatenation distributes over union:
● (  )  = ( )  ( ).
●  (  ) = ( )  ( ).
Kleene star:
● * = .
● * = .
●(*)* = *.
● ** = *.
●(  )* = (**)*.

End of Chapter – Page 161 +

Try all of the problems – Really!
Download