Automata, Computability, & Complexity by Elaine Rich ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Slides provided by author Slides edited for use by MSU Department of Computer Science – R. Halverson 1 Regular Expressions LOTS of problems at end of chapter for you to practice! Regular Languages L Regular Expression Regular Language Accepts Finite State Machine A method to describe a regular language. ◦ Different from that of a FSM Consists of set of symbols + a syntax Symbols ◦ Special symbols: ∅ U ( ) * + ◦ Alphabet ∑: from which strings in language are made Regular Expressions The regular expressions over an alphabet are all and only the strings that can be obtained as follows: 1. is a regular expression. 2. is a regular expression. 3. Every element of is a regular expression. 4. If , are regular expressions, then so is . 5. If , are regular expressions, then so is . 6. If is a regular expression, then so is *. 7. If is a regular expression, then so is +. 8. If is a regular expression, then so is (). Know this definition! Regular Expression Examples If = {a, b}, the following are regular expressions: a (a b)* abba b*(aa bb)+ b Regular Expressions Define Languages Define L, a semantic interpretation function for regular expressions: 1. L() = . 2. L() = {}. 3. L(c), where c = {c}. 4. L() = L() L(). 5. L( ) = L() L(). 6. L(*) = (L())*. 7. L(+) = L(*) = L() (L())*. If L() is equal to , then L(+) is also equal to . Otherwise L(+) is language formed by concatenating together one or more strings drawn from L(). 8. L(()) = L(). Rules 1, 3, 4, 5 & 6 give the language its power to define sets. Rule 8 has as its only role grouping other operators. Rules 2 & 7 appear to add functionality to regular expression language, but don’t. 2. is a regular expression. * = this is like 50 = 1, 0 = 7. is a regular expression, then so is +. + = * L((a b)*b) = L((a b)*) L(b) = (L((a b)))* L(b) = (L(a) L(b))* L(b) = ({a} {b})* {b} = {a, b}* {b}. Examples Convention dictates that omit the L( ) portion and use the expression to represent a language. Give a description. L(a*b*) = a*b* = {a}*{b}* L((a b)*) = (a b)* = {a,b}* L((a b)*a*b*)=(a b)* a*b*={a,b}*{a}*{b}* L((a b)*abba(a b)*) = (ab)*abba(ab)*) = {a,b}*abba{a,b}* Give a Regular Expression L = {w {a, b}*: |w| is even} Solution L = {w {a, b}*: |w| is even} (a b) (a b))* OR (aa ab ba bb)* Explain how this guarantees an even number of characters in each string that fits the pattern of the regular expression. Give a regular expression L = {w {a, b}*: w contains an odd number of a’s} Solution L = {w {a, b}*: w contains an odd number of a’s} b* (ab*ab*)* a b* b* a b* (ab*ab*)* More Regular Expression Examples L ( (aa*) ) = L ( (a )* ) = L = {w {a, b}*: there is no more than one b in w} L = {w {a, b}* : no two consecutive letters in w are the same} Common Idioms What do these mean? ( ) (a b)* (a b)+ Operator Precedence in Regular Expressions Highest Lowest Regular Expressions Arithmetic Expressions Kleene star exponentiation concatenation multiplication union addition a b* c d* x y2 + i j 2 The Details Matter Explain the differences! These will be components of MANY of your regular expressions a* b* (a b)* (ab)* (ab)* a*b* The Details Matter L1 = {w {a, b}* : every a is immediately followed a b} A regular expression for L1: A FSM for L1: L2 = {w {a, b}* : every a has a matching b somewhere} A regular expression for L2: A FSM for L2: In this course… We will make claims that 2 methodologies are equivalent. i.e. Have the same power. We will also claim that one methodology is more powerful than another What does that mean? Descriptive, Define 6.2 Kleene’s Theorem Finite state machines & regular expressions define the same class of languages. i.e. They are equivalent. i.e. They are equally powerful. To prove this, we must show: Theorem: Any language that can be defined with a regular expression can be accepted by some FSM and so is regular. Theorem: Every regular language (i.e., every language that can be accepted by some DFSM) can be defined with a regular expression. For Every Regular Expression There is a Corresponding FSM We’ll show this by construction. That is, for each of the components in the definition of a Regular Expression (page 128), we will develop a corresponding finite state machine. The result will not necessarily be deterministic The methods in the proof are not necessarily unique For Every Regular Expression There is a Corresponding FSM For the first 3 components: : A single element c of : = (*): M1for Expression 1 S S1 M2 for Expression 2 S2 Do any other states need to change? Minimal? M1for Expression 1 S1 F is this still F? M2 for Expression 2 S2 Do any other states need to change? Finals? M1for Expression 1 S1 SF F Do any other states need to change? Finals? M1for Expression 1 SF Do any other states need to change? Finals? F Example 1 (b ab)* An FSM for b An FSM for a An FSM for b An FSM for ab: Note: This Example 6.5 page 136 is in error in text. Example 1 (b ab)* An FSM for (b ab): Can we reduce it? Example 1 (b ab)* An FSM for (b ab)*: Reduce?? Do Homework starting page 151. Simplifying Regular Expressions Regex’s describe sets: ● Union is commutative: = . ● Union is associative: ( ) = ( ). ● is the identity for union: = = . ● Union is idempotent: = . Concatenation: ● Concatenation is associative: () = (). ● is the identity for concatenation: = = . ● is a zero for concatenation: = = . Concatenation distributes over union: ● ( ) = ( ) ( ). ● ( ) = ( ) ( ). Kleene star: ● * = . ● * = . ●(*)* = *. ● ** = *. ●( )* = (**)*. End of Chapter – Page 161 + Try all of the problems – Really!