Chapter 3 Regular languages and grammars Section 3.1 regular expressions A regular expression is operators: Examples: 1) A variable name: 2) All even length strings: 3) All strings ending in 00 or 11: Definition: Let be an alphabet. Then 1. 2. 3. Associated with each regular expression r is a language we denote by L(r). 1. 2. 3. For regular expressions r1 and r2, 4. L(r1 + r2) = 5. L(r1• r2) = 6. L((r1)) = 7. L(r1*) = operator precedence: Examples: Assume the alphabet is = {a, b} 1) Strings that contain consecutive a’s: 2) The complement of the language in 1) 3) All strings over Σ = {a, b} in which b's occur in clumps of even length. 4) Strings in which the number of a’s is odd. #6 p. 76 What languages do (*)* and a represent? Section 3.2 Connection between Regular Expressions and Regular Languages Theorem 3.1 Let r be a regular expression. Then, an NFA that accepts L(r). Consequently, L(r) is a regular language. The NFA constructed in the proof has the following properties: 1. 2. 3. Proof: Basis: We first construct automata for the three basis cases: Ø, , and a Hypothesis: Induction step: Finishing the Proof Case 1: Case 2: Case 3 Example: automaton for (a + b)*ab Regular expressions for regular languages Basic idea: generalized transition graph Example: eliminate vertex q2 in the figure below. Result: Theorem 3.2: Let L be a regular language. Then there exists a regular expression r such that L = L(r). Proof: Finishing the proof Example 3.10 on page 84. EE OE EE OO OE OO EE OE OO OO OE EE Thus we have: Finishing the Example Now, we need to remove OO EE OO EE EO OO EO EE OO EO EO OO EE Final diagram is: Section 3.3 regular grammars Definition 3.3 A grammar G = (V, T, S, P) is said to be right-linear if Grammars to Automata and vice versa Right linear grammars generate regular languages Grammar over {a, b} that generates all strings of odd length ending in b. Sb S bA | aA A aS | bS Algorithm to construct an NFA from the grammar Note: this discussion differs from that in the book. 1. 2. 3. 4. 5. Theorem 3.3: Let G = (V, T, S, P) be a right-linear grammar. Then L(G) is a regular language. Proof sketch—by induction on the length of a derivation Basis: Hypothesis: If w L(G) and w can be derived in n or fewer steps, then w L(M) and if w L(M) and |w| n, then S derives w in n or fewer steps. Induction: Suppose S derives w in n + 1 steps. Finishing the grammar – NFA proof Right linear grammars for regular languages. The grammar has four parts N = set of variables T = set of terminals P = set of productions or rules S = start symbol automaton move (q0, a) = q1 (q0, b) = q2 (q1, a) = q0 (q1, b) = q0 (q2, a) = q0 (q2, b) = q0 {A, B, C} {a, b} {A} grammar production Finishing the construction A aB | bC B aA | bA C aA | bA Productions needed to terminate derivations Theorem 3.4 If L is a regular language on the alphabet then a right-linear grammar G = (V, , S, P) such that L = L(G). V= T= S= P is defined as follows: Finally, we have theorem 3.6: A language L is regular iff there exists a regular grammar G such that L = L(G). Another example of machine to grammar construction Let’s let S be EE, A be OE, B be OO and C be EO