Chapter 4 • Pumping Lemma • Properties of Regular Languages • Decidable questions on Regular Languages Theorem 4.1: Pumping lemma for regular languages Let L be a regular language. Then there is a constant n (which depends on L) such that for every string w in L such that |w| n, we can break w into three strings, w = xyz, such that y , |xy| n, and for all i 0, xyiz is also in L. q ql f Proof: Let m be the number of states in the smallest FA accepting L, and let w = a1a2…an where n m. If w is in L, then we have (q, a1a2…aj) = ql, (ql, aj+1..ak) = ql, and (ql, ak+1…an) = f. choose x = a1a2…aj y = aj+1..ak z = ak+1…an It is obvious that (ql, vi) = ql for i 0. So, if the FA accepts w = xyz, it also accepts zyiz. Applications of the pumping lemma: Useful to prove a language L is not a regular set Method select an arbitrary ‘n’ choose a string w in L where |w| n for any partition of w = xyz such that |xy| n and |y| 1, show a contradiction; i.e. show that there is a string xykz not in L; k will depend on n, x, y, and z Example L = {0i1i | k > 0} given arbitrary n, choose w = 0n1n for any partition of w as xyz, y is one of 0j, 1j, or 0j1l in all cases xy2z is not in L So, L is not a regular set Example: L = { w in (a+b)* | # a’s = # b’s} Example: L = {w | w = ap where p is a prime} Definition of concatenation Let u, v *. The concatenation of u and v, written as uv, is a binary operation on the strings of * defined as follows: 1. Basis: If lenghth(v) = 0, then v = and uv = u 2. Recursive step: Let v be a string of length n > 0 and v = wa where a and lenghth(w) = n-1. Then uv = (uw)a. Definition of uR (reversal of u) 1. Basis: length(u) = 0, then u = and = R 2. Recursive step: If length(u) = n > 0, then u = wa for some string w such that length(w) = n-1 and some a , and uR = awR. u is a substring of v if v = xuy u is a prefix of v if v = ux u is a suffix of v if v = xu Theorem 1: (uv)w = u(vw). Concatenation is associative. Theorem 2: (uv)R = vRuR A language L such that no string in L is a proper prefix (suffix) of any other string L is said to have the prefix(suffix) property. Example: {a}* does not have prefix property {aib| i 0} does have the prefix property Definition of homomorphism: Let 1 and 2 be alphabets. A homomorphism is a mapping h: 1 2*. It is extended to 1* as follows: h: 1* 2* is defined as h() = and h(xa) = h(x)h(a) for all x1* and a1. Homomorphism applied to a language Let L be a language and h be a homomorphism. h(L) = {h(w) | w is in L} Definition of inverse homomorphism: If h: 1 2* is a homomorphism, then the relation h-1: 2* (1*), called an inverse homomorphism and is defined as follows: If y 2* then h-1(y) = {x | h(x) = y} h-1(L) = h-1(y) = {x | y = h(x) L} Examples: Let 1 = {0, 1}, 2 = {a, b}, L = {0n1n | n > 0}. Let h(0) = a, h(1) = bb Then, h(L) = {anb2n | n > 0} h-1(abb) = {01} Let h(0) = ab and h(1) = Then, h(L) = (ab)n h-1(ab) = {1n01n} Closure Properties of Regular Languages The class of regular sets is closed under – – – – – – – – – Union Intersection Complement Difference Reversal Closure (*) Concatenation Homomorphism Inverse homomorphism Theorem (4.5): If L is a regular language over an alphabet , then L = * – L is also regular Proof: Let L = L(A) for some DFA A = (Q, , , q0, F). Let B be the DFA B = (Q, , , q0, Q – F). Then L(B) = L. Therefore L is regular. Corollary: If L and M are regular languages, then L M is regular. Theorem (4.11): If L is a regular language, so is LR. Proof: (Use structural induction on the size of the regular expression E representing L.) Claim: ER represents LR Basis: If L is , {} or {a} then claim is true Induction step: 1)If E = E1 + E2, then ER = E1R + E2R 2)If E = E1 + E2, then ER = E2RE1R 3)If E = E1*, then ER = (E1R)* Theorem (4.14): If L is a regular language over an alphabet , and h is a homomorphism on , then h(L) is also regular. Proof: Let r be a regular expression, representing L, and s be a regular expression obtained from r by replacing every symbol a of by h(a), then s represents h(L) (use structural induction to prove this). s is a regular expression. So, h(L) is regular language. Theorem (4.16): if h is a homomorphism from an alphabet to * and L *, then h-1(L) is also a regular set. Proof: Let M=(Q, , , q0, F) be such that L = L(M). Let M1 = (Q, , 1, q0, F) where (q, a) in M1 = (q, h(a)) in M. Then L(M1) = h-1(L). Example: Show that {anban | n 1} is not regular. Example: Show that {0,1}* - {0n1n | n > 0} is not regular Note: The Class of Regular sets included in * is a Boolean algebra of sets for any alphabet .