The Pumping Lemma CS 130: Theory of Computation HMU textbook, Chapter 4 (sections 4.1 and 4.2) A language that is not regular Consider L = {w | w = anbn for all n >= 0} There is no DFA that accepts L, no regular expression that describes L How do we prove this? Intuition: DFAs cannot “count”, that is, DFAs cannot remember that n a’s have been recognized so that n b’s should follow Proof technique: by contradiction Proving that a language is not regular Suppose there is a DFA that recognizes L, and let k be the number of states in that DFA Consider recognizing the strings ab, a2b2, a3b3, …, akbk, ak+1bk+1 Note what state the DFA is in as it reads the last a. For the k+1 examples, there should be two examples where they are in the same state (pigeonhole principle) Proving that a language is not regular This means there are two strings: arbr, ar+pbr+p, such that, the prefixes ar and ar+p brings the DFA to the same state, say qt. That is, q0 s qt for both s = ar and s = ar+p This in turn means that qt v qtfor v = ap Note also that qt w qf for both w = br and w = br+p Implications: arbr+p and ar+2pbr+p are acceptable in this DFA. A contradiction. Summarizing the strategy We looked for a state that can be pumped so that substrings admissible by starting and ending with that state can be arbitrarily inserted in the acceptable string We formalize this notion through the Pumping Lemma Pumping Lemma Let L be a regular language There exists an n such that for all z in L, |z| >= n, z can be broken down into three substrings: z = uvw, where, for all i, uviw is in L Note: |uv| <= n and |v| >= 1 Proof of this lemma follows the outline of our earlier argument Using the pumping lemma in our example L is all strings of the form apbp We are guaranteed there is an n as specified in the pumping lemma We choose anbn in L and express this as uvw. Since |uv| <=n, v consists entirely of a’s, say v=ak. This means, strings like an-kbn and an+kbn should be in L, which is a contradiction