TDDD65 Introduction to the Theory of Computation Lecture 3 Gustav Nordh Department of Computer and Information Science gustav.nordh@liu.se 2012-09-05 Outline Context-free Grammars Ambiguity Pumping Lemma Pushdown Automata Summary of Context-free Languages Context-free Languages (CFL) What can be computed with restricted access to unlimited memory? Context-free Languages (CFL) What can be computed with restricted access to unlimited memory? input tape 0 0 1 1 control stack .. . ··· Context-free Languages (CFL) What can be computed with restricted access to unlimited memory? input tape 0 0 1 1 0 stack .. . control ··· Context-free Languages (CFL) What can be computed with restricted access to unlimited memory? input tape 0 0 1 1 0 0 stack .. . control ··· Context-free Languages (CFL) What can be computed with restricted access to unlimited memory? input tape 0 0 1 1 0 0 stack ··· control .. . A push-down automaton (PDA) is a NFA with a stack Context-free Languages (CFL) Recall: DFAs correspond to regular expressions Push-down automatas (PDAs) correspond to Context-free Grammars (CFGs) Context-free Languages (CFL) Noam Chomsky (1928 -) Context-free Grammars: Motivation Describing (parts of) natural languages Describing the syntax of programming languages Example of a Context-free Grammars (CFG) Example S→A S→B A → 0A1 A→ε B → aBa B → bBb B→a B→b B→ε Example of a Context-free Grammars (CFG) Example S→A S→B A → 0A1 A→ε B → aBa B → bBb B→a B→b B→ε Example S→A|B A → 0A1 | ε B → aBa | bBb | a | b | ε Example of a Context-free Grammars (CFG) Example S→A|B A → 0A1 | ε B → aBa | bBb | a | b | ε A string is in the language of the grammar if it can be generated by: Example of a Context-free Grammars (CFG) Example S→A|B A → 0A1 | ε B → aBa | bBb | a | b | ε A string is in the language of the grammar if it can be generated by: 1 Writing down the start variable Example of a Context-free Grammars (CFG) Example S→A|B A → 0A1 | ε B → aBa | bBb | a | b | ε A string is in the language of the grammar if it can be generated by: 1 Writing down the start variable 2 Replacing a variable that is written down by the right hand side of a rule starting with that variable Example of a Context-free Grammars (CFG) Example S→A|B A → 0A1 | ε B → aBa | bBb | a | b | ε A string is in the language of the grammar if it can be generated by: 1 Writing down the start variable 2 Replacing a variable that is written down by the right hand side of a rule starting with that variable 3 Repeating Step 2 until no variable remains Example of a Context-free Grammars (CFG) Example S→A|B A → 0A1 | ε B → aBa | bBb | a | b | ε A string is in the language of the grammar if it can be generated by: 1 Writing down the start variable 2 Replacing a variable that is written down by the right hand side of a rule starting with that variable 3 Repeating Step 2 until no variable remains S⇒B Example of a Context-free Grammars (CFG) Example S→A|B A → 0A1 | ε B → aBa | bBb | a | b | ε A string is in the language of the grammar if it can be generated by: 1 Writing down the start variable 2 Replacing a variable that is written down by the right hand side of a rule starting with that variable 3 Repeating Step 2 until no variable remains S ⇒ B ⇒ aBa Example of a Context-free Grammars (CFG) Example S→A|B A → 0A1 | ε B → aBa | bBb | a | b | ε A string is in the language of the grammar if it can be generated by: 1 Writing down the start variable 2 Replacing a variable that is written down by the right hand side of a rule starting with that variable 3 Repeating Step 2 until no variable remains S ⇒ B ⇒ aBa ⇒ abBba Example of a Context-free Grammars (CFG) Example S→A|B A → 0A1 | ε B → aBa | bBb | a | b | ε A string is in the language of the grammar if it can be generated by: 1 Writing down the start variable 2 Replacing a variable that is written down by the right hand side of a rule starting with that variable 3 Repeating Step 2 until no variable remains S ⇒ B ⇒ aBa ⇒ abBba ⇒ abba Definition of Context-free Grammar (CFG) Definition A context-free grammar (CFG) is a 4-tuple (V , Σ, R, S) where V is a finite set of variables Σ is a finite set of terminals R is a finite set of rules S ∈ V is the start variable The language of a CFG If u, v , and w are strings of variables and terminals, and A → w is a rule of the grammar we say that uAv yields uwv , written uAv ⇒ uwv . The language of a CFG If u, v , and w are strings of variables and terminals, and A → w is a rule of the grammar we say that uAv yields uwv , written uAv ⇒ uwv . ∗ u derives v written u ⇒ v if u ⇒ u1 ⇒ u2 ⇒ · · · ⇒ uk ⇒ v The language of a CFG If u, v , and w are strings of variables and terminals, and A → w is a rule of the grammar we say that uAv yields uwv , written uAv ⇒ uwv . ∗ u derives v written u ⇒ v if u ⇒ u1 ⇒ u2 ⇒ · · · ⇒ uk ⇒ v Definition The language of a CFG G = (V , Σ, R, S) is ∗ {w ∈ Σ∗ | S ⇒ w} written L(G). The language of a CFG If u, v , and w are strings of variables and terminals, and A → w is a rule of the grammar we say that uAv yields uwv , written uAv ⇒ uwv . ∗ u derives v written u ⇒ v if u ⇒ u1 ⇒ u2 ⇒ · · · ⇒ uk ⇒ v Definition The language of a CFG G = (V , Σ, R, S) is ∗ {w ∈ Σ∗ | S ⇒ w} written L(G). Definition A language that is generated by some context-free grammar is called a context-free language The language of a CFG Example S→A|B A → 0A1 | ε B → aBa | bBb | a | b | ε The language of a CFG Example S→A|B A → 0A1 | ε B → aBa | bBb | a | b | ε Example The language generated by A → 0A1 | ε is LA = {0n 1n | n ≥ 0} The language of a CFG Example S→A|B A → 0A1 | ε B → aBa | bBb | a | b | ε Example The language generated by A → 0A1 | ε is LA = {0n 1n | n ≥ 0} Example The language generated by B → aBa | bBb | a | b | ε is LB = {s ∈ {a, b}∗ | s is a palindrome} CFG, Ambiguity Definition A derivation of a string w in a grammar G is a leftmost derivation if at every step the leftmost remaining variable is the one being replaced CFG, Ambiguity Definition A derivation of a string w in a grammar G is a leftmost derivation if at every step the leftmost remaining variable is the one being replaced Definition A string s is derived ambiguously in a CFG G if it has two different leftmost derivations. A CFG G is ambiguous if it generates some string ambiguously. CFG, Ambiguity Example S→A|B A → 0A1 | ε B → aBa | bBb | a | b | ε CFG, Ambiguity Example S→A|B A → 0A1 | ε B → aBa | bBb | a | b | ε Is this grammar ambiguous? CFG, Ambiguity Example S→A|B A → 0A1 | ε B → aBa | bBb | a | b | ε Yes! S⇒B⇒ε S⇒A⇒ε CFG, Ambiguity Example S→A|B A → 0A1 | ε B → aBa | bBb | a | b | ε Equivalent and unambiguous: Example S→A|B|ε A → 0A1 | 01 B → aBa | bBb | a | b | aa | bb Pushdown Automata input tape 0 0 1 1 control stack .. . ··· Pushdown Automata input tape 0 0 1 1 0 stack .. . control ··· Pushdown Automata input tape 0 0 1 1 0 0 stack .. . control ··· CFL, pushdown automata, definition Definition A pushdown automaton (PDA) is a 6-tuple (Q, Σ, Γ, δ, q0 , F ) where Q is the finite set of states Σ is the input alphabet Γ is the stack alphabet δ : Q × (Σ ∪ {ε}) × (Γ ∪ {ε}) → P(Q × (Γ ∪ {ε})) is the transition function q0 ∈ Q is the start state F ⊆ Q is the set of accept states CFL, pushdown automata Pushdown automata are nondeterministic! CFL, pushdown automata Pushdown automata are nondeterministic! Theorem There are languages recognized by PDAs that are not recognized by any deterministic PDA. For example the language {ww R | w ∈ {0, 1}∗ }. CFL, pushdown automata Example Describe a pushdown automaton recognizing the language {0n 1n | n ≥ 0} CFL, pushdown automata Example Describe a pushdown automaton recognizing the language {0n 1n | n ≥ 0} 1 Start pushing the 0’s read on the stack. 2 When the first 1 appears, start popping a 0 from the stack for each 1 that is read. 3 Should a 0 appear as input in this stage, then reject the string. 4 If the input is finished and 0’s remains on the stack, or if the stack is emptied before the input is finished, then reject the string. 5 Otherwise, accept the string. CFL, pushdown automata Theorem A language is context-free if and only if some pushdown automaton recognizes it nonCFLs L = {0n 1n 2n | n ≥ 0} is not a CFL Why? nonCFLs L = {0n 1n 2n | n ≥ 0} is not a CFL Why? Imagine a PDA that recognize L When reading a string the PDA needs to keep track of the number of 0’s so that it can check that the same number of 1’s and 2’s follow The number of 0’s is unbounded so the PDA needs to use its stack for this To check that the same number of 1’s follow, the PDA needs to empty its stack Now, the PDA has no way of checking that the same number of 2’s follow Pumping Lemma for CFLs Lemma If L is a CFL, then there exists a positive integer p (the pumping length) such that every string s ∈ L, |s| ≥ p, can be partitioned into five pieces, s = uvxyz, such that the following conditions hold: |vy | > 0, |vxy | ≤ p, and for each i ≥ 0, uv i xy i z ∈ L Summary of Context-free Languages The context-free languages are the languages generated by context-free grammars A context-free grammar is ambiguous if the same string can be derived using two different left-most derivations A language is context-free iff it is recognized by a PDA There are simple languages that are not context-free