CS21 Decidability and Tractability Lecture 6 January 15, 2016 January 15, 2016 CS21 Lecture 6 1 Outline • Context-Free Grammars and Languages – ambiguity – normal form • • equivalence of NPDAs and CFGs non context-free languages January 15, 2016 CS21 Lecture 6 2 CFG example • Arithmetic expressions over {+,*,(,),a} – (a + a) * a –a*a+a+a+a+a • A CFG generating this language: <expr> → <expr> * <expr> <expr> → <expr> + <expr> <expr> → (<expr>) | a January 15, 2016 CS21 Lecture 6 3 CFG example <expr> → <expr> * <expr> <expr> → <expr> + <expr> <expr> → (<expr>) | a • A derivation of the string: a+a*a <expr> <expr> * <expr> <expr> + <expr> * <expr> a + <expr> * <expr> a + a * <expr> a+a*a January 15, 2016 CS21 Lecture 6 4 Parse Trees • Easier way to picture derivation: parse tree <expr> <expr> <expr> a + * <expr> <expr> a a • grammar encodes grouping information; this is captured in the parse tree. January 15, 2016 CS21 Lecture 6 5 CFGs and parse trees <expr> → <expr> * <expr> <expr> → <expr> + <expr> <expr> → (<expr>) | a • Is this a good grammar for arithmetic expressions? – can group wrong way (+ precedence over *) – can also group correct way (ambiguous) January 15, 2016 CS21 Lecture 6 6 Solution to first problem <expr> → <expr> + <term> | <term> <term> → <term> * <factor> | <factor> <factor> → <term> * <factor> <factor> → (<expr>) | a • forces correct precedence in parse tree grouping – within parentheses, * cannot occur as ancestor of + in the parse tree. January 15, 2016 CS21 Lecture 6 7 Parse Trees • parse tree for a + a * a in new grammar: <expr> → <expr> + <term> | <term> <term> → <term> * <factor> | <factor> <factor> → <term> * <factor> <factor> → (<expr>) | a <expr> <expr> + <term> <term> <factor> <term> * <factor> <factor> a a a January 15, 2016 CS21 Lecture 6 8 Ambiguity • Second problem: ambiguous grammar • Definitions: – a string is derived ambiguously if it has two different parse trees – a grammar is ambiguous if its language contains an ambiguously derived string • ambiguity sometimes undesirable • some CFLS are inherently ambiguous January 15, 2016 CS21 Lecture 6 9 Ambiguity • Definition in terms of derivations (rather than parse trees): – order in which we replace terminals in shouldn’t matter (often several orders possible) – define leftmost derivation to be one in which the leftmost non-terminal is always the one replaced – a string is ambiguously derived if it has 2 leftmost derivations January 15, 2016 CS21 Lecture 6 10 Chomsky Normal Form • Useful to deal only with CFGs in a simple normal form • Most common: Chomsky Normal Form (CNF) • Definition: every production has form A → BC or S→ε A→a or where A, B, C are any non-terminals (and B, C are not S) and a is any terminal. January 15, 2016 CS21 Lecture 6 11 Chomsky Normal Form Theorem: Every CFL is generated by a CFG in Chomsky Normal Form. Proof: Transform any CFG into an equivalent CFG in CNF. Four steps: – add a new start symbol – remove “ε-productions” A→ε – eliminate “unit productions” A → B – convert remaining rules into proper form January 15, 2016 CS21 Lecture 6 12 Chomsky Normal Form • add a new start symbol – add production S0 → S • remove “ε-productions” A→ε – for each production with A on rhs, add production with A’s removed: e.g. for each rule R → uAv, add R → uv • eliminate “unit productions” A → B – for each production with B on lhs: B → u, add rule A → u January 15, 2016 CS21 Lecture 6 13 Chomsky Normal Form • convert remaining rules into proper form – replace production of form: A → u1U2u3…uk with: A → U1A1 U1 → u1 A1 → U2A2 U2 already a A2 → U3A3 U3 → u3 non-terminal : Ak-2 → Uk-1Uk Uk-1 → uk-1 Uk → uk January 15, 2016 CS21 Lecture 6 14 Some facts about CFLs • CFLs are closed under – union – concatenation – star (proof?) (proof?) (proof?) • Every regular language is a CFL – proof? January 15, 2016 CS21 Lecture 6 15 NPDA, CFG equivalence Theorem: a language L is recognized by a NPDA iff L is described by a CFG. Must prove two directions: () L is recognized by a NPDA implies L is described by a CFG. () L is described by a CFG implies L is recognized by a NPDA. January 15, 2016 CS21 Lecture 6 16 NPDA, CFG equivalence Proof of (): L is described by a CFG implies L is recognized by a NPDA. 0 # 1 an idea: A → 0A1 A→# January 15, 2016 q1 0 # 1 q2 0 # 1 q3 0 0 0 A # # 1 1 1 $ $ $ : : : CS21 Lecture 6 17 NPDA, CFG equivalence 1. we’d like to non-deterministically guess the derivation, forming it on the stack 2. then scan the input, popping matching symbol off the stack at each step 3. accept if we get to the bottom of the stack at the end of the input. what is wrong with this approach? January 15, 2016 CS21 Lecture 6 18 NPDA, CFG equivalence – only have access to top of stack – combine steps 1 and 2: • allow to match stack terminals with tape during the process of producing the derivation on the stack 0 # 1 q1 A → 0A1 A→# January 15, 2016 0 # 1 q2 0 # 1 q3 0 A # A 1 1 1 $ $ $ CS21 Lecture 6 19 NPDA, CFG equivalence • informal description of construction: – place $ and start symbol S on the stack – repeat: • if the top of the stack is a non-terminal A, pick a production with A on the lhs and substitute the rhs for A on the stack • if the top of the stack is a terminal b, read b from the tape, and pop b from the stack. • if the top of the stack is $, enter the accept state. January 15, 2016 CS21 Lecture 6 20 NPDA, CFG equivalence ε, ε → S$ one transition for each production A→w q ε, A → w b, b → ε ε, $ → r shorthand for: ε, ε → wk-1 one transition for q2 … ε, A → wk each terminal b q1 q January 15, 2016 ε, A → w = w1w2…wk CS21 Lecture 6 ε, ε → w1 qk r 21 NPDA, CFG equivalence Proof of (): L is recognized by a NPDA implies L is described by a CFG. – harder direction – first step: convert NPDA into “normal form”: • single accept state • empties stack before accepting • each transition either pushes or pops a symbol January 15, 2016 CS21 Lecture 6 22 NPDA, CFG equivalence – main idea: non-terminal Ap,q generates exactly the strings that take the NPDA from state p (w/ empty stack) to state q (w/ empty stack) – then Astart, accept generates all of the strings in the language recognized by the NPDA. January 15, 2016 CS21 Lecture 6 23 NPDA, CFG equivalence • Two possibilities to get from state p to q: generated by Ap,r generated by Ar,q stack height p input r q abcabbacacbacbacabacabbabbabaacabbbababaacaccaccccc string taking NPDA from p to q January 15, 2016 CS21 Lecture 6 24 NPDA, CFG equivalence • NPDA P = (Q, Σ, , δ, start, {accept}) • CFG G: – non-terminals V = {Ap,q : p, q Q} – start variable Astart, accept – productions: for every p, r, q Q, add the rule Ap,q → Ap,rAr,q January 15, 2016 CS21 Lecture 6 25 NPDA, CFG equivalence • Two possibilities to get from state p to q: generated by Ar,s stack height r s pop d q p push d input abcabbacacbacbacabacabbabbabaacabbbababaacaccaccccc string taking NPDA from p to q January 15, 2016 CS21 Lecture 6 26 NPDA, CFG equivalence from state p, • NPDA P = (Q, Σ, , δ, start, read{accept}) a, push d, move to state r • CFG G: – non-terminals V = {Ap,qfrom : p, qstate Q}s, read b, pop d, – start variable Astart, accept move to state q – productions: for every p, r, s, q Q, d , and a, b (Σ {ε}) if (r, d) δ(p, a, ε), and (q, ε) δ(s, b, d), add the rule Ap,q → aAr,sb January 15, 2016 CS21 Lecture 6 27 NPDA, CFG equivalence • NPDA P = (Q, Σ, , δ, start, {accept}) • CFG G: – non-terminals V = {Ap,q : p, q Q} – start variable Astart, accept – productions: for every p Q, add the rule Ap,p → ε January 15, 2016 CS21 Lecture 6 28 NPDA, CFG equivalence • two claims to verify correctness: 1. if Ap,q generates string x, then x can take NPDA P from state p (w/ empty stack) to q (w/ empty stack) 2. if x can take NPDA P from state p (w/ empty stack) to q (w/ empty stack), then Ap,q generates string x January 15, 2016 CS21 Lecture 6 29 NPDA, CFG equivalence 1. if Ap,q generates string x, then x can take NPDA P from state p (w/ empty stack) to q (w/ empty stack) – induction on length of derivation of x. – base case: 1 step derivation. must have only terminals on rhs. In G, must be production of form Ap,p → ε. January 15, 2016 CS21 Lecture 6 30 NPDA, CFG equivalence 1. if Ap,q generates string x, then x can take NPDA P from state p (w/ empty stack) to q (w/ empty stack) – assume true for derivations of length at most k, prove for length k+1. – verify case: Ap,q → Ap,rAr,q →k x = yz – verify case: Ap,q → aAr,sb →k x = ayb January 15, 2016 CS21 Lecture 6 31 NPDA, CFG equivalence 2. if x can take NPDA P from state p (w/ empty stack) to q (w/ empty stack), then Ap,q generates string x – induction on # of steps in P’s computation – base case: 0 steps. starts and ends at same state p. only has time to read empty string ε. – G contains Ap,p → ε. January 15, 2016 CS21 Lecture 6 32 NPDA, CFG equivalence 2. if x can take NPDA P from state p (w/ empty stack) to q (w/ empty stack), then Ap,q generates string x – induction step. assume true for computations of length at most k, prove for length k+1. – if stack becomes empty sometime in the middle of the computation (at state r) • y is read going from state p to r • z is read going from state r to q • conclude: Ap,q → Ap,rAr,q →* yz = x January 15, 2016 CS21 Lecture 6 (Ap,r→* y) (Ar,q→* z) 33 NPDA, CFG equivalence 2. if x can take NPDA P from state p (w/ empty stack) to q (w/ empty stack), then Ap,q generates string x – if stack becomes empty only at beginning and end of computation. • • • • first step: state p to r, read a, push d go from state r to s, read string y last step: state s to q, read b, pop d conclude: Ap,q → aAr,sb →* ayb = x January 15, 2016 CS21 Lecture 6 (Ar,s→* y) 34