CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 7 School of Innovation, Design and Engineering Mälardalen University 2012 1 Content Midterm results Regular vs. Non-regular Languages Context-Free Languages Context-Free Grammars Derivation Trees. Ambiguity Applications Push-Down Automata, PDA 2 Midterm 1 Solution http://www.idt.mdh.se/kurser/cd5560/12_11/examination/ Duggor/MIDTERM1-20121127-Solution.pdf 3 A comment on the MIDTERM 1 The Pumping Lemma for Regular Languages Pumping Lemma cannot be used to prove that a language is regular! An example: If something is a square it always has four edges (a property of square) But: having proved that something has four edges does not necessarily mean that the object is a square. http://www2.mat.ua.pt/rosalia/cadeiras/TC/pump.pdf 4 Time to take the next step: beyond Regular Languages n l n l {a b c : n, l 0}{a : n 0} n! Non-regular languages Context-Free Languages n n R {a b } {ww } Regular Languages 5 Automata theory: formal languages and formal grammars Grammar Languages Automaton Type-0 Recursively enumerable Turing machine Type-1 Contextsensitive Linear-bounded nondeterministic Turing machine Type-2 Context-free Non-deterministic pushdown automaton Production rules No restrictions and Type-3 Regular Finite state automaton 6 Context-Free Languages Based on C Busch, RPI, Models of Computation 7 Context-Free Languages Context-Free Grammars Pushdown Automata 8 Context-Free Grammars 9 Grammar Formal Definition G V , T , S , P V: T: Set of variables Set of terminal symbols S : Start variable P: Set of production rules 10 Repetition: Regular Grammars Grammar Variables G (V ,T , S , P) Terminal symbols Start variables Right or Left Linear Grammars. Productions of the form: A xB A Bx or Cx x is string of terminals 11 Definition: Context-Free Grammars Grammar Variables G (V ,T , S , P) Terminal symbols Start variables Productions of the form: A x x is string of variables and terminals 12 Regular vs. Context-free Grammar A regular grammar is either right or left linear, whereas context free* grammar is any combination of terminals and non-terminals. Hence regular grammars are a subset of context-free grammars. Grammar generating palindromes is not regular: S ABA A something B something *The name context-free grammar is explained by the property of productions that are independent of the surrounding symbols. There are also context-sensitive grammars where 13 productions depend on the context (symbols that surround variables). Example 1: A context-free grammar G S aSb S A derivation S aSb aaSbb aabb 14 A context-free grammar G S aSb S Another derivation S aSb aaSbb aaaSbbb aaabbb 15 S aSb S L(G ) {a b : n 0} n n ( ( ( ( ) ) ) ) 16 Example 2: A context-free grammar G S aSa S bSb S A derivation S aSa abSba abba 17 A context-free grammar G S aSa S bSb S Another derivation S aSa aaSaa aaaSaaa aaabSbaaa aaabbaaa 18 S aSa S bSb S L(G) {ww : w {a, b}*} R 19 Example 3: A context-free grammar G S aSb S SS S A derivation S SS aSbS abS ab 20 A context-free grammar G S aSb S SS S A derivation S SS aSbS abS abaSb abab 21 S aSb S SS S L(G ) {w : na ( w) nb ( w), and na (v) nb (v) in any prefix v} ( )( ( ( ) ) ) ( ( ) ) 22 Example 4: Language L {a nb m:n m} is context - free. For the case n m : S AS1 , S1 aS1b|λ, A aA|a. For the case n m : S S1B , S1 aS1b|λ, B bB|b. n m: n m: S AS1 , S1 aS1b|λ, A aA|a. S S1B , S1 aS1b|λ, B bB|b. The grammar for the language L {a nb m:n m} is : S AS1|S1B S1 aS1b|λ A aA|a B bB|b Definition: Context-Free Grammars Grammar Variables G (V ,T , S , P) Terminal symbols Start variables Productions of the form: A x x is string of variables and terminals 25 Definition: Context-Free Languages A language L is context-free if and only if there is a grammar G with L L(G ) 26 Derivation Order 1. S AB 2. A aaA 3. A 4. B Bb 5. B Leftmost derivation 1 2 3 4 5 S AB aaAB aaB aaBb aab 27 Derivation Order 1. S AB 2. A aaA 3. A 4. B Bb 5. B Rightmost derivation 1 4 5 2 3 S AB ABb Ab aaAb aab 28 S aAB A bBb B A| Leftmost derivation S aAB abBbB abAbB abbBbbB abbbbB abbbb 29 S aAB A bBb B A| Rightmost derivation S aAB aA abBb abAb abbBbb abbbb 30 Derivation Trees 31 Derivation can be represented in a tree form S AB A aaA | B Bb | S AB S A B 32 B Bb | A aaA | S AB S AB aaAB S A a a B A 33 A aaA | S AB B Bb | S AB aaAB aaABb S A a a B A B b 34 A aaA | S AB B Bb | S AB aaAB aaABb aaBb S A a a B A B b 35 S AB A aaA | B Bb | S AB aaAB aaABb aaBb aab S Derivation Tree B A a a A B b 36 A aaA | S AB B Bb | S AB aaAB aaABb aaBb aab S Derivation Tree A a a B A B yield b aab aab 37 Partial Derivation Trees S AB A aaA | B Bb | S AB Partial derivation tree S A B 38 S AB aaAB Partial derivation tree S A a a B A 39 S AB aaAB sentential form Partial derivation tree S yield A a a B aaAB A 40 Sometimes, derivation order doesn’t matter Leftmost: S AB aaAB aaB aaBb aab Rightmost: S AB ABb Ab aaAb aab S The same derivation tree A a a B A B b 41 Ambiguity 42 E E E | E E | (E) | a a a a E E a E a derivation (* denotes multiplication) E E E a E a E E a a E a a a E E a leftmost derivation 43 E E E | E E | (E) | a a a a derivation E E E E E E E E a E E a aE E E E a a a a leftmost derivation a a 44 E E E | E E | (E) | a a aa E E a E E E a E E a a E E E a a 45 E E E | E E | (E) | a a aa Two derivation trees E E a E E a E E a a E E E E a a 46 The grammar E E E | E E | (E) | a is ambiguous! String a a a has two derivation trees E E a E E E a E E a a E E E a a 47 E E E | E E | (E) | a is ambiguous as the string a a a The grammar has two leftmost derivations: E E E a E a EE a a E a a*a E EE E EE a EE a aE a aa 48 Definition A context-free grammar G is ambiguous if some string w L(G ) has two or more derivation trees (two or more leftmost/rightmost derivations). 49 Why do we care about ambiguity? a aa a2 E E a E E E a E E a a E E E a a 50 Why do we care about ambiguity? 2 22 E E 2 E E E 2 E E 2 2 E E E 2 2 51 Why do we care about ambiguity? 2 22 6 E 2 E 2 8 E 4 E 2 E 2 2 22 6 2 E 2 E 2 2 4 E 2 E 2 E 2 2 2 22 8 52 Correct result: 2 22 6 6 E 2 E 2 4 E 2 E 2 2 E 2 53 Ambiguity is bad for programming languages We want to remove ambiguity! 54 We fix the ambiguous grammar… E E E | E E | (E) | a E E T …by introducing parentheses () to indicate grouping, (precedence) E T T T F Non-ambiguous grammar T F F (E) F a 55 E E T T T F T a T a T F a F F a aF a aa E E E T a aa E T E T T F T T F F T F F (E) F a a T a F a 56 Unique derivation tree a aa E E T T T F F a a F a 57 The grammar G : E E T E T T T F T F F (E) is non-ambiguous. F a Every string w L(G ) has a unique derivation tree. 58 Inherent Ambiguity Some context free languages have only ambiguous grammars! Example: S S1 | S2 L {a b c } {a b c } n n m n m m S1 S1c | A S 2 aS2 | B A aAb | B bBc | 59 The string n n n a b c has two derivation trees S1 S S S1 S2 c a S2 60 n l n l {a b c : n, l 0}{a : n 0} n! Non-regular languages Context-Free Languages n n R {a b } {ww } Regular Languages 61 Applications: Compilers 62 Machine Code Program v = 5; if (v>5) x = 12 + v; while (x !=3) { x = x - 3; v = 10; } ...... Compiler Add v,v,0 cmp v,5 jmplt ELSE THEN: add x, 12,v ELSE: WHILE: cmp x,3 ... 63 Compiler Lexical analyzer input program parser output machine code 64 A parser “knows” the grammar of the programming language 65 Parser PROGRAM STMT_LIST STMT_LIST STMT; STMT_LIST | STMT; STMT EXPR | IF_STMT | WHILE_STMT | { STMT_LIST } EXPR EXPR + EXPR | EXPR - EXPR | ID IF_STMT if (EXPR) then STMT | if (EXPR) then STMT else STMT WHILE_STMT while (EXPR) do STMT 66 The parser finds the derivation of a particular input derivation Parser input 10 + 2 * 5 EE+E |E*E | INT EE+E E+E*E 10 + E*E 10 + 2 * E 10 + 2 * 5 67 derivation EE+E E+E*E 10 + E*E 10 + 2 * E 10 + 2 * 5 derivation tree E E + E 10 E * E 5 2 68 derivation tree E E machine code + E mult a, 2, 5 add b, 10, a 10 E 2 * E 5 69 Parsing examples 70 Parser input string grammar derivation 71 Example: Parser input aabb S SS derivation S aSb S bSa ? S 72 Exhaustive Search S SS | aSb | bSa | Phase 1: S SS S aSb Find derivation of aabb S bSa S All possible derivations of length 1 73 S SS aabb S aSb S bSa S 74 Phase 2 S SS | aSb | bSa | S SS SSS S SS aSbS Phase 1 S SS bSaS S SS S SS S S aSb S aSb aSSb aabb S aSb aaSbb S aSb abSab S aSb ab 75 S SS | aSb | bSa | Phase 2 S SS SSS S SS aSbS aabb S SS S S aSb aSSb S aSb aaSbb Phase 3 S aSb aaSbb aabb 76 Final result of exhaustive search (top-down parsing) Parser input aabb S SS S aSb S bSa S derivation S aSb aaSbb aabb 77 Another use of context free grammars: Context Free Art http://www.contextfreeart.org/index.html 78 Context Free Art 79 Context-Free Languages Context-Free Grammars Pushdown Automata stack automaton 80 Pushdown Automata PDAs 81 Pushdown Automaton - PDA Input String Stack States 82 The Stack A PDA can write symbols on a stack and read them later on. POP reading symbol PUSH writing symbol y x z All access to the stack only on the top! (Stack top is written leftmost in the string, e.g. yxz) A stack is valuable as it can hold an unlimited amount of information. The stack allows pushdown automata to recognize some non-regular languages. 83 The States Input symbol Pop old reading stack symbol q1 a, b / c Push new writing stack symbol q2 84 q1 a, b / c q2 input a a stack b h e $ top Replace c h e $ (An alternative is to start and finish with empty stack) 85 q1 a, / c q2 input a stack b h e $ top Push a c b h e $ 86 q1 a,b / q2 input a a stack b h e $ top Pop h e $ 87 q1 a, / q2 input a a stack b h e $ top No Change b h e $ 88 Formal Definition Pushdown Automaton is defined as 7-tuple M (Q, , , , q0, z, F ) Final states States Input alphabet Stack alphabet start Transition state function Stack start symbol 89 Time 0 Example 3.7 Salling: A PDA for simple nested parenthesis strings ( ( ( ) ) ) Input (, / ( start s Stack ), ( / ), (/ q end 90 Example 3.7 Time 1 Input ( ( ( ) ) (, / ( start s ( ) Stack ), ( / ), (/ q end 91 Example 3.7 Time 2 Input ( ( ( ( ) ) ) ( (, / ( start s Stack ), ( / ), (/ q end 92 Example 3.7 Time 3 Input ( ( ( ) ) ( ( ) ( (, / ( start s ), ( / ), (/ q Stack end 93 Example 3.7 Time 4 Input ( ( ( ) ) ( ( ) ( (, / ( ), ( / Stack start s ), (/ q end 94 Example 3.7 Time 5 Input ( ( ( ) ) ( ) ( (, / ( start s ), ( / ), (/ q Stack end 95 Example 3.7 Time 6 Input ( ( ( ) ) ( ) ), ( / (, / ( start s Stack ), (/ q end 96 Example 3.7 Time 7 Input ( ( ( ) ) ) Stack (, / ( start s ), ( / ), (/ q end 97