Context-Free Grammars: Definition, Derivation, and Examples

CONTEXT-FREE GRAMMAR Definition of Context-free Grammar: A context-free grammar (G) is a 4-tuple (quadruple) G = (V, T, S, P) where V = Finite set of objects called Variables T = Finite set of objects called Terminal symbols SV = Start variable P = Finite set of Production rules, with each rule being a variable and a string of variables and terminals A production rule P is of the form X y where X is a variable and y is a string of symbols from (V U T)*. • Given a string w, of the form w = uxv, we can use the production rule xy and obtain a new string z = uyv. . • The set of all strings obtained by using Production rules is the “Language” generated by the Grammar. • If the grammar G = (V, T, S, P) then L(G) = {w  T * : S  w} • If W  L(G), then the sequence S  w1  w2  w 3 …  w n  w is a “derivation” of the sentence w. • The string S, w1 , w2 , … wn , which contain variables as well as terminals, are called “sentential forms” of the derivation. Grammar: S  Derivation: S aS a  a S aS • String Generators: Grammars specify languages by generating strings in the language using production rules e.g. SaBb, BbBa | Sa, etc. • Pattern Recognizers: Grammars can be viewed as a notation for describing a family of recognition algorithms. • Context-freeness: A context-free grammars allow the following: – An A-rule can be applied whenever A occurs in a string, irrespective of the context (that is, non-terminals and terminals around A) 4 S aSb, S  λ SaSb  ab a^1b^1 SaSb aaSbb aabb a2b2 SaSb  aaSbb  aaaSbbb aaabbb ………. a3b3 anbn L(G) =anbn Example: Given a Grammar G = ({S}, {a, b}, S, P) with P defined as S aSb, S  λ (i) Obtain a sentence in language generated by G and the sentential form (ii) Obtain the language L(G). Solution SaSb ab S  aSb aaSbb aabb SaSbaaSbbaabb Therefore we have S* aabb. So a sentence in the language generated by G is aabb. The Sentential form = aaSbb. (ii) The rule S  aSb is recursive. All sentential forms will have the forms w i = ai S bi Applying the production rule S  aSb, we get ai bi ai+1Sbi+1 This is true for all i. In order to get a sentence we apply S  λ Therefore we get S  anSbn anbn Therefore L(G) = {anbn ; n > 0}. Example: Given G1 =( { A, S}, {a, b }, S , P1 ) with P1 defined by the production rules: S  aAb | λ A  aAb | λ (i). show that L(G1 ) = {anbn : n > 0}. (ii). show that G1 is equivalent to G where G = ({S}, {a, b}, S, P) where P is given by S aSb S λ Solution Given P1 as S aAb | λ; A aAb | λ S aAb aλb ab S aAb  aaAbb  aabb i.e. a2b2 and so on Therefore L(G1) = {anbn : n > 0}. Given G = ({S},{a, b}, S, P) where P is S  aSb, S  λ. The rule S  aSb is recursive. All sentential forms will have the forms: wi =a iSb i Applying the production rule S  aSb, we get aiSbi  ai+1Sbi+1 This is true for all i. In order to get a sentence, we apply S  λ. Therefore we get S  anSbnanbn Hence L(G) = {anbn : n > 0}. Hence G1 is equivalent to G as both the grammars are given by {anbn : n > 0}. Example: Given a grammar G defined by the production rules S AB A Aa B Bb A a Bb. Show that the word w = a2b4  L(G), where L is a language determined by G. Solution S AB AaB aaB aaBb aaBbb aaBbbb aabbbb i.e. a2b4 Hence the word w = a2b4  L(G). Question: Suppose a context free grammar G = ( {S,A} ,{a,b},P,S) with the following productions rules: SaSb | aAb , AbAa , Aba Determine its language . Solution: SaAbabab SaSbaaAbb aababb (sub S->aAb) S-aSb aaSbb aaaAbbbaaababbb Thus L={anbmambn, where n>=1. m=1} Example: Give a simple description of the language generated by the grammar with productions (a). S  aA, A bS, S λ (b). S Aa, A B, B Aa Solution (a) For the given production rules S aA  abS  ab S  aA abS  abaA ababS abab S  aA abS  abaA ababS ababaA abababS  ababab , etc we have the language L given by L ={(ab)n | n ≥1} (b) For the given production rules S  Aa  Ba  Aaa  Baa  Aaaa  Baaa  Aaaaa There is no proper termination; so, there is no language L produced. Right-Linear Grammars • In right-linear grammar, all productions have one of the two forms: V T *V or V T * i.e. the LHS should have a single variable and the RHS consists of any number of terminals (members of T) optionally followed by a single variable. e.g. A xyzB | xB |  • The following automaton and right-linear grammar both recognize the set of set of strings consisting of an even number of 0’s and an even number of 1’s. • and NFAs Right Linear Grammars • This is another Right Linear Grammar: Aa A  aB A where A, B V and a  . 13 Left-Linear Grammars • In a left-linear grammar, all productions have one of the two forms: V VT * or V T * i.e. the LHS must consist of a single variable, and the RHS consists of an optional single variable followed by one number of terminals. e.g. Aa A  Ba A where A, B  V and a  . Example: Determine the context-free languages. for the grammar G = ({S}, {a, b}, S, P) with productions: (a). S aSa, S  bSb, S  λ (b). S  abB, A aaBb, B bbAa, A λ Solution (a) S aSa aaSaa aabSbaa aabbaa The language is L(a) = {wwR : w ϵ{a, b}*}. or L(G) ={anbnan : n ≥ 0 ). (b). S  abB  abbbAa  abbbaaBba  abbbaabbAaba abbbaabbaaBbaba  abbbaabbaabbAababa  abbbaabbaabbababa The language is L(G) = {ab(bbaa)nbba(ba)n : n ≥ 0} DERIVATION TREES A ‘derivation tree’ is an ordered tree which the nodes are labeled with the left sides of productions and in which the children of a node represent its corresponding right sides. Definition of a Derivation Tree Let G = (V, T, S, P) be a CFG. An ordered tree is a derivation tree for G iff (if and only if) it has the following properties: i. The root of the derivation tree is S. ii. Each and every leaf in the tree has a label from T U{λ}. iii. Each and every interior vertex (a vertex which is no a leaf) has a label from V. iv. If a vertex has label V, and its children are labeled (from left to right) a1 , a2 , …an , then P must contain a production of the form A  a1, a2, ... an v. A leaf labeled l has no siblings, that is, a vertex with a child labeled l can have no other children. Sentential Form For a given CFG with productions S aA, A aB, B bB, B a. The derivation tree is as shown below: Right Most/Left Most/Mixed Derivation Consider the grammar G with production 1. S aSS 2. S b Left most Derivation: S aSS aaSSS  aabSS  aabaSSS  aababSS  aababbS  aababbb The sequence followed is “1121222” Mixed Derivation: S  aSS  aSb aaSSb  aabSb  aabaSSb  aabaSbb  aababbb The sequence followed is “1212122” Right most Derivation: S  aSS  aSb  aaSSb  aaSaSSb  aaSaSbb  aaSabbb  aababbb The sequence followed is “1211222” A grammar G is context-free and has the productions: S aAB, A  Bba, B  bB, B  c (i). Derive the word acbabc (ii). Obtain the derivation tree. Solution: (i). The word w = acbabc is derived as follows: S aAB  a(Bba)B  acbaB acba(bB)acbabc. B c c A CFG given by productions is S a, S aAS, A bS Obtain the derivation tree of the word w = abaabaa. Given a CFG given by G = (N, T, P, S) with N = {S}, T = {a, b}, P ={S aSb, S  ab} Obtain the derivation tree and the language generated L(G). Given G = (N, T, P, S) with N = {E}, S = E, T = {id, +, *, c} with the productions: E E + E, E  E* E, E  E, E id Obtain the derivation tree. Given a CFG G = (N, T, P, S) with N = {S, A}, T = {a, b} and the productions: S aS, S  aA, A bA, A b Obtain the derivation tree and L(G). a Question: Sketch the derivation tree for the CFG given by S  aA, A  aB, B  bB, B  a. Solution: Given a grammar G with production rules S  aB, S  bA, A aS, A bAA, A a, B bS, B aBB, B b Obtain the (i) leftmost derivation, and (ii) rightmost derivation for the string “aaabbabbba”. Solution (i) Leftmost derivation: S aB  aaBB  aaaBBB  aaabBB  aaabbB  aaabbabB  aaabbabbB  aaabbabbbS  aaabbabbba (ii) Rightmost derivation: S  aB  aaBB  aaBbS  aaBbbA  aaaBBbba  aaabBbba  aaabbSbba  aaabbaBbba  aaabbabbba Example: Let G = (V, , P, S) be a CFG in the form: G  ({S},{a, b},{S   , S  aSb}, S ) i...Show.that.L(G )  {a b | n  0} n n ii..Draw.the.derivation.tree. for.aabb i. S  aSb  aaSbb  aabb S  aSb  aaSbb  aaaSbbb  aaabbb S  aSb  aaSbb  aaaSbbb  aaaaSbbbb  aaaabbbb Thus, L(G )  {a b | n  0} n [See slide #5] n 27 ii. Derivation tree for aabb is: S S a a  b b 28 G  ({S , A, B},{a , b}, {S  AB, A  aA |  , B  Bb |  }, S) L(G )  L( a * b*) Leftmost Derivation : S  AB  aAB  aB  aBb  ab Rightmost Derivation : S  AB  ABb  Ab  aAb  ab 29 Derivation Tree S A B A a  ) B b  30 More Examples of CFGs and CFLs ) 31 S  aSa | aBa B  bB | b L( S )  {a b a : m  0} n m n L( S )  {a b a : n, m  0} m m ) m 32 S  aSa | B B  bB |  L( S )  {a b a | n  0  m  0} n m n S  abSc |  L( S )  {( ab) c | n  0} n ) n 33 S  AB A  aA | a B  bB |  S  aS | aB B  bB |  L( S )  {a b | m  0, n  0}  * L( S )  L( a b ) n m ) 34 S  aS | B S  AbAbA A  aA |  B  bA A  aA | bC C  aC |  L( S )  {a * ba * ba* | a, b  0} ) 35 S  S   | aO | bO | aaE | abE O aE | bE | baE | bbE L( S )  {w {a, b}* | length ( w) is EVEN } S   | aE | bO O aO | bE L( S )  {w  {a, b}* | w has EVEN number of b' s} ) 36 Example: Given the grammar G = (V, T, P,E) with the following productions: A  AbA AB B  aBa Bb Derive the string aabaababa. Solution: A  AbA  BbA  aBabA  aaBaabA  aabaabA  aabaabB  aabaabaBa  aabaababa Consider the grammar G = (V, T, P,E) where V = {E,N}, T = {+,*,(,), 0,1},and P contains the following productions: E E + E | E * E | (E) | N N  0N |1N | 0 | 1 All the following words are in the language L(G): 0 0 * 1 + 111 (1 + 1) * 0 (1 * 1) + (((0000)) * 1111) For instance, (1 + 1) * 0 is derived by E  E * E  (E) * E  (E + E) * E  (N + N) * N  (1 + 1) * 0: The derivation tree for the grammar is: Leftmost derivation: E  E + E  N + E  0N + E  01 + E  01 + (E)  01 + (E * E)  01 + (N * E)  01 + (1 * E)  01 + (1 * N)  01 + (1 * 0) Rightmost derivation: E  E + E  E + (E)  E + (E * E)  E + (E * N)  E + (E * 0)  E + (N * 0)  E + (1 * 0)  N + (1 * 0)  0N + (1 * 0)  01 + (1 * 0) • Leftmost derivation uses the depth first traversal of the tree from left to right encounters them. • Rightmost derivation corresponds to the depth first traversal from right to left. Ambiguity in Context-free Grammars (CFGs) and Context-free Languages (CFLs) 41 • A context-free grammar G is called ambiguous if some word has more than one leftmost derivation (equivalently: more than one derivation tree). • Otherwise the grammar is unambiguous. E.g. the word 1+0+1 has the following two leftmost derivations • EE+EE+E+E1+E+E  1 + 0 + E  1 + 0 + 1 and • EE+E1+E1+E+E1+0+E 1+0+1 These correspond to different derivation trees; thus the CFG for the word 1+0+1 is ambiguous. Ambiguity in CFGs Example: S ==> AS |  A ==> A1 | 0A1 | 01 Input string: 00111 • Can be derived in two ways Leftmost derivation #1: S => AS => 0A1S =>0A11S => 00111S => 00111 Leftmost derivation #2: S => AS => A1S => 0A11S => 00111S => 00111 44 • The grammar G1 = ({S}, {a, b}. P1, S) where P1 contains the productions S aSb | aaS | έ is ambiguous because the word aaab has two different leftmost derivations: S  aaS  aaaSb  aaab and S  aSb  aaaSb  aaab: • The language {a2k+nbn | k, n >=0} it generates is not inherently ambiguous because it is generated by the equivalent unambiguous grammar ({S,A}, {a, b}, P11, S) with productions S  aSb | A, A  aaA | έ Note: έ and λ are used synonymously. Why does ambiguity matter? Given E ==> E + E | E * E | (E) | a | b | c | 0 | 1 Derive the string: = a * b + c LM derivation #1: E => E + E => (E)+E => (E * E) + E => (a * b) + c E E * a E + E (a*b)+c c E b E LM derivation #2 E => E * E => a * E =>a*(E) => a * (E + E) => a * (b + c) E a The calculated value depends on which of the two parse trees is actually used. E * E b + a*(b+c) E c The Values are different !!! Removing Ambiguity in Expression Evaluations • It may be possible to remove ambiguity for some CFLs – E.g. in a CFG for expression evaluation by imposing rules & restrictions such as precedence – This would imply a re-write of the grammar Order of Precedence: (), * , + Ambiguous version: E ==> E + E | E * E | (E) | a | b | c | 0 | 1 Modified/unambiguous version E => E + T | T T => T * F | F F => I | (E) I => a | b | c | 0 | 1 Inherently Ambiguous CFLs • However, for some languages, it may not be possible to remove ambiguity • A CFL is said to be inherently ambiguous if every CFG that describes it is ambiguous Example: L = { anbncmdm | n,m≥1} U {anbmcmdn | n,m≥1} L is inherently ambiguous This can be proved using the input string: anbncndn [The proof is beyond the scope of this course; it will be done 48 in Theory of Computing (in Level 400)] Converting from Grammars to Finite Automata Convert the following Grammar to Finite Automata S A B F Solution: a S c -> -> -> -> aA | cF bB | bA λ λ b A b B F 50 Convert the following Grammars to Finite Automata S A B F -> -> -> -> S A B F Z aA | cF bB | bA λ λ Right-Linear Grammar Solution: b a S c A -> -> -> -> -> λ Sa | Ab Ab Sc B | F Left-Linear Grammar b B F 51 Converting from Finite Automata to Grammars Note: λ and ε are used interchangeably as non-input symbols. i.e. A  aA | bC | aW C cC | ε W  cX Xε

Context-Free Grammars: Definition, Derivation, and Examples

Related documents

Products

Support

Context-Free Grammars: Definition, Derivation, and Examples

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib