Context-free grammars

advertisement
Context free languages
1. Equivalence of context free
grammars
2. Normal forms
Context-free grammars

In a context free grammar, all
productions are of the form
A -> w,
where A is a nonterminal or the start
symbol S, and w is a string from
(N  T)*
Handles, recursive productions



In the production A -> xw,
prefix x, if a single symbol, is called the
handle of the production, whether x is in N
or T
A production
A -> Aw
is called left-recursive
The production
A -> wA
is called right recursive
Repeated sentential forms

In a derivation, the sentential form wAx
S-> … -> wAx -> … -> wAx -> …
is called a repeated sentential form. All
the intervening steps are wasted steps.
Leftmost derivations
Minimal leftmost derivations


A derivation is a leftmost derivation
if at each step only the leftmost
nonterminal symbol is replaced using
some rule of the grammar.
A leftmost derivation is called minimal
if no sentential form is repeated in the
derivation
Weak equivalence

Two context-free grammars G1 and G2
are called weakly equivalent if
L(G1) = L(G2)
Example of weak equivalence


G1: S -> S01; S -> 1
L(G1) = { 1(01)* }
G2: S -> S0S; S -> 1
L(G2) = { 1(01)* }
Strong equivalence

Two CFGs G1 and G2 are called
strongly equivalent if they are
weakly equivalent, and for each string
w of terminals in L(G1) = L(G2), and
the minimal left-most derivations of w
in G1 and the minimal left-most
derivations of w in G2 are exactly the
same in number, and so can be put into
one-to-one correspondence.
Strong equivalence

Thus G1 and G2 must both be
unambiguous, or must both be
ambiguous in exactly the same number
of ways, for each string w in T*
Weakly equivalent but not
strongly equivalent



G1: Grammar of expressions S:
S -> T | S + T;
T -> F | T * F;
F -> a | ( S );
G2: Grammar of expressions S:
S -> E;
E -> E + E | E * E | (E) | a;
L(G1) = L(G2) = valid expressions using a, +,
*, (, and ). G1 has operator precedence.
Example: Strong equivalence


G1: S->A; A->1B; A->1; B->0A
L(G1) = { (10)*1 }
G2: S->B; B->A1; B->1; A->B0
L(G2) = { 1(01)* }
Elementary transformations of
context free grammars





substitution
expansion
removal of useless productions
removal of non-generative productions
removal of left recursive productions
Substitution


If G has the A-rule, A->uBv,
and all the B-rules are:
B->w1, B->w2, . . . , B->wk, then
1. Remove the A-rule A->uBv
2. Add the A-rules: A->uw1v, A->uw2v, .
. . , A->uwkv
3. Keep all the other rules of G,
including the B-rules
Example of substitution


G1: S->H;
H->TT;
T->S; T->aSb; T->c
G2: S->H;
H->ST; H->aSbT; H->cT;
T->S; T->aSb; T->c;
Strong equivalence after
substitution

The grammar G, and the grammar G’
obtained by substitution of B into the Arule, are strongly equivalent if steps 2
and 3 do not introduce duplicate rules.
Expansion


If a grammar has the A-rule, A->uv
Remove this A-rule, and replace it with
the two rules


A->Xv; X->u; or with
A->uY; Y->v
where X (or Y) is a new non-terminal
symbol of the grammar.
Strong equivalence after
expansion

If G is context free, and G’ is obtained
from G by expansion, then G and G’ are
strongly equivalent.
Useful production



A production A->w of a cfg G is useful
if there is a string x from T* such that
S-> . . -> uAv -> uwv -> . . -> x
Otherwise the production, A->w is
useless
Thus, a production that is never used to
derive a string of terminals is useless
Removing useless productions



T-marking
S-marking
Productions that are both T-marked and
S-marked are useful. All other
productions can be removed.
T-marking

Construct a sequence P0, P1, P2, . . . ,
of subsets of P, and a sequence N0, N1,
N2, . . . of subsets of N as follows:




P0 = empty, N0 = empty, j = 0
P[j+1] = { A->w|w in (N[j] + T)* }
N[j+1] = { A in N | P[j+1] contains a rule
A->w }
Continue until P[j] = P[j+1] = P[T]
S-marking

Construct a sequence Q1, Q2, Q3, . . .
of subsets of P[T] as follows:




Q1 = {S->w in P[T]}
Q[j+1] = Q[j] + {A->w in P[T] | Q[j]
contains a rule B->uAv }
Continue until Q[j] = Q[j+1] = P[S]
P[S] are now the useful productions.
Example: T/S-marking


Rule
T mark
1. S->H
2
2. H->AB
3. H->aH
2
4. H->a
1
5. B->Hb
2
6. C->aC
Thus only 1,3,4 are useful
S mark
1
2
2
Strong equivalence after removal of
useless productions

If grammar G’ is obtained from
grammar G after removal of useless
productions of grammar G, then G and
G’ are strongly equivalent.
Removing non-generative
productions

Removing left-recursive rules

Let all the X-rules of grammar G be:
X->u1 | u2 | . . . | uk
X->Xw1 | Xw2 | . . . | Xwh
Then these rules may be replaced by
the following:
X->u1 | u2 | . . . | uk
X->u1Z | u2Z | . . . | ukZ
Z->w1 | w2 | . . . | wh
Z->w1Z | w2Z | . . . | whZ
where Z is a new non-terminal symbol
Example: Removing leftrecursive rules

S->E;
E->T | aT | bT;
E->EaT | EbT;
T->F;
T->TcF | TdF;
F->n | xEy
S->E;
E->T | aT | bT;
E->TG | aTG | bTG;
G->aT | bT;
G->aTG | bTG;
T->F;
T->FH;
H->cF | dF;
H->cFH | dFH;
F->n | xEy
Strong equivalence after
removal of left-recursive rules

If grammar G’ is obtained from
grammar G by replacing the leftrecursive rules of G by right recursive
rules to get G’, then G and G’ are
strongly equivalent.
Well-formed grammars

A context free grammar G=(N,T,P,S) is
well-formed if each production has one
of the forms:
S->
S->A
A->w
where A  N and w  (N+T)* - N and
each production is useful.
Example of well-formed
grammars

Parenthesis grammar
S->A;
A->AA;
A->(A);
A->();
Chomsky Normal form

A context free grammar G=(N,T,P,S) is
in normal form (Chomsky normal form)
if each production has one of the forms:
S->
S->A
A->BC
A->a
where A,B,C  N and a  T.
Example of Chomsky normal
form grammar

Parenthesis
S->A;
A->AA;
A->(A);
A->();
grammar
S->A;
A->AA;
A->BC;
B-> (;
C->AD;
D->);
A->BD;
Chomsky Normal Form
Theorem

From any context free grammar, one
can construct a strongly equivalent
grammar in Chomsky normal form.
Greibach normal form
(standard form)

A context free grammar G=(N,T,P,S) is
in standard form (Greibach normal
form) if each production has one of the
forms:
S->
S->A
A->aw
where A  N, a  T, and w  (N+T)*.
Example: converting to
Greibach standard form

First remove
S->E;
E->T;
E->EaT;
T->n;
T->xEy;
left-recursive rules:
S->E;
E->T;
E->TF;
F->aT;
F->aTF;
T->n;
T->xEy;
Converting to Greibach: then substitute to
get nonterminal handles

S->E;
E->T;
E->TF;
F->aT;
F->aTF;
T->n;
T->xEy;
S->E;
E->n | xEy;
E->nF | xEyF;
F->aT;
F->aTF;
T->n;
T->xEy;
Standard Form Theorem

From any context free grammar, one
can construct a strongly equivalent
grammar in standard form (Greibach
normal form).
Pumping Lemma for context
free languages

If L is a context free language, then
there exists a positive integer p such
that: if w  L and |w| > p, then
w = xuyvz, with uv and y nonempty
and xukyvkz  L for all k  0.
Download