Context-sensitive Grammar Definition

advertisement
Phrase-structure grammar
A phrase-structure grammar is a quadruple
G = (V, T, P, S) where V is a finite set of
symbols called nonterminals, T is a set of
terminals, P is the set of productions  → ,
  (V  T)*V(V  T)*,   (V  T)*, S
is a member of V called the start symbol.
S → ABC
AB → aAD
AB → bAE
DC → BaC
EC → BbC
Da → aD
Db → bD
Ea → aE
Eb → bE
AB → 
C→
aB → Ba
bB → Bb
Context-sensitive Grammar
Definition: A grammar G = (V, T, P, S) is
context sensitive if ||  || for every
production  in P.
Definition: A “true” context sensitive
grammar G = (V, T, P, S) is a grammar in
which each production is of the form
A  , where  and  are in (V T)*,
 in (V  T)+, and A in V. The production
A   is also written as A  / _
Example:
S  ABC
S  ABCS
AB  BA
AC  CA
BC  CB
BA  AB
CA  AC
CB  BC
Aa
Bb
Cc
1)
2)
3)
4)
5)
6)
7)
8)
9)
10)
11)
S  ASCB
S  ACB
CB  CR
CR  BR
BR  BC
AB  Ab
Ab  ab
Aa  aa
bB  bb
bC  bc
cC  cc
Definition: A language L is contextsensitive language if it is generated by a
context-sensitive grammar.
Theorem: Every context-sensitive language
can be generated by a true context-sensitive
grammar.
Step 1:
convert all rules of the grammar G to the
form    where  is a string of non
terminals, let G1 be the new grammar
replace a by Na and add a production
Na  a, where Na is a new nonterminal
Step 2:
Let w(G) = max {||, where  is in P}
Convert the grammar G1 to a grammar G2
such that   V + and w(G2)  2
Let : A1 …Am  B1 .. Bn be a production
If n  2, add it to G2
If 2  m < n, create two productions:
A1 … Am  B1 … Bm-1X
X  Bm … Bn
If m = 1 and n  3 create n-1 productions
A1  B1X1
X1  B2X2
…
Xn-2  Bn-1Bn
If m = n and n  3, create the n-1 productions
A1A2  B1X1
X1A3  B2X2
…
Xn-2An  Bn-1Bn
Step 3: Convert G2 to a new grammar G3
Add productions of the form A   to G3
If AB  CD is a production and if A= C or B = D,
add to G3
if AB  CD and A  C and B  D, then add the
productions
AB  XB, XB  XY, XY  CY CY  CD
Definition: Let G = (V, T, P, S) be a
context-sensitive grammar and let w  Tn
for some n  1. Define a sequence of sets
Wi  (V  T)* as follows:
W0 = {S}
for each i  0,
Wi+1 = Wi  {  (VT)+ |    in G, 
is in Wi, and ||  n}
Proposition: Let Wi be as defined before.
Then we have the following:
1) for each i  0, Wi  Wi+1
2) if Wk = Wk+1 for some k, Wk = Wk+m for
all m > 0
3) for each i  0,
Wi = {  (V  T)* | S m , ||  n, m  i}
4) there exists k < max(2*|VT|n, n+1) such that
Wk = Wk+1
5) let k be the least integer such that Wk = Wk+1,
then Wk = {  (VT)+ | S * , ||  n}
Theorem: Let G = (V, T, P, S) be a contextsensitive grammar. Then there is an
algorithm which, given any w  T*, decides
whether or not w  L(G). (L(G) is
recursive)
Download