Chapter 6 Formal Languages

advertisement
6-1
Theory of Computation
Chapter 6 Formal Languages
Formal Languages definitions
A vocabulary/alphabet V is a finite, nonempty, set of symbols.
A word over V is a finite length string of symbols from V.
The set V* is the set of all words over V. A language over V is any subset of V*.
Phrase-Structure Grammar
A phrase-structure grammar is a 4-tuple
(V,VT, S, P )
where
V is the vocabulary
VT is the set of terminals
S is the start symbol
P is a set of production rules
Eg
V={0,1,S}
S=S
VT={0,1}
P={S0S, S1}
Let w1 and w2 be words over V.
Then w1 directly generates/derives w2 written w1  w2, if  is a production from P
and w1 contains an instance of  and w2 is identical to w1 with one instance of 
replaced by 
If w1, w2, w3……. wn are words over V
and w1  w2, w2  w3,…, wn-1  wn.
Then w1 generates/derives w2
Written
w1  wn
The language L generated by G, sometimes denoted L(G) is the set
L={wVT|S  w}
6-1
6-2
Theory of Computation
to sent
output (sentence nounphrase verbphrase)
end
to nounphrase
output (sentence the adjective noun)
end
to verbphrase
output (sentence verb nounphrase)
end
to noun
output pick [girl boy elephant zebra giraffe clown]
end
to adjective
output pick [big happy little funny silly]
end
to verb
output pick [hugs punches likes visits]
end
? print sent
THE FUNNY CLOWN HUGS THE HAPPY ZEBRA
? print sent
THE LITTLE GIRAFFE PUNCHES THE SILLY BOY
G = { {S,B,C, a,b,c}, {a,b,c}, S, P}
where the productions P are
S
CB
bB
cC




aSBC
BC
bb
cc
S
aB
bC



aBC
ab
bc
Using these productions as "rewriting rules" it can be
shown that, starting with S, we can derive any string of
the form anbncn
6-2
6-3
Theory of Computation







S
aSBC
aaBCBC (using S
aabCBC (using aB
aabBCC (using CB
aabbCC (using bB
aabbcC (using bC
aabbcc (using cC
 aBC)
 ab)
BC)
 bb)
 bc)
 cc)
is a valid derivation of a2b2c2.
Equivalent Grammars
Two different grammars G1 and G2 may generate the same language, i.e. it may
be that L(G1) = L(G2). Such grammars are said to be equivalent.
There is no general procedure for determining whether two arbitrary grammars
are equivalent (c.f. the halting problem).
G = { {S, a, b}, {a,b}, S, P}
where the productions P are
S
aSa
bSb
S




aSa
aaa
bab

S
aSa
bSb



bSb
aba
bbb
S
S


bSb
b
S
AS
A



bAS
SA
a
G = { {S, a, b}, {a,b}, S, P}
where the productions P are
S
S
S
 aSa
 a
 
G = { {S, a, b}, {a,b}, S, P}
where the productions P are
S
BS
B
S
 aBS

SB
 b
 
6-3
6-4
Theory of Computation
Erasing productions
An erasing production takes the form



where length() > length ()
Context sensitive Grammars (type 1)
A grammar is said to be context sensitive if none of its productions are erasing
productions. With the exception of S  
Context free Grammars (type 2)
Agrammar is said to be context free if all the productions are non erasing and of the form



where length()=1.
With the exception of S  
Regular Grammars(type 3)
A grammar is said to be regular if all the productions are non erasing and of the form



where length()=1 and  is either of the form tN or t, where t is terminal and N is
a non-terminal.
With the exception of S  

Chomsky’s Heirarchy
type 0
type 1
type 2
type 3
Phrase Structure Grammars
Context Sensitive Grammars
Context Free Grammars
Regular Grammars
Recognition Machines
For all Regular Grammars it is possible to construct a Finite State Machine that will
recognise it.
For all Context Free Grammars it is possible to construct a Push Down Automata that
will recognise it.
For all Context Sensitive Grammars it is possible to construct a Linear Bounded
Automata (a TM with Finite tape) that will recognise it.
For all Phrase Structure Grammars it is possible to construct a Turing Machine that will
recognise it.
6-4
6-5
Theory of Computation
Backus-Naur Form (BNF)
<identifier>::=<letter>|<identifier><letter >|<identifier><digit>
<letter>::=a|b|c|…|z
<digit>::=0|1|…|9
in BNF non-terminals are identified by < >, the production arrow becomes ::= and |
stands for or.
G=({I,L,D,a,b,c,…,z,0,1,…,9},
{a,b,c,…,z,0,1,…,9},I,P)
where P is the following
I  L I  IL
I  ID
La Lb Lc Ld
...
…
D0 D1D2D3
…
…
…
Lz
…
D9
Example
Find a Grammar that generates
L = {ww | w{0,1}*}
6-5
Download