An Algebraic Approach to Equivalence

advertisement

An Algebraic Approach to Equivalence

Among Context-Free Linear Grammars:

An Introductory Paper

Mark DeArman

Math 320

12/11/2003

Introduction

Linguistics is the scientific study of language. Key to the scientific study of any subject is the ability of the scientists to represent their findings in the concrete language of mathematics as apposed to simple empirical results. In this paper, I will introduce the mathematical axioms and structures used to model language and grammar in a way such that mathematical analysis is possible. Three sections divide the content logically so each section builds upon the last. The only prerequisite knowledge assumed, is a basic understanding of abstract algebra, topology and set theory.

The first section describes the algebraic structures that are building blocks for a Formal

Context-Free Linear Grammar. The purpose of this chapter is to refresh prerequisite knowledge and show applications of those structures to language modeling.

The second section explains the axioms used to define a context-free linear grammar and show examples of the flexibility of the structure. The purpose of this chapter is to show in detail how a CFG is constructed from the building blocks of section one. The section concludes with a discussion of transformations between grammars, which are important structures to sentence formation or ever translation work in natural language computing.

The final section describes how two CFGs can be analyzed using topological techniques to determine their similarity. The purpose of this section is to show more application of the previous two sections content. Though the final proof given in this section is incomplete, it is included to promote further study in this area, again applicable to translation work.

2

Section I : Algebraic Structures

Key to visualizing and solving a problem in mathematics is a deep knowledge of how the structure of that problem is setup. In this chapter, I will explain the use of the various algebraic structures that contribute to the formalization of context-free grammars.

The basic building block of natural language is an alphabet. We define an alphabet as a finite set of distinguishable elements called characters. For example we can define an alphabet of six characters as a set

such that



. It is important to note that

has no underlying structure and is simply an unordered set of elements (Hockett, 55.)

A semigroup is a non-empty set of elements closed under association. Let

be a semigroup

(

,

). Then for any x,y

 

, (x

 y)

z = x

(y

 z) will hold. In terms of our alphabet

 we are interested in a more specific type of semigroup called a free monoid (Hockett, 52.)

A free monoid is a semigroup with an identity e=

, closed under association (

), and concatenation (

). Let F be a free monoid, then F(

) is a free monoid over the alphabet

 whose elements F are all the finite strings over

such that F ={ x



|

 i

0

[

  i

0 x ]}

Since F(

) contains all the finite strings of

its order is infinite. For example, F(

) must contain all the individual characters of

 along with all the combinations of their concatenation and association. If

 is simply the null set, then F(

) contains simply the

 string. Let

 and let

 continue to be the as defined above. Then it follows that F (

)

F (

) , since F (

) contains a infinite number elements which differ from F(

) (Hockett, 56.)

It is important to note that from the previous example that since

 there must be some isomorphism f : F(

)

F (

). If we let H : F(

)

F (

) be the inclusion group generated by f , then H is a simple example of a language generated by the free monoid F (

) (Spanier, 2.)

Obviously, such a simple isomorphism does nothing to characterize a natural language. Our goal in mathematical linguistics is to find isomorphisms f

0

, f

1

, … , f n

, which generate a group H such that strings of H mimic natural language. The collection of functions f will be developed in the next section into a definition of formal context-free grammars as linear generative grammars

(Hockett, 58.)

3

Section II : Formal Context-Free Grammars

A formal grammar is an absolute description of a group H



F(

) where

 is some finite alphabet. There is a numerous variety of formal grammars, but the scope of this paper only addresses a certain class of the larger category: those H that are linearly generated and context free. Section III will address the implications and advantages of linear generation, though the choice of this name will be evident after definition of their structure. A context sensitive grammar as apposed to the ones developed in this paper are groups H



F(

) such that H is closed under a binary ordering operator,

. No further discussion will be devoted to them (Charnial and Wilks, 55.)

Definition: Context-Free Linear Generative Grammar 1

A linear generative grammar is a system G(A , I , T , R ) characterized by the following postulates: o A is a finite alphabet. o I is a unique initial character of A . o T is a proper subset of A – { I } called the terminal subalphabet. o The characters N = A – T are called the non-terminal or auxiliary subalphabet. o R is a non-null finite set of rule isomorphisms { R i

} m

. Each rule is a function whose domain is the free monoid

F(A) and whose image is some subset there of. If

is any string over A then R (

) is unique over A.

Postulate P1.

For every rule R , R ( Æ ) = Æ.

A non-null string over the terminal subalphabet T is a terminal string.

Postulate P2.

If

is a terminal string and R is any rule, then R (

)= Æ.

Let S = { (R i

) n

| n

1}, where each R i

is some rule of R . For a given string s over A let: s

1

= R

1

(s) , … , s n

=R n

(s n-1

). Then S, like R , is a function whose domain is F ( A ) and whose image is a subset thereof. If for this sequence, there exists some non-null s over A , then we say the sequence is a rule row . Thus if S(s) ¹ Æ then we can say S is a rule which takes s as an instring and generate S(s) as an outstring.

Postulate P3.

Given any rule row S, and any string s over A acceptable as an instring to S, then S(s) ¹ s.

A rule R such that R( I ) ¹ Æ is an initial rule. If R



S then S is an initial rule row and S(I) depends on the choice of S



R . If S( I ) generates a terminal string, then S is called a rule chain.

Postulate P4.

Every rule of R appears on at least one chain.

From P3, circuit formation if prohibited because no S can generate its self.

Note that R can contain any number of duplicate rules R.

From the definition above a trivial grammar G can be constructed as an example to find

H ( G )



F(

) such that |H( G )| = 0 < n < ¥ . Granted that this trivial case in no way resembles a natural language, it shows all the steps necessary to find some H ( G ) which does.

Example One:

2

Let A = { I , a } and let T = { a }. Let R É R ( s )

 if s

I otherwise

.

The alphabet A is a finite set of order two, T is the terminal alphabet of order one, and R contains one rule which transforms an initial string I into a . For all s Ì F ( A ), there is only one acceptable instring s=”

I a ”, thus

G generates H ( G ) Ì F ( T ) of order one containing only the string “ a ”. Though this example is simplistic, it helps to illustrate the basic procedure. The next example expands upon Example One and shows the generate

1

This definition is taken from Hockett, 59-61 and contains only slight modifications to better fit the needs of this paper.

2

This example was derived from Hockett, 62 and contains slight modifications to better fit the needs of the paper.

4

of language H ( G ) of a more complex structure, more applicable to the discussion of

Section III.

Example Two:

3

Let A = { I , b,B,l,L,p,P } and let T = { b,l,p }.

Let R be the set containing the following rules:

R

1

( s )

R

2

( s )

 

 

 if if s s

I otherwise

 otherwise

. R

3

R

4

( s )

( s )

 otherwise if if s s

 otherwise

Let G ( A , I , T , R ,) be a grammar and Let C ( I )=s be some permutation of S( I ) which generates the terminal string for the system of rules.

Visually, a certain C ( I ) for this G would be structured as follows:

I

R

1

: B

R

2

: b

R

3

: b l

R

4

: b l

L

L

P p

Note that the ordering of the rules of R is not necessary since only permutations of S which generate valid outstings are acceptable for a choice of C ( I ). Observe that the structure of C gives rise to a tree topology (Blackett, 165.)

Let T

0

be a topological network formed by the ordering of C ( I ) for the above permutation of S. Then T

0 follows, where each level of the tree represents a transformation by R n which either fixes elements or replaces them based on a given rule.

I

R

1

B L

R

2 b L

R

3 b l P

R

4 b l p

Since T

0

is a topological space which is the subspace of F ( A ), we should be able to choose a transformation c which will transform outstrings of G into outstrings of G

’ for a given C and

C’

. o Let A be the English alphabet in upper and lower case. Let A

’ =

A È {‘(‘,’)’}.

3 This example was based on work throughout Hockett.

5

o Let T = A

– { x Î A | x is upper case }. o Define a function Coll(T, A ) that takes a tree T and all non-terminal characters with unique non-terminal characters from A and maintains ordering at the vertexes by use of ‘(‘ and ‘)’. Define a second function Flat(T,

T ) which takes all terminal characters of T and replaces them with x

0

… x n

. o From example two, let c

1

be a transformation equal to: o Flat[Coll(T

0

, A ), T ]=I(A x

1

B(C x

2

D x

3

)) o Let c

2

= I(K(G x and

C’

( I ).

3

B x

1

)L(M(j)E x

2

)) be the collapse of a tree T

1

generated by G ’ o Let c be a function defined as c: c

1

 c

2

, which maps x

0

 x

0

, … , x n

 x n

Then any outstring s in C ( I ) Ì H ( G) can be transformed into outstrings s’ of C’

( I ) Ì

H (

G’).

Thus we can find morphisms over F ( A ) which can transform between rules of R and R

’.

Now we must make the transition from modeling random strings of characters into modeling sentences. In order to do this, we must define a bijection s : L

A which maps all the words and grammar parts from a lexicon L into an alphabet A . Then we can choose rules in R that appropriately model sentence structure. As before, some rule sets may yield | H ( G )| = ¥ while others may yield finite languages. Most notably, any compound sentences will yield an infinite order language (Kuroda, 174.) When dealing with languages of infinite order, the algebraic techniques described above tell us nothing about an entire language. In this scenario, the techniques of the next section will help.

With the Algebraic tools defined in this section, there is nothing but computation necessary to develop a H ( G ) which isomorphic with a given natural language. It follows from this, that two distinct natural languages could be modeled as H and

H’.

Having two distinct models of natural languages leads the question, if it is even possible, to find transformations between them that work as a translation agent. As a first step, Section III will discuss mathematical tools that we have to investigate the similarity between two phrase structures of infinite order.

6

Section III : Topology Applied to Formal Context-Free Grammar Structures

A linear generative grammar defines the structure of the H ( G ) subgroup. The structure is defined as the semigroup consisting of all s = C ( I ) Ì S(x) x Î F(A)

which can be finite or infinite. If instead, we think of H(G ) = { S(s) É C n

( I ) ¹ Æ | 0 < n £ m, s Î F ( A )} defined in the following way, with m paths through S( R ) then we can construct a topological analog, much like was done for Example Two.

Topology allows us to talk one thing being near another within a space. The topological space of for our tree structures is the free monoid over the alphabet under the influence of the grammar structure. If a homeomorphism can be found between two topological spaces that maps the neighborhoods of one space near enough to the others, then the two spaces can be considered structurally equivalent (Kuroda, 175.)

We define our topological space as follows

4

: o Let G be a grammar structure, G ( A , I , T , R ). o Let F ( A ) be the free monoid over the alphabet. o Let H ( G ) be the language induced by G and let H have order ¥ .

Then ( H(G) , F(A) ) meets the axioms for a topological space.

1.

Æ Î F ( A ) and H ( G ) Î F ( A )

2.

For all U

1

,U

2

Î F ( A ), then U

1

`Ç U

2

Î F ( A )

3.

If X Ì F ( A ), then È X Î F ( A ).

The base for the space with respect to a given x Î H(G)

B( x )= { S( x ) É C n

( I ) ¹ Æ | 0 < n £ m }. Then the neighborhood system for the space is defined as { B(x) } x Î H(G).

, and meets the axioms for a neighborhood system.

1.

For every x Î H ( G ), B (x) ¹ Æ and for every C ÎB (x), x Î C.

2.

If x Î C Î B (y), then there exists a V Î B (x) such that V Ì C

3.

For any U

1

,U

2

Î B (x) there exists a U Î B (x) such that U Ì U

1

Ç U

2

With these preconditions, we can proceed to analyze nearness or structural similarity of two spaces.

Let K and K’ be languages generated by discreet grammars G and G’ respectively.

Let T = (K,F(A)) and T’=(K’,F(B)) be topological spaces which satisfy the axioms above.

Let P be an arbitrary finite set such that P Ì K and P’ Ì K’.

For t Î T prune t so much as t Î P and denote this new structure as t p

.

For t’ Î T’ prune t’ so much as t’ Î P’ and denote this new structure as ‘t p

.

Define new topological spaces t p

=(P,K) and ‘t p

=(P’,K’) which satisfy all the axioms above.

If t p

Î ‘t p

then t’ is near t relative to the pruning set.

Since P Ì K for a given K, it is also a subset of B (x), the neighborhood system.

Let f

0

and f

1

be mappings such that f

0

, f

1

: (P,K)

(P’,K’) relative to P.

Thus f

0

is homotopic to f

1

relative to p if there exists a mapping:

4 Engelking, 13.

7

Z:(K,P) x I

( K’,P’) where I is the unit interval (Spanier, 23.)

The larger the set P and P’, the more likely the homotopy is to exist and be continuous from K into K’ because P Ì B (x) (Kuroda, 183.)

The results given in this section are by no means complete. The information is presented here as a starting point for further research into the topology of various classes of phrase structures.

Conclusion

This paper is the culmination of at least a year of research and has done nothing but open further doors, awaiting exploration, by the Author.

I have tried to give a brief overview of basic algebraic structures and their application to linguistics. In the third section, the explanation may have gone off the deep end and it is obvious more research needs to concentrate in this area.

In a final section, which I would have included space permitting, I would have liked to investigate some of the more geometric representations of phrase structure applying results from Geometric Topology and graph theory to obtain further analysis. This seems the most promising area of study, since the algebraic homotopy relations do not yield to easy or meaningful solutions. Matrix representations of phrase structure vertex and edge equations might be easy to solve, as vector processing gets faster and faster on modern computers.

My conclusion is of course as stated earlier, that more research needs to be done before any meaningful results can be derived.

8

o Hockett, Charles F. 1967. Language, Mathematics, and Linguistcs.

Mouton and Co,

Paris. o Spanier, Edwin H. 1996. Algebraic Topology.

McGraw and Hill, New York. o Kuroda, S. Y. 1987. “A Topological Approach to Structural Equivalences of Formal

Languages.” Mathematics of Language.

University of California, San Diego. o Engelking, Ryszard. 1989. General Topology.

Heldermann Verlag, Berlin.

9

Download