1. Introduction In order for a machine to be able to understand text

advertisement
1. Introduction
In order for a machine to be able to understand text written in a natural language like
English, it should have a suitable internal representation of the meanings of words and be
able to use them in order to contstruct representations for the meanings of sentences
contstituted by them. One of the various solutions to this problem that have been
proposed, uses Conceptual Graphs [1] for the internal representation. It originated in [2]
and is described in section 2.
After the internal representation has been specified, the meanings of the words that the
system is supposed to understand should somehow be entered into the system to form a
kind of dictionary. This is normally done by the programmer, and the definitions should
be written in a formal, machine undertsandable, language.
In section 4 we propose two algorithms for defining the meaning of a verb or a noun to a
system that uses Conceptual Graphs for the internal representation of meanings. These
algorithms allow for word senses for nouns or verbs to be defined at runtime, and so can
add flexibility to an NLP system. Moreover, the definitions are given in natural language,
resembling definitions given in a common dictionary.
2. Related Work
Conceptual Graphs (CGs) are a type of semantic networks invented by John Sowa [1].
They have been used for knowledge representation in many applications, among which is
Natural Language Processing. See [1.5] for an introduction to Conceptual Graphs.
John Sowa and Eileen Way created in 1986 a semantic interpreter that represents the
meaning of sentences of English with Conceptual Graphs [2]. To achieve this, the
meanings of words are described using CGs called Canonical Graphs, or word Sense
Graphs, which are saved in a lexicon together with the words. These Graphs are joined
together in order to form CGs that describe the meaning of the whole sentence. These
manipulations are driven by the parse tree of the sentence. Walling Cyre describes in [3]
the procedure he uses to do this in a similar semantic interpreter that generates CGs from
digital systems requirements expressed in English. Sykes and Konstantinou developed a
similar system for automatically identifying deontic inconsistencies in legal documents
[4].
The idea of making a machine understand a word definition is not new. Harabagiu &
Moldovan [5] for example created a system that transforms defininitions of the WordNet
[6] electronic dictionary into semantic networks. Also Barriere [7] created a system to
transform definitions from a chindren 's dictionary into Conceptual Graphs. The main
difference between the approach of those two systems and ours is that using the
algorithms proposed here, the new word senses are defined in terms of ones already
understandable by the system, and thus the new word senses are "grounded" to the real
word. Defining new word senses in terms of old ones has also appeared before, e.g. in
SHRDLU [8] (See [9] for a short description of SHRDLU).
3. Word Sense Graphs for Verbs
Here we present a new aproach for Word Sense Graphs for verbs, that will be used in the
following section.
A typical Word Sense Graph for a verb sense found in bibliography would look like the
following, which corresponds to the verb "go" (taken from [2]):
[MOBILE-ENTITY] -> (AGNT) -> [GO] -> (DEST) -> [PLACE]
In our aproach, the Word Sense Graph would look like this:
[MOBILE-ENTITY] -> (Go) -> [PLACE]
The difference is that the word is not represented by a Concept, as above, but by a
Relation.
The main advantage of this approach is that the resulting graphs are quite smaller.
Because relation acrs are ordered, the information expressed by the relations "agnt" and
"dest" in the former graph is not lost but can be represented by the numbering of the arcs
of the relation "go". An easy way to do this is to define the relation type "go" using the
following lambda expression
# (Lambda expressions are defined in *par* 6.4 of [1.6]. *par* 6.6 describes how # a
relation type can be defined with a lambda expression) :
[MOBILE-ENTITY: 11] -> (AGNT) -> [GO] -> (DEST) -> [PLACE: 22]
If this definition is used, then the two CGs above are equivalent and we can take the one
from the other by substituting the relation "Go" for its definition or vice versa.
A disadvantage of this approach is that senses for nouns and verbs are represented in
separate hierarchies. A noun created from a verb (e.g. a gerund) can' t be represented with
the same concept type as the verb (which is the case in the usuall aproach). This problem
can be worked around, e.g. by using as the word sense graph of the noun a context whose
descriptor is the word sense graph of the verb.
The fact that relations must have a specified number of arcs imposes another problem. If
two verbs take a different number of arguments, the relation types representing them can'
t have a subtype relation. However, it may be possible to work around this problem by
e.g. using dummy arcs in one of the relation types (arcs that are not used but are there
only to raise the valence of the relation type) or relaxing the requirement that a relation
must have the same valence with its super-relation.
Overall, whether this approach is prefferable from the usuall one or not should depend on
the details of the application. We propose to the future designer of a system like the ones
described in [2],[3] and [4] to consider both choices.
4. The Algorithms
The following algorithms can be used to define new senses for nouns or verbs. A system
that uses a Phrase Structure Grammar for parsing and Conceptual Graphs for representing
meaning is presupposed. The main idea is to parse the user's (or dictionary 's) definition
and create a CG from it, as would be done for any sentence, and then manipulate this CG
to create the Word Sense Graph for the new word sense.
Algorithm 1 (for nouns):
1. The user gives a word (the noun for which a new word sense is being defined) and a
phrase that is syntactically an NP (noun phrase) and describes the word sense.
2. This phrase is parsed and one or more CGs are produced. In each of these CGs one
concept is specified as the head concept of the CG (If more than one CGs are produced,
then either one of them is chosen, or each one leads to a different word sense).
3. A monadic lambda expression is created from this CG by specifying the head concept
as the unique parameter. This lambda expression is used to define a new concept type.
4. The Canonical Graph for the new word sense is specified to be a single concept having
as type the one that was defined in the previous step and no referent.
Notes:
#a. A semantic interpreter like the ones described in [2],[3] and [4], will #already contain
the code for creating CGs from NP phrases. It would also most #probably keep a
reference to a head concept,because ... So only a few additional #bits of code are required
in order to implement this algorithm.
a. Common reasons for getting more than one CG from a single phrase are multiple
senses for a single word and syntactic ambiguity.
c. A unique name has to be automatically created for the defined concept type.
Example:
Suppose that the user wants to give a definition for the word "giraffe". The following
actions would take place:
1. The user enters the word and gives as a description the phrase
"A large herbivorous mammal with a long neck"
2. The phrase is analysed, and a CG is produced (or chosen from a set of produced CGs)
that looks like the one below:
[Mammal: *]-> (attr) -> [Size: large]
-> (attr) -> [Attribute: herbivorous]
-> (has_a) -> [Neck:*] -> (attr) -> [Length: long]
The head concept of this CG is the one that has "Mammal" as type.
3. The head concept is specified to be a parameter, and so a lambda expression is formed,
which serves as the definition of a new concept type with a unique name, e.g.
"Giraffe_1".
4. If the word "giraffe" was not in the lexicon it is added as a noun, with the following
Canonical Graph:
[Giraffe_1]
If it was, the above Canonical Graph is added among the others.
Algorithm 2 (for verbs) :
Defining a word sense for a verb is harder than it is for a noun. Every verb has a number
of role players, like agent, beneficiary, etc. The syntax of the verb phrase determines
what object plays each role. For example, the difference between the sentences "Dog
bites man" and "Man bites dog" as far as meaning is concerned is that the roles have been
inversed. A definition of a verb sense should contain information about these roles. The
following algorithm serves this purpose well.
1. The user enters the word for which the new sense is being defined, together with two
phrases. One of them (phrase 1) displays the usage of the word, and the other (phrase 2)
describes the meaning of the first phrase. Both phrases should syntactically be either a
VP (verb phrase) or an NP + VP. Phrase 1 should contain as many role players as the
verb being defined can have, and all the names of the role players used in phrase 1 should
appear in phrase 2 as well.
2. Phrase 1 is parsed. The parse tree supplies information about the number of role
players that the verb can have, prepositions or particles that it requires, etc.
3. A new relation type (with a unique name) is added to the relation type hierarchy. The
valence of the new type is set equal to the number of role players of the defined verb.
4. The verb is added to the lexicon, together with any syntactic information acquired in
step 2. It's Word Sense Graph is a single relation of the newly defined type, all of whose
arguments are concepts of the Universal type with no referent (the most general concept).
5. Phrase 1 is analysed, and a CG is produced (using the Word Sense Graph described
above). We will call it cg1.
6. Phrase 2 is analysed, and one or more CGs are produced. One of them is chosen. We
will call it cg2.
7. For every concept of cg1, we find a concept of cg2 with an equivalent referent and
designate it &lambda_i, where i is the number that the former concept has in the only
relation of cg1. Thus cg2 is transformed to a lambda expression, with the same valence as
the relation type created in step 3.
8. This lambda expression is set as the definition of the relation type created in step 3.
Notes:
a0. The new word needn't be used only in phrases that are syntactically equivalent to
phrase 1. After the definition of the new word, it will be added in the lexicon together
with any syntactic information extracted from phrase 1, and then it can be used in the
same way with any other word of the lexicon.
a. The role players of the verb should appear in phrases 1 and 2 expressed in the same
words, so that they can be identified in step 7.
b. When phrase 1 is parsed in step 2, no CGs are generated (they couldn't, since the verb
has no Word Sense Graph yet). For this parsing some extra syntactic rules will be needed,
that include code that keeps information like how many role players the verb has,
prepositions it uses, etc.
c. In step 5 a CG is produced from phrase 1 that contains a relation of the newly defined
type. This relation type has not been defined yet, but will be using this CG.
d. Step 6 could produce multiple CGs (see note "a" to algorithm 1), but step 5 produces
only one, because there is only one Word Sense Graph for the verb, namely the one
described in step 4. If the verb was already in the lexicon, other Word Sense Graphs
should be blocked. Because of the simplicity of phrase 1, no syntactic ambiguity should
appear that could lead to multiple CGs. If however for some reason more than one CGs
are produced, one has to be chosen.
e. While phrase 1 will be quite simple, there is no limitation to how phrase 2 can be, as
far as it belongs to the requested syntactic category.
Example:
Suppose that the user wants to give a definition for the verb "receive", while the verb
"give" is already understood by the program. The following could take place:
1. The user enters the word "receive", and the two phrases. Phrase 1 could be "X receives
something from Y" and phrase 2 "Y gives X something". Note that the same names for
the role players of "receive" (X, Y, something) are used in both phrases.
2. Phrase 1 is parsed, and it is inferred that the new verb has three role players, one of
which requires the preposition "from".
3. A new relation type is added to the hierarchy, with a unique name like "receive_12"
and valence 3.
4. "Receive" is added to the lexicon (if it wasn't before) as a ditransitive verb (because it
has three role players) with the information that role player 2 requires the preposition
"from". Its Word Sense Graph is
(receive_12) <-1- [Universal]
<-2- [Universal]
-3-> [Universal]
5. Phrase 1 produces the following CG (cg1):
(receive_12) <-1- [Universal: something ]
<-2- [Universal: Y ]
-3-> [Universal: X ]
6. Phrase 2 is analysed, and produces some CGs, from which one looking like the
following is chosen:
(give_3) <-1- [Universal: something]
<-2- [Person: X]
-3-> [Person: Y]
We call it cg2.
7. The first concept of the relation of cg1 has the same referent with the first concept of
cg2, so the later is designated &lambda_1. The process is repeated for the other two
concepts, so the following lambda expression is formed:
(give_3) <-1- [Universal: &lambda_1]
<-2- [Person: &lambda_3]
-3-> [Person: &lambda_2]
8. The above lambda expression is set as the definition of the relation type "receive_12".
After a new word sense has been defined using one of these algorithms, using the word in
a sentence will give rise to a CG that contains the concept or relation type that was
created during the definition. This concept or relation type can be expanded using its
definition leading to a CG that contains only the "primitive" types.
Continuing the previous example, the sentence "Bill receives a book from John" would
produce (possibly among others) the CG:
(receive_12) <-1- [Book]
<-2- [Person: John]
-3-> [Person: Bill]
By expanding the type of the relation, we would get the following CG:
(give_3) <-1- [Book]
<-2- [Person: Bill]
-3-> [Person: John]
Which is exactly the same as the CG we would get from the sentence "John gave a book
to Bill".
IMPLEMENTATION
The algorithms presented above can easily be incorporated in a system that translates
natural language into Conceptual Graphs, and much of the code needed will be included
in such a system anyway. In particular, step 2 of algorithm 1 and steps 5 and 6 of
algorithm 2 can be performed without extra code.
The algorithms were implemented by the author in a simple semantic interpreter that used
prolog DCG rules to parse english sentences and the NOTIO java API [10] to create and
manipulate Conceptual Graphs. About ** prolog rules and ** java methods were needed
to implement both algorithms. The source code is available upon request from the author.
EPILOGUE
Using these algorithms, one could design a semantic interpreter that contains Word Sense
Graphs for words that express some primitive concepts, using a number of concept and
relation types with specified semantics for the application for constructing these Word
Sense Graphs, and letting any other word or word-sense be defined in terms of these
words. This way the semantics of the new word-senses for the application will also be
specified. In this paper only nouns and verbs were considerd, but the ideas presented
should be easily extendible to other kinds of words.
These ideas can be used in various applications. For example, an interactive system can
be built, that initially "understands" a few basic words, in which the user progressively
adds new word senses untill he is satisfied with its understanding of text. Another
application could be an information extraction system that reads texts and looks up any
unknown words that it meets in an electronic dictionary.
References
[1]
Sowa, J.F.: Conceptual Structures. Reading, MA: Addison-Wesley Publishing
Company, 1984.
[1.5] Conceptual
Graphs:http://meganesia.int.gu.edu.au/~phmartin/WebKBtools/doc/CGs.html
[2]
Sowa, J.F., and Way, E.C.: Implementing a semantic interpreter using conceptual
graphs, IBM Journal of Research and Development, 30 (1986), 57-69.
[3]
Walling R. Cyre: Capture, Integration, and Analysis of Digital System
Requirements with Conceptual Graphs. IEEE Transactions on Knowledge and Data
Engineering, 9(1): 8-23 (1997).
[4]
V. Konstantinou, J. Sykes: "Defining degrees of obligation (automatic
identification of inconsistencies in legal documents) SA'2000 International ICSC
Congress on Intelligent Systems and Applications, December 11-15, 2000, University of
Wollongong , Australia.
[5]
Harabagiu, S. M., D. I Moldovan: Knowledge processing on an extended
WordNet, In C Fellbaum, editor, WordNet: An Electronic Lexical Database, pages
379--405. MIT Press, 1998.
[6]
WordNet : http://www.cogsci.princeton.edu/~wn/
[7]
Barriere C.: From a children' s first dictionary to a lexical knowledge base of
conceptual graphs, Ph.D. thesis, Simon Fraser University, June 1997.
[8]
Winograd, T.: Understanding Natural Language, Academic Press, Harcourt Brace
Jovanovich, San Diego, New York, 1972.
[9]
SHRDLU : http://hci.stanford.edu/~winograd/shrdlu/
[10] Southey, F., J.G Linders: Notio - A Java API for Conceptual Graphs, In William
Tepfenhart and Walling Cyre, editors, Proceedings of the 7th International Conference on
Conceptual Structures (ICCS'99), Lecture Notes in AI, pages 262-271, Blacksburg,
Virginia, U.S.A., 1999. Springer-Verlag.
Download