253
1
G. M. Emelyanov 2 , D. V. Mikhailov 2
2 Yaroslav-the-Wise Novgorod State University,
173003, Russia, Velikii Novgorod, ul. Bol'shaya St. Petersburgskaya, 41, tel.: (8162)627940, e-mail : gem@novsu.ac.ru (G.M. Emelyanov), mdv@novsu.ac.ru (D.V. Mikhailov)
The approach to formalization of semantic correlations between a lexeme and its lexical correlates in a problem of synonymy's situations's recognition is represented. The synonymy's situations are described on the basis of standard Lexical Functions. In this paper a principles of the Word's Lexical Meaning's theory's independent descriptions's generalization are represented.
Introduction
Application of the device of standard Lexical
Functions (LF) within the frameworks of the
“Meaning Text” approach can solve a problem of the Natural Language's (NL) statements's synonymy's proof on the basis of final set Rls of correctly formalizable rules of transformations of Deep Syntactic Structures
(DSS) [4, 5]. Nevertheless, significant difficulty at realization of Rls is a formalization of conditions of rules's applicability. For
rl
Rls the condition r
is a set of requirements to syntactic and semantic properties of the lexical blocks replaced by rl .
Let's consider a problem of the statements's
LF-synonymy's proof as a classical problem of
Pattern Recognition (PR). All set L of NLstatements's pairs, between which the LFsynonymy's establishment is possible
(concerning Rls ), there is an initial set of classified objects. Then statements's pairs's l
L LF-synonymy's demonstrability concerning the fixed rl
Rls will be the basis for grouping them into one taxon. Thus r
represents itself as precedent as a typical representative of the taxon rl . The
_______________________________________________________________________ formulation of PR’s problem: the new pair l
L , which not participated in taxonomy is shown. It is required to analyse rl
and to recognize a class's pattern rl
Rls , to which an object l is the most similar. The problem statement : to develop a program-realizable representation of
by means of revealing a character of semantic correlations between a lexeme and its lexical correlates for the basic types of Lexical Functions.
By virtue that sense's redistribution actual for the formalization of
is characteristic for situations with the parametric LFs, the PR's problem given above should be formulated as recognition of the semantic relation which is set by splintered value. Revealing and generalization of the given relation has direct analogy to the description of Noun Phrases's semantics [1]. Thus for the Lexical Meanings
(LM) of words replaced by rl are under construction the formalized descriptions in a kind of theories - a sets of meaning postulates, connecting each of replaced words with in other words and concepts. Nevertheless, at independent construction of the theory of one word by different researchers there is a problem of generalization of knowledge received thus. The given problem is especially
1 This work is financially supported by RFBR (project №06-01-00028) and by ESIC of NovSU.
actual at construction of theories on the basis of LMs's NL-definitions with application of standard conceptual languages [2].
Decision methods
Let for
Lec i j l j
, l j
L we have the description of its Lexical Meaning's theory by means of compound object of Prolog language: lmth
Lec _ j _ i , Var _ Smth , Re l _ list
(1) which describes a set of binary relations l between concepts Cncpt 1 and Cncpt 2 : rel 2
Re l , Cncpt 1 , Cncpt 2
, and (2) recursively defined relations of arbitrary arity : rel 2 _ complex
Re rel _ l complex
, Cncpt
Re l ,
, Re
Re l l
_
_ list list
and (3)
(4) by means of a list Re l _ list of structures of a kind (2), (3) and (4)).
LM given by means of (1) is a denotation to which in logic is put in conformity an extension [3] as a class of entities, defined by
(1). Inasmuch as for
Lec i j
its sense (or intension, [3]) from the philosophical point of view is a network of relations between i j
Lec and other words Lec m k
: Lec j i k
Lec m
, the sense of a lexeme can be defined by a set of functions which are set by statements of a kind
(2), (3) and (4) in structure of theories. These functions characterize the concepts designated by Lec i j
. Following the terminology accepted in [3], we shall name such functions by
Characteristic Functions (ChF) for a set of
Lexical Meanings. Thus, as shown by us in
[1], each of them can be set both by a separate statement, and their group.
At use of a structure (1) for the description of the theory of LM Lec _ j _ i a value of each of the specified functions will be equal to the third argument Cncpt 2 _ mng _ fn of the
254 relation in some statement of a kind (2). And
Cncpt 2 _ mng _ fn should be a designation of concept known to system (this concept is identified with the Semantic Class (SCl) of some word). To a name of ChF there will correspond the first argument Re l _ name _ fn of the first statement of a kind (2) or (3), being a designation of a known SCl (this SCl should define a relational noun), at back viewing the list Re l _ list of statements of a kind (2) for the given Lec i j
(here as the Re l _ list there can be a list the third argument of the statement of a kind (3), containing the
Re l _ name _ fn by the first argument) from the statement with the Cncpt 2 _ mng _ fn mentioned above as the third argument
(formed at such viewing the Re l _ list the list in the further reasonings we shall designate as
Re l _ list _ fn , Re l _ list _ fn
Re l _ list ).
On a place of the second argument of the statement with Re l _ name _ fn necessarily there should be a variable Var _ Smth designating a word, interpreted by means of
(1). Each next statement in list Re l _ list _ fn should is obligatory to have at least one common argument, which is a designation of some variable, with the previous statement.
According to the definition of sense formulated in [3] as intension, externally various descriptions (1) of theories of the same
LM give a common set of ChFs mentioned above. Finally they define an intension for the generalized theory of considered LM.
Proceeding from definition of intension as a function from the possible worlds to extensions [3], and also the recursive nature of meaning postulates, let's set the task of construction of the generalized theory of the given LM on the basis of independently received variants of theories of this LM as restoration of syntactic representation [3] of extension on the basis of known syntax of
expressions for the ChFs which are making an intension and written down by set of statements of a kind (2), (3) and (4). We have a ternary relation I
G
M
W between :
a set of objects G which correspond to variants
Lec i j lmth i oj
of definition of the LM
in the form of (1), G
lmth i oj
;
a set of attributes M which correspond to values Cncppt 2 _ mng
Characteristic Functions for
_
fn i poj lmth i oj
; of
a set W of attribute values. In our task each w
W is a name Re l _ name _ fn i poj
of ChF which value belongs to the M .
A relation I can be considered as a binary relation
I 1
lmth i oj
, Cncpt 2 _ mng _ fn i poj
W .
According to the Basic Theorem of Formal
Concept Analysis (FCA) [4] proved by G.
Birkhoff that for any binary relation it is possible to construct a “complete’ lattice appears an opportunity to apply the mathematical device of FCA to our problem.
With the respect of a complex character of postulates of a kind (3) and (4) we shall expand a set M of formal attributes by first arguments Re l _ from _ Arg i qpoj of statements of a kind (2), being an elements of
Re l _ list _ fn i poj
for the given variant i oj lmth of the theory lmth i j
(let's designate the resulted set as M 1 , and the extended thus a set of Formal Concepts (FC) – as 1 ). As well as
Cncpt 2 _ mng _ fn i poj
,
Re l _ from _ Arg i qpoj should be a designation of Semantic Class known to system. Besides, as a rule, with
Re l _ from _ Arg i qpoj
, associate some relation set by noun which names a
Re l _ from _ Arg i qpoj
. Thus
Re l _ from _ Arg i qpoj
Cncpt 2 _ mng _ fn i poj
, will (actually) characterize the FC set by pair
Re l _ name _
i oj lmth , fn i poj
. A value of attribute Re l _ from _ Arg i qpoj
will be equal to third argument Cncpt 2 _ mng _ fn i qpoj of the first statement of a kind (2) in the list
Re l _ list _ fn i poj (at direct viewing of this list),
Cncpt 2 _ mng _ fn i qpoj should be a designation of Semantic Class known to system. Search of such statement and formation of the
255 corresponding sublist of list Re l _ list _ fn is carried out by analogy to formation directly
Re l _ list _ fn .
By introduction in a consideration of a multi valued context :
K
G 1 , M 1 , W , I
(5) on the set G determine a relation known in the theory of the FCA as a ‘subconceptsuperconcept” [4] relation. Besides for any subset of objects from G the Least Common
Superconcept (LCS) and Greatest Common
Subconcept (GCS) can be set. Thus a set of the objects connected by “subconceptsuperconcept” relation with one GCS and/or with one LCS, it is necessary to consider as area. There in a role of LCS and of GCS can be, accordingly, the top concept and the bottom concept of lattice [4]. In this paper for areas we put forward the requirement of uniqueness both GCS, and LCS. A context (5) can be visually represented (fig. 1) by application of the specialized Software
ToscanaJ (http://toscanaj.sourceforge.net) which realize a methods of FCA.
Fig. 1. LM's definitions for Russian word “агрессор”
Using a definition introduced above for an area of a lattice with reference to elements of
LM's definition of the given Lec i j
let's define formally a key rule for generalization of statements of theories (1). rel rel
Two compared statements of a kind
2
2
_
_ complex complex
Re
Re l l
,
, Cncpt ,
Cncpt ,
Re
Re l l _
_ list list
1
2
and with coincident first and second arguments will be in a resulted theory a one statement of a kind
(3) with a third argument which includes the statement (4) what unites the statements from
256 lists Re l _ list 1 and Re l _ list 2 by “or” relation at fulfilment of a following condition.
A sets of the FCs got on the basis of
Re l _ list 1 and Re l _ list 2 , should form in a lattice for (5) an areas with LCS which has fig.1 an “or” relation will correspond to the following pairs of FCs :
(“Definition2_of_aggressor”,
“Definition3_of_aggressor”) ;
“Definition1_of_aggressor” and LCS for the pair
(“Definition2_of_aggressor”,
“Definition3_of_aggressor”) . rel rel
Two compared statements of a kind
2
2
_
_ complex complex
Re
Re l l
,
, Cncpt ,
Cncpt ,
Re
Re l l _
_ list list
1
2
and
will be in a resulted theory a one statement of a kind
(3) with a third argument which includes the statement (4) what unites the statements from lists Re l _ list 1 and Re l _ list 2 by “and” relation at fulfilment of a following condition.
Statements of the lists Re l _ list 1 and
Re l _ list 2 describe the same FC of (5) but by means of different ChFs. In an example in a fig.1 a told is related to an intent (as a set of formal attributes, [2]) of the FC
“Definition1_of_aggressor” and to an intent of the
LSC for the pair (“Definition2_of_aggressor”,
“Definition3_of_aggressor”) .
The stated principles of generalization of statements of a kind (3) are applicable for statements of any complexity from among entering into the third argument of statements
(3) and recursively defined on the basis of (3) and (4). Thus whereas a capacity n of a set of
ChFs corresponding required extension, does not depend on quantity k of generalized theories, a computing complexity of generalization's process of the given LM's theories depends exclusively from n and amounts O n k
k
(at worst n it is equal to quantity of statements of a kind (2) and (3) at all levels of the description of LM by means of
(1)). As and O
n k k
1 , , n
, O k n k k
1 under k
n .
n under k
1
Experimental approbation
The offered technique of generalization of theories (1) has been approved in Visual
Fig.2. The generalized theory of LM for Russian word “агрессор”
Prolog 5.2 environment on a material of independent lexicographic definitions for the
LM of Russian word “агрессор”. Variants of definitions are taken from the Big Soviet
Encyclopedia and the thematic dictionary
“War and peace” on http://slovari.yandex.ru and also in [4]. The generalized theory of LM for “агрессор” is represented in a fig.2.
The perspectives of further researches are related with sharing the approach offered in
the present paper and methods of generalization of predicates on the basis of the truth's sets [1].
References
1.
Mikhailov D.V., Emelyanov G.M. Model of language's sorts's system in a problem of a statement's semantic pattern's construction at a level of deep syntax // Taurian Herald for Computer
Science and Mathematics. - 2006. - №1. - P.79-90
(in Russian).
2.
Emelyanov G.M., Kornyshov A.N., Mikhailov D.V.
Conceptually-situational modeling of process of synonymic transformation of the Natural Language
257 statements as machine learning on the basis of precedents // Scientific-theoretical magazine
“Artificial intelligence”. - 2006. - №2. - P.72-75 (in
Russian).
3.
Gerasimova Irena. A. Formal grammar and intensional logic // Moscow : Russian Academy of
Science, Institute of Philosophy, 2000 (in Russian)
4.
Ganter B. and Wille R. Formal Concept Analysis -
Mathematical Foundations // Berlin: Springer-
Verlag, 1999.
5.
Igor A. Mel'cuk, Alexander K. Zholkovsky.
Explanatory Combinatorial Dictionary of Modern
Russian. Semantico-Syntactic Studies of Russian
Vocabulary // Wiener Slawistischer Almanach,
Sonderband 14, Wienna 1984.