Example-Based NLP Techniques

advertisement
From: AAAI Technical Report WS-92-01. Compilation copyright © 1992, AAAI (www.aaai.org). All rights reserved.
Example-Based NLP Techniques
- A Case Study of Machine Translation Eiichiro
SUMITA and Hitoshi
HDA
ATRInterpreting Telephony Research Laboratories
Hikaridai, Seika, Souraku, Kyoto 619-02, JAPAN
e-mail:{ sumita, iida}@atr-la.atr.co.jp
Abstract
This paper proposes Example-Based NLP
(EBNLP)which uses EXAMPLES
(pairs of
types of data, e.g. a sentence and its parse)
extracted from a corpus and the DISTANCE
between examples, explaining Example-Based
MachineTranslation (EBMT)
in particular.
EBMT
prototype has been implementedto deal
with frequent and polysemous linguistic
phenomena
such as Japaneseverbs, case particles
and English prepositions. The average success
rate of each phenomenonand interesting
relationships betweensuccess rate and example
databaseparametersare presented.
knowledgebases, and the remarkablesuccess of speech
recognitionusingspeechdatabaseshas createda needfor
very large linguistic corpora, drawingattention to
corpus-basedNIP.
Corpus-hasedNLPprojects vary according to
the followingparameters:
¯ Is the corpusmonolingnal
or bilingual?
¯ Howstructured is the corpus?Is it in a string, word
sequence,syntacticstructure, or semanticstructure?
¯ Whatis the mechanism
using the corpus?
¯ Is knowledge
extractedfromthe corpusor is the corpus
useddirectly?
¯ Whatis quality andsize of the corpus?
Recently, the increase of electronically stored
documents,the progress of hardwaresuch as memory
and parallel computers’,the development
of very large
EBNLPcan be explained while observing
corpus-based NLP. A monolingual corpus is
related to analysis and synthesis research.
Basedon the co-occurrenceor n-gramsextracted froma
monolingualwordor string level corpus, a wide range
of studies such as the semanticclassification of words
[Hirschman
et al., 1975;Hindle,1990], spell checking,
andthe correction of speechrecognitionerrors havebeen
made. Moreover, Smadja has begun new research
collecting collocations using n-grams and a parser
[Smadja,1991]. Frequencyof dependencybetweenwords
extracted from a monolingualsyntactic structure level
corpus, is used to eliminate syntactic ambiguity
[’rsutsumi and Tsutsumi,1989; Nagao,K., 1990].
A bilingual
corpus is related
to
translation research? A statistical
method for
aligningsentencesin a bilingual string corpus,has been
proposed by Gale and Church[Galeand Church, 1991].
Astatistical methodfor translating sentencesbasedon
mutualinformationin a bilingual wordcorpus, has been
proposed by Brown et al.[Brown et al., 1988].
Independentof this, EBMT,
whichutilizes bilingual
examples
(pairs of a sourcesentenceandits translation)
and the distance betweenexamples,has emerged[Nagan,
M., 1984; Sadler, 1989; Sato, 1991; Sumitaand Iida,
1991b;Furuseand Iida, 1992;Watanabe,1991].
It has been proven that massively parallel
computers [Kitano, 1991] and parallel computers
[SumitaandIida, 1991a]are effective in acceleratingthe
retrieval froma very large corpus.
" The bilingual corpus is useful for not only
translation but also analysis. A methodfor eliminating
syntactic ambiguityusing bilingual exampleshas been
proposed[Furuse
andIida, 1992].
1 Introduction
Recently,the constructionanduse of very large corpora
has begunto increase [Walker,1990]andhas introduced
a variety of corpus-basedNLPresearch projects. The
authors propose Example-Based NLP (EBNLP)
which uses EXAMPLES
(pairs of two types of data,
e.g. a sentenceandits parse)extractedfroma corpusand
the DISTANCE
betweenexamples. This paper explains
Example-Based Machine Translation
(EBMT)
in detail. Basedon the experimentspresented in the
paper, the authors demonstrate that Example-Based
Machine
Translationis effective in translating several
linguistic phenomenaand also present interesting
relationships betweensuccessrate andexampledatabase
parameters.
2 Background
The authors’ opinion is as follows: (1) The
approachby Brownet aL whichacquires all necessary
information from a bilingual word level corpus is
expectedto reducethe effectiveness for translation of
languagepairs, e.g. English and Japanese, whichare
drastically different from each other in wordsand
grammars, for example, word order, (2) Sato (his
MBT2
model)and Sadler supposethat translation units
whichare equivalent to each other from the point of
translationare linked.Thereare still manydifficulties to
overcomein order to build linked bilingual example
databases. Unlikethese approaches,the authors assume
hybrid architectures which incorporate EBMT
as a
subroutine into conventional machine translation
systems.’ Weprepareda databaseof formattedexamples
for several linguistic phenomena
for this paper. Weaim
to clarify what kind of phenomena
EBMT
is suited to
andillustrate the relationshipsbetweensuccessrate and
the exampledatabaseparameters.
Anotherinstance of EBMT
is the retrieval of
similar sentencesfor translation aid. It enablesa person
to retrieve sentences from an accumulatedtranslation
exampleswhichare similar to the sentencehe wantsto
translate. ETOCretrieves similar sentences by
generalizing input incrementally[Sumitaand Tsutsumi,
1988]. CTMmeasures the distance by counting the
numberof characters whichco-occurin both the input
and the example[Sato,1992].
Eventhoughthe corpus is monolingual,based
on pairs of twotypesof data, e.g. a string andits parse
as examplesand the distance betweenthe examples,a
variety of research will emerge.Theauthors generally
call them EBNLP.
EBNLPmakes much of individualities
in
linguistic phenomena.At the start, it does not
acquire abstract knowledge from corpus but
utilizes
the corpus directly.
Research on
compressingthe exampledatabase while maintaining
system performanceis an important future task for
memory
size and processingspeed.
This research formsonepart of ATR’sresearch
whichaims to realize interpreting telephony. TheATR
corpus contains conversationsabout registering for an
international conference. ATRhas collected about
17,000 Japanese sentences (about 270,000words)
analyze linguistic phenomena
and extract statistic
information.EachJapanesesentenceis morphologically
and syntactically analyzedand its English translation
) Theauthors assumethat linguistic phenomena
can
be divided into two parts: a regular part whichis
describedwell byrules andan irregular part whichis not
describedwell by rules. Theformerrelates to modelorientedNLPandthe latter relates to data-orientedNLP.
It is morerealistic to incorporate a data-oriented
approachwith a model-orientedapproachthan to use a
single approachexclusively.
82
Atmehed[Ehara
et al., 1990].
3 Outline
of EBMT
Recentmachinetranslation technologyhas reachedthe
level whereseveral systemshave been commercialized
and are used daily. However, the conventional
technologyis not satisfactory in somepoints such as
target wordselection. This paper addresses the word
selection problemsof several phenomenaand proves
EBMT
feasibifity throughsuccessful experiments.
Example-BasedMachine Translation (EBMT)
was proposed by Nagao[Nagao,M., 1984] in order to
overcomeproblemsinherent in conventional Machine
Translation. In EBMT,
(1) a tin!abase whichconsists
examplesis prepared for translation knowledge;(2)
examplewhosesourcepart is similar to the input phrase
or sentenceis retrieved fromthe exampledatabase;(3)
the translation is obtainedby replacingof corresponding
wordsin the target expressionof the retrievedexample.
(1) Analysis
(4) Example
Database
(2) Example-Based
Transfer
(5) Thesaurus
(3) Generation
Figure 3.1 EBMTConfiguration
As shown in Figure 3.1, EBMTuses two
databases, i.e., an exampledatabase and a thesaurus’.
Examples
(pairs of a source phraseand its translation)
are abstracted from the bilingual corpus. For
convenience,
a translation pattern (tp), e.g., for English
adverbial preposition "in", "N2~=- VI’, "N2"~ VI",
"N2¢ VI" is abstracted from an exampleand stored
with the examplein the exampledatabase. Table 4.1
’ The hierarchy of the Japanese thesaurus is in
accordance with the thesaurus of everyday Japanese
[Ohnoand Hamanishi, 1984], the hierarchy of the
English thesaurus is in accordance with the LONGMAN
LEXICON[McArthur,
1981]. To what extent EBMT
is
sensitive to the hierarchy of the thesaurus is an open
andinterestingissue.
showsthe translation pattern distributionof "in".
The distance measured by the summation
of the word distances multiplied by the
weight of each word. It is assumedthat the input
andthe examplein the exampledatabase(referred to as
andE) are representedin the list of words’syntacticand
semanticattribute values(referred to as k and Ek) for
eachphrase. Thedistanceis calculatedac-cording-tbthe
expression(1). Different measuresof worddistance and
wordweighthave been proposedas shownin footnotes
5 and6.
(1) d(I,E)ffiY~ d(Ik,Ek) k
k
Here, weexplain our measure[Sumita
and Iida,
1991b]briefly. Worddistanced(Ik,Ek.) is proportionalto
the location of the conceptwli~ch zs called the Most
sSpecific Common
Abstraction.
I
".tl
i
A,B
C
of tpwhen Ek=Ik)
EBMT’s
advantagesare as follows:
*Byadding examples,the user himself canimprovethe
quality of EBMT.
*Basedon a bilingual corpuswhichrealizes translators’
expertise, EBMT
can generatea high-qualitytranslation.
*Thedistance functions as an indicator of translation
reliability.’
*Even though the example database is very large,
acceleration is possible using indexing and parallel
computing.
Prepositions
4.1 English Prepositions
*English prepositions occur frequently.
Prepositionsare the basicdevicesused in constructing
English verb and nounphrases and occur frequently.
Prepositionsdealt with in this paper, i.e., "of", "to",
"for", "in", "on", "at", "from","by", "with"are within
the almost top fifty.’ The corpus has about 270,000
words. Thetotal numberof examplesis about 3,300.
-English prepositions are highly polysemous.
LONGMANDICTIONARY OF CONTEMPORARY
ENGLISH[Longman,
1978] lists 21 senses for the
preposition"in". AnnetteHerskovitslists 11 senses for
the preposition "in" of locative expression[Herskovits,
1986].
*English prepositions together with the following
nounsmodifypreceding verbs or nouns. In this paper
wecall the former adverbial usage (1) and the latter
adnominalusage (2).
I I .I
[I /I II \ I
D
2(2)Wk=,,/Y. (freq.
tp
4 EBMTfor
Thesaurus Root~J=
d(A,C)
summationof the square of frequency of translation
pattern(tp) in the subset of the exampledatabasewhere
Ek=I
k.
E
Figure 3.2 ThesaurusHierarchy and
Distance
Thedistance varies from 0 to 1. In Figure 3.2, A, B,
C, D and E represent words, and squares represent
concepts,to whichcorresponding
distancesare attached.
Wordweight is computedaccording to the
followingexpression.* Wordweightis the root of the
s Sato[Sato, 1991]andSadler[Sadler,1989]compute
worddistanceusing feature vectors whichconsist of cooccurrencebequenciesin the example~!,base.
’ Sato computes word weight using an
information-basedestimation method[Sato,1991]. The
weight are invariable for any input. Consideringthe
translation of "A B C" which consists of three
components,"A", "B", and "C", this methodcan only
determine a general tendency for target expression
selection, e.g., "In general, Ais the mostimportantof
the three components,"and thus A should receive the
highest weight. However,in individual cases, "B or C
can often be the mostimportantof the three."
83
(1) I’ll
seeyou{inOsaka}.
V
P
N
(2) I’m workingfor a comvanvi~ Kvotol.
N P N
’ Asreportedin [SumitaandIida,1991b],generally,
the smaller the distance, the better the quality.
Moreover,the distance functions as an indicator of
translation reliability. In the reportedexperiment,(1)
the cases whered<=0.5,the success rate is 82%.(2)
the cases whered>0.5, the success rate is 37%.The
difference betweenthese two cases is remarkable.
’ Theseprepositionsoccurfrequentlyin other
domains,e.g., computermanuals,as well.
The rates of the two usages vary from preposition to
preposition (Figure 4.1). For example, the rates
adnominalusage of "of" and adverbial usage of "to",
"by", and "with"arc great(over90%).Otherprepositions
are less biased. Both adverbial usage and adnominal
usageare important,so the two are dealt with uniformly
by our approach.
Prepositions
of
in
to
¯ adverbial usage
r’l adnominal
usage
by
Distribution (V1 In N2)
tp
N2 ~- V1
N2 "~ V1
N2 ¢ VI"
total
[]
with
i
|
|
I
!
l
I
i
I
I
100200 300400500 600700800 9001000
Number
Figure 4.1 Adverbial Usagevs.
Adnominal Usage
Prepositions
at
¯ Thefollowing samplecorrespondences
showhowthe
adverbialusage
of "in" corresponds
to the Japanese
case
particles,"~, ~ andso on.’
.........
live in Kyoto
plan - in March
rate
0.630
0.359
0.010
1.000
0
[] ni
[] de
IRI ni-kanshite
[] e(zero)
==
.....................
I , , , , , , , , , I
0 10 20 30 40 50 60 70 80 90 100
hold - in Kyoto
meet ~ in Kyoto
frequency
242
138
4
384
Japanesecase particlesarebothfrequent andpolysemous
as well. The correspondences between English
prepositions and Japanese case particles are
not one-to-one but complicated (Figure4.2).
contrast, the correspondences between English
prepositionsand Frenchprepositions are almostone-toone, e.g., "at" and"a","in" and"clans", "on" and"sur"
’=
correspond
eachother.
on
for
at
from
Table 4.1 Translation Pattern
Adverbial Particles
---
%
TM
[Kyo~---’],
[hold]
g~$~-3
[Kyot-6"],
[meet]
Figure 4.2 Correspondence between
English Prepositions and Japanese
Particles (Adverbial Usage)
[Kyo~--’],
[live]
--- 3~J ~:-:~:~
[Ma~’-h],
[plan]
meet- in the afternoon
[afte~-oon],[meet]
leave - in 10 minutes -- lO~-e$~3’- ~
[10 m’~utes],[leave]
Table 4.1 shows the distribution
correspondence,
i.e., translation pauem(tp).
4.2 Experiment
of the
4.2.1 Adverbial Usage "in"
Examplesare represented as a simple fixed format
consisting of three elements of the form (VERB
PREP
I|
’ Typicalcase particles are as follows:~¢(ga),
(o), t~-(ni), "P(de),~ (to), "~(e), t~-[~L.’C(ni-kanshite),
and so on. Romanization
is parenthesizedhere.
,o Englishequivalences
are bracketedthroughthis
paper.
84
¢ meansno case particle.
~2 In fact, there are manyexceptionsto this simple
correspondence.In order to explain the correspondence,
Nathalie Japkowiczand Janyce M. Wiebehave proposed
a translation method based on conceptualization
[Japkowicz and Wiebe, 1991]. Cornelia ZelinskyWibbelthas proposeda similar approachfor English and
German[Zelinsky-Wibbelt’1990].
NOUN),such as (see in Osaka). Conjecture that
Iranslation of adverbial prepositions dependsonly on the
following nouns is not true, because neglecting the verb
reduces the success rate. Wehave to refer to both verb
and noun, word weights change on a case-by-case basis.
The success rate was 89.8%.
4.2.2
General
Tendency
Figure 4.3 summarizesadverbial usage, Figure 4A
summarizes adnominal usage."
Successrate
make changes at once -- ~,~,
~.~T 7o
[immediately] [makechanges]
¯ Informationwithin the fixed format is not sufficient to
determine the sense. For example, "be in Osaka" is
wanslated differently ":J~:J~ ~:- & 70" and "~l~ "C
&7o" depending on the subject of "be".This can be
eliminated by using larger exampleswhennecessary."
The hotel is in Osaka--* ~ ~’)//W 5~[~ ~=- ~ 70
[hotel] [Osaka] [be]
The party is in Osaka--* /¢-- Y" 4 ~~ :)~:~’~ [070
lpartyI [Osaka] [be]
100
l
80
4.2.3 Relationship
I,
database
60
40
between success
and
This section outlines the relationships between the
success rate and exampledatabase parameters. In Figures
4.5, 4.6, and 4.7, each square represents each
preposition, e.g., adnominal"of", adverbial "in".
2O
0
of in on to for at from by with
Prepositions
¯ In general,
the more examples we have, the
better the quality as reported in a previous
paper[Sumita
and Iida, 1991b]. However, the
relationship between success rate and the number of
examples is peculiar to each phenomenon.There is no
commonequation applicable for all phenomena(Figure
4.5).
Figure 4.3 Adverbial SuccessRate
Successrate
100
Success rate
y = 91.679- 1.2239a-3x
R^2- 0.002
8O
60
’°°l: ¢
mra
m
40
[]
90 !
20
80
0
of in on to for
0
at from by with
Prepositions
rl B
~1 ,
.
200
Figure 4.5
Figure 4.4 Adnominal Success Rate
Major causes of failure
¯ NOsimilar examples. Thesefailures can be eliminated
by addingexamples.
¯ Idiomatic expressions. Thesefailures can be eliminated
by exception handling or by adding examples.
[]
, ,.
,
.
, .
,
400
600
800
1000
Number of Examples
Success Rate per Number
of Examples
" If an example element does not have sufficient
influence on translation, its weight is very low. For
example, weight of "be" is lower than others.
,s If the number of examples is small (in our
experiment of prepositions, under 20), the success rate
is not stable, e.g., adverbial "of" and adnominal"to" and
"with" are not very successful, while adnominal"by" is
translated completely. Adverbial "of" and adnominal
"to", "by", and "with" are neglected in this section.
" Adverbial "of" and adnominal "to", "by", and
"with" are shaded in Figures 4.3 and 4.4 as they have
too few examples for their success rates to be
meaningful.
85
¯ In general, the moretranslation patterns, the
worsethe quality. Figure 4.6 showsthe relationship
the number
of translation patterns andsuccessrate.
Successrate
Content Word
y - 98.352- 4.4187xR^2- 0.220
m
100
90"t
80
0
m
2
content wordssuch as verbs andnounsand so on and (2)
functionwordssuchas prepositionsandparticles.
B
[]
/
~
4
~ns
6
For verbs particularly, there is one-to-manymapping"
from the source languageword to the target language
wordand to select the appropriate target languageword
is an importanttask.
Number of
Translation
8
-~ )PP~:
[milk]
Y, -- 7" ~
[soup]
~"
[medicine]
~ ~:
[cigarette]
10
Figure 4.6 Success Rate per Number
of Translation Patterns
¯ In general, the higher the weight, the better
the quality. Figure 4.7 shows the relationship the
weight and success rate. Theword weight defined by
expression(2) in section 3 is computedin subset of
the exampledatabase. In contrast, this weight is
computedas the root of the summation
of the square of
frequency of translation pattern(tp) in the whole
exampledatabase.
Successrate
y - 69.341+ 29.725xR^2- 0.602
100
eig
ht
80
~-’l~;~mi-~~
0.4 0.5 0.6 0.7
¢30
-"*
drink milk
[drink/eat/take/smoke]
0 0
--*
eat soup
[drink/eat/take/smoke]
0 0
--*
take medicine
[drink/eat/take/smoke]
0 ~r
-"
smoke
[drink/eat/take/smoke]
Because EBMT
does not use rules based on
semantic markers but uses examples directly and
measuresthe distance based on a thesaurus, EBMT
is
expectedto performthe finer wordselection.
Function Word
Anadverbial preposition is a typical function word.It
together with the followingnounsmodifythe preceding
verbs. Tables 5.1 and 5.2 classify function words
according to whetherthe modifier" is a nounor verb
and whether the modificand is a noun or verb. As
exemplifiedin section 4, in general, function wordsare
frequent and polysemous
(in manycases, they haveoneto-manymappings). Consequently,they are important
in translation.
Table 5.1 Classification of Function
Words (English)
0.8 0.9
1.0
N
V
Figure 4,7 Success Rate per Weight
5 EBMTfor Other Phenomena
N
adnominal
preps
e.g. theobjectof t h u
conference
adverbial
preps
e.g.meet
at t h¯
station
V
relativeclauses
etc.
e.g.the manwho
walks
conjunctions
e.g.walkwhile
listening
This section reports translation of other linguistic
phenomena
whichwehaveconductedto date.
5.1 Other Phenomena
Here, an outline of linguistic phenomenais given.
EBMT
is applicable to various difficult translation
problems,as reported in the paper[Sumitaand Iida, 19
91b]. Here, wewill explain phenomena,dividing words
into two classes froma grammaticalpoint of view: (1)
86
,6 In the ATRcorpus, one-to-many
verb are
36. l%(type)and 81.5%(token).
"Modifiersare boldfacedin the tables.
Table 5.2 Classification of Function
Words (Japanese)"
N
adnominal
case
particles
N
V
adverbial
caseparticles
e.g.~’~ ")
[conference],[object][station],[meet]
V
relativeclauses
etc.
e.g.~ < _zJ~
[walk],[man]
conjunctions
e.g.~~ ~’,~:0~
G~ <
[listen],[walk]
Acknowledgements
Theauthors gratefully acknowledge
the help providedby
Makoto NAGAOand Satoshi SATO at Kyoto
University,
and by Akira KUREMATSU,
Osamu
FURUSE,and other members at ATRInterpreting
TelephonyResearchLaboratories.
$.2 Experiment
Wehavetested verbsas contentwordandthe top of
Tables5.1 and5.2 as functionwords.Withthe average
successrates shownbelow, EBMT
is shownto be
effectivefor bothcontentwordsandfunctionwords.
(1)
(2)
(3)
(4)
(5)
verb"
adnominalprepositions
adverbialprepositions
~
adnominalcase panicles
~’
adverbialcaseparticles
(JE)
(F_J)
(EJ)
(JE)
(JE)
87%
87%
90%
78%
79%
6 Conclusion
This paper has proposed the Example-Based NLP
(EBNLP)which uses EXAMPLES
(pairs of two types
of data, e.g. a sentenceandits parse) extracted from
corpus and DISTANCE
between the examples. To show
EBNLPfeasibility,
an Example-Based Machine
Translation (EBMT)prototype using fixed format
exampleswhich deals with frequent and polysemous
linguistic phenomenasuch as Japanese verbs, case
particles and English prepositions wasimplemented.
The average success rates were high, i.e., about
"Weassumethat there is zero wordfor cases such
as <ltA.
"The averageof success rates of polysemons
verbs
with a frequencyover 100 in the ATRcorpus, i.e.,
¯
80-90%.In addition, interesting relationships between
success rate and exampledatabase parameters were
presented.
, Ourfuture plan are as follows:
¯ CompareEBMT
with MTusing deep case."
¯ Test phenomena
shownat the bottomof Table 5.2.
¯ Estabfisha methodto extrapolatethe successrate with
large exampledatabase from that with a small example
database.
¯ Studythe domaindependency
of the exampledatabase.
.Carry out research on other EBNLP.
q T,
,3.
References
[Brownet al., 1988]Brown,P., Cooke,J., DellaPietra,
S., Della Pietra, V, Jelinek, E, Mercer,R. and
Roossin, P.: "A Statistical
Approach to
LanguageTranslation," Proc. of Coling ’88,
pp.71-76, (1988).
[Cornelia Zelinsky-Wibbelt,1990]CorneliaZelinskyWibbelt: "The Semantic Representation of
Spatial Configurations:a conceptualmotivation
for generationin MachineTranslation," Proc. of
the 13th International
Conference on
ComputationalLinguistics, vol. 3, pp.299-303,
(1990).
[Eharaet al., 1990]Ehara, T., Ogura,K. and Morimoto,
T.: "ATR Dialogue Database," Proc. of
International Conferenceon SpokenLanguage
Processing’90, (1990)i
[Furuse and Iida, 1992] Furuse, O. and Iida, H.:
"CooperationBetweenTransfer and Analysis in
Example-BasedFramework," Proc. of Coling
’92, pp.645-651,(1992).
[Gale and Church, 1991] Gale, A. and Church,K.: "A
PROGRAMFOR ALIGNING SENTENCESIN
BILINGUAL
CORPORA,"
Proc. of 29th ACL,
pp.177-184,(June 1991).
[Herskovits, 1986]AnnetteHerskovits:"Languageand
spatial cognition: an interdisciplinary study of
Several problems have been pointed out
concerning the translation methodusing deep cases
[Tsujii and Yamanashi, 1985], [Nigel, 1991]. The
authors do not use deepcases to avoid such problems.
Wehave to compare EBMT
and MTusing deep cases
fromseveral points of view.
Averageof successrates of all adnominal
case
particles [SumitaandIida, 1991b].
~’ Average
of successrates of typicaladverbialcase
panicles,
i.e.,~0~,~., I:.."e.
87
the prepositionsin English,"(Studies in natural
Translation," (manuscript1991).
language processing), CambridgeUniversity
[Sumita and Iida, 1991b] Sumita, E. and Iida, H.:
Press, (1986).
"EXPERIMENTS AND PROSPECTS OF
[Hindle, 1990] Hindle, D.: "NOUN
CLASSIFICATION
EXAMPLE-BASED
MACHINE
F ROM
P R E D I C ATE-ARGUMENT
TRANSLATION,"
Proc. of 29111 ACL,pp.185STRUCTURES,"
Prec. of 28th ACL, pp.268192, (June 1991).
275, (June 1990).
[Sumita and Iida, 1992]Sumita, E., and Iida, H. :
[Hirschman
et al., 1975] Hirschman,L., Grishman,R.,
"Example-Based
Transfer of Japanese Adnominal
and Sager, N.:"Grammatically-basedautomatic
Particles into English," IEICETRANS.
INF. &
wordclass formation," Information Processing
SYST., VOL. E75-D, No. 4, pp.585-594,
and Management,
11, pp.39-57, (1975).
(1992).
[Japkowicz and Wiebe, 1991]Nathalie Japkowiczand
[Sumita and Tsutsumi, 1988]Sumita,E. and Tsutsumi,
Janyce
M. Wiebe: "A SYSTEM FOR
Y.: "A Translation Aid SystemUsing Flexible
TRANSLATINGLOCATIVEPREPOSITIONS
Text Retrieval basedon Syntax-matching,"Proc.
FROMENGLISHINTO FRENCH," Prec. of
of The Second International Conference on
29th ACL,Berkeley, California, P. 153-160,
Theoretical and Methodological Issues in
(1991).
Machine Translation of Natural Languages,
[Kitano,1991]Kitano,H.: MassivelyParallel Artificial
CMU,
Pittsburgh, (June 1988).
Intelligence and Its Application to Natural
[Tsujii and Yamanashi,1985]Tsujii, J. and Yamanashi,
LanguageProcessing," Proc. of International
M.:"Kakuto sono nintei kijun," IPSJ, WGNLWorkshopon Fundamental Research for the
52-3, (1985).
Future Generation of Natural Language
[Tsutsumi and Tsutsumi, 1989] Tsutsumi, Y. and
Processing(FGNLP),pp.87-107, Kyoto, Japan,
Tsutsumi, T.:"DisambiguationMethodBased on
(July 1991).
the Stochastic Data for Natural Language
[LONGMAN,
1978] Longman: "LongmanDictionary
Parsing," Trans. of IEICE, volJ72-D-II, no.9,
of Contemporary English," London:Longman
pp. 1448-1458,(1989)(in Japanese).
Group,(1978).
[Walker,1990]Walker,D.: "The Ecologyof Language,"
[McArthur, 1981] McArthur,T.:"LongmanLexicon of
Proc. of International Workshop
on Electronic
Contemporary English," London:Longman
Dictionaries, Oiso, Japan, pp.10-22, (Nov.
Group,(1981).
1990).
[Nagao, M., 1984]Nagao, M.: "A Frameworkof a
[Watanabe, 1991] Watanabe,H.: "A Transfer System
MechanicalTranslation between Japanese and
Combining
Similar Translation Patterns," Proc.
Englishby AnalogyPrinciple," in Artificial and
of 8th conventionof JSSST, E2-3, pp.185-188,
HumanIntelligence, eds. A. Elithorn and R.
(1991)(inJapanese).
Banerji, North-Holland,pp.173-180,(1984).
[Nagao, K., 1990] Nagao, K., Structural
Disambiguation with Knowledgeon Word-toWordDependencies,Proc. of InfoJapan’90, Part
lI 97-104,(1990).
[Nigel, 1991]WardNigel: "Decomposing
DeepCases,"
IPSJ, pro¢. of 43th convention,4G-4,(1991).
[Ohnoand I-Iamanishi, 1984] Ohno,S. and Hamanishi,
M.: "Ruigo-Shin-Jiten," Kadokawa,(1984) (in
Japanese).
[Sadler, 1989] Sadler, V.: "Workingwith Analogical
Semantics,"Foils Publications, (1989).
[Sato, 1991] Sato, S. : "Example-BasedMachine
Translation,"
Doctorial Thesis, Kyoto
University,(1991).
[Sato, 1992] 9. Sato, S. : "CTM:An Example-Based
Translation Aid System Using the CharacterBased Best MatchRetrieval Method,"Prec. of
Coling ’92, pp.1259-1263,(1992).
[Smadja, 1991] Smadja, E: "FROMN-GRAM
TO
COLLOCATIONS,"Proc. of 29th ACL,
pp.279-284,(June 1991).
[Sumita and Iida, 1991a] Sumita, E. and Iida, H.:
"Acceleration of Example-Based Machine
88
Download