Abductive Processes for Answer Justification Sanda M. Harabagiu

From: AAAI Technical Report SS-02-06. Compilation copyright © 2002, AAAI (www.aaai.org). All rights reserved.
AbductiveProcesses for AnswerJustification
Sanda M. Harabagiu
Steven J. Maiorano
Language Computer
Corporation
Dallas TX 75206 USA
sanda @ianguagecomputer.com
University of Sheffield
Sheffield S 1 4DPUK
stevemai @home.corn
Abstract
Ananswerminedfromtext collections is deemedcorrect if an explanationor justification is provided.Due
to multiplelanguageambiguities,abduction,or inferenceto the best explanation,is a possiblevehiclefor
suchjustifications. Tointerpret abductivelyan answer,
world knowledge,iexico-semantic knowledgeas well
as pragmaticknowledge
needto be integrated. Thispaper presentsvariousformsof abductiveprocessestaking
place whenanswersare minedfrom texts and knowledgebases.
Introduction
In (Harabagiu and Maiorano1999) we claimed that finding
answers from large collections of texts is a matter of implementing paragraph indexing and applying abductive inference. Subsequent experiments, reported in (Harabagiu
et al.2000) have showedthat in fact, answerjustification
provided by lightweight abduction has the merit of improving the accuracy of a Question Answering(QA) system
over 20%.These results were later confirmedin the TREC9t evaluations (Voorbees 2001), where LCC-SMU’s
system
was the only one that implementedabductive processes for
answerjustification.
To evaluate the results of a QAsystem, the TRECresults were considered as five possible answers, of either
50 bytes of length (short answers) or 250 bytes of length
(long answers). The scoring, detailed in (Voorbees2001),
uses a MeanReciprocal Rank (MRR)by relying on only six
values:(1, .5, .33, .25, .2, 0), representingthe score each set
of five answersobtains. If the first answeris correct, it obtains a score of 1, if the secondone is correct, it is scored
with .5, if the third one is correct, the score becomes.33, if
the fourth is correct, the score is .25 and if the fifth one is
correct, the score is .2. Otherwise, it is scored with 0. No
credit is given if multiple answers are correct. As shown
Copyright(~) 2002,American
Associationfor Artificial Intelligence(wwwn~i,org).
All rights reserved.
*TheTExtRetrieval Conference(TREC)
is the mainresearch
forumfor evaluatingthe performance
of InformationRetrieval(IR)
and Question Answering(QA)Systems. TRECis sponsored
the NationalInstitute of Standardsand Technology
(NIST)and the
DefenseAdvancedResearch Projects Agency(DARPA)
to annually conductIR and QAevaluations.
33
in Figure 1, the accuracy of the five answers returned by
the LCC-SMU
system was substantially better than those of
other systems. (Papa and Harabagiu 2001) reports the observation that abductiveprocesses(a) doublethe precision
QAsince they filter out unjustifiable answers, and (b) have
the sameoverall effect as the lexico-semantic alternations
of keywordsused to search the answer. Given this significant contribution of abductive inference to the accuracyof
QAsystems, we take issue in this paper on the formalization
of several abductive processes and their general interaction
with the larger concept of context, primarily as viewedby
GraemeHirst in (Hirst 2000).
In (Hobbset al.1993) abduction is defined as inference
to the best explanation. This is because abduction is different from deduction or induction. In deduction, from
(Vz)(a(z) ~ b(z)) a(A)one concludes b(A). In induction, from a number of instances a(A) and b(A), one
concludes (Vz)(a(z) -t b(z)). Abductionis the third
siblity, as noted in (Hobbset al.1993). From(Vz)(a(z)
b(z)) and b(A), one concludes a(A). If b(A) is the observable evidence, then (Vz)(a(z) b(z)is the general principlethat can explain b(A)’s occurrence, whereas a(A) is
the inferred cause or explanation of b(A). For QA,if b(A)
stands for a candidate answer of question a(A), we need
to discover the implication (Vz)(a(z) b(z)). In s tat eof-the-art QAsystems, there are several different abductive
processesthat concurto the discoveryof the justification of
an answer.
Abductive processes are determined by (1) iexicosemantic knowledgethat enables the retrieval of candidate
answers; (2) logical axiomsthat are generated from various sources ( e.g. WordNet(Feilbaum 1998)); (3)
matic knowledge,allowing bridging inference; and (4) several formsof coreference information.Westress out the fact
that for each answer,the iexico-semantic,logical, pragmatic
and coreferential knowledgespans a relatively small search
space, thus enabling efficient abductive processing. Typically, answers are minedfrom short paragraphs containing
only a few sentences. Moreover,the lexico-semantic informarion is determined by the keywordsemployedto retrieve
the paragraphs containing candidate answers. As reported
in (Harabagiu et al.2001), alternations of these keywords
as well as the topically related concepts of these keywords
do not increase substantially the search space for abductive
TRY-9 5O b~
O.5
0.4
0.3
m
mmmmmmmm_
o., mmmmnmmmmmmmmmmmm_
mm mmmmmmmmmmmmmmmmmm.m
0
0.8
0.7
0.6
0.$
0.4
0.3
0.2
0,!
0
TREC-925Obytes
Figure 1: Results of the TREC-9
evaluations.
processing - on the contrary- they guide the abductive pro-
Rouge, the famous Parisian nightclub. If appositions are
not recognized and the mining of answers is solely based
on keywords, manyfalse positives may be encountered.
For example, when searching on GOOGLE
for ’ ’Houlin
Rouge’ ’ a vast majority of the URLscontain information about the 2001movieentitled "MoulinRouge",starring
Nicole Kidrnanand EwenMcGregor.Dueto this, the pageranking algorithm used by GOOGLE
categorizes the named
entity as the title of a movie. Linguistically, we are dealing in this case with a metonymy,since moviesget titles because of the mainplace of themesthey depict, in this case
the Parisian cabaret. However,to be able to answerthe question, a long chain of inference is needed.Oneof the reviews
of the movie says: " "Resurrects a momentin the history
of cabaret whenspectacle ravished the senses in an atmosphere of bohemianabandon.". The elliptic subject of the
review refers to the movie. Becauseof this reference, the
first clause introduces the theme of the movie: "a moment
in the history of cabaret". This enables the coercion of the
nameof the movieinto the type "cabaret", leading to the
sameanswer that was minedthrough an apposition from the
TRECcollection. As opposedto a simple recognition of an
apposition, this exampleshowsthe special inference chains
imposedby on-line documents.
Copulativeexp _res~__’ons
Copulative expressions are used for defining features of a
widerange of entities. This explains why,for so-called definition questions, recognizedby one of the patters:
cesse8.
This paper is organizedas follows. Section 2 describes the
answerminingtechniques that operate on large on-line text
collections. Section 3 presents the techniques used for mining answersfrom lexical knowledgebases. Section 4 details
several abduedveprocesses and their attributes. Section 5
discusses the advantages and differences of using abductive
justification as opposedto quantitative ad-hoc information
provided by redundancy. Section 6 presents results of our
evaluations whereasSection 7 summarizesthe conclusions.
AnswerMiningfromOn-Hne
Text Collection
There are questions whoseanswer can be minedfrom large
text collections because the informationthey request is explicit in sometext snippet. Moreover,for somequestions
and sometext collections, it is easier to minea questionin a
vast set of texts rather than a knowledge
base. In this section
we describe several abductive processes that account for the
miningof answersfrom text collections:
Appositions
Appositive constructions are widely used in Information
Extraction tasks or coreference resolution. Similarly, they
provide a simple and reliable way for the abduction of
answers from texts. For example, the question "What
is Moulin Rouge?" evaluated in TREC-10is answered
by the apposition following the named entity: Moulin
34
(Q-Pl):What {is[are} <phrase.to.define>?
(Q-P2): Whatis the definition of <phrase.to.define>9.
(Q-P3): Who[ is[wuslarelwere} < person.name(s)>
the mining of the answer is done by matching some of
the following patterns,as reported in (Pa~ca and Harabagiu
2001):
( A-Pl ):[ <phrase.to.defme>
{ islarelbecame}
( A-P2):[ <phrase_to.defme
>, { a[the[an}]
(A-P3):[<phrase_to.define>
A special observation needs to be madewith respect to
the abduction of answers for definition questions. Another
apposition, from the same TRECcollection of 3 GBytesof
text data, correspondingto the question asking about Moulin
Rougeis Moulin Rouge, a Pard landmark. At this point,
additional pragmatic knowledgeneeds to be used, in order
to discard this second candidate answerdue to its level of
generality. Oneneeds to knowthat the Notre Dameand the
Eiffel Towerare also Paris landmarks, but howeverMoulin
Rougeis a cabaret, the Notre Dameis a cathedral whereas
the Eiffel Toweris a monument.Therefore the definition
of the MoulinRougeas a Parisian landmarkis too general,
since abductions classifying it as a cathedral or monument
wouldbe legal, but untruthful.
Unfortunately, pragmatic knowledgeis notalwaysreadily
available, and thus often we needto rely on classification information available either from categorizations provided by
popular
search
engines
likeGOOGLE,
orfromad-hoc
categorizations
determined
by hyperlinls
onvarious
Webpages.
For example, when searching GOOGLEfor ’ ’ landmark
+ Paris’’, the resulting pointers comprise Webpages
discussing the Arc de Triompbe,the Eiffel Towerbut also
the Hotel Intercontinental, boasted as a Paris landmark,or
the Maiile Waymustardshop or even the products of Iguana,
a companymanufacturing heirloom glass ornaments representing Parisian landmarkssuch as the Eiffel Tower, the
Notre Dameor the Arc de Triomphe. Furthermore, lexical
knowledgebases such as Wordlqetdo not classify a cabaret
as a landmark,but rather as a place of entertainment.
Possessives
Possessive constructions are useful textual cues for answer
mining. For example, whensearching for the answer to the
TRF_~-10question "Wholived in the NeaschwansteincaRde?", the possessive "MadKing Ludwig’s Neuschwanstein
Castle indicates a possible abduction,due to the multiple interpretations of the semanticof possessiveconstructs. In the
case of a house, evenif luxurious as a castle, the "possessot" is either the architect, the owneror the personthat lives
in it. Since the question specifically asked for the person
wholived in the castle, the other twopossibilities of interpretation are filtered out unless there is evidencethat King
Ludwig
built the castle but did not live in it.
Possessive constructions are sometimes combined with
appositions, like in the case of the answer to the TREC10 question "What currency does Argentina use?", which
is "The austral, Argentina’s currency". However,mining
35
the answer to a very similar TREC-10
question "Whatcurrency does Luxemburguse?" relies far more on knowledge
of other currencies than on pure syntactic information. In
this case, the answer is provided by the complexnominal
"Luxemburg
franc", that is recognizedcorrectly because the
noun "franc’~gorized
as CURRENCY
by a Named Entity tagger similar to those developedfor the MessageUnderstanding Conference (MUC)evaluations. If such a resource is not available and the answerminingrelies only on
semantic information from WordNet,the abduction cannot
be performed, since in WordNet1.7, the noun ’~franc" is
classified as a "monetaryun/t", a conceptthat has no relation to the "currency"concept.
Meronyms
Sometimesquestions contain expressions like "madeof" or
"belongs to" whichare indicative of meronymic
relations to
one of the entities expressed in the question. For example,
in the case of the TREC-10
question "Whatis the statue of
liberty madeof?. ", evenif Statue of Libertyis not capitalized
properly, it is recognized as the NewYork landmarkdue to
its entry in WordNet.The sameentry is sub-categorized as
a statue, and the nameof its location "NewYork" can be retrieved from its defining gloss. However,WordNetdoes not
encode any meronymicrelations of the type IS-MADE-OF.
Instead, the meronymicrelation is implied by the prepositional attachment "the 152-foot copperstatue in NewYork’,
where the nominal expression ~ to the Statue of Liberty in the same TRECdocument. Since "copper" is classifted as a substance in WordNet,it allows the abduction
of the metonymybetweencopper and the Statue of Liberty,
thus justifying the answer.
A second form of meronymy
is illustrated by the TREC10 question "Whatcounty is Modesto,California in ?". The
processing of this question presupposesthe recognition of
Modestoas a city nameand of California as a state name,
which is a function easily performed by most namedentity
recognizers. Usually the names of UScities ate immediately followed by the nameor the acronymof the state they
belong to - to avoid ambiguitybetweendifferent cities with
the samename,but located in different states, e,g. Arlignton,
VAand Arlington, TX. However,the question asks about
the nameof a county, not for the nameof the state. If gazeteers of all UScities, their counties andstates are not available, an IS-PART
relation is infened from the interpretation
of prepositional attachments determinedby the preposition
"in".. For example,the correct answerto the abovequestion
is indicated by the following attachment: "10 miles west of
Modestoin Stanislaus County.
A third form of meronymy
results from recipes or cooking directions. It is commonsense
knowledgethat any food
is madeof a series of ingredients, and the general format
of any recipe will list the ingredients along with their measurements. This insight allows the mining of the answerto
the TREC-10question "What fruit is Melba sauce made
from?". The expected answer is a kind of fruit. Both
peaches and raspberries are fruits. However,we cannot abductively prove that "peach" is the correct answer, because
the text snippet "/t is used as saucefor PeachMelba", indi-
ORGANIZATION
~’~
/---"’~
LANGUAGE
LO~T,ON LXA A
I
!,~\
UN,VERSrTY
/~/~
DEGREE
VALUE
DEFINITION
~TE PERCENTAGE
AMOUNT SPEED
"’"
A A
A A A A
"",":"~NS,ON
DURA~DN
COUNTT~ERATURE
AA
AA
A ~ ~
N~.NT
//
G~LAXY COU Y
,,
"....
"--....
"
IAA
AA
,~’
""IVbNEY
CITY PROVINCE ODIER_LOC :;’
’
-AAA
AA AA_. .....
...
I
I=t,~
oU~,,
I
Figure 2: AnswerType Taxonomy
cates that the sauce, whichcorefers with Melhasauce in the
sametext, isjust in ingredient of the PeachMeiha.However,
the text snippet "2 cups of pureed raspberries" extracted
from the recipe of Melbasauce provides evidence it is the
COITeCt answer.
AnswerMiningfromLexical Knowledge
Bases
Lexical knowledgebases organized hierarchically represent
an important source for mininganswers to natural language
questions. A special case is represented by the WordNet
database, which encodes a vast majority of the concepts
from the English language, lexicalized as nouns, verbs, adjectivas and adverbs.Whatis special about WordNetis the
fact that each concept is represented by a set of synonym
words,also called a synset. In this way, wordsthat havemultiple meaningsbelong to multiple synsets. However,word
sense disambignation is not hindering the process of mining answers from lexico-semantic knowledgebases because
question processingenables a fairly accurate categorization
of the expected answertype.
Whenopen-domainnatural language questions inquire
only about entities or events or someof their attributes or
roles, as was the case with the test questions in TREC-8,
TREC-9and the main task in TREC-10,an off-line taxonomyof answer types can be built by relying on the sysnets encoded in WordNet(Fellbaum 1998). (Pagca
Harabagiu2001) describes the semi-antomatic procedure for
acquiring a taxonomyof answertypes, similar to the one illustrated in Figure2.
Ananswer type hierarchy enables the mining of the answer of a wide range of questions. For example, when
searching for the answer to the question "Whodefeated
Napoleonat Waterloo?"it is important to knowthat the ex-
36
pected answer represents somenational armies, let by some
individuals. In WordNet1.7, the glossed definition of the
synset { Waterloo,battle of Waterloo}is (the battle on 18
June 181S in which Napoleonmet his fuml defeat; Prussian
and British forces under Blucher and the Dukeof Wellington muted the French forces under Napoleon). The exact
answer, extracted from this gloss is "Prussian and British
forces under Blucher and the Duke of Wellington", representing a noun phrase, whosehead, Prussian and British
forces, is coerced from the namesof the two nationalities,
that are encodedas concepts in the answer type hierarchy.
However,since the expected answer type is PERSON,
several entitieS from the gloss are candidates for the answer:
"Napoleon", " Prussian and British forces", "Blucher",
"the Duke of Wellington" and "the French forces". If
"Napoleon", "Prussian", "British" and "French" are encoded in WordNet,it is to be noted that "Blucher" and "the
Dukeof Wellington" are not. Nevertheless, they are recognized as namesof persons due to the usage of a NamedEntity recognizer.
Since the verb "defeat"
is not reflexive, both "Napoleon"
and "the French forces" are ruled out, the second due
to the prepositional attachment "the French forces under
Napoleon".Becausethe only remaining candidates are syntactically dependentthrough the coordinated prepositional
attachment "Prussian and British forces underBlucher and
the Dukeof Wellington"they represent the exact
answer.If a
shorter exact answeris wanted, a philosophical debate over
whois more important- the general or the armyneeds to be
resolved.
Fromthe WordNetgloss, "Prussian and British
forces" is recognized as the genus of the definition of Waterloo - and thus stands for an answer mined from WordNet. Miningthis answer from WordNetis important, as it
Answeri. Score: 280.00
Thefamilygainedprominence
by financingsuchmassiveprojects as European
railroads and the British military
campaign
that led to the final defeat of Napoleon
Bonaparte
at Waterlooin June1815
Answer2. Score: 244.69
In 1815,Napoleon
Bonapartemethis Waterlooas British and Prussiantroops defeatedthe Frenchforces in Belgium
Answer3. Score: 244.69
In 1815,NapoleonBonapartemethis Waterlooas British and Prussiantroops defeatedthe Frenchforces in Belgium
Answer4. Score: 146.14
in 1815,Napoleon
Bonaparte
arrived at the island of Saint Helenaoff the westAfricancoast wherehe wasexiled after his
lossto the Britishat Waterloo
Answer5. Score: 132.69
Subtitled "WorldSociety1815- 1830,"the bookconcentrateson the period that openswithNapoleon’sdefeat at Waterloo
and closes withthe election of Andrew
Jacksonas presidentof the U.S.Scienceand technologyare as importantto his story
as the rise andfall of empires
Table 1: Answersreturned by LCC’sQASfor the question "Whodefeated Napoleonat WaterlooT’
AnswerI. Score: 888.00
Notsurprisingly, Leningrad’sCommunist
Party leadershipis in the forefront of those opposedto changingit backagain
Answer2. Score: 856.00
Thevehicle plant namedafter Lenin’s YoungCommunist
Leagueand knownas AZKL,
shut its car productionline yesterday
after suppliesof enginesfrom
Answer
3. Score:824.00
"! ’m againstchanging
the name
of the city," YuriP. Belov.a secretaryof the Leningrad
regionalCommunist
Party
committee,
saidin aninterview
Answer4. Score: 820.69
Expre~ing
the interests of the workingclass andall working
peopleandrelying on the great legacy
of Marx,Bagelsand Lenin,the SovietCommunist
Patty is creatively developingsocialist
Answer5. Score: 820.69
In both Moscow
and Leningrad,the local Communist
Party has taken control of the newspapers
it originally sharedwith
~’
Table 2: Answersreturned by LCC’sQASfor the question "Whowas Lenin.
can determine a better scoring of QAsystems that mineanswers only from text collections. For example,if this answer
were known,it wouldhave improvedthe precision of LCC’s
Question Answering Server (QASTM),that produced the
answerslisted in Table1. In this case, answers4 and 5 would
have been filtered out and answers2 and 3 wouldhave taken
precedence over answer 1.
Definition questions
Mining answers from knowledge bases has even more importance whenprocessing definition questions. The percentage of questions that ask for definitions of concepts, e.g.
"What are capers?" or "What is an antisen?" represented
25%of the questions from the main task of TREC-10,an
increase from a mere 9% in TREC-9and 1% in TREC-8
respectively. The definition questions normallyrequire an
increase in the sophistication of the QAsystem. This is determined by the fact that questions asking about the definition of an entity use few other words that could guide the
semantics of the answermining. Moreover,for a definition,
there will be no expected answer type. A simple question
such as "Whowas Lenin?" can be answered by consulting
the Wordlqetknowledgebase rather than a large set of texts
like the TRECcollection. Lenin is encoded in Wordlqet,
in a synset that lists all his namealiases: {Len/n, Vladimir
Lenin, Nikolai Lenin, Vladimir Ilyich Ulyanov}. Moreover,
the gloss of this synset lists the definition: (Russianfounder
of the Bolsheviks and leader of the Russian Revolution and
37
first head of the USSR(1870-1924)). This is a correct answer, and furthermore, a shorter answercan be obtained by
simply accessing the hypernymof the Lenin entry, revealing
that he was a communist. Whenmining the answer for the
samequestion from large collections of texts, as illustrated
in Table I no correct completeanswers can be found, as the
informationrequired by the answeris not explicit in any text,
and thus only related informationis found.
Q912:Whatis epilepsy?
Qi273:Whatis an annuity?
Q!022:Whatis I~mbledon?
QI I.52: Whatis MardiGras?
QI I tO:Whatis dianetics?
Q1280:Whatis MuscularDistrophy?
Table 3: Examplesof definition questions.
Severaladditional definition questions that wereevaluated
in TREC-10
are listed in Table 3. For the somethose questions the concept that needs to be defined is not encoded
in WordNet.For example, both diametics from Q1160and
Muscular Distrophy from Q1280are not encoded in WordNet. But wheneverthe concept is encodedin WordNet,both
its gloss and its hypernymhelp determine the answer. Table 4 illustrates several examplesof definition questionsthat
arc resolvedby the substitution of the conceptwith its hypernym. The usage of WordNethypernymsbuilds on the con-
jecture that any concept encodedin a dictionary like WordNet is defined by a genus and a differentia. Thuswhenasking about the definition of a concept, retrieving the genusis
sufficient evidenceof the explanationof the definition, especially whenthe genus is identical with the hypernym.
Conceptthat needs
ob dOd
Whatis a
shaman?
Whatis a
nematode?
Whatis
anise.?
Hypernymfrom
WordNet
{priest.nonChristianpriest}
(worm)
Candidateanswer
Mathews
is the
priest or shaman
nematodes,tiny
worms
in soil.
{herb, herbaceous aloe, anise, rhubarb
plant)
and other herbs
Table 4: WordNetinformation employed for minings answersof definition questions.
In WordNetonly one third of the glosses use a hypernymof the concept being defined as the genus of the gloss.
Therefore, the genus, processes as the head of the first Noun
Phrase from the gloss, can also be used mine for answers.
For example, the processing of question Q1273from Table 3 relies on the substitution of annuity with incomethe
genus of its WordNetgloss, rather than its hypernym,the
concept regular payment.The availability of the genus or
hypernymhelps also the processing of definition questions
in whichthe concept is a namedentity, as it is in the case
of questions Q1022and Q1152from Table 3. In this way
Wimbledonis replaced with a suburb of Londonand Mardi
Gras with holiday.
Pertainyms
Whena concept A is related by some unspecified semantics to another concept B, it is said that A pertains to B.
WordNetencodes such relations, which prove to be very
useful for mining answers for natural language questions.
For example, in TREC-10,when asking "Whatis the only
artery that carries bloodfromthe heart to the lungs’?°’, the
correct answer is mined from WordNetbecause the adjective "’pulmonary"pertains to lungs, therefore the Pulmonary
Artery is the expected answer. Knowingthe answer makes
the retrieval of a text snippet containingit mucheasier. The
textual answer from the TRECcollection is: "a connection
from the heart to the lungs through the pulmonaryartery".
Bridging Inference
Questions like Whatis the esophagusused for?" or "Why
is the sun yellow ?" are difficult to process becausethe answer relies on expert knowledge,from medicinein the former example, and from physics in the latter one. Nevertheless, if a lexical dictionary that explains the definitions
of concepts is available, some supporting knowledgecan
be mined. For example, by inspecting WordNet(Folihaum
1998), in the case of esophaguswe can find that it is "the
passage between the pharynx and the stomach’. Moreover,
WordNetencodes several relations, like meronymy,showing
that the esophagusis part of the digestive tube or gastrointestinal tract. The glossed definition of the digestive tube
showsthat one of its function is the digestion.
38
The information minedfrom WordNetguides several processes of bridging inference betweenthe question and the
expected answer.First the definition of the concept defined
by the WordNetsynonymset {esophagus, gorge, gullet} indicates its usageas a passagebetweentwo other bodyparts:
the pharynx and the stomach. Thus the query " esophagus
ANDpharynx ANDstomach’" retrieves all paragraphs conmining relevant connectionsbetweenthe three concepts, including other possible functions of the esophagus. When
the query does not retrieve relevant paragraphs, newqueries
combiningesophagusand its hoionyms(i.e. gastrointestinal
tract) or functions of the holonyms(i.e. digestion) retrieve
the paragraphs that maycontain the answer. To extract the
correct answer, the question and the answer need to be semantically unified.
Thedifficulty stands in resolvingthe level of precision reTMconsidquired by the unification. Currently, LCC’sQAS
ers an acceptable unification when(a) a textual relation can
be established betweenthe elements of the query matchedin
the answer(e.g. esophagusand gastrointestinal tract); and
(b) the textual relation is either a syntactic dependency
generated by a parser, a reference relation or it is inducedby
matchingagainst a predefined pattern. For example,the pattern "X, particularly Y" accountsfor such a relation, granting the validity of the answer "the upper gastrointestinal
tract, particularly the esophagus’. However,we are aware
that suchpatterns generatemultiplefalse positive results, degrading the performanceof the question-answeringsystem.
Abductive Processes
In (Hobbset a1.1993) abductive inference is used as a vehicle for solving local pragmatics problems. To this end,
a first-order logic implementingthe Davidsonianreification
of eventualities (Davidson 1967). The ontological assumptions of the Davidsoniannotation was reported in (Hobbset
al.1993). This notation has twoadvantages:(l) it is as close
as possible to English; and (2) it localizes syntactic ambiguities, as it was previously shownin (Bear and Hobbs1988).
Moreover,the samenotation is used in a larger project first
outlined in (Harahagiuet aL 1999) that extends the WordNet
lexicai database (Fellbaum 1998). The extension involves
the disambiguationof the defining gloss of each concept as
well as the transformationof each gloss into a logical form.
The Logic FromTransformation (LFT) codification is based
on syntactic-based relations such as: (a) syntactic subjects;
(b) syntactic objects; (c) prepositional attachmentsand
adjectival/adverbial adjuncts.
The Davidsoniantreatment of actions considers that all
predicates iexicaiizod by verbs or nominalizationshave three
argumentsaction/state/event-predicate(e, x, y), where:
o e represents the eventuality of the action, state or event to
have taken place;
o x represents the agent of the action, representedby the syntactic subject of the action, event or state;
o y represents the syntactic direct object of the action, event
or state.
In addition, we express only the eventuality of the actions,
events or states, considering that the subjects, objects or
other themes expressed by prepositional attachments exist
Cargtldmanswsrl;
the ruskol soivlngB~eparadoxol whyAtlner*j , the cradleel Wesfem
democracy,
cot)denmed
Socrafes
to death
Evk~n~:condemn(e)-:.cause(e)tw~e) &die[e,?) -~fe(e,?) & cause[e,?,el) ~[x) -> die[e,?)
WordNet
troponyms:how(e)& die(e,?)-> sulfocate(e, ?)
how(e)& die(e,?)-> ctown(e,
Candidateanswer2:
Socrates
Butsikares, a formereditor of rne NewYorkTimes" infenmtional
edition in Paris, d~dThun;day
of
a heartalfackat a StatenIslandhospital
d
I
I LFT: somt~x~~re~x~t ~ve~e=~~ o~e~a ,u~ana~J
Evidence:how(e)&die(e,?)->die(e,?) & cause(e,?,el) hean..affack{z)-.~Wclion(z)affliction(z)->
Figure 3: Abductivecoercion of candidate answers.
by virtue of actions they participate in. For example, the
LFTof the gloss (a person who inspires fear) defining the
WordNet
synset2 {terror, scourge, threat} is: Lverson(x)
inspire(e, x, y) &fear(y)]. Wenote that verbal predicates can
sometimeshave morethan three arguments, e.g. in the case
of bitransitive verbs. For example, the gloss of the synset
{impart, leave, give, pass on} is (give people knowledge),
with the correspondingLFr:~give(e, x, y, z) &knowledge(y)
&people(z)], havingthe subject unspecified.
Since most thematic roles are rclxesented as prepositional
attachments, predicates lexicalized by prepositions arc used
to account for such relations. The preposition predicates
have always two predicates: the first argumentcorresponds
to the head of the prepositional attachmentwhereasthe second argumentstands for the object that is being attached.
The predicative treatment of prepositional attachments was
first repotted in (Bear and Hobbs1988). In the workof Bear
and Hobbs, the prepositional predicates are used as means
for localizing the ambiguities generated by possible attachments. In contrast, when LFTsare generated for WordNet
glosses, all syntactic ambiguities are resolved by a highperformancepmbabilistic parser trained on the Penn Treebank (Marcus et al.1993). However,the usage of preposition predicates in the LFTis very important, as it suggests
the logic relations determinedby the combinationof syntax
and semantics. Paraphrase matching can thus be employed
to unify relations bearing the samemeaning, and thus enabling justifications. The paraphrasesare correctly matched
also because of the lexical and semanticdisambignationthat
is producedconcurrently with the parse. The probabilistic
parser also correctly identifies base NPs, allowing complex
nominals to be recognized and transformed in a sequence
2Aset of synonyms
is called a synseL
39
of predicates. The predicate rut indicates a complexnominal, whereasits argumentscorrespond to the componentsof
the complex nominal. For example, the complex nominal
baseball team is transformed in Inn(a, b) &baseball(a)
team(b)].
The roles that generate LFTswere reported in (Moldovan
and Rus). The transformation of WordNetglosses into Ll~s
is important because they enable two of the most important
criteria neededto select the best abduction. Thesecriteria
were reported in (Hobbset al.1993) and are inspired by Thagard’s principles of simplicity andconsilience (cf. (Thagard
1978)). Roughly,simplicity reqt~:s that the chain of explanations should be as ,:mall as possible, whereasconsilieace
requires that the evidence triggering the explanation be as
large as possible. In other words,for the abductiveprocesses
needed in QA,we expect to have as muchevidence as possible to infer froma candidateanswerthe validity of the question. Wealso expect to makeas few inferences as possible.
All the glosses from WordNet
represent the search space for
evidencesupportingan explanation. If at least twodifferent
justifications are found, the consiliencecriterion requires to
select the one having stronger evidence. However,the quantification of evidenceis not a trivial matter.
For the exampleillustrated in figure 3, the evidenceused
to justify the first answerof the TREC-8
question comprises
axiomsabout the definition of the condenmconcept, the definition of the howquestion class and the relation between
the nominalization death and its verbal root die. The LFT
of the first answer, whenunified with these axiomsproduces
a coercion that cannot be unified with the LFTof the question because the causality relation is reversed: the answer
explains whySocrates dies and not how he did die. The
second candidate answerhowevercontains the right dire(:-
Can~fam
answer
3:
Hls cdtics evennndfault wl~ ltmt . argulng#wt the m__~__~_
tlon wlth ~m~ ~ ~ ~ ~
I LFT:°~oclatlo~iU
& wl~a.x]& ~x] & ~ike{e.a/o]& sulc;de(b]&~el.b]& hero,el)
EvldBn~:
7???
AbOicti~
pxC~n:
a.~ocla#on
with(x] makw
(y] ->relatlon
Ix,
how[e)
Ld~[e,
?)->¢~°,
?]L cause[e,
?,°1]
~iclde[e,?)->Idtl[e.?,?]
l~l(e,?,?]->cause[e,?]&
Oie[e,?)
x) ]
i Coe~on:Socmtes(x]&~.~cide[°,
Cand/dme
atwwer
4:
Somt~wt~owas~nwnced
to oeam
for/ml~y °nO~ ~k~ of ¢~0c~ty V ya~. ~ m~ ~ pokumous
i
I ,FT..socmm(,)
,, ~*~y)-~,,o~y)
8 ~,,t,o,0,)
Eviderce:
Ira,ale) &die(a,?)
-,~ie(e.?)& cau~(e,?.el)
poL,~n(e,x)
->t~(e,x)poi~w~u~z)->
Figure4: Abducfivejustification of candidateanswers.
tion of the causality imposed
by the semanticclass of the
questionandit is considereda correct, althoughsurprising
answer.
Thisis dueto the ambiguity
of the question- it is not
clear whetherweare askedaboutthe morefamousSocrates,
i.e. the philosopher, or about somebody
whosefirst nameis
Socrates. The coercion of the first candidate answercould
not use any of the WordNetevidence provided by the hypouymsof the verb die.
The simplicity criterion is satisfied becausethe coercion
of the answeralways uses the minimalpath of backwardinference. Hobbsconsiders that in abduction, the balance between simplicity and consilience is guaranteed by the high
degree of redundancytypical in natural languageprocessing.
However,we note that redundancyis not always enough. An
exampleis illustrated in Figure 4 by the third candidate answer to the same TRECquestion. Withoutan abductive pattern recognizing a relation betweenconcept X and concept
Yin the expression "association with X makesY" it is practically impossible to justify this answer, whichis correct.
The fourth candidate answer is also correct, and furthermore it is a morespecific answer. However,only by using
the axiomatic knowledgederived from WordNetthrough the
LFTs,it is impossible to showthat the third answeris more
general than the fourth one - therefore it is hard to quantify
the simplicity criterion - due to the generative powerof natural language, whichalso has to be taken into consideration
when implementingabductive processes.
Whenimplementing abductive processes, we considered
the similarity betweenthe simplicity criterion and the notion of context simpliciter discussed in (Hint 2000). Hirst
notes that context in natural languageinterpretation cannot
be oversimplified and considmedonly as a source of information - its main functionality when employedin knowledge representation. Withsomevery interesting examples,
4O
Hirst showsthat in natural languagecontext constrains the
interpretation of texts. At a local level, it is known
that context is a psychologicalconstruct that generates the meaning
of words. Miller and Charles argued in (Miller and Charles
1991) that word sense disambignation (WSD)could be
ferred easily if we could knowall the contexts in which
words occur. In a very simplified way, local context can
be imaginedas a windowof words surrounding the targeted
concept, but work in WSD
shows that modeling context in
this wayimposessophisticated waysof deriving the features
of these contexts. To us it is clear that the problemof answer justification is by far morecomplexand difficult than
the problemof WSD,and therefore we believe that simplistic representationsof context are not beneficial for QA.Hirst
notes in (Hirst 2000)
that"havingdiscretion in constructing
context does not meanhaving complete freedom’.
Arguingthat context is a spurious concept that maycome
dangerously to being used in incoherent manners, Hirst
notes that "lmowledgeused in interpreting natural language
is broaderwhile reasoningis shallow". In other words, the
inference path is usually short while the evidential knowledge is rich. Weused this observation whenbuilding the
abductive processes.
To complywith the simplicity and consilience criteria we
need (1) to havesufficient textual evidencefor justification
- which we accomplish by implementingan information retrieval loop that bring forwardall paragraphscontainingthe
ke~ of the question as well as at least one concept of
the expected answer type3; (2) to consider that questions
can be unified with answers, and thus provide immediate
explanations; and (3) to justify the answer through abduc3Moredetails of the feedbackImpsimplemented
for QAare
reportedin (Harabagiuct al.2001)
I
Quest~n: Whyare so many ~ sent to jail?
]
Answ~.Jo#sand educationdimin~h.
Context1: (Revemnd
JesseJaclcsonln anatficle on MartlnLutherKing’s"l lmvea droarn"speech)
The"glant
eucldng
is and
noteOuca~n
morelydiminish,
Amorlcan
jot~
ooingam
to being
NAFTA
andGA
Tr choap
Ja~indusMal
~s. Thoglant
stctd~
sound
is tlmtek~md"
as jobs
our youth
sucked
into
the jail
~mplex.
I
I
Context2:
(RossPerotfirst presidentialcampaign)
tithe UnitedStmtesapprovesNAFTA,
the giant sucldngsoundthat wehearwill bethe soundof thousands
of joDs
I
ar~ ~s d~appeart~
to Mex~o.
I
Figure 5: Therole of context in abductiveprocesses.
tions that use axioms gnnemtedfrom WordNetalong with
information derived by the reference resolution or by world
knowledge;(4) to makeavailable the recognition of paraphrasesand to generate somenormalized dependencies between concepts. Howeverwe find that we also need to take
into account the contextual information. To account for our
belief, we present an exampleinspired by the discussion of
context from (Hirst 2000). The exampleis illustrated in Figure 5.
Redundancy
Whenthe question from Figure 5 is posed the answer is
extracted from a paragraphcontained in an article of Reverend Jesse Jackson on Martin Luther King’s "I have a
dream" speech. This paragraph contains also an anaphoric
expression "the giant sucking sound" which is used twice.
The anaphor is resolved to the metaphoricexplanation that
this soundis causedby youngstersbeing jailed due to the diminishingjobs and education. In the process of interpreting
the anaphor,the expression : being sucked into the jail ind~trial complex"is recognized as a paraphrase
of "sent to
jail" even if is contains the morphologicallyrelation to the
"sucking sound". So what is this "sucking sound"?As Hirst
4, at the timeof writingthe arfirst points out in (Hirst 2000)
ticle containing the answer, ReverendJesse Jackson alludes
to a widely reported remark of Ross Perot. This remark
bring forwardthe context of Perot’s speechin his first presidential campaign,whenhe predicts that NAFTA
will enable
the disappearing of jobs and factories to Mexico.This example showsthat both anaphora resolution, that sometimes
require access to embeddedcontext, and the recognition of
causation relations are at the crux of answerjustification.
The recognition of the causation underlyingthe text snippet
"as jobs and education diminish, our youth are being sucked
into the jail industrial complex"is basedon cause-effect patterns triggered by cue phrases such as as. This examplealso
leads to the conclusion that abduetive processes that were
proven successful in the past TRECevaluations need to be
extended to deal with more forms of pragmatic knowledge
as well as non-literal expressions(e.g. metaphors).
vs.
Abduction
Definition questions impose a special kind of QAbecause
often the question contains only one concept - the one that
needsto be defined. Moreover,in a text collection, the definition maynot be matchedby one of the predefined patterns
- and thus required additional informationto be recognized.
Such information maybe obtained by using the very large
index of a search engine, e.g. google.com.Whenretrieving
documentsfor the concept to be defined, the most frequent
words encountered in the summaries produced by GOOGLE
for each documentcould be considered. Due to the redundancyof these words, we mayassumethat they port important information. However,this is not alwaystrue. For the
question "What is an atom?" we found that none of the
redundant word were used in the definition of atom in the
WordNet
database. Nevertheless this technique is useful for
concepts that are not defined in WordNet.Whentrying to
answer the question "Who is Sanda Harabagiu?", GOOGLe
returns documentsshowingthat she is a faculty memberand
an author of papers on QA.Therefore for such questions,
redundancymaybe considereda reliable source of justifications.
Conclusions
In this paper we have presented several forms of abducrive processes that are implemented in a QAsystem. We
also drawattention to morecomplexalxinctive explanations
that require morethan lexico-semantic information or axiomatic knowledgethat can be scanned in a left-to-right
back-chainingof the LFTof candidate answer, as it was reported in (Harabagiu et al.2000). Weclaim in this paper
that the role of abduction in QAwill break ground in the
interpretation of the intentions of questioners. Wediscussed
the difference betweenredundancyand abduction in QAand
concluded that some forms of information redundancy may
be safely consideredas abductive processes.
Acknowledgements:This research was partly funded by
the ARDA’sAQUAINT
program.
4Thisexamplewas taken from (Hint 2000)
41
References
S. Abney,M. Collins, and A. Singhal. Answerextraction. In Proceedings of the 6th Applied Natural LanguageProcessing Conference (ANLP-2000),pages 296-301, Seattle. Washington,2000.
J. Bear and J. Hobbs. Localizing expressions of ambiguity. In
Proceedings of the 2nd Applied Natural Language Processing
Conference (ANLP.1998),
pages235--242,
1988.
P. Cohen,R. Schrag,
E. Jones,A. Pease,A. Lin,B. Start,
D. Easter,
D. Gunning,
and M. Burke.The DARPAHighPerformanceKnowledgeBases Project. Artificial Intelligence Magaz/ne, 19(4):2.5-49, 1998.
M. Collins. A newstatistical parser based on bigramlexical dependencies, in Proceedings of the 34th Annual Meeting of the
Association of ComputationalLinguistics (ACL-96), pages 184191, 1996.
D. Davidson.Thelogical form of action sentences, in N. Rescher,
ed. The Logic of Decisionand Action, University of Pittsburg, PA,
pages 81-95, 1967.
Christia~ Pellbaum (Editor). WordNet- An Electronic Lexical
Database. Mrr press, 1998.
A. Graesser
andS.Gordon. Question-Answering
andtheOrganization
of World Knowledge,chapter Question-Answeringand the
Organization of WorldKnowledge.Lawrence Erlbaum, 1991.
S. Harabagiu, G.A. Miller and D. Moldovan.WordNet2 - A Morphologically and Semantically EnhancedResource. in Proceedings of $1GLIZX-99,June 1999, Univ. of Maryland,pages !-8.
S. Harabagiu and S. Maionmo.Finding answers in large collections of texts: Paragraphindexing + abductive inference. In AAAi
Fall Symposiumon Question Answering Systems, pages 63-71,
November1999.
S. Hmbagiu, M. Pqca and S. Maiorano. Experiments with
Open-DomainTextual Question Answering. in Proceedings of
the 18th International Conferenceon ComputationalLinguistics
(COLING-2000),August 2000, Saarbruken Germany,pages 292298.
S. 14~iu, D. Moldovan, M. l~a. R. Mihaicea. M. Surdeanu,
R. Bunescu,R. G[rju, V. Rnsand P. Morg~scu.The role of lexicosemantic feedback in open-domaintextual question-answering. In
Proceedings of the 39th Annual Meeting of the Associationfor
ComputationalLinguistics (ACL-2001).Toulouse, France. 2001,
pages 274-28 !.
G. Hirst. Context as a spurious concept, in Proceedings,Conference on Intelligent Text Processing and ComputationalLinguistics, MexicoCity, Februmy2000. pages 273-287.
E. Hovy, L. Gerber, U. Hermjakob,C-Y. Lin, D. Ravichandran.
Towardsemantic-based answer pinpointing. In Proceedings of
the HumanLanguage Technology Conference (HLT-2001), San
Diego, California. 2001.
J. Hobbs,M. Stickel, D. Appelt, and P. Martin. Interpretation as
abduction. Artificial Intelligence, 63, pages 69-142, 1993.
B. Kay. Fromsentence processing to information access on the
world wide web. In AAAi Spring Symposiumon Natural Language Processing for the World W’MeWeb, Stanford, California,
1997.
W. Lehnert. The Process of Question Answering: a computer
simulation of cognition. LawrenceErlbanm, 1978.
Mitch Marcus, Beatrice Santorini and Mark Marcinkiewicz.
1993. Building a large annotated corpus of English: The Penn
Treebank. ComputationalLinguistics, 19(2):313-330, 1993.
42
G.A. Miller. WordNet:a lexical database. Communicationsof
theACM,38(i I):39--41, 1995.
G.A. Miller and W.G.Charles. Contextualcorrelates of semantic
similarity. Languageand CognitiveProcesses,6( I ): 1-28, 199
D. Moldovanand V. Rns. Transformation of WordNetGlosses
into Logic Forms. In Proceedings of the 39th Annual Meeting of the Association for ComputationalLinguistics (ACL-2001),
Toulouse, France, 2001, pages 394--401.
M. Pa.~ca and S. Harabagiu.
High Performance Question/Answering, in Proceedingsof the 24th AnnualInternational
ACL$1GIRConference on Research and Development in Information Retrieval ($1GIR-2001), September 2001, NewOrleans
LA, pages 366-374.
G. Salton, editor. The SMART
Retrieval System. Prentice-Hall,
NewJersey, 1969.
P.R. Thagard. Thebest explanation: criteria for theory choice.
Journal of Philosophy, pages 79--92, 1978.
E.Voorhecs. Overview of the TREC-9 Question Answering
Track. in Proceedings of the 9th Text REtrieval Conference
(TREC-9), 2001, pages 71-79.