INFEHENCEWITHRECURSIVERULES Stuart C. Shapiro and Donald

advertisement
From: AAAI-80 Proceedings. Copyright © 1980, AAAI (www.aaai.org). All rights reserved.
INFEHENCEWITHRECURSIVERULES
StuartC. Shapiroand DonaldP. McKay
Departmentof ComputarScience
State Universityof New York at Buffalo
Amherst,New York 14226
ABSTRACT
relevantrules and builds a data structureof
processeswhich attemptto derive the answer frcm
the rules and other informationstored in the data
base. Since we are using a semanticnetworkto
representall declarativeinformationavailablein
the system,we do not make a distinctionbetween
"extensional"
and "intensional"
data bases, i.e.
non-rulesand rules are stored in the same data
we do not distinguish
base. More significantly,
"base"frm "defined"relations. Specific
instancesof ANCESTOR maybe storedaswell as
This point of view
a ruledefining ANCESTOR.
contrastswith the basic assurrption
of several
data base questionansweringsystems [3,8,9]. In
addition,the inferencesystemdescribedhere does
not restrictthe left hand side of rules to contain only one literalwhich is a derivedrelation
[3],does not need to recognizecycles in a graph
[3,8] and does not requirethat there be at least
one exit frm a cycle [8l.
Recursiverules, such as "Yourparents'ancestors are your ancestors",althoughvery useful for
theoremproving,naturallanguageunderstanding,
ring and informationretrieval
questions-answe
systems,presentproblemsfor many such systems,
either causinginfiniteloops or requiringthat
arbitrarilymany copiesof them be made. We have
written an inferencesystem that can use recursive
rules without eitherof these problems. The solution appearedautomaticallyfrom a technique
designedto avoid redundantwork. A recursive
rule causes a cycle to be built in an AND/OR graph
of activeprocesses. Each pass of data throughthe
cycle resultingin anotheranswer. Cyclingstops
as soon as either the desiredanswer is produced,
no more answerscan be produced,or resource
bounds are exceeded.
Introduction
Recursiverules, such as "your parents'ancestors are your ancestors",occur naturallyin
inferencesystemsused for theoremproving,
questionanswering,natural languageunderstanding,
and informationretrieval. Transitiverelations,
v(x,y,z)[ANCES'lloR(x,y)
& ANCESToR(y,z)-+
%%SToR(x,z)], inheritancerules, e.g. ~z(x,y,p)
[ISA(x,y)& HAs(y,p)-+HAS(x,p)l,circulardefinitions and equivalencesare all occurrencesof
recursiverules. Yet, recursiverules present
problemsfor system implemantors. Inference
systemswhich use a %aive chaining"algorithmcan
go into an infiniteloop, like a left-to-right
top-downparser given a left recursivegrammar
Sme systemswill fail to use a recur[41.
sive rule more than once, i.e. are incomplete
Other systemsbuild tree-likedata
[6,121.
structures(connection
graphs)containingbranches
thelengthofwhichdependonthenlrmberoftimes
the recursiverule is to be applied [2,131. Since
scme of these build the structurebefore using it,
the correctlengthof these branchesis problematic.
Son-esystemseliminaterecursiverules by deriving
and addingto the data base all implicationsof
the recursiverules in a specialpass before normal
inferenceis done [91.
The inferencesystemof SNePS [13]was designed
ti use rules stored in a fully indexeddata base.
when a questionis asked, the systemretrieves
--7-sTheworkwas supportedinpartbytheNationa1
ScienceFoundationunder Grant No. MCS78-02274.
The structureof processesmay be viewed as
an AND/OR problemreductiongraph in which the
processworkingon the originalquestionis the
mot, and rules are problemreductionoperators.
Partly influencedby Kaplan'sproducer-consumer
model [53,we designedthe system so that if a
processworkingon some problemis about to
create a process for a subproblem,and there is
anotherprocessalreadyworkingon that subproblem,
the parentprocess canmake useof the extant
processand so avoid solvingthe same problem
again. The methodwe employ handlesrecursive
rules with no additionalmechanism. The structure
of processesmay be viewed as an active connection
graph,but, as will be seen below, the size of the
resultingstructureneed not depend on the number
of times a recursiverule will be used.
151
This paper describeshm our systemhandles
recursiverules. Aspectsof the system not
directlyrelevantto this issuewill be abbreviated
orcmitted. Inparticular,detailsof thematch
routinewhich retrievesformulasunifiablewith a
given formulawill not be discussed (butsee [lOI).
The InferenceSysteq
The SNePS inferencesystembuilds a graph of
processes [7,11]to answera question (derive
instancesof a given formula)based on a data base
of assertions(groundatomic formulas)and rules
(non-at&c formulas). Each processhas a set of
signifyingthat TTB has been derived. SW'PK!H
sends to its boss the application46., the substitutionderived from o by replacingeach term
t in o by tB. The effectof the SWITCH is to
changethe answer from the contextof the variables of T to the contextof the variablesof Q.
1n0u.rexample,themmightsendthe
answer
B=k/xl. SWITCHwould then send a\B = {b/x,x/y}\
{c/xl= (b/x,c/y)to the INFER, indicatingthat
Q~\B = P(x,a,y)(b/x,c/y)
= P(b,a,c)has been
derived. The importanceof the factoringof the
mguof Q andT into the sourcebinding0 and the
targetbinding T - a separationwhich the SWITCH
repairs-- is thattheCHAIN canwork onT in the
contextof its originalvariablesand report to
many bosses,each throughits own SWITCH.
A CHAIN processis createdto use a particular substitutioninstance,T, of a particular
formula,Al&...GAk1T to deduce instancesof T-r.
Its answers,which will be sent to a SWITCH,will
Four Processes
be substitutionsB such that T-rBhas been deduced
using the rule. For each Ai, ISilk,the CHAIN
An INFER process is createdto derive
tries to discoverif ANT is deducibleby creating
instancesof a formula,Q. It firstmatchesQ
an INF'ER
process for it. However,an INFER process
againstthe data base to find all formulas
might alreadybe workingon Aia. If ~.=TT,
unifiablewith Q. The result of this match is a
the
alreadyextant INFER is just what the CHAIN wants.
list of triples,<T,-r,W,where T is a retrieved
It takes all the data the INFER has already
formulacalled the target.and 'cand 0 are subcollected,and adds itselfto the INFER'sbosses
stitutionscalled ti?get
bindingand source
so that itwill also get futureanswers. If a
bindingrespectively.
-Essentially-cand o are
is more generalthan -r,the INFER will produceall
factoreaversionsof the mst generalunifier
the data the CHAIN wants, but unwanteddata as
(mgu)ofQ and T. Pairs of the rqu whose variables
well. In this case the CHAIN createsa FILTER
are in Q appear in G, while those whose variables
areinTappearinT.
Any variablein term position processto standbetween it and the INFER. The
FILTER storesa substitutionconsistingof those
is taken from T. Factoringthe q-u obviatesthe
pairs of T for which the term is a constant,and
need for renamingvariables. For exampleif
when it receivesan answer substitutionfrom the
Q=P(x,a,y)and *P(b,y,x), we would have
INFER,it passes it along to its CHAIN only if
o=(b/x,x/y)and T={a/y,x/x)(thepair x/x is
includedto make our algorithmseasier to describe). the stored substitutionis a subsetof the answer.
For example,if T were (a/x,y/z,b/w}
Note that Qa = T-r= P(b,a,x),the variablesin the
and a were
variablepositionof the substitutionpairs of 0
(u/x,v/z,v/wl, a FILTERwould be createdwith a
sutstitutionof {a/x,b/w),insuringthat unwanted
are all and only the variablesin Q, the variables
answerssuch as (c/x,d/z,b/w)
producedby the more
in the variablepositionof 'care all and only the
generalINFER were filteredout. If a is not
variablesin T, all terms of crce from T, and
compatiblewith T, or is less generalthan T, a
the non-variablesin T came from Q.
new INFER must be created. However,if a is less
generalthan T, the old INFER might alreadyhave
For each match <TJ,o> that an INFER finds
for Q, there are two possiblitieswe shall consider. collectedanswersthat the new one can use. These
are takenby the new DJFERand senttoits bosses.
First, T might be an -assertion
in the data base.
In this case, G is an answer (Qcr
has been derived). Also, since thenew INFERwillprcduce all the
additionalanswersthat the old one would (plus
If the INFER has alreadystored 0, it is ignored.
others),the old INFER is eliminatedand its
Otherwise,CTis storedby the INFER and the pair
bosses given to the new INFERwith intervening
<Q,o> is sent to all the INFER'sbosses. For CTto
FILTEXs. The net result is that the same
be a reasonableanswer,it is crucialthat all its
structureof processesis createdregardlessof
vai&les occur in Q. The other case we shall conwhether the mre generalor less generalquestion
sider is the one in which T is the consequentof
was asked first.
some rule of the form Al&...&An1T. (oursystem
allcrws
other forms of rules,but considerationof
A CHAIN receivesanswersfrom INFERS (possibly
this one will sufficefor explaininghow we handle
filtered)in the form of pairs <Ai,Bi>indicating
recursiverules). In this case, the INFER creates
that AiPi, an instanceof the antec&entAi,has
two other processes,a SWITCH and a CHAIN to
beendeduced. Wheneverthe CHAIN collectsa set
derive instancesof T-r. The SWITCH is made the
of consistentsubstitutions(Bi,...,Bn),
one for
CHAIN'sboss, and the INFER the SWITCH'sboss.
each antecedent,it sends an answar to its bosses
It maybe the case that an alreadyextant CHADJ
consistingof the ccgnbination
of Bl,...,Bk(where
maybeused insteadof anewone. This will be
the ambination of 61 = ~tll/vll,...,tln,/vlnlI
discussedbelow.
,...,Bk= ctkl/vkl
,...,tknk/vknkl
is the mgu of
the expressions(vll,...,vlnl,...,vkl,...,vknk)
and
The SWITCHprocesshas a registerwhich is
(tll ,...,tlnl,.
..,tkl,...,tknk)
set to the sourcebinding,O. The answersit
[l,p.1871).
receivesfrom the W
are substitutionsB,
registerswhich containdata, and each processmay
send messagesto other processes. Since, in this
system,the messagesare all answersto scntequestion, we will call a processP2 a boss of a process
PI if PI sends messagesto P2. Sane processes,
calleddata collectors,are distinguish& by two
features: 1) they can have storethan one boss: 2)
they store all massagesthey have sent to their
bosses. The storedmessagesare used for two
purposes: a) it allows the data collectorto avoid
sendingthe sama messagetwice; b) it allows the
data collectorto be given a new boss, which can
inmediately be broughtup to date by being given
all the messagesalreadysent to the other bosses.
Four types of processesare importantto the
discussionof recursiverules. They are called
INFER,CHAIN, SWITCH and FILTER. INFER and CHAIN
are data collectors,SWITCH and FILTER are not.
152
In normalBases for Data Bases, Gallaire,H.,
Minker,J. and Nicolas,J. (edS.),Plenum,New
York, i980.
4. Fikes, R.E., and HendrixG.G., The deduction
component. In UnderstandingSpokenLanguage,
Walker,D.E., ed., ElsevierNorth-Holland,1978,
RecursiveRules Cause Cycles
Just as a CHAIN can make use of alreadyexisting INFERS,an INFER can r&e use of already
existingms,
filteredif necessary. A
recursiverule is a chain of the form
with C unifiable
Al&...&Ak~Bl,Bl&...&Bn~...X,
with at least one of the antecedents,Al say.
When an INFER operateson Al, it will find that C
matchesAl, and it may find that it can use the
CHAIN alreadycreatedfor C. Since this CHAIN is
in fact the INFER'sboss, this will result in a
cycle of processes. The cyclewill producemore
answersas new data is passed around the cycle,
but no infiniteloop will be created since no data
collectorsends any answernore thanonce. (If an
infiniteset of Skolem constantsis generated,the
processwill still terminateif the root goal had
a finitenumberof desired answersspecified[ll,
p.1941).
355-374.
5. Kaplan,R.M., A
multi-processing
approachto
natural languageunderstanding.Proc. National
ComputerConference,AFIPS Press,Montvale,NJ,
1973, 435-440.
6. Klahr, P.,
Planningtechniquesfor rule selection in deductivequestion-answering.In PattemDirectedInferenceSystems,Waterman,D.A., and
Hayes-Roth,R., eds., AcademicPress,New York,
1978, 223-239.
7. McKay, D.P.
and Shapiro,S.C., MULTI--ALISP
based muitiprocessing
system. Proc. 1980 LISP
Conference.StanfordUniversitv.1980.
Performing
8. Naqvi,'S.A.,and Henschen,-L.J.,
inferencesover recursivedata bases. Proc.
First AAAI Conference,StanfordUnivers1980.
9. Reiter,R., On structuringa first order data
base, Proc. SecondNationalConference,Canadian
Societyfor ComputationalStudiesof Intelligence,
Figure 1 shows a structureof processeswhich
we consideran active connectiongraph. It is
built to derive instancesof ANCESToR(William,w)
form the rulesvCrx,y)
[PARENT(x,y)~ANCESToR(x,y)
1
and V‘(x,y,z)
[ANcESToR(x,y)
& PAREMT(y,z)x
ANCESToR(x,z)].The notationfor the rule
instancesis similarto thatpresented in [31.
Note particularlythe SWITCH in the cycle which
allowsnewly derived instancesof the goal
ANCESToR(William,w)
to be treatedas additional
instancesof the antecedentANCESToR(William,y).
A similarstructurewould be built regardlessof
the order of assertingthe two rules, the order
of anteaedentsinthetwoantecedentrule, the
order of executionof the processes,whether the
query had zero,either one, or both variables
ground,or if the twoanteceiientruleused
ANCESTORfor both antecedents.
1978, 50-99.
10. Shapiro,S.C., Representing
andlocatingde-
ductionrules in a semanticnetwork. Proc. Workshop on Pattern-Directed
InferenceSystexns.In
SIGARTNewsletter,63 (June1977), 14-18.
11. Shapiro,S.C., The SNePS sepnanticneixork
processingsystem. In AssociativeNetworks:The
Representation
and Use of Kncwledsebv Camrsuters.
F&dler, N.V., ed., AcadenicPress,NW York, 1979,
179-203.
12. Shortliffe,E-H., ComputerBased Medical Consultations:MYCIN, &rericanElsevier,New York,
1976.
13. Sickel,S., A search techniquefor clause
interconnectivity
graphs,IEEE Transactionson
ComputersVol. C-25, 8 (August1976)I 823-835.
In the SNePS inferencesystem,recursive
rules cause cyclesto be built in a graph structure
of processes. The key featuresof the inference
systemwhich allow recursiverules to be handled
are: I) the processesthat producederivations
(INFERand CHAIN) are data collectors:2) data
collectorsnever send the same answernore than
once: 3) a data collectormay report to mre than
one boss: 4) a new boss may be assignedto a data
collectorat any time -- it will miately
be
given all previouslycollecteddata; 5) variable
contextsare localized,SWITCH changingcontexts
dynamicallyas data flows around the graph: 6)
FILTERsallow more generalproducersto be used
by less generalconsumers.
1. chang, C.-L., and Lee, R.C.-T.,SymbolicLogic
and MechanicalTheoremProving,Acadtic Press,
New York, 1973.
2. chang, C.-L., and Slagle,J.R., Using rewriting
rules for connectiongraphs to prove theorems,
ArtificialIntelligence12, 2 (August1979),
159-180.
3. chang, C.-L., On evaluationof queries containing derivedrelationsin a relationaldata base.
153
ANCZSMR(William,z)
y)
PARETJT(William,y)AN(TlZSMR(William,
Figure 1.
Download