From: AAAI-80 Proceedings. Copyright © 1980, AAAI (www.aaai.org). All rights reserved. INFEHENCEWITHRECURSIVERULES StuartC. Shapiroand DonaldP. McKay Departmentof ComputarScience State Universityof New York at Buffalo Amherst,New York 14226 ABSTRACT relevantrules and builds a data structureof processeswhich attemptto derive the answer frcm the rules and other informationstored in the data base. Since we are using a semanticnetworkto representall declarativeinformationavailablein the system,we do not make a distinctionbetween "extensional" and "intensional" data bases, i.e. non-rulesand rules are stored in the same data we do not distinguish base. More significantly, "base"frm "defined"relations. Specific instancesof ANCESTOR maybe storedaswell as This point of view a ruledefining ANCESTOR. contrastswith the basic assurrption of several data base questionansweringsystems [3,8,9]. In addition,the inferencesystemdescribedhere does not restrictthe left hand side of rules to contain only one literalwhich is a derivedrelation [3],does not need to recognizecycles in a graph [3,8] and does not requirethat there be at least one exit frm a cycle [8l. Recursiverules, such as "Yourparents'ancestors are your ancestors",althoughvery useful for theoremproving,naturallanguageunderstanding, ring and informationretrieval questions-answe systems,presentproblemsfor many such systems, either causinginfiniteloops or requiringthat arbitrarilymany copiesof them be made. We have written an inferencesystem that can use recursive rules without eitherof these problems. The solution appearedautomaticallyfrom a technique designedto avoid redundantwork. A recursive rule causes a cycle to be built in an AND/OR graph of activeprocesses. Each pass of data throughthe cycle resultingin anotheranswer. Cyclingstops as soon as either the desiredanswer is produced, no more answerscan be produced,or resource bounds are exceeded. Introduction Recursiverules, such as "your parents'ancestors are your ancestors",occur naturallyin inferencesystemsused for theoremproving, questionanswering,natural languageunderstanding, and informationretrieval. Transitiverelations, v(x,y,z)[ANCES'lloR(x,y) & ANCESToR(y,z)-+ %%SToR(x,z)], inheritancerules, e.g. ~z(x,y,p) [ISA(x,y)& HAs(y,p)-+HAS(x,p)l,circulardefinitions and equivalencesare all occurrencesof recursiverules. Yet, recursiverules present problemsfor system implemantors. Inference systemswhich use a %aive chaining"algorithmcan go into an infiniteloop, like a left-to-right top-downparser given a left recursivegrammar Sme systemswill fail to use a recur[41. sive rule more than once, i.e. are incomplete Other systemsbuild tree-likedata [6,121. structures(connection graphs)containingbranches thelengthofwhichdependonthenlrmberoftimes the recursiverule is to be applied [2,131. Since scme of these build the structurebefore using it, the correctlengthof these branchesis problematic. Son-esystemseliminaterecursiverules by deriving and addingto the data base all implicationsof the recursiverules in a specialpass before normal inferenceis done [91. The inferencesystemof SNePS [13]was designed ti use rules stored in a fully indexeddata base. when a questionis asked, the systemretrieves --7-sTheworkwas supportedinpartbytheNationa1 ScienceFoundationunder Grant No. MCS78-02274. The structureof processesmay be viewed as an AND/OR problemreductiongraph in which the processworkingon the originalquestionis the mot, and rules are problemreductionoperators. Partly influencedby Kaplan'sproducer-consumer model [53,we designedthe system so that if a processworkingon some problemis about to create a process for a subproblem,and there is anotherprocessalreadyworkingon that subproblem, the parentprocess canmake useof the extant processand so avoid solvingthe same problem again. The methodwe employ handlesrecursive rules with no additionalmechanism. The structure of processesmay be viewed as an active connection graph,but, as will be seen below, the size of the resultingstructureneed not depend on the number of times a recursiverule will be used. 151 This paper describeshm our systemhandles recursiverules. Aspectsof the system not directlyrelevantto this issuewill be abbreviated orcmitted. Inparticular,detailsof thematch routinewhich retrievesformulasunifiablewith a given formulawill not be discussed (butsee [lOI). The InferenceSysteq The SNePS inferencesystembuilds a graph of processes [7,11]to answera question (derive instancesof a given formula)based on a data base of assertions(groundatomic formulas)and rules (non-at&c formulas). Each processhas a set of signifyingthat TTB has been derived. SW'PK!H sends to its boss the application46., the substitutionderived from o by replacingeach term t in o by tB. The effectof the SWITCH is to changethe answer from the contextof the variables of T to the contextof the variablesof Q. 1n0u.rexample,themmightsendthe answer B=k/xl. SWITCHwould then send a\B = {b/x,x/y}\ {c/xl= (b/x,c/y)to the INFER, indicatingthat Q~\B = P(x,a,y)(b/x,c/y) = P(b,a,c)has been derived. The importanceof the factoringof the mguof Q andT into the sourcebinding0 and the targetbinding T - a separationwhich the SWITCH repairs-- is thattheCHAIN canwork onT in the contextof its originalvariablesand report to many bosses,each throughits own SWITCH. A CHAIN processis createdto use a particular substitutioninstance,T, of a particular formula,Al&...GAk1T to deduce instancesof T-r. Its answers,which will be sent to a SWITCH,will Four Processes be substitutionsB such that T-rBhas been deduced using the rule. For each Ai, ISilk,the CHAIN An INFER process is createdto derive tries to discoverif ANT is deducibleby creating instancesof a formula,Q. It firstmatchesQ an INF'ER process for it. However,an INFER process againstthe data base to find all formulas might alreadybe workingon Aia. If ~.=TT, unifiablewith Q. The result of this match is a the alreadyextant INFER is just what the CHAIN wants. list of triples,<T,-r,W,where T is a retrieved It takes all the data the INFER has already formulacalled the target.and 'cand 0 are subcollected,and adds itselfto the INFER'sbosses stitutionscalled ti?get bindingand source so that itwill also get futureanswers. If a bindingrespectively. -Essentially-cand o are is more generalthan -r,the INFER will produceall factoreaversionsof the mst generalunifier the data the CHAIN wants, but unwanteddata as (mgu)ofQ and T. Pairs of the rqu whose variables well. In this case the CHAIN createsa FILTER are in Q appear in G, while those whose variables areinTappearinT. Any variablein term position processto standbetween it and the INFER. The FILTER storesa substitutionconsistingof those is taken from T. Factoringthe q-u obviatesthe pairs of T for which the term is a constant,and need for renamingvariables. For exampleif when it receivesan answer substitutionfrom the Q=P(x,a,y)and *P(b,y,x), we would have INFER,it passes it along to its CHAIN only if o=(b/x,x/y)and T={a/y,x/x)(thepair x/x is includedto make our algorithmseasier to describe). the stored substitutionis a subsetof the answer. For example,if T were (a/x,y/z,b/w} Note that Qa = T-r= P(b,a,x),the variablesin the and a were variablepositionof the substitutionpairs of 0 (u/x,v/z,v/wl, a FILTERwould be createdwith a sutstitutionof {a/x,b/w),insuringthat unwanted are all and only the variablesin Q, the variables answerssuch as (c/x,d/z,b/w) producedby the more in the variablepositionof 'care all and only the generalINFER were filteredout. If a is not variablesin T, all terms of crce from T, and compatiblewith T, or is less generalthan T, a the non-variablesin T came from Q. new INFER must be created. However,if a is less generalthan T, the old INFER might alreadyhave For each match <TJ,o> that an INFER finds for Q, there are two possiblitieswe shall consider. collectedanswersthat the new one can use. These are takenby the new DJFERand senttoits bosses. First, T might be an -assertion in the data base. In this case, G is an answer (Qcr has been derived). Also, since thenew INFERwillprcduce all the additionalanswersthat the old one would (plus If the INFER has alreadystored 0, it is ignored. others),the old INFER is eliminatedand its Otherwise,CTis storedby the INFER and the pair bosses given to the new INFERwith intervening <Q,o> is sent to all the INFER'sbosses. For CTto FILTEXs. The net result is that the same be a reasonableanswer,it is crucialthat all its structureof processesis createdregardlessof vai&les occur in Q. The other case we shall conwhether the mre generalor less generalquestion sider is the one in which T is the consequentof was asked first. some rule of the form Al&...&An1T. (oursystem allcrws other forms of rules,but considerationof A CHAIN receivesanswersfrom INFERS (possibly this one will sufficefor explaininghow we handle filtered)in the form of pairs <Ai,Bi>indicating recursiverules). In this case, the INFER creates that AiPi, an instanceof the antec&entAi,has two other processes,a SWITCH and a CHAIN to beendeduced. Wheneverthe CHAIN collectsa set derive instancesof T-r. The SWITCH is made the of consistentsubstitutions(Bi,...,Bn), one for CHAIN'sboss, and the INFER the SWITCH'sboss. each antecedent,it sends an answar to its bosses It maybe the case that an alreadyextant CHADJ consistingof the ccgnbination of Bl,...,Bk(where maybeused insteadof anewone. This will be the ambination of 61 = ~tll/vll,...,tln,/vlnlI discussedbelow. ,...,Bk= ctkl/vkl ,...,tknk/vknkl is the mgu of the expressions(vll,...,vlnl,...,vkl,...,vknk) and The SWITCHprocesshas a registerwhich is (tll ,...,tlnl,. ..,tkl,...,tknk) set to the sourcebinding,O. The answersit [l,p.1871). receivesfrom the W are substitutionsB, registerswhich containdata, and each processmay send messagesto other processes. Since, in this system,the messagesare all answersto scntequestion, we will call a processP2 a boss of a process PI if PI sends messagesto P2. Sane processes, calleddata collectors,are distinguish& by two features: 1) they can have storethan one boss: 2) they store all massagesthey have sent to their bosses. The storedmessagesare used for two purposes: a) it allows the data collectorto avoid sendingthe sama messagetwice; b) it allows the data collectorto be given a new boss, which can inmediately be broughtup to date by being given all the messagesalreadysent to the other bosses. Four types of processesare importantto the discussionof recursiverules. They are called INFER,CHAIN, SWITCH and FILTER. INFER and CHAIN are data collectors,SWITCH and FILTER are not. 152 In normalBases for Data Bases, Gallaire,H., Minker,J. and Nicolas,J. (edS.),Plenum,New York, i980. 4. Fikes, R.E., and HendrixG.G., The deduction component. In UnderstandingSpokenLanguage, Walker,D.E., ed., ElsevierNorth-Holland,1978, RecursiveRules Cause Cycles Just as a CHAIN can make use of alreadyexisting INFERS,an INFER can r&e use of already existingms, filteredif necessary. A recursiverule is a chain of the form with C unifiable Al&...&Ak~Bl,Bl&...&Bn~...X, with at least one of the antecedents,Al say. When an INFER operateson Al, it will find that C matchesAl, and it may find that it can use the CHAIN alreadycreatedfor C. Since this CHAIN is in fact the INFER'sboss, this will result in a cycle of processes. The cyclewill producemore answersas new data is passed around the cycle, but no infiniteloop will be created since no data collectorsends any answernore thanonce. (If an infiniteset of Skolem constantsis generated,the processwill still terminateif the root goal had a finitenumberof desired answersspecified[ll, p.1941). 355-374. 5. Kaplan,R.M., A multi-processing approachto natural languageunderstanding.Proc. National ComputerConference,AFIPS Press,Montvale,NJ, 1973, 435-440. 6. Klahr, P., Planningtechniquesfor rule selection in deductivequestion-answering.In PattemDirectedInferenceSystems,Waterman,D.A., and Hayes-Roth,R., eds., AcademicPress,New York, 1978, 223-239. 7. McKay, D.P. and Shapiro,S.C., MULTI--ALISP based muitiprocessing system. Proc. 1980 LISP Conference.StanfordUniversitv.1980. Performing 8. Naqvi,'S.A.,and Henschen,-L.J., inferencesover recursivedata bases. Proc. First AAAI Conference,StanfordUnivers1980. 9. Reiter,R., On structuringa first order data base, Proc. SecondNationalConference,Canadian Societyfor ComputationalStudiesof Intelligence, Figure 1 shows a structureof processeswhich we consideran active connectiongraph. It is built to derive instancesof ANCESToR(William,w) form the rulesvCrx,y) [PARENT(x,y)~ANCESToR(x,y) 1 and V‘(x,y,z) [ANcESToR(x,y) & PAREMT(y,z)x ANCESToR(x,z)].The notationfor the rule instancesis similarto thatpresented in [31. Note particularlythe SWITCH in the cycle which allowsnewly derived instancesof the goal ANCESToR(William,w) to be treatedas additional instancesof the antecedentANCESToR(William,y). A similarstructurewould be built regardlessof the order of assertingthe two rules, the order of anteaedentsinthetwoantecedentrule, the order of executionof the processes,whether the query had zero,either one, or both variables ground,or if the twoanteceiientruleused ANCESTORfor both antecedents. 1978, 50-99. 10. Shapiro,S.C., Representing andlocatingde- ductionrules in a semanticnetwork. Proc. Workshop on Pattern-Directed InferenceSystexns.In SIGARTNewsletter,63 (June1977), 14-18. 11. Shapiro,S.C., The SNePS sepnanticneixork processingsystem. In AssociativeNetworks:The Representation and Use of Kncwledsebv Camrsuters. F&dler, N.V., ed., AcadenicPress,NW York, 1979, 179-203. 12. Shortliffe,E-H., ComputerBased Medical Consultations:MYCIN, &rericanElsevier,New York, 1976. 13. Sickel,S., A search techniquefor clause interconnectivity graphs,IEEE Transactionson ComputersVol. C-25, 8 (August1976)I 823-835. In the SNePS inferencesystem,recursive rules cause cyclesto be built in a graph structure of processes. The key featuresof the inference systemwhich allow recursiverules to be handled are: I) the processesthat producederivations (INFERand CHAIN) are data collectors:2) data collectorsnever send the same answernore than once: 3) a data collectormay report to mre than one boss: 4) a new boss may be assignedto a data collectorat any time -- it will miately be given all previouslycollecteddata; 5) variable contextsare localized,SWITCH changingcontexts dynamicallyas data flows around the graph: 6) FILTERsallow more generalproducersto be used by less generalconsumers. 1. chang, C.-L., and Lee, R.C.-T.,SymbolicLogic and MechanicalTheoremProving,Acadtic Press, New York, 1973. 2. chang, C.-L., and Slagle,J.R., Using rewriting rules for connectiongraphs to prove theorems, ArtificialIntelligence12, 2 (August1979), 159-180. 3. chang, C.-L., On evaluationof queries containing derivedrelationsin a relationaldata base. 153 ANCZSMR(William,z) y) PARETJT(William,y)AN(TlZSMR(William, Figure 1.