Opportunities: A unifying framework for planning ...

From: AIPS 1994 Proceedings. Copyright © 1994, AAAI (www.aaai.org). All rights reserved. Opportunities: A unifying frameworkfor planning and execution Louise Pryor School of ComputerScience The University of Birmingham Edgbaston BirminghamBI5 2TY UK l.m.pryot~cs.bham ~tc.uk Gregg Collins TheInstitute for the LearningSciences Northwestern University 1890 MapleAvenue EvanstonIL 60201 USA collins@ils.nwu.edu Abstract Asuccessful agent in the real world mustboth plan aheadandreact to the unexpected.Ideally, both processes should be carried out in a common framework. In this paper wedescribe su~a framework basedon the analysis of opportunities. Weargue that planningin advancecan be viewedas a matter of anticipating opponanities, whilerespondingto the unexpectedshould be seen as reacting to opportunitieswhenthey arise. Wepresent an opportunistic planning agent, PARETO, that operates in a simulatedrobot delivery world,and implementsour approach. 1.1 Planning in an unpredictable world Traditional AI planningsystems(Fikes and Nilsson 1971: Sacerdoti 1977; Chapman 1987), knownas classical planners (Wilkins1988), haveeffectively decoupledplan constmctionand plan execution, operating on the assumption that all neededinformationwill be freely availablein advance. Morespecifically, these systemsrely on three assumptionsaboutthe worldsin whichthey operate: ¯ Simplicity: it is possible to knoweverythingabout the worldthat mightaffect the agent’sactions. ¯ Stasis: there will be no clumges in the worldexceptthose causedby the agent’sactions. ¯ Certainty:the agent’sactionshavedeterministicresults. 1. Introduction Werethese assumptionsvalid, the world wouldhold no Thereal worldis regular enoughto makeadvanceplanning surwisesfor the agent;henceplanscouldbe specif’w.din exworthwhile,yet unpredictableenoughto makeplanningto act detail with no fear that an unexpectedoutcomewould the last detail impossible.Anautonomous agentmustthere- force subsequentrethinking.Of com~e,there are fewnatural fore strike a balancebetweenplanningaheadandreacting to worldsin whichthe classical assumptionshold up. To take changes.For example,a robot on a strange planet mustde- a simpleexample,consideran everydayhuman activity like terminewhichareas to explore,whereto take soil samples, preparingbreakfast. For mostpeople,this occursdayafter whatroutes to take, andso on. Theright chokeswill depend day in the sameenvironment,at the sametime, using the on details concerningthe terrain encountered,the atmo- sameingredients,andso on. It is an eventregular andpresphericconditions,andthe results of tests performed on ear- dictable enoughto be described by a script (Schankand lier samples--factors that cannot,in general,be predictedin Abelson1977). However,the apparent regularity of the sufficient detail to allowfirm decisionsto be madein ad- breakfast worldis only an artifact of our loftily abstract vance. Onthe other hand, undirectedwanderingmakeslittle point of view;downat the level of detail at whichwemust sense: enoughwill be knownin advanceto makesomedeci- treat the domainin orderto executea plan successfully,we sions that will makea productivemissionmorelikely. The find a worldof wildandcapfi¢ionsunpredictability.Theplan distinction is a matterof availableinformation:someof the for makingbreakfast mayinvolve any numberof actions informal/onthat wouldbe required to construct aa optimal suchas grasping,lifting, andpouringcontainersof milk,ceplan will not be availabkbeforephmexecutionbegins. real, coffee, andso on; dishes andutensils mastbe manipuSince relevant informationmaybecomeavailable at any lated; appliancesmustbe operated;obstacles on the floor point, evenwhilea plan is beingexecuted,an agentmustbe mustbe circumnavigated. Toliterally makea completeplan preparedto alter its plansto refle~newinformation.Ideally, in advance,it wouldbe necessaryto knowthe exact posithis replanningprocess will resembletbe wocessof plan- tion, orientation, andweightof everyrelevantobjectin the ning in advanceas muchas possible, so that the same kitchen. It wouldbe necessaryto know,for example,the knowledge andprocessescen be applkd.In other we~ls,iris preciseangleat whichthe boxof cornflakes shouldbe tilted desirablethat planningandreplanmng can be carried out in a to achieveoptimalflowinto the bowl,the properposition commonframework. In this paper we propose such a andaltitude of the milkcontainerduringpouringto miniframework, basedon the notion of respondingto opportuni- mizesplashingcausedbythe flakes, andso on. ties. Wediscuss the implementationof this approachin Clearly.this is completelyunrealistic. Evenassuming the 1 a system that notices and responds to opportuniPARETO, theoretical possibility of gatheringsuchinformationin adties as it pursuesits goals(Pryor1994). vance,the cost of acquiring,storing, andprocessingsucha 1 Plann~ and Acting in Realistic Envimnmeam by Thinking about Opportunities.Vilfredo Psreto (1848-1923)wasmIlslim eccaemist,so- ciologht, and phiimepherbest knownfor the notion of Parelo opt~lity mdthePartt°distributi°n’neither°fwhichbtmalindtisw°rk" PRYOR 329 From: AIPS 1994 Proceedings. Copyright © 1994, AAAI (www.aaai.org). All rights reserved. quantity of information about the world in general is prohibitive. Furthermore,given inaccuracies in sensors and the interference of other agents, muchrelevant information is likely to be unavailable in principle. In other words, the breakfast world, like most natural environments, displays the followingcharacteristics: ¯ Complexity: it is impossible to knoweverything. ¯ Dynamism:changes occur as a result of the actions of other agents or of natural phenomena. ¯ Uncertainty: the agent cannot be sure what the results of its actions will be. Confrontedwith an environmentthat displays these characteristics, the classical planningparadigmbreaks down.There are three key reasons for this. First, inaccurate information maycause wrongdecisions to be madeduring the construction of a plan. For example, believing that there are clean bowls in the cupboard, you might construct a plan that entails openingthe cupboarddoor. If you are wrongabout the bowls, this step is unnecessary.The possibility of a faulty plan implies that agents must monitor their plans during executionand be preparedto recover fromfailures. Second, information needed to make somedecisions may not be available at the time a plan is chosen. For example, you cannot accurately predict the movementsof your roommate, whichmeansthat there is potential for interference between your plan and your roommate’sactions. In order to make an optimal decision on what path to take across the kitchen, you must knowwhere your roommatewill be, or at least knowthat she is out of the way. In general, decisions that fall into this category should be deferred until enough informationis available; since the infonnatiou will in many cases not be available until after executionof the plan has begun, the agent must prepared to interleave plan construction and plan execution. Third, decisions mayarise that have not been foreseen in the planning process. For example,the telephone might ring while you are pouringa glass of orungejuice, forcing you to decide whether to continue pouring or stop and unswerthe phone. The agent mast recognizecircomstances under which the need to maken,-t unforeseendecision arises, and must, if necessary, be able to acquire the information neededto make those decisions. The agent must be able to changeits plans duringtheir executionto reflect anforeseensituations. In sum, the inevitability of the unexpected means that plans made in advance will require modification during execution. Expendingeffort on the construction of elabcmae and detailed plans is therefore often unpmdoctive.A more effective apwmch in the face of unlmxlictability is to expend some effort on choosing simple plans, and to expend more effort on adapting those plans as unforeseen circumstances are encountered. This is the aPim3a~followed in PARETO. 1.2 Plan execution The emphasisin the design of PARETO is on recognizing the need for and makingunforeseen decisions during plan execution. As an example of the kind of reasoning PARETO is meant to perform, suppose you happento see a sharp knife as you a~ looking for a pair of scissors with whichto cut 330 POSTERS string. In such a situation, unless there weresomeclear reason not to do so. you might well use the knife to achieve your goal and abandonthe plan to find scissors. PARETO recognizesand takes advanlageof such opportunities. Instead of reasoning in detail about the interaction betweenplans for its various goals, as a classical planner would, PARETO constructs separate plans for each of its goals and does not expandeffort attemptingto anticipate tential interactions in advance.In addition, instead of expending a great deal of effort to gather all available information at planning time, PARETO depends on possibly faulty assumptionsabout the situation in whichit will execute its plans. Obviously,the failure of these assumptionscan cause PARETO’s plans to fail. PARETO is designedto react quickly and flexibly to unexpectedcircumstances,rather than to minimizethe possibility of an uncertaintyarising. 2. Planning and opportunities Thecost of achievinga goal and the benefits of doingso can vary wildly over the goal’s lifetime. For example,consider a goal to buy gas for your car. Duringa rush to makean imporlant meeting, the cost of stopping for gas wouldbe very high, since it is likely to makeyou late for the meeting. The benefit--essentially the reduced probability of running out of gas--is also high. but is likely outweighedby the cost. During the meeting, the cost of pursuing the goal is still higher, since walkingout of the meetingand driving off to buy gas wouldbe a most undesirable course of action: the benefit does not change. Onthe drive homeafter work, the cost of buying gas is relatively low--assumingyou have no urgent plans--while the benefit increases as the likelihood of running out of gns becomesprogressively more acute. At home,the cost of buyinggas is again high. as it is nowinconvenient to make a special journey, while the benefit stays the same.Cost and benefit continue to vary over time. dependingon the exact situation in whichyoufind yourself. until youdecide to achievethe goal. An effective planner must thus not only find workable plans to achieveits goal, but must also, insofar as possible, maximizethe benefit and minimizethe cost of doing so. In short, it must wait until there is a good opportunity for achieving a given goaLPlanning can be seen as the process of predicting whenopportunities will arise and what form they will take, and deciding in advanceto take advantageof them. Althoughit is imwacticalto performdetailed predictions for all possible circumsUmces, it is often possible to performsimplifgd iavdk:tious. For example,it is routine to plan whento refuel your car based on your knowledgeof the amountremainingin the tank, the locations of gas stations, and your travel plans over the next day or so. 2.1 Adapting the current plan Aplan that is designedto achievea particular goal should be revised whena predicted opportunity does not in fact arise, or whena better opportunity comesalong that was not considereal during the planningprocess. Returningto an earlier example,if youwere to receive a call on your car phoneinforming you that your meeting had been postponed, you From: AIPS 1994 Proceedings. Copyright © 1994, AAAI (www.aaai.org). All rights reserved. would be presented with an unexpected opportunity to buy gas. In general an agent should notice such unexpectedopportunities and consider adopting new plans to take advantage of them. As far as possible, the agent should respond to the unexpectedopportunity in the sameway as it wouldrespendto the oppcmmfity had it been predicted in advance. The paradigmof noticing and responding to opportunities thus provides a unifying framework within which to approachboth the issue of planningand the issue of plan revision during execution. In this approach,an agent’s response to an opportunity should be independent of whether it has beenpredicted ornot.Planning inadvance is:based on the prediction offurore opportunities, whileplanrevision is based ontherecognition ofcurrent opportunities. 2.2 Switching between plans PARETO’s approach means that it is pursuing a number of independent plans at any given time. The managementof these diverse plans is thus a critical issue in PARETO’s design. To managethem successfully, PARETO must distinguish those plans that are being actively pursued from those that arenot. In fact, it can generally be assumedthat only a handful of the agent’s current plans will be pursued actively at any given time. For example,consider a plan to follow a recipe that says "soak the beans overnight." Clearly, pursuing this plan actively:once the beans are put in to soak--for example, by sitting and watching them until morning-wouldbe a tremendouslyinefficient plan. Instead, the agent should suspendthe executionof this plan, and turn its attention to the pursmtof other goals. At any one time, most of an agent’s goals are suspended,tn order to manipulateplans in this way, PARETO must incoqxm~ general mechanisms for deciding whenit should changefrom following the plan for one goal to followingthe plan for another. In general, a goodtime to attend to a particular goal is whenthere is an oppornmi~for that goal, whether wedicted or unpredicted. For example, you should return to the soaking beans whenyou are in a position to Wocee___d with the next step in the recipe. The existence of such an opportunity-to perform the next step in the preparation of the beans---waspredicted in theconstruetion of the overall plan. However,an unpredicted opportunity maysimilarly trigger a change of attention from one goal to another. For example, supposeyou have tried and failed to get hold of a friend on the telephone to makesome arrangements withher, if you see her in the supermarket, you maywell temporarily suspend your goal of {k)ing ymgr weekly shopping to pursue your goal of making the arr~menm. Thus, managementof plans is handled naturally within PARETO’sparadigm of responding to opportunities. Decisions about changing, plans, or reactivating suspended plans, are based on the recognition of current opportunities, while the plans themselvesare based on the prediction of future opportunities. 3, Pareto In this section we describe PARETO, a working system that illustrates howthe frameworkdescribed in this paper is an Fibre I , PARETO’s w,orld effective means, of combining the execution ofplans with apI~wiate responses to unexpectedsituations. 3.1 What PARETO does PARETO operates a simulated robot delivery truck. The simulator was built using TRUCKWORLD (Firby and Hanks 1987; Hankset al. 1993). In a TRUCKWORLD world, a robot delivery truck travels between locations on a network of roads, encountering and manipulating various objects as it goes. In PARETO’s world (described in detail in Pryor, 1994), there are several building sites whoseworkersuse the truck to run delivery errands such as "fetch a hammer,"or "fetch somethingto carry mytools in." PARETO’s world consists of a numberof locations linked by roads along which the truck can travel (see figure 1). Threeof the locations are building sites, one is the truck’s base whereit can usually find feel, and the other locations contain objects that the truck uses to fulfill its delivery goals. Mostof these objects are used regularly by the construction workers whoseerrands the truck runs: hammers, saws, ladders, paint, and so on. There are over 30 different types of object in PARETO’s world, of which20 are used for deliveries. At any moment,there are typically well over 100 different objects at the various locations amtmdthe world. PARETO’s world is unpredictable: The truck has limited perception, and can sense only those objects that are at its current location, meaningthat from PARETO’s perspective the world is complex.It is also dynamic, since objects may spontaneously change location, appear, or disappear. Finally, the results of the truck’s actions are uncertain: it maydrop objects that it is trying to grasp, and neither the time taken to travel between locations nor the amountof fuel used canbe predicted. P~ receives defivery orders at unpredictable intervals during its operation, with a typical run involving between sevenand twelve separate deliveries. Plans that allow for every possible combination of goals would be far too complex; instead it uses a separate plan for each of its goals. These plans are sketchy--they do not specify in detail every action that should be performed. Instead, they specify the overall strategy that should be used in terms of a few simple steps, and PARETO decides how each step should be performedas it executes the plans. PRY’OR 331 From: AIPS 1994 Proceedings. Copyright © 1994, AAAI (www.aaai.org). All rights reserved. PARETO’s sketchy plans allow for someof the manycontingencies that mayarise (for example,they specify what to do whenthe object being grasped is dropped) but manycircumstances cannot be f~seen in the plans. There are two types of situation in which PARETO must respond on the fly to circumstances that it encounters. First, circumstances may dicmte that PARETO should switch its attention from one goal to another. For example, PARETO will not con- Remove tuk kern out of fuel, but will instead concentrateon trying to refill its fuel tank. 2 Second, an unforeseen opportunity mayarise. This mayentail either switching plans, or replacing a current plan with an alternative. For example, suppose the truck has two delivery goals, one for something to carry tools and one for something to cut twine. PARETO maydecide to pursue the carry-tools goal fh-st’ and set off to the warehousein which it expects to fund a box. If the truck fnds a knife or a pair of scissors on the way, however, PARETO will temporarily switch its attention and pick up the cutting tool. If PARETO subsequendyencounters a bag that wouldbe suitable to carry tools, it wouldabandonits plan to find a box and instead pick up and deliver the bag. PARETO has an efficient mechanismfor spotting unpredic(ed opportunidessuch as these, and u’eats themin exacdy the sameway as it does opportunities that have been predicted in its plans. The next sections explain PARETO’s basic operationand characterization of plans and opporumities. 3.2 How PARETO works 3 plan execution system PARETO is based on Fuby’s RAPs (F’trby 1987; 1989), and is deacrihed in (Pryor 1994). PARETO acquires a newgoal, it looks in its libraryof SAPs (sketchy plans) foronedmtwillachieve thegoalThesteps in a RAPspecify subgoals that the system must achieve in order to execute the plan successfully. PARETO recursively expands sketchy plans by choosing aplna for each subgoal. Eventually, a subgoal will be achievable by performing a simple action, and no further expansionis required. When PARI~TO _ro~__~!_’ves a goal, its lust action is to place a task aimed at achieving that goal on the task agenda. The task that is placed ou the task agendaconsists of the goal and the RAPthat has been chosen to achieve it. PARETO°s execution cycle is summarizedin figure 2, and consists of the followingsteps: ¯ Cheosinga task from the agenda. The mkthat will be the i.._ __ ~ task aoenm/ tinne tryingto makea deliverywhenthe truckis running Figure2 PARETO’s execution cycle achieve its goal. For example, a goalto find fuel might be achieved by going to the location of a fuel drumthat PARETO knows about, by going to the base which is a source location for fuel, or by wanderingaroundthe world until a fuel drumis found. Eachsuch plan is a methodof achieving the goal. PARETO chooses one of the methods of the task that is being l~rucessed, based on the state of the worldat the time that the processing takesplace. ¯ Addingnew tasks to the m~agenda for each of newgoals created duringthe previousprocessingstep. ¯ Repmcessingthe original task when each of its submsks has achievedits goal. The suecessful executionof the subasks is not enoughto guaranteethat the original task will itself haveachieved its goal,sincePAR~rO’s worldis dy- namicand sometime maypass betweenthe execution of a task’s sublasks and the repeat lm3cessingof the original rusk. For example,the truck mightsucceed ingoing to the location of a knownfuel drum, but the drummight have meanwhiledisappeared, or the fuel in it been used by another agent. If the task has succeeded, it is removedfrom the task agenda,else it ls Wecessedas descrihedabove. To Wusimtethe execution cycle in action, consider what happens whenPARETO receives the two goals in the example above, to deliver somelhingto carry tools and something to cut twine. As each goal is received, the deliver-object RAPis chosenmachieve it and the relevant task is placed on the task agenda. ’I~ deliver-obje~ RAPhas four steps: PARETO must find a su/table object, load it, travel to the correct location, and unload the objecL After PARETO has processed the deliver-object m~kforthecarry-tools goal, a~ thuSfive tasks on the agendafor dmt gel: delivermost productive in ~ PAitLrrO’sgoals should he object, find-ohject, Ioad-payload-ol~ect, truck4ravel-to,and chosen. PARETO’s task selector Ope~lt~ by looking for unlond-at.Of these, the deliver-objecttaskis waitingfor the opporl~mitlesto further its various goals. The ability to other four to complete, and the Ioad-payload-obiect, truckrecognize un~ oppommitles is a significunt chan~ travel-to, and unload-at tasks are waiting for their from the task selector used by Firby’s aAPssystem. predecessm,sto complete. If PARETO has no other goals, the ¯ Processingof the chosentask to fill in the demill of rite next task to he chosenwill be the find-object task, which incompletelyspecified plan that describes the task. A RAP will in turn be expanded andits subgoalsplacedon ~ task specifies all the different plans that might be used to agmda. If all goesaccordingto plan. all the sulxaskswill be proc~ in ttwn and removedfrom the wsk alp~aKkuntil the unload-at task has been achinved. Finally, PARETO will 2 Aswell as,,-,/very Soals, PAiteTO has lm~envation iio, I- (Schmk md againprocessthe deliver-objecttask, fred that it has sucAbebon 1977)to mum du~the mu:kdoenet nmouzd fuel mKIto keep c__~eded, undremove it fromtlz ruskaSenda. m~etim mnoundinp. 31temive Aaion Padmlm 332 POSTEP~S From: AIPS 1994 Proceedings. Copyright © 1994, AAAI (www.aaai.org). All rights reserved. PARETO thus characterizes opportunitiesas tasks on its --> New order U~ER-5: sor~chtr~j to CARRY~ for ILS --> New top level ~oal= ~LIV~-OB~CT:=[S:6] task a~endathat are easyto achieve,h has an efficient mech--> New order ~-4: ~ng to CUT ".WINg for ~mismfor recognizing ~ties that uses a f’dtering pro--> ~ top level ~al: ~TV~-~ ::[7:7] cess basedonthe functionalcharacteristicsof objectsin the ¯ .. ProcessL~g ta~-<rs~.~VER-CB3E~ ~-IU~S 113 (HI~-5>:: [6:6] world (Pryor 1994). There are two waysin whichone ... Prooessing task~ (It~ff-~S _ >=:[6:5] PARETO’s tasks maybe easily achievable,correspondingto ... ~he truck ~.s ~ff to £1nd a Jx~ ... +~-~ Potential opp=~JJ~t¥ for the twotypesof opportunitieswediscussedearlier:, either it <DELIV~ (2~qff-TOC~ TI.S (lq~9> has alreadysucceeded, or it is readyfix processing(notwaitf~zn ITJ~I-20 ~AG) ing for anyothers to be completed).Tasksthat havealready ~++ Po~enti~l Opport~C¥ for ~ CPaSff-~OLS _ > fra. I~M-20 {Bl~;) succeededmayindicate the presenceof an unexpected oppor+4+ Potential ~.~r~,.rd.ty for tunity, whilea task that is readyfor processingmayindicate <L(I%D-PA~ ~ CRRRY-T(X)LS> the presenceof a predictedopportunity--onethat has been from I~-20 (BAG) +++ Potentl~oRoortta~ty for foreseenin a plan. In our example,the findingof a bagis <DEIXV~-4:B~ ~ ~ (1~1~-8> unexpected:PAI~TO had plannedto find abox. frum IT~-I3(SCISSORS) PARErO thus characterizesopportunitiesin termsof easHas already ~ <~ND-Ci3J~ ~-~(~S.. *** Taking tme~oecTJed ~.ta’~.ty: [6:5] ily achievabletasks on its task agenda.Expected opportuni<FI1qD-G~ECTC~J~-~3XS.> ties are associatedwithtasks that representthe nextstep in ... Pro~essir~ task- <lvDl:~]Bb~ Ciq~/-~q _ >: : [6:5] ... Pro~essin~task: the plan for oneof its goals, andunexpectedopportunities <IDAD-PAYI£1%D-CS..~C~ 1"1vn4-20 C~R~-~(~S>::[6:8] a’e associatedwithtasks that havealreadysucceeded. ... cbe cna:k loadsthebag ... ... Task ~; ~PA~.J~D-cnJ~’T ~20 (~qRRY-~0(~S>:: [6:8] 4. Related work *** Taking ~ ~I.l~k%~tiCy: [7:7] An agent should revise its plans whenit encountersan un<D~LIV~ ~ T~R U~-8> "** ~ng ~oals ~ 6 Co 7 expectedopportumty, but cannotafford to analyzeeverysit¯.. ProcessiJ~ task: uation exhaustively.Thecentrality of opportunitiesto the <[~IV~-Ci~ C~f~Z~ ~ Ue3~-8>: : [7:7] executionof plansin an unwedictable worldhas receivedlit... Prc~essir~. task: <FI"I~)-CEk]~C~ ~ _ >: : [7:5] ¯.. Task ~: tle arm,ionbyother researchersin the field. <~I~D~ CIE-TR11~ => ~EH3J~P~-2 ITEM-13>::[7:5] Mostcurrent resem~hon the problemof recognizingthe ¯.. Pr~esstng task: <IQN>-PAYICN)-CB~CT I~M-13CIE-TWI~>::[7:8] needto makeunforeseendecisionsfails to addressthe issue ...the tzuc*loadsthe sc/ssors ... of opportunismexplicitly 08resins and Drummond 1990; ¯.. Task w~w~___~ Fe~guson 1992; Lyons and Hendriks 1992; McDermott <IEN)-P&~ I~(-13 ~: : [7:8] *** Changing~la from 7 no 6 1992).In general,this workrelies heavilyon projectingthe ... Processingtask: <~C~-~AVEL-TO ~LE-AVE>:: [6:5] agent’s current plans to detmninewhenreplanningwouldbe ... the rzu~c g~es off Co deliver ~hs haft... desirable. Asprojection mayinvolve arbitrarily complex reasoning, this approachfails to address the problemof wx~3gnizing the needto makedecisionsquicklyenoughthat 3.3 Opportunities in PARETO the agentcan respondappropriatelyin a dynamic world. PARETO chooseswhichtask to Ixocess next by considering Theeurlicst workon opportunityrecognition, by Hayesthose tasks that are associated with opportunities. When Rothand Hayes-Roth (1979), lookedat opportunism in plan PARETO spots aft opportunity,it doesnot cousiderwbethef cons~, but did not consider plan execution. or not the opportune task is. the nextstep in the planfor the Hammond and his colleagues (Hammond et al. 1993) pre. currentgoal, Thus,it mayin effect ignorethe existingplan sent a methodof oppoNmfity recognitionbasedon reoognizfor carryingout a ruskwhenan opportunityarises; if it suc- inK the features involvedin a goal’s achievement. This apceedsin taking advantageof the oplmmmity, the task andits proachrelies on havingspecified the plan for the goal in the previous plan are simply removedfrom the agenda. enoughdetail that the environmental elementsinvolvedare Furthermore, PARETO is not co~i~ned by any notion of already known.It does not allow aa agentto recognizeopthe"curre.t" _.m_~ in ~ ~x~m--~us. it can efportunities for goals that it has not yet decidedhowto fectively switchplans wbetwv~ aa oppommity arises. achieve,anddoesnot allowthe recognitionof opportunities In our example,PARW/’O changesits plan for its current that require a diffe~nt metlxxiof achievement fromthat in goal by ¢k~idingto pick up tha.~ instead of continningto the current plan. For example,Hammond’s approachwould lookfor a Uox.It also switchesits attention to anothorgoal not allow an agent to recognize the ~ity discussed by picking up the ~’im~. Howeveg,the ~ switch is above,in whichthe presenceof a bag allows an agent to only temporary,as pickiagup the scissors does not achieve abandonits plan to find a box. This limited viewof oppof the task of findingan objectthat cancarry tools, whichtask tunity recognitionwouldInvent the use of oppommities as remains on the agenda. After picking up the scissors a framework for combining planningand responsiveness. PARETO returns to the c.arry-tools goal and goes off to Theimpracticalityof unlimitedreplunninghas long bee. deliver the bag. Figure 3 showsPARETO’s output as it recognizedas a serious problem in the designof intelligent ~cognizesand rakes advantageof these oppommities. agents. Thereale twoaspectsto the problem:the necessity of limiting the amountof reasoningthat is performed,and Fig,m~.. 3 ~.pARW£O takes at~vauta~of twoopportunities PRYOR 333 From: AIPS Proceedings. Copyright ©when 1994,this AAAIlimited (www.aaai.org). All rightsBreslau, reserved.J. the 1994 necessity of determining reasoning ImdM.Drummond. 1990. "Integrating Planningand Reaction: A preliminary report." In Workingnotes of the Spring Symposium onPlanning inUncertain, Unpredictable, or Changing £nyirommen~, Stanford, CA,AAAI. Chapman, D. 1987. Planningfor ConjunctiveGoals. Artificial Intelligence 32 : 333-337. Dem~T. and M. Boddy.1988. "Ananalysis of tlme-dependent planning." In Proceedingsof the SeventhNational Conference on Art~cial Intelligence, St Paul, MN,AAAI. Ferguson,I. A. 1992. TouringMachines:An architecture for dynamic, rational, mobile agents. ComputerLaboratory, University of Cambridge.TechnicalReportNo. 273. likes, R. E. and N. J. Nilsmm.1971. STRIPS:A newapproach mthe application of theorem proving to problem solving. Artificial Intelligence2 : 189-208. Firby, R. J. 1987. "Aninvestigation into reactive planningin complex domains." In Proceedings of the Sixth National Conferenceon ArtOqcialIntelligence, Seattle, WA,AAAI. Firby, R. J.1989.Adaptive execution in complex dynamic worlds. Department of Computer Science, YaleUniversity. Technical Report YALEUK~SD/RR #672. Firby, R. J. and S. Hanks. 1987. The simulator manual. Departmentof ComputerScience, Yale University. Technical Report YALEU~SD/RR #563. Hammond, K., T. Converse, M. Marks, and C. Scifert. 1993. Oppununiam wAlearnin8. Machin~ Leorn/ng 10 : 279-309. Hanks. S., M. E. Pollack, and P. R. Cohen. 1993. "Benchmarks,testbeds, controlled experimentation, and the design of agent architectures."A/Magazine, Winte~1993. Hanks, S. J. 1990. Projecting plans for uncertain worlds. Departmentof ComputerScience, Yale University. Technical ReportYALEUK~SD/RR if056. Haym-Roth, B,1990."Dynamic control planning in intelligentagents." In Workingnotes of the Spring Sympesiwnon Planning in Uncertain, Unpedictable or Changing F.nvironments,Stanford University.. Htyes-Roth, B.andF,Hayes-Roth. 1979. A cognitive model of plmmin $. Conclusion 8. Cognitive Science 3 (4): 275-310. Lynn, D. M. and A. J. Hendrila. 1992. "A practical approach To be successful, an agent in the real world must both plan to integrating reaction and deliberation." In Proceedingsof the ahead and react to the unexpected. Ideally, both pmeeues First International Conferenceon Artificial Intelligence should be carried out in a common framework.In this paper Planning Sy~n&Collage Park, Maryland, MorganKaufmann. we have described such a frameworkbased on the analysis of Maes,P. 1991. "Adaptiveaction selection." In Proceedingsof opportunities, and a computer program, PARETO, that imthe Thirteenth Annual Coherence of the Cogmtive Sconce plements our approach. Society, CI6cago,IL, LawronceErlbaumAssociates. McDermott, D. 1992.Transformational planning ofreactive behavior. Yale University, Departmentof Computm Science. Acknowledgements YALEUK~SD/RR #941. Mostof this workwas carried out while the first author was Pryor, L. 1994. "Opportunities and Planning in an at the Institute for the Learning Sciences. This work was Unpt~UctableWorkl." PhDdiu~ation, in preparation, The supported in part by the AFOSRunder grant number Institute for the LearningSciences,Northwestern University. Secm’doti. E. 1977. A sir,crate for plan, and behavior. New AFOSR-91-0341-DEF,and by DARPA,monitored by the York: Ammicen Elsevier. AFOSR under contract F496?J)-88-C.0058.The Institute for Schsnk. R. C. and R. P. Abelson.1977. Scripts, plana, goals the Learning Sciences was established in 1989 with the and amm~,rmamd~r. Hillsdele. NI: LawrenceErltmum Associates. support of Andersen Consulting, part of The Arthur Wellmen,M. P. 198g. "Qualitative Pmbabilisfic Netwmks for Andersen WorldwideOq~mization. The Institute receives Planning under Uncertainty." In Uncertainty in Artificial additional support fromAmerimch andNorth WestWater, Intelligence, ed. J. F. Lemmer and L. N. Kanal.2. Amsterdam: Institute Partners, andfromIBM. Elsevier. Wellmm~ M. P. 1990.Pormslation of Tradeoffs in Planning References Uncertainly. LondomPitm~ Wilkiu, D. E. 1988. Practical Plam6ng: F.xtendin| the Boddy, M. and T. Dean. 1989. "Solving time-dependent plmClassical A! Planning Paradigm. San Marco, CA: Morgan ning problems."In Proceedingsof the Eleventh International Joint Conference on Artificial Intelligence, Detroit, MI,AAAI. Kaufmmm. should occur. Traditional AI ~bes have concenlrated on limiting the amountof reasoning that is performed, by using either quantitativeapproximations (Hanks1990) ~ qualitative techniques (Wellman1988; 1990). Anytime algorithms (Dean and Boddy1988; Beddyand Dean1989) address the issues of designing reasoning algorithms that will pro. duce an answerin limited time. Noneof this work addresses the issue of whenthis reasoningshould be performed. In Hayes-Roth’s GUARDIAN system (Hayes-Roth 1990) global control plans are used to direct the agent’s reasoning towards important goals. These conlrol plans arechanged by global control decisions, whichappear to be trigget~ by the receipt of sensory information. However,Hayes-Rothgives no details of howthese decisions are Iriggered or the process by which they are made. Presumably GUARDIAN’s mechanism for makingthese decisions involves minimal reasoning, as they appear to occur rapidly whennecessary, but there is no discussion of this aspecL GUARDIAN thus limits the amountof reasoning that need be done by focusing it on a subset of the agent’s goals, but there is no clear answerto the question of whensuch reasoning should be perfornw~ Maes(1991) describes a netwm’k-basedarchitecture with parametersthat adjust the speed with whichthe agent reacts to changes in its environment. If the environmentchanges slowly, the agent can perform more reasoning before respending; in very unpredictable environments, the agent must react with little or no reasoning. The parameters changeonly in response to the unpredictability of the environment~and are not affected by the particular situation in whichthe agent finds itself. The balance betweenacting and reasoning changeson a global basis, and no attempt is made mdirect the reasoningtowardsspecific goals. 334 POSTERS

Opportunities: A unifying framework for planning ...

Related documents

Products

Support

Opportunities: A unifying framework for planning ...

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib