Using Behavior Activation Value Histories for Updating...

From: AAAI Technical Report FS-01-01. Compilation copyright © 2001, AAAI (www.aaai.org). All rights reserved. Using BehaviorActivation Value Histories for UpdatingSymbolicFacts* Frank Sch~inherr and Mihaela Cistelecan t and Joachim Hertzberg and Thomas Christaller Fraunhofer-Institute for Autonomous Intelligent Systems(AIS) Schloss Birlinghoven, 53754Sankt Augustin, Germany Frank.Schoenherr@ais.fraunhofer.de Abstract Thepaperpresentsa newtechniquefor extractingsymbolic groundfacts out of the sensor data streamin autonomous robotsfor use underhybridcontrolarchitectures. Thesensor dataare usedin the formof timeseries curvesof behavior activation values, yieldingan imageof the environment as perceivedthroughthe eyesof usefulbehaviors. Similarprogressions in individualbehavioractivationcurves are aggregated to well-defined patterns,like edgesandlevels, calledqualitativeactivations.Sets of qualitativeactivations for differentbehaviorsoccurringin the sameinterval of time are summed to activationgestalts. Sequences of activation gestalts are usedfor definingchronicles,the recognitionof whichestablishesevidencefor the validity of groundfacts. Theapproach in generalis described,andexamples for a partitular behavior-based robotcontrolframework in simulation are presentedanddiscussed. 1 Background and Overview There are several good reasons to include a behaviorbased componentin the control of an autonomousmobile robot. There are equally good reasons to include in addition a deliberative component. Having componentsof both types results in a hybrid control architecture, intertwining the behavior-basedand the deliberative processes that go on in parallel. Together, they allow the robot to react to the dynamicsand unpredictability of its environmentwithout forgetting the high-level goals to accomplish. Arkin (Arkin 1998, Ch. 6) presents a detailed argument and surveys hybrid control architectures; the many workingautonomous robots that use hybrid architectures inelude the RemoteAgent Project (Muscettola et al. 1998; RemoteAgentProject 2000) as their highest-flying example. While hybrid, layered control architectures for autonomousrobots, such as Saphira (Konoligeet al. 1997)or *Thisworkis partially supportedbythe German Fed. Ministry for Educationand Research(BMBF) in the joint project AgenTee(521-75091-VFG0005B). This paper is an updatedversion (Schrnherr et al. 2001). tOnleave from Tech. University of Cluj-Napoca,Romania. The workwas supportedby a Roman-Herzog scholarship of the AlexanderyonHumboldt foundation. Copyright~) 2001,American Associationfor Artificial Intelligence(www.aaai.org). All rights reserved. 3T(Bonassoet al. 1997)are state of the art, someproblems remainthat makeit a still complicated task to build a control systemfor a concrete robot to workon a concrete task. To quote Arkin(Arkin 1998, p. 207), the nature of the boundarybetweendeliberation and reactive executionis not well understoodat this time, leading to somewhat arbitrary architectural decisions. Oneof the problems is to keep up-to-date the symbolic world representation for the deliberative component,which is an instance of the symbol groundingproblem(I-Iarnad 1990). Thereare solutions to importantparts of that problem, such as methodsand algorithms for sensor-based localization to reason about future navigationactions: (Fox, Burgard, &Thrun 1999) presents one of the manyexamples for on-line robot pose determinationbasedon laser scans. If the purpose of deliberation is supposedto be moregeneral than navigation, such as action planningor reasoningabout action, then the needarises to sense moregenerally the recent relevant part of the worldstate and updateits symbolic representation basedon these sensor data. Wecall this representationthe current situation. Thenaive version of the update problem"Tell meall that is currently true aboutthe word!"needsnot be solved, luckily, if the goal is to build a concreterobot to workon a concrete task. Onlythose facts needupdatingthat, accordingto the symbolic domainmodelused for deliberation, are relevant for the robot to workon its task. Then,every robot has its sensor horizon,i.e., a border in spaceand timelimiting its sensor range. The term sensor is understoodin a broad sense: It includes technical sensors like laser scanners, ultra soundtransducers, or cameras;but if, for example,the arena of a delivery robot includes access to the control of an elevator, then a status request by wireless Ethemetto determinethe current location of the elevator cabin is a sensor action, and the elevator status is permanentlywithin the sensor horizon. Weassume:The worldstate informationwithin the sensor horizonis sufficient to achievesatisfying robot performance. Thissaid, the task of keepingthe facts of a situation upto-date remains to continually computefrom recent sensor data and the previoussituation a newversion of the situation as far as it lies withinthe sensorhorizon.Thecomputation is basedon plain, current sensor values as well as histories of situations and sensor readings or aggregatesthereof. Prac- tically, we cannot expect to get accurate situation updates instantly; all wecan do is makethe situation update as recent, comprehensive, and accurate as possible. This paper contributes to this task an approachof using histories of activation values in behavior-based robot control systems (BBSs)(Matari6 1999) as main source ofinformation for situation update. This approachis useful for three reasons: ¯ Activation values are calculated anywayin most BBSsto allow for arbitration or mergingbetweenbehaviors; they can be used at no additional computationcost, provided that they are sufficiently fine grained. ¯ Anactivation value is groundedin sensor readings and, by definition, evaluatesthemin a waytailored to its respective behavior;using activation values like aggregatedsensor readings yields automatically an action-centered way of "looking throughthe sensors". ¯ To have a practical hybrid robot control, the symbolic world modelmust be in accord with the inventory of behaviors anyway;using activation value histories in situation update only makeseven more explicit the need to co-design the BBSand deliberative control components. Ourapproachis not in principle limited to a particular combination of deliberation componentand BBS,as long as the BBSis expressedas a dynamicalsystemand involves a looping computationof activation values for the behaviors. Some additionalrequirements applythat will be clarified in the paper. All demoexamples are formulated in a concrete BBS framework, namely, Dual Dynamics (DD, (Jaeger Christaller 1997)), whichhas also inspired our abstract view of BBSs.Our view of building hybrid robot controllers involving a BBSas reactive componentis shaped by our work in progress on the DD&P robot control architecture (Hertzberget al. 1998;Hertzberg&Schtinherr 2001), which blends DDcontrollers with action plans generatedby a classical propositionalplanner(concretely, IPP (Koehleret al. 1997)) as the central deliberation component.Mind, however: The methodfor fact extraction from activation value histories of BBSspresentedhere is potentially applicable in BBSframeworksother than DD,as will be discussed at the end of the paper. Therest of this paper is organizedas follows. In Sec. 2, we present our approach of formulating BBSsas dynamical systems and give a detailed examplein See. 3. To provide somebackgroundconcerning complete robot control systems, we then (See. 4) sketch howwe assumethe deliberation componentinterferes the BBScontrol component. See. 5 contains the technical contribution of the paper, describing in general as well as by wayof examplethe technique of extracting facts from BBSactivation value histories. See. 6 discussesthe approachand relates it to the literature. Sec. 7 concludes. robot actuators, and higher-level behaviors (HLBs),which are connected to LLBsand/or HLBs.Each LLBimplements twodistinct functions: a target function and an activation function. The target function for the behavior b provides the reference tb for the robot actuators ("what to do") as follows: TOtT tb = fb(8 T, 8 f, LLBI (1) wherefb is a nonlinear vector function with one component for eachactuator variable, sT is the vector of all inputs from sensors, s~" is the vector of the sensor-filters and t~LTLB is the vector of activation valuesof the LLBs.Bysensor-filters - sometimescalled virtual sensors - we meanmarkovian and non-markovianfunctions used for processing specific information from sensors. The LLBactivation function modulatesthe output of the target function. It provides a value between1 and 0, meaning that the behaviorfully influences, does not influenceor influences to somedegree the robot actuators. It describes whento activate a behavior. For LLBb the activation value is computed fromthe followingdifferential equation: ~b,LLB ~- ~b(Olb,LLB, OnFb, OffFb, OgTb) (2) Eq. 2 gives the variation of the activation value Olb,LLBof this LLB.lib is a nonlinearfunction. OCTb allows the planner to influencethe activation values, see Sec. 4. Thescalar variables OnFband O f f Fb are computedas follows: OnFb m Ub(S T,8fT T,OtLLB,0/T HLB) T T T T (3) (4) = Vb(S ,Sl,aLLB, aHLn) whereUband Vb are nonlinear functions. Thevariable OnFb sumsup all conditions whichrecommend activating the respective behavior (on forces) and OffFbstands for contradictory conditionsto the respectivebehavior(off forces). The HLBsimplementonly the activation function. They are allowed to modulate only the LLBsor other HLBson the sameor lowerlevel. In our ease, the changeof activation values for the I-ILBs Otb,HLB are computed in the same manneras F-xl. 2. Thereason for updatingbehavioractivation in the formof Eq. 2 is this. Byreferring to the previousactivation value&b, it incorporates a memory of the previous evolution which can be overwritten in case of suddenand relevant changes in the environment,but normallyprevents activation values from exhibiting high-frequencyoscillations or spikes. At the sametime, this formof the activation function provides somelow-passfilteringcapabilities, deleting sensor noise or oscillating sensor readings. Independentfrom that, it helps to develop stable robot controllers if behavioractivations havea tendencyof moving towardstheir boundaryvalues, i.e., 0 or 1 in our formulation. Toachieve that, wehaveimplemented lib in Eq. 2 as a bistable groundform(like in (Bredenfeldet al. 1999)for RoboCup application of a BBSof the same type) providing somehysteresis effect. Withoutfurther influence, this function pushesactivation values lower/higherthan somethreshold/~ (typically ~ = 0.5) softly to 0/1. Theactivation value changesas a result of addingthe effects comingfrom the variables OnF,Of f F, OCTand the bistable groundform. OffFb 2 BBSs as dynamical systems Weassumea BBSconsists of two kinds of behaviors: lowlevel behaviors(LLBs),whichare directly connectedto the 10 Exactformulationsof the gb function are then just technical and unimportantfor this paper. The relative smoothnessof activation values achievedby using differential equations and bistability will be helpful later in the technicalcontributionof this paper(See. 5), when it comesto derive facts fromthe time series of activation values of the behaviors. In our BBSformulation, behavior arbitration is achieved using the activation values. As shownin Eqs. 2-4, each behaviorcan interact with (i.e., encourageor inhibit) every other behavioron the sameor lowerlevel. Themodelof interaction betweenbehaviorsis defined by the variables OnF andO f f F. Theoutput vector orreference vector r oftheBBSforthe robot acmatol~ isgenerated bysumming allLLBoutputs by a m~er, asfollows: r = ~ ~b,LLBtb lows checkingisolated behaviors or the wholeBBSin designated environmentalsituations (configurations). Figure 2: The behavior inventory for our examplesas in a screen shot from the DDDesigner tool. Big circles denote I-ILBs; squares denote LLBs;the hollowicons at the bottom denote robot actuator reference values. Arrowsdenote control flow betweenLLBsand actuators. Theinfluence structure betweenbehavior activations is not shown;the small circles are of no importancehere. (5) b Togetherwith the formof the activation values, this wayof blending the outputs of LLBsavoids discontinuities in the reference values r~ for the single robots actuators, such as suddenchangesfrom full speed forward to full speed backward. 3 An Example To illustrate the notation of See. 2 we give a demonstration problemconsisting of the task of following a waft with a robot and entering only those doors that are wide enough to allow the entrance. Figure 1 gives an overview about the mainpart of our arena. The depicted robot is equipped with a short distance laser-scanner, 4 infrared side-sensors, 4 front/back bumpersand somedead-reckoningcapabilities. The control system contains three HLBsand six LLBs, see Fig. 2. RobotDirectionand RobotVeloeit~/are the referencesfor the tworespective actuators. Wehavethe following I-ILBs(cf. Fig. 2): CloseToDoor is activated if there is evidencefor a door; InCorridoris active while the robot movesinside a corridor; TimeOut was implementedin order to avoid getting stuck in a situation, see See. 5; The LLBsare the following: TurnToDoor is activated if the robot is situated on a level with a door; GoThruDooris activated after the behavior TumToDoor wassuccessful; FollowFlightWall is active whena right wall is followed; FollowLeftWall is active whena left wall is followed; AvoidCollis active whenthere is an obstacle in the front of the robot Wander is active whenno other LLBis active. Most of the implementedbehaviors are commonfor this kind of tasks. However, wedecidedto split the task of passing a door in a sequenceof two LLBs.This helps structure, maintain and independentlyimprovethese two behaviors. To give a simple exampleof BBSmodeling, here are the "internals" of Wander(Wdr): Figure 1: The demoarena and the final robot pose in ExampleI. The robot has started its course at the lowerright comer,belowto the round obstacle. On, FWdr= kl (1 - OtCioseroV.) * (1 -- OtFollownlghtW. ) OffFwdr Weused a simulator based on the DDDesignerprototype tool (Bredenfeld2000;Bredenfeldet al. 1999). Thetool al- = ¯ (1 - aFollowLe~W.) * (1 -- OtAvolaColl) k20~AvoldColl+ k3OtCloseToDoor + k4OtFollowRIghtWall "1- kS~FollowLeftWall (7) 11 ing/falling edgesof someactivation values. The secondrise of AvoidColl(after t = 22) is caused by the door frame, whichpopsinto sight as a close obstacle at the very end of the turning maneuver.Effectively, the collision avoidance guides the robot through the door. Finally GoThruDoor gets slowlydeactivatedallowingother behaviorsto take control of the robot. The HLBsCloseToDoorand InCorridor describe global states, therebymodulating the interaction, activation and sequencing of the LLBs. 4 Fromplans to BBSs:Blendingbehaviors with operators Figure 3: Theactivation value histories for Example1. The numbersare time unit for reference. Top down:AvoidColl, FollowLeftWall, GoThruDoor, /urnToDoor, Wander, CloseToDoor,InCorridor, TimeOut. RobotDwdr = e~botVwd r tWdr = = randomDireetionO mediumSpeed RobotVelocitYWdr [ RobotDirectionwdr ] (8) (9) (10) (11) awdr= gw.(aw.,OnFw.,OLfFw.,OOTw.) where kl... k5 are empirically chosen constants. randomDirection 0 could be every function that generates a direction whichresults in a randomlychosentrajectory. Due to its product form, OnFwander can only be remarkably greater than zero if all included ab are approximately zero. OffFwanderconsists of a sumof terms allowing every included behavior to deactivate Wander.Both terms are simple and can be calculated extremelyfast, whichis a guideline for most BBSs.The OCTterm will be briefly explainedin the next section. Fig. 3 showsthe activation value histories generatedduring a robot run, whichwill be referred to as Example 1. The robot starts at the right loweredgeof Fig. 1 with Wanderin control for a very short time, until a wall is perceived.This effect is explainedby Eq. 6. Whilethe robot starts to follow the wall, it detects the smallroundobstaclein front. In consequence, two LLBsare active simultaneously: AvoidColl and FollowLeftWall. Finally, the robot followsthe wall, ignores the little gap and enters the door. In the examplesfor this paper, FollowRightWall is alwaysinactive and therefore not shownin the activation value curves. This exemplifiesthe purposeof a slow increase in behavior activation. FollowLeftWallshould only have a strong influenceto the overall robot behaviorif a wall is perceived with both side-sensors for sometime, so as to be moresure that the robot really has sensed a wall. Thesmall dent in the activation of FollowLeftWall(around the time t = 4) is explainedby perceiving free space with one side-sensor. If both side-sensors detect free space this behaviorwould be deactivated. Theturning to the door is describedby ris- 12 Thetechnical contribution of this paper is an approachto enhancingthe information flow from the BBSto the deliberation part in hybrid robot control systems. Beforecoming to that in Section 5, we wantto sketch the control flow in the oppositedirection, fromthe plannerto the BBS,to make completethe picture of the entire robot control architecture that we have in mind. Not in the focus of the present paper, this descriptionjust consists of stating the basic principle, and we refer to workon the DD&P control architecture (Hertzberget al. 1998;Hertzberg&SchOnherr2001), which elaborates on the approach. The basic idea is this: An action planner continually maintainsa current action plan, based on the current situation and the current set of user-providedor self-generated mission goals. Basedon the current plan and the current situation, an execution componentpicks one of the operators in the plan as the one currently to be executed. Plan executionis donein a plans-as-advice(Pollack 1992)fashion: Executingan operator meansstimulating moreor less strongly the behaviorsworkingin favor of the operator, and muting those workingagainst its purpose. Whichoperator stimulates or muteswhichbehaviorsis an informationthat the domainmodelerhas to provide along with the domain modelfor the deliberative component and the set of behaviors for the BBS. Technically,the influence of the current operator is "injected" into the BBSin terms of the Operator-CouplingTerms(OCT)in the activation functions, see Eqs. 2. The influence of the current groundoperatorop gets inside every behavior b through the term OCTb,as follows: OCT,= ~ °~’ °p op (12) 8b e b (Z; -- O~b) opEOP where 87,Z:Pe {0,1}andc7isp a consmt. 87= 1 iff op influences the behaviorb. c~ modelsthe immediacy or delay of the operator influence on the behavior. Zb°p expresses whetherthe operator influence is of the stimulating or the mutingsort: If Z~~ = 1, then the respective behavior is stimulated, and mutedif Z:P = 0. Z maybe a boolean function, returning 0 or 1 conditionally. Togive an example,assumethat the domainmodelfor the deliberation componentincludes an operator GO-IN-RM(~) modelingthe action of someoffice delivery robot to go (from whereverit is) to and enter roomx. Let the behavior inventorybe the one specified in See. 3. Hereis a selec- lion of s, Z, and e variables of these behaviorsand howthey should be affected by the groundoperator GO-IN-RM(A) c,O-~.ai(A) set to 1 as the operatordoes influencethe be8GoThmDoo r havior; GO-~-aM(A) CGoThruOoor set to somemediumvalue, causing a tendency to influenceactivation soonafter the operatoris chosenas being active; ZGC,~-~M(A) oThruDoor set to charFct(CloseTo(A)),i.e., the characteristic function that returns 1 if CloseTo(A) is currently in the fact base, and0 else; C~-~(A) 8Wander set to 1 as the operator should affect its activation (namely,mutingit); c~-m~(a) SAvoldCo, set to 0 as collision avoidanceshould not be affected by the operator (note the difference betweennot affecting and actively mutingan activation value) 5 FromBBSsto plans: Extracting facts from activation values Wenowturn to the methodhowto extract facts from activation value histories. It is influencedby previousworkon chronicle recognition, such as (Ghallab1996). Tostart, take anotherlookat the activationcurvesin Fig. 3 in See. 3. Someirregular activation time series occur due to the dynamicsof the robot/environmentinteraction, such as early in the AvoidColl and Wanderbehaviors. However, certain patterns re-occur for single behaviorswithin intervals of time, such as a value being moreor less constantly high or low, and values going up from low to high or vice versa. Theidea to extract symbolicfacts fromactivation values is to consider characteristic groupsor gestalts of such qualitative activation features occurringin chronicles over time. Tomakethis precise, wedefine, first, qualitative activation values (or briefly, qualitative activations) describing these isolated patterns. In this paper, we consider four of them, which are sufficient for defining and demonstrating the principle, namely,rising/falling edge, high and low, symbolized by predicates to, ~e, Hi, and Lo, respectively. In general, there maybe morequalitative activations of interest, such as a value staying in a mediumrange over some period of time. For a behaviorb and time interval It1, t2], they are defineda~s Hi(b)[tl, t2] Lo(b)[tl, t2] = to(b)[h, tg.] ab [t] > h for all tl _~t ___~’ t’~ ab[t] <_l for all tl _< t _<t2 ab[tl] = l and Ctb[tz] = h (13) and ab increases generallymonotonicallyover[t1, t2] ~e(b) [tl, t2] = ab[tl] = h and ab[t2] = l (14) and ab decreases generallymonotonicallyover[t1, t2] for given threshold values 0 << h < 1 and 0 < l << 1, where ab[t] denotes the value ofab at time t. Generalmonotonicity requires anothertechnical definition, whichweskip here 13 for brevity. The idea is that somedegree of noise should be allowedin, e.g., an increasing edge, makingthe increase locally non-monotonic. In the rather benignexampleactivation curvesin this paper,regular monotonicity suffices. Similarly, it is not alwaysreasonableto use the global constants h, 1 as Hi and Lothresholds, respectively. It is possible to use different threshold constants or thresholding functions for different behaviors. Wedo not go into that here. Then, it makessense to require a minimum duration for [tl, *a] to preventuseless mini intervals of Hi and Lo types from being identified. Finally, the strict equalities in Eq.s 13 and14 are unrealistic in real robot applications, wheretworeal numbers must be compared,which are seldomstrictly equal. Equality-l-e is the solutionof choicehere. Thekey idea to extract facts fromactivation histories is to considerpatterns of qualitative activations of several behaviors that occur within the sameinterval of time. Wecall these patterns activation gestalts. Weexpress themformally by a time-dependentpredicate AGover a set Q of qualitative activations of potentially manybehaviors. For a time interval [t, t’] the truth of AG(Q)[t,t’] is definedas the conjunction of conditions on the component qualitative activations q E Qof behaviorsb. Onepossibility of defining them is in the followingway: case: q = Hi(b) q = Lo(b) q = l~e(b) q = #e(b) then: Hi(b)[t,t’] Lo(b)[t, t’] i~e(b)[tl, t2] for someIt1, tz] C_[t, t’], andLo(b)[t, hi and Hi(b)[t2, ~e(b)[tx, t~] for some [tl, tg] C_It, t’], andHi(b)[t, tl] andLo(b)[t2, Notethat it is not required that different rising or falling edges in Q start or end synchronouslyamongeach other or at the interval borders of It, g]---they only mustall occur somewhere within that interval. For example, AG({~e(GoThruDoor),~e(TurnToDoor), Hi(CloseToDoor)}) is true over [20,24] in the activation histories in Fig. 3; it is also true over [16, 23] (and therefore, also overtheir union[16, 24]), but not over[16, 25], as CloseToDoor has left its Hi band by time 25, and possibly the samefor GoThruDoor, dependingon the concrete value of the h threshold. A chronicle over someinterval of time [t0,t] is a set of activation gestalts oversub-intervalsof [to, t] witha finite set of n linearly ordered internal interval boundarypoints to <tl < ... < tn < t. A groundfact is extracted fromthe activation history of a BBSas true (or rather, as evident, see the discussionbelow)at time t if its defining chroniclehas been observedover someinterval of time ending at t. The defining chronicle must be provided by the domainmodeler, of course. Wegive as an examplethe defining chronicle of the fact InFioomthat the robot is in someroom,such as the one left of the wall in Fig. 1. Inlqoom[t]is extractedif the following defningchronicleis true withinthe interval It0, t], wherethe tl are existentiallyquantifiedt: AG({~e(GoThruDoor) } )[t4, t] A AG({l~’e(GoTrO.), ~e(Turn2D.), Hi(Close2O.)})[t3, A AG({Hi(TurnToDoor, Lo(InCorridor)})[tz,Ls] A AG({’fi’e(Turn2O.), #e(Close2D.), Se(InCor.)}) A AG({Hi(InCorridor)})[to, A AG({Lo(TimeOut)})[to, (15) Assumingreasonable settings of the Hi and ko thresholds h,l, the following substitutions of the time variables to time-points yield the mappinginto the activation histories in Fig. 3: L = 28 (right outside the figure), to = 3, tl 12,tz = 16,t3 = 20,t4 = 24. As a result, we extract InRoom[24]. This substitution is not unique. For example, postponing to until 5 or having tl earlier at 9 wouldalso work. This point leads to the process of chronicle recognition: given a workingBBS,permanentlyproducing activation values, howare the given defining chroniclesof facts checkedagainst that activation value data stream to determinewhethersomefact starts to hold? The obviousbasis for doing this is to keep track of the qualitative activations as they emerge.That means,for every behavior, there is a processloggingpermanentlythe qualitative activations. For those of type Hi and/o, the sufficiently long time periods of the respective behavioractivation above and belowthe h, I thresholds, resp., haveto be recordedand, if adjacentto the current time point, appropriatelyextended. Thiswouldlead automaticallyto identifying qualitative activationsof types Hi andLowiththeir earliest start point, such as to = 3 for Hi(InCorridor)in the exampleabove. Qualitative activationsof types#e and~e are loggediff their definitions (eqs. 13 and14, resp.) are fulfilled in the recent history of activation values. Asthis loggingprocessis local to every behavior, the complexityis linear O(B)in the numberB of behaviors. Qualitative activation logs are then permanentlyanalyzed whetherany of the existing definingchroniclesare fulfilled, whichmayrun in parallel to the ongoingprocess of logging the qualitative activations. Anonline version of this analysis inspired by (Ghallab 1996) wouldattempt to match the flowof qualitative activations withall definingchroniclesc by meansof matchingfronts that jumpalong c’s internal interval boundarypoints t~ and try to bind the next time point ti+l as current matchingfront suchthat the recent qualitative activationsfit all sub-intervalsof c that endin ti+ t. Notethat morethan one matchingfront maybe active in every defining chronicle at any time. A matchingfront in c vanishes if it reachesthe endpoint t (the definingchronicleis true), else whilestuck at tl is caughtup by anothermatchingfront at ti, or else an activation gestalt overan interval endingat ti+z is no longer valid in the current qualitative activation history. For complexityconsiderations, assumethat (7 defining chronicles are defined that involve a maximum of N - 1 internal interval boundarypoints. Assume further that A is tTofit columnspace,somebehaviornamesare abbreviated. 14 the maximalnumberof qualitative activations occurring in single activation gestalt conjunctsof all definingchronicles. (A is boundedby the numberB of behaviors.) Then, one cycle of the online matchingof defining chronicles runs in O(ACN) time. Practically, the necessarycomputationmaybe focusedby specifying for each defining chronicle a trigger condition, i.e., one of the qualitative activationsin the definition that is used to start a monitoringprocessof the validity of all activation gestalts. For example,in the InRoom definition above, ~e(GoThruDoor), as occurringin the Its, t4] interval, might be used. Note that the trigger condition need not be part of the earliest activation gestalts in the definition. On appearanceof sometrigger conditionin the qualitative activation log, wetry to matchthe activation gestalts prior to the trigger withqualitativeactivationsin the log file, and,if successful, verify the gestalts after the trigger conditionin the qualitative activations as they are beinglogged. Figure4: Robotexample2: final state and activation curves. See text for explanations. To give an examplewherethe derivation of the InRoom fact fails, considerFig. 4. Thescenariois like before, i.e., the robot starts at the lowerfight comer,driving upwardand trying to enter any doorlarge enough.Differentto Fig. 1, no obstacleis presentat the beginning,andwhilethe robottries to enter into the detected door, another robot comesfrom within the roomand blocks it. Fig.4 right showsthe scene after the robot has failed to enter the door, and left are the respectiveactivation values. Like before, while InCorriclor (to = 3), the CloseToDoor and TurnToDoor activations rise with InCorridor falling (tz = 9, tz = 13); then, TurnToDoor is Hi, while InCorridor is Lo (ts = 20). But then, mischief strikes. After the long high period of TurnToDoor,TimeOutjumps up, terminating its Lo period, before GoThruDoor has risen. In consequence, t4 cannot be bound, and the fact InRoomnot extracted. If #e(GoThruDoor) wasused as a trigger condition in the first place, then no unnecessarymatchingeffort was wasted. Somemore general remarks are in place here. Defining chroniclesexclusivelyin terms of activation gestalts is a special cause that wehaveused in this paperto keepmatters focused. In general, the obviousother elementsmaybe used for defining them:sensor readings at sometime points (be they physicalsensors or sensor filters), and the validity of symbolicfacts at a time point or over sometime interval. Ourintention is to provide fact extraction fromactivation values as a mainsource of information, not the exclusive one. That type of information can be addedto the logical format of a chronicle definition as in (15). For example, if the exact time point of entering a roomwith the robot’s front is desired as the starting point of the InRoom fact, then this might be determinedby the time within the interval It4, t] (i.e., within the decreaseof the GoThruDoor activation) wheresomesensor senses open space to the left and right again. As another example, assumethat the fact At(DoorA)for the door to someroomA maybe in the fact base (as derived from a normallocalization process). Then At(DoorA )[t4] could be addedto the defining chronicle (15) aboveto derive not only InRoom[t],but morespecifically InRoom(A)[t]. The fact extraction technique does not presumeor guarantee anything about the consistency of the facts that get derived over time. Achievingand maintainingconsistency, and determining the ramifications of newly emergedfacts remainissues that go beyondfact extraction. Pragmatically, we wouldnot recommend to blindly add a fact as true to the fact baseas soonas its definingchroniclehas beenobserved. A consequentof a recognized defining chronicle should be interpreted as evidencefor the fact or as a fact hypothesis, which should be addedto the robot’s knowledgebase only by a more comprehensiveknowledgebase update process, whichmayeven reject the hypothesisin case of conflicting information. Apossible solutions wouldbe to add someintegrity constraints to the defining chronicles. However, this is not withinthe scopeof this paper. 6 Discussion A physical agent’s perception categories must to somedegree be in harmonywith its actuator capabilities--at least in purposivelydesignedtechnical artifacts such as working autonomousrobots. 2 Our approach of extracting symbolic facts from behavioractivation merelyexploits this harmony for intertwiningcontrol on a symbolicand a reactive level of a hybridrobot control architecture. Thetechnical basis for the exploitationare time series of behavioractivation values. Wehave taken themfrom a special type of behavior-basedrobot control systems (BBSs), namely,those consisting of behaviors expressedby nonlincar dynamicalfunctions of a particular form,as describedin Sec. 2. Thepoint of havingactivation values in BBSsis not new;it is also the case, e.g., for the behavior-basedfuzzy control part underlying Saphira (Konolige et al. 1997), wherethe activation values are used for context-dependent blendingof behavioroutputs, whichis similar to their use in our BBSframework.Activation values also provide the degree of applicability of the correspondingmotorschemas in (Arkin 1998, p. 141). 2Wedonot speculateaboutbiologicalagentsin this paper,althoughwewouldconjecturethat natural selection and parsimony stronglyfavorthis principle. 15 The activation values of a dynamical system-type BBS are well-suitedfor fact extraction in that their formalbackgroundin dynamicalsystemstheory provides both the motivation and the mathematicalinventory to makethemchange smoothlyover time----compare,e.g., the curves in Figures3 and4 with the ragged ones in (Saffiotti, Ruspini, &Konolige 1999, Fig. 5.10). This typical smoothnessis handyfor defining qualitative activations, whichaggregateparticular patterns in termsof edgesandlevels of the curvesof individual behaviors, whichare recordedas they emergeover time. Thesethen serve as a stable basis for chronicle recognition over qualitative activations of several behaviors.Note,however, that this smoothness is a practical rather than a theoretical issue, and other BBSapproachesmayserve as bases for fact extractionfromactivation values. Wewantto emphasizethat the activation values serve two purposesin our ease: first, their normaloneto providea reliable BBS,and second, to deliver the basis for extracting persistent facts, basedon their distinctive patterns. Withthe seconduse, we save the domainmodelera significant part of the burdenof designinga complicatedsensor interpretation schemeonly for deriving facts. The behavior activation curves, as a by-productcomingfor free of the behaviorbased robot control, focus on the environmentdynamics,be it inducedby the robot itself or externally. Byconstruction, these curvesaggregatethe available sensor data in a waythat is particularly relevant for robot action. Wehavearguedthat this informationcan be used as a mainsource of information about the environment;other information, such as coming from raw sensor data, fromdedicated sensor interpretation processes, or from available symbolicknowledge,could be used in addition. Asactivation values are present in a BBSanyway,it is possible to "plug-in" the fact extraction machinefor a deliberative component to an already existing behaviorsystem like the DDcontrol systemin (Bredenfeldet al. 1999).Yet, if a newrobot control systemis about to be written for a newapplication area, things couldbe donebetter, within the degrees of freedomfor variations in behavior and domain modeldesign. The ideal case is that the behavior inventory and the fact set is in harmonyin the sense that such facts get used in the domainmodelwhosemomentary validity engravesitself in the activation value history, and such behaviorsget used that produceactivation values producing evidencefor facts. For example,a single WallFollow behavior workingfor walls on the right andon the left, maybe satisfactory fromthe viewpointof behaviordesign for a given robot application; for fact extraction, it maybe moreopportune to split it into FollowLeftWalland FollowRightWall, whichwouldbe equally feasible for the behavior control, but allowsmoretargeted facts to be deduceddirectly. Apart from such design-level interdependencies, which are non-trivial, but not special for our approach, we are aimingat a control architecture with a deliberative and a behavior-basedpart as twoabreast moduleswith no hierarchy, as sketchedin (Hertzberget al. 1998).Thefact extraction schemeleaves the possibility to un-plugthe deliberative part fromthe robot control, whichwethink is essential for robustness of the wholerobot system. Our technique is complementary to anchoring symbolsto sensor data as describedin (Coradeschi&Saffiotti 2001). differs fromthat line of workin twomainrespects. First, weuse sensor data a,s aggregatedin activation value histories only, not rawsensor data. Second,weaimat extracting groundfacts rather than establishing a correspondencebetweenpercepts and references to physical objects. Thelimit of our approachis that it is inherently robot-centeredin the sense that we can only arrive at informationthat has to do directly with the robot action. Theadvantageis that, due to its specificity, it is conceptuallyand algorithmicallysimpler than symbolanchoringin general. 7 Conclusion Wehave presented a newapproach for extracting information aboutsymbolicfacts fromactivation curvesin behaviorbased robot control systems. Updatingthe symbolic environmentsituation is a crucial issue in hybrid robot control architectures in order to bring to bear the reasoningcapabilities of the deliberativecontrol part on the physicalrobot action as exerted by the reactive part. Unlikestandard approachesto sensing the environmentin robotics, we are using the information hidden in the temporal developmentof the data, rather than their momentary values. Therefore,our methodpromises to yield environmentinformation that is complementary to normalsensor interpretation techniques, whichcan and should be used in addition. Wehavepresentedthe techniquein principle as well as in terms of selected demoexamplesin a robot simulator, which has allowedto judge the approachfeasible and to design the respective algorithms. The computationalcomplexityof the recognition process is in O(BCN),where B is the number of behaviors, C the numberof chronicle definitions, and N the maximal"length" of a chronicle definition in terms of intermediatetimepoints internal to the chronicledefinition. The approachwill be applied in the context of the hybrid robot control architecture DD&P (Hertzberg et al. 1998) to generate an important part of the information that is used to update the symbolic world modelfrom the sensor data stream. Weare implementingan Add-On-Toolfor the DDDesigner (Bredenfeld et al. 1999) which allows systematic analyze and test of our ideas and experimentswith additional types of qualitative activations. Acknowledgements Thanksto HerbertJaeger, the GMD AiS.BE-Team, and especially Hans-Ulrich Kobialka for their supporton this work. References Arkin, R. 1998. Behavior-BasedRobotics. MITPress. Bonasso, P.; Firby, I.; Gat, E.; Kortenkamp,D.; Miller, D.; and Slack, M. 1997. Experienceswith an architecture for intelligent, reactive agents.J. Expt.Theor.Artif. Intell. 9:237-256. Bredenfeld, A.; Gthring, W.; Ganter, H.; Jaeger, H.; Kobialka, H.-U.;P16ger,P.-G.; Schtll, P.; Siegberg,A.; Streit, A.; Verbeek,C.; and Wilberg, L 1999. Behaviorengineering with dual dynamicsmodelsand design tools. In Proc. 3rd lnt. Workshopon RoboCupat IJCAI-99, 57-62. 16 Bredenfeld, A. 2000. Integration and evolution of modelbased tool prototypes. In 11th IEEEInternational Workshop on Rapid SystemPrototyping, 142-147. Coradesehi,S., and Saffiotti, A. 2001. Perceptual anchoring of symbolsfor action. In Proc.of the 17th IJCAIConf., 407--412. Fox, D.; Burgard, W.; and Thrun, S. 1999. Markovlocalization for mobilerobots in dynamicenvironments.J. Artif. Intell. Research (JAIR)11. Ghallab, M. 1996. Onchronicles: Representation, on-line recognition and learning. In Aiello; Doyle;and Shapiro., eds., Proc. Principles of KnowledgeRepresentation and Reasoning (KR-96), 597-606. Morgan-Kauffman. I-Iarnad, S. 1990. The symbolgroundingproblem.Physica D 42:335-346. Hertzberg, J., and Schtnherr, E 2001. Concurrencyin the DD~Probot control architecture. In Sebaaly, M. E, ed., Proc. of The Int. NAISO Congr.on InformationScience Innovations (ISI’2001), 1079-1085.ICSCAcademicPress. Hertzberg, J.; Jaeger, H.; Zimmer,U.; and Morignot,P. 1998. A frameworkfor plan execution in behavior-based robots. In Proc. of the 1998IEEEInt. Symp.on Intell. Control (ISIC-98), 8-13. Jaeger, H., and Christaller, T. 1997. Dual dynamics:Designing behavior systems for autonomousrobots. In Fujimura, S., and Sugisaka, M., eds., Proc. Int. Symposium on Artificial Life andRobotics(AROB’97), 76--79. Koehler, J.; Nebel, B.; Hoffmann,I.; and Dimopoulos,Y. 1997. Extending planning graphs to an ADLsubset. In Steel, S., and Alami,R., eds., RecentAdvancesin AI Planning. ECP-97,273-285. Springer (LNAI1348). Konolige,K.; Myers, K.; Saffiotti, A.; and Ruspini, E. 1997. The Saphira architecture: A design for autonomy. J. Expt.Theor.Artif. Intell. 9:215-235. Matarit, M.J. 1999. Behavior-basedrobotics. In Wilson, R. A., and Keil, E C., eds., MITEncyclopedia of Cognitive Sciences. The MITPress. 74-77. Muscettola,N.; Nayak,P.; Pell, B.; and Williams,B. 1998. RemoteAgent: to boldly go where no AI system has gone before. J. Artificial Intelligence103:5--47. Pollack, M.1992. Theuses of plans. Artif. Intl. 57(1):4368. RemoteAgent Project. 2000. Remoteagent experiment validation, http://rax.arc.nasa.gov/DS1-Tech-report.pdf. Saffiotti, A.; Ruspini, E.; and Konolige, K. 1999. Using fuzzy logic for mobilerobot control. In Zimmermann, H.-J., ed., Practical Applications of FuzzyTechnologies. Kluwer Academic. chapter 5, 185-205. Handbook of FuzzySets, vol.6. Sch0nherr, E; Cistelecan, M.; Hertzberg, J.; and Christaller, T. 2001. Extracting situation facts fromactivation value histories in behavior-basedrobots. In Baader, F.; Brewka,G.; and Eiter, T., eds., KI-2001:24th German 9th Austrian Conf. on AI, volume2174of LNALSpringer.

Using Behavior Activation Value Histories for Updating...

Related documents

Products

Support

Using Behavior Activation Value Histories for Updating...

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib