Using Behavior Activation Value Histories for Updating...

From: AAAI Technical Report FS-01-01. Compilation copyright © 2001, AAAI (www.aaai.org). All rights reserved.
Using BehaviorActivation Value Histories for UpdatingSymbolicFacts*
Frank Sch~inherr
and Mihaela
Cistelecan
t and Joachim Hertzberg
and Thomas Christaller
Fraunhofer-Institute for Autonomous
Intelligent Systems(AIS)
Schloss Birlinghoven, 53754Sankt Augustin, Germany
Frank.Schoenherr@ais.fraunhofer.de
Abstract
Thepaperpresentsa newtechniquefor extractingsymbolic
groundfacts out of the sensor data streamin autonomous
robotsfor use underhybridcontrolarchitectures. Thesensor dataare usedin the formof timeseries curvesof behavior
activation values, yieldingan imageof the environment
as
perceivedthroughthe eyesof usefulbehaviors.
Similarprogressions
in individualbehavioractivationcurves
are aggregated
to well-defined
patterns,like edgesandlevels,
calledqualitativeactivations.Sets of qualitativeactivations
for differentbehaviorsoccurringin the sameinterval of time
are summed
to activationgestalts. Sequences
of activation
gestalts are usedfor definingchronicles,the recognitionof
whichestablishesevidencefor the validity of groundfacts.
Theapproach
in generalis described,andexamples
for a partitular behavior-based
robotcontrolframework
in simulation
are presentedanddiscussed.
1 Background and Overview
There are several good reasons to include a behaviorbased componentin the control of an autonomousmobile
robot. There are equally good reasons to include in addition a deliberative component. Having componentsof
both types results in a hybrid control architecture, intertwining the behavior-basedand the deliberative processes
that go on in parallel. Together, they allow the robot
to react to the dynamicsand unpredictability of its environmentwithout forgetting the high-level goals to accomplish. Arkin (Arkin 1998, Ch. 6) presents a detailed argument and surveys hybrid control architectures; the many
workingautonomous
robots that use hybrid architectures inelude the RemoteAgent Project (Muscettola et al. 1998;
RemoteAgentProject 2000) as their highest-flying example.
While hybrid, layered control architectures for autonomousrobots, such as Saphira (Konoligeet al. 1997)or
*Thisworkis partially supportedbythe German
Fed. Ministry
for Educationand Research(BMBF)
in the joint project AgenTee(521-75091-VFG0005B).
This paper is an updatedversion
(Schrnherr
et al. 2001).
tOnleave from Tech. University of Cluj-Napoca,Romania.
The workwas supportedby a Roman-Herzog
scholarship of the
AlexanderyonHumboldt
foundation.
Copyright~) 2001,American
Associationfor Artificial Intelligence(www.aaai.org).
All rights reserved.
3T(Bonassoet al. 1997)are state of the art, someproblems
remainthat makeit a still complicated
task to build a control
systemfor a concrete robot to workon a concrete task. To
quote Arkin(Arkin 1998, p. 207),
the nature of the boundarybetweendeliberation and
reactive executionis not well understoodat this time,
leading to somewhat
arbitrary architectural decisions.
Oneof the problems is to keep up-to-date the symbolic
world representation for the deliberative component,which
is an instance of the symbol groundingproblem(I-Iarnad
1990). Thereare solutions to importantparts of that problem, such as methodsand algorithms for sensor-based localization to reason about future navigationactions: (Fox,
Burgard, &Thrun 1999) presents one of the manyexamples
for on-line robot pose determinationbasedon laser scans. If
the purpose of deliberation is supposedto be moregeneral
than navigation, such as action planningor reasoningabout
action, then the needarises to sense moregenerally the recent relevant part of the worldstate and updateits symbolic
representation basedon these sensor data. Wecall this representationthe current situation.
Thenaive version of the update problem"Tell meall that
is currently true aboutthe word!"needsnot be solved, luckily, if the goal is to build a concreterobot to workon a concrete task. Onlythose facts needupdatingthat, accordingto
the symbolic domainmodelused for deliberation, are relevant for the robot to workon its task. Then,every robot has
its sensor horizon,i.e., a border in spaceand timelimiting
its sensor range. The term sensor is understoodin a broad
sense: It includes technical sensors like laser scanners, ultra soundtransducers, or cameras;but if, for example,the
arena of a delivery robot includes access to the control of
an elevator, then a status request by wireless Ethemetto determinethe current location of the elevator cabin is a sensor
action, and the elevator status is permanentlywithin the sensor horizon. Weassume:The worldstate informationwithin
the sensor horizonis sufficient to achievesatisfying robot
performance.
Thissaid, the task of keepingthe facts of a situation upto-date remains to continually computefrom recent sensor
data and the previoussituation a newversion of the situation
as far as it lies withinthe sensorhorizon.Thecomputation
is
basedon plain, current sensor values as well as histories of
situations and sensor readings or aggregatesthereof. Prac-
tically, we cannot expect to get accurate situation updates
instantly; all wecan do is makethe situation update as recent, comprehensive,
and accurate as possible.
This paper contributes to this task an approachof using
histories of activation values in behavior-based
robot control
systems (BBSs)(Matari6 1999) as main source ofinformation for situation update. This approachis useful for three
reasons:
¯ Activation values are calculated anywayin most BBSsto
allow for arbitration or mergingbetweenbehaviors; they
can be used at no additional computationcost, provided
that they are sufficiently fine grained.
¯ Anactivation value is groundedin sensor readings and, by
definition, evaluatesthemin a waytailored to its respective behavior;using activation values like aggregatedsensor readings yields automatically an action-centered way
of "looking throughthe sensors".
¯ To have a practical hybrid robot control, the symbolic
world modelmust be in accord with the inventory of behaviors anyway;using activation value histories in situation update only makeseven more explicit the need to
co-design the BBSand deliberative control components.
Ourapproachis not in principle limited to a particular combination of deliberation componentand BBS,as long as the
BBSis expressedas a dynamicalsystemand involves a looping computationof activation values for the behaviors. Some
additionalrequirements
applythat will be clarified in the paper.
All demoexamples are formulated in a concrete BBS
framework, namely, Dual Dynamics (DD, (Jaeger
Christaller 1997)), whichhas also inspired our abstract view
of BBSs.Our view of building hybrid robot controllers
involving a BBSas reactive componentis shaped by our
work in progress on the DD&P
robot control architecture
(Hertzberget al. 1998;Hertzberg&Schtinherr 2001), which
blends DDcontrollers with action plans generatedby a classical propositionalplanner(concretely, IPP (Koehleret al.
1997)) as the central deliberation component.Mind, however: The methodfor fact extraction from activation value
histories of BBSspresentedhere is potentially applicable in
BBSframeworksother than DD,as will be discussed at the
end of the paper.
Therest of this paper is organizedas follows. In Sec. 2,
we present our approach of formulating BBSsas dynamical systems and give a detailed examplein See. 3. To provide somebackgroundconcerning complete robot control
systems, we then (See. 4) sketch howwe assumethe deliberation componentinterferes the BBScontrol component.
See. 5 contains the technical contribution of the paper, describing in general as well as by wayof examplethe technique of extracting facts from BBSactivation value histories. See. 6 discussesthe approachand relates it to the literature. Sec. 7 concludes.
robot actuators, and higher-level behaviors (HLBs),which
are connected to LLBsand/or HLBs.Each LLBimplements
twodistinct functions: a target function and an activation
function. The target function for the behavior b provides
the reference tb for the robot actuators ("what to do") as
follows:
TOtT
tb = fb(8 T, 8 f, LLBI
(1)
wherefb is a nonlinear vector function with one component
for eachactuator variable, sT is the vector of all inputs from
sensors, s~" is the vector of the sensor-filters and t~LTLB
is
the vector of activation valuesof the LLBs.Bysensor-filters
- sometimescalled virtual sensors - we meanmarkovian
and non-markovianfunctions used for processing specific
information from sensors.
The LLBactivation function modulatesthe output of the
target function. It provides a value between1 and 0, meaning that the behaviorfully influences, does not influenceor
influences to somedegree the robot actuators. It describes
whento activate a behavior. For LLBb the activation value
is computed
fromthe followingdifferential equation:
~b,LLB
~- ~b(Olb,LLB,
OnFb, OffFb, OgTb) (2)
Eq. 2 gives the variation of the activation value Olb,LLBof
this LLB.lib is a nonlinearfunction. OCTb
allows the planner to influencethe activation values, see Sec. 4. Thescalar
variables OnFband O f f Fb are computedas follows:
OnFb m Ub(S T,8fT T,OtLLB,0/T HLB)
T
T
T
T
(3)
(4)
= Vb(S ,Sl,aLLB, aHLn)
whereUband Vb are nonlinear functions. Thevariable OnFb
sumsup all conditions whichrecommend
activating the respective behavior (on forces) and OffFbstands for contradictory conditionsto the respectivebehavior(off forces).
The HLBsimplementonly the activation function. They
are allowed to modulate only the LLBsor other HLBson
the sameor lowerlevel. In our ease, the changeof activation values for the I-ILBs Otb,HLB
are computed
in the same
manneras F-xl. 2.
Thereason for updatingbehavioractivation in the formof
Eq. 2 is this. Byreferring to the previousactivation value&b,
it incorporates a memory
of the previous evolution which
can be overwritten in case of suddenand relevant changes
in the environment,but normallyprevents activation values
from exhibiting high-frequencyoscillations or spikes. At
the sametime, this formof the activation function provides
somelow-passfilteringcapabilities, deleting sensor noise or
oscillating sensor readings.
Independentfrom that, it helps to develop stable robot
controllers if behavioractivations havea tendencyof moving towardstheir boundaryvalues, i.e., 0 or 1 in our formulation. Toachieve that, wehaveimplemented
lib in Eq. 2 as
a bistable groundform(like in (Bredenfeldet al. 1999)for
RoboCup
application of a BBSof the same type) providing
somehysteresis effect. Withoutfurther influence, this function pushesactivation values lower/higherthan somethreshold/~ (typically ~ = 0.5) softly to 0/1. Theactivation value
changesas a result of addingthe effects comingfrom the
variables OnF,Of f F, OCTand the bistable groundform.
OffFb
2 BBSs as dynamical
systems
Weassumea BBSconsists of two kinds of behaviors: lowlevel behaviors(LLBs),whichare directly connectedto the
10
Exactformulationsof the gb function are then just technical
and unimportantfor this paper.
The relative smoothnessof activation values achievedby
using differential equations and bistability will be helpful
later in the technicalcontributionof this paper(See. 5), when
it comesto derive facts fromthe time series of activation
values of the behaviors.
In our BBSformulation, behavior arbitration is achieved
using the activation values. As shownin Eqs. 2-4, each
behaviorcan interact with (i.e., encourageor inhibit) every
other behavioron the sameor lowerlevel. Themodelof interaction betweenbehaviorsis defined by the variables OnF
andO f f F.
Theoutput
vector
orreference
vector
r oftheBBSforthe
robot
acmatol~
isgenerated
bysumming
allLLBoutputs
by
a m~er,
asfollows:
r = ~ ~b,LLBtb
lows checkingisolated behaviors or the wholeBBSin designated environmentalsituations (configurations).
Figure 2: The behavior inventory for our examplesas in a
screen shot from the DDDesigner
tool. Big circles denote
I-ILBs; squares denote LLBs;the hollowicons at the bottom
denote robot actuator reference values. Arrowsdenote control flow betweenLLBsand actuators. Theinfluence structure betweenbehavior activations is not shown;the small
circles are of no importancehere.
(5)
b
Togetherwith the formof the activation values, this wayof
blending the outputs of LLBsavoids discontinuities in the
reference values r~ for the single robots actuators, such as
suddenchangesfrom full speed forward to full speed backward.
3 An Example
To illustrate the notation of See. 2 we give a demonstration
problemconsisting of the task of following a waft with a
robot and entering only those doors that are wide enough
to allow the entrance. Figure 1 gives an overview about
the mainpart of our arena. The depicted robot is equipped
with a short distance laser-scanner, 4 infrared side-sensors,
4 front/back bumpersand somedead-reckoningcapabilities.
The control system contains three HLBsand six LLBs,
see Fig. 2. RobotDirectionand RobotVeloeit~/are the referencesfor the tworespective actuators. Wehavethe following I-ILBs(cf. Fig. 2):
CloseToDoor
is activated if there is evidencefor a door;
InCorridoris active while the robot movesinside a corridor;
TimeOut
was implementedin order to avoid getting stuck
in a situation, see See. 5;
The LLBsare the following:
TurnToDoor
is activated if the robot is situated on a level
with a door;
GoThruDooris activated after the behavior TumToDoor
wassuccessful;
FollowFlightWall
is active whena right wall is followed;
FollowLeftWall
is active whena left wall is followed;
AvoidCollis active whenthere is an obstacle in the front of
the robot
Wander
is active whenno other LLBis active.
Most of the implementedbehaviors are commonfor this
kind of tasks. However,
wedecidedto split the task of passing a door in a sequenceof two LLBs.This helps structure,
maintain and independentlyimprovethese two behaviors.
To give a simple exampleof BBSmodeling, here are the
"internals" of Wander(Wdr):
Figure 1: The demoarena and the final robot pose in ExampleI. The robot has started its course at the lowerright
comer,belowto the round obstacle.
On,
FWdr= kl (1 - OtCioseroV.) * (1 -- OtFollownlghtW.
)
OffFwdr
Weused a simulator based on the DDDesignerprototype
tool (Bredenfeld2000;Bredenfeldet al. 1999). Thetool al-
=
¯ (1 - aFollowLe~W.)
* (1 -- OtAvolaColl)
k20~AvoldColl+ k3OtCloseToDoor
+
k4OtFollowRIghtWall
"1- kS~FollowLeftWall
(7)
11
ing/falling edgesof someactivation values. The secondrise
of AvoidColl(after t = 22) is caused by the door frame,
whichpopsinto sight as a close obstacle at the very end of
the turning maneuver.Effectively, the collision avoidance
guides the robot through the door. Finally GoThruDoor
gets slowlydeactivatedallowingother behaviorsto take control of the robot.
The HLBsCloseToDoorand InCorridor describe global
states, therebymodulating
the interaction, activation and sequencing of the LLBs.
4 Fromplans to BBSs:Blendingbehaviors
with operators
Figure 3: Theactivation value histories for Example1. The
numbersare time unit for reference. Top down:AvoidColl,
FollowLeftWall, GoThruDoor, /urnToDoor, Wander,
CloseToDoor,InCorridor, TimeOut.
RobotDwdr =
e~botVwd
r
tWdr
=
=
randomDireetionO
mediumSpeed
RobotVelocitYWdr
[ RobotDirectionwdr
]
(8)
(9)
(10)
(11)
awdr= gw.(aw.,OnFw.,OLfFw.,OOTw.)
where kl... k5 are empirically chosen constants.
randomDirection
0 could be every function that generates
a direction whichresults in a randomlychosentrajectory.
Due to its product form, OnFwander
can only be remarkably greater than zero if all included ab are approximately
zero. OffFwanderconsists of a sumof terms allowing every included behavior to deactivate Wander.Both terms
are simple and can be calculated extremelyfast, whichis
a guideline for most BBSs.The OCTterm will be briefly
explainedin the next section.
Fig. 3 showsthe activation value histories generatedduring a robot run, whichwill be referred to as Example
1. The
robot starts at the right loweredgeof Fig. 1 with Wanderin
control for a very short time, until a wall is perceived.This
effect is explainedby Eq. 6. Whilethe robot starts to follow
the wall, it detects the smallroundobstaclein front. In consequence, two LLBsare active simultaneously: AvoidColl
and FollowLeftWall.
Finally, the robot followsthe wall, ignores the little gap and enters the door. In the examplesfor
this paper, FollowRightWall
is alwaysinactive and therefore
not shownin the activation value curves.
This exemplifiesthe purposeof a slow increase in behavior activation. FollowLeftWallshould only have a strong
influenceto the overall robot behaviorif a wall is perceived
with both side-sensors for sometime, so as to be moresure
that the robot really has sensed a wall. Thesmall dent in
the activation of FollowLeftWall(around the time t = 4)
is explainedby perceiving free space with one side-sensor.
If both side-sensors detect free space this behaviorwould
be deactivated. Theturning to the door is describedby ris-
12
Thetechnical contribution of this paper is an approachto
enhancingthe information flow from the BBSto the deliberation part in hybrid robot control systems. Beforecoming
to that in Section 5, we wantto sketch the control flow in
the oppositedirection, fromthe plannerto the BBS,to make
completethe picture of the entire robot control architecture
that we have in mind. Not in the focus of the present paper, this descriptionjust consists of stating the basic principle, and we refer to workon the DD&P
control architecture
(Hertzberget al. 1998;Hertzberg&SchOnherr2001), which
elaborates on the approach.
The basic idea is this: An action planner continually
maintainsa current action plan, based on the current situation and the current set of user-providedor self-generated
mission goals. Basedon the current plan and the current
situation, an execution componentpicks one of the operators in the plan as the one currently to be executed. Plan
executionis donein a plans-as-advice(Pollack 1992)fashion: Executingan operator meansstimulating moreor less
strongly the behaviorsworkingin favor of the operator, and
muting those workingagainst its purpose. Whichoperator
stimulates or muteswhichbehaviorsis an informationthat
the domainmodelerhas to provide along with the domain
modelfor the deliberative component
and the set of behaviors for the BBS.
Technically,the influence of the current operator is "injected" into the BBSin terms of the Operator-CouplingTerms(OCT)in the activation functions, see Eqs. 2. The
influence of the current groundoperatorop gets inside every
behavior b through the term OCTb,as follows:
OCT,= ~ °~’ °p op
(12)
8b e b (Z; -- O~b)
opEOP
where
87,Z:Pe {0,1}andc7isp a consmt.
87= 1
iff op influences the behaviorb. c~ modelsthe immediacy
or delay of the operator influence on the behavior. Zb°p expresses whetherthe operator influence is of the stimulating
or the mutingsort: If Z~~ = 1, then the respective behavior
is stimulated, and mutedif Z:P = 0. Z maybe a boolean
function, returning 0 or 1 conditionally.
Togive an example,assumethat the domainmodelfor the
deliberation componentincludes an operator GO-IN-RM(~)
modelingthe action of someoffice delivery robot to go
(from whereverit is) to and enter roomx. Let the behavior inventorybe the one specified in See. 3. Hereis a selec-
lion of s, Z, and e variables of these behaviorsand howthey
should be affected by the groundoperator GO-IN-RM(A)
c,O-~.ai(A) set to 1 as the operatordoes influencethe be8GoThmDoo
r
havior;
GO-~-aM(A)
CGoThruOoor
set to somemediumvalue, causing a tendency
to influenceactivation soonafter the operatoris chosenas
being active;
ZGC,~-~M(A)
oThruDoor
set to charFct(CloseTo(A)),i.e., the characteristic function that returns 1 if CloseTo(A)
is currently
in the fact base, and0 else;
C~-~(A)
8Wander
set to 1 as the operator should affect its activation (namely,mutingit);
c~-m~(a)
SAvoldCo, set to 0 as collision avoidanceshould not be
affected by the operator (note the difference betweennot
affecting and actively mutingan activation value)
5 FromBBSsto plans: Extracting facts from
activation values
Wenowturn to the methodhowto extract facts from activation value histories. It is influencedby previousworkon
chronicle recognition, such as (Ghallab1996).
Tostart, take anotherlookat the activationcurvesin Fig. 3
in See. 3. Someirregular activation time series occur due to
the dynamicsof the robot/environmentinteraction, such as
early in the AvoidColl and Wanderbehaviors. However,
certain patterns re-occur for single behaviorswithin intervals of time, such as a value being moreor less constantly
high or low, and values going up from low to high or vice
versa. Theidea to extract symbolicfacts fromactivation values is to consider characteristic groupsor gestalts of such
qualitative activation features occurringin chronicles over
time.
Tomakethis precise, wedefine, first, qualitative activation values (or briefly, qualitative activations) describing
these isolated patterns. In this paper, we consider four of
them, which are sufficient for defining and demonstrating
the principle, namely,rising/falling edge, high and low, symbolized by predicates to, ~e, Hi, and Lo, respectively. In
general, there maybe morequalitative activations of interest, such as a value staying in a mediumrange over some
period of time. For a behaviorb and time interval It1, t2],
they are defineda~s
Hi(b)[tl,
t2] Lo(b)[tl, t2] =
to(b)[h, tg.]
ab [t] > h for all tl _~t ___~’ t’~
ab[t] <_l for all tl _< t _<t2
ab[tl] = l and Ctb[tz] = h
(13)
and ab increases
generallymonotonicallyover[t1, t2]
~e(b) [tl, t2] = ab[tl] = h and ab[t2] = l (14)
and ab decreases
generallymonotonicallyover[t1, t2]
for given threshold values 0 << h < 1 and 0 < l << 1, where
ab[t] denotes the value ofab at time t. Generalmonotonicity requires anothertechnical definition, whichweskip here
13
for brevity. The idea is that somedegree of noise should
be allowedin, e.g., an increasing edge, makingthe increase
locally non-monotonic.
In the rather benignexampleactivation curvesin this paper,regular monotonicity
suffices. Similarly, it is not alwaysreasonableto use the global constants
h, 1 as Hi and Lothresholds, respectively. It is possible to
use different threshold constants or thresholding functions
for different behaviors. Wedo not go into that here. Then,
it makessense to require a minimum
duration for [tl, *a] to
preventuseless mini intervals of Hi and Lo types from being
identified. Finally, the strict equalities in Eq.s 13 and14 are
unrealistic in real robot applications, wheretworeal numbers must be compared,which are seldomstrictly equal.
Equality-l-e is the solutionof choicehere.
Thekey idea to extract facts fromactivation histories is
to considerpatterns of qualitative activations of several behaviors that occur within the sameinterval of time. Wecall
these patterns activation gestalts. Weexpress themformally
by a time-dependentpredicate AGover a set Q of qualitative activations of potentially manybehaviors. For a time
interval [t, t’] the truth of AG(Q)[t,t’] is definedas the conjunction of conditions on the component
qualitative activations q E Qof behaviorsb. Onepossibility of defining them
is in the followingway:
case:
q = Hi(b)
q = Lo(b)
q = l~e(b)
q = #e(b)
then:
Hi(b)[t,t’]
Lo(b)[t,
t’]
i~e(b)[tl, t2]
for someIt1, tz] C_[t, t’],
andLo(b)[t, hi and Hi(b)[t2,
~e(b)[tx, t~]
for some
[tl, tg] C_It, t’],
andHi(b)[t, tl] andLo(b)[t2,
Notethat it is not required that different rising or falling
edges in Q start or end synchronouslyamongeach other or
at the interval borders of It, g]---they only mustall occur
somewhere
within that interval.
For example, AG({~e(GoThruDoor),~e(TurnToDoor),
Hi(CloseToDoor)})
is true over [20,24] in the activation
histories in Fig. 3; it is also true over [16, 23] (and therefore, also overtheir union[16, 24]), but not over[16, 25], as
CloseToDoor
has left its Hi band by time 25, and possibly
the samefor GoThruDoor,
dependingon the concrete value
of the h threshold.
A chronicle over someinterval of time [t0,t] is a set of
activation gestalts oversub-intervalsof [to, t] witha finite
set of n linearly ordered internal interval boundarypoints
to <tl < ... < tn < t. A groundfact is extracted fromthe
activation history of a BBSas true (or rather, as evident, see
the discussionbelow)at time t if its defining chroniclehas
been observedover someinterval of time ending at t. The
defining chronicle must be provided by the domainmodeler,
of course.
Wegive as an examplethe defining chronicle of the fact
InFioomthat the robot is in someroom,such as the one left
of the wall in Fig. 1. Inlqoom[t]is extractedif the following
defningchronicleis true withinthe interval It0, t], wherethe
tl are existentiallyquantifiedt:
AG({~e(GoThruDoor)
} )[t4, t]
A AG({l~’e(GoTrO.),
~e(Turn2D.),
Hi(Close2O.)})[t3,
A AG({Hi(TurnToDoor,
Lo(InCorridor)})[tz,Ls]
A AG({’fi’e(Turn2O.),
#e(Close2D.),
Se(InCor.)})
A AG({Hi(InCorridor)})[to,
A AG({Lo(TimeOut)})[to,
(15)
Assumingreasonable settings of the Hi and ko thresholds
h,l, the following substitutions of the time variables to
time-points yield the mappinginto the activation histories
in Fig. 3: L = 28 (right outside the figure), to = 3, tl
12,tz = 16,t3 = 20,t4 = 24. As a result, we extract
InRoom[24].
This substitution is not unique. For example, postponing to until 5 or having tl earlier at 9 wouldalso
work. This point leads to the process of chronicle recognition: given a workingBBS,permanentlyproducing activation values, howare the given defining chroniclesof facts
checkedagainst that activation value data stream to determinewhethersomefact starts to hold?
The obviousbasis for doing this is to keep track of the
qualitative activations as they emerge.That means,for every
behavior, there is a processloggingpermanentlythe qualitative activations. For those of type Hi and/o, the sufficiently
long time periods of the respective behavioractivation above
and belowthe h, I thresholds, resp., haveto be recordedand,
if adjacentto the current time point, appropriatelyextended.
Thiswouldlead automaticallyto identifying qualitative activationsof types Hi andLowiththeir earliest start point, such
as to = 3 for Hi(InCorridor)in the exampleabove. Qualitative activationsof types#e and~e are loggediff their definitions (eqs. 13 and14, resp.) are fulfilled in the recent history
of activation values. Asthis loggingprocessis local to every
behavior, the complexityis linear O(B)in the numberB of
behaviors.
Qualitative activation logs are then permanentlyanalyzed
whetherany of the existing definingchroniclesare fulfilled,
whichmayrun in parallel to the ongoingprocess of logging
the qualitative activations. Anonline version of this analysis inspired by (Ghallab 1996) wouldattempt to match the
flowof qualitative activations withall definingchroniclesc
by meansof matchingfronts that jumpalong c’s internal interval boundarypoints t~ and try to bind the next time point
ti+l as current matchingfront suchthat the recent qualitative
activationsfit all sub-intervalsof c that endin ti+ t. Notethat
morethan one matchingfront maybe active in every defining chronicle at any time. A matchingfront in c vanishes if
it reachesthe endpoint t (the definingchronicleis true),
else whilestuck at tl is caughtup by anothermatchingfront
at ti, or else an activation gestalt overan interval endingat
ti+z is no longer valid in the current qualitative activation
history.
For complexityconsiderations, assumethat (7 defining
chronicles are defined that involve a maximum
of N - 1
internal interval boundarypoints. Assume
further that A is
tTofit columnspace,somebehaviornamesare abbreviated.
14
the maximalnumberof qualitative activations occurring in
single activation gestalt conjunctsof all definingchronicles.
(A is boundedby the numberB of behaviors.) Then, one
cycle of the online matchingof defining chronicles runs in
O(ACN) time.
Practically, the necessarycomputationmaybe focusedby
specifying for each defining chronicle a trigger condition,
i.e., one of the qualitative activationsin the definition that
is used to start a monitoringprocessof the validity of all
activation gestalts. For example,in the InRoom
definition
above, ~e(GoThruDoor),
as occurringin the Its, t4] interval,
might be used. Note that the trigger condition need not be
part of the earliest activation gestalts in the definition. On
appearanceof sometrigger conditionin the qualitative activation log, wetry to matchthe activation gestalts prior to
the trigger withqualitativeactivationsin the log file, and,if
successful, verify the gestalts after the trigger conditionin
the qualitative activations as they are beinglogged.
Figure4: Robotexample2: final state and activation curves.
See text for explanations.
To give an examplewherethe derivation of the InRoom
fact fails, considerFig. 4. Thescenariois like before, i.e.,
the robot starts at the lowerfight comer,driving upwardand
trying to enter any doorlarge enough.Differentto Fig. 1, no
obstacleis presentat the beginning,andwhilethe robottries
to enter into the detected door, another robot comesfrom
within the roomand blocks it. Fig.4 right showsthe scene
after the robot has failed to enter the door, and left are the
respectiveactivation values.
Like before, while InCorriclor
(to = 3), the
CloseToDoor and TurnToDoor activations rise with
InCorridor falling (tz = 9, tz = 13); then, TurnToDoor
is Hi, while InCorridor
is Lo (ts = 20). But
then, mischief strikes. After the long high period of
TurnToDoor,TimeOutjumps up, terminating its Lo period, before GoThruDoor
has risen. In consequence, t4
cannot be bound, and the fact InRoomnot extracted. If
#e(GoThruDoor)
wasused as a trigger condition in the first
place, then no unnecessarymatchingeffort was wasted.
Somemore general remarks are in place here. Defining chroniclesexclusivelyin terms of activation gestalts is
a special cause that wehaveused in this paperto keepmatters focused. In general, the obviousother elementsmaybe
used for defining them:sensor readings at sometime points
(be they physicalsensors or sensor filters), and the validity
of symbolicfacts at a time point or over sometime interval. Ourintention is to provide fact extraction fromactivation values as a mainsource of information, not the exclusive one. That type of information can be addedto the
logical format of a chronicle definition as in (15). For example, if the exact time point of entering a roomwith the
robot’s front is desired as the starting point of the InRoom
fact, then this might be determinedby the time within the
interval It4, t] (i.e., within the decreaseof the GoThruDoor
activation) wheresomesensor senses open space to the left
and right again. As another example, assumethat the fact
At(DoorA)for the door to someroomA maybe in the fact
base (as derived from a normallocalization process). Then
At(DoorA
)[t4] could be addedto the defining chronicle (15)
aboveto derive not only InRoom[t],but morespecifically
InRoom(A)[t].
The fact extraction technique does not presumeor guarantee anything about the consistency of the facts that get
derived over time. Achievingand maintainingconsistency,
and determining the ramifications of newly emergedfacts
remainissues that go beyondfact extraction. Pragmatically,
we wouldnot recommend
to blindly add a fact as true to the
fact baseas soonas its definingchroniclehas beenobserved.
A consequentof a recognized defining chronicle should be
interpreted as evidencefor the fact or as a fact hypothesis,
which should be addedto the robot’s knowledgebase only
by a more comprehensiveknowledgebase update process,
whichmayeven reject the hypothesisin case of conflicting
information. Apossible solutions wouldbe to add someintegrity constraints to the defining chronicles. However,
this
is not withinthe scopeof this paper.
6 Discussion
A physical agent’s perception categories must to somedegree be in harmonywith its actuator capabilities--at least
in purposivelydesignedtechnical artifacts such as working
autonomousrobots. 2 Our approach of extracting symbolic
facts from behavioractivation merelyexploits this harmony
for intertwiningcontrol on a symbolicand a reactive level of
a hybridrobot control architecture.
Thetechnical basis for the exploitationare time series of
behavioractivation values. Wehave taken themfrom a special type of behavior-basedrobot control systems (BBSs),
namely,those consisting of behaviors expressedby nonlincar dynamicalfunctions of a particular form,as describedin
Sec. 2. Thepoint of havingactivation values in BBSsis not
new;it is also the case, e.g., for the behavior-basedfuzzy
control part underlying Saphira (Konolige et al. 1997),
wherethe activation values are used for context-dependent
blendingof behavioroutputs, whichis similar to their use
in our BBSframework.Activation values also provide the
degree of applicability of the correspondingmotorschemas
in (Arkin 1998, p. 141).
2Wedonot speculateaboutbiologicalagentsin this paper,althoughwewouldconjecturethat natural selection and parsimony
stronglyfavorthis principle.
15
The activation values of a dynamical system-type BBS
are well-suitedfor fact extraction in that their formalbackgroundin dynamicalsystemstheory provides both the motivation and the mathematicalinventory to makethemchange
smoothlyover time----compare,e.g., the curves in Figures3
and4 with the ragged ones in (Saffiotti, Ruspini, &Konolige 1999, Fig. 5.10). This typical smoothnessis handyfor
defining qualitative activations, whichaggregateparticular
patterns in termsof edgesandlevels of the curvesof individual behaviors, whichare recordedas they emergeover time.
Thesethen serve as a stable basis for chronicle recognition
over qualitative activations of several behaviors.Note,however, that this smoothness
is a practical rather than a theoretical issue, and other BBSapproachesmayserve as bases for
fact extractionfromactivation values.
Wewantto emphasizethat the activation values serve two
purposesin our ease: first, their normaloneto providea reliable BBS,and second, to deliver the basis for extracting
persistent facts, basedon their distinctive patterns. Withthe
seconduse, we save the domainmodelera significant part
of the burdenof designinga complicatedsensor interpretation schemeonly for deriving facts. The behavior activation curves, as a by-productcomingfor free of the behaviorbased robot control, focus on the environmentdynamics,be
it inducedby the robot itself or externally. Byconstruction,
these curvesaggregatethe available sensor data in a waythat
is particularly relevant for robot action. Wehavearguedthat
this informationcan be used as a mainsource of information
about the environment;other information, such as coming
from raw sensor data, fromdedicated sensor interpretation
processes, or from available symbolicknowledge,could be
used in addition.
Asactivation values are present in a BBSanyway,it is
possible to "plug-in" the fact extraction machinefor a deliberative component
to an already existing behaviorsystem
like the DDcontrol systemin (Bredenfeldet al. 1999).Yet,
if a newrobot control systemis about to be written for a
newapplication area, things couldbe donebetter, within the
degrees of freedomfor variations in behavior and domain
modeldesign. The ideal case is that the behavior inventory and the fact set is in harmonyin the sense that such
facts get used in the domainmodelwhosemomentary
validity engravesitself in the activation value history, and such
behaviorsget used that produceactivation values producing
evidencefor facts. For example,a single WallFollow
behavior workingfor walls on the right andon the left, maybe satisfactory fromthe viewpointof behaviordesign for a given
robot application; for fact extraction, it maybe moreopportune to split it into FollowLeftWalland FollowRightWall,
whichwouldbe equally feasible for the behavior control,
but allowsmoretargeted facts to be deduceddirectly.
Apart from such design-level interdependencies, which
are non-trivial, but not special for our approach, we are
aimingat a control architecture with a deliberative and a
behavior-basedpart as twoabreast moduleswith no hierarchy, as sketchedin (Hertzberget al. 1998).Thefact extraction schemeleaves the possibility to un-plugthe deliberative
part fromthe robot control, whichwethink is essential for
robustness of the wholerobot system.
Our technique is complementary
to anchoring symbolsto
sensor data as describedin (Coradeschi&Saffiotti 2001).
differs fromthat line of workin twomainrespects. First,
weuse sensor data a,s aggregatedin activation value histories only, not rawsensor data. Second,weaimat extracting
groundfacts rather than establishing a correspondencebetweenpercepts and references to physical objects. Thelimit
of our approachis that it is inherently robot-centeredin the
sense that we can only arrive at informationthat has to do
directly with the robot action. Theadvantageis that, due to
its specificity, it is conceptuallyand algorithmicallysimpler
than symbolanchoringin general.
7 Conclusion
Wehave presented a newapproach for extracting information aboutsymbolicfacts fromactivation curvesin behaviorbased robot control systems. Updatingthe symbolic environmentsituation is a crucial issue in hybrid robot control
architectures in order to bring to bear the reasoningcapabilities of the deliberativecontrol part on the physicalrobot
action as exerted by the reactive part. Unlikestandard approachesto sensing the environmentin robotics, we are using the information hidden in the temporal developmentof
the data, rather than their momentary
values. Therefore,our
methodpromises to yield environmentinformation that is
complementary
to normalsensor interpretation techniques,
whichcan and should be used in addition.
Wehavepresentedthe techniquein principle as well as in
terms of selected demoexamplesin a robot simulator, which
has allowedto judge the approachfeasible and to design the
respective algorithms. The computationalcomplexityof the
recognition process is in O(BCN),where B is the number
of behaviors, C the numberof chronicle definitions, and N
the maximal"length" of a chronicle definition in terms of
intermediatetimepoints internal to the chronicledefinition.
The approachwill be applied in the context of the hybrid
robot control architecture DD&P
(Hertzberg et al. 1998)
to generate an important part of the information that is
used to update the symbolic world modelfrom the sensor
data stream. Weare implementingan Add-On-Toolfor the
DDDesigner
(Bredenfeld et al. 1999) which allows systematic analyze and test of our ideas and experimentswith additional types of qualitative activations.
Acknowledgements
Thanksto HerbertJaeger, the GMD
AiS.BE-Team,
and especially
Hans-Ulrich
Kobialka
for their supporton this work.
References
Arkin, R. 1998. Behavior-BasedRobotics. MITPress.
Bonasso, P.; Firby, I.; Gat, E.; Kortenkamp,D.; Miller,
D.; and Slack, M. 1997. Experienceswith an architecture
for intelligent, reactive agents.J. Expt.Theor.Artif. Intell.
9:237-256.
Bredenfeld, A.; Gthring, W.; Ganter, H.; Jaeger, H.; Kobialka, H.-U.;P16ger,P.-G.; Schtll, P.; Siegberg,A.; Streit,
A.; Verbeek,C.; and Wilberg, L 1999. Behaviorengineering with dual dynamicsmodelsand design tools. In Proc.
3rd lnt. Workshopon RoboCupat IJCAI-99, 57-62.
16
Bredenfeld, A. 2000. Integration and evolution of modelbased tool prototypes. In 11th IEEEInternational Workshop on Rapid SystemPrototyping, 142-147.
Coradesehi,S., and Saffiotti, A. 2001. Perceptual anchoring of symbolsfor action. In Proc.of the 17th IJCAIConf.,
407--412.
Fox, D.; Burgard, W.; and Thrun, S. 1999. Markovlocalization for mobilerobots in dynamicenvironments.J.
Artif. Intell. Research
(JAIR)11.
Ghallab, M. 1996. Onchronicles: Representation, on-line
recognition and learning. In Aiello; Doyle;and Shapiro.,
eds., Proc. Principles of KnowledgeRepresentation and
Reasoning (KR-96), 597-606. Morgan-Kauffman.
I-Iarnad, S. 1990. The symbolgroundingproblem.Physica
D 42:335-346.
Hertzberg, J., and Schtnherr, E 2001. Concurrencyin the
DD~Probot control architecture. In Sebaaly, M. E, ed.,
Proc. of The Int. NAISO
Congr.on InformationScience Innovations (ISI’2001), 1079-1085.ICSCAcademicPress.
Hertzberg, J.; Jaeger, H.; Zimmer,U.; and Morignot,P.
1998. A frameworkfor plan execution in behavior-based
robots. In Proc. of the 1998IEEEInt. Symp.on Intell.
Control (ISIC-98), 8-13.
Jaeger, H., and Christaller, T. 1997. Dual dynamics:Designing behavior systems for autonomousrobots. In Fujimura, S., and Sugisaka, M., eds., Proc. Int. Symposium
on Artificial Life andRobotics(AROB’97),
76--79.
Koehler, J.; Nebel, B.; Hoffmann,I.; and Dimopoulos,Y.
1997. Extending planning graphs to an ADLsubset. In
Steel, S., and Alami,R., eds., RecentAdvancesin AI Planning. ECP-97,273-285. Springer (LNAI1348).
Konolige,K.; Myers, K.; Saffiotti, A.; and Ruspini, E.
1997. The Saphira architecture: A design for autonomy.
J. Expt.Theor.Artif. Intell. 9:215-235.
Matarit, M.J. 1999. Behavior-basedrobotics. In Wilson,
R. A., and Keil, E C., eds., MITEncyclopedia
of Cognitive
Sciences. The MITPress. 74-77.
Muscettola,N.; Nayak,P.; Pell, B.; and Williams,B. 1998.
RemoteAgent: to boldly go where no AI system has gone
before. J. Artificial Intelligence103:5--47.
Pollack, M.1992. Theuses of plans. Artif. Intl. 57(1):4368.
RemoteAgent Project. 2000. Remoteagent experiment
validation, http://rax.arc.nasa.gov/DS1-Tech-report.pdf.
Saffiotti, A.; Ruspini, E.; and Konolige, K. 1999. Using fuzzy logic for mobilerobot control. In Zimmermann,
H.-J., ed., Practical Applications of FuzzyTechnologies.
Kluwer Academic. chapter 5, 185-205. Handbook of
FuzzySets, vol.6.
Sch0nherr, E; Cistelecan, M.; Hertzberg, J.; and
Christaller, T. 2001. Extracting situation facts fromactivation value histories in behavior-basedrobots. In Baader,
F.; Brewka,G.; and Eiter, T., eds., KI-2001:24th German
9th Austrian Conf. on AI, volume2174of LNALSpringer.