Grounding of Robots Behaviors Louis Hugues and Alexis Drogoul

From: AAAI Technical Report FS-01-01. Compilation copyright © 2001, AAAI (www.aaai.org). All rights reserved.
Grounding of Robots Behaviors
Louis HuguesandAlexis Drogoul
Universit6Pierre et MarieCurie, LIP6- Paris - France
louis.hugues@lip6.fr, alexis.drogoul@lip6.fr
Abstract
of real experiencescontainingpertinent associationsof pereeption and action.
Thispaperaddressesthe problemof learningrobot behaviorsin real environments.
Robotbehaviorshavenot
onlyto be grounded
in the physicalworldbut also in
the human
spacewherethey are supposeto take place.
Thepaperbriefly presentsa learningmodelrelyingon
teachingby demonstrations,
enablingthe user to transmit its intentionsduringreal experiments.Important
propertiesare outlinedandthe probabilisticlearning
processis described.Finally,weindicate howgrounded
behaviorcouldbe interfacedto a symbolic
level.
Teachingby Showing
The worksrelated to this view are Programming
by Demonstration (Cypher93), Imitative Learning(Demirisand Hayes
96) or Supervised learning methodssuch as in the ALVINN
system(Pomeflau93).
Introduction
It has been shown in numerous works that behavior
paradigmis a good way to specify the actions of robots
in real environments.0nee clearly identified and named,
behaviorscan be incorporated into complexaction/selection
schemesand can be referred to, symbolically, by deliberative processes at upper levels. Thusbehaviors provide a
nice humandesigner/robot interface. Howeverbehaviors
are still difficult to modelandformulatein the real world.In
realistic applications, robots behaviorshave to be adapted
to the Physical space and the Human
space. In the Physical
Spacebehaviors have to deal with sensors incertitude and
imprecision, incompleteness of real world models and
actions too peculiar to be easily defined and specified. In
the Human
Space they have to be adapted to our intentions
and requirement, our habits, the way we organize our
environment(signs, stairways...) and the importance
give to variouscontexts.
Figure 1: Three exampleswhich maybe used to shape the
behavior"Exit by the door".
Wepropose a modelfor behavior learning by demonstration - more details can be found in (Huguesand Drogoul
2001). Theapproachdo insists on the followingpoints:
¯ Shapinga behavior with a few examples: The user must
be able to incrementally shape the behavior by showing
a few examples. To showa behavior such as "Exit by
the door" in figure 1 we wouldlike to demonstrateit by
showingonly a fewsignificativefacets of the behavior.
¯ Finding a Perceptual Support for the behavior A behavior needsa supportin the perceptualfield, somefeatures
to rely on for choosingthe right actions. The features
to be detect cannotbe modeleda priori, they mustbe inferred from the demonstrations.In a wayinspired by Gibson’s Theoryof affordances (Gibson86), the environment
is viewedas directly providinga supportfor possible behaviors. Thisleads us to learn actions by relating themto
the elementsof the perceptualfield at the lowestlevel.
Behaviors have to be grounded either in Physical and
Humanspace, and explicit programmingof behaviors or
optimization processes (genetic algorithms, reinforcement
learning..) cannot alone provide a satisfying adaptation to
those both spaces. Teachingbehaviors by showingthemto
the robots permits to face this problem. By showingthem
directly ( ie: using a remotecontrol device ) robots behaviors are anchoredfrom the beginningin the only space that
matters, the space of the final user’s intentions in its real
physical environment.The
learning process can take benefit
¯ Contextof the behavior:The behaviorhas to be independent of irrelevant features of the contexts of realization
and it also mustbe triggered only in valid contexts. The
identification of contextshas to be associatedtightly to a
behaviordescription.
Copyright~) 2001,American
Associationfor Artificial Inteliigenee(www.aaal.org).
All fights reserved.
115
BehaviorSynthesis
A synthetic behavior is represented by a large population
of elementary cells. Eachcell observes, along each exampies, a particular point in the perceptual field and records
statistically the valueof the effectors. Cells are activatedby
detection of specific prope~es(visual properties mainly)
of the perceptual field such as color, density, gradient of
small pixel regions. A cell records the arithmetic meanof
the robot’s effectors taken over its periods of activations,
the importanceof a particular cell dependson the number
of its activations. Duringthe reproductionof the behavior,
the current values of the effectors are obtained by the
ponderatedcombinationof active cells.
The synthetic behavior also contains a context description
formed by several histograms recording distribution of
perceptualproperties ( again color, density etc ...) over all
the examples.Thebehavior synthesis is divided into four
main steps: (1) individual synthesis of each example
a population of cells and context structure. (2) Fusion
separate synthetic behaviors in a unique population. (3)
Real-Timeuse of the cells structure. (4) Eventually,on line
correctionby the tutor.
multiplicationand importantfeatures are highlighted.
Step 3 : Real-time use At replay time, the behavior can
be activated if the context histogrammatchescurrent video
images.Robotis controlled in real time by deducingeffectors values fromthose stored in active ceils (usinga majority
principle), Theset of cells reacts like a populationwithdominant opinion (possible opinion are the possible values for
the effeetors). Videoimageshave only to overlap partially
the recordeddata so as to producecorrect control values.
Step 4 : Online correction The final behavior can be corrected on line by the tutor. The tutor corrects the robots
movements
by using the joystick in contrary motion. Active
ceils are correctedproportionallyto their importance.
Behaviorinterface with symboliclevel
Modulesoperating at symbolic level (action/selection
mainly)can interface with a groundedbehaviorby referring
to its logical definition (whatit is supposedto do in logical/symbolicterms) and its associated context stimuli. The
user can re-shapethe behaviorreal definition independently
of the symbolicmodules. The value of the stimuli correspondsto the quality of matchingof context histogramand
is used to trigger the behavior.Themodelthus providessupport for programming
at the high level and groundingin real
data at the behaviorlevel.
Conclusion
This modelis implementedon pioneer 2DXrobots equipped
with color camera.In a preliminary phasewe have obtained
promisingresults for simple behaviors such as object approachand door exit. Despite its simplicity the population
of cells encodesa large numberof situations and is robust to
noise. Fewexamples(< 10) are neededto synthesize this
kind of behaviors. Weare currently extending the workin
three directions : Validationof contextdetection, validation
over morecomplexbehaviors,( integrating time and perceptual aliasing), integrating behaviorsinto an action/selection
scheme.
References
L. Huguesand A, Drogoul (2001) Robot Behavior Learning
by Vision-Based Demonstrations, 4th European workshop
on advancedmobile robots (EUROB
OT01).
Figure 2: Collective reaction of the cells in two different
situation of "Exit by the door" behavior. (Top)Far formthe
door the cells proposeto go forward- (Bottom)near the door
a lot of cells tells to turn left.
A.Cypher, (1993) Watch what I do, programming
demonstration. MITPress. CambridgeMA.
J.Demiris and G.M. Hayes (1996) Imitative Learning
Mechanisms in Robots and Humans. In Proe. of 5th
EuropenaWorkshopon Learning Robot, V Klingspor (ed.).
Step 1 : ExamplesCapture Each exampleis shownusing a joystick, the robot vision and movements
are recorded.
Cells are activated by video frames of the example. The
meanof the effeetors values is stored in each cell for periods
of activation of the cell. A Context Histogramis deduced
fromperceptual properties observedalong the example.Step
2: FusingexamplesThe final behavioris obtained by fusing
separate cells populationcomingfromseveral examplesinto
a uniquecell population. Context histogramsare mergedby
J.J. Gibson (1986) The Ecological Approach to Visual
Perception,LEAPublishers, London.
D.A Pomerlau, (1993) Knowledge-based training
artificial neural networksfor autonomous
robot driving. In
J,Connell and S. Mahadevan(Eds.) Robot Learning (pp.
19-43)
116