Perception-Action Coupling via Imitation and Attention

From: AAAI Technical Report FS-01-01. Compilation copyright © 2001, AAAI (www.aaai.org). All rights reserved.
Perception-Action Couplingvia Imitation and Attention
GeorgeMaistros, Yuval Marom
and Gillian Hayes
Institute of Perception, Action,and Behaviour,
Division of Informatics, University of Edinburgh,
5 Forrest Hill, Edinburgh, EH12QL,UK
{georgem,yuvalm, gmh}@dai.ed.ac.uk
Abstract
next twosections describe these sub-systems,and also provide their biological and psychologicalinspirations, respectively.
The paper then proceeds with a section describing how
the current workintegrates these sub-systemsinto a single
systemcapable of anchoringaffordances to perceptual information. Then,the followingsections describe the implementationof the systemon twodifferent platforms. Finally,
a concludingsection provides a discussion of current and
future work.
When
weinteract withobjects, wegive themmeaning,
i.e. weshowwhatthey axepotentiallyuseful for. We
believethat physicalentities are anchored
to perceptual
representations,andthroughthemto the actions that
they’afford’. Thispaperbringsan imitationmechanism
andan attentionsystemtogethercomputationally,
with
the aimof havinga systemthat is capableof creating
and maintainingthese anchors.Theintegratedsystem
is implemented
on twodifferent platforms:a simulated
humanoid
robot learning fromanotherhowto drink a
glass of beer, and a simulatedmobilerobot learning
fromanotherhowto followwalls.
Schema Network
NeurophysiologicalModel
Neurophysiologieal experiments in the brain of macaque
monkeys
suggest that neuronsin area F5 (the rostral part of
inferior area 6), namelycanonicalandmirror neurons,have
both visual and motorproperties (Rizzolatti et al. 1988).
PETstudies illustrate the presenceof neuronswith similar
properties in the humanbrain as well. Further, it is believed
that these neurons mayformthe fundamentalbasis for imitation in primates(Rizzolatti, Fogassi,&Gallese2000).
Single neuronstudies by Gallese et al. (1996) and Rizzolatti et al. (1996) exploredfurther the properties of
neurons and exposeda strong relationship betweenperception and motorcontrol. For instance, mirror neurons fire
both whenthe monkeyperforms an action (motor stimulus)
and whenit observes another monkeyor the experimenter
performthat sameaction (perceptual stimulus). Similarly,
canonical neurons discharge both when the monkeyperforms an action, say A, and whenthe monkeymerely observes a 3Dobject that affords action A.
The real attractiveness of these neurons reveals itself
throughtheir functional role. Mostliterature agrees on the
views summarised
by Rizzolatti, Fogassi, &Gallese (2000),
wheremirror neurons (and F5 neurons in general) perform
the necessaryvisuomotortransformationsto offer a ’vocabulary’ of potential actions (or ’ideas’ on howto (re)act).
also think of these potential actions as affordances.
Introduction
Gibson(1966) claimed that our perception of the world
expressedin terms of our interactions with it. He introduced
the conceptof affordancesof objects that represent the actions that one mayapply to those objects. Theseactions are
the ones that the object affords. Amug,for example,affords
varioushandgrasps, as a chair affords sitting.
Wesuggest that perception coupled together with action forms a correspondencebetweenthe perceived world
and the interactions with it. This correspondencecan offer robots the necessary frameworkwith which to ground
perceptual informationto their ownactions, and even use it
to ’understand’the observedinteractions of others with objects.
Our approachis to anchor sensory data of physical entitles throughperceptualclassifications to the actions that
they ’afford’. Thisis achievedthrougha perceptualattention
systemthat matchesand classifies input, and a perceptionaction coupling mechanismthat is responsible for the acquisition of motorskills throughimitation. In other words,
we believe that the integration of these twosystemsoffers
appropriate motorskills onto whichone can groundthe perceptual instances of the world.
There are therefore two main modulesthat makeup the
system described in this paper, resulting from work undertaken separately: (1) schema network that ha ndles
perception-actioncoupling, and (2) an attention system. The
Implementation
Inspired by these views we have developed an imitation
mechanismthat employs Arbib’s SchemaTheory (Arbib
1981) in the form of a schemanetworkthat expresses this
Copyright(~) 2001,American
Associationfor Artificial Intelligence(www.aaai.org).
All rights reserved.
52
.......................................
ii..................
nil
Figure 1: A schema network: several perceptual-motor
schemapairs send candidate motor commandsto the competition modulewhichsend the winnerto the motorsystem
for execution.
very role of the mirror neurons. This role together with the
notion of affordances has been successfully implemented
by
Fagg&Arbib (1998), while the mechanismspresent in this
workare inspired by and resemble the work by Demiris &
Hayes(1999) and Matari6 et aL (1998).
A schemanetworkis formedusing pairs of perceptual and
motorschemas(see Figure 1). Perceptual schemasrepresent
perceptualstructures of an action (e.g. the joint-angles of an
observed demonstrator), and motor schemasrepresent the
motorskills for that action. The motorskills are expressed
in termsof internal targets (e.g joint-angle targets) that are
to be achievedfor that particular skill.
All perceptual modulesreceive perceptual input and compare themwith their stored structures to producea confidence measure. This confidence, if high enough, is used
to activate the motorschemas.Then, active motorschemas
convert their internal targets into candidatemotorcommands
using proprioception feedback and a forward model. Given
a robot’s current state and internal targets a forwardmodel
suggests the motorcommands
that best achieve those targets
(here we are using a simplified version of the forwardmodel
used by Demiris & Hayes 1999). These candidate commandscompetein the competition moduleand the one with
the highest confidence(given from the perceptual schemas)
winsand is sent to the motorsystemfor execution.
As such, whenthe schemanetwork receives perceptual
input of a familiar behaviourit triggers in sequencethose
motorskills that imitate that behaviour.Alternatively, an
unfamiliarbehaviourresults in either no or verylittle motor
skill triggering, thus producingpoor imitation. For a more
detailed description of the schemanetworkand its implementation see Maistros &Hayes(2001).
Figure2 illustrates a problemof this implementation.
Perceptual and motorschemasare given to the networka priori
so that it can imitate familiar behaviours(i.e. the onesrepresented by its schemas).
For example, the schemanetwork of Figure 2 has been
manuallybuilt such that it accommodates
three behaviours:
behaviours1 and 2, whichare quite similar to one another
(drinking a glass of beer), and behaviour3 whichis rather
different (movinga glass over a table). Eachbehaviour
considered as a long sequenceof joint-angles. These sequencesare arbitrarily divided into a numberof parts and
stored into perceptual and motorschemas.Behaviours1 and
53
time
Figure 2: The output of the schemanetwork(see text for
explanation).
2 are divided into a common
approachpart (approachingthe
glass to be grasped)and another four sequential parts each.
Similarly,behaviour3 is dividedinto four sequentialparts.
Eachsuch part is stored in one perceptual and one motor
schemaforming a perceptual-motor schemapair. The perceptual schemaholds a sequenceof joint-angles that are to
be recognised, and the motorsehemaholds the sequenceof
joint-angle internal targets that are to be achieved. Hence
a schemanetworkof a total of 13 pairs is formedwhere
behaviour1 is perceptually and motorically represented by
perceptual and motor schemas1-2-3-4-5 in sequence, behaviour 2 by 1-6-7-8-9, and behaviour 3 by 10-11-12-13.
In the episodeof Figure 2, the perceptual input given is
that of behaviour 2, and hence motor schemas1-6-7-8--9
are expectedto be activated in sequenee.However,one observes that, at times 300-600,schemas2 and 6 are alternate
winners. As it turns out, schemas2 and 6 are quite similar
to each other (grasping the glass) and hencestrongly competitive. Thereis a redundancy
in the representation.
The problemarises from the arbitrary break-downof the
behaviour into perceptual and motor schemas, as deemed
appropriateby the designer.
The workpresentedlater in this paper is intendedto address someof these weaknesses by employing an attention system whichreplaces perceptual schemaswith clusters of perceptual experiences. Theagent builds up perceptual structures fromexperience, rather than receiving them
a priori fromthe designer. This reducesthe chanceof storing duplicate schemasand also ensures that only significant
experiencesare represented in the network.
Attention
System
Psychological Model
In our approach to attention, we are mainly inspired by
the Habituation Hypothesis of Selective Attention (Cowan
1988), whichclaims that rather than attention being a simple filtering mechanism
that selects certain inputs and disposes of others, it is a systemthat continuallyinspects all
its input channels, comparesthemto descriptions stored in
higher
level
processing
Oo°n~
memo~
update
bottom-up
dishabimadon
orientation
Memory
dvmion &
".. "-." : comparisons /
]
r
Stimulus
Figure 3: The attention model. Units in memory
respond to
a stimulus, and this determineswhetherthe stimulus is attendedto, due to noveltyor forgetting (see text), or ignored.
Attendingto a stimulus (the orientation response) can result in further, higher-levelprocessing,and modificationsin
memory.
memory,
habituates to familiar stimuli, and has the ability to
dishabituate if needed.
The role of habituation is to inhibit what is knownas
the Orienting Response, which is a combination of neural, physiological, and behavioural changesthat an organism undergoeswhena novel or significant stimulus is detected (Sokolov 1963; Kahneman
1973). Whatis interesting
is howthe orienting responseis reinstated (dishabituation),
and this can occur due to a numberof factors (Balkenius
2000). Our implementationinvolves two of the main factors: the presentationof a novelstimulus, andthe passageof
time(forgetting).
Accordingto the model,depicted in Figure 3, the activation and comparisonsof structures in memory
occur out,side
of attention, in a passive, automaticmanner.Whenthe orienting responseis reinstated (due to noveltyor forgetting),
the informationgoes throughto the focus of attention, which
can cause the creation of newstructures in memory
(see Figure 3). Further, the informationfromthe focus of attention
can go throughto higher-level processing,such as learning.
ComputationalSystem
Wehave implemented the model discussed above using
a variation of the Self Organising Feature Map(SOFM),
wherestructures growfrom experience as required, rather
than being specified a priori. The SOFM
has been suecessfully used in the past in modellingrobotic sensory input (Nehmzow
1999). It attempts to cover the sensory input
space with a networkof nodes, and edges connectingneighbouring nodes are determinedby somedistance measure,in
our case a Euclideandistance. This has the effect of preserv-
54
ing the topologyinherent in the space.
Wehave adopted and suited to our purposes an algorithm developedby Marsland, Nehmzow,
&Shapiro (2001),
whichincorporates notions of habituation, novelty detection, and forgetting. It reflects howperceptual information
activates stored structures for comparison;howstructures
are added, updated, and deleted; and howstructures habituate to familiarstimuli, but are able to reinstatethe orientation
responseif required.
Themainasset of this algorithmis that it keepsa habituation measurefor each node in the SOFM.
This measuregives
an indicationof familiarity, i.e. the frequencyof that node’s
activation, whichprovides a useful heuristic. Eachtime a
nodeis active, its habituationvalue decreasesexponentially.
The nodes and edges of the SOFM
form the memorypan
of Figure 3, wherethe dotted arrowsdenote the comparisons
betweenthe incomingstimulus and the stored structures,
and the node in bold font denotes the ’activated’, winning
node, i.e. the one that best matchesthe input. Oneof three
possiblecases followsthis activation:
1. If the matchis not goodenoughnovelty is detected. The
orientation responseis initiated, whichcorrespondsto the
creation of a newnode (denoted by a ’dashed’ node in
Figure3).
2. If the matchis goodthe activated nodeand its neighbours
are modifiedby a constantfraction to matchthe input even
better, and they are also habituated.
3. If the winningnodeis fully habituated,it doesnot respond
to any moreinput, until dishabituation is forced through
the passage of a fixed time interval. The informationis
re-attendedto until the nodefully habituatesagain.
To summarise,the system handles attention as follows.
Nodesin the networkrespondand habituate to their respective stimuli. Whenfully habituated, nodes ignore further
stimulation and hencedo not get updated. Theorienting responseis reinstated either due to novelty detection, whena
newnode is created, or due to forgetting, whena node is
dishabituatedw in both situations the stimulus is (re-) attendedto.
For moreinformation on the general algorithm see Marsland, Nehmzow,
&Shapiro (2001), and for our implementation see Marom& Hayes(2001).
The Integrated System
The two systems described in the previous sections, namely
the schemanetworkand the attention system, were developed separately in on-going work. The workpresented in
this paper brings the two sub-systemstogether computationally, and the aimis to havea systemthat is capableof creating and maintainingrepresentations of perceptual experienceswithrespect to the actions that they afford.
In other words,the systemgroundsperceptual instances to
the actions that they afford. Thepart of the systemresponsible for grounding(through attention) is the SOFM,
and the
pan responsiblefor the acquisitionof actions, or motorskills
(through imitation) is the schemanetwork.A schematicdiagramof the systemis shownin Figure 4.
SOFM
Stimulul
¯
Feedback
Figure 4: The integrated system: the SOFM
handles the
stimulus and finds the best matchingnodeat each time; its
hard-wired motor schema calculates the motor commands
before sendingthemto the motorsystemfor execution.
The attention system (SOFM)replaces the perceptual
schemasof Figure1. Thus, instead of arbitrarily categorising the input space, we nowuse attention for this purpose.
The SOFM
takes input directly from the sensors and processes themas described in the previous section. For each
node in the SOFM
there is a single motorschemahard-wired
to iLt whereinternal targets are to be stored.
At the learning phase both the SOFMand the motor
schemasare built in response to the input. Since during
training the nodes in the SOFM
movein the input space,
we create and modifymotor schemasonly whena node is
fully habituated, and therefore settled. Thesemodifications
dependon the habituation value of the winningnode:
1. If the winningnode is not fully habituated, the correspondingmotorschemais left intact.
2. If the winningnodeis fully habituatedand the perceptual
input is part of the samelearningepisode,the current target is appendedto the sequenceof targets (if any) of the
corresponding motor schema.
3. If the winningnodeis fully habituated and the perceptual
input is part of a newlearning episode,all targets of the
correspondingmotor schemaare deleted and overwritten
by the current target.
Dueto forgetting (dishabituation), mentionedin the "Attention System"section, a nodecan re-train itself and its
motor schema, to adapt to changes in the environment, or
to unsuccessful previous trainings. Wehave not yet developed sophisticated methodsfor modifying motor schemas
(for re-training); currently weare using the simpleand crude
methoddescribed above.
To summarise,an episode is recorded in a motorschema
that correspondsto a fully habituated SOFM
node, and this
episodeis overwrittenby future episodes.
The emergentlevel of granularity in the motorschemas
is governedby the sensitivity of the SOFM
nodes, which
in turn is controlled by the thresholdusedfor detecting novelty. Thisis the classical over-generalisation(too fewnodes)
versus over-fitting (too manynodes)issue.
tThepossibility of hard-wiringmoremotorschemasto a perceptualcategorisation
is beingconsidered.
55
Figure 5: The simulation platform: the imitator (righ0 is
imitating the demonstrator(left).
At the recall phase we use the SOFM
and motor schemas
created in the learning phase to recall the appropriate motor skills required at particular perceptual states. At each
step the input activates the SOFM
node that best matches
it, and the motorschemaassociated with that nodeis then
responsible for producing motor commands.Weare interested in testing howwell the systemcan reproducethe
learned behaviour, and morespecifically in the dynamicsof
pereeptual-motoractivations.
Wehavetested the integrated systemon twodifferent platforms,both in simulation,and they are presentedin the followingtwo sections. Weare also currently testing the system on a physical robot (a Real WorldInterface B21robo0.
Experiment1: Postural Imitation
Setup
The experimentspresented in this section simulate the dynamicsof two eleven degrees of freedomsimulated robots
(waist upwards,see Figure 5): a demonstratorand an imitator. Eachrobot has three degrees of freedomat the neck,
three at each shoulder, and one at each elbow.
Theimitator has explicit access to the joint-angles andinstantaneousvelocities of the demonstratoralong with object
information(visual perception), and to its ownangles and
velocities (proprioception),after noise is addedontothem.
At the learning phase the imitator observes the demonstrator performan action, andtries to learn it. At the recall
phase, the imitator observes the demonstratorperformthat
action again, and tries to reproduceit, i.e. imitate it. The
action involves’pickingup’ a glass off a table, ’drinking’its
contents, and’putting’ it backon the table.
Notethat, due to softwarelimitations, the robotsdo not interact withobjects, althoughobjects are represented, recognised, and virtually myet effortlessly -- moved(hence the
visual absenceof the glass in Figure5).
Imitation,here, is usedin the recall phaseto explicitly expose the imitator to perceptual cues that maytrigger appropriate motorschemasto reproducethe learned action. In this
experiment, the motorschemasstore joint-angles as internal targets whichare converted into motorcommands
via a
Proportional-Integral-Derivative(PID)controller (a forward
model).Notice that these motorskills correspondto affordancesonto whichperceptual informationis to be grounded.
0.81
0,75l
0.75[
0.7
m
,-0.4
-0.6
Pdnclpat Component
1
-0.2
0.45!
-i
-0.9
--0.8 .-0.7 -0.6 -0.5
principal Component
1
(a)
(b)
-0.4
-0.3
500
10’00 "~500
Time
20’oo
25oo
(c)
Figure6: Experiment
1: (a) the perceptualdata exposedto at the learning phase, projectedonto the first 2 principal components;
(b) the SOFM
producedby the attention systemfor these data; (c) the SOFM
nodeactivation at the recall phase.
Learning Phase
Further, in part (i), as the handapproachesthe glass, the
’velocity’ tends to ’zero’ (it decelerates), whereasin part
(iii), as the handmovesawayfromthe glass, the ’velocity’
increases from’zero’ (it accelerates).
Note howeverthat a single episode ends whenthe demonstrator puts the glass backonthe table, i.e. parts (i) and(ii)
have been completed. Repeating the demonstrationepisode
first takes the hand off and awayfrom the glass, back towardsthe starting location (i.e. part (iii)) andthen through
parts (i) and(ii). Nevertheless,
part (iii) is clusteredand
sociated with appropriatemotorskills.
Indeed, in Figure 6(b) we observe the attention system
clusters these three parts with enoughnodes, with the only
exceptionof the right handpart of the central loop. This
is due to the small numberof data points in that area. Further workwill considerthe issue of whetherit is desirable
to sampleat a faster rate whenfast movements
(as the one
mentionedabove)are to be imitated.
Figure 6(a) showsthe perceptual data that the imitator observes at the learning phase, while Figure 6(b) showsthe
resulting SOFM
network. Since the dimensionality of the
input space is quite high (28), we have used a dimensionality reduction technique called Principal Component
Analysis (PCA), for display and analysis purposes. PCAfinds
the moststatistically significant dimensions,called Principal Components,
in a multivariate dataset (see Afifi &Clark
(1996) for more information on PCA).
Figure 6(a) is the projection of the perceptual data onto
z Similarly,
the first 2 principal componentsfoundby PCA.
Figure 6(b) is the projection of the SOFM
networkonto the
first 2 principal components.Interpreting the PCAplots is
not straightforwardbecauseit is hardto inspect the principal
components
and trace themback to the original dimensions,
especially whenthe original dimensionalityis quite high.
However,from our knowledgeof what the learner should
be perceiving, we can conjecture whatthe different parts of
the plots mean.
It seems that the first Principal Component
expresses
a generalised velocity (’velocity’) of the right hand jointangles, while the second expresses a generalised height
(’height’) of the right handend point or wrist. Noticethat
the first PCis ’zero’ around-0.7, while the secondis at
’table height’ around0.4.
Wecan distinguish three shapes in Figure 6(a): one fairly
straight curveat the bottomleft of the figure (goingleft to
centre), a central loop in the middle(going clockwise),
anotherfairly straight curveat the bottomright (goingcentre
to right). In fact, this does correspondto the three major
parts of the "drinkinga glass of beer" action: (i) approach
the glass; (ii) pick up, drink, and put down;(iii) move
fromthe glass.
Recall Phase
In the recall phase, while the demonstratorperformsthe action, the SOFM
receives continuousperceptual input, finds
the nodethat best matchesthe input, and activates the corresponding motorschema. That schemarecalls the current
internal target state and calculates the motorcommands
to
achievethat state usingits PIDcontroller.
Figure 6(c) shows the SOFM
activation of a single
episodeat the recall phase,i.e. the sequenceof nodesthat are
activated in responseto the input. Weobservethat the nodes
in the clusters, created at the learningphase(Figure 6(b)),
are activated in sequenceat the recall phase. The sequence
6--3-4-8 controls the imitator fromthe starting posture, to
’grasp’ the glass. Similarly, motorschemas2-1-0 control
the imitator to ’rift’ the glass and ’move’it towar&sthe
mouth, and schemas 0-9-7 from the mouth back towards
the table. As mentionedabove, a single episode ends when
the glass is put backon the table; whichexplainsthe absence
of winning nodes 5-10-11.
Figure6(c) merelydemonstratesthe correct activation dynamics of the SOFM
nodes (the winningnode at each time
step). However,we expect that motor schemashard-wired
eIt’s importantto checkhowmuchof the varianceis actually
accounted
for by the first 2 principalcomponents,
andthis is given
bythe sumof their eigenvalues.
Thisgivesus an indicationof how
representative
a 2-dimensional
plot is of the true complexity
in the
data. In this ease, 2 dimensions
accountfor approximately
78%of
the total variance.
56
1,5-
Mouth
1.
¯
0.5-
ii
i
0~-O.5-
Experiment 2: Imitation by Following
Setup
The experimentspresented in this section were performed
using a Kheperamobile robot simulator with a learner
agent followingbehind a teacher agent (see Figure 8). The
learner robot perceivesinformationthrough6 infra-red sensors aroundits front. It uses its built-in followingbehaviour
to keep behind the demonstrator, as the demonstratorexecutes the task, whichin this case is wall-following.
Weregard this as ’imitation’ in the sense that information
is implicitly shared betweenthe demonstratorand learner
about a specific task to be learned. Theobject-interactions
in this case correspondto howthe robot responds to being
near a wall: the affordancesof the wall and of the no-wall.
In this experimentthe forwardmodelis not as straight forwardas in Experiment1 wherethe PIE) provides an intuitive
forward model. As mentioned earlier, the motor schemas
providemotorskills expressedas a sequenceof internal targets. The forward modelshould suggest to the robot howto
attain a particular target state, given the current state, and
the possibleactions availableto the robot.
Welet the robot build up this forward modelfrom experience before we start the experiment. The robot wanders around the environment on its own, with an added
obstacle-avoidance skill, and collects information about
state-transitions. The transition matrix is then stored and
used later as a forwardmodel.
i I
i
......
,,e
-1.5-
- ..
Depth
" . ....
-......’
0 -1.5
Width
Figure 7: The trajectories of the fight handwrists of the
demonstrator(bold font) and the imitator (normal font)
a single episodein the recall phase.
I
!
I..
@
°
!
Figure 8: The Kheperamobilerobot simulator.
to those nodes will generate appropriate motor commands
to efficiently imitate the demonstratedbehaviour.
Onthe other hand, since there is no easy or standardised
wayto evaluateimitation, ’efficient imitation’ is hardto define (Matari62000). For the purposesof this t~sk, though,
we consider the position of the hand(s) over time relative
to the position of the demonstrator’shand(s)sufficient as
indicationof ’efficient imitation’.
Learning phase
Figure9(a) showsthe perceptualdata that the learner is exposedto as it follows the demonstratoraroundthe environment, and Figure 9(b) showsthe correspondingnetworkproducedby the attention system. As before, since the dimensionality of the sensorspaceis too highto visualise, wehave
used PCAto reduce the numberof dimensionsto two (from
six)?
For a simple task such as wall-following,we do not expect
manyperceptual clusters, and in fact wesee 3 mainclusters:
onecluster is the intersection of the 2 apparentlines in Figure 9(a), whichis the area of low(weak)wall-detection,i.e.
’no wall’, and as we moveawayfrom this intersection in
either direction, we see data correspondingto ’right wall’
and ’left wall’, at different distances fromthe wall, clustered in distinct areas. Theattention systemcaptures these
clusters by placing nodesin appropriate places, as shownin
Figure 9(b). Wetherefore expectthese clusters to afford the
appropriatemotorskills.
Recall phase
After the robot has clustered its perceptual space and ereated correspondingmotorschemasat the learning phase, it
is placed in the sameenvironmenton its ownfor the recall
phase. The default behaviour is ’wandering’, and at each
step it tries to activate one of the nodesin its SOFM.
The
winningnode recalls the targets stored in the corresponding
Figure 7 showsthe trajectories of the fight-handwrists of
both the demonstrator(in bold font) and the imitator (norreal font) in a single episodein the recall phase. Noticethat
the trajectories are close to each other, save for one part of
the action. This reflects the actual behaviourof the imitator
whereall sub-goals of the task are achievedwith a reasonable degree of accuracy. One possible wayto numerically
evaluate this degreeof accuracyis to calculate the area betweenthe curves; this is part of ongoingwork.
SThe2 dimensionsused in Figure9(a) accountfor approximately98%of the total variance.
57
IJ:[
o/
~-o.
[
04 ¯
-0.2
,i
"9-0.2
-0/~12
-0.1
0
0.1
PdnclpalC(~nponent
0.2
0.3
-0’-~I
,?
-0,1
or-0
O.t
P~ncipalComponent
1
(b)
(a)
0.2
0.3
100OO
2OOO0
Time
30000
(c)
Figure9: Experiment
2: (a) the perceptualdata exposedto at the learning phase, projectedonto the first 2 principal components;
(b) the SOFM
produeedby the attention systemfor these data; (e) the SOFM
nodeactivation at the recall phase.
motor schema, and these are passed to the forward model
whichselects the best action likely to achievethem.
Figure 9(e) showsthe SOFM
activation at the recall phase.
Firstly, wesee that the nodes,created at the learningphase,
that formclusters (Figure 9(b)), are also activated together
and intermittently at the recall phase. Secondly,we see an
emergentsequenceof activations as the robot movesaround
trying to recall the information from the motor schemas:
the activations of nodesfor wall perception on either side
are separatedby activations of nodesfor no-wallperception.
This reflects the actual behaviourof the learner: following
a right wall, then no wall, then followinga right wall again,
etc.
Notethat the activation dynamicsare different than in Experiment1 (Figure 6(e)), whichis inherent in what is being
modelled.In Experiment1 the behaviourbeing modelledby
the motorschemashas a true sense of sequencein it (moving
the hand towardsa glass, ’picking’ it up, etc.); each motor
schemastores a part of that sequence.
In this experiment,each motorschema,or rather cluster
of schemas,correspondsto being in a particular perceptual
’state’, and the motorskills are responsible for maintaining
that state (for examplefine-tuning to stay next to a wall on
the left). For this reason we do not expect to have a clear
winningnodesince the robot will keepadjusting itself by activating alternate nodeswithin the samecluster (whereeach
oneaffords different fine-tuning).
Visual inspection of the behaviourconfirmsthat the activation of the schemasdoes correspondto a behaviourclosely
resembling the one exhibited by the demonstrator. Weare
also developing a methodto numerically evaluate the recalled behaviourin terms of the actual task; we do this by
calculatingthe ’energy’that the robot acquiresfromthe wall
at particular configurationsfromit. At the end of the recall
episode we can look at the accumulatedenergy as a measure
of howwell the robot performsthe wall-followingtask.
Discussion & Future Work
Wehave presented on-going work on the integration of an
attention system and an imitation mechanism.The aim of
this integration is threefold: first, to classify our perceptual
58
experiencessuch that wedistinguish familiar fromnovel experiences;second,to be able to recall the right motorskills
at the right timeto achievefeasible (natural) motorcontrol
for robots; and third, to groundour perceptual experiences
to these motorskills.
To summarise,the overall systemneeds to be shownthe
perceptualinput togetherwith the action that it affords such
that it creates appropriateperceptual and motorstructures.
Further, a forwardmodelneedsto be available such that the
internal targets in the motorschemascan be convertedinto
actions.
Currently, perceptual and motorstructures are implicitly
eonneeted by a simple one-to-one relationship. Weare,
however, interested in ways to afford more than one action per perceptualclassification, as weare also considering
more sophisticated waysto update and maintain the motor
schemas.
This overall systemhas been implementedon two different platforms where we observe that the attention component correctly classifies the input space and creates perceptual nodes and motorschemas, and that the imitation componenttakes full advantageof these affordanees(to generate
behaviour).
Asmentioned
earlier, evaluating’efficient imitation’ is as
difficult a task as is challengingand it is part of on-going
workin the field. Evaluatingthis systemis therefore equally
difficult. However,as mentionedthroughoutthe paper, we
are currently workingon ways to measurenumerically the
overall performance
of the systemon a task-specific basis.
Whena eonspeeifie performsan action that weare familiar with, not only are weable to imitate that action, but we
are also able to ’understand’it, i.e. we’know’the perceptual
context that triggered that action together with its immediate consequences.This can be regarded as ’understanding’
in the sense that we can internally imitate an action, and
a higher-level systemthat overlooks its consequencesmay
havethe capacity to infer its goal(s) and/or purpose. Our
workis an attempt to encompassthese ideas in a computational framework.
References
Afifi, A. A., and Clark, V.A. 1996. Computer-Aided
Multivariate Analysis. London: Chapman&Hall. ISBN
041273060x.
Arbib, M.A.1981. Perceptual structures and distributed
motorcontrol. In Brooks,V. B., ed., Handbook
of Physiology - The Nervous System IL MotorControl. Amer.Physiol. Soe. 1449-1480.
Balkenius, C. 2000. Attention, habituation and conditioning: Towarda computational model. Cognitive Science
Quarterly1(2).
Cowan,N. 1988. Evolving conceptions of memorystorage, selective attention, and their mutualconstraints within
the humaninformation processing system. Psychological
Bulletin 104:163-191.
Demiris, J., and Hayes, G.M.1999. Active and passive
routes to imitation. In Proceedinssof the AISBsymposium
on imitation in animalsandartifacts. Edinburgh,Scotland,
UK.
Fagg, A. H., and Arbib, M. A. 1998. Modelingparietalpremotorinteractions in primate control of grasping. Neural Networks11(7/8): 1277-1303.
Gallese, V.; Fadiga, L.; Fogassi, L.; and Rizzolatti, G.
1996. Action recognition in the premotor cortex. Brain
119:593-609.
Gibson, J. J. 1966. The senses considered as perceptual
systems. HoughtonMifflin Company.
Kahneman,D. 1973. Attention and Effort. EnglewoodCliffs, NewJersey: Prentice Hall.
Maistros, G., and Hayes, G. M. 2001. Animitation mechanism for goal-directed actions. In Proceedingsof TIMR
2001 - TowardsIntelligent Mobile Robots. Manchester,
UK.
Marom,Y., and Hayes, G.M. 2001. Attention and social situatedness for skill acquisition. In Proceedingsof
the First International Workshopon Epigenetic Robotics:
ModelingCognitive Developmentin Robotic Systems.
Marsland,S.; Nehmzow,
U.; and Shapiro, J. 2001. Novelty
detection in large environments.In Nehmzow,
U., and Melhuish, C., eds., Proceedingsof TowardsIntelligent Mobile
Robots (TIMR)2001.
Matari6, M.J.; Williamson,M.; Demiris,J.; and Mohan,A.
1998. Behavior-based
primitives for articulated control. In
Pfeifer, R.; Blumberg,B.; Meyer,J.-A.; and Wilson,S. W.,
eds., Proceedings:FromAnimalsto Animats5, Fifth International Conferenceon Simulation of Adaptive Behavior
(SAB-98),165-170.Ziirieh, Switzerland.
Matari6, M. J. 2000. Getting humanoidsto moveand imitate. IEEEIntelligent Systems14:18-24.
Nehmzow,U. 1999. Mobile Robotics: A Practical Introduction. Springer Verlag. ISBN1-85233-173-9.
Rizzolatti, G.; Carmada,R.; Fogassi, L.; Gentilueci, M.;
Luppino, G.; and Matelli, M. 1988. Functional organization of inferior area 6 in macaquemonkey.2. Area F5
59
and the control of distal movements.ExperimentalBrain
Research71 (3):491-507.
Rizzolatti, G.; Fadiga, L.; Gallese, V.; and Fogassi, L.
1996. Premotorcortex and the recognition of motor actions. CognitiveBrain Research3(2): 131-141.
Rizzolatti, G.; Fogassi, L.; and Gallese, V. 2000. Cortical
mechanisms
subserving object grasping and action recognition: Anewviewon the cortical motorfunctions. In Gazzaniga, M., ed., The NewCognitive Neurosciences. Cambridge, MA:MITPress. 539-552.
Sokolov, E. 1963. A neuronal modelof the stimulus and
the orienting reflex. In Sokolov,E., ed., Perceptionandthe
ConditionedReflex. PergamonPress. 282-294.