Perception-Action Coupling via Imitation and Attention

From: AAAI Technical Report FS-01-01. Compilation copyright © 2001, AAAI (www.aaai.org). All rights reserved. Perception-Action Couplingvia Imitation and Attention GeorgeMaistros, Yuval Marom and Gillian Hayes Institute of Perception, Action,and Behaviour, Division of Informatics, University of Edinburgh, 5 Forrest Hill, Edinburgh, EH12QL,UK {georgem,yuvalm, gmh}@dai.ed.ac.uk Abstract next twosections describe these sub-systems,and also provide their biological and psychologicalinspirations, respectively. The paper then proceeds with a section describing how the current workintegrates these sub-systemsinto a single systemcapable of anchoringaffordances to perceptual information. Then,the followingsections describe the implementationof the systemon twodifferent platforms. Finally, a concludingsection provides a discussion of current and future work. When weinteract withobjects, wegive themmeaning, i.e. weshowwhatthey axepotentiallyuseful for. We believethat physicalentities are anchored to perceptual representations,andthroughthemto the actions that they’afford’. Thispaperbringsan imitationmechanism andan attentionsystemtogethercomputationally, with the aimof havinga systemthat is capableof creating and maintainingthese anchors.Theintegratedsystem is implemented on twodifferent platforms:a simulated humanoid robot learning fromanotherhowto drink a glass of beer, and a simulatedmobilerobot learning fromanotherhowto followwalls. Schema Network NeurophysiologicalModel Neurophysiologieal experiments in the brain of macaque monkeys suggest that neuronsin area F5 (the rostral part of inferior area 6), namelycanonicalandmirror neurons,have both visual and motorproperties (Rizzolatti et al. 1988). PETstudies illustrate the presenceof neuronswith similar properties in the humanbrain as well. Further, it is believed that these neurons mayformthe fundamentalbasis for imitation in primates(Rizzolatti, Fogassi,&Gallese2000). Single neuronstudies by Gallese et al. (1996) and Rizzolatti et al. (1996) exploredfurther the properties of neurons and exposeda strong relationship betweenperception and motorcontrol. For instance, mirror neurons fire both whenthe monkeyperforms an action (motor stimulus) and whenit observes another monkeyor the experimenter performthat sameaction (perceptual stimulus). Similarly, canonical neurons discharge both when the monkeyperforms an action, say A, and whenthe monkeymerely observes a 3Dobject that affords action A. The real attractiveness of these neurons reveals itself throughtheir functional role. Mostliterature agrees on the views summarised by Rizzolatti, Fogassi, &Gallese (2000), wheremirror neurons (and F5 neurons in general) perform the necessaryvisuomotortransformationsto offer a ’vocabulary’ of potential actions (or ’ideas’ on howto (re)act). also think of these potential actions as affordances. Introduction Gibson(1966) claimed that our perception of the world expressedin terms of our interactions with it. He introduced the conceptof affordancesof objects that represent the actions that one mayapply to those objects. Theseactions are the ones that the object affords. Amug,for example,affords varioushandgrasps, as a chair affords sitting. Wesuggest that perception coupled together with action forms a correspondencebetweenthe perceived world and the interactions with it. This correspondencecan offer robots the necessary frameworkwith which to ground perceptual informationto their ownactions, and even use it to ’understand’the observedinteractions of others with objects. Our approachis to anchor sensory data of physical entitles throughperceptualclassifications to the actions that they ’afford’. Thisis achievedthrougha perceptualattention systemthat matchesand classifies input, and a perceptionaction coupling mechanismthat is responsible for the acquisition of motorskills throughimitation. In other words, we believe that the integration of these twosystemsoffers appropriate motorskills onto whichone can groundthe perceptual instances of the world. There are therefore two main modulesthat makeup the system described in this paper, resulting from work undertaken separately: (1) schema network that ha ndles perception-actioncoupling, and (2) an attention system. The Implementation Inspired by these views we have developed an imitation mechanismthat employs Arbib’s SchemaTheory (Arbib 1981) in the form of a schemanetworkthat expresses this Copyright(~) 2001,American Associationfor Artificial Intelligence(www.aaai.org). All rights reserved. 52 ....................................... ii.................. nil Figure 1: A schema network: several perceptual-motor schemapairs send candidate motor commandsto the competition modulewhichsend the winnerto the motorsystem for execution. very role of the mirror neurons. This role together with the notion of affordances has been successfully implemented by Fagg&Arbib (1998), while the mechanismspresent in this workare inspired by and resemble the work by Demiris & Hayes(1999) and Matari6 et aL (1998). A schemanetworkis formedusing pairs of perceptual and motorschemas(see Figure 1). Perceptual schemasrepresent perceptualstructures of an action (e.g. the joint-angles of an observed demonstrator), and motor schemasrepresent the motorskills for that action. The motorskills are expressed in termsof internal targets (e.g joint-angle targets) that are to be achievedfor that particular skill. All perceptual modulesreceive perceptual input and compare themwith their stored structures to producea confidence measure. This confidence, if high enough, is used to activate the motorschemas.Then, active motorschemas convert their internal targets into candidatemotorcommands using proprioception feedback and a forward model. Given a robot’s current state and internal targets a forwardmodel suggests the motorcommands that best achieve those targets (here we are using a simplified version of the forwardmodel used by Demiris & Hayes 1999). These candidate commandscompetein the competition moduleand the one with the highest confidence(given from the perceptual schemas) winsand is sent to the motorsystemfor execution. As such, whenthe schemanetwork receives perceptual input of a familiar behaviourit triggers in sequencethose motorskills that imitate that behaviour.Alternatively, an unfamiliarbehaviourresults in either no or verylittle motor skill triggering, thus producingpoor imitation. For a more detailed description of the schemanetworkand its implementation see Maistros &Hayes(2001). Figure2 illustrates a problemof this implementation. Perceptual and motorschemasare given to the networka priori so that it can imitate familiar behaviours(i.e. the onesrepresented by its schemas). For example, the schemanetwork of Figure 2 has been manuallybuilt such that it accommodates three behaviours: behaviours1 and 2, whichare quite similar to one another (drinking a glass of beer), and behaviour3 whichis rather different (movinga glass over a table). Eachbehaviour considered as a long sequenceof joint-angles. These sequencesare arbitrarily divided into a numberof parts and stored into perceptual and motorschemas.Behaviours1 and 53 time Figure 2: The output of the schemanetwork(see text for explanation). 2 are divided into a common approachpart (approachingthe glass to be grasped)and another four sequential parts each. Similarly,behaviour3 is dividedinto four sequentialparts. Eachsuch part is stored in one perceptual and one motor schemaforming a perceptual-motor schemapair. The perceptual schemaholds a sequenceof joint-angles that are to be recognised, and the motorsehemaholds the sequenceof joint-angle internal targets that are to be achieved. Hence a schemanetworkof a total of 13 pairs is formedwhere behaviour1 is perceptually and motorically represented by perceptual and motor schemas1-2-3-4-5 in sequence, behaviour 2 by 1-6-7-8-9, and behaviour 3 by 10-11-12-13. In the episodeof Figure 2, the perceptual input given is that of behaviour 2, and hence motor schemas1-6-7-8--9 are expectedto be activated in sequenee.However,one observes that, at times 300-600,schemas2 and 6 are alternate winners. As it turns out, schemas2 and 6 are quite similar to each other (grasping the glass) and hencestrongly competitive. Thereis a redundancy in the representation. The problemarises from the arbitrary break-downof the behaviour into perceptual and motor schemas, as deemed appropriateby the designer. The workpresentedlater in this paper is intendedto address someof these weaknesses by employing an attention system whichreplaces perceptual schemaswith clusters of perceptual experiences. Theagent builds up perceptual structures fromexperience, rather than receiving them a priori fromthe designer. This reducesthe chanceof storing duplicate schemasand also ensures that only significant experiencesare represented in the network. Attention System Psychological Model In our approach to attention, we are mainly inspired by the Habituation Hypothesis of Selective Attention (Cowan 1988), whichclaims that rather than attention being a simple filtering mechanism that selects certain inputs and disposes of others, it is a systemthat continuallyinspects all its input channels, comparesthemto descriptions stored in higher level processing Oo°n~ memo~ update bottom-up dishabimadon orientation Memory dvmion & ".. "-." : comparisons / ] r Stimulus Figure 3: The attention model. Units in memory respond to a stimulus, and this determineswhetherthe stimulus is attendedto, due to noveltyor forgetting (see text), or ignored. Attendingto a stimulus (the orientation response) can result in further, higher-levelprocessing,and modificationsin memory. memory, habituates to familiar stimuli, and has the ability to dishabituate if needed. The role of habituation is to inhibit what is knownas the Orienting Response, which is a combination of neural, physiological, and behavioural changesthat an organism undergoeswhena novel or significant stimulus is detected (Sokolov 1963; Kahneman 1973). Whatis interesting is howthe orienting responseis reinstated (dishabituation), and this can occur due to a numberof factors (Balkenius 2000). Our implementationinvolves two of the main factors: the presentationof a novelstimulus, andthe passageof time(forgetting). Accordingto the model,depicted in Figure 3, the activation and comparisonsof structures in memory occur out,side of attention, in a passive, automaticmanner.Whenthe orienting responseis reinstated (due to noveltyor forgetting), the informationgoes throughto the focus of attention, which can cause the creation of newstructures in memory (see Figure 3). Further, the informationfromthe focus of attention can go throughto higher-level processing,such as learning. ComputationalSystem Wehave implemented the model discussed above using a variation of the Self Organising Feature Map(SOFM), wherestructures growfrom experience as required, rather than being specified a priori. The SOFM has been suecessfully used in the past in modellingrobotic sensory input (Nehmzow 1999). It attempts to cover the sensory input space with a networkof nodes, and edges connectingneighbouring nodes are determinedby somedistance measure,in our case a Euclideandistance. This has the effect of preserv- 54 ing the topologyinherent in the space. Wehave adopted and suited to our purposes an algorithm developedby Marsland, Nehmzow, &Shapiro (2001), whichincorporates notions of habituation, novelty detection, and forgetting. It reflects howperceptual information activates stored structures for comparison;howstructures are added, updated, and deleted; and howstructures habituate to familiarstimuli, but are able to reinstatethe orientation responseif required. Themainasset of this algorithmis that it keepsa habituation measurefor each node in the SOFM. This measuregives an indicationof familiarity, i.e. the frequencyof that node’s activation, whichprovides a useful heuristic. Eachtime a nodeis active, its habituationvalue decreasesexponentially. The nodes and edges of the SOFM form the memorypan of Figure 3, wherethe dotted arrowsdenote the comparisons betweenthe incomingstimulus and the stored structures, and the node in bold font denotes the ’activated’, winning node, i.e. the one that best matchesthe input. Oneof three possiblecases followsthis activation: 1. If the matchis not goodenoughnovelty is detected. The orientation responseis initiated, whichcorrespondsto the creation of a newnode (denoted by a ’dashed’ node in Figure3). 2. If the matchis goodthe activated nodeand its neighbours are modifiedby a constantfraction to matchthe input even better, and they are also habituated. 3. If the winningnodeis fully habituated,it doesnot respond to any moreinput, until dishabituation is forced through the passage of a fixed time interval. The informationis re-attendedto until the nodefully habituatesagain. To summarise,the system handles attention as follows. Nodesin the networkrespondand habituate to their respective stimuli. Whenfully habituated, nodes ignore further stimulation and hencedo not get updated. Theorienting responseis reinstated either due to novelty detection, whena newnode is created, or due to forgetting, whena node is dishabituatedw in both situations the stimulus is (re-) attendedto. For moreinformation on the general algorithm see Marsland, Nehmzow, &Shapiro (2001), and for our implementation see Marom& Hayes(2001). The Integrated System The two systems described in the previous sections, namely the schemanetworkand the attention system, were developed separately in on-going work. The workpresented in this paper brings the two sub-systemstogether computationally, and the aimis to havea systemthat is capableof creating and maintainingrepresentations of perceptual experienceswithrespect to the actions that they afford. In other words,the systemgroundsperceptual instances to the actions that they afford. Thepart of the systemresponsible for grounding(through attention) is the SOFM, and the pan responsiblefor the acquisitionof actions, or motorskills (through imitation) is the schemanetwork.A schematicdiagramof the systemis shownin Figure 4. SOFM Stimulul ¯ Feedback Figure 4: The integrated system: the SOFM handles the stimulus and finds the best matchingnodeat each time; its hard-wired motor schema calculates the motor commands before sendingthemto the motorsystemfor execution. The attention system (SOFM)replaces the perceptual schemasof Figure1. Thus, instead of arbitrarily categorising the input space, we nowuse attention for this purpose. The SOFM takes input directly from the sensors and processes themas described in the previous section. For each node in the SOFM there is a single motorschemahard-wired to iLt whereinternal targets are to be stored. At the learning phase both the SOFMand the motor schemasare built in response to the input. Since during training the nodes in the SOFM movein the input space, we create and modifymotor schemasonly whena node is fully habituated, and therefore settled. Thesemodifications dependon the habituation value of the winningnode: 1. If the winningnode is not fully habituated, the correspondingmotorschemais left intact. 2. If the winningnodeis fully habituatedand the perceptual input is part of the samelearningepisode,the current target is appendedto the sequenceof targets (if any) of the corresponding motor schema. 3. If the winningnodeis fully habituated and the perceptual input is part of a newlearning episode,all targets of the correspondingmotor schemaare deleted and overwritten by the current target. Dueto forgetting (dishabituation), mentionedin the "Attention System"section, a nodecan re-train itself and its motor schema, to adapt to changes in the environment, or to unsuccessful previous trainings. Wehave not yet developed sophisticated methodsfor modifying motor schemas (for re-training); currently weare using the simpleand crude methoddescribed above. To summarise,an episode is recorded in a motorschema that correspondsto a fully habituated SOFM node, and this episodeis overwrittenby future episodes. The emergentlevel of granularity in the motorschemas is governedby the sensitivity of the SOFM nodes, which in turn is controlled by the thresholdusedfor detecting novelty. Thisis the classical over-generalisation(too fewnodes) versus over-fitting (too manynodes)issue. tThepossibility of hard-wiringmoremotorschemasto a perceptualcategorisation is beingconsidered. 55 Figure 5: The simulation platform: the imitator (righ0 is imitating the demonstrator(left). At the recall phase we use the SOFM and motor schemas created in the learning phase to recall the appropriate motor skills required at particular perceptual states. At each step the input activates the SOFM node that best matches it, and the motorschemaassociated with that nodeis then responsible for producing motor commands.Weare interested in testing howwell the systemcan reproducethe learned behaviour, and morespecifically in the dynamicsof pereeptual-motoractivations. Wehavetested the integrated systemon twodifferent platforms,both in simulation,and they are presentedin the followingtwo sections. Weare also currently testing the system on a physical robot (a Real WorldInterface B21robo0. Experiment1: Postural Imitation Setup The experimentspresented in this section simulate the dynamicsof two eleven degrees of freedomsimulated robots (waist upwards,see Figure 5): a demonstratorand an imitator. Eachrobot has three degrees of freedomat the neck, three at each shoulder, and one at each elbow. Theimitator has explicit access to the joint-angles andinstantaneousvelocities of the demonstratoralong with object information(visual perception), and to its ownangles and velocities (proprioception),after noise is addedontothem. At the learning phase the imitator observes the demonstrator performan action, andtries to learn it. At the recall phase, the imitator observes the demonstratorperformthat action again, and tries to reproduceit, i.e. imitate it. The action involves’pickingup’ a glass off a table, ’drinking’its contents, and’putting’ it backon the table. Notethat, due to softwarelimitations, the robotsdo not interact withobjects, althoughobjects are represented, recognised, and virtually myet effortlessly -- moved(hence the visual absenceof the glass in Figure5). Imitation,here, is usedin the recall phaseto explicitly expose the imitator to perceptual cues that maytrigger appropriate motorschemasto reproducethe learned action. In this experiment, the motorschemasstore joint-angles as internal targets whichare converted into motorcommands via a Proportional-Integral-Derivative(PID)controller (a forward model).Notice that these motorskills correspondto affordancesonto whichperceptual informationis to be grounded. 0.81 0,75l 0.75[ 0.7 m ,-0.4 -0.6 Pdnclpat Component 1 -0.2 0.45! -i -0.9 --0.8 .-0.7 -0.6 -0.5 principal Component 1 (a) (b) -0.4 -0.3 500 10’00 "~500 Time 20’oo 25oo (c) Figure6: Experiment 1: (a) the perceptualdata exposedto at the learning phase, projectedonto the first 2 principal components; (b) the SOFM producedby the attention systemfor these data; (c) the SOFM nodeactivation at the recall phase. Learning Phase Further, in part (i), as the handapproachesthe glass, the ’velocity’ tends to ’zero’ (it decelerates), whereasin part (iii), as the handmovesawayfromthe glass, the ’velocity’ increases from’zero’ (it accelerates). Note howeverthat a single episode ends whenthe demonstrator puts the glass backonthe table, i.e. parts (i) and(ii) have been completed. Repeating the demonstrationepisode first takes the hand off and awayfrom the glass, back towardsthe starting location (i.e. part (iii)) andthen through parts (i) and(ii). Nevertheless, part (iii) is clusteredand sociated with appropriatemotorskills. Indeed, in Figure 6(b) we observe the attention system clusters these three parts with enoughnodes, with the only exceptionof the right handpart of the central loop. This is due to the small numberof data points in that area. Further workwill considerthe issue of whetherit is desirable to sampleat a faster rate whenfast movements (as the one mentionedabove)are to be imitated. Figure 6(a) showsthe perceptual data that the imitator observes at the learning phase, while Figure 6(b) showsthe resulting SOFM network. Since the dimensionality of the input space is quite high (28), we have used a dimensionality reduction technique called Principal Component Analysis (PCA), for display and analysis purposes. PCAfinds the moststatistically significant dimensions,called Principal Components, in a multivariate dataset (see Afifi &Clark (1996) for more information on PCA). Figure 6(a) is the projection of the perceptual data onto z Similarly, the first 2 principal componentsfoundby PCA. Figure 6(b) is the projection of the SOFM networkonto the first 2 principal components.Interpreting the PCAplots is not straightforwardbecauseit is hardto inspect the principal components and trace themback to the original dimensions, especially whenthe original dimensionalityis quite high. However,from our knowledgeof what the learner should be perceiving, we can conjecture whatthe different parts of the plots mean. It seems that the first Principal Component expresses a generalised velocity (’velocity’) of the right hand jointangles, while the second expresses a generalised height (’height’) of the right handend point or wrist. Noticethat the first PCis ’zero’ around-0.7, while the secondis at ’table height’ around0.4. Wecan distinguish three shapes in Figure 6(a): one fairly straight curveat the bottomleft of the figure (goingleft to centre), a central loop in the middle(going clockwise), anotherfairly straight curveat the bottomright (goingcentre to right). In fact, this does correspondto the three major parts of the "drinkinga glass of beer" action: (i) approach the glass; (ii) pick up, drink, and put down;(iii) move fromthe glass. Recall Phase In the recall phase, while the demonstratorperformsthe action, the SOFM receives continuousperceptual input, finds the nodethat best matchesthe input, and activates the corresponding motorschema. That schemarecalls the current internal target state and calculates the motorcommands to achievethat state usingits PIDcontroller. Figure 6(c) shows the SOFM activation of a single episodeat the recall phase,i.e. the sequenceof nodesthat are activated in responseto the input. Weobservethat the nodes in the clusters, created at the learningphase(Figure 6(b)), are activated in sequenceat the recall phase. The sequence 6--3-4-8 controls the imitator fromthe starting posture, to ’grasp’ the glass. Similarly, motorschemas2-1-0 control the imitator to ’rift’ the glass and ’move’it towar&sthe mouth, and schemas 0-9-7 from the mouth back towards the table. As mentionedabove, a single episode ends when the glass is put backon the table; whichexplainsthe absence of winning nodes 5-10-11. Figure6(c) merelydemonstratesthe correct activation dynamics of the SOFM nodes (the winningnode at each time step). However,we expect that motor schemashard-wired eIt’s importantto checkhowmuchof the varianceis actually accounted for by the first 2 principalcomponents, andthis is given bythe sumof their eigenvalues. Thisgivesus an indicationof how representative a 2-dimensional plot is of the true complexity in the data. In this ease, 2 dimensions accountfor approximately 78%of the total variance. 56 1,5- Mouth 1. ¯ 0.5- ii i 0~-O.5- Experiment 2: Imitation by Following Setup The experimentspresented in this section were performed using a Kheperamobile robot simulator with a learner agent followingbehind a teacher agent (see Figure 8). The learner robot perceivesinformationthrough6 infra-red sensors aroundits front. It uses its built-in followingbehaviour to keep behind the demonstrator, as the demonstratorexecutes the task, whichin this case is wall-following. Weregard this as ’imitation’ in the sense that information is implicitly shared betweenthe demonstratorand learner about a specific task to be learned. Theobject-interactions in this case correspondto howthe robot responds to being near a wall: the affordancesof the wall and of the no-wall. In this experimentthe forwardmodelis not as straight forwardas in Experiment1 wherethe PIE) provides an intuitive forward model. As mentioned earlier, the motor schemas providemotorskills expressedas a sequenceof internal targets. The forward modelshould suggest to the robot howto attain a particular target state, given the current state, and the possibleactions availableto the robot. Welet the robot build up this forward modelfrom experience before we start the experiment. The robot wanders around the environment on its own, with an added obstacle-avoidance skill, and collects information about state-transitions. The transition matrix is then stored and used later as a forwardmodel. i I i ...... ,,e -1.5- - .. Depth " . .... -......’ 0 -1.5 Width Figure 7: The trajectories of the fight handwrists of the demonstrator(bold font) and the imitator (normal font) a single episodein the recall phase. I ! I.. @ ° ! Figure 8: The Kheperamobilerobot simulator. to those nodes will generate appropriate motor commands to efficiently imitate the demonstratedbehaviour. Onthe other hand, since there is no easy or standardised wayto evaluateimitation, ’efficient imitation’ is hardto define (Matari62000). For the purposesof this t~sk, though, we consider the position of the hand(s) over time relative to the position of the demonstrator’shand(s)sufficient as indicationof ’efficient imitation’. Learning phase Figure9(a) showsthe perceptualdata that the learner is exposedto as it follows the demonstratoraroundthe environment, and Figure 9(b) showsthe correspondingnetworkproducedby the attention system. As before, since the dimensionality of the sensorspaceis too highto visualise, wehave used PCAto reduce the numberof dimensionsto two (from six)? For a simple task such as wall-following,we do not expect manyperceptual clusters, and in fact wesee 3 mainclusters: onecluster is the intersection of the 2 apparentlines in Figure 9(a), whichis the area of low(weak)wall-detection,i.e. ’no wall’, and as we moveawayfrom this intersection in either direction, we see data correspondingto ’right wall’ and ’left wall’, at different distances fromthe wall, clustered in distinct areas. Theattention systemcaptures these clusters by placing nodesin appropriate places, as shownin Figure 9(b). Wetherefore expectthese clusters to afford the appropriatemotorskills. Recall phase After the robot has clustered its perceptual space and ereated correspondingmotorschemasat the learning phase, it is placed in the sameenvironmenton its ownfor the recall phase. The default behaviour is ’wandering’, and at each step it tries to activate one of the nodesin its SOFM. The winningnode recalls the targets stored in the corresponding Figure 7 showsthe trajectories of the fight-handwrists of both the demonstrator(in bold font) and the imitator (norreal font) in a single episodein the recall phase. Noticethat the trajectories are close to each other, save for one part of the action. This reflects the actual behaviourof the imitator whereall sub-goals of the task are achievedwith a reasonable degree of accuracy. One possible wayto numerically evaluate this degreeof accuracyis to calculate the area betweenthe curves; this is part of ongoingwork. SThe2 dimensionsused in Figure9(a) accountfor approximately98%of the total variance. 57 IJ:[ o/ ~-o. [ 04 ¯ -0.2 ,i "9-0.2 -0/~12 -0.1 0 0.1 PdnclpalC(~nponent 0.2 0.3 -0’-~I ,? -0,1 or-0 O.t P~ncipalComponent 1 (b) (a) 0.2 0.3 100OO 2OOO0 Time 30000 (c) Figure9: Experiment 2: (a) the perceptualdata exposedto at the learning phase, projectedonto the first 2 principal components; (b) the SOFM produeedby the attention systemfor these data; (e) the SOFM nodeactivation at the recall phase. motor schema, and these are passed to the forward model whichselects the best action likely to achievethem. Figure 9(e) showsthe SOFM activation at the recall phase. Firstly, wesee that the nodes,created at the learningphase, that formclusters (Figure 9(b)), are also activated together and intermittently at the recall phase. Secondly,we see an emergentsequenceof activations as the robot movesaround trying to recall the information from the motor schemas: the activations of nodesfor wall perception on either side are separatedby activations of nodesfor no-wallperception. This reflects the actual behaviourof the learner: following a right wall, then no wall, then followinga right wall again, etc. Notethat the activation dynamicsare different than in Experiment1 (Figure 6(e)), whichis inherent in what is being modelled.In Experiment1 the behaviourbeing modelledby the motorschemashas a true sense of sequencein it (moving the hand towardsa glass, ’picking’ it up, etc.); each motor schemastores a part of that sequence. In this experiment,each motorschema,or rather cluster of schemas,correspondsto being in a particular perceptual ’state’, and the motorskills are responsible for maintaining that state (for examplefine-tuning to stay next to a wall on the left). For this reason we do not expect to have a clear winningnodesince the robot will keepadjusting itself by activating alternate nodeswithin the samecluster (whereeach oneaffords different fine-tuning). Visual inspection of the behaviourconfirmsthat the activation of the schemasdoes correspondto a behaviourclosely resembling the one exhibited by the demonstrator. Weare also developing a methodto numerically evaluate the recalled behaviourin terms of the actual task; we do this by calculatingthe ’energy’that the robot acquiresfromthe wall at particular configurationsfromit. At the end of the recall episode we can look at the accumulatedenergy as a measure of howwell the robot performsthe wall-followingtask. Discussion & Future Work Wehave presented on-going work on the integration of an attention system and an imitation mechanism.The aim of this integration is threefold: first, to classify our perceptual 58 experiencessuch that wedistinguish familiar fromnovel experiences;second,to be able to recall the right motorskills at the right timeto achievefeasible (natural) motorcontrol for robots; and third, to groundour perceptual experiences to these motorskills. To summarise,the overall systemneeds to be shownthe perceptualinput togetherwith the action that it affords such that it creates appropriateperceptual and motorstructures. Further, a forwardmodelneedsto be available such that the internal targets in the motorschemascan be convertedinto actions. Currently, perceptual and motorstructures are implicitly eonneeted by a simple one-to-one relationship. Weare, however, interested in ways to afford more than one action per perceptualclassification, as weare also considering more sophisticated waysto update and maintain the motor schemas. This overall systemhas been implementedon two different platforms where we observe that the attention component correctly classifies the input space and creates perceptual nodes and motorschemas, and that the imitation componenttakes full advantageof these affordanees(to generate behaviour). Asmentioned earlier, evaluating’efficient imitation’ is as difficult a task as is challengingand it is part of on-going workin the field. Evaluatingthis systemis therefore equally difficult. However,as mentionedthroughoutthe paper, we are currently workingon ways to measurenumerically the overall performance of the systemon a task-specific basis. Whena eonspeeifie performsan action that weare familiar with, not only are weable to imitate that action, but we are also able to ’understand’it, i.e. we’know’the perceptual context that triggered that action together with its immediate consequences.This can be regarded as ’understanding’ in the sense that we can internally imitate an action, and a higher-level systemthat overlooks its consequencesmay havethe capacity to infer its goal(s) and/or purpose. Our workis an attempt to encompassthese ideas in a computational framework. References Afifi, A. A., and Clark, V.A. 1996. Computer-Aided Multivariate Analysis. London: Chapman&Hall. ISBN 041273060x. Arbib, M.A.1981. Perceptual structures and distributed motorcontrol. In Brooks,V. B., ed., Handbook of Physiology - The Nervous System IL MotorControl. Amer.Physiol. Soe. 1449-1480. Balkenius, C. 2000. Attention, habituation and conditioning: Towarda computational model. Cognitive Science Quarterly1(2). Cowan,N. 1988. Evolving conceptions of memorystorage, selective attention, and their mutualconstraints within the humaninformation processing system. Psychological Bulletin 104:163-191. Demiris, J., and Hayes, G.M.1999. Active and passive routes to imitation. In Proceedinssof the AISBsymposium on imitation in animalsandartifacts. Edinburgh,Scotland, UK. Fagg, A. H., and Arbib, M. A. 1998. Modelingparietalpremotorinteractions in primate control of grasping. Neural Networks11(7/8): 1277-1303. Gallese, V.; Fadiga, L.; Fogassi, L.; and Rizzolatti, G. 1996. Action recognition in the premotor cortex. Brain 119:593-609. Gibson, J. J. 1966. The senses considered as perceptual systems. HoughtonMifflin Company. Kahneman,D. 1973. Attention and Effort. EnglewoodCliffs, NewJersey: Prentice Hall. Maistros, G., and Hayes, G. M. 2001. Animitation mechanism for goal-directed actions. In Proceedingsof TIMR 2001 - TowardsIntelligent Mobile Robots. Manchester, UK. Marom,Y., and Hayes, G.M. 2001. Attention and social situatedness for skill acquisition. In Proceedingsof the First International Workshopon Epigenetic Robotics: ModelingCognitive Developmentin Robotic Systems. Marsland,S.; Nehmzow, U.; and Shapiro, J. 2001. Novelty detection in large environments.In Nehmzow, U., and Melhuish, C., eds., Proceedingsof TowardsIntelligent Mobile Robots (TIMR)2001. Matari6, M.J.; Williamson,M.; Demiris,J.; and Mohan,A. 1998. Behavior-based primitives for articulated control. In Pfeifer, R.; Blumberg,B.; Meyer,J.-A.; and Wilson,S. W., eds., Proceedings:FromAnimalsto Animats5, Fifth International Conferenceon Simulation of Adaptive Behavior (SAB-98),165-170.Ziirieh, Switzerland. Matari6, M. J. 2000. Getting humanoidsto moveand imitate. IEEEIntelligent Systems14:18-24. Nehmzow,U. 1999. Mobile Robotics: A Practical Introduction. Springer Verlag. ISBN1-85233-173-9. Rizzolatti, G.; Carmada,R.; Fogassi, L.; Gentilueci, M.; Luppino, G.; and Matelli, M. 1988. Functional organization of inferior area 6 in macaquemonkey.2. Area F5 59 and the control of distal movements.ExperimentalBrain Research71 (3):491-507. Rizzolatti, G.; Fadiga, L.; Gallese, V.; and Fogassi, L. 1996. Premotorcortex and the recognition of motor actions. CognitiveBrain Research3(2): 131-141. Rizzolatti, G.; Fogassi, L.; and Gallese, V. 2000. Cortical mechanisms subserving object grasping and action recognition: Anewviewon the cortical motorfunctions. In Gazzaniga, M., ed., The NewCognitive Neurosciences. Cambridge, MA:MITPress. 539-552. Sokolov, E. 1963. A neuronal modelof the stimulus and the orienting reflex. In Sokolov,E., ed., Perceptionandthe ConditionedReflex. PergamonPress. 282-294.

Perception-Action Coupling via Imitation and Attention

Related documents

Products

Support

Perception-Action Coupling via Imitation and Attention

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib