From: AAAI Technical Report FS-01-01. Compilation copyright © 2001, AAAI (www.aaai.org). All rights reserved. Interpreting Symbols on Conceptual Spaces: the Case of Dynamic Scenarios AntonioChellat, MarcelloFrixione:~ and Salvatore Gagliot J’Dip. di Ingegneria Automaticae Informatica, Univ. of Palermoand CERE-CNR, Palermo,Italy :~Dip. di Scienzedella Comunicazione, Univ. of Salerno, Italy Abstract In (Chella,Frixione,&Gaglio1997;2000)weproposed a framework for the representationof visualknowledge, withparticularattentionto the analysisandthe representation of sceneswith movingobjects and people. Oneof our aimsis a principledintegrationof the models developed withinthe artificial vision community with the propositionalknowledge representationsystemsdevelopedwithinsymbolicAI. In this paperweshowhow the approachweadoptedfits well withthe representational choicesunderlyingoneof the mostpopularsymbolic formalisms usedin cognitiverobotics,namelythe situationcalculus. Ourmodelof visual perception: Anoverall view In (Chella, Frixione, & Gaglio 1997; 2000) we proposed theoretical frameworkfor the representation of knowledge extracted fromvisual data. Oneof our aimsis a principled integration of the approachesdevelopedwithin the artificial vision community,and the propositional systems developed within symbolic knowledgerepresentation in AI. It is our assumptionthat such an integration requires the introduction of a conceptuallevel of representation that is in some wayintermediate betweenthe processing of visual data and declarative, propositional representations. In this paper we argue that the conceptualrepresentation we adoptedfor representing motionis compatiblewith one of the most influential symbolicformalismsused in cognitive robotics, namely the situation calculus(Reiter 2001).In the rest of this section we summarizethe main assumptions underlying our model. Next section is devoted to a synthetic description of our conceptuallevel representation of motion.The third section showshowthe situation calculus can be mappedon the conceptual representation we adopted. The fourth section describes a wayto translate the symbolicrepresentation weadopted in (Chella, Frixione, &Gaglio2000) in the formalismof the situation calculus. Ashort conclusionfollows. On the one hand, the computer vision communityapproached the problem of the representation of dynamic scenes mainlyin terms of the construction of 3Dmodels, Copyright©2001, American Associationfor Artificial Intelligence(www.aaai.org). All rights reserved. 6O and of the recoveryof suitable motionparameters, possibly in the presenceof noise and occlusions. Onthe other hand, the KRcommunitydeveloped rich and expressive systems for representationof time, of actions and, in general, of dynamicsituations. Nevertheless,these twotraditions evolvedseparately and concentrated on different kinds of problems.The computer vision researchersimplicitly assumedthat the problemof visual representation ends with the 3Dreconstruction of moving scenes. The KRtradition in classical, symbolicAI usually underestimatedthe problemof groundingsymbolicrepresentations in the data comingfromsensors. It is our opinion that this state of affairs constitutes a strong limitation for the developmentof artificial autonomous systems(typically, robotic systems).It is certainly true that several aspects of the interaction betweenperception andbehaviourin artificial agentscanbe faced in a rather reactive fashion, with perceptionmoreor less directly coupled to action, without the mediationof complexinternal representations. However, we maintainthat this is not sufficient to modelall the relevant types of behaviour.For many kinds of complextasks, a moreflexible mediationbetween perceptionand action is required. In suchcases the coupling betweenperception and action is likely to be "knowledge driven", and the role of explicit formsof representation is crucial. Our modelcan be intended as a suggestionto provide artificial agents withthe latter of the above-mentioned functionalities. The existing attempts to integrate visual perception with propositional KRare mostly oriented towards natural languageinterpretation, with particular emphasison the aspects of man-machineinteraction. They face only in a marginalwaythe general aspects of representation of knowledge. In addition, the existing proposals often concernspecific domainsof application,such as car traffic (Nagel1994; Neumann1989), biomedical images (Tsotsos et al. 1980; Tsotsos 1985), sport scenarios (Herzog &Wazinski1994; Siskind 1994 5), simple assemblytasks (Kuniyoshi &Inoue 1993), visual surveillance (Buxton&Gong1995). Weassumethat a principled integration of the approaches of artificial vision and of symbolicKRrequires the introduction of a missinglink betweenthese two kinds of representation. In our approach,the role of such a link is playedby the notion of conceptual space (G~irdenfors2000). A con- Figure1: Thethree areas of representation,and the relations amongthem. bolic knowledge initially stored within the linguistic area, and by the information learned by a neural networkcomponent. The knowledgein the linguistic area plays a central role in determiningthe expectationsof the system, that in their turn drive the scanning of the data comingfrom the sensors. Whichhigh level information the systemextracts fromthe sensory data (or, in other worlds, the wayin which the systeminterprets the perceivedscenes) crucially depends on a top downprocesspartially directed by linguistic knowledge. The details of this process are described in (Chella, Frixione, &Gaglio 1997). Thepurpose of this paper is to showthat the conceptual representation we adopted for dynamicscenes (and that is describedin details in (Chella, Frixione, &Gaglio2000)) consistent with the representational choices underlyingone of the most popular symbolicformalismsused in cognitive robotics, namelythe situation calculus (Reiter 2001). this way, our dynamicconceptual spaces could offer a conceptualinterpretation for the situation calculus, whichcould help in anchoringpropositional representations to the perceptualactivities of a robotic system. ceptual space (CS) is a representation whereinformationis characterized in terms of a metric space definedby a number of cognitive dimensions.Thesedimensionsare independent from any specific languageof representation. Accordingto G~denfors,a CS acts as an intermediate representation betweensubconceptualknowledge(i.e., knowledgethat is not yet conceptually categorized), and symbolically organized knowledge. Accordingto this view, our architecture is organisedin three computational areas. Figure 1 schematically shows the relations amongthem. The subconceptualarea is concerned with the lowlevel processingof perceptual data coming from the sensors. The term subconceptualsuggests that here informationis not yet organisedin terms of conceptual structures and categories. In this perspective, our subconceptual area includes a 3Dmodelof the perceived scenes. Indeed, evenif such a kind of representation cannot be considered"lowlevel" fromthe point of viewof artificial vision, in our perspectiveit still remainsbelowthe level of conceptual categorisation. In the linguistic area, representation and processingare based on a propositional KRformalism(a logic-oriented representation language). In the conceptualarea, the data comingfrom the subconceptualarea are organised in conceptual categories, whichare still independentfromany linguistic characterisation. The symbolsin the linguistic area are groundedon sensory data by mappingthem on the representations in the conceptualarea. In a nutshell, the operation of our modelcan be summarisedas follows: it takes in input a set of imagescorrespondingto subsequentphases of the evolution of a certain dynamicscene, and it producesin output a declarative description of the scene, formulatedas a set of assertions written in a symbolic representation language. The generation of such a description is driven both by the "a priori", sym- Conceptual spaces for representing motion Representationsin the conceptualarea are couchedin terms of conceptual spaces (G~denfors2000). Conceptualspaces provide a principled wayfor relating high level, linguistic formalismson the one hand, with low level, unstructured representation of data on the other. A conceptualspace CS is a metric space whosedimensionsare in somewayrelated to the quantities processedin the subconceptualarea. Different cognitive tasks maypresupposedifferent conceptual spaces, and different conceptualspaces can be characterised by different dimensions. Examplesof dimensionsof a CS could be colour, pitch, mass, spatial co-ordinates, and so on. In somecases dimensionsare strictly related to sensory data; in other cases they are moreabstract in nature. Anyway,dimensionsdo not dependon any specific linguistic description. In this sense, conceptualspaces comebefore any symbolic-propositionalcharacterisation of cognitive phenomena. In particular, in this note, we take into accounta conceptualspace devotedto the representationof the motionof geometricshapes. Weuse the term knoxel to denote a point in a conceptual space. A knoxelis an epistemologicallyprimitive element at the considered level of analysis. For example, in (Chella, Frixione, &Gaglio 1997) we assumedthat, the case of static scenes, a knoxelcoincideswith a 3Dprimitive shape, characterised according to someconstructive solid geometry(CSG)schema. In particular, we adopted superquadrics (Pentland 1986; Solina &Bajcsy 1990) as suitable CSGschema.However,we do not assumethat this choice is mandatoryfor our proposal. Our approach could be reformulated by adopting different modelsof 3Drepresentation. Theentities representedin the linguistic area usually do not correspond to single knoxels. Weassumethat complex entities correspondto sets of knoxels. For example,in the case of a static scene, a complexshape correspondsto the set of knoxelsof its simpleconstituents. SENSORY DATA 61 In order to accountfor the perceptionof dynamicscenes, we choose to adopt an intrinsically dynamicconceptual space. It has been hypothesisedthat simple motionsare categorised in their wholeness,and not as sequencesof static frames. Accordingto this hypothesis, we define a dynamic conceptual space in such a way that every knoxel corresponds to a simple motion of a 3D primitive. In other words, we assume that simple motions of geometrically primitive shapes are our perceptual primitives for motion perception. As we choosesuperquadrics as geometricprimitives, a knoxelin our dynamicconceptualspace correspond to a simple motionof a superquadric. Of course, the decision of which kind of motion can be considered"simple"is not straightforward, and is strictly related to the problem of motion segmentation. (Mart Vaina 1982) adopted the term motion segmentto indicate such simple movements.According to their SMS(StateMotion-State) schema,a simple motionis characterised by the interval betweentwosubsequentoverall rest states. Suchrest states maybe instantaneous. Considera person moving an arm up and down. According to Marr and Vaina’s proposal, the upwardtrajectory of the forearm can be considereda simple motion,that is represented in CS by a single knoxel, say ka. Whenthe armreaches its vertical position, an instantaneousrest state occurs. Thesecondpart of the trajectory of the forearmis anothersimplemotionthat correspondsto a secondknoxel, say ka. The sameholds for the upperarm:the first part of its trajectory correspondsto a certain knoxelkb; the secondpart (after the rest state) correspondsto a further knoxelk~. In intrinsically dynamicconceptual spaces, a knoxel k corresponds to a generalised simple motion of a 3Dprimitive shape. By generalised we meanthat the motion can be decomposed in a set of componentsxi, each of themassociated with a degree of freedomof the movingprimitive shape. In other words, wehave: k = [Xl, x2 ..... Xn] Figure 2: A dynamicconceptual space. For our purposes,weare not interested in the representation of any possible motion,but only in a compactdescription of perceptuallyrelevant kinds of motion.In the domain we are facing here, the motionscorrespondingto each degree of freedomofa superquadriccan be viewedas the result of the superimpositionof the first low frequencyharmonics, accordingto the well-known Discrete Fourier Transform (DFT), see (Oppenheim&Shafer 1989) for an in-depth scription of the topic. Accordingto this approach,given a generic parameterof a superquadric,e.g., ax, let be ax(t) the function that returns for each instant t the corresponding value of ax. This function can be viewedas a superimpositionof a discrete numberof trigonometric functions and it can then be representedby a point in a discrete functional space, whosebasis functions are the trigonometric functions. By a suitable compositionof the time functions of all superquadricparameters,the overall motionof the superquadricis representedin its turn by a point in a discrete functional space. Weadopt the resulting functional space as the conceptual space for the representation of dynamic scenes. wheren is the numberof degrees of freedomof the moving superquadric.In this way,changesin shapeand size are also taken into account. Parametersof superquadrics are given with respect to an absolute reference system. As a consequence, also motion properties of knoxels is described in absolute terms. In turn, each motionxi correspondingto the i-th degreeof freedomcan be viewedas the result of the superimposition of a set of elementarymotionsfj: i i __ x,-S, xj:j J In this way,it is possible to define a set of basis functions )’j, in terms of whichany simple motioncan be expressed. Suchfunctions can be associated to the axes of the dynamicconceptual space as its dimensions. In this way, the dynamicCSresults in a functional space. The theory of function approximation offers different possibilities for the choice of basic motions: trigonometric functions, polynomial functions, wavelets, and so on. 62 The choice of DFTfor motion representation seems promisingfromout point of viewalso becauseFourier analysis has been proposedas a psychologicallyplausible model for representation of humanand animal motion (Gallistel 1980; Johansson 1973). See (Chella, Frixione, &Gaglio 2000)for greater details. Figure 2 is an evocative representation of a dynamicconceptual space. In the figure, each group of axes f~ correspondsto the i-th degreeof freedomof a simple shape; each axis j’j in a group fi corresponds to the j-th component pertaining to the i-th degree of freedom. In this way, each knoxel in a dynamicconceptual space represents a simplemotion,i.e. the motionof a simpleshape occurring within the interval betweentwo subsequentrest states. Wecall compositesimple motiona motionof a composite object (i.e. an object approximatedby morethan one superquadric). A compositesimple motionis represented in the CSby the set of knoxelscorrespondingto the motionsof its components. For example,the first part of the trajectory of the wholearm shownin figure 2 is represented as acom- posite motionmadeup by the knoxels ka (the motionof the forearm) and kb (the motion of the upper arm). Note that in compositesimple motionsthe (simple) motionsof their components occur simultaneously. That is to say, a composite simple motioncorrespondsto a single configuration of knoxelsin the conceptualspace. In order to consider the compositionof several (simple or composite) motions arranged according to sometemporal relation (e.g., a sequence), weintroduce the notion structured process. Astructured processcorrespondsto a series of different configurationsof knoxelsin the conceptual space. Weassumethat the configurations of knoxels within a single structured process are separated by instantaneous changes.In the transition betweentwosubsequentdifferent configurations,there is a changeof at least one of the knoxels in the CS. Wecall a "scattering" such a transition from one knoxel to another. This correspondsto a discontinuity in time, and is associated with an instantaneousevent. In the exampleof the movingarm, a scattering occurs whenthe arm has reachedits vertical position, and begins to movedownwards.In a CS representation this amountto say that knoxelka (i.e., the upwardmotionof the forearm) replaced by knoxel ka, and knoxel kb is replaced by k~. In fig. 2 such scattering is graphicallysuggestedby the dotted arrowsconnectingrespectively ka to ka, and kb to k~. Mappingsituation calculus on conceptual spaces In (Chella, Frixione, &Gaglio2000) the formalismadopted for the linguistic area was a KL-ONE like semantic net, equivalent to a subset of the first order predicate language. In this note we proposethe adoption of the situation calculus as the formalismfor the linguistic area. Indeed, the kind of representation it presupposesis in manyrespects homogeneous to the conceptualrepresentation described in the previoussection. In this section we suggesthowa representation in terms of the situation calculus could be mapped on the conceptualrepresentation presented above. The situation calculus is a logic basedapproachto knowledge representation, developedin order to express knowledge about actions and changeusing the languageof predicate logic. It was primarily developedby (McCarty1968); for an up to date and exhaustiveoverviewsee (Reiter 2001). The basic idea behind the situation calculus is that the evolutionof a state of affairs can be modeled in termsof a sequenceof situations. The world changes whensomeaction is performed.So, given a certain situation sl, performinga certain action a will result in a newsituation s2. Actionsare the sole sourcesof changeof the word:if the situation of the wordchangesfrom, say, si to s j, then someaction has been performed. Thesituation calculus is formalisedusing the languageof predicate logic. Situations and actions are denotedby first order terms. The two place function do takes as its argumentsan action and a situation: do(a, s) denotes the new situation obtained by performingthe action a in the situation s. Actions occur in time. Accordingto (Reiter 1996), 63 introduce a one argumentsymbolfunction time. If a is an action, time(a) is a real numbercorrespondingto the occurrence timeof a. Classesof actions can be representedas functions. Forexample, the one argumentfunction symbolpick_up(x) could be assumedto denote the class of the actions consisting in picking up an object x. Givena first order term o denoting a specific object, the term pick_up(o)denotes the specific action consisting in pickingup o. Asa state of affairs evolves, it can happenthat properties and relations changetheir values. In the situation calculus, propertiesandrelations that canchangetheir truth value fromone situation to anotherare called (relational)fluents. Anexampleof fluent could be the propertyof being red: it canhappenthat it is true that a certain object is red in a certain situation, and it becomesfalse in another. Fluents are denotedby predicate symbolsthat take a situation as their last argument.For example,the fluent correspondingto the propertyof being red can be representedas a twoplace relation red(x, s), wherered(o, sl) is true if the object o is red in the situation Sl. Differentsequencesof actions lead to different situations. In other words, it can never be the case that performing someaction starting formdifferent situations can result in the samesituation. If two situations derive from different situations, they are in their turn different, in spite of their similarity. In order to accountfor the fact that twodifferent situations can be indistinguishablefromthe point of viewof the propertiesthat holdin them,the notionof state is introduced.Considertwosituations Sl and s2; if they satisfy the samefluents, then we say that Sl and s2 correspondto the samestate. Thatis to say, the state of a situation is the set of the fluentsthat are true in it. In the ordinary discourse, there are lot of actions that havea temporal duration. For example,the action of walking from certain point in space to another takes sometime. In the situation calculus all actions in the strict sense are assumedto be instantaneous. Actions that have a duration are representedas processes, that are initiated and are terminated by instantaneous actions (see (Pinto 1994) Chap. 7 of (Reiter 2001)). Supposethat we want to represent the action of movingan object x from point y to point z. Wehave to assumethat movingx from y to z is a process, that is initiated by an instantaneousaction, say start_move(x, y, z), and is terminated by another instantaneous action, say end.move(x,y, z). In the formalismof the situation calculus, processescorrespondto relational fluents. For example,the process of movingx fromy to z correspondsto a fluent, say moving(x,y, z, s). A formulalike moving(o,Pl, P2, Sl) meansthat in situation sl the object o is movingfromposition Pl to position P2. This approachis analogousto the representation of actions adoptedin the dynamicconceptualspaces described in the precedingsection. In a nutshell, a scattering in the conceptual space CS correspondsto an (instantaneous) action. A knoxelcorrespondsto a particular process. A configuration of knoxelsin CScorrespondsto a state. Please, note that, accordingto the situation calculusterminology we adopthere, a state (in the technical sense) does not necessarily correspondto a static configurationof entities. Rather,as wesaid in the previousparagraph,a state is definedas the set of fluents that are true in a givensituation, and werepresent processes (i.e., actions that have a duration) as a particular kindof relational fluents. Therefore, state can correspondto a set of processes. Note also that a configuration of knoxels cannot correspondto a situation, because,at least in principle, the same configurationof knoxelscould be reachedstarting fromdifferent configurations, and it can never happenthat the same situation (in the technical sense) can be obtained starting fromdifferent fromdifferent states of affairs. Asan example,consider a scenario in whicha certain object o movesfromposition pl to position P2, and then rests in P2. Whenthe motion of o from Pl towards P2 starts, a scattering occurs in the conceptualspace CS, and a certain knoxel,say kl, becomesactive in it. Sucha scattering correspondsto an (instantaneous)action that couldbe represented by a term like start_move(o, pl, P2). The knoxelkl correspondsin CS to the process consisting in the motionof the object o. Duringall the time in whicho remainsin such a motion state, the CS remains unchanged(provided that nothing else is happeningin the considered scenario), and kl continueto be active in it. In the meanwhile,the fluent moving(o, Pl, P2, s) remains true. Wheno’s motion ends, a further scattering occurs, kl disappears, and a newknoxel k2 becomesactive. This secondscattering correspondsto an instantaneous action end_moving(o,pl, p2). The knoxel k2 correspondsto o’s rest. Duringthe timein whichk2 is active in CS, the fluent staying(o,p2, Sl) is true. In its traditional version, the situation calculusdoes not allow to account for concurrency. Actions are assumedto occursequentially,and it is not possibleto representseveral instantaneousactions occurringat the sametime instant. For our purposes,this limitations is too severe. Whena scattering occurs in a CS it mayhappenthat moreknoxels are affected. This is tantamountto say that several instantaneous actions occur concurrently. For example,according to our terminology(as it has beenestablished in the precedingsection), a compositesimple motionis a motionof a composite object (i.e. an object approximatedby morethan one superquadric). A compositesimple motion is represented in the CS by the set of knoxels correspondingto the motions of its components.For example,the trajectory of a whole arm movingupward(the examplein the previous section) represented as a composite motionmadeup by the knoxels ka (the upwardmotion of the forearm) and kb(the upward motionof the upper arm). Supposethat we want to represent such a motionwithin the situation calculus. Accordingto what stated before, movingan ann is represented as a process, that is started by a certain action, say start_move.arm,and that is terminated by another action, say end.move.arm.(For sake of simplicity, no argumentsof such actions - e.g. the agent, the starting position, the final position - are taken into account here). The process of movingthe arm is represented as a fluent moving_arm(s),that is true if in situation s the arm is moving. The scattering in CS corresponding to both start_move_arm and end_move_arminvolves two 64 knoxels, namelyka and kb, that correspond respectively to the motion of the forearm and to the motionof the upper arm. Consider for example start_move_arm. It is composedby two concurrent actions that could be named start_move_forearm and start_move_upper_arm, both correspondingto the scattering of one knoxelin CS (resp. ka and kb). Extensionsof the situation calculus that allows for a treatment of concurrencyhavebeen proposedin the literature (Reiter 1996; 2001; Pinto 1994). (Pinto 1994) adds the languageof the situation calculus a twoargumentfunction +, that, given twoactions as its arguments,producesan actionas its result. In particular,if al anda2 are twoactions, al + a2 denotes the action of performingal and a2 concurrently. Accordingto this approach,an action is primitive if it is not the result of other actions performedconcurrently. If a is a complexaction such that a = al + a2 + .. ¯ + an, then, for each i such that 1 < i < n, we say that ai E a. In our approach,the scattering of a single knoxelin CS correspondto a primitive action; several knoxelsscattering at the sametime correspondto a complexaction resulting fromconcurrentlyperformingdifferent primitive actions. Therefore,accordingto Pinto’s notation, the representation of the arm motionexamplein the formalismof the (concurrent) situation calculusinvolvesfour primitiveactions: start_move_forearm start_move_upper_arm end_move_forearm end.move_upper_arm and twonon primitive actions: start_move_arm end_move.arm = q= + start_move_forearm + start_move_upper_arm end.move_forearm+ end_move_upper_arm In addition, the three followingfluents are needed: moving_arm (a formula like this is true if in the CS configuration correspondingto the situation s bothka and kb are present); moving_forearm(a formula like this is true if in the CS configuration correspondingto the situation s ka is present); moving.upper.arm(a formula like this is true if in the CS configuration correspondingto the situation s kb is present). From a terminological taxonomy of processes to the situation calculus In (Chella, Frixione, &Gaglio 1997; 2000), we adopted linguistic representation basedon a hybrid formalismin the KL-ONE tradition (on this kind of formalismssee (Brachman& Schmoltze1985; Nebel 1990). These formalisms are ment of knoxels simultaneously occurring in CS. In other words, within a synchronic motion no scattering occurs. A Simple.motion is a synchronic motion which has no parts: it corresponds to a single knoxel in CS. A Composite.simple.motion is a Synchronic.motionwith at least two parts that are simple motionsoccurringsimultaneously. That is to say, all the parts of a compositesimple motion start and end at the sametime. Therefore,if the simplemotions sm1 ¯ .. Stunare the parts of a certain compositesimple motioncsm, and al ... an, a are the instantaneous actions that initiate, respectively, sml ... smnand csm, then we have that a = al + ... + an. In other words: Figure 3: A fragment of the terminological KBdescribing someof the most general concepts. hybridin the sense that they are constitutedby twodifferent components:a terminological componentfor the description of concepts, and an assertional component,that stores information concerning a specific context. In the domain of dynamicscene representation, the terminological component contains the description of significant conceptssuch as various types of motions, of processes and of actions. The assertional component stores the assertions describing specific perceiveddynamicsituations. Fig. 3 shows a fragment of the terminological knowledge base adopted in (Chella, Frixione, & Gaglio 2000), describing the mostgeneral concepts representing motions, processes and actions. Thegraphical notation is similar to that of Brachmanand Schmoltze (Brachman& Sehmoltze 1985). In this section, we showhowa taxonomylike this can be translated in the formalismof the situation calculus. Intuitively, Processesare the mostgeneric motionscorrespondingto a temporalevolution of the knoxels in the CS. The beginning and the end of a Process is determined by Instantaneous_Actions,that have no temporal duration and correspondto a scattering of knoxelsin the CS. Instantaneous_Actionsare analogousto the actions of the situation calculus. Therefore, the conceptInstantaneous_Actioncorrespondsto a oneplace predicate, that is true of the individuals in the domainthat correspondto actions. As we anticipated before, Processesare analogoustofluents. In order to keepa greater correspondence with the semanticnetworkrepresentation, we adopt the choice of reifying processes. That is to say, we assumethat in our domainindividual entities exist, whichcorrespondto specific instances of processes. Therefore,a certain kind of process P1 (i.e., a subconceptof the Processconceptof the fig. 3) is representedas a twoplace relational fluent PI (x, y). atomic formula P1 (P, s) meansthat the individual p is processof kind/’1, andthat it holds in the situation s. Processes maybe arranged according to different temporal relations: they maybe consecutive,mayoverlap, and so on. A Synchronic.motionis a Process involving the motion of one or more objects, that corresponds to one arrange- 65 VXl , x2, yl , y2, s(Composite_simple_motion(xl, s) Apart_of_CMotion(xl, x2)/x start(xl, Yl) Astart(x2, Y2) "+ Y2E Yl) Asimilar constraint holds for the actions terminating a compositesimple motion. A Structured_processis a motion involving a temporal evolution(i.e., a scattering) in CS;a Structured_process has at least twoparts that are Synchronic.motions not occurring at the sametime. In the case of structured processes, the followingcondition holds: VX1, x2, Yl , Y2, s(Structured_Process(xl , s) A /xpart_of_Sprocess(xl, x2) A start(xl, Yl) Astart(x2, Y2) --+ time(yl) < time(y2)) Thatis to say, ifxl is s a structuredprocessthat is initiated by the action Yl, andx2is a part of Xl that is initiated bythe action Y2, then Y2must temporallyfollow or coincide with Yl. A symmetriccondition constraints the end of structured processesandof their parts: ’¢Xl, x2, Yl, Y2, s(Structured_Process(xl,s) A Apart_of_Sprocess(xl, x2) A end(xl, yl) Aend(x2, Y2) ~ time(yl) > time(y2)) Some conclusions In the abovesection we suggesteda possible interpretation of the languageof the situation calculus in terms of conceptual spaces. In this waythe situation calculus could be chosenas the linguistic area formalismfor our model,with the advantageof adopting a powerful, well understoodand widespreadformal tool. Besides this, we maintain that a conceptualinterpretation of the situation calculus wouldbe interesting in itself. Indeed,it could be consideredcomplementarywith respect to traditional, modeltheoretic interpretations for logic orientedrepresentationlanguages. Modeltheoretic semantics (in its different versions: purely Tarskian for extensional languages, possible worlds semantics for modallogic, preferential semantics for non monotonicformalisms, and so on) has been developedwith the aimof accountingfor certain metatheoreticalproperties G~denfors, P. 2000. ConceptualSpaces. Cambridge,MA: MITPress, Bradford Books. Herzog, G., and Wazinski, P. 1994. Visual TRAnslator: Linkingparceptionsand natural languagedescriptions. Artif. lntell. Review8(2-3): 175-187. Johansson, G. 1973. Visual perception of biological motion and a modelfor its analysis. Perception and Psychophysics 14:201-211. Kuniyoshi,Y., and Inoue, H. 1993. Qualitative recognition of ongoinghumanaction sequences. In Proc. 13th IJCAL 1600--1609. Marr, D., and Vaina, L. 1982. Representation and recognition of the movements of shapes. Proc. R. Soc. Lond. B 214:501-524. McCarty,J. 1968. Situations, actions and causal laws. In Minsky,M., ed., Semantic Information Processing. Cambridge, MA:MITPress. 410--417. Nagel, H. 1994. A vision of "vision and language" comprises action: Anexamplefromroad traffic. Artif. lntell. Review8(2-3): 189-214. Nebel, B. 1990. Reasoningand Revision in Hybrid Representation Systems, volume422of Lecture Notes in Artificial Intelligence. Berlin: Springer-Verlag. Neumann, B. 1989. Natural language description of timevarying scenes. In Waltz, D., ed., SemanticStructures. Hillsdale, NJ: LawrenceErlbaum. Oppenheim, A., and Shafer, R. 1989. Discrete-TimeSignal Processing.Englewood Cliffs, N.J.: Prentice Hall, Inc. Pentland, A. 1986. Perceptual organization and the representation of natural form.Artif. Intell. 28:293-331. Pinto, J. 1994. Temporalreasoningin the situation calculus. Technicalreport, Dept. of ComputerScience, Univ. of Toronto, Toronto, CA. Reiter, R. 1996. Natural actions, concurrencyand continuous time in the situation calculus. In Principles of Knowledge Representation and Reasoning: Proceedings of the Fifth International Conference( KR’96). Reiter, R. 2001. Knowledgein Action: Logical Foundations for Describing and ImplementingDynamicalSystems. Cambridge,MA:MITPress, Bradford Books. Siskind, J. 1994-5. Groundinglanguage in perception. Artif. Intell. Review8(5-6):371-391. Solina, F., and Bajcsy, R. 1990. Recoveryof parametric modelsfrom range images: The case for superquadrics with global deformations. IEEETrans. Patt. Anal. Mach. Intell. 12(2): 131-146. Tsotsos, J.; Mylopoulos,J.; Covvey,H.; and Zucker, S. 1980. A frameworkfor visual motionunderstanding. IEEE Trans. Patt. Anal. Mach.lntell. 2(6):563-573. Tsotsos, J. 1985. Knowledge organisation and its role in representationand interpretation for time-varyingdata: the ALVEN system. ComputationalIntelligence 1:16-32. of logical formalisms(such as logical consequence,validity, correctness, completeness,and son on). However,it is of no help in establishing howsymbolicrepresentations are anchoredto their referents. In addition, the modeltheoretic approachto semanticsis "ontologicallyuniform",in the sense that it hides the ontological differences betweenentities denotedby expressions belongingto the samesyntactic type. For example,all the individual terms of a logical languageare mappedonto elementsof the domain,no matter of the deep ontological variety that mayexist betweenthe objects that constitute their intendedinterpretation. Considerthe situation calculus. Accordingto its usual syntax,situations, actions and objectsare all representedas first order individualterms;therefore, they are all mappedon elements of the domain. This does not constitute a problemgiven the abovementionedpurposes of modeltheoretic semantics. However,it becomesa serious drawbackif the aim is that of anchoring symbolsto their referents throughthe sensoryactivities of an agent. In this perspective, it is our opinion that an interpretation of symbolsin terms of conceptual spaces of the form sketchedin the abovepagescould offer: ¯ a kind of interpretation that does not constitute only a metatheoreticdeviceallowingto single out certain properties of the symbolicformalism;rather, it is assumed to offer a further level of representationthat is, in somesense, closer to the data comingfromsensors, and that, for this reason, can help in anchoringthe symbolsto the external world. ¯ A kind of interpretation that accountsfor the ontological differences betweenthe entities denoted by symbolsbelonging to the samesyntactic category. This wouldresult in a richer and finer grained model,that stores information that is not explicitly representedat the symboliclevel, and that therefore can offer a further source of "analog" inferences, offering at the sametime a link betweendeliberative inferential processes, and formsof inference that are closer to the lowerlevels of the cognitivearchitecture (reactive behaviours,and so on). Acknowledgements This work has been partially supported by MURST "Progetto Cofinanziato CERTAMEN". References Brachman,R., and Schmoltze, J. 1985. An overview of the KL-ONE knowledgerepresentation system. Cognitive Science 9(2): 171-216. Buxton, H., and Gong,S. 1995. Visual surveillance in a dynamicand uncertain world. Artif. InteU. 78:371-405. Chella, A.; Frixione, M.; and Gaglio, S. 1997. A cognitive architecturefor artificial vision. Artif. Intell. 89:73-111. Chella, A.; Frixione, M.; and Gaglio, S. 2000. Understanding dynamicscenes. Artif. lntell. 123:89-132. Gallistel, C. 1980. The Organization of Action: A New Syntesis. Hillsdale, NJ: LawrenceErlbaumAssociates. 66