Interpreting Symbols on Conceptual Spaces: the ... Antonio Chellat, Marcello Frixione:~ and Salvatore Gagliot

From: AAAI Technical Report FS-01-01. Compilation copyright © 2001, AAAI (www.aaai.org). All rights reserved.
Interpreting
Symbols on Conceptual Spaces: the Case of Dynamic Scenarios
AntonioChellat, MarcelloFrixione:~ and Salvatore Gagliot
J’Dip. di Ingegneria Automaticae Informatica, Univ. of Palermoand CERE-CNR,
Palermo,Italy
:~Dip. di Scienzedella Comunicazione,
Univ. of Salerno, Italy
Abstract
In (Chella,Frixione,&Gaglio1997;2000)weproposed
a framework
for the representationof visualknowledge,
withparticularattentionto the analysisandthe representation of sceneswith movingobjects and people.
Oneof our aimsis a principledintegrationof the models
developed
withinthe artificial vision community
with
the propositionalknowledge
representationsystemsdevelopedwithinsymbolicAI. In this paperweshowhow
the approachweadoptedfits well withthe representational choicesunderlyingoneof the mostpopularsymbolic formalisms
usedin cognitiverobotics,namelythe
situationcalculus.
Ourmodelof visual perception:
Anoverall view
In (Chella, Frixione, & Gaglio 1997; 2000) we proposed
theoretical frameworkfor the representation of knowledge
extracted fromvisual data. Oneof our aimsis a principled
integration of the approachesdevelopedwithin the artificial
vision community,and the propositional systems developed
within symbolic knowledgerepresentation in AI. It is our
assumptionthat such an integration requires the introduction of a conceptuallevel of representation that is in some
wayintermediate betweenthe processing of visual data and
declarative, propositional representations. In this paper we
argue that the conceptualrepresentation we adoptedfor representing motionis compatiblewith one of the most influential symbolicformalismsused in cognitive robotics, namely
the situation calculus(Reiter 2001).In the rest of this section we summarizethe main assumptions underlying our
model. Next section is devoted to a synthetic description
of our conceptuallevel representation of motion.The third
section showshowthe situation calculus can be mappedon
the conceptual representation we adopted. The fourth section describes a wayto translate the symbolicrepresentation
weadopted in (Chella, Frixione, &Gaglio2000) in the formalismof the situation calculus. Ashort conclusionfollows.
On the one hand, the computer vision communityapproached the problem of the representation of dynamic
scenes mainlyin terms of the construction of 3Dmodels,
Copyright©2001, American
Associationfor Artificial Intelligence(www.aaai.org).
All rights reserved.
6O
and of the recoveryof suitable motionparameters, possibly
in the presenceof noise and occlusions. Onthe other hand,
the KRcommunitydeveloped rich and expressive systems
for representationof time, of actions and, in general, of dynamicsituations.
Nevertheless,these twotraditions evolvedseparately and
concentrated on different kinds of problems.The computer
vision researchersimplicitly assumedthat the problemof visual representation ends with the 3Dreconstruction of moving scenes. The KRtradition in classical, symbolicAI usually underestimatedthe problemof groundingsymbolicrepresentations in the data comingfromsensors.
It is our opinion that this state of affairs constitutes
a strong limitation for the developmentof artificial autonomous
systems(typically, robotic systems).It is certainly
true that several aspects of the interaction betweenperception andbehaviourin artificial agentscanbe faced in a rather
reactive fashion, with perceptionmoreor less directly coupled to action, without the mediationof complexinternal
representations. However,
we maintainthat this is not sufficient to modelall the relevant types of behaviour.For many
kinds of complextasks, a moreflexible mediationbetween
perceptionand action is required. In suchcases the coupling
betweenperception and action is likely to be "knowledge
driven", and the role of explicit formsof representation is
crucial. Our modelcan be intended as a suggestionto provide artificial agents withthe latter of the above-mentioned
functionalities.
The existing attempts to integrate visual perception with
propositional KRare mostly oriented towards natural languageinterpretation, with particular emphasison the aspects of man-machineinteraction. They face only in a
marginalwaythe general aspects of representation of knowledge. In addition, the existing proposals often concernspecific domainsof application,such as car traffic (Nagel1994;
Neumann1989), biomedical images (Tsotsos et al. 1980;
Tsotsos 1985), sport scenarios (Herzog &Wazinski1994;
Siskind 1994 5), simple assemblytasks (Kuniyoshi &Inoue 1993), visual surveillance (Buxton&Gong1995).
Weassumethat a principled integration of the approaches
of artificial vision and of symbolicKRrequires the introduction of a missinglink betweenthese two kinds of representation. In our approach,the role of such a link is playedby
the notion of conceptual space (G~irdenfors2000). A con-
Figure1: Thethree areas of representation,and the relations
amongthem.
bolic knowledge
initially stored within the linguistic area,
and by the information learned by a neural networkcomponent. The knowledgein the linguistic area plays a central
role in determiningthe expectationsof the system, that in
their turn drive the scanning of the data comingfrom the
sensors. Whichhigh level information the systemextracts
fromthe sensory data (or, in other worlds, the wayin which
the systeminterprets the perceivedscenes) crucially depends
on a top downprocesspartially directed by linguistic knowledge. The details of this process are described in (Chella,
Frixione, &Gaglio 1997).
Thepurpose of this paper is to showthat the conceptual
representation we adopted for dynamicscenes (and that is
describedin details in (Chella, Frixione, &Gaglio2000))
consistent with the representational choices underlyingone
of the most popular symbolicformalismsused in cognitive
robotics, namelythe situation calculus (Reiter 2001).
this way, our dynamicconceptual spaces could offer a conceptualinterpretation for the situation calculus, whichcould
help in anchoringpropositional representations to the perceptualactivities of a robotic system.
ceptual space (CS) is a representation whereinformationis
characterized in terms of a metric space definedby a number
of cognitive dimensions.Thesedimensionsare independent
from any specific languageof representation. Accordingto
G~denfors,a CS acts as an intermediate representation betweensubconceptualknowledge(i.e., knowledgethat is not
yet conceptually categorized), and symbolically organized
knowledge.
Accordingto this view, our architecture is organisedin
three computational areas. Figure 1 schematically shows
the relations amongthem. The subconceptualarea is concerned with the lowlevel processingof perceptual data coming from the sensors. The term subconceptualsuggests that
here informationis not yet organisedin terms of conceptual
structures and categories. In this perspective, our subconceptual area includes a 3Dmodelof the perceived scenes.
Indeed, evenif such a kind of representation cannot be considered"lowlevel" fromthe point of viewof artificial vision,
in our perspectiveit still remainsbelowthe level of conceptual categorisation.
In the linguistic area, representation and processingare
based on a propositional KRformalism(a logic-oriented
representation language). In the conceptualarea, the data
comingfrom the subconceptualarea are organised in conceptual categories, whichare still independentfromany linguistic characterisation. The symbolsin the linguistic area
are groundedon sensory data by mappingthem on the representations in the conceptualarea.
In a nutshell, the operation of our modelcan be summarisedas follows: it takes in input a set of imagescorrespondingto subsequentphases of the evolution of a certain
dynamicscene, and it producesin output a declarative description of the scene, formulatedas a set of assertions written in a symbolic representation language. The generation
of such a description is driven both by the "a priori", sym-
Conceptual spaces for representing motion
Representationsin the conceptualarea are couchedin terms
of conceptual spaces (G~denfors2000). Conceptualspaces
provide a principled wayfor relating high level, linguistic
formalismson the one hand, with low level, unstructured
representation of data on the other. A conceptualspace CS
is a metric space whosedimensionsare in somewayrelated
to the quantities processedin the subconceptualarea. Different cognitive tasks maypresupposedifferent conceptual
spaces, and different conceptualspaces can be characterised
by different dimensions. Examplesof dimensionsof a CS
could be colour, pitch, mass, spatial co-ordinates, and so
on. In somecases dimensionsare strictly related to sensory data; in other cases they are moreabstract in nature.
Anyway,dimensionsdo not dependon any specific linguistic description. In this sense, conceptualspaces comebefore any symbolic-propositionalcharacterisation of cognitive phenomena.
In particular, in this note, we take into accounta conceptualspace devotedto the representationof the
motionof geometricshapes.
Weuse the term knoxel to denote a point in a conceptual space. A knoxelis an epistemologicallyprimitive element at the considered level of analysis. For example,
in (Chella, Frixione, &Gaglio 1997) we assumedthat,
the case of static scenes, a knoxelcoincideswith a 3Dprimitive shape, characterised according to someconstructive
solid geometry(CSG)schema. In particular, we adopted
superquadrics (Pentland 1986; Solina &Bajcsy 1990) as
suitable CSGschema.However,we do not assumethat this
choice is mandatoryfor our proposal. Our approach could
be reformulated by adopting different modelsof 3Drepresentation.
Theentities representedin the linguistic area usually do
not correspond to single knoxels. Weassumethat complex
entities correspondto sets of knoxels. For example,in the
case of a static scene, a complexshape correspondsto the
set of knoxelsof its simpleconstituents.
SENSORY
DATA
61
In order to accountfor the perceptionof dynamicscenes,
we choose to adopt an intrinsically dynamicconceptual
space. It has been hypothesisedthat simple motionsare categorised in their wholeness,and not as sequencesof static
frames. Accordingto this hypothesis, we define a dynamic
conceptual space in such a way that every knoxel corresponds to a simple motion of a 3D primitive. In other
words, we assume that simple motions of geometrically
primitive shapes are our perceptual primitives for motion
perception. As we choosesuperquadrics as geometricprimitives, a knoxelin our dynamicconceptualspace correspond
to a simple motionof a superquadric.
Of course, the decision of which kind of motion can be
considered"simple"is not straightforward, and is strictly
related to the problem of motion segmentation. (Mart
Vaina 1982) adopted the term motion segmentto indicate
such simple movements.According to their SMS(StateMotion-State) schema,a simple motionis characterised by
the interval betweentwosubsequentoverall rest states.
Suchrest states maybe instantaneous. Considera person moving an arm up and down. According to Marr and
Vaina’s proposal, the upwardtrajectory of the forearm can
be considereda simple motion,that is represented in CS by
a single knoxel, say ka. Whenthe armreaches its vertical
position, an instantaneousrest state occurs. Thesecondpart
of the trajectory of the forearmis anothersimplemotionthat
correspondsto a secondknoxel, say ka. The sameholds for
the upperarm:the first part of its trajectory correspondsto
a certain knoxelkb; the secondpart (after the rest state) correspondsto a further knoxelk~.
In intrinsically dynamicconceptual spaces, a knoxel k
corresponds to a generalised simple motion of a 3Dprimitive shape. By generalised we meanthat the motion can
be decomposed
in a set of componentsxi, each of themassociated with a degree of freedomof the movingprimitive
shape. In other words, wehave:
k = [Xl, x2 .....
Xn]
Figure 2: A dynamicconceptual space.
For our purposes,weare not interested in the representation of any possible motion,but only in a compactdescription of perceptuallyrelevant kinds of motion.In the domain
we are facing here, the motionscorrespondingto each degree of freedomofa superquadriccan be viewedas the result
of the superimpositionof the first low frequencyharmonics, accordingto the well-known
Discrete Fourier Transform
(DFT), see (Oppenheim&Shafer 1989) for an in-depth
scription of the topic. Accordingto this approach,given a
generic parameterof a superquadric,e.g., ax, let be ax(t)
the function that returns for each instant t the corresponding value of ax. This function can be viewedas a superimpositionof a discrete numberof trigonometric functions
and it can then be representedby a point in a discrete functional space, whosebasis functions are the trigonometric
functions. By a suitable compositionof the time functions
of all superquadricparameters,the overall motionof the superquadricis representedin its turn by a point in a discrete
functional space. Weadopt the resulting functional space
as the conceptual space for the representation of dynamic
scenes.
wheren is the numberof degrees of freedomof the moving
superquadric.In this way,changesin shapeand size are also
taken into account. Parametersof superquadrics are given
with respect to an absolute reference system. As a consequence, also motion properties of knoxels is described in
absolute terms.
In turn, each motionxi correspondingto the i-th degreeof
freedomcan be viewedas the result of the superimposition
of a set of elementarymotionsfj:
i i
__
x,-S,
xj:j
J
In this way,it is possible to define a set of basis functions )’j, in terms of whichany simple motioncan be expressed. Suchfunctions can be associated to the axes of the
dynamicconceptual space as its dimensions. In this way,
the dynamicCSresults in a functional space. The theory of
function approximation
offers different possibilities for the
choice of basic motions: trigonometric functions, polynomial functions, wavelets, and so on.
62
The choice of DFTfor motion representation seems
promisingfromout point of viewalso becauseFourier analysis has been proposedas a psychologicallyplausible model
for representation of humanand animal motion (Gallistel
1980; Johansson 1973). See (Chella, Frixione, &Gaglio
2000)for greater details.
Figure 2 is an evocative representation of a dynamicconceptual space. In the figure, each group of axes f~ correspondsto the i-th degreeof freedomof a simple shape; each
axis j’j in a group fi corresponds to the j-th component
pertaining to the i-th degree of freedom.
In this way, each knoxel in a dynamicconceptual space
represents a simplemotion,i.e. the motionof a simpleshape
occurring within the interval betweentwo subsequentrest
states. Wecall compositesimple motiona motionof a composite object (i.e. an object approximatedby morethan one
superquadric). A compositesimple motionis represented in
the CSby the set of knoxelscorrespondingto the motionsof
its components.
For example,the first part of the trajectory
of the wholearm shownin figure 2 is represented as acom-
posite motionmadeup by the knoxels ka (the motionof the
forearm) and kb (the motion of the upper arm). Note that
in compositesimple motionsthe (simple) motionsof their
components
occur simultaneously. That is to say, a composite simple motioncorrespondsto a single configuration of
knoxelsin the conceptualspace.
In order to consider the compositionof several (simple
or composite) motions arranged according to sometemporal relation (e.g., a sequence), weintroduce the notion
structured process. Astructured processcorrespondsto a series of different configurationsof knoxelsin the conceptual
space. Weassumethat the configurations of knoxels within
a single structured process are separated by instantaneous
changes.In the transition betweentwosubsequentdifferent
configurations,there is a changeof at least one of the knoxels in the CS. Wecall a "scattering" such a transition from
one knoxel to another. This correspondsto a discontinuity
in time, and is associated with an instantaneousevent.
In the exampleof the movingarm, a scattering occurs
whenthe arm has reachedits vertical position, and begins
to movedownwards.In a CS representation this amountto
say that knoxelka (i.e., the upwardmotionof the forearm)
replaced by knoxel ka, and knoxel kb is replaced by k~. In
fig. 2 such scattering is graphicallysuggestedby the dotted
arrowsconnectingrespectively ka to ka, and kb to k~.
Mappingsituation calculus on conceptual
spaces
In (Chella, Frixione, &Gaglio2000) the formalismadopted
for the linguistic area was a KL-ONE
like semantic net,
equivalent to a subset of the first order predicate language.
In this note we proposethe adoption of the situation calculus as the formalismfor the linguistic area. Indeed, the
kind of representation it presupposesis in manyrespects homogeneous
to the conceptualrepresentation described in the
previoussection. In this section we suggesthowa representation in terms of the situation calculus could be mapped
on
the conceptualrepresentation presented above.
The situation calculus is a logic basedapproachto knowledge representation, developedin order to express knowledge about actions and changeusing the languageof predicate logic. It was primarily developedby (McCarty1968);
for an up to date and exhaustiveoverviewsee (Reiter 2001).
The basic idea behind the situation calculus is that the
evolutionof a state of affairs can be modeled
in termsof a sequenceof situations. The world changes whensomeaction
is performed.So, given a certain situation sl, performinga
certain action a will result in a newsituation s2. Actionsare
the sole sourcesof changeof the word:if the situation of the
wordchangesfrom, say, si to s j, then someaction has been
performed.
Thesituation calculus is formalisedusing the languageof
predicate logic. Situations and actions are denotedby first
order terms. The two place function do takes as its argumentsan action and a situation: do(a, s) denotes the new
situation obtained by performingthe action a in the situation s.
Actions occur in time. Accordingto (Reiter 1996),
63
introduce a one argumentsymbolfunction time. If a is an
action, time(a) is a real numbercorrespondingto the occurrence timeof a.
Classesof actions can be representedas functions. Forexample, the one argumentfunction symbolpick_up(x) could
be assumedto denote the class of the actions consisting in
picking up an object x. Givena first order term o denoting
a specific object, the term pick_up(o)denotes the specific
action consisting in pickingup o.
Asa state of affairs evolves, it can happenthat properties and relations changetheir values. In the situation calculus, propertiesandrelations that canchangetheir truth value
fromone situation to anotherare called (relational)fluents.
Anexampleof fluent could be the propertyof being red: it
canhappenthat it is true that a certain object is red in a certain situation, and it becomesfalse in another. Fluents are
denotedby predicate symbolsthat take a situation as their
last argument.For example,the fluent correspondingto the
propertyof being red can be representedas a twoplace relation red(x, s), wherered(o, sl) is true if the object o is red
in the situation Sl.
Differentsequencesof actions lead to different situations.
In other words, it can never be the case that performing
someaction starting formdifferent situations can result in
the samesituation. If two situations derive from different
situations, they are in their turn different, in spite of their
similarity. In order to accountfor the fact that twodifferent
situations can be indistinguishablefromthe point of viewof
the propertiesthat holdin them,the notionof state is introduced.Considertwosituations Sl and s2; if they satisfy the
samefluents, then we say that Sl and s2 correspondto the
samestate. Thatis to say, the state of a situation is the set of
the fluentsthat are true in it.
In the ordinary discourse, there are lot of actions that
havea temporal duration. For example,the action of walking from certain point in space to another takes sometime.
In the situation calculus all actions in the strict sense are
assumedto be instantaneous. Actions that have a duration are representedas processes, that are initiated and are
terminated by instantaneous actions (see (Pinto 1994)
Chap. 7 of (Reiter 2001)). Supposethat we want to represent the action of movingan object x from point y to
point z. Wehave to assumethat movingx from y to z is
a process, that is initiated by an instantaneousaction, say
start_move(x, y, z), and is terminated by another instantaneous action, say end.move(x,y, z). In the formalismof
the situation calculus, processescorrespondto relational fluents. For example,the process of movingx fromy to z correspondsto a fluent, say moving(x,y, z, s). A formulalike
moving(o,Pl, P2, Sl) meansthat in situation sl the object
o is movingfromposition Pl to position P2.
This approachis analogousto the representation of actions adoptedin the dynamicconceptualspaces described in
the precedingsection. In a nutshell, a scattering in the conceptual space CS correspondsto an (instantaneous) action.
A knoxelcorrespondsto a particular process. A configuration of knoxelsin CScorrespondsto a state.
Please, note that, accordingto the situation calculusterminology
we adopthere, a state (in the technical sense) does
not necessarily correspondto a static configurationof entities. Rather,as wesaid in the previousparagraph,a state is
definedas the set of fluents that are true in a givensituation,
and werepresent processes (i.e., actions that have a duration) as a particular kindof relational fluents. Therefore,
state can correspondto a set of processes.
Note also that a configuration of knoxels cannot correspondto a situation, because,at least in principle, the same
configurationof knoxelscould be reachedstarting fromdifferent configurations, and it can never happenthat the same
situation (in the technical sense) can be obtained starting
fromdifferent fromdifferent states of affairs.
Asan example,consider a scenario in whicha certain object o movesfromposition pl to position P2, and then rests
in P2. Whenthe motion of o from Pl towards P2 starts,
a scattering occurs in the conceptualspace CS, and a certain knoxel,say kl, becomesactive in it. Sucha scattering
correspondsto an (instantaneous)action that couldbe represented by a term like start_move(o, pl, P2). The knoxelkl
correspondsin CS to the process consisting in the motionof
the object o. Duringall the time in whicho remainsin such
a motion state, the CS remains unchanged(provided that
nothing else is happeningin the considered scenario), and
kl continueto be active in it. In the meanwhile,the fluent
moving(o, Pl, P2, s) remains true. Wheno’s motion ends,
a further scattering occurs, kl disappears, and a newknoxel
k2 becomesactive. This secondscattering correspondsto an
instantaneous action end_moving(o,pl, p2). The knoxel k2
correspondsto o’s rest. Duringthe timein whichk2 is active
in CS, the fluent staying(o,p2, Sl) is true.
In its traditional version, the situation calculusdoes not
allow to account for concurrency. Actions are assumedto
occursequentially,and it is not possibleto representseveral
instantaneousactions occurringat the sametime instant. For
our purposes,this limitations is too severe. Whena scattering occurs in a CS it mayhappenthat moreknoxels are affected. This is tantamountto say that several instantaneous
actions occur concurrently. For example,according to our
terminology(as it has beenestablished in the precedingsection), a compositesimple motionis a motionof a composite
object (i.e. an object approximatedby morethan one superquadric). A compositesimple motion is represented in
the CS by the set of knoxels correspondingto the motions
of its components.For example,the trajectory of a whole
arm movingupward(the examplein the previous section)
represented as a composite motionmadeup by the knoxels
ka (the upwardmotion of the forearm) and kb(the upward
motionof the upper arm).
Supposethat we want to represent such a motionwithin
the situation calculus. Accordingto what stated before,
movingan ann is represented as a process, that is started
by a certain action, say start_move.arm,and that is terminated by another action, say end.move.arm.(For sake of
simplicity, no argumentsof such actions - e.g. the agent,
the starting position, the final position - are taken into account here). The process of movingthe arm is represented
as a fluent moving_arm(s),that is true if in situation s
the arm is moving. The scattering in CS corresponding
to both start_move_arm and end_move_arminvolves two
64
knoxels, namelyka and kb, that correspond respectively
to the motion of the forearm and to the motionof the upper arm. Consider for example start_move_arm. It is
composedby two concurrent actions that could be named
start_move_forearm and start_move_upper_arm, both
correspondingto the scattering of one knoxelin CS (resp.
ka and kb).
Extensionsof the situation calculus that allows for a
treatment of concurrencyhavebeen proposedin the literature (Reiter 1996; 2001; Pinto 1994). (Pinto 1994) adds
the languageof the situation calculus a twoargumentfunction +, that, given twoactions as its arguments,producesan
actionas its result. In particular,if al anda2 are twoactions,
al + a2 denotes the action of performingal and a2 concurrently. Accordingto this approach,an action is primitive if
it is not the result of other actions performedconcurrently.
If a is a complexaction such that a = al + a2 + .. ¯ + an,
then, for each i such that 1 < i < n, we say that ai E a.
In our approach,the scattering of a single knoxelin CS
correspondto a primitive action; several knoxelsscattering
at the sametime correspondto a complexaction resulting
fromconcurrentlyperformingdifferent primitive actions.
Therefore,accordingto Pinto’s notation, the representation of the arm motionexamplein the formalismof the (concurrent) situation calculusinvolvesfour primitiveactions:
start_move_forearm
start_move_upper_arm
end_move_forearm
end.move_upper_arm
and twonon primitive actions:
start_move_arm
end_move.arm
=
q=
+
start_move_forearm +
start_move_upper_arm
end.move_forearm+
end_move_upper_arm
In addition, the three followingfluents are needed:
moving_arm
(a formula like this is true if in the CS configuration correspondingto the situation s bothka and kb
are present);
moving_forearm(a formula like this is true if in the
CS configuration correspondingto the situation s ka is
present);
moving.upper.arm(a formula like this is true if in the
CS configuration correspondingto the situation s kb is
present).
From a terminological
taxonomy of processes
to the situation calculus
In (Chella, Frixione, &Gaglio 1997; 2000), we adopted
linguistic representation basedon a hybrid formalismin the
KL-ONE
tradition (on this kind of formalismssee (Brachman& Schmoltze1985; Nebel 1990). These formalisms are
ment of knoxels simultaneously occurring in CS. In other
words, within a synchronic motion no scattering occurs.
A Simple.motion is a synchronic motion which has no
parts: it corresponds to a single knoxel in CS. A Composite.simple.motion is a Synchronic.motionwith at least
two parts that are simple motionsoccurringsimultaneously.
That is to say, all the parts of a compositesimple motion
start and end at the sametime. Therefore,if the simplemotions sm1 ¯ .. Stunare the parts of a certain compositesimple motioncsm, and al ... an, a are the instantaneous actions that initiate, respectively, sml ... smnand csm, then
we have that a = al + ... + an. In other words:
Figure 3: A fragment of the terminological KBdescribing
someof the most general concepts.
hybridin the sense that they are constitutedby twodifferent
components:a terminological componentfor the description of concepts, and an assertional component,that stores
information concerning a specific context. In the domain
of dynamicscene representation, the terminological component contains the description of significant conceptssuch as
various types of motions, of processes and of actions. The
assertional component
stores the assertions describing specific perceiveddynamicsituations.
Fig. 3 shows a fragment of the terminological knowledge base adopted in (Chella, Frixione, & Gaglio 2000),
describing the mostgeneral concepts representing motions,
processes and actions. Thegraphical notation is similar to
that of Brachmanand Schmoltze (Brachman& Sehmoltze
1985). In this section, we showhowa taxonomylike this
can be translated in the formalismof the situation calculus.
Intuitively, Processesare the mostgeneric motionscorrespondingto a temporalevolution of the knoxels in the CS.
The beginning and the end of a Process is determined by
Instantaneous_Actions,that have no temporal duration and
correspondto a scattering of knoxelsin the CS. Instantaneous_Actionsare analogousto the actions of the situation
calculus. Therefore, the conceptInstantaneous_Actioncorrespondsto a oneplace predicate, that is true of the individuals in the domainthat correspondto actions.
As we anticipated before, Processesare analogoustofluents. In order to keepa greater correspondence
with the semanticnetworkrepresentation, we adopt the choice of reifying processes. That is to say, we assumethat in our domainindividual entities exist, whichcorrespondto specific
instances of processes. Therefore,a certain kind of process
P1 (i.e., a subconceptof the Processconceptof the fig. 3)
is representedas a twoplace relational fluent PI (x, y).
atomic formula P1 (P, s) meansthat the individual p is
processof kind/’1, andthat it holds in the situation s.
Processes maybe arranged according to different temporal relations: they maybe consecutive,mayoverlap, and so
on.
A Synchronic.motionis a Process involving the motion
of one or more objects, that corresponds to one arrange-
65
VXl , x2, yl , y2, s(Composite_simple_motion(xl,
s)
Apart_of_CMotion(xl, x2)/x start(xl, Yl)
Astart(x2, Y2) "+ Y2E Yl)
Asimilar constraint holds for the actions terminating a
compositesimple motion.
A Structured_processis a motion involving a temporal
evolution(i.e., a scattering) in CS;a Structured_process
has
at least twoparts that are Synchronic.motions
not occurring
at the sametime. In the case of structured processes, the
followingcondition holds:
VX1,
x2,
Yl
, Y2,
s(Structured_Process(xl , s)
A
/xpart_of_Sprocess(xl, x2) A start(xl, Yl)
Astart(x2, Y2) --+ time(yl) < time(y2))
Thatis to say, ifxl is s a structuredprocessthat is initiated
by the action Yl, andx2is a part of Xl that is initiated bythe
action Y2, then Y2must temporallyfollow or coincide with
Yl.
A symmetriccondition constraints the end of structured
processesandof their parts:
’¢Xl, x2, Yl, Y2, s(Structured_Process(xl,s) A
Apart_of_Sprocess(xl, x2) A end(xl, yl)
Aend(x2, Y2) ~ time(yl) > time(y2))
Some conclusions
In the abovesection we suggesteda possible interpretation
of the languageof the situation calculus in terms of conceptual spaces. In this waythe situation calculus could be
chosenas the linguistic area formalismfor our model,with
the advantageof adopting a powerful, well understoodand
widespreadformal tool. Besides this, we maintain that a
conceptualinterpretation of the situation calculus wouldbe
interesting in itself. Indeed,it could be consideredcomplementarywith respect to traditional, modeltheoretic interpretations for logic orientedrepresentationlanguages.
Modeltheoretic semantics (in its different versions:
purely Tarskian for extensional languages, possible worlds
semantics for modallogic, preferential semantics for non
monotonicformalisms, and so on) has been developedwith
the aimof accountingfor certain metatheoreticalproperties
G~denfors, P. 2000. ConceptualSpaces. Cambridge,MA:
MITPress, Bradford Books.
Herzog, G., and Wazinski, P. 1994. Visual TRAnslator:
Linkingparceptionsand natural languagedescriptions. Artif. lntell. Review8(2-3): 175-187.
Johansson, G. 1973. Visual perception of biological motion and a modelfor its analysis. Perception and Psychophysics 14:201-211.
Kuniyoshi,Y., and Inoue, H. 1993. Qualitative recognition
of ongoinghumanaction sequences. In Proc. 13th IJCAL
1600--1609.
Marr, D., and Vaina, L. 1982. Representation and recognition of the movements
of shapes. Proc. R. Soc. Lond. B
214:501-524.
McCarty,J. 1968. Situations, actions and causal laws. In
Minsky,M., ed., Semantic Information Processing. Cambridge, MA:MITPress. 410--417.
Nagel, H. 1994. A vision of "vision and language" comprises action: Anexamplefromroad traffic. Artif. lntell.
Review8(2-3): 189-214.
Nebel, B. 1990. Reasoningand Revision in Hybrid Representation Systems, volume422of Lecture Notes in Artificial Intelligence. Berlin: Springer-Verlag.
Neumann,
B. 1989. Natural language description of timevarying scenes. In Waltz, D., ed., SemanticStructures.
Hillsdale, NJ: LawrenceErlbaum.
Oppenheim,
A., and Shafer, R. 1989. Discrete-TimeSignal
Processing.Englewood
Cliffs, N.J.: Prentice Hall, Inc.
Pentland, A. 1986. Perceptual organization and the representation of natural form.Artif. Intell. 28:293-331.
Pinto, J. 1994. Temporalreasoningin the situation calculus. Technicalreport, Dept. of ComputerScience, Univ. of
Toronto, Toronto, CA.
Reiter, R. 1996. Natural actions, concurrencyand continuous time in the situation calculus. In Principles of Knowledge Representation and Reasoning: Proceedings of the
Fifth International Conference( KR’96).
Reiter, R. 2001. Knowledgein Action: Logical Foundations for Describing and ImplementingDynamicalSystems. Cambridge,MA:MITPress, Bradford Books.
Siskind, J. 1994-5. Groundinglanguage in perception.
Artif. Intell. Review8(5-6):371-391.
Solina, F., and Bajcsy, R. 1990. Recoveryof parametric modelsfrom range images: The case for superquadrics
with global deformations. IEEETrans. Patt. Anal. Mach.
Intell. 12(2): 131-146.
Tsotsos, J.; Mylopoulos,J.; Covvey,H.; and Zucker, S.
1980. A frameworkfor visual motionunderstanding. IEEE
Trans. Patt. Anal. Mach.lntell. 2(6):563-573.
Tsotsos, J. 1985. Knowledge
organisation and its role in
representationand interpretation for time-varyingdata: the
ALVEN
system. ComputationalIntelligence 1:16-32.
of logical formalisms(such as logical consequence,validity, correctness, completeness,and son on). However,it is
of no help in establishing howsymbolicrepresentations are
anchoredto their referents.
In addition, the modeltheoretic approachto semanticsis
"ontologicallyuniform",in the sense that it hides the ontological differences betweenentities denotedby expressions
belongingto the samesyntactic type. For example,all the
individual terms of a logical languageare mappedonto elementsof the domain,no matter of the deep ontological variety that mayexist betweenthe objects that constitute their
intendedinterpretation. Considerthe situation calculus. Accordingto its usual syntax,situations, actions and objectsare
all representedas first order individualterms;therefore, they
are all mappedon elements of the domain. This does not
constitute a problemgiven the abovementionedpurposes of
modeltheoretic semantics. However,it becomesa serious
drawbackif the aim is that of anchoring symbolsto their
referents throughthe sensoryactivities of an agent.
In this perspective, it is our opinion that an interpretation of symbolsin terms of conceptual spaces of the form
sketchedin the abovepagescould offer:
¯ a kind of interpretation that does not constitute only a
metatheoreticdeviceallowingto single out certain properties of the symbolicformalism;rather, it is assumed
to offer a further level of representationthat is, in somesense,
closer to the data comingfromsensors, and that, for this
reason, can help in anchoringthe symbolsto the external
world.
¯ A kind of interpretation that accountsfor the ontological
differences betweenthe entities denoted by symbolsbelonging to the samesyntactic category. This wouldresult
in a richer and finer grained model,that stores information that is not explicitly representedat the symboliclevel,
and that therefore can offer a further source of "analog"
inferences, offering at the sametime a link betweendeliberative inferential processes, and formsof inference that
are closer to the lowerlevels of the cognitivearchitecture
(reactive behaviours,and so on).
Acknowledgements
This work has been partially supported by MURST
"Progetto Cofinanziato CERTAMEN".
References
Brachman,R., and Schmoltze, J. 1985. An overview of
the KL-ONE
knowledgerepresentation system. Cognitive
Science 9(2): 171-216.
Buxton, H., and Gong,S. 1995. Visual surveillance in a
dynamicand uncertain world. Artif. InteU. 78:371-405.
Chella, A.; Frixione, M.; and Gaglio, S. 1997. A cognitive
architecturefor artificial vision. Artif. Intell. 89:73-111.
Chella, A.; Frixione, M.; and Gaglio, S. 2000. Understanding dynamicscenes. Artif. lntell. 123:89-132.
Gallistel, C. 1980. The Organization of Action: A New
Syntesis. Hillsdale, NJ: LawrenceErlbaumAssociates.
66