Grounded Models as a Basis for Intuitive ... the Origins of Logical Categories Josefina

From: AAAI Technical Report FS-01-01. Compilation copyright © 2001, AAAI (www.aaai.org). All rights reserved.
GroundedModelsas a Basis for Intuitive Reasoning:
the Origins of Logical Categories
JosefinaSierra-Santibfifiez
Escuela T6cnicaSuperior de Inform6tica
Universidad Aut6nomade Madrid
28049Madrid, Spain
Josefina.Sierra@ii.uam.es
Abstract
egorizer (Steels 1996), whichare used for conceptualizing
perceptual information. Then, we consider the process of
truth evaluation, showhowlogical categories can be constructed by identifying sets of outcomesof the evaluation
process, and explain howconceptscan be generatedby combining logical and perceptually groundedcategories. Next,
we describe howgroundedmodelsare constructed as a side
effect of the agents activity playingdifferent types of languagegames,and present someexperimentsin whicha couple of agents learn a groundedmodelthat can be used for
spatial reasoning. Finally, we introducethe notion of intuitive reasoning and showhowgroundedmodelscan be used
for it.
Grounded
models(Siena 2001b)differ fromaxiomatictheories in establishingexplicit connectionsbetweenlanguage
andreality that are learnedthroughlanguage
games(Wittgenstein 1953).Thispaperdescribeshowgroundedmodelsare
constructedby autonomous
agentsas a side effect of their
activity playingdifferent types of languagegames(Steels
1999),andexplainshowthey canbe usedfor intuitive reasoning.It proposesa particular languagegamewhichcanbe
usedfor simulatingthe generationof logicalcategories(such
as negation,conjunction,disjunction,implicationor equivalence), and describessomeexperimentsin whicha couple
of visually grounded
agentsconstructa grounded
modelthat
canbe usedfor spatial reasoning.
Introduction
Perceptually GroundedCategories
In (Sierra 200 lb), we introducedgroundedmodelsand compared themto axiomatic theories of mathematics.This paper describes howgroundedmodelsare constructed by autonomous
agentsas a side effect of their activity playingdifferent types of languagegames(Steels 1999).It builds up
previous workon the relation betweenlanguageacquisition
and conceptualization in visually groundedrobotic agents
(Steels 2001), by addressingthe issue of the acquisition
logical categories, and proposinga particular languagegame
(called the evaluation game)whichcan be used for simulating the generation of logical categories, such as negation,
conjunction,disjunction, implication or equivalence.
Logicalcategories are very importantfor intellectual development(Piaget 1985), because they allow the generation
of structured units of meaning,and they set the basis for intuitive reasoningat the level of propositionallogic.
Groundedmodelsare not proposedas substitutes for axiomatic theories. Onthe contrary, they provide an explanation for our ability to comeup withsomeof these theories by
intuitive reasoning. Theyexplain, for example,whywe accept certain axiomsas intuitively true without further argumentation. Groundedmodelsshould rather be considered as
precursorsof axiomatictheories, since these theories require
considerablelinguistic competence
for their formation.
The rest of the paper is organizedas follows. First, we
defne the concepts of sensory channel, category and cat-
Weassumea scenario similar to the physical setting of The
TalkingHeadsExperiment
(Steels 1999),i.e., a set of robotic
"talking heads" playing language gameswith each other
about scenes perceived through their camerason a white
board in front of them. Figure 1 showsa typical configuration of the whiteboardwithseveral geometricfigures pasted
onit.
Weconsidernowthe first steps of a languagegame,which
are intendedto conceptualizethe perceptual informationobtained by the agents after lookingat samearea of the white
board.
Copyright©2001,American
Associationfor Artificial Intelligence(www.aaal.org).
All rights reserved.
Sensory Channels
The agents look at one area of the white board by capturing an imageof that area with their cameras. First, they
segmentthe imageinto coherent units in order to identify
the objects that constitute the context of a languagegame.
Next, somesensory channels (implementedby low level visual processes) gather informationabout each segment,such
as its color, horizontal or vertical position. In the experimentsdescribedin this paper, we assumethat there are only
two primitive sensory channels.
¯ H(o)computesthe x-midpositionof object o.
¯ V(o) computesthe y-midpositionof object
The values returned by the sensory channels H and V are
sealed so that its rangeis the interval (0.0 1.0). Consider
101
followed by the upper and lower boundsof the region they
carve out. Thus [TOTALLY-LEFr]
is labeled as [H 0.0
0.25], becauseit is true for a region between0.0 and 0.25
on the H channel. Wealso use first order logic predicate
notation to emphasizethe fact that perceptually grounded
categoriescorrespondto n-ary predicatesof first order logic.
For example,we use the unary predicate [H 0.0 0.25](0)
refer to the category [H 0.0 0.25], and the binary predicate
[VD0.0 1.0](ol,o2) to refer to the category[VD0.0 1.0].
@
Figure 1: Eachobject in the scene is characterized by its
values on two primitive sensory channels: H and V.
the three objects numbered
in figure 1, object 1 has the values H(O1)=0.5,V(O1)=0.5,object 2 the values H(O2)=0.5,
V(O2)=0.2,and object 3 the values H(O3)=0.8,V(O3)=0.2.
In addition to two primitive sensory channels, there are
other sensory channels constructed fromthem.
¯ HD(ol,o2)computesthe difference of the x-midpositions
of objects ol and o2, i.e., HD(ol,o2)=H(ol)-H(o2).
¯ VD(ol,o2)computesthe difference of the y-midpositions
of objects ol and 02.
¯ EQ(ol,o2)is definedas a predicate, i.e., a functionwhich
takes the value 1 (true) if the values of the primitive sensory channels H and V are equal for objects ol and 02,
and 0 (false) otherwise.
For example, the sensory channel VDhas the value 0.3
whenit is applied to the pair of objects (O1,O2), i.e.,
VD(O1,O2)=0.3.
The range of the sensory channels HI3 and
VDis (-1.0 1.0). Therange of the sensory channelEQis the
discrete set of Booleanvalues {0, 1}.
Categories
Thedata returned by sensory channelsare values froma continuous domain(except for sensory channel EQ). To be the
basis of a natural languageconceptualization, these values
must be transformed into a discrete domain. One form of
categorization consists in dividing up each domainof values of a particular sensory channelinto regions and assigning a category to each region (Steels 1996). For example,
the range of the H channelcan be cut into two halves leading to a distinction between[LEFI](0.0 < H(x) < 0.5)
[RIGHT](0.5 < H(x) < 1.0). Object 3 in figure 1 h~s
value H(O3)=0.8and wouldtherefore be characterized
[RIGHT].Similarly, the VDchannel can be cut into two
halves, leading to a distinction between[ABOVE]
(0.0
VD(x)< 1.0) and [BELOW]
(-1.0 < VD(x)<
It is alwayspossible to refine a categoryby dividing its
region. Thus an agent could divide the bottom region of
the H channel (categorized as [LEFT])into two subregions
[TOTALLY-LEFt]
(0.0 < H(x) < 0.25), and [MID-LEFr]
(0.25 < H(x) < 0.5). Thecategorization networksresulting
from these consecutivebinary divisions form discrimination
trees (Steels 1996).
Followingthe notation introduced in (Steels 1999),
label categories using the sensory channelthey operate on,
102
Categorizers
At the sametime the agents build categories to conceptualize
perceptual information,they construct cognitive procedures
that allow themto check whetherthese categories hold or
not for a given tuple of objects. A categorizer is a cognitive procedure capable of determiningwhether a category
applies or not. For example,the behaviorof the eategorizer
for category [V 0.0 0.5](0) can be described by a function
that takes the value 1 if 0.0 < V(o)< 0.5, and 0 otherwise.
Categorizers give groundedmeaningsto symbolsby establishing explicit connectionsbetweeninternal representations (categories) andreality (perceptualinput as processed
by sensory channels) (Steels and Vogt 1997). These conneetions are learned through language games(Wittgenstein
1953), and allow agents to cheekwhethera category is true
or not for a given tuple of segmentedobjects. Mostimportantly, they provideinformationon the perceptualand cognitive processesan agent mustgo throughin order to evaluate
a givencategory.
The behavior of the categorizers associated with the
perceptually groundedcategories used in this paper can
be described by linear constraints. Weuse the notation
c (£) to refer to the categorizer whichis capable
[CAT]
recognizing whethercategory [CAT](£)holds or not for
giventuple of objects. It is importantto realize however
that
eategorizersare cognitive procedures,not linear constraints
or implementedfunctions. Weonly use linear constraints
to modeltheir behavior. Wedo not assume,therefore, that
natural agents do learn such constraints, but that they build
themin their perceptual and cognitive systems.
Thelinear constraints belowdescribe the behaviorof the
eategorizers associated with the categories up, down,above,
and below.
[V0.5 1.0]c(z)
= 0.5<V(z)AV(x)<I.0
W 0.0 0.5]c (z) -- 0.0<V(z)^V(z)<0.5
[VD0.0 1.0]c(z,y)
-- 0.0<VD(z,V) AVD(x,V)<I.0
c
[VD-1.0 0.0] (z, y) -- - 1.0 < VD(z,y) A VD(z,y) <
Logical Categories
Weconsider nowthe process of truth evaluation, and describe howlogical categories can be constructed by identifying sets of outcomesof the process of truth evaluation.
Logical categories are important for a numberof reasons:
(1) they allow the generationof an infinite set of concepts,
whichcorresponds
to the set of free quantifierfirst orderformulas that can be constructed from perceptually grounded
categories; (2) they set the basis for intuitive reasoning
the level of propositionallogic.
Evaluation
Channel
The evaluation channel(denotedby E) is capable of observing the internal state of the agent. It is appliedto a category
tuple and an object tuple, and returns a tuple of Booleanvalues resulting from evaluating each category on the object
tuple. It evaluatescategories by finding their categorizers,
applying themto a given object tuple, and observing their
output.
Let ~ = (cl,...,cn)
be a category tuple and ~ =
(ol,..., o,+) an object tuple, the result of applyingthe evaluation channelE to ~’and5is E(6’, 59 = (Vl,..., v,+), where
eachvi is the result of applyingc~ (the categorizerof c~)
(i.e., v+=cic (o-3).
For example,E(([V 0.0 0.5](z), [H 0.5 1.0](:e)),
(0, 0), becauseO1(object 1 in figure 1) is neither on
lowerpart nor on the right part of the white board. That is,
[V 0.0 0.5]c(O1)=0 and [H 0.5 1.0]c(O1)=0.
Logical Categories and Concepts
Therangeof the evaluationchannelis infinite, since it consists of all the Booleantuples of any arity. In the experimentsdescribed in this paper, we will focus on unary and
binary tuples only1. Therange of this channelfor unarytupies of categories is the set { 1,0}, where1 and 0 correspond
to the logical categories true and false, respectively. The
rangeof the evaluationchannelfor categorytuples of length
twois the set {(0,0),(0,1),(1,0),(1,1)). If weassign
ical category to each elementof this set, we obtain one of
the categories usedin classical logic (conjunction),and several conjunctionsused in natural language, such as neither
(0,0), but (1,0), or although(0,1) 2. If we considerhowever
subsets of the rangeof the evaluationchannel(instead of individual elements)weobtain the meaningsof all the cermetfives used in propositionallogic, i.e., negation,conjunction,
disjunction, implication and equivalence.
For example,wesay that sentencecl Vc2 is true if the result of evaluatingthe pair of categories(cl, c2) is Boolean
pair whichbelongsto the set {(1, 1), (1, 0), (0, 1)).
samecan be said for the sentence Cl +--> c2 and the set of
Booleanpairs {(1,1),(0,0)).
The agents construct logical categories by identifying
subsets of the range of the evaluation channel. The evaluation gamecreates situations in whichthe agents discover
such subsets, and use themto distinguish a subset of objects
fromothers in a given context. Therepresentation of logical
categories, such as conjunction,disjunction or implication,
1Weomitthe parentheses
if the evaluationchannelisappliedto
categorytuplesof lengthone.
2TheassociationbetweenBooleanpairs andnatural language
connectives
is approximate:
neitheris usedwhenthe twosentences
that followit are negative;butis usuallypreceded
byan affirmative
sentenceand followedby a negativeone; and althoughis sometimesprecededbya negativesentenceandfollowedby an affirmative one.
103
as sets of Boolean
tuples is in fact equivalentto the truth tables used in classical logic for describing the semanticsof
propositional connectives.Trivial subsets such as the empty
set, ((1),(0)} or {(I,1),(i,0), (0,I),(0,0)} are not
with categories, because they are not informative. Subsets
which combineunary and binary Boolean tuples, such as
{(1,1),(0)} are not consideredeither.
Thenotation used for describing logical categories consists of the symbolE (for evaluation channel) followed
an schematicrepresentation of the set of Booleantuples for
whicheach category holds. Thus, [13 1](c) corresponds
true(c), becauseit is true if the result of applyingthe evaluation channelto e belongsto the set of Booleantuples {(1)}.
[E 0](e) represents the logical categoryfalse(c), or the
ical connectivenot(e), becauseit is true if the result of applying the evaluation channelto c belongsto the set {(0)}.
Thelogical connectiveor(el,c2) (disjunction) is represented
by the logical category [E ll-10-01](x,y), whichis true
the result of applyingthe evaluation channelto the pair of
categories(c1,c2) belongsto the set {(1,1),(1,0),(0,1)}.
A key aspect of logical categories is that they describe properties of categories, as opposedto perceptually
groundedcategories whichdescribe properties of objects.
It is therefore natural to applylogical categories to perceptually groundedcategories to construct structured units of
meaning, which we call concepts. For example, the concept [E 0]([V 0.0 0.5](x)) can be constructedby applying
logical category[E 0](c) (i.e., not(c)) to the category
0.5](x) (i.e., down(x)).The concept [I~ II-10-01]([[V
1.0](x),[H0.5 1.0](x)) (i.e., or(up(x),right(x)))
strutted similarly by applyingthe logical categoryor(el,c2)
to the categoriesup(x) and right(x).
If we considerperceptually groundedcategories as atomic
concepts, wecan observethat: (1) logical categories cannot
only be applied to perceptually groundedcategories but to
all sorts of concepts;and (2) the result of applyinga logical
categoryto a single conceptor a pair of conceptsis again
a newconcept. Logical categories allow generating then an
infinite set of concepts,whichcorrespondsto the set of free
quantifier first order formulasthat can be constructedfrom
perceptually groundedcategories. The notion of concepta.s
structure of meaningconstructed by an agent can then be
defined inductively as follows: (1) a perceptually grounded
categoryis a concept;(2) if l(£) is an n-ary logical category
and ~’is an n-ary tuple of concepts,then l(c") is a concept.
Logical Categorizers
The categorizers of logical categories are cognitive procedures that allow determiningwhether a logical category
holds or not for a tuple of conceptsand an object tuple. As
we have explainedabove, logical categories can be associated with subsets of the rangeof the evaluationchannel. The
behaviorof their eategorizers can be describedtherefore by
constraintsof the formE(~,0-3 E St, wherel is a logical eategory, St is the subsetof the rangeof the evaluationchannel
for whichI holds, ~’is a tuple of concepts,and 5is an object
tuple. That is, by functionswhichtake the value 1 (true)
the result of evaluating the tuple of conceptsto whichthe
logical categoryis appliedbelongsto the subsetof the range
of the evaluation channelfor whichthe category holds, and
0 (false) otherwise.
Thefollowingconstraints describe the behaviorof the categorizers for the logical connectivesnot, and, or, if (implication) and/ft. Theeategorizer whichis capableof checking
whethercategory [CAT](c-)holds is denotedby [CAT]C(c-’).
The variables c, cl and c2 range over concepts. The variable
~rangesover object tuples.
[E0]c(c)=E(c,o-3e {(0)}
[E ll]C(cl,c2) =_ E((cl,c2),o-) 6 {(1, 1)}
[E l l-10-01]C(cl,c2)= E((cl, c2), 6) E {(1, 1),
i:
~
perception
whiteboardobjects
action
topic == pointed
perception
4 action
SPEAKER
lexicalization
language
expression
interpmtation~ m~--ng~
Figure 2: Thesemiotic square summarizes
the cognitive processes involvedin a languagegame.
(0,1)}
[E 11-01-00]C(cl,c2)- E((cl, c2), 6) {(1, 1), (0 , 1)
(0,0)}
[E ll-00]C(cl, c2) = E((cl, c2), 6) E {(1, 1), (0,
Becausethe categorizers of logical categories use the
evaluation channel, this channel can be naturally extended
to evaluategenericconcepts(i.e., free quantifier first order
formulas)recursively using the categorizers of logical and
perceptually groundedcategories. Atomicconcepts (perceptually groundedcategories) are evaluated by applyingtheir
categorizers to the object tuple given as input to the evaluation channel. Nonatomicconcepts of the form l(~ are
evaluatedby applyingthe eategorizer of the logical category
l to the result of evaluatingthe concepttuple ~’onthe object
tuple given~.s input to the evaluationchannel. Thefollowing
is an inductivedefinition of the evaluationchannelE(~’, o-’)
for generic concepts.
1. If [CAT](£)is a perceptually groundedcategory, then
E([CAT](£), o-’) = [CAT]C(o’).
2. If l(c-) is a concept,wherel is a logical category,E is
tuple of conceptsand St is the subset of the range of the
evaluation channelfor whichI holds, then E(l(c-), 6)
if E(E, 6) E St, and 0 otherwise.
Constructing
Grounded Models
In (Sierra 2001a), we defined grounded mo
del astheset
of concepts and eategorizers constructed by an agent at a
given timein its development
history. This section describes
howgroundedmodelsare constructeda.s a side effect of the
agenLsactivity playing different types of languagegames.
First, the agents play languagegamesof the sort described
in the book The Talking HeadsExperiment(Steels 1999).
Thesegamesallow themto construct perceptually grounded
categories and a shared lexicon for referring to such categories. Oncethe agents have learned a shared lexicon for
perceptually groundedcategories, they start playing evaluation games,whichallow themto construct logical categories
and a shared lexicon for referring to them.
Wedescribe the evaluation gamein this section, and refer the reader to (Steels 1999)for a thoroughdiscussion
the languagegamesused for learning perceptually grounded
categories, and their crucial role on the origins of language
104
(Steels 1997). The evaluation gameis a language game
whichfocuses on the evaluation of concepts and the charaeterization of sets of objects by meansof logical compositions of perceptually groundedcategories. The semiotic
square (Steels 1999) (figure 2) summarizesthe maincognitive processes involvedin a languagegame.
The evaluation gameis played by two agents. One agent
plays the role of speaker, and the other agent plays the role
of hearer.
PerceptionFirst, the speakerlooks at one area of the white
board and directs the attention of the hearer to the same
area3. Both, speaker and hearer, segmenttheir images of
the white board into coherent units, whichroughly correspondto different objects pasted on the white board. Then,
the speaker chooses a context for the languagegame.This
is a set of object tuples of the samearity. In the experiments
wehavecarried out so far, the object tuples are of arity one
or two. The speaker indicates the hearer the set of object
tuples that constitute the context by pointing to them. Different protocolsexist for pointing to unaryand binaryobject
tuples. Then,the speaker picks up a subset of object tuples
fromthe context whichwewill call the topic. In our experiments, the topic contains a variable numberof object tuples
which goes from one to three. The rest of the object tupies in the context constitute the background.Both, speaker
and hearer, use their sensory channelsto gather information
about each object tuple in the context, and store that informationso that they can use it in subsequentstages of the
language game.
For example, in evaluation game1 (described below)
the speaker chooses as context the set of object tuples
{(O1), (O2), (O3)}, where O1,02 and 03 represent
numberedobjects in figure 1. The topic chosen by the
speakeris the subsetof objecttuples{ ( O1), ( O3 ) }, and
backgroundthe subset { (02)
Discrimination The speaker tries to find a pair of categories whichdistinguishes the topic from the background.
3Robotsdirectthe attentionof othersto specificareasor objects
onthe whiteboardbyinformingeachother of the directionwhere
theyare focusing.
That is, a pair of categories (cl, e2) suchthat their evaluation on the topic is different fromtheir evaluation on any
object tuple in the background.The evaluation of a tuple
of conceptson a set of object tuples (e.g., the topic) is defined as follows. Weuse the function symbolE to denote
the evaluationchannel, and give this definition as an extension of previous ones. Let ~’ be a tuple of concepts, and
s a set of object tuples, E(6’, s) = {E(~, ~}~,, where
E(fi’, 6") is as defined in previous sections. The speaker
tries to find then a pair of categories (cl,c2) such that
E((cl, c2), topic) A E((cl, c2), background)
For example, consider the pair of categories ([-V 0.0
0.5](x),[H 0.5 1.0](x)) (i.e., (down(x),right(x))).
of categories discriminates the topic fromthe background
in
evaluationgame1, becausethe evaluation of the pair of eategorieson the topic doesnot intersect with the evaluationof
the pair of categories on the background.That is, E(([V0.0
0.5](x),[H 0.5 1.0](x)),{O1,O3})={(0,0),(1,1)},
0.5](x),[H 0.5 1.0](x)),{O2})={(1,0)}, and {(0,0),(1,1)}
{(1,0)}=~.
If the speaker cannot find a discriminatingpair of categories, the gamefails. Otherwise,the speakertries to find a
logical categorywhichis associated with the set of Boolean
pairs tv resulting fromevaluating the pair of categories on
the topic. If it does not haveany logical categoryassociated
with this set, it creates a newlogical category of the form
[E tv](cl,c2), togetherwithits categorizer,and addsit to its
current groundedmodel.
Continuingwith the exampleof evaluation game1, if the
speaker does not havea logical eategory associated with the
set of Booleanpairs { (1,1),(0,0)} (resulting fromevaluating
the discriminatingpair of categories (IV 0.0 0.5](x),[H
1.0](x)) on the topic {(O1),(O3)}),it can create a
ical categoryof the form[E 11-00](el,c2) and add it to its
grounded model.
Theconeeptconstructed by applyingthis logical category
to the categories in the discriminatingpair characterizesthe
topic as the set of object tuples in the context for whichthe
concept is true. For example, in evaluation game1, the
speaker maycharacterize the topic as the set of object tupies in the context whichare on the lowerpart of the image
if andonly if they are also on the right part of it. Thetopic
{(O1),(O3)}is the set of object tuples in {(O1),(O2),(O3)}
whichsatisfy the formula iff(down(x),right(x)). This formulais internally represented by the concept [E 11-00]([V
0.00.51(x),[H
0.51.0](x)).
Lexlealization The speaker examinesits lexicon in order
to find an expressionassociated with the logical category.
The lexicon of an agent contains associations of the form
(C,W,R),whereC is a category, Wis a linguistic expression,
and R is the rate of the association (i.e., a numberbetween
0 and 1 which correspondsto the confidencethe agent h~s
on the usefulness of that association). If there are several
candidates, the speaker chooses the expressionof the association withhighest rate. If there are no associationsfor the
logical category C, the speaker mayinvent a newexpression
Wwith probability CR, and add an association of the form
105
(C, W,0) to its current lexicon. TheconstantCRis the creativity rate of the agent. It determinesthe probability with
whichthe agent mayenrich the language by creating new
words.In our experimentsCRis set to 0.9 for all agents.
For example,if the speaker has no associations for the
logical category[E 11-00](c1,c2)in its current lexicon,
mayinvent a newexpression(e.g., ifJ) and add a newassociation of the form([E 11-00](x,y),iff(x,y),0)to its lexicon.
In this paper, we assumethat internal representations and
linguistic expressionshaveidentical structures, so that the
agents do not have to worryabout learning grammarrules
(see (Steels 1998) and (Steels 2000) for someexperiments
on the origins of syntax in visually groundedagents). This
simplification allows us to concentrate on the issue of the
origins of logical categories fromthe point of viewof their
semanticalfunction.
Next, the speakerconstructs a sentenceusing the lexicalizations of the logical categoryand the categoriesin the discriminatingpair. At this stage, it is assumed
that the agents
havelearned a shared lexicon for perceptuallygroundedcategories already, so that the learningprocesscan focus on the
constructionof a sharedlexicon for logical categories.
For example, the speaker mayconstruct the sentence
iff(down(x),right(x)) to expressthe concept[E 11-00]([V0.0
0.5](x),[H 0.5 1.0](x)), assumingthe associations ([E
00],iff(x,y),R1), ([V 0.0 0.5](x),down(x),R2)and
1.0](x),right(x),R3)are part of its lexiconalready.
Oncethe speaker has constructed a sentence which expresses the conceptthat characterizes the topic, it communicates that sentenceto the hearer.
Interpretation The hearer interprets the sentence using
the associations between(logical or perceptually grounded)
categories and linguistic expressionsin its lexicon. That is,
it tries to reconstruct the conceptencodedby the sentence.
If the hearer does not have any association for someof the
expressions used in the sentence, the gamefails and a repair process takes place. The speaker points to the topic
so that the hearer can guess the logical categoryit has used
to conceptualizethe topic, and acquire an association betweenthat logical category and the expression used by the
speaker. This happens with probability AS. The constant
ASis the assimilationrate of the agent. It indicatesthe probability with whichan agent adoptsexpressionsused by other
agents. In our experimentsASis set to 0.9 for all agents.
For example,in evaluation game1 the hearer mayinterpret the sentenceiff(down(x),right(x) as concept [E 1100]([V0.0 0.5](x),[H0.5 1.0](x)), if its lexiconcontains
right associations betweencategories and linguistic expressions. Otherwise, it mayacquire the association ([E 1100](x,y),iff(x,y),0)as a result of the repair process.
Identification Thehearer tries to find the referent, i.e., the
set of object tuples in the contextthat satisfy the meaning
of
the sentence.Forexample,the set of object tuples whichsatisfy the meaningof iff(down(x),right(x)) in evaluationgame
1 is {(O1),(O3)}.If the meaningof the sentence is true
all the object tuples in the context or for noneof them, the
gamefalls and a repair proeess similar to that describedin
the previous step takes place. Failures such as these may
happenwhenspeaker and hearer associate the sameexpression withdifferent logical categories.
Coordination
Thehearer points to the referent, i.e., to the
set of object tuples it identified in the previousstep. The
gamesucceedsif the referent is equal to the set of object
tuples the speakerhad in mind(the topic). If the gamesueceeds, speaker and hearer incrementthe rates of the associations they used for lexicalizing and interpreting the logical category by an amountI, and decrementthe rates of
competingassociations whichwere discarded by both agents
duringthe processesof lexicalization and interpretation, respectively. If the gamefalls, only the rates of the associations used by speaker and hearer are decremented,and a
repair processsimilar to that describedfor the previousstep
takes place. In our experiments,the amountI by whichrates
are modifiedis set to 0.1.
Wesummarizebelow the main steps of evaluation game
1, whichhas beenused as examplein our description of the
evaluation game.
Game number : 1
Speaker : A1
Hearer : A2
Topic: { (01), (03)}
Background:{ (02)
Speaker conceptualizes topic as:
[E Ii-00]([V 0 0.5] (x), [H 0.5 i] (x))
Speaker lexicalizes concept as:
iff (down(x), right(x))
Hearer interprets sentence as:
[E ii-00]([V 0 0.5] (x), [H 0.5 i] (x))
Hearer points to: {(01), (03)}
Speaker says: OK
The game is a success.
Experiments
Wehave run someexperiments in order to see whether a
couple of agents can learn a set of perceptually grounded
categories for spatial reasoning, and a set of logical categories whichallow themto construct logical formulas from
perceptually groundedcategories and to evaluate those formulas using the categorizers of perceptually groundedand
logical categories.
In the first experiment, the agents play languagegames
of the sort described in (Steels 1999). Thesegamesallow
themto learn perceptually groundedcategories and a shared
lexicon for referring to those categories. Figure 3 showsthe
evolutionof the lexical coherencein a series of 500 games.
Thelexical coherence(Steels 1999)is a measureof the similarity of the agents’ lexicons. It can be observedthat they
reach total lexical coherenceafter 250games.The spatial
categories and lexicon learned after 500 gamesare shownin
table 1.
In the second experiment, the agents play evaluation
games. These gamesare played after a series of 500 language games,during whichthe agents learn a shared lexi-
106
1.2
1
0.8
0.6
0.4
0.2
0
5 9 13 17 21 25 29 33 37 41 45 49
Figure 3: This graph showsthe evolution of the lexical coherence in a series of 500 languagegames. It can be observedthat the agents reach total lexical coherenceafter 250
games.
con for perceptually groundedcategories that they needed
for playing evaluation games. Figure 4 showsthe evolution of the lexical coherencein a series of 10000evaluation
games.First, the agents learn unarycategories, such as true
and false. Thisis donevery quickly. Then,they learn binary
categories, such as conjunetion,disjunction, implication or
equivalence. This takes longer, about 3000gamesto reach a
lexical coherenceof 0.97. In additionto logical connectives,
suchas not, and, or, if or iff, the agentslearn other categories
whichcorrespondto natural languageconjunctions, such as
but, neither, howeverand others. Thelogical categories and
lexicon learned after 10000gamesare shownin table 2.
Table1. Lexicon:spatial categories.
Expression Category
English
Rate
seba
1
[VP0.0 0.5]
down(x)
bi
[VP0.5 1.0]
up(x)
1
dilebo
[HP0.0 0.5]
left(x)
1
na
[HP0.5 1.0]
right(x)
1
sule
[VD-1.0 0.0] below(x,y)
1
ba
[VD0.0 1.0]
above(x,y)
1
belebubi
[HD-1.0 0.0] leftof(x,y)
1
le
[HD0.0 1.0]
rightof(x,y)
1
Table2. Lexicon: logical categories.
Expression Category
English
Rate
lebasu
[E 0]
not(x)
1
nadebabu
[E 1]
true(x)
1
sosulu
and(x,y)
[E 11]
1
deloda
[E 10]
1
nodasosu
[E 01]
1
1o
1
[E 00]
benusu
[E 11-10]
1
si
1
[E 11-01]
luse
[E 11-00]
iff(x,y)
1
babesu
[E 10-01]
xor(x,y)
1
nidi
[E 10-00]
1
nu
[E 01-00]
1
di
[E 11-10-01] or(x,y)
1
ne
[E 11-10-00]
1
sede
[E 11-01-00] if(x,y)
1
sabeleli
[E 10-01-00]
1
1
0.9
0.8
o.7
0.6
0.5
0.4
0.3
0.2
o.1
o
100020003000 400050006000700080009000 10000
Figure 4: This graphshowsthe evolution of the lexieal coherence in a series of 10000evaluation games.First, the
agents learn unaryconnectives(~), reaching total coherence
very soon. Afterwards, they start learning binary connectives (A, V, --% ++), reaching a lexical coherenceof 0.97
after 3000games.
Intuitive Reasoning
A natural question is to ask what the agents have really
learned by playing language games. In first place, they
haveconstructedcategorizers for perceptuallygroundedcategories (such as up, down,right, left, above, below,rightof
and leftof), andfor logical categories(such as negation,disjunction, conjunction, implication or equivalence). In second place, they have acquired a shared vocabulary for referring to those categories. Andin third place, they have
learnedto evaluategenericconcepts(i.e., free quantifierfirst
order formulas)using the categorizers of logical and perceptually groundedcategories.
Accordingto the result of the evaluationprocess, the concepts that can be constructed by an agent can be classified
into three categories: (1) intuitive truths, whichare true for
everyobject tuple; (2) intuitive falsehoods,whichare false
for every object tuple; and (3) regular concepts, whichare
true for someobject tuples and false for others.
Intuitive reasoningis a process by whichthe agents discoverrelationships that hold among
the categorizers of basic
concepts. For example,an agent maydiscover that the concept above(x,y) ^ above(y,is an intuitive fal sehood, because the categorizers of above(x, y) andabove(y, never
return true at the same time. Similarly, it can discover
that the concept above(x, y) ~ --~above(y, is an int uitive truth, since the eategorizerof above(y,x) returns false
wheneverthe categorizer of above(z, y) returns true.
Intuitive reasoning can then be used for determining
whethera conceptis an intuitive truth, an intuitive falsehood
or a regular conceptof a groundedmodel. It mayworkas a
processof constraint satisfaction in natural agents, by which
they try to discoverwhetherthere is any combinationof values of their sensorychannelsthat satisfies a givenconcept.It
is not clear to us, howthis processof constraint satisfaction
can be implemented
in natural agents. It maybe the result of
a simulation process, by whichthe agents generate possible
combinationsof values for their sensory channelsand check
107
whetherthey satisfy a given concept. Or it maybe grounded
on the physical impossibility of firing simultaneouslysome
categorizers due to the waythey are implemented
by physically connectedneural networksin natural agents.
Toget an idea of the set of intuitive truths that are implied
by the groundedmodelG constructed by a couple of agents
in the previousexperiments,consider the first order theory
Ts. The language of Ts consists of two binary predicate
symbolsA(x, y) (above) and R(z, y) (rightof). Its
nonlogiealaxiomsis as follows.
A(z,y) -~ ~A(y,z)
(1)
A(x, y) A A(y, z) -+ A(z,
(2)
A(z,y) A A(z,z) A’~R(y,z) A-~R(z,y) A
(3)
A(y, z) V A(z,
R(z, y) --+ -,R(y,
(4)
R(z, ^ z) --+
R(x, y) A R(z, z) A "~A(y, z) A -~A(z, y)
(5)
(6)
R(y,v R(z,
It is easy to see that everytheoremof Ts is an intuitive
truth of groundedmodelG, because every axiomof Ts is an
intuitive truth of G, and the intuitive truths of G are closed
underthe inferencerule of resolution.
Whenthe behavior of the categorizers for basic concepts
(i.e., categories)can be describedby linear constraints, intuitive reasoningcan be implemented
as a process of linear
constraint satisfaction. This is so, becausethe behaviorof
the categorizer of every conceptcan be described by a disjunction of linear constraint systemswhichcan be computed
by replacing everycategoryby a linear constraint describing
the behaviorof its categorizer in the concept, and computing
the disjunctive normalformof the constraint systemresulting fromthe substitution. Oncethis transformationhas been
done, showingthat a conceptis an intuitive truth is equivalent to provingthat the disjunction of constraint systems
associated with the conceptis true for every value of the
sensory channels, and this can be done by provingthat the
disjunction of constraint systemsassociated with the negation of the conceptis unsatisfiable.
For example,it can be shownthat axiom2 (whichstates
that the relation above- A(x,y)- is transitive) is an intuitive
truth of groundedmodelG by checking that the following
disjunction of constraint systemsis unsatisfiable for every
valueof z, y and z in the interval (0.0 1.0). Thisformulahas
been obtained by: (1) replacing every category by a linear
constraint describing the behaviorof its eategorizer in the
concept associated with the negation of axiom2 in grounded
modelG; (2) replacing every instance of VPOS(x),VPOS(y)
and VPOS(z)by x, y and z in the expressionresulting from
step 1; and (3) computingthe disjunctive normalformof the
result of step 2.
{0<x-y, x-y<l.0, O<y-z, y-z<l.0, x-z<0}V
{0<x-y, x-y<1.0, O<y-z, y-z<l.0, 1.0<x-z}
It is easy to checkthat this disjunctionof constraint systemsis unsatisfiable. A disjunction of constraint systemsis
unsatisfiable if everydisjunct is unsatisfiable, andeachdisjunct is a linear constraint systemthat can be checkedby
a linear constraint solver, such as the one implemented
in
Sicstus Prolog.
The rest of the axioms of Ts can be shownto be intuitive truths of groundedmodelG by constraint satisfaction
as well. Thatis, the coupleof agentsusedin the previousexperimentscan discoverthat the relations aboveand rightof
are antisymmetric,transitive andtotal orderson objects located in the samehorizontalor vertical positions by intuitive
reasoning. In this sense, one can say that groundedmodel
G provides an explanation for the fact that mostpeople accept these axiomsas intuitively true, and use themfor building logical theories for common
sense reasoning (McCarthy
1959), (McCarthy1990).
It can also be provedthat intuitive reasoningin grounded
modelsin whichthe behavior of the categorizers for basic
conceptscan be describedby linear constraints is closed under resolution. That is, if twoconceptsare intuitive truths
of a groundedmodel,its resolventis an intuitive truth of the
groundedmodelas well. This is a consequenceof the property of soundnessof the inferencerule of resolution, and the
fact that the linear constraints describingthe behaviorof the
categorizers of a groundedmodelconstitute a model(in the
sense of modeltheory semantics(Shoenfield1967)) of every
4.
intuitive truth of the groundedmodel
Everytheoremof Ts is, therefore, an intuitive truth of
groundedmodelG, but not every intuitive truth of G is a
theorem of Ts, because groundedmodel G has categorizers for concepts(such as up(x), down(x),right(x), left(x),
below(x,y) or leftof(x,y)) whichare not even included
the language of Ts. For example, the formulas up(x)
-"down(x) or up(x) A down(y) ~ -,above(y, x) intuitive truths of groundedmodelG, but are not theoremsof
Ts.
Conclusions
Groundedmodels (Sierra 2001b) differ from axiomatic
theories in establishing explicit connectionsbetweenlanguageand reality that are learned through languagegames
(Wittgenstein 1953). Theseconnections, whichwe call eategorizers, give groundedmeaningsto symbolsby linking
themto the portionof reality they refer to.
In this paper, wehaveexplainedhowcategories (i.e., symbolic representations) and categorizers (groundedmeanings)
can be constructed by autonomous
agents connectedto their
environmentthrough sensors and actuators using someconeeptualization mechanismsand languagegamesproposed in
(Steels 1999).
Wehavethen consideredthe process of truth evaluation,
proposed a language gamethat can be used simulating the
generation of logical categories, and showedhowlogical
categories and eategorizers allow the construction and evaluation of generic concepts of the samecomplexityas free
quantifier first order formulas.
Finally, we have described someexperiments in which
a couple of visually groundedagents construct a grounded
4Thelinear constraintsdescribingthe behaviorof the categorizers axeconsidered,
in this case,as predicateswhichare true for
thoseobjecttuplesthat satisfythem.
108
modelthat can be used for spatial reasoning, and we have
explained howgroundedmodelscan be used for intuitive
reasoning.
Acknowledgments
The author wouldlike to thank Luc Steels for manyinteresting conversationson the topics of the origins of language
and groundedrepresentations.
References
McCarthy, J. 1959. Programs with Common
Sense. In
Mechanizationof ThoughtProcesses, Proceedingsof the
Symposium
of the National Physics Laboratory, 77-84.
McCarthy, J. 1990 Formalizing Common
Sense. Papers
by John McCarthy.Edited by Vladimir Lifschitz. Ablex
Publishing Corporation, NewJersey.
Piaget, J. 1985. TheEquilibrationof CognitiveStructures:
the Central Problemof lntellectual Development.University of ChicagoPress, Chicago.
Shoenfield, J R. 1967. Mathematical Logic. AddisonWesleyPublishing Company.
Sierra-Santib(ffiez, J. 2001. GroundedModels.In Working
Notes of the AAAI-2001Spring Symposiumon Learning
GroundedRepresentations, 69-74. Stanford University,
Stanford, California.
Sierra-Santib~ez, J. 2001. GroundedModelsas a Basis
for Intuitive Reasoning.In Proceedingsof the Seventeenth
International Joint Conferenceon Artificial Intelligence.
MenloPark, Calif.: International Joint Conferenceson Artificial Intelligence,Inc.
Steels, L. 1996. Perceptually GroundedMeaningCreation.
In Proceedingsof the International Conferenceon MultiAgent Systems. MenloPark, Calif.: AAAIPress.
Steels, L. 1997. The Synthetic Modelingof LanguageOrigins. Evolution of Communication
1(1): 1-35.
Steels, L. 1998. The Origins of Syntax in Visually
GroundedAgents.Artificial Intelligence, 103:1-24.
Steels, L. 1999. The Talking HeadsExperiment.Volume1.
Wordsand Meanings. Special Pre-edition for LABORATORIUM,Antwerpen.
Steels, L. 2000. The Emergenceof Grammarin Communicating Autonomous
Robotic Agents. In Proceedings of
the EuropeanConferenceon Artificial Intelligence 2000.
Horn, W.(ed.), IOSPublishing, Amsterdam.
Steels, L. 2001. The Role of Language in Learning
Grounded Representations. In Working Notes of the
AAA/-2001Spring Symposiumon Learning Grounded
Representations, 80-85. Stanford University, Stanford,
California.
Steels, L., and Vogt, P. 1997. GroundingAdaptive Language Gamesin Robotic Agents. In Proceedings of the
EuropeanConferenceon Artificial Life 97. The MITPress,
Cambridge Ma.
Wittgenstein, L. 1953. PhilosophicalInvestigations.
Macmillan, NewYork.