From: AAAI Technical Report FS-01-01. Compilation copyright © 2001, AAAI (www.aaai.org). All rights reserved. GroundedModelsas a Basis for Intuitive Reasoning: the Origins of Logical Categories JosefinaSierra-Santibfifiez Escuela T6cnicaSuperior de Inform6tica Universidad Aut6nomade Madrid 28049Madrid, Spain Josefina.Sierra@ii.uam.es Abstract egorizer (Steels 1996), whichare used for conceptualizing perceptual information. Then, we consider the process of truth evaluation, showhowlogical categories can be constructed by identifying sets of outcomesof the evaluation process, and explain howconceptscan be generatedby combining logical and perceptually groundedcategories. Next, we describe howgroundedmodelsare constructed as a side effect of the agents activity playingdifferent types of languagegames,and present someexperimentsin whicha couple of agents learn a groundedmodelthat can be used for spatial reasoning. Finally, we introducethe notion of intuitive reasoning and showhowgroundedmodelscan be used for it. Grounded models(Siena 2001b)differ fromaxiomatictheories in establishingexplicit connectionsbetweenlanguage andreality that are learnedthroughlanguage games(Wittgenstein 1953).Thispaperdescribeshowgroundedmodelsare constructedby autonomous agentsas a side effect of their activity playingdifferent types of languagegames(Steels 1999),andexplainshowthey canbe usedfor intuitive reasoning.It proposesa particular languagegamewhichcanbe usedfor simulatingthe generationof logicalcategories(such as negation,conjunction,disjunction,implicationor equivalence), and describessomeexperimentsin whicha couple of visually grounded agentsconstructa grounded modelthat canbe usedfor spatial reasoning. Introduction Perceptually GroundedCategories In (Sierra 200 lb), we introducedgroundedmodelsand compared themto axiomatic theories of mathematics.This paper describes howgroundedmodelsare constructed by autonomous agentsas a side effect of their activity playingdifferent types of languagegames(Steels 1999).It builds up previous workon the relation betweenlanguageacquisition and conceptualization in visually groundedrobotic agents (Steels 2001), by addressingthe issue of the acquisition logical categories, and proposinga particular languagegame (called the evaluation game)whichcan be used for simulating the generation of logical categories, such as negation, conjunction,disjunction, implication or equivalence. Logicalcategories are very importantfor intellectual development(Piaget 1985), because they allow the generation of structured units of meaning,and they set the basis for intuitive reasoningat the level of propositionallogic. Groundedmodelsare not proposedas substitutes for axiomatic theories. Onthe contrary, they provide an explanation for our ability to comeup withsomeof these theories by intuitive reasoning. Theyexplain, for example,whywe accept certain axiomsas intuitively true without further argumentation. Groundedmodelsshould rather be considered as precursorsof axiomatictheories, since these theories require considerablelinguistic competence for their formation. The rest of the paper is organizedas follows. First, we defne the concepts of sensory channel, category and cat- Weassumea scenario similar to the physical setting of The TalkingHeadsExperiment (Steels 1999),i.e., a set of robotic "talking heads" playing language gameswith each other about scenes perceived through their camerason a white board in front of them. Figure 1 showsa typical configuration of the whiteboardwithseveral geometricfigures pasted onit. Weconsidernowthe first steps of a languagegame,which are intendedto conceptualizethe perceptual informationobtained by the agents after lookingat samearea of the white board. Copyright©2001,American Associationfor Artificial Intelligence(www.aaal.org). All rights reserved. Sensory Channels The agents look at one area of the white board by capturing an imageof that area with their cameras. First, they segmentthe imageinto coherent units in order to identify the objects that constitute the context of a languagegame. Next, somesensory channels (implementedby low level visual processes) gather informationabout each segment,such as its color, horizontal or vertical position. In the experimentsdescribedin this paper, we assumethat there are only two primitive sensory channels. ¯ H(o)computesthe x-midpositionof object o. ¯ V(o) computesthe y-midpositionof object The values returned by the sensory channels H and V are sealed so that its rangeis the interval (0.0 1.0). Consider 101 followed by the upper and lower boundsof the region they carve out. Thus [TOTALLY-LEFr] is labeled as [H 0.0 0.25], becauseit is true for a region between0.0 and 0.25 on the H channel. Wealso use first order logic predicate notation to emphasizethe fact that perceptually grounded categoriescorrespondto n-ary predicatesof first order logic. For example,we use the unary predicate [H 0.0 0.25](0) refer to the category [H 0.0 0.25], and the binary predicate [VD0.0 1.0](ol,o2) to refer to the category[VD0.0 1.0]. @ Figure 1: Eachobject in the scene is characterized by its values on two primitive sensory channels: H and V. the three objects numbered in figure 1, object 1 has the values H(O1)=0.5,V(O1)=0.5,object 2 the values H(O2)=0.5, V(O2)=0.2,and object 3 the values H(O3)=0.8,V(O3)=0.2. In addition to two primitive sensory channels, there are other sensory channels constructed fromthem. ¯ HD(ol,o2)computesthe difference of the x-midpositions of objects ol and o2, i.e., HD(ol,o2)=H(ol)-H(o2). ¯ VD(ol,o2)computesthe difference of the y-midpositions of objects ol and 02. ¯ EQ(ol,o2)is definedas a predicate, i.e., a functionwhich takes the value 1 (true) if the values of the primitive sensory channels H and V are equal for objects ol and 02, and 0 (false) otherwise. For example, the sensory channel VDhas the value 0.3 whenit is applied to the pair of objects (O1,O2), i.e., VD(O1,O2)=0.3. The range of the sensory channels HI3 and VDis (-1.0 1.0). Therange of the sensory channelEQis the discrete set of Booleanvalues {0, 1}. Categories Thedata returned by sensory channelsare values froma continuous domain(except for sensory channel EQ). To be the basis of a natural languageconceptualization, these values must be transformed into a discrete domain. One form of categorization consists in dividing up each domainof values of a particular sensory channelinto regions and assigning a category to each region (Steels 1996). For example, the range of the H channelcan be cut into two halves leading to a distinction between[LEFI](0.0 < H(x) < 0.5) [RIGHT](0.5 < H(x) < 1.0). Object 3 in figure 1 h~s value H(O3)=0.8and wouldtherefore be characterized [RIGHT].Similarly, the VDchannel can be cut into two halves, leading to a distinction between[ABOVE] (0.0 VD(x)< 1.0) and [BELOW] (-1.0 < VD(x)< It is alwayspossible to refine a categoryby dividing its region. Thus an agent could divide the bottom region of the H channel (categorized as [LEFT])into two subregions [TOTALLY-LEFt] (0.0 < H(x) < 0.25), and [MID-LEFr] (0.25 < H(x) < 0.5). Thecategorization networksresulting from these consecutivebinary divisions form discrimination trees (Steels 1996). Followingthe notation introduced in (Steels 1999), label categories using the sensory channelthey operate on, 102 Categorizers At the sametime the agents build categories to conceptualize perceptual information,they construct cognitive procedures that allow themto check whetherthese categories hold or not for a given tuple of objects. A categorizer is a cognitive procedure capable of determiningwhether a category applies or not. For example,the behaviorof the eategorizer for category [V 0.0 0.5](0) can be described by a function that takes the value 1 if 0.0 < V(o)< 0.5, and 0 otherwise. Categorizers give groundedmeaningsto symbolsby establishing explicit connectionsbetweeninternal representations (categories) andreality (perceptualinput as processed by sensory channels) (Steels and Vogt 1997). These conneetions are learned through language games(Wittgenstein 1953), and allow agents to cheekwhethera category is true or not for a given tuple of segmentedobjects. Mostimportantly, they provideinformationon the perceptualand cognitive processesan agent mustgo throughin order to evaluate a givencategory. The behavior of the categorizers associated with the perceptually groundedcategories used in this paper can be described by linear constraints. Weuse the notation c (£) to refer to the categorizer whichis capable [CAT] recognizing whethercategory [CAT](£)holds or not for giventuple of objects. It is importantto realize however that eategorizersare cognitive procedures,not linear constraints or implementedfunctions. Weonly use linear constraints to modeltheir behavior. Wedo not assume,therefore, that natural agents do learn such constraints, but that they build themin their perceptual and cognitive systems. Thelinear constraints belowdescribe the behaviorof the eategorizers associated with the categories up, down,above, and below. [V0.5 1.0]c(z) = 0.5<V(z)AV(x)<I.0 W 0.0 0.5]c (z) -- 0.0<V(z)^V(z)<0.5 [VD0.0 1.0]c(z,y) -- 0.0<VD(z,V) AVD(x,V)<I.0 c [VD-1.0 0.0] (z, y) -- - 1.0 < VD(z,y) A VD(z,y) < Logical Categories Weconsider nowthe process of truth evaluation, and describe howlogical categories can be constructed by identifying sets of outcomesof the process of truth evaluation. Logical categories are important for a numberof reasons: (1) they allow the generationof an infinite set of concepts, whichcorresponds to the set of free quantifierfirst orderformulas that can be constructed from perceptually grounded categories; (2) they set the basis for intuitive reasoning the level of propositionallogic. Evaluation Channel The evaluation channel(denotedby E) is capable of observing the internal state of the agent. It is appliedto a category tuple and an object tuple, and returns a tuple of Booleanvalues resulting from evaluating each category on the object tuple. It evaluatescategories by finding their categorizers, applying themto a given object tuple, and observing their output. Let ~ = (cl,...,cn) be a category tuple and ~ = (ol,..., o,+) an object tuple, the result of applyingthe evaluation channelE to ~’and5is E(6’, 59 = (Vl,..., v,+), where eachvi is the result of applyingc~ (the categorizerof c~) (i.e., v+=cic (o-3). For example,E(([V 0.0 0.5](z), [H 0.5 1.0](:e)), (0, 0), becauseO1(object 1 in figure 1) is neither on lowerpart nor on the right part of the white board. That is, [V 0.0 0.5]c(O1)=0 and [H 0.5 1.0]c(O1)=0. Logical Categories and Concepts Therangeof the evaluationchannelis infinite, since it consists of all the Booleantuples of any arity. In the experimentsdescribed in this paper, we will focus on unary and binary tuples only1. Therange of this channelfor unarytupies of categories is the set { 1,0}, where1 and 0 correspond to the logical categories true and false, respectively. The rangeof the evaluationchannelfor categorytuples of length twois the set {(0,0),(0,1),(1,0),(1,1)). If weassign ical category to each elementof this set, we obtain one of the categories usedin classical logic (conjunction),and several conjunctionsused in natural language, such as neither (0,0), but (1,0), or although(0,1) 2. If we considerhowever subsets of the rangeof the evaluationchannel(instead of individual elements)weobtain the meaningsof all the cermetfives used in propositionallogic, i.e., negation,conjunction, disjunction, implication and equivalence. For example,wesay that sentencecl Vc2 is true if the result of evaluatingthe pair of categories(cl, c2) is Boolean pair whichbelongsto the set {(1, 1), (1, 0), (0, 1)). samecan be said for the sentence Cl +--> c2 and the set of Booleanpairs {(1,1),(0,0)). The agents construct logical categories by identifying subsets of the range of the evaluation channel. The evaluation gamecreates situations in whichthe agents discover such subsets, and use themto distinguish a subset of objects fromothers in a given context. Therepresentation of logical categories, such as conjunction,disjunction or implication, 1Weomitthe parentheses if the evaluationchannelisappliedto categorytuplesof lengthone. 2TheassociationbetweenBooleanpairs andnatural language connectives is approximate: neitheris usedwhenthe twosentences that followit are negative;butis usuallypreceded byan affirmative sentenceand followedby a negativeone; and althoughis sometimesprecededbya negativesentenceandfollowedby an affirmative one. 103 as sets of Boolean tuples is in fact equivalentto the truth tables used in classical logic for describing the semanticsof propositional connectives.Trivial subsets such as the empty set, ((1),(0)} or {(I,1),(i,0), (0,I),(0,0)} are not with categories, because they are not informative. Subsets which combineunary and binary Boolean tuples, such as {(1,1),(0)} are not consideredeither. Thenotation used for describing logical categories consists of the symbolE (for evaluation channel) followed an schematicrepresentation of the set of Booleantuples for whicheach category holds. Thus, [13 1](c) corresponds true(c), becauseit is true if the result of applyingthe evaluation channelto e belongsto the set of Booleantuples {(1)}. [E 0](e) represents the logical categoryfalse(c), or the ical connectivenot(e), becauseit is true if the result of applying the evaluation channelto c belongsto the set {(0)}. Thelogical connectiveor(el,c2) (disjunction) is represented by the logical category [E ll-10-01](x,y), whichis true the result of applyingthe evaluation channelto the pair of categories(c1,c2) belongsto the set {(1,1),(1,0),(0,1)}. A key aspect of logical categories is that they describe properties of categories, as opposedto perceptually groundedcategories whichdescribe properties of objects. It is therefore natural to applylogical categories to perceptually groundedcategories to construct structured units of meaning, which we call concepts. For example, the concept [E 0]([V 0.0 0.5](x)) can be constructedby applying logical category[E 0](c) (i.e., not(c)) to the category 0.5](x) (i.e., down(x)).The concept [I~ II-10-01]([[V 1.0](x),[H0.5 1.0](x)) (i.e., or(up(x),right(x))) strutted similarly by applyingthe logical categoryor(el,c2) to the categoriesup(x) and right(x). If we considerperceptually groundedcategories as atomic concepts, wecan observethat: (1) logical categories cannot only be applied to perceptually groundedcategories but to all sorts of concepts;and (2) the result of applyinga logical categoryto a single conceptor a pair of conceptsis again a newconcept. Logical categories allow generating then an infinite set of concepts,whichcorrespondsto the set of free quantifier first order formulasthat can be constructedfrom perceptually groundedcategories. The notion of concepta.s structure of meaningconstructed by an agent can then be defined inductively as follows: (1) a perceptually grounded categoryis a concept;(2) if l(£) is an n-ary logical category and ~’is an n-ary tuple of concepts,then l(c") is a concept. Logical Categorizers The categorizers of logical categories are cognitive procedures that allow determiningwhether a logical category holds or not for a tuple of conceptsand an object tuple. As we have explainedabove, logical categories can be associated with subsets of the rangeof the evaluationchannel. The behaviorof their eategorizers can be describedtherefore by constraintsof the formE(~,0-3 E St, wherel is a logical eategory, St is the subsetof the rangeof the evaluationchannel for whichI holds, ~’is a tuple of concepts,and 5is an object tuple. That is, by functionswhichtake the value 1 (true) the result of evaluating the tuple of conceptsto whichthe logical categoryis appliedbelongsto the subsetof the range of the evaluation channelfor whichthe category holds, and 0 (false) otherwise. Thefollowingconstraints describe the behaviorof the categorizers for the logical connectivesnot, and, or, if (implication) and/ft. Theeategorizer whichis capableof checking whethercategory [CAT](c-)holds is denotedby [CAT]C(c-’). The variables c, cl and c2 range over concepts. The variable ~rangesover object tuples. [E0]c(c)=E(c,o-3e {(0)} [E ll]C(cl,c2) =_ E((cl,c2),o-) 6 {(1, 1)} [E l l-10-01]C(cl,c2)= E((cl, c2), 6) E {(1, 1), i: ~ perception whiteboardobjects action topic == pointed perception 4 action SPEAKER lexicalization language expression interpmtation~ m~--ng~ Figure 2: Thesemiotic square summarizes the cognitive processes involvedin a languagegame. (0,1)} [E 11-01-00]C(cl,c2)- E((cl, c2), 6) {(1, 1), (0 , 1) (0,0)} [E ll-00]C(cl, c2) = E((cl, c2), 6) E {(1, 1), (0, Becausethe categorizers of logical categories use the evaluation channel, this channel can be naturally extended to evaluategenericconcepts(i.e., free quantifier first order formulas)recursively using the categorizers of logical and perceptually groundedcategories. Atomicconcepts (perceptually groundedcategories) are evaluated by applyingtheir categorizers to the object tuple given as input to the evaluation channel. Nonatomicconcepts of the form l(~ are evaluatedby applyingthe eategorizer of the logical category l to the result of evaluatingthe concepttuple ~’onthe object tuple given~.s input to the evaluationchannel. Thefollowing is an inductivedefinition of the evaluationchannelE(~’, o-’) for generic concepts. 1. If [CAT](£)is a perceptually groundedcategory, then E([CAT](£), o-’) = [CAT]C(o’). 2. If l(c-) is a concept,wherel is a logical category,E is tuple of conceptsand St is the subset of the range of the evaluation channelfor whichI holds, then E(l(c-), 6) if E(E, 6) E St, and 0 otherwise. Constructing Grounded Models In (Sierra 2001a), we defined grounded mo del astheset of concepts and eategorizers constructed by an agent at a given timein its development history. This section describes howgroundedmodelsare constructeda.s a side effect of the agenLsactivity playing different types of languagegames. First, the agents play languagegamesof the sort described in the book The Talking HeadsExperiment(Steels 1999). Thesegamesallow themto construct perceptually grounded categories and a shared lexicon for referring to such categories. Oncethe agents have learned a shared lexicon for perceptually groundedcategories, they start playing evaluation games,whichallow themto construct logical categories and a shared lexicon for referring to them. Wedescribe the evaluation gamein this section, and refer the reader to (Steels 1999)for a thoroughdiscussion the languagegamesused for learning perceptually grounded categories, and their crucial role on the origins of language 104 (Steels 1997). The evaluation gameis a language game whichfocuses on the evaluation of concepts and the charaeterization of sets of objects by meansof logical compositions of perceptually groundedcategories. The semiotic square (Steels 1999) (figure 2) summarizesthe maincognitive processes involvedin a languagegame. The evaluation gameis played by two agents. One agent plays the role of speaker, and the other agent plays the role of hearer. PerceptionFirst, the speakerlooks at one area of the white board and directs the attention of the hearer to the same area3. Both, speaker and hearer, segmenttheir images of the white board into coherent units, whichroughly correspondto different objects pasted on the white board. Then, the speaker chooses a context for the languagegame.This is a set of object tuples of the samearity. In the experiments wehavecarried out so far, the object tuples are of arity one or two. The speaker indicates the hearer the set of object tuples that constitute the context by pointing to them. Different protocolsexist for pointing to unaryand binaryobject tuples. Then,the speaker picks up a subset of object tuples fromthe context whichwewill call the topic. In our experiments, the topic contains a variable numberof object tuples which goes from one to three. The rest of the object tupies in the context constitute the background.Both, speaker and hearer, use their sensory channelsto gather information about each object tuple in the context, and store that informationso that they can use it in subsequentstages of the language game. For example, in evaluation game1 (described below) the speaker chooses as context the set of object tuples {(O1), (O2), (O3)}, where O1,02 and 03 represent numberedobjects in figure 1. The topic chosen by the speakeris the subsetof objecttuples{ ( O1), ( O3 ) }, and backgroundthe subset { (02) Discrimination The speaker tries to find a pair of categories whichdistinguishes the topic from the background. 3Robotsdirectthe attentionof othersto specificareasor objects onthe whiteboardbyinformingeachother of the directionwhere theyare focusing. That is, a pair of categories (cl, e2) suchthat their evaluation on the topic is different fromtheir evaluation on any object tuple in the background.The evaluation of a tuple of conceptson a set of object tuples (e.g., the topic) is defined as follows. Weuse the function symbolE to denote the evaluationchannel, and give this definition as an extension of previous ones. Let ~’ be a tuple of concepts, and s a set of object tuples, E(6’, s) = {E(~, ~}~,, where E(fi’, 6") is as defined in previous sections. The speaker tries to find then a pair of categories (cl,c2) such that E((cl, c2), topic) A E((cl, c2), background) For example, consider the pair of categories ([-V 0.0 0.5](x),[H 0.5 1.0](x)) (i.e., (down(x),right(x))). of categories discriminates the topic fromthe background in evaluationgame1, becausethe evaluation of the pair of eategorieson the topic doesnot intersect with the evaluationof the pair of categories on the background.That is, E(([V0.0 0.5](x),[H 0.5 1.0](x)),{O1,O3})={(0,0),(1,1)}, 0.5](x),[H 0.5 1.0](x)),{O2})={(1,0)}, and {(0,0),(1,1)} {(1,0)}=~. If the speaker cannot find a discriminatingpair of categories, the gamefails. Otherwise,the speakertries to find a logical categorywhichis associated with the set of Boolean pairs tv resulting fromevaluating the pair of categories on the topic. If it does not haveany logical categoryassociated with this set, it creates a newlogical category of the form [E tv](cl,c2), togetherwithits categorizer,and addsit to its current groundedmodel. Continuingwith the exampleof evaluation game1, if the speaker does not havea logical eategory associated with the set of Booleanpairs { (1,1),(0,0)} (resulting fromevaluating the discriminatingpair of categories (IV 0.0 0.5](x),[H 1.0](x)) on the topic {(O1),(O3)}),it can create a ical categoryof the form[E 11-00](el,c2) and add it to its grounded model. Theconeeptconstructed by applyingthis logical category to the categories in the discriminatingpair characterizesthe topic as the set of object tuples in the context for whichthe concept is true. For example, in evaluation game1, the speaker maycharacterize the topic as the set of object tupies in the context whichare on the lowerpart of the image if andonly if they are also on the right part of it. Thetopic {(O1),(O3)}is the set of object tuples in {(O1),(O2),(O3)} whichsatisfy the formula iff(down(x),right(x)). This formulais internally represented by the concept [E 11-00]([V 0.00.51(x),[H 0.51.0](x)). Lexlealization The speaker examinesits lexicon in order to find an expressionassociated with the logical category. The lexicon of an agent contains associations of the form (C,W,R),whereC is a category, Wis a linguistic expression, and R is the rate of the association (i.e., a numberbetween 0 and 1 which correspondsto the confidencethe agent h~s on the usefulness of that association). If there are several candidates, the speaker chooses the expressionof the association withhighest rate. If there are no associationsfor the logical category C, the speaker mayinvent a newexpression Wwith probability CR, and add an association of the form 105 (C, W,0) to its current lexicon. TheconstantCRis the creativity rate of the agent. It determinesthe probability with whichthe agent mayenrich the language by creating new words.In our experimentsCRis set to 0.9 for all agents. For example,if the speaker has no associations for the logical category[E 11-00](c1,c2)in its current lexicon, mayinvent a newexpression(e.g., ifJ) and add a newassociation of the form([E 11-00](x,y),iff(x,y),0)to its lexicon. In this paper, we assumethat internal representations and linguistic expressionshaveidentical structures, so that the agents do not have to worryabout learning grammarrules (see (Steels 1998) and (Steels 2000) for someexperiments on the origins of syntax in visually groundedagents). This simplification allows us to concentrate on the issue of the origins of logical categories fromthe point of viewof their semanticalfunction. Next, the speakerconstructs a sentenceusing the lexicalizations of the logical categoryand the categoriesin the discriminatingpair. At this stage, it is assumed that the agents havelearned a shared lexicon for perceptuallygroundedcategories already, so that the learningprocesscan focus on the constructionof a sharedlexicon for logical categories. For example, the speaker mayconstruct the sentence iff(down(x),right(x)) to expressthe concept[E 11-00]([V0.0 0.5](x),[H 0.5 1.0](x)), assumingthe associations ([E 00],iff(x,y),R1), ([V 0.0 0.5](x),down(x),R2)and 1.0](x),right(x),R3)are part of its lexiconalready. Oncethe speaker has constructed a sentence which expresses the conceptthat characterizes the topic, it communicates that sentenceto the hearer. Interpretation The hearer interprets the sentence using the associations between(logical or perceptually grounded) categories and linguistic expressionsin its lexicon. That is, it tries to reconstruct the conceptencodedby the sentence. If the hearer does not have any association for someof the expressions used in the sentence, the gamefails and a repair process takes place. The speaker points to the topic so that the hearer can guess the logical categoryit has used to conceptualizethe topic, and acquire an association betweenthat logical category and the expression used by the speaker. This happens with probability AS. The constant ASis the assimilationrate of the agent. It indicatesthe probability with whichan agent adoptsexpressionsused by other agents. In our experimentsASis set to 0.9 for all agents. For example,in evaluation game1 the hearer mayinterpret the sentenceiff(down(x),right(x) as concept [E 1100]([V0.0 0.5](x),[H0.5 1.0](x)), if its lexiconcontains right associations betweencategories and linguistic expressions. Otherwise, it mayacquire the association ([E 1100](x,y),iff(x,y),0)as a result of the repair process. Identification Thehearer tries to find the referent, i.e., the set of object tuples in the contextthat satisfy the meaning of the sentence.Forexample,the set of object tuples whichsatisfy the meaningof iff(down(x),right(x)) in evaluationgame 1 is {(O1),(O3)}.If the meaningof the sentence is true all the object tuples in the context or for noneof them, the gamefalls and a repair proeess similar to that describedin the previous step takes place. Failures such as these may happenwhenspeaker and hearer associate the sameexpression withdifferent logical categories. Coordination Thehearer points to the referent, i.e., to the set of object tuples it identified in the previousstep. The gamesucceedsif the referent is equal to the set of object tuples the speakerhad in mind(the topic). If the gamesueceeds, speaker and hearer incrementthe rates of the associations they used for lexicalizing and interpreting the logical category by an amountI, and decrementthe rates of competingassociations whichwere discarded by both agents duringthe processesof lexicalization and interpretation, respectively. If the gamefalls, only the rates of the associations used by speaker and hearer are decremented,and a repair processsimilar to that describedfor the previousstep takes place. In our experiments,the amountI by whichrates are modifiedis set to 0.1. Wesummarizebelow the main steps of evaluation game 1, whichhas beenused as examplein our description of the evaluation game. Game number : 1 Speaker : A1 Hearer : A2 Topic: { (01), (03)} Background:{ (02) Speaker conceptualizes topic as: [E Ii-00]([V 0 0.5] (x), [H 0.5 i] (x)) Speaker lexicalizes concept as: iff (down(x), right(x)) Hearer interprets sentence as: [E ii-00]([V 0 0.5] (x), [H 0.5 i] (x)) Hearer points to: {(01), (03)} Speaker says: OK The game is a success. Experiments Wehave run someexperiments in order to see whether a couple of agents can learn a set of perceptually grounded categories for spatial reasoning, and a set of logical categories whichallow themto construct logical formulas from perceptually groundedcategories and to evaluate those formulas using the categorizers of perceptually groundedand logical categories. In the first experiment, the agents play languagegames of the sort described in (Steels 1999). Thesegamesallow themto learn perceptually groundedcategories and a shared lexicon for referring to those categories. Figure 3 showsthe evolutionof the lexical coherencein a series of 500 games. Thelexical coherence(Steels 1999)is a measureof the similarity of the agents’ lexicons. It can be observedthat they reach total lexical coherenceafter 250games.The spatial categories and lexicon learned after 500 gamesare shownin table 1. In the second experiment, the agents play evaluation games. These gamesare played after a series of 500 language games,during whichthe agents learn a shared lexi- 106 1.2 1 0.8 0.6 0.4 0.2 0 5 9 13 17 21 25 29 33 37 41 45 49 Figure 3: This graph showsthe evolution of the lexical coherence in a series of 500 languagegames. It can be observedthat the agents reach total lexical coherenceafter 250 games. con for perceptually groundedcategories that they needed for playing evaluation games. Figure 4 showsthe evolution of the lexical coherencein a series of 10000evaluation games.First, the agents learn unarycategories, such as true and false. Thisis donevery quickly. Then,they learn binary categories, such as conjunetion,disjunction, implication or equivalence. This takes longer, about 3000gamesto reach a lexical coherenceof 0.97. In additionto logical connectives, suchas not, and, or, if or iff, the agentslearn other categories whichcorrespondto natural languageconjunctions, such as but, neither, howeverand others. Thelogical categories and lexicon learned after 10000gamesare shownin table 2. Table1. Lexicon:spatial categories. Expression Category English Rate seba 1 [VP0.0 0.5] down(x) bi [VP0.5 1.0] up(x) 1 dilebo [HP0.0 0.5] left(x) 1 na [HP0.5 1.0] right(x) 1 sule [VD-1.0 0.0] below(x,y) 1 ba [VD0.0 1.0] above(x,y) 1 belebubi [HD-1.0 0.0] leftof(x,y) 1 le [HD0.0 1.0] rightof(x,y) 1 Table2. Lexicon: logical categories. Expression Category English Rate lebasu [E 0] not(x) 1 nadebabu [E 1] true(x) 1 sosulu and(x,y) [E 11] 1 deloda [E 10] 1 nodasosu [E 01] 1 1o 1 [E 00] benusu [E 11-10] 1 si 1 [E 11-01] luse [E 11-00] iff(x,y) 1 babesu [E 10-01] xor(x,y) 1 nidi [E 10-00] 1 nu [E 01-00] 1 di [E 11-10-01] or(x,y) 1 ne [E 11-10-00] 1 sede [E 11-01-00] if(x,y) 1 sabeleli [E 10-01-00] 1 1 0.9 0.8 o.7 0.6 0.5 0.4 0.3 0.2 o.1 o 100020003000 400050006000700080009000 10000 Figure 4: This graphshowsthe evolution of the lexieal coherence in a series of 10000evaluation games.First, the agents learn unaryconnectives(~), reaching total coherence very soon. Afterwards, they start learning binary connectives (A, V, --% ++), reaching a lexical coherenceof 0.97 after 3000games. Intuitive Reasoning A natural question is to ask what the agents have really learned by playing language games. In first place, they haveconstructedcategorizers for perceptuallygroundedcategories (such as up, down,right, left, above, below,rightof and leftof), andfor logical categories(such as negation,disjunction, conjunction, implication or equivalence). In second place, they have acquired a shared vocabulary for referring to those categories. Andin third place, they have learnedto evaluategenericconcepts(i.e., free quantifierfirst order formulas)using the categorizers of logical and perceptually groundedcategories. Accordingto the result of the evaluationprocess, the concepts that can be constructed by an agent can be classified into three categories: (1) intuitive truths, whichare true for everyobject tuple; (2) intuitive falsehoods,whichare false for every object tuple; and (3) regular concepts, whichare true for someobject tuples and false for others. Intuitive reasoningis a process by whichthe agents discoverrelationships that hold among the categorizers of basic concepts. For example,an agent maydiscover that the concept above(x,y) ^ above(y,is an intuitive fal sehood, because the categorizers of above(x, y) andabove(y, never return true at the same time. Similarly, it can discover that the concept above(x, y) ~ --~above(y, is an int uitive truth, since the eategorizerof above(y,x) returns false wheneverthe categorizer of above(z, y) returns true. Intuitive reasoning can then be used for determining whethera conceptis an intuitive truth, an intuitive falsehood or a regular conceptof a groundedmodel. It mayworkas a processof constraint satisfaction in natural agents, by which they try to discoverwhetherthere is any combinationof values of their sensorychannelsthat satisfies a givenconcept.It is not clear to us, howthis processof constraint satisfaction can be implemented in natural agents. It maybe the result of a simulation process, by whichthe agents generate possible combinationsof values for their sensory channelsand check 107 whetherthey satisfy a given concept. Or it maybe grounded on the physical impossibility of firing simultaneouslysome categorizers due to the waythey are implemented by physically connectedneural networksin natural agents. Toget an idea of the set of intuitive truths that are implied by the groundedmodelG constructed by a couple of agents in the previousexperiments,consider the first order theory Ts. The language of Ts consists of two binary predicate symbolsA(x, y) (above) and R(z, y) (rightof). Its nonlogiealaxiomsis as follows. A(z,y) -~ ~A(y,z) (1) A(x, y) A A(y, z) -+ A(z, (2) A(z,y) A A(z,z) A’~R(y,z) A-~R(z,y) A (3) A(y, z) V A(z, R(z, y) --+ -,R(y, (4) R(z, ^ z) --+ R(x, y) A R(z, z) A "~A(y, z) A -~A(z, y) (5) (6) R(y,v R(z, It is easy to see that everytheoremof Ts is an intuitive truth of groundedmodelG, because every axiomof Ts is an intuitive truth of G, and the intuitive truths of G are closed underthe inferencerule of resolution. Whenthe behavior of the categorizers for basic concepts (i.e., categories)can be describedby linear constraints, intuitive reasoningcan be implemented as a process of linear constraint satisfaction. This is so, becausethe behaviorof the categorizer of every conceptcan be described by a disjunction of linear constraint systemswhichcan be computed by replacing everycategoryby a linear constraint describing the behaviorof its categorizer in the concept, and computing the disjunctive normalformof the constraint systemresulting fromthe substitution. Oncethis transformationhas been done, showingthat a conceptis an intuitive truth is equivalent to provingthat the disjunction of constraint systems associated with the conceptis true for every value of the sensory channels, and this can be done by provingthat the disjunction of constraint systemsassociated with the negation of the conceptis unsatisfiable. For example,it can be shownthat axiom2 (whichstates that the relation above- A(x,y)- is transitive) is an intuitive truth of groundedmodelG by checking that the following disjunction of constraint systemsis unsatisfiable for every valueof z, y and z in the interval (0.0 1.0). Thisformulahas been obtained by: (1) replacing every category by a linear constraint describing the behaviorof its eategorizer in the concept associated with the negation of axiom2 in grounded modelG; (2) replacing every instance of VPOS(x),VPOS(y) and VPOS(z)by x, y and z in the expressionresulting from step 1; and (3) computingthe disjunctive normalformof the result of step 2. {0<x-y, x-y<l.0, O<y-z, y-z<l.0, x-z<0}V {0<x-y, x-y<1.0, O<y-z, y-z<l.0, 1.0<x-z} It is easy to checkthat this disjunctionof constraint systemsis unsatisfiable. A disjunction of constraint systemsis unsatisfiable if everydisjunct is unsatisfiable, andeachdisjunct is a linear constraint systemthat can be checkedby a linear constraint solver, such as the one implemented in Sicstus Prolog. The rest of the axioms of Ts can be shownto be intuitive truths of groundedmodelG by constraint satisfaction as well. Thatis, the coupleof agentsusedin the previousexperimentscan discoverthat the relations aboveand rightof are antisymmetric,transitive andtotal orderson objects located in the samehorizontalor vertical positions by intuitive reasoning. In this sense, one can say that groundedmodel G provides an explanation for the fact that mostpeople accept these axiomsas intuitively true, and use themfor building logical theories for common sense reasoning (McCarthy 1959), (McCarthy1990). It can also be provedthat intuitive reasoningin grounded modelsin whichthe behavior of the categorizers for basic conceptscan be describedby linear constraints is closed under resolution. That is, if twoconceptsare intuitive truths of a groundedmodel,its resolventis an intuitive truth of the groundedmodelas well. This is a consequenceof the property of soundnessof the inferencerule of resolution, and the fact that the linear constraints describingthe behaviorof the categorizers of a groundedmodelconstitute a model(in the sense of modeltheory semantics(Shoenfield1967)) of every 4. intuitive truth of the groundedmodel Everytheoremof Ts is, therefore, an intuitive truth of groundedmodelG, but not every intuitive truth of G is a theorem of Ts, because groundedmodel G has categorizers for concepts(such as up(x), down(x),right(x), left(x), below(x,y) or leftof(x,y)) whichare not even included the language of Ts. For example, the formulas up(x) -"down(x) or up(x) A down(y) ~ -,above(y, x) intuitive truths of groundedmodelG, but are not theoremsof Ts. Conclusions Groundedmodels (Sierra 2001b) differ from axiomatic theories in establishing explicit connectionsbetweenlanguageand reality that are learned through languagegames (Wittgenstein 1953). Theseconnections, whichwe call eategorizers, give groundedmeaningsto symbolsby linking themto the portionof reality they refer to. In this paper, wehaveexplainedhowcategories (i.e., symbolic representations) and categorizers (groundedmeanings) can be constructed by autonomous agents connectedto their environmentthrough sensors and actuators using someconeeptualization mechanismsand languagegamesproposed in (Steels 1999). Wehavethen consideredthe process of truth evaluation, proposed a language gamethat can be used simulating the generation of logical categories, and showedhowlogical categories and eategorizers allow the construction and evaluation of generic concepts of the samecomplexityas free quantifier first order formulas. Finally, we have described someexperiments in which a couple of visually groundedagents construct a grounded 4Thelinear constraintsdescribingthe behaviorof the categorizers axeconsidered, in this case,as predicateswhichare true for thoseobjecttuplesthat satisfythem. 108 modelthat can be used for spatial reasoning, and we have explained howgroundedmodelscan be used for intuitive reasoning. Acknowledgments The author wouldlike to thank Luc Steels for manyinteresting conversationson the topics of the origins of language and groundedrepresentations. References McCarthy, J. 1959. Programs with Common Sense. In Mechanizationof ThoughtProcesses, Proceedingsof the Symposium of the National Physics Laboratory, 77-84. McCarthy, J. 1990 Formalizing Common Sense. Papers by John McCarthy.Edited by Vladimir Lifschitz. Ablex Publishing Corporation, NewJersey. Piaget, J. 1985. TheEquilibrationof CognitiveStructures: the Central Problemof lntellectual Development.University of ChicagoPress, Chicago. Shoenfield, J R. 1967. Mathematical Logic. AddisonWesleyPublishing Company. Sierra-Santib(ffiez, J. 2001. GroundedModels.In Working Notes of the AAAI-2001Spring Symposiumon Learning GroundedRepresentations, 69-74. Stanford University, Stanford, California. Sierra-Santib~ez, J. 2001. GroundedModelsas a Basis for Intuitive Reasoning.In Proceedingsof the Seventeenth International Joint Conferenceon Artificial Intelligence. MenloPark, Calif.: International Joint Conferenceson Artificial Intelligence,Inc. Steels, L. 1996. Perceptually GroundedMeaningCreation. In Proceedingsof the International Conferenceon MultiAgent Systems. MenloPark, Calif.: AAAIPress. Steels, L. 1997. The Synthetic Modelingof LanguageOrigins. Evolution of Communication 1(1): 1-35. Steels, L. 1998. The Origins of Syntax in Visually GroundedAgents.Artificial Intelligence, 103:1-24. Steels, L. 1999. The Talking HeadsExperiment.Volume1. Wordsand Meanings. Special Pre-edition for LABORATORIUM,Antwerpen. Steels, L. 2000. The Emergenceof Grammarin Communicating Autonomous Robotic Agents. In Proceedings of the EuropeanConferenceon Artificial Intelligence 2000. Horn, W.(ed.), IOSPublishing, Amsterdam. Steels, L. 2001. The Role of Language in Learning Grounded Representations. In Working Notes of the AAA/-2001Spring Symposiumon Learning Grounded Representations, 80-85. Stanford University, Stanford, California. Steels, L., and Vogt, P. 1997. GroundingAdaptive Language Gamesin Robotic Agents. In Proceedings of the EuropeanConferenceon Artificial Life 97. The MITPress, Cambridge Ma. Wittgenstein, L. 1953. PhilosophicalInvestigations. Macmillan, NewYork.