From: AAAI Technical Report FS-93-04. Compilation copyright © 1993, AAAI (www.aaai.org). All rights reserved. A dassilier system for learning spatial representations, based on a morphological wave propagation algorithm. Michael M. Skolnick Dcparuncnt of ComputerScience R.P.I. A~tract A system is lXOpos~which learns spatial ~tatious of planar feature point sets undersupervisedlearning. Akey (Wewtgessing3 aspect of the learningsystemis the transformarionof the each"static" point set instanceinto a "dynamic" set of measm’es of sp~tisd relationshipsspreadout over time. Amorphologicallybased wavepropagationalgorithm[1, 2, 3] performsthis transformation of Sl~tla’ structm-einto temporalstructure. Thelearningsystem[4] ls based uponclassifiers usingbucketbrigadeandgeneticalgorithms [5] to respectivelymodifystrengthsandcreate newclassifier rules. Suchlearning systemsare designedto exploit temporal reg-larities in learningenvironments and, thus, fit well with the waveIxopagntionpreprocessing.Aninitial test environment is proposedthat attemptsto re-label arbitrary feature labels into structurallymemfingful labels. 1. Introduction Figure1. Imageof planeson tarmac.Thegoal of systemis to classify planesaccordingto shape. In the field of computer visionit is the generalpracticeto decompose visual objectsinto features andthe spatial relations between features. Theadvantagesandjustifications for this practice are compelling.In computational terms, there is a nalmalmappingof features onto graphnodesandspatial relationsuntographarcs. In termsof naturalsystems,it is clear that a similar decomposition is performed by the primatevisual system(with the locational systembeingassociated with the moreprimitive pathwaysthroughthe superior coUiculus onto theparietal cortex andthe feature system beingassociated with the primaryand secondaryprojection areas of the striate cortex). Unfortunately, there are fundamentalissues that preventthis decomposition fix~ngivinga fully ~ti.~factoryacconnL In artificial visionsystemsit is clear that there remains a combinatorialexplosionin matchingobject graphmodelsto full blownscenes. Objectsappearin highlycontextual scenes, with occlusion,distortion andlarge amountsof pose variation, resulting in an explosionin the complexity of the graphmodelsandin the complexityof the search through suchmodels.In natural vision systems,there has beenno systematicaccountof howthe lowerlevels of feature analysis andfeature locationare combined in the distributed representafiousthat mustunderlie perception.Theonly theoretical account that comesclose isthat oftheHebbian cell assembly andit falls short dueto our ignoranceof the mechanisms underlyingserf-reverberatingdistributed neural networks. 95 Recentadvancesin various kinds of learning systemshold out somehopein dealing with these problems.In contrast to modelbased machinevision systems, whereall the complex non-linearcontextualrelations mustbe mapped out a priori, learning algorithmssuggest waysof evolvingsuch a knowledgebase. Further, mostlearning systemsare basedupon massivelyparallel systemsof primitivesthat are involvedin some"tacit" formof competitionand/or cooperation.This providesthe kind of search performance (after learninghas occurred)that is rexluirod.Onthe other hand,our understanding of whathappe~within these systemsseemsto be circumscribedin fundamental ways,e.g., via "hiddennodes" in neural networksand"implicit parallelism"in genetic algorithms.In the contextof these generalissues, I have beenconsideringthe formof a learning systemthat evolves representationsof spatial info~nation. Thetwo majorissues in the designof such a systeminvolve the choiceof preprocessed sp~!i~!informationto be givento the learningsystemandthe type of learning systemto receive this information.Thekaming~k will be that of supervisedpattern classification; the learningsystem attemptsto predict as to class andis punishedor rewarded ff the classificationis wrongor right. Section2 describesthe morphologicalpreprocessingwhichmapsthe input sp~!i~l data into the temporaldata exploitedby the learningsystem. Figure 2. Thre~ldedsilhouettes of planes superimposed on pseudo-convex hulls (whichfunction as local "puddles"for wave 1~). Locations of features of high curvature m-e markedin black and will function as the "pebble" sourcesfor the wavepropagation. Figure3. Illuslrarion of wavepropagationprocessemanating out of the Pebble_Puddleshownin lower left of Figure 2. Themeasureson the wavestate spacein this case involvethe cardinality of the intersections of the waves.The black pixels mark detected local maximaover the intersoction grey-scalesurface. Themaxima turn out to be useful [2,3] in finding scale androtation invariant shapesignatures. The initial focusof the learningsystemwill be on anotherset of measuresinvolvingwavefront meetings. Section3 gives an intuitive descriptionof the proposed learningsystem([4] providesa moredetailed description). + + 2. Pebble PondWaveProrogation In termsof preprocessing,assumethat eachimageis processedso that the locationsof keyfeatures are detected.For example,consider the imageof airplanes in Figure1. Now considerFigure2. After thresholdingthe grey-scaledata, morphological openingsandclosings are applied to detect silhouettefeatures of high internal andexternalcurvature respectively.In this case, the input to the learningsystem preprocessing is the set of all pointsof"high"curvature detectedin the binaryimage,e.g., on a planethe nose,tail andwingtips (for internal curvatttre) andpoints wherewing andfuselagemeet(for external curvature).In addition, pseudoconvexhulls are computedto providea preliminary segmentation and, in turn, a local regionwithinwhichto performwavepropasation. The morphological pre~g algorithm, Pebble_Pond [1, 2, 3], sendsout wavefronts fromeachof the feature locarims. Onevisualizationof the this processin shownin Figure 3. Thewavefrontscontaininformarion as to the identity of the feature source,wherethe identity is determinedby a line scanlabelingof the features. Thebasic intuition of Pebble_Pond is that if a general wavepropagationprocess can be sustainedmorphologically, then the resulting cellnlm" slate spaceshouldprovidea rich set of spatial measures. Morphological filters havebeenapplied to the Pebble_.Pond slate spaceto extract specificSl~_!i~lmeasures froma diverse groupof suchmeasm~, e.g., all k nearest neighbors,k-th 96 + + + °. .+ + + Figure4. Themidpointsof edgesof the completegraphof 7 "pebbles" (derived from detected wavefront meetings), whichwill be input to the learning systemin the order of their detection order Voronoitessellations, andk-th orderGabrielgraphs, morphological co-varianceandvarious rotation/scale invariant spatial signatures. All of these measuresare basedupon the detectionof more"primitive"eventsin the waveslate space: wavefront meetings,wavefront crossings andwave slate intersections. Thereader is refenedto previouswork [1, 2, 3] for details on the underlyingwavestate spacesimulation (especiallybasic time/spacetrade-offs) andthe suite of morphological filters requiredto extract the primitive events, fromwhichare derivedthe diverseclass of sp~ti~! measures. This paperwill focuson the exploitationof wavefront meetings as the basic input to the classifier learningsystem.The set of wavefront meetingsis usedto detect all midpointsof the completegraphthat connectsthe features (see Figure4). Webelievethat this restricted class of eventsin Pebble_Pond state spaceprovidesa sufficientlyrobuststarting point for the explorationof learningalgorithmsfor Sl~tial structure. Thecrucial idea behind Pebble_Pond isthat ittakes spatial structure andIransforms itinto temporal suucture. That is,at eachiteration in the wavepropagation,measureson the state spacereflect Sl~qtbtl glnlClXtre at the S[mti21 scalecorresponding to the currentiteration. In termsof the midpoints of the completefeature graph, Pebble_Pond will output the midpoint locations(alongwith the point sourceidentifies) order accordingto the length of the edge. Thetask of the learningsystemis to exploit this temporaltranslationof the spA~lsU’ucmre by learningto predict over time certain reg|darities in the "messages"producedby Pebble_Pond. keep track oftheinput features, butsuch labels must be remappedinto somethingmeaningful.Theproposedlearning systemfocuses on this problemand provides mechanisms whichtake the arbitrary labels anda nemptsto have "emerge"a meaningfulremapping. Giventhe restriction onthesize ofthis paper,wewill not summarize the mechanisms of classifier systems(e.g., bucket-brigadealgorithm)nor describethe detailed formalspecifications of our particular classifier systemandPebble_Pond. Afull explanationcan be foundin [4]. Theremainder will attemptto providean intuitive feel for the system. Thereare twobasic issues that mustbe confrontedin such learning systems:1. howcan newclassifiers be formedto improveperformance,and 2. howcan classifiers be madeto cooperatewith each other so that extendedsequencesof messageswill be passedbetween"associated"classifiers. Thegenetic algorithmis typically usedin classifier systems 3. Classifier Learningsystem to generatebetter classifiers out of the current"population" of classifiers. Wewill not considerthe geneticalgorithmin Classifier systems[5] are especiallyuseful in environments papersince the proposedtest systemcan be designed[4] wherepattern recognitionactionsat giventime steps needto as to removethe needfor the "discovery"of newclassifiers. be linked with related pattern recognitionactions occurring It is notedsimplythat eachclassifiers strengthis a measure at latter time steps. Pebble_Pond producesearly sets of mes- of the "fitness"of the classifier and, underthe geneticalgosagesthat report onspatial structure atnearspatial scales, rithm, the mostfit classifiers are allowedto "mate"and while latter sets ofmessages relX~rt onmore distant spa!hl "reproduce"in sucha way- via operatorssuch asmutation scale. This separation ofspati.ql scales over time provides andcrossover- that new"generations"of classifiers tendto robustw~ intheface ofspurious ormissing data, i.e., with be improved.Thus,the current discussionwill focus on how respect tothepartial matchingproblem.In other words,spu- classifiers mightbeginto link togetherinto cooperatingycomrious data has a localizedaffect at early states in wavespace. peting assemblies. Further, the wavepropagationthroughoutthe space allows all pointsto interact with all other pointsand- assuming the Theclassifiers andmessages are separatedinto three disjoint learningsystemcansearchout suchinvuriantinteractions sets: 1. Sensoryclassifiers will lake the information in the there is addedresistance to spuriousandmissingd~tA=Classensory messagesgeneratedby Pebble_Pond and post intersifters can look to sensoryinput messagescomingout of nal messages that will functionas predictionsaboutfuture Pebble_Pond andthen place internal messagesthat function sensorymessages.2. The(sensory,internal) classifiers will as predicliousthat certain other messages shouldarrive at attemptto verify these predictionswiththe current sensory lime steps. Evennmlly, classifiers shoulddevelop input. When these classifiers fire they will post newinternal whichlink these internal andsensoryinput messages messages,whichnowrepresent predictions about future sentogetherinto predictionsas to 1~_-_m~n class. Thepredictions sory input baseduponsomeintegration of informationfrom are then subject to rewardor punishment, with the learning l~viousinternal messages and the current sensoryinput. The systemusingthis information to changeits internal structure. internal messagesmeintendedto represent "linkages"betweenrelated midpointevents fromPebble_Pond, i.e., they Theinput to the classifier systemat eachtimestep is the set are intendedto searchout invariant "paths"throughthe midof midpointscurrentlydetectedandthe identifyinglabels of pointsof the complete graphthat are usefulin final classificathe meetingwavefronts.Theclassifier systemattemptsto tions. 3. Thefinal set of (internal,internal)classifiersare predict whenandwherethese samewavefrontswill particiintendedto combine internal classifiers into a c~ttegoriT~fiOll pate in meetingswith other wavefronts.Onecan viewthe or classification"action,"withthe result that the systemgets learningsystem’slinked predictionsas essentially performpayoff, whichis thenintendedto trickle backvia the bucketing a geometricallymeaningful rebellingof the initial arbibrigadealgorithm. Warylabels assignedto the feature locations.Thisbringsup a basic problemin computervision: internal modelsbeing Thus, as sensory messagesfrom Pebble_Pond comeinto the matchedto ~ta havefeatures identified in somesemantisystem,they are either treated as "new"events by sensory cally meaningful way(e.g., this feature is part of the arm, classifiers or are integratedwith previoussensoryeventsvia etc.) but- because inputdata is "raw"- all initial labels given previouslypostedinternal messages.Theinternal messages to input data will be arbitrary. Thus,weneedthe labels to areintendedto representstable spatial structures that might 97 be directly relevantto the final categorizationof a specific unally meaningfulinformationin terms of the sequenceof class ofinputs ormight beshared bydifferent classes (e.g., vectors. Asdata comesin fromPebble_Pond, the arbitrary two diffe.,ent classes might share arectangtdar sebstrectere). feature labels become "bound"to the vector "slots" that have Asinternal messases eresuccessively generated fxom a sebeenbuilt into the classifiers via learning. quence ofprevious internal andsensory messages, successively larger spatial strucUtresare constructed.The(internal, Twopossible paths that mightbe learned in a prototypical internal) messages attemptto combine these spatial structures patternof 5 pointsare shown in Figure5. Thefirst pathstarts (hopefullyat the right timel) into categorizations. at the Pebble_Pond event correspondingto the meetingmidpoint of wavesfrompoint i andk, whilethe secondstarts at Co~derthe configurationof 5 points shownin Figure5, the midpointeventfrompoints j andk. Theclassifier system whichrepresent a prototypicalpauemthat mightbe learned. needsto learn to predict the orientationandtimedelay(disThefive point sourcesare indicatedby crossesandlabeledi tance) to the next meetingevent. Becauseall eventsproduced m. Thesmall diamondsrepresent someedge midpointevents by Pebble_Pond are orderedover time accordingto spatial (messages)generated by Pebble_Pond.Since pebble_Pond scale, predictions can only be madeto next midpointmeetwill generatemessages in spatial scale order(smallto large), ings that are greater thanthe timescale of the currentpredicthe two midpointmessages,m(j~)and m(i,k), will be gener- tion point in the pathbeinglearned.Notethat pathscanshare ated before other midpointmessages.Thesemessagesmight meetingmidpointsor sub-paths.If either of these paths are be pickedup by sensoryclassifiers andincorporatedinto a relevantto a classificationthentheycanbe usedby(internal, prediction aboutfulm’emidpointmessages.For example,if internal)classifiers to participatein externalpayoff. m(i,k)is matchedby a ~lsory classifier, a predictionmight be madethat at somefuture point in time anothermatchwill Thebasic idea is that a sequenceof activatedclassifiers will occurinvolvinga midpointeventbetweeneither point i or k correspondto a path through the set of edge midpoints.The andsomeother point; the relationshipbetweenthe eventsin paths will be constrainedto involvemidpointsat increasing the predictionis given by the vector drawnbetweenm(j,k) Sl~tlal scales, with multiplepathspotentiallybeingrelevant and m(m,l), where the length of the vector ~ to the to a pattern classification. Presumably, at somepoint there predicteddelayin time betweenm(j,k) andm(m,l)andthe will be a pair of messages (representingsomepaths) that capgle predictionof 02 will be part of the internal message. (Note rare somestructurethat is relevantto a classificationaction that there is not complete consistencyin the origin usedto de- and, thus, some(internal, internal) classifier mighthazard fine the angles,so as not to clutter upFigure5 anymorethan guess.If it is correctit gets payoff,whichvia the bucket-brinec__e-ssary.)Theinformation on the sequence of vectors- time gadeshouldtrickle backto increasethe strengthsof the (sendelaysandangles- is passedon bomoneactive classifier to sory,internal)andsensoryclassifiersthat set it up. the next.It is in this sensethat the arbitrary(fine-scanorder) labels of the features are associatedor remapped onto seman- Notethat (internal,internal) classifiersare restrictedto just tically meaningful labels. In other words,learningwill result function as a wayto hookinternal messages to actions. They in sequencesof cooperatingclassifiers whichrepresentstrucfunction as binary ~ for combiningpairs of paths into classification decisions. Thislendsto a concern abouthowthe systemcan learn pattern classes that require moregeneralnI ary pathrelations. In orderto simplifythe learningsystem+ i.e., to avoidissues concerning the generationof "higheror.m(LI) der" messages - the systemwill havedifferent learningversus performancemodes.In learning mode,any pairwise combinationof internal messages that can get a voteout for a given categorizationwill be allowedto participate in the competition. This waysalient pairwisecombinations of paths can be 01’ ma.m) 4~toO,l) strengthened.In order to allowmorethan pairwiserelationships to havean effect, in performance modethe actionof the entire systemwill be the majorityvote of all classifiers making predictions.This approachis intendedto be a restricted but still robustfirst cut at the problem. m + Figure 5. Twoposs~le paths that might be learned in a prototypicalpattern of 5 points. 98 Thus,sensory messageswill be cominginto the systemconstanflyandsensoryclassifiers will attemptto start predicting their participation ina path composed ofrelated sensory messagesarrivingat latter scales,with(sensory,internal)classitiers attemptingto continuethe predictivepath. Finally, (internal, internal)classifiers will be attemptingto combine pairs of pathsinto classificationactions. Duringlearning phase,conectclassification actionswill get payoff, during performance phase, the majorityvote will win. Thebasic ideas behinda lXOposed systemfor learning spatial sU’ucUtrc havebeenpresented.For preprocessing,a wave propagationalgoriflun (Pebble_Pond) is usedto transform the static spatial structurein planarpoint sets in a temporal sequenceof measures,whichcan be thought of as messages emapafiqgfromthe evolvingwavestate space. Aclassifier systemmightthen be usedto searchfor temporalregularities in the sequenceof these measures,wherethe temporalregularities reflect regularitiesin spatialstructure. 1. Skolnick,M. M., Kim,S. and O~Bara,R., "Morphological algorithmsfor computingnon-planarpoint neighborhoods on cellular autom~tn." SecondInternational Conferenceon 1988,pop. 106-111. 2. Marineau,P. and Skolnick, M., "PebblePond:a mczphological algorithmfor representingdiverseSlmtialstructure," Technicalreport 91-3, Dept. of Computer Science, R.P.I., Troy, N.Y., 1991(to appear in MathematicalMorphology: TheoryandHardware,R. Haralick, Ed., OxfordU. Press, in press.) 3. Skolnick, M. and Marineau,P., "Pebble_Pond: a morphological wavepropagationalgorithmfor representingspatial point patterns," SPIESan DiegoConferenceon ImageAlgebra and MathematicalMorphology,1992. 4. Skolnick,M., "Aclassifier systemfor learningspalial representations, based on a Pebble_Pondmorphologicalwave propagationalgorit~n," SPIEConferenceon Visual Communications and ImageProcessing, 1992, SPIEVol. 1818, p. 919-930. 5. Holland,J. H., Holyoak,KJ., Nisbet, R.E. and Thagard, P.R. Induction.Processesof Inference. Leamin~ and Discovery. MITPress, 1986. 99