From: AAAI Technical Report SS-96-03. Compilation copyright © 1996, AAAI (www.aaai.org). All rights reserved. Generalizing from Diagrams Robert K. Lindsay University of Michigan 205 Zina Pitcher Place Ann Arbor, Michigan 48109 lindsay @umich.edu Abstract Thisresearchis developing a computational modelof the use of diagramsin the understandingof mathematical concepts. The model represents diagrams with a combination of pixelarraysandpropositional descriptions. It applies a generalsimulationalgorithmto the pixel representation to verifyinferences,following the directions of a person attempting to demonstratea theoremor conjectureof planegeometry. Suchreasoninginvolvesboth spatial andverbal reasoning,anda majoraspectof this researchis to describewith computational precisionhow these modesinteract. This paperaddressesoneimportant aspectof this problem,namely,howto representgeneral conclusions that are discovered by examination of a specific diagram.In other words,the questionis howto represent quantification overclassesof figureswhileretainingthe use of inferencebysimulationthat appliesonlyto particular instances. mathematical ideas are, first, that a diagram maybe misleading becausesomethingthat is true in a specific diagrammaynot follow from the assumptionsthat led to the constructionof that diagram,andsecond,that it is not clear howone maygeneralize a conclusion about one diagramto the correct class of diagramsof whichthat diagramis an instance. Bothof these difflcdties follow from the fact that a specific diagram, to be seen and manipulated, must makecommitments tO such features as size, location, orientation, and other specific geometric properties becausethere is no purely diagrammaticwayto specify general classes. Verbalreasoning, on the other hand, providessyntactic constructs such as variables and quantifiers that permit, in a seeminglynatural way,the specification of general classes of objects and of general (quantified) geometricandmathematicalstatements. It seems clear that it is necessary to model the understandingof generalizations if weare to modelhuman understanding of geometric and other mathematical concepts.I see no wayto do this fully withoutintroducing the methodsof verbal reasoningas part of the model,nor do I see any reasonnot to do so. Onthe other hand,there is no reason to attempt to modelgeometricunderstanding in a purely verbal way. It flies in the face of common experienceand muchexperimentalevidenceto assumethat diagramsplay no substantive role in such processes. Thus the issue is howthe twoformsof representationare related, not whichto use to the exclusionof the other. Oneapproachto the problemof generalization is found in work that attempts to describe diagrams in a formal calculus in waysthat permit soundand completededuction about the diagramsthat can be so represented. Excellent recent examplesof this approachinclude (Shin 1995)and (Wang1995), amongothers. Onekey idea in such work to define classes of diagrams.However,these approaches havenot definedthese classes in waysthat can readily be implementedas computationson actual diagrams; rather they rely on the powerof human perceptionto cunfirmthat a given diagramis a memberof a given class. This is similar to the conventionalapproachof logical analysis, whereit is assumedone can determine that a specific formulais an instance of a general formula, and hence a candidate for the application of a deductive rule, etc. However,comparingstrings of characters is relatively straightforward, if wemakethe natural but not trivial assumptionthat the characters are distinguishablemembers of a finite, well-definedset of characters. In the case of diagrams, which involves comparing two-dimensional Introduction Several approacheshave been taken to understandingthe role of diagramsin thinking. Oneof these is simulation, that is, the use of representationsthat can be manipulated in someof the waysthat their real-world counterparts could be manipulatedphysically, in order to makeinferences about behavior and to discover relationships among features of the representation or the object represented. Therationale for this approachis that, since humanshave evolvedsophisticatedvision andmanipulationskills, these skills maysubserve cognitive functions as well by permitting mentalexperiments. Furthermore,by relating cognition to perception, the conceptual components,say mathematicalideas, are groundedin a familiar perceptual world. I have been developing a programmed modelthat seeks to specify a set of processesthat is sufficient to construct pixel representations of geometric diagrams, manipulatethemin waysthat maintain spatial relations, andaccessthe modifiedrepresentationsto retrieve, without formal deduction, inferences that follow from the properties of the diagramand the constraints of spatial relations (Lindsay, 1988, 1989,1992, 1994, 1995). havebeenapplying this systemto "proofs without words" (Nelson 1993), (Loomis1940) as an attempt to explain whatit meansto understanda mathematicalidea with the aid of diagrams. Tworeasons why diagrams have traditionally been criticized as inadequate for achievingunderstandingof 51 arrays of pixels, the processesof recognitionare not well understoodand thus havenot beenreducedto a mechanical procedureof machinevision. In pursuing myworkon the use of simulation, I havebegunto address the problemof defining classes of geometricobjects in such a waythat there is a mechanicalwayof determiningclass membership froma pixel representation,andthus of findinginstancesof a class within a complex diagram, using the same representation system that supports inference by simnlation. SinceI believein the powerof illustration of generalideas by appealto special cases, it is perhapsappropriatethat I nowdescribe myapproachto this problemby consideration of somespecific examples. Generalization by Simulation Thefirst exampledeals with the majortriangle congruency theorems,viz., twotriangles are congruentif they havetwo sides and the includedangle congruent(SAS),two angles and the includedside congruent(ASA),or all three sides congruent(SSS), but are not necessarily congruentif they have three congruentangles (AAA),two sides and a nonincluded angle congruent(SSA),or other combinations componentsin correspondence. Oneway to understand these theoremsis to construct twotriangles that meetthe prescriptionsof the theorem,andthen see if the twocan be superimposed by simulated movement.For example, we can construct two copies of an angle of arbitrary size, construct twodifferent arbitrary length segments,anduse themas the rays of each angle copy, then connect the resulting endpointsto formtwotriangles at different places in the diagram.Thenit mightbe possible to showthat one triangle can be translatedandrotated until it coincideswith the other, verifying that instance of the SAStheorem. However, if superpositionfails one mustnot concludethat SASis not a theorem.It is necessaryto flip (in 3-space) oneof the instances,or equivalentlyto constructits mirror image, before attempting superposition. In the case of SAS,superpositionwill be achievablein one of these ways. Another way of verifying a congruency theorem is to construct a triangle meetingthe theorem’srequirementsand then see if the triangle can be completedin morethan one way. So to verify the SAStheorem, wecould choose an arbitrary but fixed segmentlength, construct an angle of arbitrary fLxedsize at one end of the segment,and then markoff a point on the secondray that is an arbitrary but fixed distancefromthe anglevertex; wethen "see" that the other two vertices are nowin fixed locations and hence permitonly onelocationfor the third side. Onewayto see that the two vertex locations are fixed in generalandnot just in the particular example at handis the method of loci, which embodies a limited form of generalization through diagrams. In this method,all possibleloci for a vertexare representedbya set of points, usually a line or a circle. For the SAStheorem,having 52 pickedlengthsfor twosides anda measurefor the included angle, andfixing a starting point at an arbitrarylocation and an arbitrary orientation for one side, the other two verticesmustlie onthe rays (h~lf infinite lines) of the angle thus represented.In addition, oneof the remainingvertices mustlie on the circle with center at the starting point and radiusequalto oneof the side lengths, andsinfilarly for the other vertex andthe other side length. Sincea circle anda ray thus constructedintersect in only one point (as can be determined by examining the diagram representation, includingthe loci representations),eachremaining vertex is determineduniquely. Similar procedurescan be used for ASAand SSS. Onthe other hand, if weattempt to demonstrateAAA in anyof these ways,it is easy to constructtwoappropriately prescribedtriangles that are not superimposable, or to find several solutions with the methodof loci. The most interesting case is the non-theoremSSA,becausein some cases there are two non-congruent triangles with the given specifications. This is not obviouswith the superposition constructions,but the method of loci will suffice. However,each of these variations on the attempted constructions themeruns into the generalization problem, becauseeach, includingthe methodof loci, dependson the pre-selection of arbitrary values for length and angle measures. Howare these to be chosen, and what is to assure that they are "representative"? That is whyI have emphasized "arbitrary" aboveand in the following. Tobeginwith, not all selections are guaranteedto result in a triangle: the sides mustsatisfy the triangle inequality,and every angle must be less than 180 degrees. Although makingchoicesthat violate these rules will quicklybe seen to preclude a construction, the representation and simulationmethodsdo not "know"these things, that is, this informationis not representedin a formthat can guidethe methods. Thus, either that knowledge is assumed as additional knowledgeunrelated to the system’s knowledge of space, or the methodmust"know"that repeated failure to construct evenone triangle is not sufficient reasonto give up trying other values. I will say that such knowledge is exogenous to the simulationandrepresentationmodel. Here is an alternative wayto understandthe congruency theorems. It, too, requires the assumptionof exogenous knowledge and still runs into the generalizationproblems, but it is often morerevealing to a human.Constructan arbitrary triangle. Fix certain measures,such as twosides andthe includedangle. Thatis, annotatethe representation of the constructedtriangle to indicate that the measures mustremainfixed. Instruct the programto attemptto alter those features that are not fixed, namelythe other side(s) and angle(s), using its simulation algorithm. [This algorithm makes incremental changes, checking for violations of pre-specifiedconstraints, andstoppingwhena givenconditionis met. It is describedin greater detail in (Lindsay1995).] If the simulation is unable to alter the triangle, concludethat its shapeis fully determinedby the specifiedfeatures, else that it is not. For me,this method providesan understandingof the congruencytheoremsthat is lacking froma deductiveproof becauseit demonstrates the interactions amongsides and angles in terms of perceptualprocesses. Again, the problem arises as to what is an arbitrary triangle. TheSSAcase is partic,,larly instructive here, becauseif the arbitrary triangle happensto havea right angleas the fixed angle, then the constructedtriangle will indeedbe unalterable, leadingto a false conclusion.Worse yet, if the angleis not a right angle, then althoughthere could be two possible solutions, they cannot be smoothly transformedinto one another by the simulation algorithm without passing througha range of values that violate a conslraint’ andthe algorithmdoesnot permitthis. Onestep toward generality is to apply the procedureto several different triangles, say a scalene, an equilateral, and so forth, with the understandingthat all mustpass the test. However, the knowledge that the set of cases is collectively representative of all triangles is again exogenous knowledge. Nonetheless,using simulationto achieve moregeneralized understandingthan can be achievedfromthe observationof a single diagramor a fixed set of diagramsappearsto be a promising approach. For example, several diagrammatic demonstrations of the Pythagorean theorem have been successfully demonstratedby the program.Eachof these demonstrations amountsto constructinga right triangle and constructingsquareson eachof its sides. Thesquareon the hypotenuseis then divided into componentsthat can be rearranged in such a waythat they can then be madeto cover the other squares exactly. Thesimulation algorithm can do the necessarydecompositionsand reawangements to verify the equal area claim; see (Lindsay1995). However, in these demonslrations, no explicit use is madeof the fact that the triangle is a right triangle. Generalizationby simulationmethodscan addressthis limitation by showing howthe squares’ areas are altered by small changesto the right angle. Thus, the programcan readily show that increasingthe right angle to an obtuseanglewill increase the area of the hypotenusesquare with the other squares remainingconstant in area (it does so by actually making changes and measuring the results). Conversely, decreasing the right angle has the opposite effect. Furthermore,the simulation can demonstratethat these changesare monotonic.It follows that a right angle, for whichthe theoremhas been demonstrated,is a watershed condition,hencethat the relation among the areas is true of, andonly of, right triangles. Again,however,the logic of this argument,while it uses inference by simulation, is exogenousto the simulation and representation model,and mustbe implemented by additional processesor be implicit in the user’s understanding.Furthermore,evenaugmented by the exogenousknowledge,the methodsare heuristic and do not constitute proofs. They should be viewed as psychologicalmodels,not mathematicalmachines. Theseexamplesillustrate howsimulation can be used to generalize beyonda single case by showinghowspatial constraints interact to determinethe relationship among diagrammatic features. Simulation can be used to 53 demonstrate other generalizations as well, notably asymptotic behaviors, periodic relations, and some symmetric relations. None of these has yet been implementedwithin myprogranunedsystem, but I plan to attemptsuch extensions.In spite of this promise,the fact remainsthat makingsubstantive use of such information requires exogenousknowledge,that is, knowledgethat is not explicitly embodied in the representation and simulation system. As noted above, either such knowledge mustremainimplicit in the use of the system,or it mustbe represented in waysthat the programcan manipulate. To achieve the latter, I see no alternative to a verbal representation of what appears to be inherently verbal knowledge.Thus generalization and understanding must involve verbal representations, althoughthey neednot be exclusivelyverbal. Generalizationby Abstraction Thesecondapproachto generalization that I amexploring is analogousto the methodsusedin formaldescriptions of diagrammatic reasoning.Thatis, it definesrepresentations of classes of diagrams (or diagramcomponents)so that conclusions can be stated that are applicable to any member of the class. Myresearch,however, seekto define computerrepresentations that can be derived by explicit computations on the pixel representation of diagramsthat are employedin the system. Again, since this is workin progress,it is best illustrated by the example thatI havebeen studying. G H F A K B J E L D This exampleis basedon anotherproof of the Pythagorean theorem, this oneattributedto Euclid.It is illustrated in the figure above.Likethose describedearlier, it involvesthe partitioning of the hypotenusesquare. However, the initial partition is of the irregular pentagon formedby the hypotenuse square combinedwith the original right triangle. Thepartition components are triangles. Thereis a symmetryto the procedurethat provides opportunity to exploit the abstractionmethod. After construction of the triangle and squares, the next portion of the demonstrationis the constructionof segment CE, thus forming the triangle ACE,then dropping a perpendicularCLfrom C to ED,then construction of line segmentEJ parallel to AC,thus forming parallelogram ACJE,with diagonal CE.The demonstrationthen proceeds to showthat the twotriangles composing the parallelogram are congruent.It does this by the (laborious) simulation methodthat rotates one of these triangles through 180 degrees around one of the common vertices (say C), and then translates it alongits longside until the twotriangles are superimposed, establishing that they are of equal area (since they are congruent). Another simulation step establishes that the smallertriangle ACK is of equalarea to triangle EJL,fromwhichit followsthat the parallelogram ACJEis equal in area to AKLE,the larger of the two rectangles into whichthe hypotenusesquareis partitioned by segment KL. Thus it has been demonstrated that triangle ACE is half the area of that larger rectangle. SegmentFB is nowconstructed, thus forming triangle FBA.Withthe construction of FMparallel to AB,a new parallelogram FMBA is formed with diagonal FB. We coulddemonstrate that the area of this triangle is half the area of the square ACGF by simulation: a rotation and translation of FBAand a translation of FMGto superimposeABCas before. However,if the system could recognizethat this situation is "the sameas" the previous demonstration,the simulation steps could be avoidedand the conclusiondrawnimmediately. To do so, I introduce a type of representation called a signature (following Wang)that characterizes a diagram componentwith whicha conclusion is to be associated. Therepresentation specifies the elementaryfigures which compose it (e.g., a parallelogramand twotriangles), how they are related (for examplethe triangles share a side, whichis the diagonalof the parallelogram),andanystated constraints on these components or their componentsthat are extant in the generating instance (there are no additional ones in the present case, but the parallelogram requiresthat oppositesides be parallel andequalin length), the default beingthat if no constraints are presentin the generatingfigure, noneare involvedin the signature. The secondprocessneededis one that can find in an arbitrary diagramrepresentation any instance of this signature. Associatedwith the signatureis a list of conclusions(e.g., that the triangles are congruentand of equal area). As presently implemented, it is necessaryfor the demonstrator to instruct the system whenit should construct a new signature, whatits components are (howthey are related is determinedby the program),and whatconclusions are to be recorded. Aswith other aspects of the system,muchof the burdenis placedon the demonstratorto guide the systemthoughthe set of possibilities. Whatthe systemdoes makeexplicit, 54 however,iS howto represent and manipulatecertain kinds of diagrammaticand verbal knowledge,and howthey are related. The emphasis has been on representation, including construction, simulation, and perceptual processes neededfor demonstrationunderstanding,rather than on the search procedures for their use in demonstrationinvention. Thenext demonstration step is to showthat triangle ACE is congruentto triangle AFB;this can be verified by rotation of oneof these triangles about the common vertex Auntil the two are superimposed.This then confirms that the larger piece AKLE of the hypotenusesquare is equal in area to the squareACGF on the leg AC. At this point, an additional abstraction is often seen by manypeople, namelythat the other "half" of the proof that the smaller rectangle KLDB is equal in area to the square CBIHon the leg BC- follows by "the same" argument.Indeedit is true that a formalproof of the two halves is identical except for a renamingof the points. (Gelernter 1959) devised a method of "syntactic symmetries"that could detect such cases. However,it should be possible to computethis relation from the diagram,as it were, rather than fromthe statements of a formal proof. That is the approachI havetaken. Todo so requires the formationof a secondsignature basedon the componenttriangles ACEand AFB.This signature would also record information relating their sides, based on constraints on these sides becausethey are also sides of squares.Thesecondhalf of the proofis then the sequential use of the twosignatures. Signatures are similar in concept to the "diagram configurations" of the DCmodel of (Koedinger Anderson1990). In that model, configurations were defined so that the system, which attempts to prove theorems,can apply similar methodsto similar problems. (McDougal 1993) employed case-based reasoning (generalizing from previously solved cases) in his geometryproof system, POLYA, with similar purpose. In my system, the signatures are to be generated by examination of specific figures, andare usedessentially as lemmasto avoid repeating simulation steps, whichremain the heart of the inferenceprocess. Summary Pixel representations are frequently used in modelsof spatial reasoning. Theyhave the advantagesthat it is straightforward in principle to producethemfrom actual scenes, they preserve metric and topological spatial properties, and they can be efficiently manipulated arithmetically. Simulationprocessescan readily be defined on such representations. These processes can be consWacted to preservespatial constraints as well as other constraints dictated by a particular problem. Thus simulationcan be used to makeinferencesthat followfrom the spatial and situation consWaints.Myhypothesisis that this methodof representation and inference is a more plausible psychological modelof humancognition than a modelbased on deduction in a formal system, although there is nothing inherently contradictory in the two conceptions, and certainly a modelof mathematically sophisticatedpeopleshouldincorporateboth. The simulation and pixel representation model, however, does not in itself embody an obvious model of generalization. Nonetheless, I have tried to showhow "playing with" diagrams(or their mentalor computational representations) can reveal moregeneral relations about geometricobjects than are apparent from examinationof only a fixed, specific instance. Understanding the force of such play, I havenoted, dependsin substantivewayson the underlyingmodel,but requires additional representational and computationalabilities (exogenousto the model) order to give a full account of howunderstanding is supported by simulation. In particular, one type of additionalrepresentationis someformof class description. I amattemptingto providea computationalmodelof such a class representation that uses pixel representations and simulation as its source, rather than relying on formal definitions suppliedexternally andlackingcomputationally defined perceptualand manipulativeprocesses. Acknowledgments This material is based on worksupported by the United States National Science Foundationunder Grant No. IRI9203946. References Gelemter, H. (1959). A note on syntactic symmetryand the manipulation of formal systems by machine. Informationand Control2: 80-89. Koedinger, K. R. & Anderson, J. R. (1990). Abstract planning and perceptual chunks:Elementsof expertise in geometry.CognitiveScience 14:511-550. Lindsay, R. K. (1988). Imagesand inference. Cognition, 29, 229-250.(Reprintedin J. I. Glasgow,N. H. Narayanan, and B. Chandrasekaran(Eds.), Diagrammaticreasoning: Computationaland cognitive perspectives. Cambridge, MA:MITPress, 1995) Lindsay,R. K. (1989). Qualitative geometricreasoning. Proceedingsof the Eleventh Annual Conferenceof the Cognitive Science Society [Ann Arbor, MI], 418-425. Hillsdale,NJ: LawrenceErlbaum. Lindsay, R. K. (1994). Understanding diagrammatic demonstrations.In A. Ram&K. Eiselt (Ed.), Proceedings of the Sixteenth Annual Conference of the Cognitive Science Society [Atlanta, GA], 572-576. Hillsdale, NJ: LawrenceErlbaum. Lindsay, R. K. (1995). Using diagrams to understand geometry.TechnicalReport, AnnArbor, MI: University of Michigan,MentalHealthResearchInstitute. 55 Loomis,E. S. (1940). Pythagorean proposition: Its proofs analyzedand classified and bibliographyof sources for data of the four kinds of "proofs"(2nded.). AnnArbor,MI: EdwardsBrothers. McDougal,T. F. (1993). Using case-based reasoning and situated activity to write geometryproofs. In Proceedings of the Fifteenth AnnualMeetingof the CognitiveScience Society [Boulder, CO],711-716. Hillsdale, NJ: Lawrence Erlbaum. Nelson, R. B. (1993). Proofs without Words.Exercises in Visual Thinking. Washington,D.C.: The Mathematical Associationof America. Shin, S.-J. (1995). The logical status of diagrams. Cambridge:CambridgeUniversity Press. Wang,D. (1995). Studies on the formal semantics pictures. Ph.D. diss., Institute for Logic, Language,and Computation,University of Amsterdam.