Generalizing from Diagrams

advertisement
From: AAAI Technical Report SS-96-03. Compilation copyright © 1996, AAAI (www.aaai.org). All rights reserved.
Generalizing from Diagrams
Robert K. Lindsay
University of Michigan
205 Zina Pitcher Place
Ann Arbor, Michigan 48109 lindsay @umich.edu
Abstract
Thisresearchis developing
a computational
modelof the
use of diagramsin the understandingof mathematical
concepts. The model represents diagrams with a
combination
of pixelarraysandpropositional
descriptions.
It applies a generalsimulationalgorithmto the pixel
representation
to verifyinferences,following
the directions
of a person attempting to demonstratea theoremor
conjectureof planegeometry.
Suchreasoninginvolvesboth
spatial andverbal reasoning,anda majoraspectof this
researchis to describewith computational
precisionhow
these modesinteract. This paperaddressesoneimportant
aspectof this problem,namely,howto representgeneral
conclusions
that are discovered
by examination
of a specific
diagram.In other words,the questionis howto represent
quantification
overclassesof figureswhileretainingthe use
of inferencebysimulationthat appliesonlyto particular
instances.
mathematical ideas are, first, that a diagram maybe
misleading becausesomethingthat is true in a specific
diagrammaynot follow from the assumptionsthat led to
the constructionof that diagram,andsecond,that it is not
clear howone maygeneralize a conclusion about one
diagramto the correct class of diagramsof whichthat
diagramis an instance. Bothof these difflcdties follow
from the fact that a specific diagram, to be seen and
manipulated, must makecommitments
tO such features as
size, location, orientation, and other specific geometric
properties becausethere is no purely diagrammaticwayto
specify general classes. Verbalreasoning, on the other
hand, providessyntactic constructs such as variables and
quantifiers that permit, in a seeminglynatural way,the
specification of general classes of objects and of general
(quantified) geometricandmathematicalstatements.
It seems clear that it is necessary to model the
understandingof generalizations if weare to modelhuman
understanding of geometric and other mathematical
concepts.I see no wayto do this fully withoutintroducing
the methodsof verbal reasoningas part of the model,nor
do I see any reasonnot to do so. Onthe other hand,there
is no reason to attempt to modelgeometricunderstanding
in a purely verbal way. It flies in the face of common
experienceand muchexperimentalevidenceto assumethat
diagramsplay no substantive role in such processes. Thus
the issue is howthe twoformsof representationare related,
not whichto use to the exclusionof the other.
Oneapproachto the problemof generalization is found in
work that attempts to describe diagrams in a formal
calculus in waysthat permit soundand completededuction
about the diagramsthat can be so represented. Excellent
recent examplesof this approachinclude (Shin 1995)and
(Wang1995), amongothers. Onekey idea in such work
to define classes of diagrams.However,these approaches
havenot definedthese classes in waysthat can readily be
implementedas computationson actual diagrams; rather
they rely on the powerof human
perceptionto cunfirmthat
a given diagramis a memberof a given class. This is
similar to the conventionalapproachof logical analysis,
whereit is assumedone can determine that a specific
formulais an instance of a general formula, and hence a
candidate for the application of a deductive rule, etc.
However,comparingstrings of characters is relatively
straightforward, if wemakethe natural but not trivial
assumptionthat the characters are distinguishablemembers
of a finite, well-definedset of characters. In the case of
diagrams, which involves comparing two-dimensional
Introduction
Several approacheshave been taken to understandingthe
role of diagramsin thinking. Oneof these is simulation,
that is, the use of representationsthat can be manipulated
in
someof the waysthat their real-world counterparts could
be manipulatedphysically, in order to makeinferences
about behavior and to discover relationships among
features of the representation or the object represented.
Therationale for this approachis that, since humanshave
evolvedsophisticatedvision andmanipulationskills, these
skills maysubserve cognitive functions as well by
permitting mentalexperiments. Furthermore,by relating
cognition to perception, the conceptual components,say
mathematicalideas, are groundedin a familiar perceptual
world. I have been developing a programmed
modelthat
seeks to specify a set of processesthat is sufficient to
construct pixel representations of geometric diagrams,
manipulatethemin waysthat maintain spatial relations,
andaccessthe modifiedrepresentationsto retrieve, without
formal deduction, inferences that follow from the
properties of the diagramand the constraints of spatial
relations (Lindsay, 1988, 1989,1992, 1994, 1995).
havebeenapplying this systemto "proofs without words"
(Nelson 1993), (Loomis1940) as an attempt to explain
whatit meansto understanda mathematicalidea with the
aid of diagrams.
Tworeasons why diagrams have traditionally been
criticized as inadequate for achievingunderstandingof
51
arrays of pixels, the processesof recognitionare not well
understoodand thus havenot beenreducedto a mechanical
procedureof machinevision. In pursuing myworkon the
use of simulation, I havebegunto address the problemof
defining classes of geometricobjects in such a waythat
there is a mechanicalwayof determiningclass membership
froma pixel representation,andthus of findinginstancesof
a class within a complex diagram, using the same
representation system that supports inference by
simnlation.
SinceI believein the powerof illustration of generalideas
by appealto special cases, it is perhapsappropriatethat I
nowdescribe myapproachto this problemby consideration
of somespecific examples.
Generalization
by Simulation
Thefirst exampledeals with the majortriangle congruency
theorems,viz., twotriangles are congruentif they havetwo
sides and the includedangle congruent(SAS),two angles
and the includedside congruent(ASA),or all three sides
congruent(SSS), but are not necessarily congruentif they
have three congruentangles (AAA),two sides and a nonincluded angle congruent(SSA),or other combinations
componentsin correspondence. Oneway to understand
these theoremsis to construct twotriangles that meetthe
prescriptionsof the theorem,andthen see if the twocan be
superimposed by simulated movement.For example, we
can construct two copies of an angle of arbitrary size,
construct twodifferent arbitrary length segments,anduse
themas the rays of each angle copy, then connect the
resulting endpointsto formtwotriangles at different places
in the diagram.Thenit mightbe possible to showthat one
triangle can be translatedandrotated until it coincideswith
the other, verifying that instance of the SAStheorem.
However,
if superpositionfails one mustnot concludethat
SASis not a theorem.It is necessaryto flip (in 3-space)
oneof the instances,or equivalentlyto constructits mirror
image, before attempting superposition. In the case of
SAS,superpositionwill be achievablein one of these ways.
Another way of verifying a congruency theorem is to
construct a triangle meetingthe theorem’srequirementsand
then see if the triangle can be completedin morethan one
way. So to verify the SAStheorem, wecould choose an
arbitrary but fixed segmentlength, construct an angle of
arbitrary fLxedsize at one end of the segment,and then
markoff a point on the secondray that is an arbitrary but
fixed distancefromthe anglevertex; wethen "see" that the
other two vertices are nowin fixed locations and hence
permitonly onelocationfor the third side.
Onewayto see that the two vertex locations are fixed in
generalandnot just in the particular example
at handis the
method of loci, which embodies a limited form of
generalization through diagrams. In this method,all
possibleloci for a vertexare representedbya set of points,
usually a line or a circle. For the SAStheorem,having
52
pickedlengthsfor twosides anda measurefor the included
angle, andfixing a starting point at an arbitrarylocation
and an arbitrary orientation for one side, the other two
verticesmustlie onthe rays (h~lf infinite lines) of the angle
thus represented.In addition, oneof the remainingvertices
mustlie on the circle with center at the starting point and
radiusequalto oneof the side lengths, andsinfilarly for the
other vertex andthe other side length. Sincea circle anda
ray thus constructedintersect in only one point (as can be
determined by examining the diagram representation,
includingthe loci representations),eachremaining
vertex is
determineduniquely. Similar procedurescan be used for
ASAand SSS.
Onthe other hand, if weattempt to demonstrateAAA
in
anyof these ways,it is easy to constructtwoappropriately
prescribedtriangles that are not superimposable,
or to find
several solutions with the methodof loci. The most
interesting case is the non-theoremSSA,becausein some
cases there are two non-congruent
triangles with the given
specifications. This is not obviouswith the superposition
constructions,but the method
of loci will suffice.
However,each of these variations on the attempted
constructions themeruns into the generalization problem,
becauseeach, includingthe methodof loci, dependson the
pre-selection of arbitrary values for length and angle
measures. Howare these to be chosen, and what is to
assure that they are "representative"? That is whyI have
emphasized
"arbitrary" aboveand in the following.
Tobeginwith, not all selections are guaranteedto result in
a triangle: the sides mustsatisfy the triangle inequality,and
every angle must be less than 180 degrees. Although
makingchoicesthat violate these rules will quicklybe seen
to preclude a construction, the representation and
simulationmethodsdo not "know"these things, that is, this
informationis not representedin a formthat can guidethe
methods. Thus, either that knowledge is assumed as
additional knowledgeunrelated to the system’s knowledge
of space, or the methodmust"know"that repeated failure
to construct evenone triangle is not sufficient reasonto
give up trying other values. I will say that such knowledge
is exogenous
to the simulationandrepresentationmodel.
Here is an alternative wayto understandthe congruency
theorems. It, too, requires the assumptionof exogenous
knowledge
and still runs into the generalizationproblems,
but it is often morerevealing to a human.Constructan
arbitrary triangle. Fix certain measures,such as twosides
andthe includedangle. Thatis, annotatethe representation
of the constructedtriangle to indicate that the measures
mustremainfixed. Instruct the programto attemptto alter
those features that are not fixed, namelythe other side(s)
and angle(s), using its simulation algorithm. [This
algorithm makes incremental changes, checking for
violations of pre-specifiedconstraints, andstoppingwhena
givenconditionis met. It is describedin greater detail in
(Lindsay1995).] If the simulation is unable to alter the
triangle, concludethat its shapeis fully determinedby the
specifiedfeatures, else that it is not. For me,this method
providesan understandingof the congruencytheoremsthat
is lacking froma deductiveproof becauseit demonstrates
the interactions amongsides and angles in terms of
perceptualprocesses.
Again, the problem arises as to what is an arbitrary
triangle. TheSSAcase is partic,,larly instructive here,
becauseif the arbitrary triangle happensto havea right
angleas the fixed angle, then the constructedtriangle will
indeedbe unalterable, leadingto a false conclusion.Worse
yet, if the angleis not a right angle, then althoughthere
could be two possible solutions, they cannot be smoothly
transformedinto one another by the simulation algorithm
without passing througha range of values that violate a
conslraint’ andthe algorithmdoesnot permitthis. Onestep
toward generality is to apply the procedureto several
different triangles, say a scalene, an equilateral, and so
forth, with the understandingthat all mustpass the test.
However,
the knowledge
that the set of cases is collectively
representative of all triangles is again exogenous
knowledge.
Nonetheless,using simulationto achieve moregeneralized
understandingthan can be achievedfromthe observationof
a single diagramor a fixed set of diagramsappearsto be a
promising approach. For example, several diagrammatic
demonstrations of the Pythagorean theorem have been
successfully demonstratedby the program.Eachof these
demonstrations
amountsto constructinga right triangle and
constructingsquareson eachof its sides. Thesquareon the
hypotenuseis then divided into componentsthat can be
rearranged in such a waythat they can then be madeto
cover the other squares exactly. Thesimulation algorithm
can do the necessarydecompositionsand reawangements
to
verify the equal area claim; see (Lindsay1995). However,
in these demonslrations,
no explicit use is madeof the fact
that the triangle is a right triangle. Generalizationby
simulationmethodscan addressthis limitation by showing
howthe squares’ areas are altered by small changesto the
right angle. Thus, the programcan readily show that
increasingthe right angle to an obtuseanglewill increase
the area of the hypotenusesquare with the other squares
remainingconstant in area (it does so by actually making
changes and measuring the results). Conversely,
decreasing the right angle has the opposite effect.
Furthermore,the simulation can demonstratethat these
changesare monotonic.It follows that a right angle, for
whichthe theoremhas been demonstrated,is a watershed
condition,hencethat the relation among
the areas is true of,
andonly of, right triangles. Again,however,the logic of
this argument,while it uses inference by simulation, is
exogenousto the simulation and representation model,and
mustbe implemented
by additional processesor be implicit
in the user’s understanding.Furthermore,evenaugmented
by the exogenousknowledge,the methodsare heuristic and
do not constitute proofs. They should be viewed as
psychologicalmodels,not mathematicalmachines.
Theseexamplesillustrate howsimulation can be used to
generalize beyonda single case by showinghowspatial
constraints interact to determinethe relationship among
diagrammatic features. Simulation can be used to
53
demonstrate other generalizations as well, notably
asymptotic behaviors, periodic relations, and some
symmetric relations. None of these has yet been
implementedwithin myprogranunedsystem, but I plan to
attemptsuch extensions.In spite of this promise,the fact
remainsthat makingsubstantive use of such information
requires exogenousknowledge,that is, knowledgethat is
not explicitly embodied in the representation and
simulation system. As noted above, either such knowledge
mustremainimplicit in the use of the system,or it mustbe
represented in waysthat the programcan manipulate. To
achieve the latter, I see no alternative to a verbal
representation of what appears to be inherently verbal
knowledge.Thus generalization and understanding must
involve verbal representations, althoughthey neednot be
exclusivelyverbal.
Generalizationby Abstraction
Thesecondapproachto generalization that I amexploring
is analogousto the methodsusedin formaldescriptions of
diagrammatic
reasoning.Thatis, it definesrepresentations
of classes of diagrams (or diagramcomponents)so that
conclusions can be stated that are applicable to any
member
of the class. Myresearch,however,
seekto
define computerrepresentations that can be derived by
explicit computations on the pixel representation of
diagramsthat are employedin the system. Again, since
this is workin progress,it is best illustrated by the example
thatI havebeen
studying.
G
H
F
A
K
B
J
E
L
D
This exampleis basedon anotherproof of the Pythagorean
theorem,
this oneattributedto Euclid.It is illustrated in the
figure above.Likethose describedearlier, it involvesthe
partitioning of the hypotenusesquare. However,
the initial
partition is of the irregular pentagon formedby the
hypotenuse square combinedwith the original right
triangle. Thepartition components
are triangles. Thereis a
symmetryto the procedurethat provides opportunity to
exploit the abstractionmethod.
After construction of the triangle and squares, the next
portion of the demonstrationis the constructionof segment
CE, thus forming the triangle ACE,then dropping a
perpendicularCLfrom C to ED,then construction of line
segmentEJ parallel to AC,thus forming parallelogram
ACJE,with diagonal CE.The demonstrationthen proceeds
to showthat the twotriangles composing
the parallelogram
are congruent.It does this by the (laborious) simulation
methodthat rotates one of these triangles through 180
degrees around one of the common
vertices (say C), and
then translates it alongits longside until the twotriangles
are superimposed,
establishing that they are of equal area
(since they are congruent). Another simulation step
establishes that the smallertriangle ACK
is of equalarea to
triangle EJL,fromwhichit followsthat the parallelogram
ACJEis equal in area to AKLE,the larger of the two
rectangles into whichthe hypotenusesquareis partitioned
by segment KL. Thus it has been demonstrated that
triangle ACE
is half the area of that larger rectangle.
SegmentFB is nowconstructed, thus forming triangle
FBA.Withthe construction of FMparallel to AB,a new
parallelogram FMBA
is formed with diagonal FB. We
coulddemonstrate
that the area of this triangle is half the
area of the square ACGF
by simulation: a rotation and
translation of FBAand a translation
of FMGto
superimposeABCas before. However,if the system could
recognizethat this situation is "the sameas" the previous
demonstration,the simulation steps could be avoidedand
the conclusiondrawnimmediately.
To do so, I introduce a type of representation called a
signature (following Wang)that characterizes a diagram
componentwith whicha conclusion is to be associated.
Therepresentation specifies the elementaryfigures which
compose
it (e.g., a parallelogramand twotriangles), how
they are related (for examplethe triangles share a side,
whichis the diagonalof the parallelogram),andanystated
constraints on these components
or their componentsthat
are extant in the generating instance (there are no
additional ones in the present case, but the parallelogram
requiresthat oppositesides be parallel andequalin length),
the default beingthat if no constraints are presentin the
generatingfigure, noneare involvedin the signature. The
secondprocessneededis one that can find in an arbitrary
diagramrepresentation any instance of this signature.
Associatedwith the signatureis a list of conclusions(e.g.,
that the triangles are congruentand of equal area). As
presently implemented,
it is necessaryfor the demonstrator
to instruct the system whenit should construct a new
signature, whatits components
are (howthey are related is
determinedby the program),and whatconclusions are to
be recorded.
Aswith other aspects of the system,muchof the burdenis
placedon the demonstratorto guide the systemthoughthe
set of possibilities. Whatthe systemdoes makeexplicit,
54
however,iS howto represent and manipulatecertain kinds
of diagrammaticand verbal knowledge,and howthey are
related. The emphasis has been on representation,
including construction, simulation, and perceptual
processes neededfor demonstrationunderstanding,rather
than on the search procedures for their use in
demonstrationinvention.
Thenext demonstration
step is to showthat triangle ACE
is
congruentto triangle AFB;this can be verified by rotation
of oneof these triangles about the common
vertex Auntil
the two are superimposed.This then confirms that the
larger piece AKLE
of the hypotenusesquare is equal in
area to the squareACGF
on the leg AC.
At this point, an additional abstraction is often seen by
manypeople, namelythat the other "half" of the proof that the smaller rectangle KLDB
is equal in area to the
square CBIHon the leg BC- follows by "the same"
argument.Indeedit is true that a formalproof of the two
halves is identical except for a renamingof the points.
(Gelernter 1959) devised a method of "syntactic
symmetries"that could detect such cases. However,it
should be possible to computethis relation from the
diagram,as it were, rather than fromthe statements of a
formal proof. That is the approachI havetaken. Todo so
requires the formationof a secondsignature basedon the
componenttriangles ACEand AFB.This signature would
also record information relating their sides, based on
constraints on these sides becausethey are also sides of
squares.Thesecondhalf of the proofis then the sequential
use of the twosignatures.
Signatures are similar in concept to the "diagram
configurations" of the DCmodel of (Koedinger
Anderson1990). In that model, configurations were
defined so that the system, which attempts to prove
theorems,can apply similar methodsto similar problems.
(McDougal 1993) employed case-based reasoning
(generalizing from previously solved cases) in his
geometryproof system, POLYA,
with similar purpose. In
my system, the signatures are to be generated by
examination
of specific figures, andare usedessentially as
lemmasto avoid repeating simulation steps, whichremain
the heart of the inferenceprocess.
Summary
Pixel representations are frequently used in modelsof
spatial reasoning. Theyhave the advantagesthat it is
straightforward in principle to producethemfrom actual
scenes, they preserve metric and topological spatial
properties, and they can be efficiently manipulated
arithmetically. Simulationprocessescan readily be defined
on such representations.
These processes can be
consWacted
to preservespatial constraints as well as other
constraints dictated by a particular problem. Thus
simulationcan be used to makeinferencesthat followfrom
the spatial and situation consWaints.Myhypothesisis that
this methodof representation and inference is a more
plausible psychological modelof humancognition than a
modelbased on deduction in a formal system, although
there is nothing inherently contradictory in the two
conceptions, and certainly a modelof mathematically
sophisticatedpeopleshouldincorporateboth.
The simulation and pixel representation model, however,
does not in itself embody an obvious model of
generalization. Nonetheless, I have tried to showhow
"playing with" diagrams(or their mentalor computational
representations) can reveal moregeneral relations about
geometricobjects than are apparent from examinationof
only a fixed, specific instance. Understanding
the force of
such play, I havenoted, dependsin substantivewayson the
underlyingmodel,but requires additional representational
and computationalabilities (exogenousto the model)
order to give a full account of howunderstanding is
supported by simulation. In particular, one type of
additionalrepresentationis someformof class description.
I amattemptingto providea computationalmodelof such a
class representation that uses pixel representations and
simulation as its source, rather than relying on formal
definitions suppliedexternally andlackingcomputationally
defined perceptualand manipulativeprocesses.
Acknowledgments
This material is based on worksupported by the United
States National Science Foundationunder Grant No. IRI9203946.
References
Gelemter, H. (1959). A note on syntactic symmetryand
the manipulation of formal systems by machine.
Informationand Control2: 80-89.
Koedinger, K. R. & Anderson, J. R. (1990). Abstract
planning and perceptual chunks:Elementsof expertise in
geometry.CognitiveScience 14:511-550.
Lindsay, R. K. (1988). Imagesand inference. Cognition,
29, 229-250.(Reprintedin J. I. Glasgow,N. H. Narayanan,
and B. Chandrasekaran(Eds.), Diagrammaticreasoning:
Computationaland cognitive perspectives. Cambridge,
MA:MITPress, 1995)
Lindsay,R. K. (1989). Qualitative geometricreasoning.
Proceedingsof the Eleventh Annual Conferenceof the
Cognitive Science Society [Ann Arbor, MI], 418-425.
Hillsdale,NJ: LawrenceErlbaum.
Lindsay, R. K. (1994). Understanding diagrammatic
demonstrations.In A. Ram&K. Eiselt (Ed.), Proceedings
of the Sixteenth Annual Conference of the Cognitive
Science Society [Atlanta, GA], 572-576. Hillsdale, NJ:
LawrenceErlbaum.
Lindsay, R. K. (1995). Using diagrams to understand
geometry.TechnicalReport, AnnArbor, MI: University of
Michigan,MentalHealthResearchInstitute.
55
Loomis,E. S. (1940). Pythagorean
proposition: Its proofs
analyzedand classified and bibliographyof sources for
data of the four kinds of "proofs"(2nded.). AnnArbor,MI:
EdwardsBrothers.
McDougal,T. F. (1993). Using case-based reasoning and
situated activity to write geometryproofs. In Proceedings
of the Fifteenth AnnualMeetingof the CognitiveScience
Society [Boulder, CO],711-716. Hillsdale, NJ: Lawrence
Erlbaum.
Nelson, R. B. (1993). Proofs without Words.Exercises in
Visual Thinking. Washington,D.C.: The Mathematical
Associationof America.
Shin, S.-J. (1995). The logical status of diagrams.
Cambridge:CambridgeUniversity Press.
Wang,D. (1995). Studies on the formal semantics
pictures. Ph.D. diss., Institute for Logic, Language,and
Computation,University of Amsterdam.
Download