From: AAAI-80 Proceedings. Copyright © 1980, AAAI (www.aaai.org). All rights reserved. Sticks,Plates,and Blobs: A Three-Dimensional Object Representation for Scene Analysis Linda G. Shapiro PrasannaG. Mulgaonkar John D. Moriarty RobertM. Haralick VirginiaPolytechnicInstituteand State University Departmentof ComputerScience how the parts fit together. Our models have three ABSTRACT kinds of three-dimensional Parts: sticks,Plates. and blobs. Sticksare long, thin&-g In this paper, we describea relationalmodelonlv one significantdimension. Plates are flating techniquewhich categorizesthree-dimensional ish-wideparts consistingof two nearly flat surobjectsat a gross level. These models may then faces connected by a thin edge between them. be used to classify and recognizetwo dimensional Plateshave two significantdimensions. Blobs are views of the object, in a scene analysissystem. neitherthin nor flat; they have three significant dimensions. All three kinds of parts are "near convex"; that is a stick cannot bend very much, the surfacesof a plate cannot fold too much, and a blob can be bumpy, but cannothave large concavI. Introduction ities. Figure 1 shows severalexamplesof sticks, plates,and blobs. The recognition of three-dimensionalobjects from two-dimensional views is an important and sticks still largelyunsolvedproblem in scene analysis. This problemwould be difficulteven if the twodimensionaldata were perfect,but the data can be noisy, distorted, occluded, shadowedand poorly segmented,making recognitionmuch harder. Since the data is so rough it seems reasonablethat very rough models of three-dimensionalobjectsshould be used in the process of trying to classifysuch data. In this paper we describea relational model and discuss its use in a scene analysissystem. There have been many approachesto modeling three-dimensional objects. For a comprehensive collectionsee the proceedings of the Workshopon Representation of Three-DimensionalObjects [13]. Also see Voelckerand Requicha [ll] and Brown [4] for mechanicaldesign;York et.al. [14] for curved surface modelingusing the Coons surface patch 151; Horn [6] and Waltz [12] for study of light and shadows;Badler et.al. [2] for study of human body modeling; and Pgin and Binford [1] and Nevatia and Binford [7] for the generalizedcylinder approach. The models we suggestare relatedto the generalizedcylindermodels, but are rougher descriptions that specify less detail about three-dimensional shape than do generalizedcylinders. Figure 1 illust rates several examples each of sticks, plates and blobs. In describingan object, we must list the parts, their types (stick, plate, or blob), and their relativesizes; and we must specifyhow the parts fit together. For any two primitiveparts that connect, we specifythe type of connection and up to three angle constraints. The type of connectioncan be end-end,end-interior, end-center, end-edge, interior-center,or center-center where "end" refers to an end of a stick, "interior" refersto the interiorof a stick or surface of a plate or blob, "edge" refersto the edge of a plate, and "center"refers to the center of mass of any part. II Sticks,Plates,and Blobs in RelationalDt?Siptions A relationaldescriptionof an object consists of a set of parts of the object,the attributesof the parts,and a set of relationsthat describe This research was supportedby the NationalScience Foundation under grants MCS-7923827 and MCS-7919741. 28 For each type of pairwise connection, there are one, two, or three angles that, when specified as single values, completelydescribethe connection. For example,for a stick and a plate in the end-edgetype connection,two angles are required: the angle betweenthe stick and its projectionon the plane of the plate and the angle betweenthat projectionand the line from the connectionpoint to the centerof mass of the plate. to two-dimensional models is a well-definedoperation. See Barrow,Ambler,and Burstall [3] for a discussionof exact relationalmatching, Shapiro [8] for relationalshape matching,and Shapiroand Haralick [lo] for inexact matching. Our problem in scene analysisis to match two-dimensional perspectiveprojections of objects (as found in an image) to the three-dimensionalmodels stored in the database. Our approachto this problem is to analyze a single two-dimensional view of an object,producea two-dimensional structuralshape description, use the two-dimensional description to infer as much as possibleabout the corresponding three-dimensionaldescription, and then use inexact matchingtechniques in trying to match incomplete and possibly erroneous three-dimensional objectdescriptionsto our stored three-dimensionalrelationalmodels. Requiringexact angles is not in the spirit of our rough models. Insteadwe will specifypermissible ranges for each required angle. In our relationalmodel, binary connectionsare described in the CONNECTS/SUPPORTS relation which contains lo-tuplesof the form (Partl, Part2, SUPPORTS, HOW, VLl, VHl, VL2, VH2, VL3, VH3) where Part1 connectsto Part2, SUPPORTSis true if Part1 supports Part2, HOW gives the connectiontype, VLi gives the low-valuein the permissible range of angle i and VHi gives the high value in the permissiblerange of angle i, i = 1, 2, 3. We decomposea two-dimensionalview into simple parts by a graph-theoreticclusteringscheme as describedin [9]. To match a two-dimensional object descriptionto a three-dimensional model is to find a mapping from the tm-dimensionalsimple parts of the object to the sticks, plates and blobs of the model so that the relationships among the two-dimensionalparts are not inconsistent with the relationships among the three-dimensional parts. For example, a binary CONNECTSrelation can be constructedfor the two-dimensional parts. For a pair (pl,p2) of three-dimensionalmodel parts where (pl,p2,*,*,*,*,*,*,*,*,) is an element of the CONNECTS/SUPPORTS relation and a mappingh from three-dimensionalmodel parts to two-dimensional object parts, if (h(pl),h(p2))is not an elementof the two-dimensional CONNECTSrelation, then an error has occured. If a mappingaccumulates too many errors from variousn-tuples of various relationsnot being satisfied, that mapping cannot be considereda match. The CONNECTS/SUPPORTS relation is not sufficient to describea three-dimensional object. One shortcoming is its failureto place any global constraintson the resultingobject. We can make the model more powerfulmerely by consideringtriples of parts (sl,s2,s3) where sl and s3 both touch s2 and describingthe spatialrelationship between sl and s3 with respectto s2, Such a des,riptionappearsin the TRIPLECONSTRAINTrelation and has two components: 1) a boolean which is true if sl and s3 meet s2 on the same end (or surface) and 2) a contrainton the angle subtendedby the center of mass of sl and s3 at the center of mass of s2. The angle constraintis also in the form of a range. Our current relational description for an object consistsof ten relations. The A/V relation or attribute-value table containsglobal proOur A/V relationscurpertiesof the object. rentlycontainthe followingattributes:1) number of base supports,2) type of topmostpart, 3) number of sticks,4) numberof plates, 5) numberof blobs, 6) numberof uprightparts, 7) numberof horizontalparts, 8) number of slantedparts. The A/V relationis a simplenumericvector, including none of the structural informationin the other relations. It will be used as a screeningrelation in matching;if two objectshave very different A/V relations, there is no point in comparing the structure-describing relations. We are also using the A/V relations as featurevectors to input to a clusteringalgorithm. The resulting clusters representgroups of objectswhich are similar. Matchingcan then be performedon cluster centroidsinsteadof on the entiredatabaseof Other relations includeSIMPLE PARTS, models, PARALLELPAIRS, PERPENDICULARPAIRS, LENGTHCONSTRAINT, BINARYANGLE CONSTRAINT, AREA CONSTRAINT, VOLUME CONSTRAINT, TRIPLECONSTRAINT and CONNECTS/SUPPORTS. As an example, supposethe three-dimensional model of a simple chair containstwo plates (the back B and seat S) and four sticks (legsLl, L2, The relation obtainedfrom just the L3, L4). first two columnsof the CCNNECTS supportsrelation is {(S,B), (B,S), U-J,S), (S,Ll), (L2,S) r (S,L2),(L3,S)I (S,L3) (L4,S),(S,L4)]. Now consider the two-dimensionaldecompositionof Figure 2. We can construct the hypotheticalconnection relationC = {(sl,s2),(s2,sl),(s3,s2), (s2,s3), (s4,s2), (s2,s4), (s4,sl), (s3,sl), (sl,s3) I (sl,s4), (s5,s2), (s2,s5)]. Then the mapping f definedby {(S,s2), (Btsl) , (Ll,s3), (L2,s4), (L3,s5), (L4,s4))accumulatesno error while the mapping g definedby {(S,sl),(B,s2), (Ll,s3), (L2,s4),(L3,s5),(L4,s4)]accumulateserror since (L3,S)is in the model, but (f(L3),f(S)) = (s5,sl) is not in C. Not all of the three-dimensional relationscan be directlyconstructedfrom two-dimensional data. (If they could, the entire scene analysisproblem would be much easier.) For example,only an estimate of whether one part supportsanother can be computed. Relations like PARRALLEL PAIRS and LENGTHCONSTRAINT can also be estimated. Relations involvinganglesare probablythe most dif- III. Matching Relationalmatchingof two-dimensional objects 29 m 2. Badler,NaI.I J. O'Rourke,and H. Toltzis,"A SphericalRepresentationof A Human Body For VisualizingMovement", Proceedings-of the IEEE, Oct. 79. 3. Barr==, A.P. Ambler,and R.M. Burstall, "Some Techniquesfor RecognizingStructure in Pictures", Frontiersof PatternRecognition,S, Watanabe (ed),AcademicPress, New York, 1972, pp. l-29. Requicha and H.B. 4. Brown, C.M., A.A.G. Voelcker, "GeometricModelingSystemsfor Mechanical Design and Manufacturing", Proc. ACM 1978, WashingtonD.C., Dee 4:6, m -5. Coons, S.A., "Surfaces for Computer-Aided Design of Space Forms,"M.I.T. Project MAC, MAC-TR-41,June 1967 (AD 663504). 6. Horn, B., "ObtainingShape From ShadingInformation", in The Psychologyof Computer 7 Vision, (P.H. Winston,ed.),McGraw-Hill, New York, 1975, pp. 115-155. 7. Nevatia,R.K. and T.0, Binford, "Structured Descriptionsof ComplexObjects", Proc. Third International Joint Conference on ArtificialIntelligence,Stanford,1973.8. Shapiro,L.G., "A StructuralModel of Shape", IEEE Transactionson PatternAnalysisand Machine Intelligence, -1-2, No. 2, -980. 9. Shapiro,L.G. and R.M. Haralick,"Decomposition of Two-DimensionalShapesby GraphTheoretic Clustering,"IEEE Transactions on Pattern Analysisand Machine IntelligenZ&XX. PAMI- pp- lo-?Z&-ZZi.m 10. Shapiro,L.G. and R.M. Haralick,"Structural Descriptionsand InexactMatching", Technical Report CS79011-R,Departmentof Computer Science,VirginiaPolytechnicInstitute and State University,Blacksburg, VA 24061,Nov. 1979. 11. Voelcker,H.B. and A.A.G. Requicha,"Geometric Modeling of Mechanical Parts and Processes",Computer,Vol. 10, No. 12, Dec. 1977, pp.48-57. 12. Waltz, D., "Understandinq Line Drawinqsof Sceneswith Shadows",in The Psychologyof ed.x ComputerVision, (P.H.Winston, McGraw-Hi-w York, 1975. 13. Workshop on Representationof Three-Dimensional Objects, (R. W-Y, Director), Universityof Pennsylvania, May l-2, 1979. 14. York, B., A. Hanson,and E.M. Riseman,"Representation of Three-DimensionalObjects with Coons Patchesand Cubic B-Splines", Department of Computer and Information Science, University of Massachusetts, Amherst,MA 01003. l7l”. Figure 2 illustratesthe decomposition of a twodimensional chair by graph-theoretic clustering. ficult,since a perspectiveprojectionwill change the angles betweenparts. Such .information should be left out of initial matchingattemptsand used later to try to validatea given match or to "hoose betweenseveralpossiblematches. The precise definitionof an inexactmatch from a two-dimensionaldescriptionto a three-dimensional descriptionis the subjectof our currentresearch. IV. Sumnaryof Currentand Future Research We have described a relational model for three-dimensional objects, in which the parts of an object are very grosslydescribed as sticks, plates, or blobs. We are buildinga databaseof The objects in three-dimensional objectmodels. the databaseare being clusterdinto groups, using graph-theoretic clustering algorithm. Instead of comparing a two-dimensionalview to every object in the database, it will be comparedinitiallyonly to a centroidobjectsin each group. Only in those groups where the unknown object is most highly relatedto the centroidwill any full relationalmatchingtake place. Relational matchingwill be a form of the inexactmatchingwe described in [lo]. The general method will be to obtain estimatesof the three-dimensionalrelationsfrom the two-dimensional shape and match these estimatesagainstthe three-dimensional models. Derivingthe algorithms and heuristicsfor the matching is one of our most challengingtasks. References 1. Agin, G.J. and T.O. Binford, "ComputerDescriptionsof Curved Objects", IEEE Transactinson Computers, Vol. 25x03, -April 1976. 30