COMPUTEI~ GRAPHICS AND IMAGE PROCESSING (1974) 3, (23-33) Representation of Digitized Contours in Terms of Conic Arcs and Straight-Line Segments ANTONIO ALBANO ~ New York University, Bronx, New York 10453 Communicated by A. Rosenfeld Received August 5, 1973 One of the most interesting tasks of scene analysis is the reconstruction problem, which is the problem of finding a three-dimensional description of a scene from two or more projections. When the scene is assumed to be composed of man-made objects bounded by quadric surfaces, the reconstruction process is simplified if the border lines in each projection are given by means of straight-line segments and conic arcs. Since each projection is a digitized picture, a preprocessing stage produces a description of the boundaries in terms of point sequences. This report discusses an algorithm for converting the point-sequence description to an analytic description using conic arcs and straight-line segments. 1. I N T R O D U C T I O N One of the most interesting tasks in the analysis of scenes composed of man-made objects bounded by quadric surfaces is that of obtaining a description of the three-dimensional scene, given two or more digitized pictures taken from different and known vantage points. The first step for solving this problem consists of extracting from each picture a two-dimensional line structure representing the border lines. Next, a certain number of features, relevant to the reconstruction stage, must be determined. These features consist mainly of the type of junctions present in the line structures. ~ To analyze and classify these junctions, a suitable description of the border lines is needed. The problem of describing border lines belongs to the broader task of representing lines in digitized pictures. In recent years, a number of researchers have investigated this problem and have followed the general approach of approximating lines by polygons satisfying some specified property [1-3]. Their procedures have, however, all resulted in descriptions that are scarcely useful for locating the junctions in images representing the projections of scenes that are composed of man-made objects bounded by quadric surfaces. In order to break a boundary line at an expected junction, a procedure describing the lines in terms of conic arcs and straight-line segments is needed [4]. We shall assume that our data are digitized pictures of * On leave from University of Pisa, Department of Computer Science, Corso Italia 40, 56100 Pisa, Italy. J A junction is the intersection of two or more curves in the plane of projection, Copyright © 1974 by Academic Press, Inc. All rights of reproduction in any form reserved, 23 24 ALDANO the scene and proceed as follows: At the beginning, the digitized image is explored to extract local information such as the locations of points which belong to the border lines and the values of the tangent to the border line through each point. In this way we achieve a discrete, elementm'y version ok" the high-level line structure we are seeking [5]. Next, the points in this elementary version are joined into a line sh'ucture by means of a relatively small number of long straight-line segments and conic arcs. It is this lastnamed problem with which we shall be concerned in this report. The problem to be solved has two aspects: fitting and segmenting. T h e fitting aspect is concm~aed with the problem of finding the conic arc or the straight-line segment that best fits a set of points. For what follows, let us call et the error in the ith arc approximation. The segmenting aspect is concerned with the problem of finding the smallest n u m b e r p of arcs in which a p set of points can be divided, such that the total error ~ e~ is less than a maxt=1 imum tolerable value. In the next paragraph we shall introduce a leastsquares error function for fitting conic arcs and then show how to achieve segmentation using this fitting procedure. 2. FITTING OF CONIC ARCS Let Q ( x , y ) = A x 2 + B x y + o y 2 + D x + E y + F = O be the general conic equation, and let d~ be the distance from a point (xt,yt) to the conic Q. If the points m'e of different significance because of variations in the accuracy by which they were determined, we can associate with each a weighting coefflcient w,. The conic which best fits a given set of N points can be d e t e r m i n e d using the least-square error criterion which minimizes the following quanfit:c: N S = i=1 Unfortunately, the minimum value of S with the exact expression for d~ is difficult to evaluate. The task becomes simpler if we consider, instead of the true distances d~, the values assumed by [Q(x,y)l at the given points. T h e quantity IQ(xi,y~)] vanishes for points which belong to the conic and takes progressively larger values as the points' distance from the conic increases. The function to be minimized now becomes N s = E w,Q (x,,y,) i=1 To obtain the minimum value of S, which is a function of the six variables F, we set the value of F to 1 and the following five first partial derivatives to 0: A,B . . . . . REPRESENTATION OF DIGITIZED CONTOURS 25 aS = ~ 2w~x~(Ax'~+ Bx~y, + Cy[ + Dx~ + Ey, + 1) = 0, OA 1=1 aS N O'--B= ~ 2w~x~y,(Ax~ + Bx~y, + Cy~ + Dx~ + Eye + 1) = 0, i=1 a__SS = ~ 2wiy~(Ax~ + Bx~y, + Cy~ + Dx, + Eyt + 1) = 0. oE 4=1 We then obtain the five simultaneous linear equations: A~w~x~ + B~w~x.~yt + C~w~x[y[ + DEw~x~ + E ~ w ~ x ~ + ~w~x~ = O, AF,w~x~y~+ BF,wtx'~y'~ + C~w~x~y'~ + DEw~x'~y~+ EEw~x~y'~ + Ewfx~y~ = O, AY, wix[y~ + B~wlxty~ q- C~w~y~ + D~w~xiy[ + E]~wty~ + ~wd¢~ = 0, (1) AEwtx~ + BEwtx,~yt + CEwtx~y~ + D~w~x~ + EEwfxiyt + Ewty~ = O, A~wtx~y~ + BEw~x~y~ + CEw~y~ + DEwix~y~ + E~w~y~ + Ew~y~ = O, where the symbol :Z implies summation for i from 1 to N. Once the above system is solved, the conic equation is completely defined. In order to reduce the influence of numerical e~Tors, it has b e e n found convenient to standardize the coordinates of the points (see Appendix 1) before determining the coefficients of the equations and to normalize each equation before solving the system (1). Linear Constraints Since the described fitting procedure has to be used also in the segmentation stage, it is necessary to modify the procedure in order to introduce some strong and/or weak constraints. By strong constraint we m e a n that the conic has to pass through some specified point or through a specified point with a specified tangent. Such constraints, in n u m b e r less than five, can be expressed as linear equations in the variables A,B . . . . . E. By weak constraint we mean that a strong constraint n e e d not be satisfied exactly but only approximately. To handle the weak constraints, it is enough to add the corresponding equations to the system (1) and then to solve the overdeterrnined system A x = B by finding the solution vector that minimizes the sum of the squares of the components of the residual vector R = A x - B (least-square method). In Appendix 2 it is shown how to find the solution in the more general case in which each equation has a weighting factor. The weighting factor can be useful w h e n we wish to assign different importance to the weak constraints. When k < 5, strong constraints are present; then the overdetermined system o f m equations is solved in such a way that k constraint equations are solved exactly but the remaining m - k are solved according to the leastsquare method. 9,6 ALBANO Every time a new conic equation is determined to obtain a more accurate fitting, four equations will be added to the system (1). T h e y correspond to the weak constraints that the conic has to pass through the two end points with the expected tangents. Fitting Evaluation In the least-square approximation, the coefficients of the approximating conic are chosen such that the sum of the squares of errors is minimum. Although this is an efficient method for finding an approximation, the sum of the errors squared is not a good measure for discriminating b e t w e e n good and bad fits. We have found it convenient to use the quantity E = Max(wtHt), where H~ is an approximate value of the distance of the points (x~d]t) with weight w~ from the fitted conic. The distance H~ is computed as shown in Fig. 1. E is equal to zero for perfect fit and assumes higher values w h e n the fit becomes poorer. To decide whether or not a fit is acceptable, a threshold T has to be specified whose value depends on the accuracy available in the data. The more exact the data are, the lower the threshold will be. 3. S E G M E N T A T I O N Let the ordered list L be the set of cartesian coordinate pairs (x~,yl), (xz,y2). . . . , (X~C,yN) associated with the points to be described b y a set of conic arcs. We can describe a segmentation in p-parts by means of an i n d e x set {KI,K~, . . . ,Kp+~} such that each adjacent pair of indices (K~,K~+I), for i = 1. . . . . p, gives the position in the list L of the coordinates associated with the first and last points belonging to the ith segment. According to this definition, adjacent segments share one point. Assume that the number of segments p is known and a function is available that associates a cost C, with each conic arc. Then the o p t i m a l i n d e x set, P that is, the one minimizing the global cost C = ~ Ct, could be found by a t=1 discrete search [6-8]. Unfortunately, there are two reasons w h y the above solution is unfeasible: the number of possible index sets grows enormously H~=~ H~=Z~2 .x H~= Ay2 FIG, 1. Approximate distance from a conic arc. REPRESENTATION OF D I G I T I Z E D CONTOURS Points and thresholds 9~7 1 reduce current set of points __• consider points left ...... ~ /possible stvalght ( ~ line on the f i t straipht line ll~points?? reduce current iS the fittinn acceptable? set Qf points ,yes ~ ( f i t stralpht line is the fittino acceptable? \ line are all noints approMmated? on the first points? • consider points left F---- lyes try one conic © arc is the current approximation ® yes all ooints Y ~ared ? ~ no reduce current set of points F~G. 2. try to extend current fitted arc 1 Overall segmentation algorithm. with the number of points N and, in general, the number p segments is not available. The algorithm described here makes the problem manageable by utilizing a two-stage solution. In the first stage, an approximate index set is found. It is approximate in the sense that the number of its elements is the expected one, though their values K, might not be. Then, in the second stage, the Kp+l values are improved by a discrete search guided by the values previously found. The algorithm to find the approximate index set is shown by the flow chart in Fig. 2. The section SL of the flow chart is to extract from the given set of points those subsets which can be approximated by straight lines. This section of the algorithm only considers the points at the beginning and end of the list L. Let us see how it works when it is applied at the beginning of L. The decision to fit a straight line is made on the basis of the tangent information available at every point as follows. The current list L is examined to find out 28 ALBANO in how many successive points the tangents are in the range ti ± At, where t~ is the tangent at the first point and At is the maximum variation allowed to the tangents in order to consider the points candidates for a straight-line fit. If the number of points with the above property is greater than a prespecified minimum n,~,n, then a straight line through the first and the last point is evaluated together with the point M with the maximum distance dM from the line. If dM is less than the threshold T, the straight line is assumed as the fitted one. 2 Otherwise the set of points is reduced according to the index set {K1,KM} and the preceding steps are repeated. The final fitting will be considered meaningful only if the number of fitted points is greater than nmi,,. If it is not, this straight-line fitting stage will be abandoned. When a straight line has been accepted, the remaining points are considered and the procedure is repeated as long as some straight-line fit can be found. The same steps are repeated starting with the last points on the current list L. Section C of the flow chart in Fig. 2 finds the longest conic arc fit to a set of points. Let us assume for simplicity that no straight lines have b e e n fitted with the section SL. The algorithm starts by evaluating the conic arc that best fits all the N given points. If this fitting gives an error of less than a prespecified threshold, then the index set {K1,K2}, where K1 = 1 and/(2 = N, is taken as the optimal one and the second stage is skipped. Otherwise the set of points is divided into two parts according to the index set {Ks,/(2 ,Ks}, where/(1 = 1, Ka = N, a n d / ~ = [(Ks +/(3)/2] (biquadratic search). The first set of points to be fitted will be the one identified by the index (K1,K2). If the fitting is not acceptable the set of points will be further subdivided until a good fitting is found. Let us assume that the fitting to the set of points identified by the indices (K1,KtL) is not acceptable whereas the one identified by (K1 ,Kk) is, where Kk = [ (/(1 + Kh)/2]. In order to decide which are the first two elements (K1,K2) of the index set we are looking for, the procedure tries to extend the fitting by looking for the highest value of Ks, where Kk ~< K2 < Kh, which still gives an acceptable fitting. This search is again accomplished by means of a biquadratic search between Kk and K~ by applying the fitting procedure to each step. Once the conic arc fitted to the points identified by the index set (K~,K2) has been found, both the procedures for a straight-line and conic arc fit are again applied to the set of {K2,K3} until all the Kt values are found. From now on, however, the only points considered for straight-line fitting will be the ones at the beginning of the current list. Solution Improvement Once the approximate index set has been found, it will be used for simplifying the determination of the final one. Let us assume that the computed index set has the expected number of elements p and that the value of each z In general, this line is not the best fit to the current set of points. This problem will be resolved in the second stage, where the position of every line fit will be adjusted. However, the applicability of this simple algorithm is generally restricted to data with moderate noise. REPRESENTATION O F D I G I T I Z E D CONTOURS 29 K~ element can vary by A~. To find the final index set, it is no longer necessary to consider the best solution among all the possible segmentations of N points into p parts but it is enough to consider just the possible sets of indices generated by varying each K~, where i = 2 . . . . . 19, within the range K, +-- A+ 4. EXPERIMENTAL RESULTS The fitting and the approximate segmentation parts of the algorithm have been implemented in FORTRAN V on a UNIVAC 1108. No particular effort was made to optimize the speed of the program because we were more interested in testing the effectiveness of the algorithm rather than in speed. The information about the problem is given to the program by specifying for each point the coordinates, the weight, and the value of the expected tangent. The tangents are quantized into 32 values between 0 ° and 180 °. The data used for the test problems were obtained in the preprocessing stage from the two stroke images [5] shown in Figs. 3 and 4. These data were calculated from J FIG. 3. Stroke image. FIG. 4. Stroke image. I 30 ALBANO © • . '° j ../ / ' .. •, ° • °° ® FIC. 5. Segmentation examples. the original scanned pictures and were displayed on an ADAGE AGT/30 graphics terminal• The stroke images show the points where an edge element exists by giving a unit straight-line segment centered in each point and oriented with the expected tangent orientation of the edge at that point. In Fig, 5 two sets of points from Fig. 3 are shown. The first point of the list L is circled• The set marked (~) has been segmented as indicated in the figure and has been approximated by a straight-line segment and an elliptic arc, The computer execution time for this calculation was 0.28 sec. The set marked (~) required an execution time of 0.62 sec. and was approximated by a hyperbolic arc, a straight-line segment, an elliptic arc, and a straight-line segment. The first conic arc is not reliable because it is only evaluated at five points• The algorithm is organized in such a way that if flae current set of ® • • • • . • °.•,°•. .. C) ® • O Q• ' ®' ',• O FIG. 6. Segmentation examples. REPRESENTATION OF DIGITIZED CONTOURS 31 points to be approximated has five or fewer points, it will be indicated as a noisy segment and will not be further considered in the segmentation process. Other segmentation examples are shown in Fig. 6, which is obtained from Fig. 4. The computer execution time for the sets (~, (~), @ , (~), and (~) was, respectively, 0.56, 0.17, 0.16, 0.23, and 0.41 sec. In this version of the algorithm, the improvement of the approximate solution has not as yet been implemented. This has been postponed, both because the results were already satisfactory and because the algorithm was designed to he a part of a larger program for reconstructing three-dimensional line structures. We believe, in fact, that the position of the junctions can be adjusted more easily during the reconstruction stage when more global information is available. 5. SUMMARYAND CONCLUSION In this report an algorithm has been described for the representation of an ordered set of points in terms of a small number of conic arcs and straightline segments. The number and locations of the segments found were in most of the cases a good approximation of what one would have expected. The algorithm, however, cannot be assumed to be generally valid for finding the correct segmentation with more complex and noisy data. Failures are likely to occur when there is a smooth passage from one conic to another. But from our experience with this type of problem, we feel that any more reliable algorithm not guided in the search of the optimal solution by more global consideration is expected to face the same drawbacks. However, the simplicity and the low execution time of the algorithm make it useful when used as part of a genera/program for reconstructing three-dimensional line structures. ACKNOWLEDGMENTS The author thanks Professor H. Freeman for having brought to his attention the problem described in this report. The stroke images shown in Figs. 3 and 4 were made available through the courtesy of Mr. U. Ramer. The author also thanks Ruth Shapira for her valuable suggestions" concerning the solution implementation. This work was supported by the Directorate of Mathematical and Infmznation Sciences, Air Force Office of Scientific Research, under Grant No. AF-AFOSR-70-1854. APPENDIX 1. COORDINATE STANDARDIZATION Let xB- XW~!h • Xw~y~ X w , ' Vn = X w t be the centroid coordinates of the original points. The coordinate standardization is accomplished with the following steps: S t e p 1. The original coordinates are referred to xB, yB with the conversion = (x, - o, --- ( y , - w), 32 ALBA_NO S t e p 2. L e t M2o = E w i ~ , Mo~ ~- Ew~O~, and R = (Mo2 + M20)1/2, t h e n the standardized coordinates are given by the conversion X'~ -, xsl=~ Ot -. y.t=~ Once the conic e q u a t i o n A . x ~ + B ' x s ' y s + C . y s 2+D'x~+E.y~+F=0 has been detennined, it can be expressed in the original frame of reference by the equation: [(y - yo) - t . ( x - x0)] ~ [(y - uo) - t ~ ( x - x0)]" ( t ~ + 1)a ~ q ( t ~ + 1)b ~ = 1, (1) where x0,Y0 denote the coordinates of the center Xo t~,t~ \ B2_ 4 " • R + X~, Yo \ B~_ 4 . A " R + Yn. are the slopes of the conic axes C-A+ ta . . . . . . a 2 , b ~ are, ((C-A)'2+B~) m B C-A-((C-A)2+B2) ' tb= ~12 B respectively, the square of half the major and minor axes BDE Fs = ~ -- A E 2 _ C D ~ B 2 _ 4AC ' 12(F-v~)(C + a --'- ~ : Z ~ - ~ b ~={2(F-F') \B ~ - 4AC ) + ((C - A ) 2 + B~) ~1~ R ~, (C + A - ) 2 + B2) '~'2 W . ((C - A ) If Eq. (1) represents an ellipse, both a s and b 2 will be positive. If the given equation represents a hyperbola, then either a s or b ~ will be negative. The exceptional cases of a parabola or a straight line are associated with the condition B 2 _ 4AC = 0. These cases rarely occur in practice. APPENDIX 2. LEAST-SQUARES SOLUTION OF AN OVERDETEBMINED SYSTEM OF WEIGHTED LINEAR EQUATIONS Consider the system (m > n) a~lxl + • " " + alnXn = ba a.~lXl + " " • + a2nxn = b2 • . , . . . . . . . . . . , , a,nlx~ + ' . " + a,,,,x,, = b,,. This can be rewritten as REPRESENTATION OF DIGITIZED CONTOURS 33 a l l x l -b • ' • + a l n x n - - b j = ~'l a21xl -1- " ' • q- a ~ n x n - b~ = 't'z amlXl + b ~ = r,~. • • " + amnxn- I f e a c h e q u a t i o n has a w e i g h t w , , we look for the n u m b e r s Xl,X2 . . . . . which minimize the quantity x, E = w ~ . r~ + w 2 " r~ + " • " + Wm " r ~ . T h e r e s u l t o f s e t t i n g the d e r i v a t i v e s o r E relative to Xl,X~ . . . . and Xn equal to zero can b e e x p r e s s e d b y t h e following s y s t e m of normal equations: ~ w , a , ~ a , ~ x ~ + Y~wtatla~2x2 q- . . . . Y ~ w t a ~ a t l x l + Ew~a~.,a~x~ + . . . . Y~w~a,~b~ = 0 Ew~at~bi = 0 Ew~a~,,a~lxt + Ew~a~na,,,x2 + . . . . Ew~a~nb~ = O. REFERENCES 1. U. MONTANAR[,A note on minimal length polygonal approximation to a digitized contour, C o m m u n . A C M 13, 1970, 41-47. 2. C. L. JARVIS,A method for fitting polygons to figure boundary data. Austral. ComT~ut. J. 3, 1971, 50-54. 3. U. RAMER, An iterative procedure {br the polygonal approximation of plane curves, Computer Cra?phics and Image Processing 1, 1972, 244-256. 4. H. FREEMAN, Computer processing of line drawings, Technical Report 403-30, Department of Electrical Engineering and Computer Science, New York University, May 1973. 5. U. RAMER, The extraction of edges from photographs of quadric bodies Part I, Technical Report 403-29, Department of Electrical Engineering and Computer Science, April 1973. 6. H. STONE,Approximation of curves by line segments, Math. Comp. 15, 1961, 40-47. 7. R. BELLMAN,On the approximation of curves by line segments using dynamic programming, C o m m u n . A C M 4, 1961, 284. 8. B. GLUSS, Further remarks on line segment curve-fitting using dynamic programming, C o m m u n . A C M 5, 1962, 441-443.