January, 1987 LIDS-P-1638 ESTINATION ALCURITHIS FOR RECOOSTRUCRING A XONVEX SET GIVEN NOISY NEASUREIFIS OF ITS SUPPORT LINES Jerry L. Prince and Alan S. Willsky ASIRAB Cr: In many applications, measurements of the support lines of a two dimensional set are available. From such measurements, the convex hull of the set may be reconstructed. If, however, the measurements are noisy, then the set of measurements, taken together, may not correspond to any set in the plane -- they are inconsistent. This paper describes the consistency conditions for support line measurements when the angles of the lines are precisely known, but the lateral displacements are degraded by noise. We propose three simple algorithms for obtaining consistent support line estimates from inconsistent measurements and show examples of the performance of these algorithms. This research was conducted at the Massachusetts Institute of Technology Laboratory for Information and Decision Systems, and was supported by the National Science Foundation grant ECS-8312921 and the U.S. Army Research Office grant DAAG29-84-K-0005. In addition, the work of the first author was partially supported by a U.S. Army Research Office Fellowship. Laboratory for Information and Decision Systems, Room 35-233, M.I.T., 77 Massachusetts Avenue, Cambridge, MA, 02139, U.S.A. 1.0 INIRODUCTION There are several applications that involve the reconstruction of a 2-dimensional set using information about the support lines of the set. The simplest example of such an application is tactile sensing in robotics [Schneiter, 1986]. Suppose the typical parallel plate robot jaw shown in Figure 1 were to close down on a "thick 2-D object" from a variety of different angles. Each time the jaw closes down on the object, two support lines are identified, one for each plate. Given all the support lines over the angular interval [0,2Xr) one can then reconstruct a convex set -- the convex hull of the object -- in which the object must be contained. This paper considers the problem in which there are a finite number of support line measurements for which the angles are known precisely but the lateral displacements are noisy. Considering again the example above, one can imagine obtaining 4 support line measurements, containing the object in a rectangle (as shown in Figure 2), and then acquiring a fifth (noisy) measurement for which the measured support line completely misses the rectangle. This set constitutes an inconsistent set of support line measurements (see Section 3); it is this situation that this paper addresses. How does one estimate the set (or convex hull of the set) when faced with inconsistent support line measurements of this type? Another area of application, which actually motivated this study, is computed tomography (see [Herman, 1980], for example). In computed tomography, the measurements are (possibly noisy) line integrals in the plane. For example, consider the parallel-ray projection geometry shown - 2 - Support Lines Motion of Jaws Rotation of Robot Arm Figure 1. A parallel plate robot jaw that rotates in the plane measures support lines of the set S. - 2a- L4 L3 L2 L1 Figure 2. If L 1 - L 4 are known to be true support lines, then L5 , a noisy measurement, cannot possibly be a support line of any set support by L 1 - L.4 - 2b - in Figure 3. For a given angle 0, the line integrals g(t,O) = { f(x,y) dl L(t,o) are obtained for many discrete values of t forming the projection of f(x,y) at angle 0, as depicted in the figure. Given a projection of f(x,y), one may estimate two support lines (corresponding to the dotted lines in Figure 3) of the set S in which f(x,y) is non-zero. In a fashion similar to that described above, a set that contains S may be built up using the support line estimates from many angular projections. However, when the observed projections are noisy, the lateral positions of the support line estimates will be inaccurate; and, as in the robotics example, the observed support lines may be inconsistent. This paper provides several methods for obtaining a consistent set of support lines from such measurements and, hence, an estimate of the set S. In Section 6 we discuss further aspects of the computed tomography problem. There are many other applications where the algorithms described herein may be of use. One can view this problem in terms of the generic concept of set reconstruction in computational geometry [Preparata and Shamos, 1985][Greschak, 1985]. Then, in addition to the areas of tactile sensing and computed tomography, one finds that set reconstruction is also of great interest in robot vision [Horn, 1986], chemical component analysis [Humel, 1986], and silhouette imaging [Van Hove and Verly, 1985], for example. The contribution of this paper to the topic of set reconstruction is twofold: 1) we expose the fundamental constraint imposed on finite sets of support lines, and 2) we address the problem of set reconstruction given noisy support line measurements that may violate the fundamental constraint. While the particular problems and algorithms -3- Support Lines \"ttt~~~~~~~~t (x,y) L(t,e) Figure 3. The projection g(t,O) is formed by integrating f(x,y) along lines L(t,e). The dotted lines are support lines for the set S in which f(x,y) is non-zero. - 3a - developed are of potential value, we feel that a more important contribution of this paper is the initiation of the systematic treatment of geometric reconstruction in the presence of noise and a priori information concerning the set to be reconstructed. The paper is organized as follows. In Section 2, we present the basic geometry and some fundamental definitions and properties. In Section 3, the definitions and properties related to discrete support functions are developed. Section 4 presents three possible algorithms that take advantage of the constraints developed in Section 3, and Section 5 presents some simple experiments and simulations, showing the performance of the two algorithms. Finally, in Section 6, we discuss these results and present some possible directions for further research. -4- 2.0 BACKGROUND In this section we present the fundamental concepts of object geometry, followed by the definitions and properties of continuous support functions and support lines. 2.1 Fundamental Object Geometry. We refer to the two-dimensional sets under discussion as objects in the plane. Although it is not needed, we think of the objects as closed and bounded convex sets in the plane with no internal holes. This allows the convex region formed by the intersection of a finite number of support half spaces (see Section 3.1) to be a reasonably good approximation to the true set. In what follows, points in the plane are denoted by two-dimensional column vectors. denoted as uTv The dot (inner) product of two points u and v will be (where uT denotes the transpose of vector u). Since the unit vector pointing in the direction 0 is often required, it will be given the symbol w. Hence, w = [cosO sinO] T throughout this paper. A line is a set of points that correspond to a shifted one-dimensional linear subspace. In the plane, a line may be parametrized by two variables, t and 0, so that 2 T T. L(t,O) =w = [cosO { xe sin] | xe = t } where, as specified above, 2.2 The Support Function and Support Lines Consider the closed and bounded set S as shown in Figure 4. In the figure, two lines are drawn at angle 0 so as to just touch opposite sides of S. Technically both lines support the set S, but we shall identify only the solid line, not the dashed line, as the support line - 5 - \ S L \L Figure 4. 2 The support line at angle 0 of the set S is L,1 L2. - 5a - not 1 at angle 0 of set S. We make this identification rigorous in the following two definitions (refer to Figure 5). Definition: Support function, support value. The support function h(O) for the set = h(O) (where & = [cos0 sin0]T). sup xTS S C IR2 is given by c x The value of h(S) for some 0 is called the support value or support distance at angle 0. Definition: Support line. The support line at angle 0 for the set S is given by LS(0) = { x eC R2 xT = h(S) } Note that given the general definition for the line L(t,e) as L(t,e) = { x 2 t 1 xT = }, an equivalent definition is Ls(O) = L(h(O),O ) It is important to point out that these two definitions succeed in choosing the solid line in Figure 4 as the support line of set S at angle 0. The dashed line may be identified as the support line for the angle Stir. In this paper, the measurement of support lines corresponds to obtaining a measurement of t, a single real number, for a known angle, 0. Hence, the "noise" of a support line measurement contributes only to a lateral displacement of the line, not an angular rotation. - 6 - h(e) / Lsin eJ '<X ) Figue5.Theeoer of suppotlies( Figure 5. The geometry ofn support line Ls(O) is positioned as far in the w direction so as to just "graze" the set S. - 6a - Several simple properties concerning sets and their support lines and support functions are now stated without proof [Santalo, 1976][Kelly and Weiss, 1979][Spivak, 1979]. Property 1. The support function h(O) is periodic with period 2n. Property 2. For any point x E S it must be true that xT < h(O). This property merely says that S is guaranteed to be on a particular side of its own support line (see Figure 5). Property 3. If S is closed and bounded then the intersection of any support line of S with S itself cannot be empty. Unless otherwise noted, we assume throughout this paper that S is closed and bounded. Property 4. The complete family of support lines (i.e., L (0) with 0 known over any 2ir range) completely specifies the convex hull of S, denoted hul(S). If S is known to be convex then S = hul(S), and, hence, the family of support lines Ls(0) (or equivalently, the support function h(O)) completely specifies S. Property 5. If h(O) is twice differentiable then S is convex and the boundary of S is continuous and smooth (i.e., has a continuously turning normal vector), and is given by the parametric (vector) equation [cosO e(0) -sinO][h (0)] = [sinO cosO I '(0) where h'(0) is the first derivative of h(O) with respect to 0. -7- Property 6. Suppose the function q(O) is periodic with period 2r and is twice differentiable. some set if and only if Then q(O) is a support function of q''(0) + q(O) > O. This proposition is stated and proved in Santalo [Santalo, 1976], for example. reasoning goes roughly as follows (refer to Figure 6a). The First, we notice that a support function h(O) defines a family of lines which must be the envelope of a convex set. Using the expression in Property 5 it possible to show that the radius of curvature of the envelope (boundary) is given by h''(0) + h(O). The stated property then follows because the radius of curvature of the boundary of a convex set must always be non-negative (see [Spivak, 1979], for example). In Figure 6b we show a curve which has a radius of curvature which is first positive, then negative; such a curve cannot possibly be generated from any support function h(O). The following Section essentially restates Properties 4 and 5 for the discrete support function. As with the continuous case, in the discrete case we are concerned with 1) the representation of the object geometry given a discrete support function, and 2) the nature of the constraints for a discrete function to be a discrete support function. -8- N' h"(e) + h(8) (a) (b) Figure 6. (a) If h''(0) exists then the boundary of S (= hul(S)) has a continuously turning normal. The radius of curvature at the (unique) boundary points e(O) = Ls(O) given by h''(0) + h(O), which is alway non-negative. this "S" shaped curve, the radius of curvature Mi n S is (b) For changes sign. Such a curve cannot be the envelope e(O) generated by a support function h(O). - 8a - 3.0 SUPPORT LINE CONSTRAINTS AND GEONEIRY This section contains the bulk of the analytic and geometric background necessary to understand and appreciate the algorithms developed in Section 4. We begin by discussing finite sets of support lines, defining the support vector, and then stating and outlining the proof of the Support Theorem. Next, we discuss properties related to the geometry of the objects in the plane and to the geometry of the support vector constraints. 3.1 Discrete Support Function We shall need some further notation. Let &i = [cosOi the unit vector pointing in the 0i direction and let the line implied by the value h. at the angle O.. 1 1 sinoi]T be Li = L(h i Oi) be When referring to the support angle or support line we have numerous occasion to index the angle, unit normal, or support line with expressions like i+l, i-i, etc.. Unless otherwise stated, the index is assumed to be modulo M; for example, the index M+1 is to be interpreted as simply 1. 3.1.1 Finite sets of support lines. taken at discrete positions in space. In most applications data is For example, in computed tomography the angular data is collected from a finite set of angles and a finite set of positions at each angle. Suppose one were able to estimate the support values of some set S from observations of its projections or silhouettes at a finite number of angles. If the measurements were noisy, it may be true that the set of estimated support values does not correspond to any object at all, hence the set is inconsistent. -9- To see how a set of support values (with their corresponding support angles) could be inconsistent consider Figure 7. Here two support lines corresponding to the normals o1 and 03 are assumed to be known precisely. We examine the possible location of the third support line, L2 , corresponding to the normal w2 . It is obvious that positioning L2 to the right of the point P 2 would yield an inconsistent set -- if L 1 and L3 are indeed support lines then L, 2 so positioned, could not possibly intersect any set S with support lines L 1 and L,3 thus violating property 3 of Section 2. The consistency of the three lines in Figure 7 is guaranteed if L 2 is positioned to the left of P 2. Then it is possible to find some set S which has L 1, L,2 and L3 as support lines. In the next section we restrict the support angles 0i to be evenly spaced over 2w and specify a test to determine the consistency of the corresponding set of support values. 3.1.2 Consistent support values: the Support Theorem. Suppose we are interested in specifying the support values for arbitrary sets in R2 at angles given by 0i = 2ir(i-1)/M, i=1,...,M, where M > 5. these angles as support angles. h = [h1 some vector ... We refer to The question we wish to answer is given hM]T , under what conditions may the elements of h be considered to be the support values at the specified support angles for some set S? We will need the following definition in order to more compactly state the main theorem. Definition: support vector. A vector h = [h1 ... hM]T L. IR 2 = h. } = { u u. is a support vector if the lines - where 10 - i. = [cosOi sinOi]T and L3 L1 L Figure 7. 2 Assuming L 1 and L3 are the true support lines then L 2 can also be a support line (of some set) if it is positioned to the left of P2 - - 10a - O. = 2r(i-1)/M, for i=1,...,M, are support lines for some set 2 S C IR Now we are prepared to state the basic theorem of this paper. Theorem 1: The Support Theorem. h CIM A vector hT C < (M > 5) is a support vector if and only if 1 [O ... O] (3.1) where C is a MxM matrix given by C 1 -k 0 -k 1 -k 0 0 -k and 0 -k 1 -k : 0 ... -k 0 0 -k 1 ... 0 k = 1/2cos(2v/M). At this point, it is important to point out the parallel development of the discrete support vector defined above with the continuous support function described in Section 2. One immediately sees the similarity in the continuous support function constraint, h''(0) + h(0) 2 O, and the discrete support vector constraint, In fact, it can be shown that in the limit as M T h C < 0 goes to h''{0) + h(0) > 0 - ' hTC < O. the expression (see Appendix B). One is led to view the vector -h C as a type of discrete radius of curvature vector (which is always positive), in analogy to the continuous case where the 1 A vector inequality such as x.i < y T T x < y where n x,y C En th for i=,...,n, where x i and yi are the i vectors x and y respectively. - 11 - implies that elements of the radius of curvature was found to be h''(0) + h(O). This notion is made more concrete in Section 3.2 where we show that -h C has a direct geometric interpretation in the discrete case which resembles the continuous radius of curvature. The constraint hTC < 0 is more fundamental than h''(O) + h(O) 2 0, however, since the latter requires that the second derivative of h exist, which in turn implies that the set is convex and has continuously turning normals on its boundary. The discrete constraint, instead, applies for any bounded set in the plane. Sketch of Proof: (See Appendix A for complete proof). It is relatively straightforward to show the necessity of condition (3.1). By hypothesis, h is a support vector of some set S. Now consider the set D. defined by the two support lines Li_ 1 and Li+1 as shown in Figure 8. Note that by hypothesis Oi+1 -0i 1 M > 5, which implies that < r. This in turn implies that the two lines Li_ 1 and Li+ have a finite intersection point Pi, and that positive combination of wi-1 and wi+l' L. may be written as a These two facts are necessary in order to conclude that the support value at angle 0i for the set Di is T Pi W. T Then, since S C D we must have that h. Pi .. With a bit of algebraic manipulation (see Appendix A), this inequality may be shown to be equivalent to the condition given by the ith column of (3.1). This argument applies to each column independently, which shows the necessity of (3.1). The proof of the sufficiency of (3.1) is more complicated. Here we assume we have a vector h which satisfies the conditions (3.1); we must show that h is a support vector. which h is a support vector. To do this we construct a set S for Consider the following two sets - 12 - - 1+1 I //ii~l defined by lines Li 1 and Li+l and their normal vectors, and i+1 The support line with normal vector wi must lie to the left of the dotted line for consistency. -- - - -- -----· ii1 ----- ·-- -12a - ~1----"--1----~--`--11----i- constructed from the vector h (see Figure 9): { {u sB-= e 2IRI* 2 l1 '" M] <- [hl h2 "' hM] } (3.2) and = S hul(v1, v 2. ' (3.3) M) where the vi's are points defined as the intersection of two lines n Li+ vi Li L 2 { uEIR where 1 uT 1 = hi1 } (3.5) and hul(.) denotes the convex hull (of a set of points in this case). Now suppose that S B = SV. support vector of the set following. Then it must be true that h is the S =S B = S . One method of proof is the (See Appendix A for an alternative proof of this fact.) First note from the definition of SB in (3.2), that we must have sup x Wi xES On the other hand, v.i < SV = S B = S h. 1 1 and T v.U. = h.. Consequently, hi is the support value at this angle. What remains to be shown, then is that condition (3.1) implies that SB = S . The lengthy details of this part of the proof may be found in Appendix A. o The immediate use of the support theorem is as a test of consistency. Given a test vector h we may determine whether h specifies a consistent set of support lines by evaluating hTC and seeing whether the elements of the resultant row vector are all negative. From an estimation viewpoint, we see that if we are trying to estimate a support vector h from a set of noisy measurements, then we must make sure that - 13 - L3 SV 2 L4 L2 L5 Figure 9. The set SB (shaded region) is the intersection of the half spaces determined by lines vectors. The set S L1 - L5 and their normal (whose boundary is given by the thickened line segments) is the convex hull of the vertex points U1 - v5' - 13a - h our estimate h satisfies ^T h C < O. In the following section we examine these constraints in more detail. 3.2 Geometry of the Support Cone and Implied Objects · = { h e EM I hTC < [O ... The set 0] } is a convex polytope cone We which defines the consistent set of M-dimensional support vectors. shall call I the support cone. if h is in · property of cones: It is a cone because it obeys the usual then ah (a > O) is also in I. It is a polytope because it is the intersection of a finite number of closed half spaces in IR. The algorithms of Section 4 are inherently constrained optimization algorithms because of this constraint cone; they are, however, of the simplest type because the constraints are linear. In what follows, we study the structure of the support cone and what this structure reveals about the objects supported by vectors in the cone. 3.2.1 The Basic Object and the Vertex Points. Although a support vector does not define a unique set S in the plane, it is useful for us to think of a particular object (set) implied by h. In this paper we use the largest convex set consistent with the support vector h as the implied object -- we call this the basic object. The basic object is a convex polytope with vertices that may be found directly from the support vector h. We now give more precise definitions for these ideas. The Basic Object. Let h = [h1 h2 ... hM]T be a support vector. We define the basic object as the set SB = { u eC 2 I uT[wl M M... [h ]... An example for M = 5 is shown in Figure 10. - 14 - hM . (3.6) It should be clear that L3 V2 'l Z, a LL Li L5 Figure 10. ,4 The basic object, SB' is the largest convex set consistent with support vector h. support vector h, SB = S'. necessarily distinct. For lines Li defined by a Note that vertex points are not This occurs when 3 or more of the Li have a common intersection point, in this case L 1, L2 , and L.5 Note that this corner of SB is "sharper" than the others. - 14a - any set which has h as its support vector is contained in SB. is the largest set whose support vector is h. Hence, SB If the true object, S, is known to be convex, then as M is increased the set SB becomes a better If S is not known to be convex, then SB may never approximation to S. be a good approximation to S, but will closely approximate the convex hull of S, denoted hul(S) as M increases (see Figure 11). In the proof of the Support Theorem we used the set Vertex points. S V defined as SV = (3.7) ' "' v2 hul(v, where hul(.) denotes the convex hull (of a set of points) and the points V 1 ... v M are given by vi = Li n (3.8) i=l,...,M Li+ 1 We proved that when h is a support vector it must be true that SB Therefore, the points vI ,.... vM object (see Figure 10). = SV must be extreme points of the basic Ordinarily, one would call such points vertices, but since the points vi are not necessarily distinct (as demonstrated in Figure 10), we shall call them vertex potnts. The transformation of the support vector h to the set of vertex points is the most convenient way to describe the basic object, since any point in the basic object is simply a convex combination of the vertex points. Let us examine this transformation in some detail. vi = L i n Li+ 1 Since, by definition, we see that vi must simultaneously satisfy the two linear equations T Vi w. 11 h. 1 and vi 1 T i+ 1+1 = h i+l which is solved as ~~~~T ~-1 Vi = hi hi+l][oi - 15 - i+] * (3.9) SB Figure 11. he IR5 S he IR ' ° As M increases, SB becomes a better approximation to hul(S) (whose boundary is indicated by dotted lines) but will never be a good approximation to S. - 15a - The matrix inverse is found as follows -1 Doi oi+l - cosOi cosei+l =sinOi s 1 = 0i+l-0. 0 = 1 1 sine Eisine.n sinOi+l 0 where = 2-X/M. -sine. 1 h Radius of Curvature. meaning of the vector -cos0i+ coso. Hence, h.-°i+l [h i i+1 sini+ -coso i ii-sin0. 1 (3.10) cose. 1 In this section we interpret the geometric b = h C. We show that the negative of each of the entries of the vector b is proportional to a discrete notion of "radius of curvature" for the boundary of the basic object corresponding to the support vector h. The radius of curvature vector p = -b is used in Section 4.2 as a "smoothness" criterion for the estimated support vector. We now give the motivation for such usage. For two dimensional plane curves, the curvature K is defined as the rate of change with respect to arc length along a curve of the angle p that the unit normal for the curve makes with the x-axis dO ds The radius of curvature A is then the reciprocal of the curvature (see [Thomas, 1972], for example) 1 The analogy to the basic (polygonal) objects of this paper is straightforward. Referring to Figure 12a we see that the unit normal angle changes by 0 0 over an arclength of f., - 16 - th the length of the i face. & hi- ..-- fi eo (a) Pi p Twi-hi (b) Figure 12. (a) The ith discrete radius of curvature, ri, has an intuitive meaning as the reciprocal of the change in angle over arclength. (b) Pi is proportional to ri through the geometry. - 16a- One may then write, by analogy with the continuous case, the i "radius of curvature" as r. i As AO = fi 00 = - But fi may be written in terms of familiar quantities. reminds us of the geometric meaning of the quantity used in Section 3.1 to derive the Support Theorem. always non-negative since the vector p = -h C p = [P1 P2 ... where h is a support vector. Figure 12b Pi = PiT i - hi' We note that Pi is pM]T PM]T satisfies By simple trigonometry we have that 2P i tan0O fi so that the ith radius of curvature, ri, becomes 2 Pi r i Otan8O = Hence, we have that T 1 2 hC rtan0 2 ... rM] (3.11) establishing the proportionality of p to the discrete radii of curvature. To relate the vector p to "smoothness" of the boundary of the basic object, note that when an element of p, say Pi, is zero, then the ith face does not exist. Thus the boundary normal changes by 20o at the vertex instead of the usual 00 --- the boundary is sharper. corresponds precisely to the case in which several of the vi as illustrated in Figure 10. is larger. This coincide, A smoother object would be one in which Pi We see that the smoothest objects are the ones for which the faces all have equal length, i.e., p = a[l 1 ... positive constant. - 17 - T 1] where a is some 3.2.2 Eigen-decomposition of C. A very important property of the matrix C is that it is circulant. Because of this property, C is diagonalizable by the Fourier transform matrix and its eigenvalues are given by the discrete Fourier transform of the first row of C [Bellman, 1970]. Therefore, the eigenvalues of C are, M -j2-r(k-1)(n-1)/M = Xk Cln e k=l,...,M (3.12) n=l where c.. denotes the element of C in the ith row and jth column. xj The eigenvectors (the columns of the Fourier matrix) are ek where = WM = exp( M2). = 2 WM WM WM(k-l)[1 ... WM(M-1)]T It is straightforward to show that 1 - . cos(27r(k-l)/M)k=l,...,M cos(27r/M) (3.13) We immediately see that two eigenvalues (and only two) are identically zero, X2 = AM = 0; hence, C is singular. A basis for the null-space (or left null-space since C is symmetric) is given by the two eigenvectors e2 = eM = [l WM W2 ... WM(M)]T WM(M-1).[1 WM WM2 ... WM(M-1)]T WM Since the support vectors h are real we are most concerned with the null-space represented with a real basis. With some manipulation it may be seen that an equivalent real basis for the null-space of C is given by n1 = [1 cosO 0 cos200 ... cos(M-1)00]T n2 = [O sinO 0 sin2O0 ... sin(M-)0]T (3.14a) and . (3.14b) We return to properties related to the eigenvalues and eigenvectors in later sections. singular. For now we examine the consequences of C being 3.3.3 The proper support cone f p In terms of the geometry of the . support cone, one consequence of C being singular is that the support C, cannot be a proper cone -- cone, there is a linear subspace (of dimension 2) contained entirely in I. This implies that the support cone is composed of the Cartesian product of a proper cone and a linear subspace. This linear subspace is obviously given by X = span(nl,n 2 ) where n 1 and n2 are given in Equation 3.14. Defining the matrix N = [n1 : n2 ] (3.15) we may then identify the proper cone as 'P = {hR 0 : 0 0 : 0 0] } -N : N] < [O ... hRT[C (3.16) . This then assures that support vectors may always be decomposed into two orthogonal components as [Dantzig, 1963] h where h EC p p p and h n = hp + h n , (3.17) C N. We see in the following section that the nullspace component of a support vector h has a very appealing geometric interpretation. 3.3.4 Properties of The Nullspace of C. In this section we show that the component of h which lies in the nullspace of C corresponds to a shift of the implied object in the plane. A support vector h which has no component in N corresponds to an object which is centered on the origin in a sense we will define. Shift Property. The shift property of support vectors states that adding a vector from the nullspace of I to a support vector h, produces a new support vector whose basic object is a shifted version of h. see this we first use (3.14a,b) and (3.15) to rewrite (3.6) as SB = {ue 2 uTNT < hT } - 19 - To Now define the set S as a shifted version of SB S = SB + v 2. where v is an arbitrary vector in R then the difference vector T.T w - v Suppose w is an element of S; must be in SB, hence T T (w-v)TN hT < Then w clearly satisfies hT wTNT + vTNT = (h + Nv) But since Nv is in X, the nullspace of the support cone, then h + Nv is also a support vector, and, in fact, must be the support vector for the set S. hn = Nv The reverse logic holds as well -- adding a null vector to a support vector h shifts the basic object by v. From the above arguments we see that the The "Centered" Object. nullspace component of a support vector relates to the position of the It turns out that a useful definition of the basic object in the plane. position of a basic object is the average position of its vertex points, v. We will show that the support vector is related to v by M =(v = v v1Vi= M M2 hTN . (3.18) i=l To see this, we use the expression for the vertex points obtained in (3.10) 1 V = ~[sinoi+ -cose i+1 siio [hi hi+ 1 ] sn0 hi-sinO. cose. Expanding (3.18) for vx using (3.19) gives 14 Ux =M i=1 1 sinO sin [hi sinO -[Tsine 2 Msin8O 0 sin sinOe hi+ 1] -sine 10 sinO sinO hT sine sinOeM - 20 - 2 1 (3.19) h(e 1 - e2) Msin MsinG0 (3.20) using the fact that the indices are modulo M. The sum of the y components expands in a similar way vy- M yL0sinG M s 1 [ i+ [cose cos6 Y _ _._. 1 [h 3 T sin i T [ hi+l] h (e3 - 4 e 1 -e 2 i 1 ] (3.21) e 1 -e2 sin and e3 -e4 as follows sin(G 1+00)-sin(Ol-00) sinM e sinOe 1 ) Now we simplify the expressions for sin s cos M-2 cos81 0MsinS0 r Msi = cos.j sn coseo O cos 221 2sinSO s 01 cOM 1 .cosO M sin [cos . = 2sin80 n1 (3.22) and coso1 cosO cos(s +0o)-C°S(Ol-O0) rCOS-21 os1M- COS[ M [cosOj j Lcos(OM+GO)-cos(OM- 0O) sinO sin 2 2 sin0: sinO. - 2sinG 0 n2 . (3.23) . Combining (3.20)-(3.23) and using the definition of N from (3.15) yields (3.18). - 21 - We shall see in Section 4 that (3.18) can be used as a constraint on estimated support vectors if the position of the true object is known a priori. Note, in particular, that when h has no nullspace component, i.e., h is in · , then T hTN = 0 and, therefore, object is centered on the origin. - 22 - v = 0 --- the basic 4.0 ALGORITHMS We present three signal processing algorithms based on the ideas developed in Section 3. The basic idea is as follows. Suppose we obtain noisy measurements of M support values at the angles i.= 2ir(i-1)/M for i=l,...,M. It is likely in this case that the support measurement vector is not a feasible support vector. Therefore, a first objective of the following algorithms is to obtain a feasible support vector estimate from the measurements. The second objective of these two algorithms is to use a priori information to guide the estimate toward "preferable" or "optimal" estimates. 4.1 The Closest Algorithm The reason this algorithm is called the "Closest algorithm" is simple: we obtain as our support estimate that (unique) support vector which is closest to the measurement. Not only is this an intuitively simple thing to do, it is also the "optimal" constrained maximum likelihood estimate given the data. In this section we develop these ideas and describe the algorithm. Suppose the observed support values are given by yi = hi + ni, i = 1,...,M (4.1) where h. are the ideal support values which we are trying to estimate and ni are samples of independent white Gaussian noise with zero mean 2 and variance a . concerning In the absence of any prior probabalistic knowledge h = [h1 ... of h given the data hM]T y = [Y1 we desire the maximum likelihood estimate ... yM] T the supportcone defined in Section 3. - 23 - and subject to h E V where · The log likelihood function is is given by L(h) = (y-h) - -ln(2r (y-h) 2 (see [Van Trees, 1968], for example). 2) (4.2) The maximum likelihood estimate of h given y is given by hML = (- argmax h 1 2 T (y-h) (y-h) (4.3) hTC < O which may be found by solving the quadratic programming (QP) problem minimize -_ hTh subject to hTC hy - ( 0 (4.4a) . (4.4b) We see that hML is the support vector in C which is closest (in the Euclidean metric) to the observation y. the variance of n, and if y is in ' then The estimate is independent of ML = y. Many efficient general purpose QP algorithms and codes exist (see, for example [Land and Powell, 1973]) that solve (4.4). The algorithm we use is described in Goldfarb and Idnani (1983); it is an active set "dual" method, leading to speed and efficiency. taken from M.J.D. Powell (1983). The FORTRAN code is The experiments and results of the Closest algorithm are described in Section 5. One should note that the algorithm needs only a minor modification if the noise vector n has an arbitrary covariance, K = E[nnT], rather than the covariance a2I as was assumed above. In this case, the optimal constrained maximum likelihood estimate is the solution of the quadratic program minimize subject to 1 hTK-h - hTK-1y hTC < 0 (4.5a) (4.5b) which may be solved using the same code referenced above. One case where one would model the noise with this more general covariance might - 24 - be in tactile sensing where the robot manipulator is "preloaded" at certain angles due to gravity, making the effects of backlash angle dependent. Another case is in computed tomography, where detection accuracy of the support lines may depend upon the eccentricity and orientation of the object. 4.2 The Mini-Max Algorithm In this section we describe a second algorithm used to estimate the support vector from noisy observations. In this case, however, we assume the noise is known to be bounded. The optimality criterion we choose is to maximize the minimum "radius of curvature" of the basic object implied by the support vector. This leads to a linear programming (LP) problem which we solve using a primal simplex method. Suppose, as before, we have measurements given by Yi = hi + ni i=l,....M (4.6) where n, are independent, identically distributed noise samples, known to be uniform over the interval [--,-]. For example, this type of error might be expected when backlash from the gears of a manipulator arm is the dominant noise source. Given the vector of measurements, T y = [Yi... .y] , we see that, ignoring the support cone, the true support vector h, has equal probability of being anywhere in the hypercube (referred to later as the measurement box) {x E M I [-r -r . T < x -y <- and zero probability of being outside this hypercube. ] (4.7) Of course, since we are looking for only feasible h's we insist that the estimate we produce is also in the support cone, hence, the estimate we seek will have the following property hM-M e . n (4.8) · n f From the development, we can see that cannot be empty, but that, in general, (4.8) by itself does not guarantee a unique solution. As a start towards specifying a unique solution we introduce a new criterion related to our expectations concerning the shape of the source object. Suppose we expect the object to have a "smooth" boundary; i.e., we expect to have no sharp corners on the boundary. One may then choose to drive the estimate toward a support vector whose basic object has a boundary that maximizes the minimum radius of curvature, subject to the condition (4.8), of course. Thus, we seek the h in maximizes the minimum radius of curvature. ! n v that Accordingly, the Mini-Max support estimate is defined as hMM = T min ( -h C ) argmax h (4.9) using the definition of radius of curvature given in Section 3 (Equation 3.11). We now show that the solution to (4.9) may be found by solving the dual linear program maximize uTb subject to u A T (4.10a) < c where u, b, A, and c are defined below. must have that (4.10b) Since all feasible solutions h C < 0, (4.9) may be rewritten as hM M = max ( h argmin T T h, hTc 2 ...'' T hTc (4.11) hEf where c 1 ,.... cM are the columns of C. The objective function that is minimized may be identified as m(h) = T max ( h c T 1 hTc, c 2 - 26 - ... , T h c hM(4.12) (4.12) so that m(h) must be a number such that m(h) hTc i=l,...,M We seek the smallest such number m over all h in (4.13) t n g. Consider the two augmented vectors u 1 = b and = (4.14) Clearly the problem is to maximize uTb; this is the linear objective function of the dual linear program of Equation (4.10a). Now consider the constraints of Equation (4.9) using the augmented vector u. that it is possible to write the necessary constraints as We see uTA < c (Equation 4.10b) where C C A ,[-1 -1 c = I l l = ... -1]'[O 0 ... [ 0 0 ... -I :(4.15) l Ol'[O 0 ... 0 : 0 0 ... 0 : 0]o'0 0 ... hb O] -h ] (4.16) where h a = y - [r Y ... r ]T hb = y + Iv 7 ... and v]T (4.17a) (4.17b) The first partition of A and c defines the objective function constraints given in (4.13); the second partition specifies the support constraint; the third and fourth partitions restrict h to lie within the observation box I. If any feasible dual solution to (4.9) were known, then the dual simplex method could be implemented directly to find the optimum solution (see [Luenberger, 1984], for example). Unfortunately, no feasible dual is known a prtort; therefore, we chose to implement the - 27 - Mini-Max algorithm by solving the primal linear program (the dual to Equation 4.9) T minimize subject to c x Ax = b x > 0 and (4.18) which we solve using the usual Phase 1/Phase 2 (with Bland's anticycling) approach of linear programming [Luenberger, 1984]. to the primal problem of (4.18) is found, the Once the optimum x Let K be the set of column indices of dual optimum is found as follows. A, and Z be the subset of K denoting the elements of x zero. which are not That is K = { 1, 2 Z = { j } ... , K xj and (4.19) (4.20) 0 If the cardinality of Z is M+1 (the number of rows in A) then, since x must correspond to a basic feasible solution, we have that the optimum dual solution is u = [c ... c 1 2 c ][a ZM+ az 1 ZM+1 a 1 2 (4.21) where the z.'s are the elements of Z, the c's are the elements of the cost vector c, and the a's are the columns of A. The optimum support vector is given by the first M elements of u, according to the definition of u in Equation 4.14. If the cardinality of Z is less than M+1 then Z must be augmented so that it has a total of M+1 elements. arbitrary elements of K so that The augmenting indices are B = [a independent columns. ... a 1 2 ] a has linearly ZM+1 Then (4.14) is used, as before, to find h. However, since the columns of A used to augment B are not unique, there may be several different solutions. Thus, in general, the Mini-Max procedure does not yield a unique solution. We will see in Section 5 that one way in which this non-uniqueness presents itself is as an - 28 - arbitrary shift (within a finite range and in a certain direction) of the basic objects corresponding to the estimates. In other words, the estimate that the algorithm produces may be changed by adding a component from the nullspace of V --- and the new support vector remains feasible and has exactly the same cost. Sparse Data. The Mini-Max algorithm is applicable with minor modifications when there are fewer than M support line observations. One might think of this as obtaining a feasible support vector estimate which is simultaneously interpolated and optimized for the smoothness criterion. For example, given 10 support measurements at angles corresponding to a 10-dimensional support vector, this method could obtain an estimate of a 20, 30, 40, etc. -dimensional support vector that maximizes the minimum radius of curvature subject to the constraint that the elements corresponding to the 10 measurements must lie within the uniform noise bound around the measurements. Consider for simplicity the case in which we wish to reconstruct M support lines given only N observations, where M/N is integer, and the observations are evenly distributed in angle. For example, if M/N = 3 then the augmented observation vector y would look like ] Y = [Y1 * * Y4 * * ... YM-2 * * (4.22) where the asterisks represent no observations, hence no explicit information about the support line at the corresponding angle. The Mini-Max estimate is still given by Equation 4.9, however, the function of the observation box must be redefined. Clearly, only those elements of hM M that correspond to true observations must be confined by the observation box; the other elements of the estimate are only restricted by the support cone constraint. - 29 - The problem is still solved using the linear program of (4.10), but the matrices A and c are slightly different. c are simple, however. The modifications to A and In the last two partitions of A and c (equations 4.15 and 4.16), remove the columns (and thus the constraints) corresponding to the asterisks (the unobserved angles) of y in Equation 4.22. The new matrices A and c are shown in Appendix C. The solution is found precisely as before using the primal algorithm first, then solving for the dual optimum, hMM. 4.3 The Close-Min Algorithm In section 5 we shall see that the Closest algorithm and the Mini-Max algorithm produce two rather extreme estimates. The Closest algorithm produces estimates that strongly resemble the data, and the Mini-Max algorithm produces estimates that strongly resemble our prior expectation (circles). In a way similar to maximum a posteriori (MAP) estimation [Van Trees, 1968], the Close-Min algorithm is designed to combine the two criteria, thus producing estimates that will reside somewhere between these two extremes. The concept is simple: we define a new cost function which is a convex combination of the Closest and Mini-Max objective functions. Relating this to MAP estimation, we see that the Closest objective function plays the role of the measurement density and the Mini-Max objective funtion plays the role of the a priori density. We do not have an optimally defined trade-off between the two objective functions as is usually the case in MAP estimation; instead we shall use the convexity parameter, a, which may be varied between 0 and 1. - 30 - The function we maximize in the Closest algorithm is fC(h) -~- = (y-h)T(y-h) (4.23) and in the Mini-Max algorithm we maximize = fM(h) min(-hTC) (4.24) We therefore define the Close-Min estimate as hCM where 0 < a < 1. = afc(h) + (1-a)fM(h) argmax h (4.25) Since fc(h) is quadratic and fM(h) is linear, we may solve (4.25) using the QP algorithm. The statement of the QP is not as simple, however, because of the necessity to augment h as in the Mini-Max problem. Using u and b as in (4.14) we find that (4.25) may be restated as the QP minimize a( subject to u A uT < [ ] u- u ) + (-a) uTb (4.26a) c (4.26b) here I is the MxM identity matrix and y are the observations. The quantities A and c may be as given in (4.15) and (4.16) although provided 4.4 a X 0 the observation box constraints may be removed. Shift Corrected Algorithms As we suggested at the end of Section 3, each of these algorithms may have a priori positional information included in the estimation process. That is, suppose one knows that the true object is centered on v, in the sense that the average position of its vertex points is v. Then the estimate should also be centered on v. But from Equation (3.18) we see that this may be assured provided that we enforce the following linear constraint hTN = v . Since this is just another linear constraint, (4.27) is easily - 31 - (4.27) incorporated into the three algorithms with minor modifications to the matrix A and vector c (see Appendix C). With the addition of the shift correction, we now have 3 distinct types of linear constraints: (1) the support cone constraint which is always imposed, (2) the measurement box constraint which is imposed for the Mini-Max and Close-Min algorithms, and (3) the shift constraint which is imposed when the a priori true object position is known or if a good estimate of the object's position is available. Combining these constraints with the three basic objective functions allows us to produce a variety of estimation algorithms that favor some object characteristics over others, as we shall see in the following section. - 32 - 5.0 EXPERIENTAL RESULTS The results contained in this section are examples that illustrate the behavior of the algorithms. In the first section, Section 5.1, we discuss the generation of the data and the methods for producing the figures used to display the results. results for the three algorithms: Section 5.2 contains comparative Closest, Mini-Max, and Close-Min. Section 5.3 shows the behavior of the sparse data and shift-corrected modified algorithms. 5.1 The Data and Plots The advantage gained by using the proposed algorithms can be demonstrated nicely using a small (low dimensioned) problem. All the examples shown herein, have a 10 dimensional observation vector; and in all cases except one, the restored support vector has 10 elements. The exception is in Section 5.3.2, where the "sparse" mini-max algorithm is used to reconstruct a support vector of dimension 30. To show graphically the basic object to which a feasible support vector corresponds, it is sufficient to simply determine the vertex points {v1 v2 ... VM) and connect them sequentially. (This is not a trivial observation; a proof that the vertex points are leftward turning allows one to conclude this.) However, for an infeasible measurement, such as each of the observation vectors we shall use, one does not have the condition that S B = SV. Ordinarily, one might display S , since, by default, this must be the set if the measurements are assumed to be accurate. Instead, however, we shall display the infeasible measurements in precisely the same way as the feasible support - 33 - functions: connect the vertex points sequentially. a vertex plot. We call such a plot It can be shown that for an infeasible vector, the resulting curve will not be simple, i.e., the curve will cross itself. Therefore, for the small examples we show here, it is easy to spot the infeasible vectors simply by observing the crossing of the vertex plots. One should also note that it is easy to visually construct both sets SB and S from the vertex plot (refer to Figure 9, for example). To show the behavior of the three algorithms, we use noise-corrupted measurements of the support vectors of two objects. The first of these objects is a circle centered at the origin, and in this case the noise-free support vector is h [r r... r]T = c where r represents the radius of the circle. The second object is an ellipse, centered on the origin, with principle axes corresponding to the x and y axes. he = ei = el In this case the noise-free support vector is e ... eM] where cos2 ([i-1]00 ) + b2sin2([i-l]00) and a and b are the ellipse radii corresponding to the x and y axis, respectively. The two true figures we use have the same area. radius 1/2, thus area vr/4; The circle has the ellipse has a=3/4 and b=1/3. The experiments shown here involve also two different noise levels. One set of noise values, denoted by n1 , has elements drawn independently from a probability density which is uniform over the interval [-r - = 0.2. a], where The second, n2 , has elements drawn independently from a probability density which is uniform over the interval [-r - 34 - a], a = 0.4. We therefore use four support observation vectors: YC = hc + nl YC = h = h = he + n 2 + n2 C2 Y e1 Ye 2 e + n e The vertex plots for Ycl, Yc 2 Yel and Ye are shown using solid lines in Figures 13(a), 14(a), 15(a), and 16(a), respectively. Appearing in those same figures, using dotted lines, are the corresponding true support figures h and he . The scales are different on each figure to accomodate the different sized noise vector contributions. 5.2 Comparison of the Algorithms Figures 13(b), 14(b), 15(b), and 16(b) show the basic figures corresponding to the estimates produced by the Closest algorithm (dashed lines), Mini-Max algorithm (solid lines), and Close-Min algorithm for a = 0.5 figures. (dotted line) for each of the observations indicated in the (a) The same scale is used on the (a) and (b) figures corresponding to the same figure number so that comparisons can be made. We now proceed to discuss each of the experiments in order. Unfortunately, it is too confusing to superpose all four vertex plots onto the same figure; instead we must make careful observations using the fact that the scales are the same on each figure (a) and (b). In order to indicate points on each of the figures, rememember that the support value corresponds to the right hand vertical face; increase in a counterclockwise fashion. - 35 - 1 St the indices ---- 1 True Object Observation y=0.2 - I .I -1 L 0 ' ' ~-- ~Closest ~1 (b) ' 1 ' I -- 1 t- -1 --I Observation y=0.4 I Figu e I. 1, Observation - -1 1 -- T=0.4 i 0 Mini-Max -. Close-Min - Closest :i -1 /. 0 1 (b) Figure 14. (a) The true object (circle) and the measured support vector (y=0.4). (b) The estimates produced by the three basic algorithms. - 35b - SB -- 1 ---I True Object Observation 7=0.2 . -1 . I I 1 0 (a) 0 -- i1 Mini-Max .... Close-Min - - Closest -1 1 0 (b) Figure 15. (a) The true object (ellipse) and the measured support vector (y=0.2). (b) The estimates produced by the three basic algorithms. - 35c - -1Observation =0.4 -1 0 1 (a) 1 -- Mini-Max -.... Close-Min -- Closest -1 -1 0 1 (b) Figure 16. (a) The true object (ellipse) and the measured support vector (y=0.4). (b) The estimates produced by the three basic algorithms. - 35d - In Figure 13, the observation vector is almost feasible; we can see only one spot (involving support lines 8, 9, and 10) where the vertex plot crosses itself. The closest support vector is one that just corrects the point of inconsistency., as can be seen in 13(b). At first glance, the Mini-Max estimate appears to be quite good, however, if one examines the size and position of the basic object one finds that the figure is too big and shifted upward. The fact that it is too large simply reflects the fact that large basic objects tend to have larger radii of curvature --- that agrees with our chosen criterion. The fact that it is shifted up will be shown in Section 5.3 to be at least partly due to a non-uniqueness in this solution. We see that, as expected, the Close-Min algorithm produces an estimate that "blends" the two extreme solutions of the former algorithms. Note in particular that in the Closest estimate, support line 2 corresponds to a degenerate extreme point of the basic object; the Close-Min algorithm smooths out this sharp corner. Figure 14 shows a more severe degradation of the true support vector; there are 3 points where the vertex plot of the support measurement cross. As before, the Closest algorithm produces an estimate that seems to simply remove the points of inconsistency. In this case, one sees that the Closest estimate has only 6 vertices, therefore 4 vertex points are degenerate. In fact, the upper vertex of the figure corresponds to 3 vertex points --in the boundary. this is a very sharp turn The Mini-Max algorithm produces the desired smoothness, but as before, creates a figure that is too large and is shifted (down in this case). "blend" once again. The Close-Min algorithm shows the correct It is important to point out that the set S B - 36 - (crosshatched region in Figure 14(a)) constructed from the raw measurements, is a bad estimate of the true set, thanks to the extremely large noise value. It is too small and is shifted down. The fact that it is small comes from the fact that the construction of SB essentially ignores the support lines that are farthest out. Each of the algorithms proposed here use all of the measurements to "pull" the inner support lines out, if necessary. There is not much new information about the estimation algorithms to be derived from Figure 15 except that the Mini-Max estimate tends to be too circular. This was also to be expected since a circle has the smoothest overall boundary. Figure 16 again shows that the Closest algorithm has the effect of shifting neighboring support lines to fix the points of inconsistency. Also, the Mini-Max algorithms produces an estimate which is more like an ellipse, but is still too large. It appears that the Close-Min algorithm is, as usual, the best estimate of the true support vector. Finally, to clarify some of the behavior of the Mini-Max estimates, it is useful to examine the estimates together with the observation box bounds ha and hb (see Equation 4.17). Figure 17 shows the vertex plots for the Mini-Max estimate (solid line), ha (dotted line), and hb (dashed line), for the example shown in Figure 13 --- that is, the noise-free figure is a circle and the noise range is [-.2 .2]. It should be clear from the figure that the estimated basic object can be shifted vertically down and still remain in the observation box. The 9 th support line is the first to become infeasible if the figure is shifted down too far. Shifting the estimated basic object keeps the estimate feasible, of course, since this merely corresponds to the addition of a - 37 - Mini-Max -t1 hb -1 .--- 0 1 Figure 17. The Measurement Box. Each element of the Mini-Max estimate is shown to lie within the corresponding elements of ha and hb as prescribed by the measurement box constraint. - 37a - vector in the nullspace of '. But, more importantly, such a change in the estimate does not change the optimum cost of the linear program that produced the Mini-Max estimate since no radius of curvature element is changed. Thus, this solution is not unique --- any of the shifted versions discussed above are optimal solutions --happened to find this one first. the LP algorithm just In the shift-corrected algorithms discussed below, this component of non-uniqueness is eliminated by imposing a known object position. As we shall see, this simple correction has dramatic effects on the Mini-Max estimates. 5.3 Modified Algoritbms In Section 4, we discussed the shift-correction and sparse-data modifications that may be made to the basic algorithms. We now show results of applying these modifications to the example of Figure 13. 5.3.1 Shift Correction. Figure 18 shows three vertex plots corresponding to the true support vector (solid line), the Mini-Max estimate from Figure 13 (dotted line), and the Mini-Max shift-corrected (for v = O) estimate (dashed line). We see immediately that the shift correction does not simply shift the original MNni-Nax solution down. To understand this we recall Figure 17. We saw that due to non-uniqueness of the solution that we could shift the solution down and up over a finite range. But, evidently, none of these shifted positions causes the sum of the vertex points to be exactly zero. To allow this to occur, the (shift-corrected) Mini-Max algorithm was forced to shrink the estimate as well. We see from this result that prior information about the position of the object may have a very strong influence on the performance of the algorithms. - 38 - True Object .... Mini-Max --1 -- -I -1 Shift-Corrected Mini-Max , . I I 1 0 Figure 18. Shift-Corrected Mini-Max estimate. - 38a - 5.3.2 Sparse Data. Now we implement the sparse-data Mini-Max algorithm for the example of Figure 13. In Figure 19 we show the original Mini-Max estimate for M=10 (dashed line), and the sparse-data Mini-Max estimate for M=30 (dotted line) as well as the measurement box bounds, ha and hb. With more degrees of freedom in the estimate one would expect to see more circular objects in the sparse-data case, and, indeed, that is so. Since the 10 measured support lines are still constrained, the estimates cannot get too much larger than the original, but within the bounds of ha and hb , the new and old lines are arranged in a more circular fashion. We also see that the non-uniqueness related to the shift of the estimates is still prominent, as the two trials happened to produce estimates at opposite ends of the feasible shift range. - 39 - 1 --1 - ha (inside) and hb (outside) --- Mni-Max (M=10 .... Mini-Max (M=30) -1 Figure 19. 0 1 Sparse-data Mini-Max estimates. - 39a - 6.0 DISCUSSION We have seen that knowledge of the basic geometrical constraint, h C < 0, can be used to advantage when recontructing sets from noisy observations of their support lines. We have described and compared three algorithms which utilize this support constraint as well as other The Closest algorithm gives the constrained criteria and constraints. maximum likelihood estimate assuming the noise is Gaussian. It requires the minimum amount of prior knowledge about the set to be reconstructed, and is implemented in a straightforward manner using quadratic programming techniques. The Mini-Max algorithm assumes bounded independent uniform noise; it must be guided towards a unique solution by prior knowledge about the object; and it uses linear programming techniques. Finally, the Close-Min algorithm blends the former two objective functions (similar to MAP estimation) to produce estimates guided by prior information but not as strongly as in Mini-Max. The examples studied in Section 5 show that: 1) As an estimate of the basic object, SB, in not very good. It always produces an estimate that is too small, because it, in effect, ignores the support observations that are farthest out, thus placing all the emphasis on innermost observations. 2) For the Closest algorithm, small amounts of noise result in estimates that closely resemble the observations; the algorithm merely corrects the points of inconsistency. 3) The Mini-Max algorithm is highly biased toward the prior expectation --- a large circle, in this case. Also, an observation which is just slightly outside of the support cone may cause the - 40 - estimate to differ greatly from the observation. 4) The Closest algorithm always yields a unique solution; the Mini-Max algorithm does not, in general. 5) Prior shift information can substantially improve the estimates. 6) The Close-Min algorithm permits one to blend the prior expectation of boundary smoothness with the criteria that the estimate be close to the observations, thus implementing a kind of MAP criterium. The primary contribution of this paper is in the formulation of the problem as a constrained optimization problem which includes the fundamental support vector constraint, a priori information, and uncertainty in the measurements. Many extensions are possible, both in the inclusion of additional constraints imposed by prior knowledge or in the development of more elaborate objective functions. Among the possible extensions is the inclusion into the estimation process of a known object area or bounds on the areas. The area of a basic object is a quadratic function of h, however, so that, as a constraint, the inclusion of this type of information makes the algorithms more complicated. A simpler extension involving the constraints may arise if one only has partial information regarding the position of the object in the plane. If the position was known to be bounded, then instead of having two lineae equality constraints as in the shift-corrected algorithms, one would have four linear inequality constraints. Other extensions may include improvements on the objective function. For example, if the position was known only approximately, but could not be considered to be bounded, then a quadratic penalty could easily be - 41 - incorporated into the objective function to try to keep the estimate near the expected position. As promised in the introduction, we wish to discuss more how we believe that this research can be of benefit in the discipline of computed tomography (CT). We now know how, given estimates of a set of support lines, we may reconstruct an estimate of the convex hull of an object. Figure 3 showed how one might seek to estimate such support lines in the CT setting. But given an estimate of the support of a function f(x,y), does this aid in reconstructing f(x,y) from its projections? There is some literature that indicates how known convex support constraints on f(x,y) can be used to aid in reconstructing f(x,y). Among these methods are the method of "projection onto convex sets" (POCS) [Sezan and Stark, 1984][Trussell and Civanlar, 1984][Youla and Webb, 1982], dual methods of POCS [Leahy and Goutis, 1985], and constrained simulated annealing methods [Prince, 1986]. The difference here is that we begin with an estimate of the support of f(x,y). Thus, a two-step hierarchical reconstruction method is envisioned: (1) estimate the region of support, and (2) estimate f(x,y) using the above estimate with a constrained reconstruction algorithm. There is much precedence in using estimates of certain parameters to aid in reconstruction. For example, the "dc" bias is quite commonly estimated from the data and used to provide the "dc" bias of the final estimate (see [Herman, 1980]). here. But there are many unanswered questions For example, how does one estimate the support values from the - 42 - projections?, and should the estimated support be imposed as a constraint or merely a penalty on the subsequent reconstruction process? It is interesting to consider the above reconstruction process if one were to know a priort that the function f(x,y) has the value 1 on S and vanishes elsewhere, and that S is convex. the reconstruction process; Then step (1) completes i.e., the problem reduces to a straight set reconstruction problem, as discussed in the introduction. Thus, this work may be considered to be extending the model-based approaches to reconstruction begun by Rossi and Willsky [Rossi and Willsky, 1984] and Bresler and Macovski [Bresler and Macovski, 1984a,b,c], by considering a slightly more general object (convex and binary with unknown boundary) and using fundamental constraints related to the consistency of the Radon transform [Louis, 1980] to form optimal estimates of the modeled object. - 43 - APPENDIX A The Support Theorem (Theorem 1) is proven in this appendix. We begin with two lemmas which are needed in the proof. A.1: Lemnm R2 SB = { u Let the set I uTU < h } Q= [w1 for w. = [cos i sinoi]T 0 02 where M] ."' = (i-1)2r/M, M > 5 and Then SB is bounded and convex. Proof: We show that SB is convex first. For any u,v E SB 0 < a < 1 and we have auT + ] 2 = auT++(1-a)v] auQ)h =h Therefore, all the points on the line segment joining u and v are in SB. Hence, SB is convex. exist vectors a > O. u e SB This implies Now suppose S B is unbounded. and v E R (u + av) T such that < h Then there must u + av is in SB for all vT and, hence, that < [0...0]. (Points v are called ray points of SB; we must prove that these cannot exist.) Writing v in polar coordinates as v = [rcosp rsinc] T we see that the inner product of v with the columns of £ is given by the row vector p = [rcos(01 -p) rcos(0 2 -p) ... rcos(OM-q)] Since the angles 0i are evenly distributed samples over 2w and M > 5 we see that it is impossible to sample the cosine function so that all - 44 - sample values are negative; therefore, p cannot have all non-positive elements. o Hence, SB cannot be unbounded. Lemma A.2: Let S be a set in R2 and L be the line L = { u C R2I uTW < t ) T xw < t (i) (ii) L n S for where w is a unit vector. x E S, Suppose and is not empty. Then L is a support line of S. Proof: Since w is a unit vector in JR it may be written as T w = [cosO sinO] . Then, to show that L is a support line of S we must show that the support value at angle 0 of the set S, given by = h(O) sup xTw is equal to t. From (i) we see that that there exists a point y C S h(O) < t. But from (ii) we have such that T y ¢ = t. Therefore, h(O) = t, which proves the lemma. a For convenience we repeat the Support Theorem and then commence the complete proof. Theorem 1: The Support Theorem. h C ER A vector h C (M 2 5) is a support vector if and only if < [O ... O] (A.1) - 45 - where C is the MxM matrix given by 1 -k -k 0 0 ... 0 0 0 ... -k and -k 1 -k -k 1 : -k 0 -k 0 1 k = 1/2cos(27r/M). Proof of Theorem 1 (Support Theorem): First we prove that (A.1) is necessary. vector h is a support vector. 2 L. = { u E We are given that the Therefore, by definition, I uT i = hi }, i=l,...,M, are support lines for some (non-empty) set S C R . Then it must be true that S is contained within the polytope D. D (see Figure 4 in the main text) 2 xT[&i 1 = xER i1 [hi-1 h 1] i+] (A.2) A2 for any i, where the index arithmetic is taken to be modulo M so that the indices stay in the range M 2 5, which implies that Li_ 1 1 < i < M. Note that by hypothesis 0i+1l-Oi 1 < r. This in turn implies that and Li+ 1 have a finite point of intersection Pi, and that wi may be written as a positive combination of ci-1 and Wi+1 ' These two facts are necessary in order to conclude that the support value at angle 0i for the set D. is Pi wT. Then, since 1 i' S C D.1 we must have that hiI < Pi1 We now show that the condition th i constraint of 1. hi <P.Ti T hTC < O. We have that T Pi T[i-1 Oi+ll c = [hi-l (A.3) hi+l] - 46 - is equivalent to the Pi =L L1 Li+ therefore, Solving for Pi T gives T -1 Pi [hi-1 hi+l1 [h i-1 hi+] [wi-1i : i+ 80 = 2r/M. where i+l1 cosO i 1 cosoi+ 1 sinOi 1 sine 1 sin2 sinOi+l -sinei 1 0 The inner product Pi *i T is = [hi[h hi+ hi1sinel hi-1i sin2O0 sinco 1 -sine i0 1 i+1 -c9sO i+ cosOe I |cosi sine sini hi+1l [hi-1 i found as i+1 1_ Pi -cosOi+l coseO sin(h0 0 sinO0 sin200 (hi-1 + hi+l ) Using the double angle formula sin200 = 2cosO0sinO 0 and the inequality hi < Pi ' -h. (Equation A.3) we see that h hi-1 i+1 + 2cose0 2cos hi1 th which is precisely the condition specified by the i (A.1). column of C in Since this applies to each i independently, this proves the necessity of (A.1). Next we show that hTC < 0 is a sufficient condition for h to be S C R . the support vector of some set We do this by construction. Consider the two sets SB = {u Sv = hul(v1 , v2 ..... IR2| uT[ 1 22 ... UoM] < [hi h 2 ...'h2M (A.4) and M) (A.5) - 47 - where hul(.) indicates the convex hull of the set of points v., i=l,...,M, given by n v = Li L = { u C Li+1 (A.6) where 2 u T.i= h. } for i=l,...,M. (A.7) Now consider what would happen if the two sets SB and S v Then letting were equal. S = S B = S-, we see from the definition of SB that each point x in S has the property from the definitions of S , i. < h. x for i=l,...,M. Furthermore, v i, and Li, each line Li intersects S in at S n Li least one point vi, yielding the fact that is not empty. Hence, from Lemma A.2 we may conclude that each line Li, i=l,...M, is a support line of S. vector for Then it follows immediately that h is a support S = S B = S-, which proves the sufficiency of (A.1). To complete the theorem, then, S B = S. that S we must show that (A.1) implies This is done in two stages. First we show that SB C S ; then C SB - Since SB is a bounded (convex) polytope (by Lemma A.1), it may be written as SB = hul(e1 , e 2 ..-. ep) (A.8) where e. are the extreme points of S B (which are guaranteed to be finite 1 in number since S B is formed as the intersection of a finite number of closed halfspaces). Consider one particular extreme point of S ,B ej. It must satisfy with equality at least two inequalities in Equation A.4, the defining expression for SB. indexed by k. Let one of those inequalities be Then we have eTk U hk - 48 - ' (A.9) i.e., e. lies on the line Lk. Two of the vi's also lie on Lk: Vk-1 and v k· Now suppose e. could be written as the convex combination of Vk_ 1 and vk. Then any extreme point of SB could be written as the convex combination of 2 points in S . then we must have that And since both SB and S are convex, SB C SB proving this stage of the theorem. We now show that e. can, indeed, be written as the convex combination of vk_ 1 and vk. Vk- 1 = Vk Vk_ and 1 Here, there are two possibilities: X Vk. Each of these cases require some development. In the case where vk_1 = vk, we show that ej = Vkl = vk e. is clearly a convex combination of vk_ 1 and vk. J Vk are on the line perpendicular to 'k' e. where = &k [ = vk + First, since e. and J we may write e. as Pklk is the perpendicular to O]k so that Wk. Taking inner products of both sides of the above expression with uk-1 and using the fact that e. is in S B we may write Ok- lTe. k-1 = + Ok-1 and, similarly, for Wk+1 T k+le = hk+1 + 6k hk-1 TI k+1 k k+1 Hence, TI PWk-1 0 k -< and P k+lwk < 0 After simplifying the above expressions using the definitions of Wk 1 -"-" - 49 " I ""~"---~---~11----------~I---~- 0 -- Wk+l' and wk ' we are led to the contradictory equations P(-sin 0 O) < 0 3( sin0O) < 0 hence, p must be zero, and therefore In the case where Vk_1 • v k ej = vk = Vk_ 1 ' as required. we first need an auxiliary result relating the unit vectors &k-l' Wk' and Wk+l' From the geometry it is easy to verify that k 0 0 = 2~/M. where = 2cosO (k-1 Next, since ej, Vk_, 1 k+l) + (A.10) and Vk all lie on the same line, Lk, and vk_ 1 and vk are distinct points, we may express ej as a linear combination of vk_ 1 and vk using the single parameter a ej = auk 1 + (1-a)vk (A.11) . Taking the inner product of both sides of (A.11) with T ej T k-1 = auk-l1 k-1_ we have T k-1 + (1-a)uvk a hk-1 k-1 + (1-a)vkl k-1 < hk-1 (A.12) The last inequality results from the fact that ej is, by definition, in SB . Now we eliminate uk-1 from (A.12) using (A.10) yielding ahkl + (l-a)vkT( 2 cos o&k8 - wk+l ) hk-1 which may be further reduced to (1-a)(2 cosOohk - hk_ 1 - hk+l) Since from (A.1) the quantity • 0 2cosO h k - hk_ 1 - hk+1 non-positive we immediately recognize that (A.13) . must be a < 1. Taking the inner product of both sides of (A.11) with Wk+l and using a similar sequence of steps leading to (A.13) one may show that a( 2 cosOohk - hk_1 - hK+ 1) from which we conclude that a > 0. • 0 Hence, we have that - 50 - (A.14) 0 ( a < 1 and e. is, in fact, a convex combination of vk_-1 and vk. proof that S B C SV. Bv Now we commence the proof that vi E SB for each i=l,...,M. that This completes the S Therefore, we intend to show that for all i=l,...,M. • wM] ... In what follows, we show Since SB is convex this is sufficient to prove that S v is contained in SB . ViT[w1 C SB. ... [h I (A.15) h 1] Expanding (A.15) using [cosoi. = (. sinOi and T Vi -1 = [hi hi+l] [i 'i+l ] and simplifying yields 1 sinOO [qil ] qiM qi2 h2 ... h (A.16) where q..i = ) hisin( i+1 - - hi+lsin(0i - o) Our task is to show that (A.16) is true given (A.17) hTC < 0. Equation A.16 is true if each term is separately true. Hence, we must show sin sinO 0for hsin(Oi+ 1 - 0j) oj) (each vi) and i=l,...,M -i hi+lsin(O j=l,...,M - ) ( h.3 (A.1S) (each term in A.12). Because of the rotational symmetry of the problem we may, without loss of generality, choose j=1 and prove that (A.18) is true for i=l,...,M. 0i = (i-1)2X/M = (i-1)0 0 , Since we may simplify (A.18) to sine0 ( h.sin(iOo) - hi+lsin((i-1) 0o) ) h . Denoting the left-hand side of (A.19) by Ei we have for i=l that E 1sin ( hlinO - ) = h- - 51 - (A.19) The general expression E. for which satisfies (A.19) trivially. j=2,...,M may be related to Ej_ 1 using the relation h C < 0 as follows. From (A.19) we have that E = - hi+lsin((i-1)0 ) sine(io) Using the double angle formula sinia 2sin(i-1)acosa - sin(i-1)a = this becomes sin1 sine0 E. 1 [ hi[2sin((i-1)0 0 )cosO0 - sin((i-2)00] L0] hi+lsin((i-1)00) sine 0 - [ (2hicoso - hi+ 1 )in((i-l)00) (A.20) - hi sin((i-2)00 Now we notice that the it h constraint in hi - h 11 i+l 2cos0i < hTC h 0 may be written as i-1 Using this inequality in (A.20) yields sin hi-1 sin((i-1)00) 0 - hi sin((i-2) 0 ] which may be reduced to Ei-1 Ei This is the result that we sought. EM < EM_ 1 < ... < E2 < Now we may conclude that E1 = h1 which concludes the proof of sufficiency and, hence, the theorem. - 52 - (A.21) APPENDIX B We show here that in the limit as the discrete support vector constraint h''(0) + h(6) > 0, M the expression for the -, hTC < 0 goes to the constraint for the continuous support function. Referring to Theorem 1, we have for three consecutive support values the relation -hi_1 + 2cos(M )hi - hi+ < 0 (B.1) which simply rewrites the constraint implied by the ith column of C in the expression hTC < O. Now assume that the i t h support value h. has 1 associated with it the angle 0, so that L i = L(hi,O ) is the corresponding support line. Then Li_1 = L(hi_1 0 M ) and L_-=L(h 6-i_ 2= iLi L(h i' - 2) are the support lines for support values hi. and hi+l, respectively. and Li_ 1 1 Then any support function which has Li_ 1, Li, as support lines must satisfy h(e-0O ) - 2cos( 0o)h(6) + h(0+ 0 o) 2> (B.2) where we have negated both sides of (B.1) and substituted 60 = To show that (B.2) approaches h''(0) + h(6) 2 0 as 00 - O, 2Mr first we substitute the the 2n d order Taylor series approximation to cos( 0O) cos( 0o) 1 - 00 2/2 (B.3) into (B.2) to obtain h(e-0o) - 2h(e) + 0o2 h(e) + h(0+ 0 o) > 0 (B.4) . Dividing both sides of (B.4) by 0O2 (which cannot change the sign of the inequality since 602 is always positive) and rearranging we obtain h(0+0 0 )-h(6) h(O)-h(O-0o) O00 + h(6) > 0 . (B.5) 60 Taking the limit of (B.5) as 00 goes to zero gives the desired result. - 53 - APPENDIX C We show here the modifications to the matrices A and c in Equation (4.10b) necessary to implement the sparse data and shift-corrected modified algorithms. Sparse Data (see end of Section 4.2). We show here the methods used to obtain A and c for M/N = 3; the reasoning for any integer scaling other than 3 is similar. h C ER from N = M/3 Recall that we seek an estimate observations. Since h is in IR we begin by setting up the problem as if there were M observations. Denoting A using four submatrices (refer to Equation 4.15) A = [A1 : A 2 : A3 : A4 ] (C.1) we see that the submatrices A1 and A2 will remain unchanged, since these constraints do not involve the observations. The matrices A3 and A 4 must change since they do involve the observations. We easily see that since certain elements of h (i.e, 2,3 and 5,6 etc.) are not observed there should be no constraints imposed. Hence, the modification to A3 and A4 is to decimate (remove) certain columns of each matrix. N = M/3 Since we will remove all columns except columns 1, 4, 7,..., 3(N-1)+l. This applies to both matrices A3 and A4. Denoting the decimated matrices as A' and A' we have the new sparse data A matrix given by A s = [AI A2 : A' : A' [A1 2 3 A4] (C.2) (C.2) Since A' and A' each have N columns corresponding to the constraints on the measured data, it turns out that the definition of the vector c (Equations 4.16 and 4.17) remains unchanged. - 54 - Shift Correction (see Section 4.4). The shift corrected algorithms require the imposition of the linear constraint T h N = M M- v. Since this is an equality constraint and the constraints of each algorithm discussed have been inequalities we must split the expression into two inequalities hTN < hTN > M (C.3) and (C.4) . Then, after multiplying (C.4) by -1, we have the shift constraint in the required form. The new shift-corrected A matrix requires a fifth partition Asc = [A A1 I A2 I A 3 I A4 ] (C.5) where A1 through A 4 are as before, and A0 is given by A0 = [N : -N] (C.6) In this case, the vector c must also be augmented as Csc = [CO : C1 : C2 : C3 C4 ] (C.7) where C 1 through C4 are the former 4 partitions of c (see Equation 4.16) and C O is given by CO = [v - 55 - -v] (C.8) REFERENCES [Bellman, 1970] Bellman, R.,Introduction to Matrix Analysis, New York: McGraw-Hill, 1970. [Bresler and Macovski, 1984a] Bresler Y. and Macovski A., "3-D Reconstruction From Projections Based on Dynamic Object Models," Proc. IEEE Int. Conf. on Acoustics, Speech & Signal Processing, San Diego, California, March 1984. [Bresler and Macovski, 1984b] Bresler Y. and Macovski A., "Estimation of 3-D Shape of Blood Vessels from X-Ray Images," Proc IEEE Comp. Soc. Int. Symp. on Medical Images and Icons, Arlington, Virginia, July 1984. [Bresler and Macovski, 1984c] Bresler Y. and Macovski A., "A Hierarchical Bayesian Approach to Reconstruction From Projections of a Multiple Object 3-D Scene," Proc. 7th International Conference on Pattern Recognition, Montreal, Canada, August 1984. [Dantzig, 1963] Dantzig G.B., Linear Programming and Extensions, Princeton, N.J.: Princeton University Press, 1963. [Goldfarb and Idnani, 1983] Goldfarb D., Idnani A., "A numerically stable dual method for solving strictly convex quadratic programs", Mathematical Programming, v.27, pp.1-33, 1983. [Greschak, 1985] Greschak J.P., "Reconstructing Convex Sets", Ph.D. Dissertation in Electrical Engineering, Mass. Institute of Technology, February 1985. [Herman, 1980] Herman G.T., Image Reconstruction From Projections, New York: Academic Press, 1980. [Horn, 1986] Horn, B.K.P., Robot Vision, Cambridge, MA: MIT Press, 1986 [Humel, 1986] Humel, J.M., "Resolving bilinear data arrays", S.M. Thesis, M.I.T., Cambridge, MA, 1986. [Kelly and Weiss, 1979] Kelly P.J., Weiss M., Geometry and Convexity -- a Study in Mathematical Methods, New York: John Wiley and Sons, 1979. [Land and Powell, 1973] Land, A.H., and Powell, S., Fortran Codes for Mathematical Programming, London: Wiley-Interscience, 1973. - 56 - [Leahy and Goutis, 1985] Leahy R.M. and Goutis C.E., "An optimal technique for constraint based image restoration and reconstruction," Preprint, subm. to IEEE Trans. ASSP, July 1, 1985. [Louis, 1980] Louis A.K., "Picture Reconstruction from Projections in Restricted Range," Math. Meth. in the Appl. Sci., v.2, pp.209-220, 1980. [Luenberger, 1984] Luenberger D.G., Linear and Nonlinear Programming, 2nd edition, Reading Massachusetts: Addison-Wesley Publishing Company, 1984. [Powell, 1983] Powell M.J.D., "ZQPCVX: A Fortran subroutine for convex quadritic programming", Report DAMTP/1983/NA17, University of Cambridge, 1983. [Preparata and Shamos, 1985] Preparata, F.P. and Shamos, M.I., Computattonal Geometry, New York: Springer-Verlag, 1985. [Prince, 1986] Prince J.L., "Geometric model-based Bayesian estimation from projections", Proposal for Ph.D. research in Electrical Engineering, Mass. Institute of Technology, April 1986. [Rossi and Willsky, 1984] Rossi D.J. and Willsky A.S., "Reconstruction from projections based on detection and estimation of objects--Parts I and II: Performance analysis and robustness analysis," IEEE Trans. ASSP, v.ASSP-32, n.4, pp.886-906, August 1984. [Santalo, 1976] Santalo L.A., Integral Geometry and Geometric Probability, Encyclopedia of mathematics and its applications I., Reading MA: Addison-Wesley. [Schneiter, 1986] Schneiter J., "Automated Tactile Sensing for Object Recognition and Localization,", Ph.D. Dissertation in Mechanical Engineering, Massachusetts Institute of Technology, June 1986. [Sezan and Stark, 1984] Sezan M.I. and Stark H., "Tomographic image reconstruction from incomplete view data by convex projections and direct Fourier inversion," IEEE Trans. Med. Imag., v.MI-3, n.2, pp.91-98, June 1984. [Spivak, 1979] Spivak M., A Comprehensive Introduction to Differential Geometry, Volume II, Berkeley: Publish or Perish, Inc., 1979. [Thomas, 1972] Thomas G.B., Calculus and Analytic Geometry, Reading, Massachusetts: Addison-Wesley Publishing Company, Inc., 1972. - 57 - [Trussell and Civanlar, 1984] Trussell H.J., Civanlar M.R., "The feasible solution in signal restoration", IEEE Trans. Acoust., Speech, and Sig. Proc., v.ASSP-32, n.2, pp.201-212, April 1984. [Van Hove and Verly, 1985] Van Hove P.L. and Verly J.G., "A silhouette-slice theorem for opaque 3-D objects," ICASSP, March 26-29, 1985, pp. 9 3 3 - 9 3 6 . [Van Trees, 1968] Van Trees H.L., Detection, Estimation, and Modulation Theory, Part I, New York: John Wiley and Sons, 1968. [Youla and Webb, 1982] Youla D.C., Webb H., "Image restoration by the method of convex projection: Part 1 -theory", IEEE Trans. Med. Imaging, v.MI-1, pp.81-94, October 1982.