LIDS-P-2205 Reconstructing Parameterized Objects From Projections: A Statistical View* Peyman Milanfar (Corresponding Author) Alphatech Inc., 50 Mall Road, Burlington, MA 01803 Phone: (617) 273-3388 extension 271 E-mail: milanfar@athena.mit.edu W. Clem Karl, Alan S. Willsky Laboratory for Information and Decision Systems Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology Cambridge, Massachusetts, 02139 October 17, 1993 *This work was supported by the National Science Foundation under Grant 9015281-MIP, the Office of Naval Research under Grant N00014-91-J-1004, the US Army Research Office under Contract DAAL03-92-G-0115, and the Clement Vaturi Fellowship in Biomedical Imaging Sciences at MIT. Abstract In many applications of tomography, the fundamental quantities of interest in an image are geometric ones. In these instances, pixel based signal processing and reconstruction is at best inefficient, and at worst, nonrobust in its use of the available tomographic data. Classical reconstruction techniques such as Filtered Back-Projection tend to produce spurious features when data is sparse and noisy; and these "ghosts" further complicate the process of extracting what is often a limited number of rather simple geometric features. In addition, even if our interest is not primarily geometric, such a perspective can provide a rational framework for focusing information in those cases where the quality or quantity of the available data will not support the generation of a dense pixel-based field. In this paper we present a framework that, in its most general form, is a statistically optimal technique for the extraction of specific geometric features or objects directly from the noisy projection data. We present an approach that is applicable to the reconstruction of any finite parameterization of an object, but in specific, we focus on the tomographic reconstruction of binary polygonal objects from sparse and noisy data. In our setting, the tomographic reconstruction problem is essentially formulated as a (finite dimensional) parameter estimation problem. The parameters to be estimated correspond to features of the underlying object. In particular, the vertices of binary polygons are used as their defining parameters. Under the assumption that the projection data are corrupted by Gaussian white noise, we use the Maximum Likelihood (ML) criterion, when the number of parameters is assumed known, and the Minimum Description Length (MDL) criterion for reconstruction when the number of parameters is not known. The resulting optimization problems are nonlinear and thus are plagued by numerous extraneous local extrema, making their solution far from trivial. In particular, proper initialization of any iterative technique is essential for good performance. To this end, we provide a method to construct a reliable yet simple initial guess for the solution. This procedure is based on the estimated moments of the object, which may be conveniently obtained directly from the noisy projection data. 2 List of Symbols f(XA,y) g(t,O) 0 6 w d, dy cos(O) sin(O) ir V i j m n O Y Vml log E vmap d VMDL N / fS k XI A R h T Vref L C Vinit det k tan I A U £ E z I P * inf e % 0 u T v w a la z F A B a b 3 List of Figures 1 The Radon transform ...... 2 A projection of a binary, polygonal object .. 3 Illustration of Result 1 4 Triangle example. .. ....... . ... ; Hexagon example. 13 22 .......... SNR= 0 dB, 50 views, 20 samples/view: True(-), Reconstruction(- -). ... 26 SNR= 0 dB 50 views, 20 samples/view: True(-), Reconstruction(- -). 26 ............... %Error=9.6 ............................. 6 99 . ........... %Error = 7.2 ........................................ 5 ; From Top to Bottom: Sinograms with 50 projections and 20 samples per projection of I) In each of these Noiseless Hexagon, II) Noisy data at 0 dB III) Reconstructed Hexagon. images, the horizontal axis is 6, the vertical axis is t, and the intensity values are the values of the corresponding projections mapped to the grayscale range of [0, 255] . ... . 27 7 Sample reconstruction of a hexagon at 0 dB SNR using FBP: 64 views, 64 samples per view . 8 Sample reconstruction of a Hexagon at 0 dB SNR 64 views, 64 samples/view: Reconstruction(- -) True(-), 28 ......................................... .... . 9 Mean performance curves for ML reconstructions of a triangle and a hexagon . 10 Performance as a function of Number of Views 11 Performance as a Function of Number of Samples per View 12 Cost vs number of sides for the hexagon in Figure 5 13 Minimum MDL costs and Sample Reconstructions for an Ellipse 14 True object (-), reconstruction (-.), initial guess (o) picked by eye . 15 True object (-), reconstruction (-.), initial guess (o) picked using Initial Guess Algorithm . .. 16 FBP Reconstruction of non-polygonal, non-convex Object: 3rd order Butterworth filter with 0.15 normalized cutoff frequency, SNR=4.35 dB. 17 ......................... . . . 29 . ... 31 31 .................. 33 ....................... . 34 ............... .............. 35 ... . ....................... A sample path of the reconstruction error at SNR=0 dB ....... 4 28 ............. 36 36 . . . 37 18 Percent error for Hexagon vs SNR after outlier removal 19 A typical reconstruction at a local minimum with SNR= 0 dB. ................. 39 20 A typical reconstruction at a local minimum with SNR= 4.35 dB ................ 40 21 Relative error in matching second order moments using the Initial Guess Algorithm 5 ..................... 39 ..... 48 1 Introduction In many applications of tomography, the aim is to extract a rather small set of geometrically based features from a given set of projection data [1, 2, 3]. In these instances, a full pixel-by-pixel reconstruction of the object is a rather inefficient and non-robust approach. In addition, in many situations of practical interest, a full set of data with high signal-to-noise ratio (SNR) is often difficult, if not impossible, to obtain. Such situations arise in oceanography, nuclear medicine, surveillance, and non-destructive evaluation when due to the geometry of the object or the imaging apparatus, only a few noisy projections are available [4, 5]. In these cases, the classical reconstruction techniques such as Filtered Back Projection (FBP) [5] and Algebraic Reconstruction Techniques (ART) fail to produce acceptable reconstructions. The shortcomings of these classical techniques in such situations can be attributed to two main sources. First, these techniques are invariably aimed at reconstructing every pixel value of the underlying object with little regard to the quality and quantity of the available data. To put it differently, there is no explicit or implicit mechanism to control greed and focus information, thus preventing one from attempting to extract more information from the data than it actually contains. The second type of shortcoming results from the fact that if we assume that the projection data are corrupted by Gaussian white noise, the process of reconstruction will have the net effect of "coloring" this noise. This effect manifests itself in the object domain in the form of spurious features which will complicate the detection of geometric features. This observation points out the importance of working directly with the projection data when the final goal is the extraction of geometric information. In our effort to address these two issues, we have proposed the use of simple geometric priors in the form of finitely parameterized objects. The assumption that the object to be reconstructed is finitely parameterized allows for the tomographic reconstruction problem to be posed as a finite (relatively lowdimensional) parameter estimation problem. If we further assume, as we have done in the latter part of this paper, that the number of such parameters is also an unknown, we can formulate the reconstruction problem as a Minimum Description Length estimation problem which provides for an automatic (data-driven) method for computing the optimal parameterized objects with the "best" number of parameters, given the data. This 6 is, in essence, an information-theoretic criterion which gives us a direct way to estimate as many parameters as the information content of the data allows us to, and thus control the greed factor. Other efforts in the parametric/geometric study of tomographic reconstruction have been carried out in the past. The work of Rossi and Willsky [6] and Prince and Willsky [7] has served as the starting point for this research effort. In the work of Rossi, the object was represented by a known profile, with only three geometric parameters; namely size, location, eccentricity, and orientation. These parameters were then estimated from projection data using the Maximum Likelihood (ML) formulation. In their approach, the number of unknown parameters was fixed and the main focus of their work was on performance analysis. Prince, on the other hand, used a priori information such as prior probabilities on sinograms and consistency conditions to compute Maximum A Posteriori (MAP) estimates of the sinogram and then used FBP to obtain a reconstruction. He made use of prior assumptions about shape, such as convexity, to reconstruct convex objects from support samples which were extracted from noisy projections through optimal filtering techniques. The approach of Prince provided an explicit method for integrating geometric information into the reconstruction process but was in essence still a pixel-by-pixel reconstruction. Extending these ideas, Lele, Kulkarni, and Willsky [8] made use of only support information to produce polygonal reconstructions. Hanson [9] studied the reconstruction of axially symmetric objects from a single projection. Karl [10] also has studied the reconstruction of 3-D objects from two-dimensional silhouette projections. The geometric modeling approach of Rossi and Willsky was expanded upon to include a more general set of objects by Bresler and Macovski [11] and Fessler and Macovski [12]. The former work chose sequences of 3-D cylinders with unknown radius, position, and orientation to model blood vessels being tomographically imaged in 3-D. The latter work used ellipsoids with unknown parameters to reconstruct 3-D arterial trees from a few magnetic resonance angiograms. Recently, Thirion [2] has introduced a technique to extract boundaries of objects from raw tomographic data through edge detection and in the sinogram. Other work in geometric reconstruction by Chang [13] and more recently Kuba, Volcic, Gardner and Fishburn, [14, 15, 16, 17] has been concerned with the reconstruction of binary objects from only two noise-free projections. Our approach provides a statistically optimal ML formulation for the direct recovery of objects from 7 the projection data in the presence of noise. We also provide an automatic mechanism for identifying the statistically optimal number of object parameters, and thus information, in a given data set. The statistically optimal ML formulation leads to an optimization problem that is nonlinearand filled with local extrema. An appropriate initial guess is thus essential for its iterative solution. We thus provide a simple procedure to generate an appropriate initial guess based moment estimates of the object formed from the original data. The organization of this paper is as follows. In Section 2 we introduce the basic definitions and assumptions and pose the general problem which we intend to solve. We also discuss the particular statistical formulations of the reconstruction problem which we use. In particular, in Section 2.3 we discuss our novel technique for computing a good initial guess for the nonlinear optimization problems that result from our formulations. Section 3 contains basic performance results and robustness studies for various scenarios. Section 4.2 contains our conclusions. 2 The Reconstruction Problem The Radon-Transform [5, 18] of a function f(z, y) defined over a compact domain of the plane O is given by g(t, ) = JA f(x, y) 6(t W [X, y]T) dxdy. (1) For every fixed t and 9, g(t, 8) is the line-integral of f over 0 in the direction w = [cos(O), sin(0)]T, where 6(t - [cos(O), sin(O)] . [X, y]T) is a delta function on a line at angle 0 + 7r/2 and distance t from the origin. See Figure 1. Here we assume the existence of a set of parameters that uniquely specify the function f. The estimation of the parameters that uniquely specify the function f is the concern of this paper. Let us stack the set of parameters that uniquely define f in a vector V. In what follows, we shall then consider the function f and its set of defining parameters as interchangeable. We will assume throughout that the data available to us are discrete samples of g which are corrupted by Gaussian white noise of known intensity. In particular, our observations will be given by Yij = g(ti, Oj, V*) + w(ti, Oj) , 8 (2) ·. ~~~~~~~~~ ~~~\\ :55 :·:~:~r"':~::~:.~::~j:::l % 0~~~.~ J, . ;. ~~i~i~i ~l~is~z~z~x Figure 1:~::~:Th Raon rasform·: for 1 < i < r, 1 < j < n where V* is the true object we wish to reconstruct. The variables w(ti, Oj) are assumed to be independent, identically distributed (i.i.d.) Gaussian random variables with variance o.2. We will denote by Y the vector of all such observations. In the classical framework, the parameter vector V consists of the pixel values of all pixels in the image to be reconstructed. For example, the parameter vector V here would contain over 16,000 elements for a 128 x 128 image. The FBP technique would estimate all of these parameters. For many applications, however, the creation of this dense visual field is not really necessary. For instance, in ocean acoustic tomography [4, 19], the size, location, and shape of a cold core ring (i.e. a continuous body of cool water contained in the warm waters of a gulf stream.) are important quantities to be estimated within a gulf stream. Since a cold core ring can be considered a binary object, these parameters can be easily extracted from tomographic data directly without reconstruction of the entire pixel field. The classical approach, however, is to perform the full reconstruction of the image and then try to locate and measure objects, such as cold core rings, in the resulting image. If we only desire knowledge of the size and location of a localized occlusion, that is to say essentially three numbers, FBP is then estimating roughly 15,997 parameters too many! Clearly, the estimation of all the pixel values is not only an inefficient use of the data, but in high noise and sparse data situations it can also produce spurious features known as "ghosts" that are bound to complicate the subsequent processing. In particular, such spurious features reflect the fact that we have colored the observation noise. This coloring of the noise, as reflected in reconstruction artifacts, may not be severe when we have high SNR and full-coverage data, yet can be quite limiting when these conditions are not met. In general, there may arise situations where the quality or quantity of the available data will simply not support the reliable estimation of all the parameters represented by the pixel values. In such cases, even though our interest may not be fundamentally geometric, our approach provides a rational and statistically precise way of reducing the degrees of freedom of the problem and hence of focusing the available information. 10 2.1 Maximum Likelihood Approach In our approach, the original data in (2) is used to directly estimate the parameters V of a geometric parameterization in a statistically optimal way. The dimension of the parameter vector V is determined by the level of detail that one can extract from the sparse and noisy data. For clarity, we first consider the case where a fixed and known number of parameters is assumed. In this case, the Maximum Likelihood (ML) [20] estimate, Vml, of the parameter vector V is given by that value of V which makes the observed data most likely. In particular, using the monotonicity of the logarithm: Vl = arg max log [P(YIV)], (3) where P(YIV) denotes the conditional probability density of the observed data set Y given the parameter vector V. It is well-known that given the assumption that the data is corrupted by i.i.d. Gaussian random noise, the solution to the above ML-estimation problem is precisely equivalent to the following Nonlinear Least Squares Error (NLSE) formulation JYij - g(ti, j, V)2. Vml = arg mimn (4) i,j The formulation (4) shows that, in contrast to the linear formulation of classical reconstruction algorithms, the ML tomographic reconstruction approach, while yielding an optimal reconstruction framework, generally results in a highly nonlinear minimization problem. It is the nature of the dependence of g on the parameter vector V that makes the problem nonlinear. Being nonlinear, the problem is plagued by numerous extraneous local extrema, making the issue of computing a good initial guess for nonlinear optimization routines an important one. As we will discuss in Section 2.3, the geometric information carried in the projections can be conveniently extracted in the form of moments. These moments can in turn be used to easily compute a good initial guess for the optimization algorithms. Finally, note that if additional explicit geometric information is available in terms of a prior probabilistic description of the object vector V, then a Maximum-A-Posteriori estimate of V may be computed as follows: Vmap = arg maxlog [P(VIY)] (5) In this work we concentrate on the ML problem given in (3) and its extensions, though application of our results to the MAP formulation is straightforward. 2.2 Minimum Description Length In the previous ML discussion we assumed we had prior knowledge of the number of parameters describing the underlying object. Without this knowledge, we can consider the Minimum Description Length (MDL) principal [21]. In this approach, the cost function is formulated such that the global minimum of the cost corresponds to a model of least order that explains the data best. The MDL approach in essence extends the Maximum Likelihood principal by including a term in the optimization criterion that measures the model complexity. In the present context, the model complexity refers to the number of parameters used to capture the object in question. Whereas the ML approach maximizes the log likelihood function given in (3), the MDL criterion maximizes a modified log likelihood function, as follows: log[P(YIV)]- VMDL = argmax V,N where d = mn - log(d)), 2 (6) is the number of samples of g(t, 9) and N refers to the number of parameters defining the reconstruction. Roughly speaking, the MDL cost is proportional to the number of bits required to model the observed data set with a model of order N, hence the term Minimum Description Length. Under our assumed observation model (2) the MDL criterion (6) yields the following nonlinear optimization problem for the optimal parameter vector VMDL: VMDL = argmin min{o- 2 N V E IIYij - g(ti, 8j, V)112 + N log(d)}, (7) where the optimization is now performed over both V and the number of parameters N. Note that the solution of the inner minimization in (7) essentially requires solution of the original ML problem (3) or (4) for a sequence of values of N. Thus the optimization problem (7), being an extension of (4), is also highly nonlinear and routines to solve it will similarly require proper initializations to avoid being stuck in local minima. In summary, our proposed approach, in its most general form, is a statistically optimal technique for the extraction of specific geometric features or objects directly from the projection data and the rational focusing of sparse and noisy information. Furthermore, via the MDL criterion it is equipped to compute 12 V3 v 2 V. t t2 2 t4 3 t5 Figure 2: A projection of a binary, polygonal object the optimal number of features (parameters) that may be extracted from a given noisy and possibly sparse projection data set. While being a statistically optimal approach, the resulting optimization problems are highly nonlinear and require appropriate initialization for their solution. To this end we will present a simple robust method to generate such an initial estimate. For the sake of focus in what follows, we will concentrate our attention on binary, polygonal objects. We define these as objects taking the value 1 inside a simple polygonal region and zero elsewhere. Unless otherwise stated, we assume from here on that the matrix V contains the vertices of an N-sided binary, polygonal region as follows: V = [VI| V2 * .. VN (8) where Vi = [Xi, yi]T denote the Cartesian coordinates of the i th vertex of the polygonal region arranged in the counter-clockwise direction (See Figure 2). Note that e use a matrix of parameters rather than a vector in what follows for notational ease of explanation in the algorithms to follow, though this is not essential. The tomographic reconstruction problem for these objects can now be stated as the problem of optimal 13 estimation of the vertices V given noisy samples of the projections denoted by Yi,j as in (2). 2.3 Algorithmic Aspects - Computing A Good Initial Guess Given the highly nonlinear nature of the dependence of the cost function in (4) on the parameters in V, it appears evident that given a poor initial condition, typical numerical optimization algorithms may converge to local minima of the cost function. Indeed, this issue is a major obstacle to the use of a statistically optimal though nonlinear approach such as given in (3) or (6). In this section we describe a method for using the projection data to directly compute an initial guess that is sufficiently close to the true global minimum as to result in convergence to it in nearly all cases. We do this by estimating the moments of the object directly from the projection data and then using (some of) these moments to compute an initial guess. In considering the use of moments as the basis for an initialization algorithm, one is faced with two important issues. The first is that although estimating the moments of a function from its projections is a relatively easy task, as we have shown in [22], the reconstruction of a function from a finite number of moments is in general a highly ill-posed problem even when these moments are exactly known [23]. Furthermore, in our framework the moments are estimated from noisy data, and hence are themselves noisy. In fact, as higher and higher order moments are estimated, the error in the estimates of these moments becomes larger. Our approach avoids these moment related difficulties by using the moments only to guide an initial coarse estimate of the object parameters for subsequent use in solution of the nonlinear ML or MDL problems. This initial estimate, in turn, itself mitigates the difficulties associated with the nonlinearities of the optimal statistical approaches. In particular, the amount of computation involved in arriving at an initial guess using our moment-based method is far smaller than the amount of computation (number of iterations) required to converge to an answer given a poor initial guess, especially since a poor initial guess may converge to a local minimum and never reach the basin of the global minimum. Further, the parameterization of the objects serves to regularize and robustify the moment inversion process [23, 24, 25, 26]. Our method of using moments to generate an initial guess is based on the following set of observations. First, let tPpq, 0 < p, q denote the moment of f(a, y) of order p + q as given by: 14 pq =/J Pyq (, y) d dy (9) In particular, note that the moments up to order 2 have the following physical relationships. The zeroth order moment o00ois the area of the object, the first order moments A1o0and plo are the coordinates of the center of mass of the object scaled by the area, and the second order moments /02, /ll,/20o the entries of the inertia matrix of the object. are used to form Thus these moments contain basic geometric information about object size, location, and elongation and orientation that, if available, could be used to guide our initialization of the nonlinear optimization problems (4) or (7). Our first aim then is to estimate them directly from the noisy projection data. To that end, it is easy to show that [18]: ] g(t,6 )tk dt =1] f(XI y) [ cos(0) + ysin(6)]k dx dy. (10) By expanding the integrand on the right hand side of (10), it becomes apparent that the moments of the projections are linearly related to the moments /1pq of the object. In particular, specializing (10) to k = 0, 1, 2, and noting that f (z, y) is an indicator function when the objects in question are binary, we arrive at the following relationships between t/pq, 0 < p+ q < 2 and the projection g(t, 0) of f(a, y) at each angle 0: oo0= [ in( ]: /g(t, ) dt -= °(0) () 1= !J4/110 PO1 (11) (12) A20 [ cos 2 () 2sin(0)cos(0) sin 2 (0) ] P11 /= (t,6)t2dtH(2)(6) (13) 102 Thus if we have projections at three or more distinct, known angles we can estimate the moments of up to order 2 of the object we wish to reconstruct. The computation of these moments is a linearcalculation, making their estimation from projections straightforward (see [22]). Since, in general, many more than three projections are available, the estimation of these moments determining the area, center, and inertia axes of the object is overdetermined. The result is a robustness to noise and data sparsity through a reduction in the noise variance of their estimated values. In particular, we can stack the moments 7'(k)(0j) 15 we obtain from the projections at each angle Oj to obtain the followin overall equations for the / q up to order 2: 1.oo (14) ~~1 cos(9i) cos 2 (91) 2sin(O1)cos(O1) J= 2sin(n,)cos(/,6) sin 2 (a0) (15) (1)() /A20 :11 : cos 2 (0n) ' sin(O,,) sin 2 (91) ] 1 sin( ) : cos(On) , t(°)(0,) 102 H(2)(81) : = (16) (2) ('e, ) Using these equations we can easily calculate the maximum likelihood estimates of the moments of the object A/pq for 0 < p + q < 2 given noisy observations of the moments of the projections. In par- ticular, this is done by gathering the above sets of equations into a single linear equation of the form h = A/r + e, where h is the vector of noisy computed moments of the projections and 1 = [OO0, 10o, o01,1620, All,1 /t(h)(0j)appearing above, U0 2]T, while e denotes a zero-mean vector of Gaussian noise with correspond- ing covariance matrix R which captures the noise in our observations of 7(k)(0j). This noise model, of course, follows directly from that of (2). The optimal Maximum Likelihood (ML) estimate of the vector A1is then given by = (ATR-1A)-lATR-lh. - (17) (The general framework for the optimal estimation of the moments of any order of a function f(z, y) from noisy measurements of its Radon transform is developed in [27].) Let us denote these ML moment estimates by itpq. Now that we have estimates of the moments of up to order 2 of the object, and thus estimates of its basic geometric structure, we need to convert this information into a suitable object for use in initializing the nonlinear optimization problem (4) or (7). The initial guess algorithm outlined next uses these ML estimates of the low-order moments i2pq, obtained from the noisy projection data, to produce a polygon which has 16 moments up to order 2 which are close to (or in some cases equal to) those which were estimated from the projection data. The resulting polygon, which will be used as our initialization, should thus have the same basic geometric structure as the underlying object. Recall that in this process of generating an initial object from the moment data we want to avoid the difficulties usually associated with obtaining an object from a set of its moments [24, 26, 25, 23]. For this reason, the initial polygon we will use is simply obtained as the affine transformation of a reference object V,ef(N), which is a centered regular N-gon of unit area. For a given choice of number of sides N, the reference object we use is given by: I cos(O) cos(- ) ... o Vref(N) = - ) (18) sin(O) sin(-2) . sin( N-1) The affine transformation of this reference object, which will be generated from the estimated moment IV aNsin(2rN set, consists of a uniform scaling, a stretching along the coordinate axes, a rotation, and finally a translation, and simply corresponds to the following transformation of the underlying spatial coordinates of the reference object [ [ | =L | + C. (19) In particular, given the form of Vref (N) in (18), this yields the following equation for the family of possible initial objects 4nit: Vinit = L Vref(N)+ [ C ... C (20) The set of all such affine transformations of lref(N) we term the affinely regular N-gons [28]. In the absence of noise, the initial guess algorithm we detail will exactly match the given estimated moments if the underlying object itself happens to be affinely regular. If the underlying object is not affinely regular itself, the algorithm will not necessarily produce an N-gon exactly matching its moments, even in the absence of noise, though as we will show, it will usually be close. Of course, in the presence of noise the estimated moments themselves are not exact and thus, while we would hope to get close, our resulting initial N-gon will never exactly match the true moments of the underlying object anyway. Given that we will restrict ourselves to initial objects of the form (20), let us consider how we might choose 17 the parameters of the transformation L and C to match a given estimated moment set Aipq, 0 < p + q < 2. Using (20) and (18) to calculate the moments of Vlnit up to order 2 we obtain the following relationships: 00oo(Vinit) [o(Vni) o01 (Vinit) /L20 (Vinit) = J= Alpq(Vinit) i10I(Viit), (21) Idet(L)IC (22) Idet(L)l (kNLL T + CCT) (23) ,11,(Vi~nit) = where Idet(L)[ is the pq-th moment of Viit and kN = 1/(4N tan(r/N)). Thus to match /ioo(Vi,,it), o01(Vinit) with their estimated counterparts the first two conditions require: Idet(L)I = .oo c = I (24) oo i (25) B01 The first condition simply corresponds to a scaling of Vref(N) so that its area matches the estimated one. The second condition shows that the affine term C in the transformation (19) should correspond to a translation of Vref(N) to the estimated center of mass of the object. These two conditions assure that we match the estimated area and center of mass location. Now, after some manipulation, (23) implies that to match the estimated second order moments j~pq, p + q = 2, we must have: LLT I (26) kNAoo where Z is the matrix of estimated central moments defined by: L20 A11 1 10(27) ] Ali b02 / 00 -01 (27) In particular, this condition implies another constraint on det(L) independent of (24), which we will not, in general, be able to satisfy. Specifically, a necessary condition for finding an L satisfying both (26) and (24) is that: det() = k2 18 oo (28) where the expression on the left is the determinant of the estimated central second moments. Actually the condition (28) is also sufficient, as shown in Appendix A. Clearly, this condition will not, in general, be satisfied and we will be unable to exactly match the estimated second moments. In fact, the objects that do meet this constraint, and thus whose moments we can exactly match, are precisely the elements of the set of affinely regular N-gons. Geometrically, this situation reflects the limitation of our restricted object class (20), i.e. the set of affinely regular N-gons. Within this class, for a given object area we are constrained as to the "size" of the corresponding inertia matrix we may have, where inertia size is measured as the product of the principal central inertia moments (eigenvalues of the central inertia matrix). For example, while the objects of our class will always be convex polygons, for a given area we can always obtain nonconvex objects of greater inertia by "moving area outward," as in a dumbell. The condition (26) can also be viewed as implying a different sealing on L needed to obtain a perfect match to the inertia condition. In general, we thus have a choice of picking this scaling of L to satisfy either the inertia condition (26) or the area condition (24). Since the area condition (24) is both a more robustly estimated and a more fundamental geometric quantity, we choose to enforce this condition in the algorithm to follow. We then choose L so that the resulting central inertia matrix of Vinit has the same principal axes directions and has its principal inertias in the same ratio as those estimated from the data as found in I. We accomplish these goals by using a square root of the matrix .Z normalized to have unit determinant for the form of L, then scaling the result to match the determinant condition (24). Thus we sacrifice overall scaling of the inertia matrix in favor of matching the estimated area. Collecting the above steps and reasoning, the overall algorithm is given by the following: Algorithm 1 (Initial Guess) 1. Compute the optimal (ML) estimates of the moments up to order 2 (oo, firo, io1, o, ill, and io2) 20 from the raw projection data using (14)-(17). 2. Construct an N-sided regular polygon centered at the origin with vertices chosen as the scaled roots of unity in counter-clockwise order so that they lie on the circle of radius 1/ / 19 sin(2). This polygon has unit area and is defined in Equation 18. 3. Compute the translationC, obtained as the estimated object center of mass: C L1 ~Io goo ] (29) - 4. Form the estimated central inertia matrix .: from the estimated moments according to (27) 5. If I is not positive definite, set L = x/Io'I2 and goto step 8. Otherwise proceed to step 6. (I2 is the 2 x 2 identity matrix.) 6. Perform an eigendecomposition of the normalized matrix Y as follows: UT =U det() [ 0 (30) 1/A where we have assumed that the eigenvalues are arrangedin descending order and that the eigenvectors are normalized to unit length so that det(U) = ±1. Note that the eigenvalues are reciprocals of each other since we have scaled the left hand side so that its determinant is 1. 7. Form the linear transformationL as a scaled square root of I as follows: L = /o U o o 1 (31) 1/vA Note that det(L) = 0ooas desired. Depending on whether U is a pure rotation or a rotation followed by a reflection, it will have determinant +1 or -1, respectively. 8. The initialguess Vinit is now obtained by applying the scaling, stretching, and rotation transformation L and the translationC to the reference object Vref(N) via the coordinate transformation [x', y']T L[x, y]T + C. Because of the form of Vref(N) this operation yields: Vinit = L Vref(N) + [ C ... C]. (32) Note that the eigenvalue A of the unit determinant matrix calculated in step 7 gives the eccentricity of the underlying object while the corresponding eigenvectors give its orientation. Also note that in the presence 20 of noise the estimated central inertia matrix for the object 2 may not be strictly positive definite and hence may not correspond to the inertia matrix of any object at all. In such instances, the algorithm refrains from the use of these moments of order 2 and computes an initial guess based only on the estimated area and center of mass. If the matrix L, computed above, is replaced by L' = LT for any orthogonal T, the resulting quantity L' LI'T satisfies LLT = L'L'T. (33) Hence, although the initial guess generated by the above algorithm is unambiguous and unique in the sense that the square root of 5 obtained by the algorithm is unique, an infinity of other initial guesses having the same moments up to order 2 may be generated by replacing L by LT and allowing T to range over the set of all 2 x 2 orthogonal transformations. A precise characterization of this set of all affinely regular N-gons with the same moments up to order 2 is, in fact, given in the following result, which we prove in Appendix A. Result 1 Consider the set of all N-gons with moments up to order 2: 1oo, o10, P01, 1o20, C11, A02, such that the resulting inertia matrix IZ - -= (34) /111 02 satisfies det(l) = k2_1o0.This set coincides with the set of N-gons with vertices on an ellipse go and sides tangent, at their midpoints, to a homofocal ellipse £I, where these ellipses are given by - C) = 1} (35) g = {Z I (Z - C)TEl (Z - C) = 1}. (36) go = { I ( - C)T Eo( with C = A/oo Eo = EI = j ~10~t1 Alo 1 0oo cos2(-/N) -Z 2100 21 (37) (38) (39) Figure 3: Illustration of Result 1 This result states that the class of all affinely regular N-gons with a given set of moments up to order 2 is given by the class of N-gons whose vertices are on a fixed ellipse and whose side are tangent to a second ellipse which is homofocal with the first. (See Figure 3 for an example.) The ellipses are uniquely determined by the value of the given moments. In order to simplify the Initial Guess Algorithm, we do not search further over this family. We simply use the output of the Initial Guess Algorithm described above as the starting guess for our nonlinear optimization routines. 3 Experimental Results In this section we present some performance studies of our proposed algorithm with simple polygonal objects as prototypical reconstructions. One may expect that our algorithms work best when the underlying object (that from which the data is generated) is itself a simple binary polygonal shape. While this is true, we will also show that our algorithms perform consistently well even when the underlying objects are complex, non-convex, and non-polygonal shapes. First we demonstrate reconstructions based on the ML criterion. In these reconstructions we use the 22 parameters of the true polygon as the initial guess to ensure that the solution produced by the algorithm is actually the ML estimate, i.e., at the global minimum of the ML cost function. Typical reconstructions are shown along with average performance studies for a variety of noise and data sampling scenarios. In particular, we show that the performance of our algorithms is quite robust to noise and sparsity of the data, significantly more so than classical reconstruction techniques. To demonstrate this point, reconstructions using our techniques and the classical FBP are provided. Next we demonstrate how the MDL criterion may be used to optimally estimate the number of parameters (sides) N directly from the data. We solve these MDL problems by solving the ML problem for a sequence of values of N. To initialize each of these ML problems, a regular polygon of the desired number of sides with the same area and centroid is used. This initialization tries to ensure that the actual ML solution corresponding to the number of sides in question is most likely being found. The robustness of the MDL approach and its ability to capture the shape information in noisy data when the underlying object is not polygonal is also shown through polygonal reconstruction of more complicated shapes. Finally, studies are reported in which the Initial Guess algorithm is used to produce a starting guess for the optimization routines. In these studies, we show that although the performance of the overall algorithm does degrade somewhat, this degradation in performance is not significant. In order to quantify some measure of performance of our proposed reconstruction algorithms, we first need to define an appropriate notion of signal-to-noise ratio (SNR). We define the SNR per sample as i g2 (tt SNR = 101oglo 0 )/ (40) where d = m x n is the total number of observations, and or2 is the variance of the i.i.d. noise w in the projection observations (2). In all our simulations the reconstruction error is measured in terms of the percent Hausdorff distance [28] between the estimate and the true polygon or shape. The Hausdorff metric is a proper notion of "distance" between two nonempty compact sets and it is defined as follows. Let d(p*, S) denote the minimum distance between the point pr and the compact set S: d(p*, S) = inf{llp* - pl 23 p C S}. (41) Define the e-neighborhoodof the set S as S() = p I d(p, S) e}. (42) Now given two non-empty compact sets, Si and S2 , the Hausdorff distance between them is defined as: 71(S 1,S 2 ) = inf{E I S1 C s(e) and S, C S()} (43) In essence, the Hausdorff metric is a measure of the largest distance by which the sets Si and S 2 differ. The percent Hausdorff distance between the true object S and the reconstruction S is now defined as Percent Error = 100% x 1i(.;, S) Ho(0, S) (44) where O denotes the set composed of the single point at the origin, so that if S contains the origin, 1H(0, S) is the maximal distance of a point in the set to the origin and thus a measure of the set's size. 3.1 ML Based Reconstruction Here we present examples and performance analyses of the ML based reconstruction method (4). For these experiments an initial condition equal to the true object was used to ensure us of obtaining the actual ML estimate. 3.1.1 Sample Reconstructions In Figures 4 and 5 we show optimal reconstructions of a triangle and a hexagon, respectively, based on the ML criterion. The true polygon, in each case is depicted in solid lines while the estimate is shown in dashed lines. For both objects, 1000 noisy projection samples were collected in the form of 50 equally spaced projections in the interval (0, ir] (m=50), and 20 samples per projection (n=20). The field of view (extent of measurements in the variable t) was chosen as twice the maximum width of the true object in each case. For each of these data sets the variance of the noise in (2) was set so that the SNR given by (40) was equal to 0. The typical behavior of the optimal ML based reconstructions in the projection space can be seen in Figure 6, which corresponds to the hexagon of Figure 5. The top image of this figure shows the underlying projection function g(ti, 9 j, V*) of (2) for the hexagon, while the middle image shows the noisy observed data Yj,j. The object is difficult to distinguish due to the noise in the image. The bottom image 24 shows the reconstructed projections corresponding to the optimal estimate g(ti, Oj, V), which are virtually indistinguishable from those corresponding to the true object. Figure 7 shows the best FBP reconstruction of the hexagon used in Figure 5 based on 4096 projection samples of the same SNR (0) (64 angles with 64 samples per angle). For comparison, the reconstruction from this data using our algorithm is shown in Figure 8. (Note here that what constitutes the "best" FBP is somewhat subjective. We tried many different filters and visually, the best reconstruction was obtained with a Butterworth filter of order 3 with 0.15 normalized cutoff frequency.) Note that the number of samples per projection used in this reconstruction is actually more than the number used to produce the ML-based reconstruction in Figure 5. The increase in sampling was necessary because CBP produces severe artifacts if the number of views exceeds the number of samples per view [5]. The ML approach has no such difficulties, as we will see in the next section where we examine performance. In contrast to the ML-based reconstruction, the details of the hexagon are corrupted in the FBP reconstruction. In addition, there are spurious features in the FBP reconstructions and perhaps most importantly, to extract a binary object from the FBP reconstruction, we would need to threshold the image or perform edge detection on it. Neither of these postprocessing steps are easily interpretable in an optimal estimation framework and, of course, they incur even more computational costs. 3.1.2 Effect of Noise on Performance The average performance of the ML based reconstructions is presented through several Monte-Carlo studies. Again, for these experiments an initial condition equal to the true object was used in each case to ensure us of obtaining the actual ML estimates. The first study establishes average reconstruction errors at various SNR's for a fixed number of data samples. The purpose of these simulations is to demonstrate that the ML-based reconstructions are robust to the quality of the data used for a wide range of SNR's. The same two polygons as in Figures 4 and 5 were chosen as the underlying objects. Again, in each case, 1000 samples of the projections of these objects were collected in the form of 50 equally spaced projections in the interval (0, 7r] (m=50), and 20 samples per projection (n=20) while the field of view (extent of measurements in the 25 3 1 / : -2 -3 -3 -2 -1 0 1 2 3 Figure 4: Triangle example. SNR= 0 dB, 50 views, 20 samples/view: True(-), Reconstruction(- -). %Error =7.2 3 2 O0 -1 -2 -3I -3 Figure 5: Hexagon example. %Error=9.6 -2 I -1 0 1 2 SNR=- 0 dB 50 views, 20 samples/view: 26 3 True(-), Reconstruction(- -). *·:· '*';';, . *c*:.i:i8i:::2:·'::: . '. .. '...*. *.;.." ;.. .. *.*.*.*.*.*.*.*.*.... ... ::::.. . ....... .~:~ Figure 6: From Top to Bottom: Sinograms with 50 projections and 20 samples per projection of I) Noiseless Hexagon, II) Noisy data at 0 dB III) Reconstructed Hexagon. In each of these images, the horizontal axis is 0, the vertical axis is t, and the intensity values are the values of the corresponding projections mapped to the grayscale range of [0, 255] 27 *.. \ 28 t 25 + I -. 0 I I ................ I5 -4 -2 0 2 4 SNR (dB) per sample Figure 9: Mean performance curves for ML reconstructions of a triangle and a hexagon variable t) was chosen as twice the maximum width of the object in each case. The samples g(tj, Oj, V*) were then corrupted by Gaussian white noise w(ti,8j) of different intensities to yield data sets at several SNR's. At each SNR, 100 reconstructions were done using independent sample paths of the corrupting white noise. The average reconstruction error was then computed and is displayed against the SNR in Figure 9. The error bars denote the 95% confidence intervals for the computed mean values. The percent error in these reconstructions increases with decreasing SNR, as one would expect. In fact, the graph shows that, at least in the examined SNR range of -4.35 to +4.35 dB, the relation between percent error and SNR is roughly linear in the cases of the triangle and the hexagon. This suggests that the performance of our algorithm does not degrade very fast with decaying SNR, demonstrating the robustness to noise of such object-based optimal ML estimates. 3.1.3 Effect of Sampling on Performance Here the performance of our ML-based estimates with respect to both the number of available data samples and their distribution is studied. One would naturally expect that as the number of available data points 29 decreases, the reconstruction error should increase. The main aim of these simulations is to demonstrate that the ML-based reconstructions are robust to both the quantity and the distribution of data over a wide range of SNR's. In particular, reasonable estimates are produced even with a drastic reduction of data and, unlike the behavior seen in FBP reconstructions, the ML estimates display no catastrophic degradation as the samples per angular view are reduced. The relative sensitivity of the ML estimates to density of samples of g(t, 8, V*) in t and 8 is also discussed, providing information of use for the design of sampling strategies. The true hexagon used in Figure 5 was again used as the underlying object. As before, an initial condition equal to the true object was used for each of experiments to ensure us of obtaining the actual ML estimates. A series of Monte-Carlo simulations (50 runs for each sampling configuration) were then performed at various SNR's to observe the effect of sparse projections and sparse sampling in each projection. In Figure 10, the percent Hausdorff reconstruction error is plotted versus the number of angular views for SNR's of 0, 4.35, and 8.7 dB, while the number of samples per view was fixed at 50. With a modest 50 samples per view, all three curves fall below 10% reconstruction error when the number of views is greater than about 10. This is only 500 total observations, many of which do not contain the object at all (since the field of view is twice as large as the object). Furthermore, as the number of angular views is decreased from 100 to 10, only a marginal increase in the reconstruction error is observed. These observations testify to the robustness of optimal ML estimates with respect to the number of views. In Figure 11, the dual case is presented. In this figure the percent Hausdorff reconstruction error is plotted versus the number of samples per view for SNR's of 0, 4.35, and 8.7 dB, while the number of angular views was fixed at 50. With 50 angular views, all curves fall below 10% reconstruction error when the number of samples per view is greater than only 10. Also, as the number of samples per view is decreased from 100 to 10, again only a marginal increase in the reconstruction error is observed. This behavior shows that the optimal ML estimates are robust with respect to the number of samples per view. Note that for a fixed sampling strategy, the reconstruction error increases only slightly as the SNR is decreased over a wide range. For instance, in Figure 10, with 40 angular views and 50 samples per view, the percent error is reduced only about 5% while the SNR goes from 0 to 8.7 dB. 30 25 50 Samples/View 20 15 10 SNR=O dB SNR--4.35 d = ~-~ - - .SNR=8.7 0 20 40 60 Number of Views 80 dB 100 120 Figure 10: Performance as a function of Number of Views 25 50 Views 20 15 10 A SNR=4.35dB SNR=8.7 dB 0 20 40 60 80 Number of Samples/View 100 Figure 11: Performance as a Function of Number of Samples per View 31 120 Finally, it is noteworthy that the reconstruction error enjoys a dramatic improvement for all SNR's (0, 4.35, and 8.7 dB) when the number of samples per view is increased from 5 to 10. This improvement is more significant than that observed in Figure 10 when the number of views is increased from 5 to 10. This behavior indicates that in a scenario where only a small (fixed) number of sample points can be collected, it is more beneficial to have more samples per view rather than more views. 3.2 MDL Reconstructions Here we will examine reconstruction under the MDL criterion of (7) where we now assume that the number of sides of the reconstructed polygon is unknown. In particular, the reconstruction experiments for the hexagon in Figure 5 were repeated at SNR=O dB assuming no knowledge of the number of sides. The MDL criterion was employed to estimate the optimal number of sides. As in the ML algorithm, it is important to find a good initial guess for the MDL algorithm as well. The problem is twofold. First, a reasonable guess must be made as to the appropriate range of the number of sides. We picked a fairly small range for the number of sides of the reconstruction; typically, 3 to 10 sides. Next, for each number of sides, the Initial Guess algorithm was used to produce an initial guess to the optimization routine. The method for selecting the range of the number of sides is ad hoc, but was shown to be reliable in the sense that for our simulations, the MDL cost never showed local or global minima for number of sides larger than 10. Figure 12 shows a plot of the MDL cost corresponding to the expression in (7) versus the number of sides for a sample reconstruction of the hexagon in Figure 5. It can be seen that the minimum occurs at N = 6, demonstrating that the optimal MDL reconstruction will consist of 6-sides. Indeed this number coincides with the true number of sides of the underlying object. The optimal MDL estimate is thus exactly the optimal ML estimate for this data set presented before. 3.3 Polygonal Reconstruction of Non-polygonal Objects In this section we wish to study the robustness of MDL-based estimates when the underlying, true object is non-polygonal. First we examine the case of an elliptical object. We use the MDL formulation presented in 32 1225 1220 1215 . 1210 .! 1205 1200 1195 119 4.5 4.5 5 5.5 6 6.5 Number of sides (N) 7 7.5 8.5 8 Figure 12: Cost vs number of sides for the hexagon in Figure 5 the previous section and study the behavior of the optimal reconstructions at two different SNR's. To this end, let the true object (that which generated the data) be a binary ellipse whose boundary is given by: ay I (X - 1)2 + 9/4 = 1 (45) 5) The above relation defines an ellipse centered at the point (1/2, -1/2) whose major and minor axes are aligned with the coordinate axes with lengths 1 and 3/2, respectively. One thousand (1000) noisy samples of the Radon transform of this ellipse were generated (m=50 equally spaced angular views in (0,ir], and n=20 samples per view) at SNR's of 0 and 2.17 dB respectively for 50 different sample paths of the corrupting noise. For each set of data, reconstructions were performed using the ML algorithm with 3, 4, 5, 6, 7, and 8 sides together with the initial guess algorithm. The MDL cost in (7) was then computed for each of these reconstructions. The ensemble mean of this cost over the 50 runs, for each value of N, is presented in the top part of Figure 13. The error bars denote the 95% confidence intervals for the computed mean values. The top left curve corresponding to the SNR= 2.17 dB case displays its minimum at N = 5. This behavior indicates that the average optimal MDL reconstruction uses 5 sides at this noise level. A corresponding typical such five-sided reconstruction of the ellipse is displayed on the lower 33 1200 1200 SNR=2.17 dB SNR=0 dB 1150 1150 1100 1100, 1050 21050 4 6 8 Number of sides (N) 1050 10 21050 5-sided reconstruction, SNR=2.17 dB 4 6 8 Number of sides (N) 10 6-sided reconstruction, SNR=0 dB I . 0 0 Figure 13: Minimum MDL costs and Sample Reconstructions for an Ellipse left plot of Figure 13 together with the true ellipse. The upper right curve corresponding to the SNR= 0 dB case displays its minimum at N = 6 which indicates that the average optimal MDL reconstruction for this case uses 6 sides. The MDL cost curve for this lower SNR case has now become quite flat however, showing that the reconstruction with N from 4 to 6 are all about equally explanatory of the data. Although the curves for both cases demonstrate the ability of the MDL procedure to capture the shape's complexity through its choice of N, this behavior suggests that with increasing noise intensity, an MDL-based estimate becomes less sensitive to the precise level of complexity of the reconstruction, as we would expect. Apparently, in high noise situations the differences between these reconstructions that would be apparent in high SNR scenarios are masked. As the noise level increases, these fine distinctions are unimportant or not supported by the data. A typical 6-sided reconstruction is also displayed in the lower right plot of Figure 13 along with the true ellipse. As another example of the robustness of the ML-based estimates when the underlying object is nonpolygonal, we present a sample reconstruction of a nonpolygonal object that is also non-convex. In Figure 14 a typical reconstruction of this object is presented based on 20 equally spaced projections with 50 samples 34 snr--4.35 dB 2 - -2 - -3 -2 -1 0 1 2 3 Figure 14: True object (-), reconstruction (-.), initial guess (o) picked by eye per projection, at a signal to noise ratio of 10. To ensure that the reconstruction was not a local minimum, the vertices of the initial guess were chosen by eye at points where the underlying object had curvature extrema. Furthermore, the number of sides was picked arbitrarily according to the number of apparent curvature extrema of the true object. Figure 15 shows a reconstruction of the same kidney-shaped object at SNR of 4.35 dB with the initial guess chosen by the Initial Guess Algorithm. Figure 16 contains the reconstruction produced by FBP using the same data set. As in our other examples, the underlying object has been captured more accurately and without spurious features through the use of our algorithm. 3.4 Initial Guess Algorithm In this section we present some sample reconstructions and performance plots for which we use the Initial Guess algorithm for generating a starting point to the nonlinear optimization (3). To study the average performance of the ML algorithm using the output of the Initial Guess Algorithm, a Monte-Carlo simulation was done for the reconstruction of the hexagon shown in Figure 5. 100 reconstructions were carried out for different realizations of the noise at various SNR's, each with 1000 projection samples as before (50 projections 35 3 $nr=4.35 dB 20 0/ -1 ~~~~o -2 - '3-3 -:2 -1 0 1 2 3 Figure 15: True object (-), reconstruction (-.), initial guess (o) picked using Initial Guess Algorithm ~~i~~~ii ~~~~:~~~:~~~'~~:~~~si'::::~~~~~~~~'::·::~~~~~~:~~~:~~~:~[ ii[~.':[ Figure 16: FBP Reconstruction of non-polygonal, non-convex Object: 3rd order Butterworth filter with 0.15 normalized cutoff frequency, SNR--4.35 dB 36 350 . Outlier --- > 300 250 t,.. o 200 Outlier ---> 150 a. 100 50 o o 0 0 & 20 0 0 0 40 60 80 Realizations Number Figure 17: A sample path of the reconstruction error at SNR=0 dB 100 and 20 samples per projection). For each SNR, on average less than 5 percent of the reconstructions (i.e. 5 out of 100 sample reconstructions) had very large reconstruction errors (we call these instances outliers). Figure 17 shows the reconstruction errors for 100 realizations of the noise at 0 dB. The outliers are clearly visible. Figure 17 indicates that in a few instances, the reconstructions were essentially at local minima very far from the global minimum of the cost. In our experience, these outliers occur most frequently when poor estimates of the moments of order 2 are obtained from the noisy data. Note that the second order moments are used in the Initial Guess Algorithm only if the corresponding inertia matrix obtained from them is strictly positive definite. The Initial Guess Algorithm decides whether to use the second order moments or not solely on the basis of this positive definiteness and regardless of how close the inertia matrix may be to negative- or indefiniteness. Hence, the outliers occur in those rare instances when the estimated inertia matrix happens to be a very poor estimate, but yet positive definite (and hence used in the Initial Guess Algorithm). This phenomenon, in turn, seems to occur when relatively few samples per projection are available. Figure 18 shows the mean percent error in the Monte-Carlo runs after the removal of the outliers. The 37 outliers were removed from the ensemble and the results of the remaining realizations were averaged to yield the values in Figure 18. That is to say that if 3 out of 100 realizations led to outliers, then only those 97 results which seemed "reasonable" were used in computing the ensemble average. Whether the result of a run was deemed reasonable or not was decided by comparing the resulting percent error to the ensemble median reconstruction error for all 100 runs. In particular, if the percent error for a run was larger than one standard deviation away from the median, the run was declared an outlier. In the case of Figure 17, using the computed median value of 17.2, and standard deviation of 33.2 a threshold level of 50.4 was chosen above which outliers were declared. The resulting "mean" performance is plotted here to show the average performance without the effect of the outliers. It can be seen, upon comparing Figure 18 with the corresponding performance curve in Figure 9 that the performance of the ML algorithm using the output of the Initial Guess algorithm still suffers even after discounting the obvious outliers. This means that instances of convergence to local minima still occur, but note that, at least from a visual standpoint, the average performance after the removal of outliers is not significantly different from the average performance with the actual polygon as the initial guess. In particular, the degradation in performance here is roughly 7 percentage points in the Hausdorff norm over the given SNR range. This corresponds to a small visual error as can be seen in Figure 4. From this observation, we conclude that even though the initial guess algorithm does not always lead to convergence to the global minimum, it almost always leads to, at least, a local minimum that is fairly close to the global minimum of the ML cost function. Typical reconstruction at local minima which are close to the global minimum of the cost are shown in Figure 19 for SNR= 0 dB and in Figure 20 for SNR= 4.35 dB. 4 4.1 Conclusions Conclusions from Experiments Several conclusions may be drawn from the experimental results presented in this paper. The optimal estimates based on the ML criterion produce reconstructions that are highly robust to noise and sparsity and 38 25 20 i 10 .................. eformance. for. Hexago.n. with .lnitial .Guess Algorithm ...... 5 -40. . 0 -2 0I SNR (dB) per sample 2I 4I Figure 18: Percent error for Hexagon vs SNR after outlier removal True(-), Reconstruction(- .), Initial Guess(+) .2 ........................ 1 ..................... -' -2 ............ .......... -1 0 ............ 1 2 3 Figure 19: A typical reconstruction at a local minimum with SNR= 0 dB 39 39 True(-), Reconstruction(- .), Initial Guess(+) 3 1 ............ . . . : . : . . .............. . ..... '\ \+ Percent Error 1 2.04 -2 . -3 ..................................... -2 -1 0 x 1 2 3 Figure 20: A typical reconstruction at a local minimum with SNR= 4.35 dB distribution of data. This behavior is essentially a direct consequence of the fact that ML-based techniques focus all of the available information to the task at hand as represented in the parameters of underlying object. In contrast, classical approaches, such as the CBP algorithm must spread this information over all the pixels in an image. Extending the ML approach through the MDL principle allows for the automatic determination of the optimum number of parameters needed to describe the given data set. The experiments verify the utility of this method. The ML-based approaches are also able to produce reconstructions for a wide variety of objects. The drawback of such statistically optimal ML and MDL approaches is that, in contrast to the linear formulations of classical reconstruction algorithms, they lead to highly non-linear optimization problems. This fact makes the issue of computing a good initial guess to the nonlinear optimization routines an important one. We provided a way to circumvent this difficulty through a simple initial guess algorithm based on the estimated moments of the object. In particular, the coarse geometric information carried in the projections can be easily extracted in the form of moments, which are then used to generate the initial guess. The efficacy of this algorithm over a range of SNR's was demonstrated. 40 4.2 Overall Conclusions In this paper, we studied statistical techniques for the reconstruction of finitely parameterized geometric objects. In particular, we focused on the reconstruction of binary polygonal objects. The reconstruction of such objects was posed as a parameter estimation problem for which the Maximum Likelihood technique was proposed as a solution. In contrast to the classical techniques, such as FBP, the ML based reconstructions showed great robustness to noise and data loss and distribution. The drawback of such ML-based formulations is that the resulting optimization problems are highly non-linear and thus a good initial guess is necessary to ensure convergence of optimization routines to the true ML estimate. To this end, an algorithm was presented for computing such a reasonable initial guess using moments of the object which are estimated directly from the projection data. While estimation of a function from its moments is, in general, a difficult and ill-posed problem, we avoid these problems by using the noisy estimated moments only to guide a coarse object estimate. This estimate, in turn, mitigates the difficulties associated with the non-linearities of the optimal ML statistical approach. The efficacy of this moment based initial guess algorithm was demonstrated over a range of SNR's. If the number of parameters describing the underlying object are not known, a Minimum Description Length criterion can be employed that simply generalizes the ML framework to penalize the use of an excessively large number of parameters for the reconstruction. The MDL approach was shown to work successfully in estimating the number of sides and the underlying object itself for low signal-to-noise ratio situations and for a variety of sampling scenarios. It was further demonstrated that if the underlying object is not polygonal, but still binary, the proposed ML and MDL algorithms are still capable of producing polygonal reconstructions which reasonably capture the object shape in the presence of high noise intensity and sparsely sampled data. In this work we have focused on the reconstruction of binary polygonal objects parameterized by their vertices. The ML and MDL-based techniques used here may also be applied to more general object parameterizations. In particular, while we used the (estimated) moments of the object only as the basis for 41 generating an initial guess, it is, in some cases, possible to actually parameterize the object entirely through its moments. For instance, Davis [29] has shown that a triangle in the plane is uniquely determined by its moments up to order 3, while in [27] we have generalized this result to show that the vertices of any simply connected nondegenerate N-gon are uniquely determined by its moments up to order 2N - 3. More generally, a square integrable function defined over a compact region of the plane is completely determined by the entire set of its moments [26, 24, 23]. In reality we will only have access to a finite set of these moments and these numbers, coming from estimates, will themselves be inexact and noisy. While estimation of the moments of a function based on its projection is a convenient linear problem, inversion of the resulting finite set of moments to obtain the underlying function estimate is a difficult and ill-posed problem. These observations suggest a spectrum of ways in which to use moments in our reconstruction problems. At one extreme, only a few moments are used in a sub-optimal way to generate a simple initialization for solution of a hard, non-linear estimation problem. At the other extreme, the moments are themselves used in an optimal reconstruction scheme. In [27] we have studied regularized variational formulations for the reconstruction of a square integrable function from noisy estimates of a finite number of its moments. We have also studied array-processing based algorithms for the reconstruction of binary polygonal objects from a finite number of their moments. A Theoretical Results on the Initial Guess Algorithm In this section we present some theoretical justification for the initial guess algorithm. To start, we state some elementary properties of unit area polygons Vref(N) whose vertices are the scaled Nth roots of unity (in counter-clockwise direction) as defined by (18). From [27], it is a matter of some algebraic manipulations to show that the regular polygon Vr,ef(N) has moments of up to order 2 given by LOO(Vef(N)) = (46) 1 /ilo(Vref(N))= /io(Vref(N)) =0 A20(Vref(N)) = o02(Vref(N)) = 4 42 (47) Ntan() =kN (48) mLii(Vref(N)) = (49) Now let Viit be an affine transformation of Vrf as Vinit = LVre(N) + [C I Cl ... C] (50) for some linear transformation L and some 2 x 1 vector C. Let O(Vinit) denote the closed, binary polygonal region enclosed by the N-gon Viit. Now by considering the change of variables z = Lu, and dropping the explicit dependences on N we have /too(Vinit) = I = fI(?.., -= - dzA, | (51) det(L)ldu, (52) oo(Vref)l det(L)l = Idet(L)l (53) Similarly, we get [/lo(V'init) /ol(Vinit)] T = (L[/1o(Vref) /Loi(Vref)]T + C) I det(L)l = I det(L)IC (54) and I(Vinit) = (LI(Vref)LT + CCT)I det(L)l = (kNLL T + CCT)I det(L)l (5.5) where for any N-gon V we write 1I(V)A20M P( ull(V) (56) 8,02(V) This proves relations (21), (22), and (23). We next establish an explicit description of the set of all affinely regular N-gons with a fixed set of moments up to order 2. In order to do this, we first need to prove a lemma. Lemma 1 For every N-gon V with moments uooi, /lo matrizx I satisfies det(I) = = 0, Ao01 = 0, /120, /11l, Uo02, such that the inertia NLO, -k2 there exists a matrix L, unique up to some orthogonal transformation, such that V = LVref. Proof: The assumptions that /lo = 0 and /oL = 0 are made without loss of generality and to facilitate the presentation of the proof. Having said this, we define L as the scaled (unique) square root of I as follows. First, write the following eigendecomposition 43 Vdet-(I7) = US2 UT, (57) where U is orthogonal and S has unit determinant. Define L as L = VANtUS. (58) The moments of V = LVref are then given by Poo(V) = o10(V) = I(V) = (59) mboo uol(V) = 0 (60) kNLL TIdet(L)I (61) Note that det(27) = k2 _44 (62) as required. If L is replaced by LT where T is any 2 x 2 orthogonal transformation, the same moments are obtained. Hence the lemma is established. C Given this lemma, we obtain an interesting geometric representation of all affinely regular N-gons that have a prespecified set of moments of up to order 2. This characterization is given by Result 1, on page 21, which we prove next. Proof of Result 1: For the sake of simplicity, and without loss of generality, we carry out the proof for the case where all polygons are centered at the origin. Let S1 denote the set of all N-gons whose first three moment sets are L00 , 0lo= o01 = 0, , 20, 11, P/02- Let S 2 denote the set of all N-gons with vertices on the ellipse zTEolz = 1, and sides tangent (at their mid-points) to the ellipse zTEi lz = 1. We show that S1 = S 2. First consider an N-gon V E S1. V has moments too, 0, 0, I and therefore, by Lemma 1, there exists an L given by (57) and (58), unique up to some orthogonal matrix T1 such that we can write V = (LTl)Vref(N) Let us denote the N-gons V and Vref(N) explicitly in terms of their columns as 44 (63) V = [V1 iv V,,ref(N) = [w 'I 2 Iw2 | N] (64) wN] (65) so that (66) vj = LTlwj. It is easy to show from the definition of Vref(N) that (Wj+ + , )T (wj+ WT = + wj) = (67) (67) aN= w; r2v N-sin(2ir/N)' 4 3 4 N= Ntan(/N) (68) Now to show that V E S 2, we prove that vJTE-lvj (vj+l + vj)T 2 = (69) 1 E 1 (vj+l + v) 2 E (70) for j = 1, 2 ... N, where by convention, N + 1 = 1. Using (57) and (58), we can write VT E- 1LOOkN (71) T-TI-1lVj =OOkN 12 ooWTTTSUTUS- 2 UTUSTlWj. (72) (73) -wjTw-. = CaN (74) 1i. Similarly, 1 plookN (Vj+, + vj)T E-1 (vj+, + vj) 2 4 3 Pv kkNAlOO - 2 - 4-(w7j+ T-1 1 S s 0 (° °+lT +Uj + Uj)T(Wj+l + Wj) - (vj+ + uj) (75) 4tN - (76) 1 Hence, V E S 2 . Now assume that V E S2. Then, by assumption, - -- --·- Similarly,- 45 T E o l vj v (j+1 + (v-1 (vj+l + vj)T E = ) (77) 1 2 2 1 (78) 1 Define the vertices of the related N-gon Z as S - l UTvj. Z= (79) where S and U are given by the normalized eigendecomposition of I given in (57). Writing (77) and (78) in terms of zj, after some algebraic manipulations, we get fTzj (Z+l + Zj)T (Zj+l + j) 2 = 1 = cos ( 2 (80) ) (81) N From these identities, again with some algebraic manipulation, it easily follows that Z is equilateral. Specifically, Ilzj+l - zj 111/2 = 2 sin( )r (82) Since the above identities show that the N-gon Z is a regular N-gon inscribed in the unit circle, then it must be related to Vref(N) through a scaling and some orthogonal transformation T2. In particular, Z = V IN sin( 2 )T 2Vref(N) (83) This in turn shows that 1 S-1UTV S/-s2 = in( )T2 Vef(N) (84) or after solving for V and simplifying, V = V/t-UST 2 Vref(N). (85) V = (LT2)Vref(N). (86) Letting L = Vi/ooUS, we obtain This last identity implies that V has moments /oo(V) = poo, / 1 o(V) = pOl(V) = 0, and Z(V) = kNLLT I det(L)I with det(I(V)) = k2 ,4o0 o . Hence V C S1 and the result is established. 46 (87) O If Z is not the inertia matrix of an affinely regular N-gon then the L constructed in the Initial Guess Algorithm will not have have the prescribed inertia matrix. We are, however, able to explicitly compute the approximation error in the following way. Result 2 Suppose that the moments /zoo, pilo = Pol = 0, :r= are given such that det(I) k2 P20 P1 111 /02 I (88) + e> 0. Define L = 1t/US (89) where = US 2 UT (90) ~~~~)~~~~~(90) V~~=nsUT is the normalized eigendecomposition of 1. Then the normalized Frobenius-normerror is given by IIlHookNLL T - F Proof: Letting A = 1ookNLL T kN POO 1 1 ZllF I-l k-1-+leoo (91) and B = I, we can write A = aUS2 UT, (92) B = bUS2 UT, (93) where S2 = 0 a = b = , I (94) 1/A kNlgo, (95) 4kNoo+ e. (96) Hence we have IIA- BIip = IIBIIF = II(a- b)S2 ll bl A2 +-. 47 = la-bl A2+ (97) (98) 0.9 0.7 /,/ 0.6 0.5 ,n- 0.4 - 0.3 - N=3 N=1000 0.2 0.1 0.2 0 0.4 0.6 0.8 1 epsilon Figure 21: Relative error in matching second order moments using the Initial Guess Algorithm Hence, |A|- BI llBltlF |I - bl b I VNoo+c- kNA 2o _ 0 o + eoo I ki which establishes the result. _________ (99) +e C We have plotted the expression for the relative error in Figure 21 for N = 3 and N = 1000, and assuming that p0oo = 1. This figure shows that although the relative error grows quite fast as E is increased, it never exceeds the maximum of 1 (i.e. 100 percent) for a fixed ,0oo. Also, the relative errors for different number of sides are seen to be very close. 48 References [1] M. Bergstrom, J. Litton, L. Ericksson, C. Bohm, and G. Blomqvist, "Determination of object contour from projections for attenuation correction in cranial positron emission tomography," J. Comput. Assist. Tomography, vol. 6, pp. 365-372, 1982. [2] J.-P. Thirion, "Segmentation of tomographic data without image reconstruction," IEEE Transactions on Medical Imaging, vol. 11, pp. 102-110, March 1992. [3] N. Srinivasa, K. Ramakrishnan, and K. Rajgopal, "Detection of edges from projections," IEEE Transaction on Medical Imaging, vol. 11, pp. 76-80, March 1992. [4] W. Munk and C. Wunsch, "Ocean acoustic tomography: a scheme for larger scale monitoring," Deep-Sea Research, vol. 26A, pp. 123-161, 1979. [5] G. T. Hermann, Imag Reconstruction From Projections. New York: Academic Press, 1980. [6] D. Rossi and A. Willsky, "Reconstruction from projections based on detection and estimation of objectsparts I and II: Performance analysis and robustness analysis," IEEE Trans. on Acoustic, Speech and Signal Proc., vol. 32, pp. 886-906, August 1984. [7] J. L. Prince., Geometric Model-based Estimation from Projections. PhD thesis, MIT, Dept. of EECS, 1988. [8] A. Lele, "Convex set estimation from support line measurements," Master's thesis, MIT, Dept. of EECS, 1990. [9] K. Hanson, "Tomographic reconstruction of axially symmetric objects from a single radiograph," in High Speed Photography, (Strassbourg), 1984. Vol. 491. [10] W. C. Karl, Reconstructing Objects from Projections. PhD thesis, MIT, Dept of EECS, 1991. [11] Y. Bresler and A. Macovski, "Estimation of 3-d shape of blood vessels from X-ray images," in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, March 1984. 49 [12] J. A. Fessler and A. Macovski, "Object-based 3-d reconstruction of arterial trees from magnetic resonance angiograms," IEEE Trans. on Medical Imag., vol. 10, pp. 25-39, March 1991. [13] S. Chang, "The reconstruction of binary patterns from their projections," Comm. of the A CM, vol. 14, no. 1, 1971. [14] A. Kuba, "Reconstruction of measurable plane sets from their two projections taken in arbitrary directions," Inverse Problems, vol. 7, pp. 101-107, 1991. [15] A. Volcic, "A three point solution to Hammer's X-ray problem," Journal of the London Mathematical Society, vol. 2, no. 34, pp. 349-359, 1986. [16] P. C. Fishburn, J. Lagarias, J. Reeds, and L. A. Shepp, "Sets uniquely determined by projections on axes I. continuous case," SIAM J. Appl. Math., vol. 50, no. 1, pp. 288-306, 1990. [17] R. J. Gardner, "Symmetrals and X-rays of planar convex bodies," Arch. Math., vol. 41, pp. 183-189, 1983. [18] S. Helgason, Radon Transform. Boston: Birkhauser, 1980. [19] R. A. Knox, Ocean Circulation Models: Combining Data and Dynamics, vol. 284 of NATO ASI Series, ch. Ocean Acoustic Tomography: A Primer, pp. 141-188. Kluwer Academic Publishers, 1989. [20] H. VanTrees, Detection, Estimation, and Modulation Theory: Part I. John Wiley and Sons, 1968. [21] J. Rissanen, Stochastic Complexity in StatisticalInquiry, vol. 15 of Series in Computer Science. World Scientific, 1989. [22] P. Milanfar, W. C. Karl, and A. S. Willsky, "Recovering the moments of a function from its Radontransform projections: Necessary and sufficient conditions," LIDS Technical Report LIDS-P-2113, MIT, Laboratory for Information and Decision Systems, June 1992. [23] G. Talenti, "Recovering a function from a finite number of moments," Inverse Problems, vol. 3, pp. 501517, 1987. 50 [24] J. Shohat and J. Tamarkin, The problem of moments. New York: American Mthematical Society, 1943. [25] M. Pawlak, "On the reconstruction aspects of moments descriptors," IEEE Trans. Info. Theory, vol. 38, pp. 1698-1708, November 1992. [26] N. Akhiezer, The classical moment problem and some related questions in analysis. New York: Hafner, 1965. [27] P. Milanfar, Geometric Estimation and Reconstruction from tomographic data. PhD thesis, MIT, Department of Electrical Engineering, June 1993. [28] M. Berger, Geometry I and II. Springer-Verlag, 1987. [29] P. J. Davis, "Plane regions determined by complex moments," Journal of Approzimation Theory, vol. 19, pp. 148-153, 1977. 51