Reconstructing Parameterized Objects From Projections: ... Statistical View*

advertisement
LIDS-P-2205
Reconstructing Parameterized Objects From Projections: A
Statistical View*
Peyman Milanfar
(Corresponding Author)
Alphatech Inc.,
50 Mall Road, Burlington, MA 01803
Phone: (617) 273-3388 extension 271
E-mail: milanfar@athena.mit.edu
W. Clem Karl,
Alan S. Willsky
Laboratory for Information and Decision Systems
Department of Electrical Engineering and Computer Science
Massachusetts Institute of Technology
Cambridge, Massachusetts, 02139
October 17, 1993
*This work was supported by the National Science Foundation under Grant 9015281-MIP, the Office of Naval Research
under Grant N00014-91-J-1004, the US Army Research Office under Contract DAAL03-92-G-0115, and the Clement Vaturi
Fellowship in Biomedical Imaging Sciences at MIT.
Abstract
In many applications of tomography, the fundamental quantities of interest in an image are geometric ones.
In these instances, pixel based signal processing and reconstruction is at best inefficient, and at worst, nonrobust in its use of the available tomographic data. Classical reconstruction techniques such as Filtered
Back-Projection tend to produce spurious features when data is sparse and noisy; and these "ghosts" further
complicate the process of extracting what is often a limited number of rather simple geometric features. In
addition, even if our interest is not primarily geometric, such a perspective can provide a rational framework
for focusing information in those cases where the quality or quantity of the available data will not support
the generation of a dense pixel-based field. In this paper we present a framework that, in its most general
form, is a statistically optimal technique for the extraction of specific geometric features or objects directly
from the noisy projection data. We present an approach that is applicable to the reconstruction of any
finite parameterization of an object, but in specific, we focus on the tomographic reconstruction of binary
polygonal objects from sparse and noisy data. In our setting, the tomographic reconstruction problem
is essentially formulated as a (finite dimensional) parameter estimation problem.
The parameters to be
estimated correspond to features of the underlying object. In particular, the vertices of binary polygons
are used as their defining parameters.
Under the assumption that the projection data are corrupted by
Gaussian white noise, we use the Maximum Likelihood (ML) criterion, when the number of parameters is
assumed known, and the Minimum Description Length (MDL) criterion for reconstruction when the number
of parameters is not known. The resulting optimization problems are nonlinear and thus are plagued by
numerous extraneous local extrema, making their solution far from trivial. In particular, proper initialization
of any iterative technique is essential for good performance. To this end, we provide a method to construct
a reliable yet simple initial guess for the solution. This procedure is based on the estimated moments of the
object, which may be conveniently obtained directly from the noisy projection data.
2
List of Symbols
f(XA,y)
g(t,O)
0
6
w
d,
dy
cos(O)
sin(O)
ir
V
i
j
m
n
O
Y
Vml
log
E
vmap
d
VMDL
N
/
fS
k
XI
A
R
h
T
Vref
L
C
Vinit
det
k
tan
I
A
U
£
E
z
I
P
*
inf
e
%
0
u
T
v
w
a
la
z
F
A
B
a
b
3
List of Figures
1
The Radon transform
......
2
A projection of a binary, polygonal object ..
3
Illustration of Result 1
4
Triangle example.
..
.......
.
...
;
Hexagon example.
13
22
..........
SNR= 0 dB, 50 views, 20 samples/view:
True(-), Reconstruction(- -).
...
26
SNR= 0 dB 50 views, 20 samples/view: True(-), Reconstruction(- -).
26
...............
%Error=9.6 .............................
6
99
.
...........
%Error = 7.2 ........................................
5
;
From Top to Bottom: Sinograms with 50 projections and 20 samples per projection of I)
In each of these
Noiseless Hexagon, II) Noisy data at 0 dB III) Reconstructed Hexagon.
images, the horizontal axis is 6, the vertical axis is t, and the intensity values are the values
of the corresponding projections mapped to the grayscale range of [0, 255] .
...
.
27
7
Sample reconstruction of a hexagon at 0 dB SNR using FBP: 64 views, 64 samples per view .
8
Sample reconstruction of a Hexagon at 0 dB SNR 64 views, 64 samples/view:
Reconstruction(- -)
True(-),
28
.........................................
....
.
9
Mean performance curves for ML reconstructions of a triangle and a hexagon .
10
Performance as a function of Number of Views
11
Performance as a Function of Number of Samples per View
12
Cost vs number of sides for the hexagon in Figure 5
13
Minimum MDL costs and Sample Reconstructions for an Ellipse
14
True object (-), reconstruction (-.), initial guess (o) picked by eye .
15
True object (-), reconstruction (-.), initial guess (o) picked using Initial Guess Algorithm . ..
16
FBP Reconstruction of non-polygonal, non-convex Object: 3rd order Butterworth filter with
0.15 normalized cutoff frequency, SNR=4.35 dB.
17
.........................
.
.
.
29
. ...
31
31
..................
33
.......................
.
34
...............
..............
35
...
.
.......................
A sample path of the reconstruction error at SNR=0 dB .......
4
28
.............
36
36
.
.
.
37
18
Percent error for Hexagon vs SNR after outlier removal
19
A typical reconstruction at a local minimum with SNR= 0 dB. .................
39
20
A typical reconstruction at a local minimum with SNR= 4.35 dB ................
40
21
Relative error in matching second order moments using the Initial Guess Algorithm
5
.....................
39
.....
48
1
Introduction
In many applications of tomography, the aim is to extract a rather small set of geometrically based features
from a given set of projection data [1, 2, 3]. In these instances, a full pixel-by-pixel reconstruction of the
object is a rather inefficient and non-robust approach. In addition, in many situations of practical interest,
a full set of data with high signal-to-noise ratio (SNR) is often difficult, if not impossible, to obtain. Such
situations arise in oceanography, nuclear medicine, surveillance, and non-destructive evaluation when due to
the geometry of the object or the imaging apparatus, only a few noisy projections are available [4, 5]. In
these cases, the classical reconstruction techniques such as Filtered Back Projection (FBP) [5] and Algebraic
Reconstruction Techniques (ART) fail to produce acceptable reconstructions.
The shortcomings of these
classical techniques in such situations can be attributed to two main sources. First, these techniques are
invariably aimed at reconstructing every pixel value of the underlying object with little regard to the quality
and quantity of the available data. To put it differently, there is no explicit or implicit mechanism to control
greed and focus information, thus preventing one from attempting to extract more information from the
data than it actually contains. The second type of shortcoming results from the fact that if we assume
that the projection data are corrupted by Gaussian white noise, the process of reconstruction will have
the net effect of "coloring" this noise. This effect manifests itself in the object domain in the form of
spurious features which will complicate the detection of geometric features. This observation points out the
importance of working directly with the projection data when the final goal is the extraction of geometric
information. In our effort to address these two issues, we have proposed the use of simple geometric priors
in the form of finitely parameterized objects. The assumption that the object to be reconstructed is finitely
parameterized allows for the tomographic reconstruction problem to be posed as a finite (relatively lowdimensional) parameter estimation problem. If we further assume, as we have done in the latter part of this
paper, that the number of such parameters is also an unknown, we can formulate the reconstruction problem
as a Minimum Description Length estimation problem which provides for an automatic (data-driven) method
for computing the optimal parameterized objects with the "best" number of parameters, given the data. This
6
is, in essence, an information-theoretic criterion which gives us a direct way to estimate as many parameters
as the information content of the data allows us to, and thus control the greed factor.
Other efforts in the parametric/geometric study of tomographic reconstruction have been carried out in
the past. The work of Rossi and Willsky [6] and Prince and Willsky [7] has served as the starting point
for this research effort.
In the work of Rossi, the object was represented by a known profile, with only
three geometric parameters; namely size, location, eccentricity, and orientation. These parameters were then
estimated from projection data using the Maximum Likelihood (ML) formulation. In their approach, the
number of unknown parameters was fixed and the main focus of their work was on performance analysis.
Prince, on the other hand, used a priori information such as prior probabilities on sinograms and consistency
conditions to compute Maximum A Posteriori (MAP) estimates of the sinogram and then used FBP to
obtain a reconstruction. He made use of prior assumptions about shape, such as convexity, to reconstruct
convex objects from support samples which were extracted from noisy projections through optimal filtering
techniques. The approach of Prince provided an explicit method for integrating geometric information into
the reconstruction process but was in essence still a pixel-by-pixel reconstruction. Extending these ideas,
Lele, Kulkarni, and Willsky [8] made use of only support information to produce polygonal reconstructions.
Hanson [9] studied the reconstruction of axially symmetric objects from a single projection. Karl [10] also
has studied the reconstruction of 3-D objects from two-dimensional silhouette projections.
The geometric modeling approach of Rossi and Willsky was expanded upon to include a more general set
of objects by Bresler and Macovski [11] and Fessler and Macovski [12]. The former work chose sequences of
3-D cylinders with unknown radius, position, and orientation to model blood vessels being tomographically
imaged in 3-D. The latter work used ellipsoids with unknown parameters to reconstruct 3-D arterial trees from
a few magnetic resonance angiograms. Recently, Thirion [2] has introduced a technique to extract boundaries
of objects from raw tomographic data through edge detection and in the sinogram. Other work in geometric
reconstruction by Chang [13] and more recently Kuba, Volcic, Gardner and Fishburn, [14, 15, 16, 17] has
been concerned with the reconstruction of binary objects from only two noise-free projections.
Our approach provides a statistically optimal ML formulation for the direct recovery of objects from
7
the projection data in the presence of noise. We also provide an automatic mechanism for identifying the
statistically optimal number of object parameters, and thus information, in a given data set. The statistically
optimal ML formulation leads to an optimization problem that is nonlinearand filled with local extrema.
An appropriate initial guess is thus essential for its iterative solution. We thus provide a simple procedure to
generate an appropriate initial guess based moment estimates of the object formed from the original data.
The organization of this paper is as follows. In Section 2 we introduce the basic definitions and assumptions and pose the general problem which we intend to solve. We also discuss the particular statistical
formulations of the reconstruction problem which we use. In particular, in Section 2.3 we discuss our novel
technique for computing a good initial guess for the nonlinear optimization problems that result from our
formulations.
Section 3 contains basic performance results and robustness studies for various scenarios.
Section 4.2 contains our conclusions.
2
The Reconstruction Problem
The Radon-Transform [5, 18] of a function f(z, y) defined over a compact domain of the plane O is given by
g(t, ) =
JA f(x, y)
6(t
W [X, y]T) dxdy.
(1)
For every fixed t and 9, g(t, 8) is the line-integral of f over 0 in the direction w = [cos(O), sin(0)]T, where
6(t - [cos(O), sin(O)] . [X, y]T) is a delta function on a line at angle 0 + 7r/2 and distance t from the origin.
See Figure 1.
Here we assume the existence of a set of parameters that uniquely specify the function f. The estimation
of the parameters that uniquely specify the function f is the concern of this paper. Let us stack the set of
parameters that uniquely define f in a vector V. In what follows, we shall then consider the function f and
its set of defining parameters as interchangeable. We will assume throughout that the data available to us
are discrete samples of g which are corrupted by Gaussian white noise of known intensity. In particular, our
observations will be given by
Yij = g(ti, Oj, V*) + w(ti, Oj) ,
8
(2)
·. ~~~~~~~~~
~~~\\
:55
:·:~:~r"':~::~:.~::~j:::l
% 0~~~.~
J,
.
;.
~~i~i~i ~l~is~z~z~x
Figure
1:~::~:Th Raon rasform·:
for 1 < i < r, 1
<
j < n where V* is the true object we wish to reconstruct. The variables w(ti, Oj) are
assumed to be independent, identically distributed (i.i.d.) Gaussian random variables with variance
o.2.
We
will denote by Y the vector of all such observations.
In the classical framework, the parameter vector V consists of the pixel values of all pixels in the image
to be reconstructed. For example, the parameter vector V here would contain over 16,000 elements for a
128 x 128 image. The FBP technique would estimate all of these parameters. For many applications, however,
the creation of this dense visual field is not really necessary. For instance, in ocean acoustic tomography
[4, 19], the size, location, and shape of a cold core ring (i.e. a continuous body of cool water contained in
the warm waters of a gulf stream.) are important quantities to be estimated within a gulf stream. Since a
cold core ring can be considered a binary object, these parameters can be easily extracted from tomographic
data directly without reconstruction of the entire pixel field. The classical approach, however, is to perform
the full reconstruction of the image and then try to locate and measure objects, such as cold core rings, in
the resulting image. If we only desire knowledge of the size and location of a localized occlusion, that is to
say essentially three numbers, FBP is then estimating roughly 15,997 parameters too many!
Clearly, the estimation of all the pixel values is not only an inefficient use of the data, but in high noise and
sparse data situations it can also produce spurious features known as "ghosts" that are bound to complicate
the subsequent processing. In particular, such spurious features reflect the fact that we have colored the
observation noise. This coloring of the noise, as reflected in reconstruction artifacts, may not be severe when
we have high SNR and full-coverage data, yet can be quite limiting when these conditions are not met. In
general, there may arise situations where the quality or quantity of the available data will simply not support
the reliable estimation of all the parameters represented by the pixel values. In such cases, even though our
interest may not be fundamentally geometric, our approach provides a rational and statistically precise way
of reducing the degrees of freedom of the problem and hence of focusing the available information.
10
2.1
Maximum Likelihood Approach
In our approach, the original data in (2) is used to directly estimate the parameters V of a geometric
parameterization in a statistically optimal way. The dimension of the parameter vector V is determined by
the level of detail that one can extract from the sparse and noisy data. For clarity, we first consider the case
where a fixed and known number of parameters is assumed. In this case, the Maximum Likelihood (ML)
[20] estimate, Vml, of the parameter vector V is given by that value of V which makes the observed data
most likely. In particular, using the monotonicity of the logarithm:
Vl = arg max log [P(YIV)],
(3)
where P(YIV) denotes the conditional probability density of the observed data set Y given the parameter
vector V. It is well-known that given the assumption that the data is corrupted by i.i.d. Gaussian random
noise, the solution to the above ML-estimation problem is precisely equivalent to the following Nonlinear
Least Squares Error (NLSE) formulation
JYij - g(ti, j, V)2.
Vml = arg mimn
(4)
i,j
The formulation (4) shows that, in contrast to the linear formulation of classical reconstruction algorithms,
the ML tomographic reconstruction approach, while yielding an optimal reconstruction framework, generally
results in a highly nonlinear minimization problem. It is the nature of the dependence of g on the parameter
vector V that makes the problem nonlinear. Being nonlinear, the problem is plagued by numerous extraneous
local extrema, making the issue of computing a good initial guess for nonlinear optimization routines an
important one. As we will discuss in Section 2.3, the geometric information carried in the projections can
be conveniently extracted in the form of moments. These moments can in turn be used to easily compute a
good initial guess for the optimization algorithms.
Finally, note that if additional explicit geometric information is available in terms of a prior probabilistic
description of the object vector V, then a Maximum-A-Posteriori estimate of V may be computed as follows:
Vmap = arg maxlog [P(VIY)]
(5)
In this work we concentrate on the ML problem given in (3) and its extensions, though application of
our results to the MAP formulation is straightforward.
2.2
Minimum Description Length
In the previous ML discussion we assumed we had prior knowledge of the number of parameters describing
the underlying object. Without this knowledge, we can consider the Minimum Description Length (MDL)
principal [21]. In this approach, the cost function is formulated such that the global minimum of the cost
corresponds to a model of least order that explains the data best. The MDL approach in essence extends the
Maximum Likelihood principal by including a term in the optimization criterion that measures the model
complexity. In the present context, the model complexity refers to the number of parameters used to capture
the object in question. Whereas the ML approach maximizes the log likelihood function given in (3), the
MDL criterion maximizes a modified log likelihood function, as follows:
log[P(YIV)]-
VMDL = argmax
V,N
where d =
mn
- log(d)),
2
(6)
is the number of samples of g(t, 9) and N refers to the number of parameters defining the
reconstruction. Roughly speaking, the MDL cost is proportional to the number of bits required to model
the observed data set with a model of order N, hence the term Minimum Description Length. Under our
assumed observation model (2) the MDL criterion (6) yields the following nonlinear optimization problem
for the optimal parameter vector VMDL:
VMDL
= argmin min{o- 2
N
V
E
IIYij - g(ti, 8j, V)112 + N log(d)},
(7)
where the optimization is now performed over both V and the number of parameters N. Note that the
solution of the inner minimization in (7) essentially requires solution of the original ML problem (3) or (4)
for a sequence of values of N. Thus the optimization problem (7), being an extension of (4), is also highly
nonlinear and routines to solve it will similarly require proper initializations to avoid being stuck in local
minima.
In summary, our proposed approach, in its most general form, is a statistically optimal technique for
the extraction of specific geometric features or objects directly from the projection data and the rational
focusing of sparse and noisy information. Furthermore, via the MDL criterion it is equipped to compute
12
V3
v
2
V.
t
t2
2
t4
3
t5
Figure 2: A projection of a binary, polygonal object
the optimal number of features (parameters) that may be extracted from a given noisy and possibly sparse
projection data set. While being a statistically optimal approach, the resulting optimization problems are
highly nonlinear and require appropriate initialization for their solution. To this end we will present a simple
robust method to generate such an initial estimate.
For the sake of focus in what follows, we will concentrate our attention on binary, polygonal objects.
We define these as objects taking the value 1 inside a simple polygonal region and zero elsewhere. Unless
otherwise stated, we assume from here on that the matrix V contains the vertices of an N-sided binary,
polygonal region as follows:
V =
[VI|
V2
*
..
VN
(8)
where Vi = [Xi, yi]T denote the Cartesian coordinates of the i th vertex of the polygonal region arranged in
the counter-clockwise direction (See Figure 2). Note that e use a matrix of parameters rather than a vector
in what follows for notational ease of explanation in the algorithms to follow, though this is not essential.
The tomographic reconstruction problem for these objects can now be stated as the problem of optimal
13
estimation of the vertices V given noisy samples of the projections denoted by Yi,j as in (2).
2.3
Algorithmic Aspects - Computing A Good Initial Guess
Given the highly nonlinear nature of the dependence of the cost function in (4) on the parameters in V, it
appears evident that given a poor initial condition, typical numerical optimization algorithms may converge
to local minima of the cost function. Indeed, this issue is a major obstacle to the use of a statistically optimal
though nonlinear approach such as given in (3) or (6). In this section we describe a method for using the
projection data to directly compute an initial guess that is sufficiently close to the true global minimum as
to result in convergence to it in nearly all cases. We do this by estimating the moments of the object directly
from the projection data and then using (some of) these moments to compute an initial guess.
In considering the use of moments as the basis for an initialization algorithm, one is faced with two
important issues. The first is that although estimating the moments of a function from its projections is a
relatively easy task, as we have shown in [22], the reconstruction of a function from a finite number of moments
is in general a highly ill-posed problem even when these moments are exactly known [23]. Furthermore, in
our framework the moments are estimated from noisy data, and hence are themselves noisy. In fact, as
higher and higher order moments are estimated, the error in the estimates of these moments becomes larger.
Our approach avoids these moment related difficulties by using the moments only to guide an initial coarse
estimate of the object parameters for subsequent use in solution of the nonlinear ML or MDL problems.
This initial estimate, in turn, itself mitigates the difficulties associated with the nonlinearities of the optimal
statistical approaches. In particular, the amount of computation involved in arriving at an initial guess using
our moment-based method is far smaller than the amount of computation (number of iterations) required to
converge to an answer given a poor initial guess, especially since a poor initial guess may converge to a local
minimum and never reach the basin of the global minimum. Further, the parameterization of the objects
serves to regularize and robustify the moment inversion process [23, 24, 25, 26].
Our method of using moments to generate an initial guess is based on the following set of observations.
First, let tPpq, 0 < p, q denote the moment of f(a, y) of order p + q as given by:
14
pq
=/J
Pyq
(,
y)
d
dy
(9)
In particular, note that the moments up to order 2 have the following physical relationships. The zeroth
order moment o00ois the area of the object, the first order moments A1o0and plo are the coordinates of the
center of mass of the object scaled by the area, and the second order moments /02, /ll,/20o
the entries of the inertia matrix of the object.
are used to form
Thus these moments contain basic geometric information
about object size, location, and elongation and orientation that, if available, could be used to guide our
initialization of the nonlinear optimization problems (4) or (7).
Our first aim then is to estimate them
directly from the noisy projection data. To that end, it is easy to show that [18]:
]
g(t,6 )tk dt =1]
f(XI y) [ cos(0) + ysin(6)]k dx dy.
(10)
By expanding the integrand on the right hand side of (10), it becomes apparent that the moments of
the projections are linearly related to the moments /1pq of the object. In particular, specializing (10) to
k = 0, 1, 2, and noting that f (z, y) is an indicator function when the objects in question are binary, we arrive
at the following relationships between t/pq, 0 < p+ q < 2 and the projection g(t, 0) of f(a, y) at each angle 0:
oo0=
[
in( ]:
/g(t,
) dt
-=
°(0) ()
1= !J4/110
PO1
(11)
(12)
A20
[ cos
2
()
2sin(0)cos(0)
sin 2 (0) ]
P11
/=
(t,6)t2dtH(2)(6)
(13)
102
Thus if we have projections at three or more distinct, known angles we can estimate the moments of up
to order 2 of the object we wish to reconstruct. The computation of these moments is a linearcalculation,
making their estimation from projections straightforward (see [22]). Since, in general, many more than three
projections are available, the estimation of these moments determining the area, center, and inertia axes of
the object is overdetermined. The result is a robustness to noise and data sparsity through a reduction in
the noise variance of their estimated values. In particular, we can stack the moments 7'(k)(0j)
15
we obtain
from the projections at each angle Oj to obtain the followin
overall equations for the /
q up to order 2:
1.oo
(14)
~~1
cos(9i)
cos 2 (91)
2sin(O1)cos(O1)
J=
2sin(n,)cos(/,6)
sin 2 (a0)
(15)
(1)()
/A20
:11 :
cos 2 (0n)
'
sin(O,,)
sin 2 (91)
]
1
sin( )
:
cos(On)
, t(°)(0,)
102
H(2)(81)
:
=
(16)
(2) ('e, )
Using these equations we can easily calculate the maximum likelihood estimates of the moments of
the object A/pq for 0 < p + q < 2 given noisy observations of the moments of the projections.
In par-
ticular, this is done by gathering the above sets of equations into a single linear equation of the form
h = A/r + e, where h is the vector of noisy computed moments of the projections
and 1 = [OO0, 10o,
o01,1620,
All,1
/t(h)(0j)appearing above,
U0 2]T, while e denotes a zero-mean vector of Gaussian noise with correspond-
ing covariance matrix R which captures the noise in our observations of 7(k)(0j). This noise model, of course,
follows directly from that of (2). The optimal Maximum Likelihood (ML) estimate of the vector A1is then
given by
= (ATR-1A)-lATR-lh.
-
(17)
(The general framework for the optimal estimation of the moments of any order of a function f(z, y)
from noisy measurements of its Radon transform is developed in [27].) Let us denote these ML moment
estimates by itpq.
Now that we have estimates of the moments of up to order 2 of the object, and thus estimates of its
basic geometric structure, we need to convert this information into a suitable object for use in initializing the
nonlinear optimization problem (4) or (7). The initial guess algorithm outlined next uses these ML estimates
of the low-order moments i2pq, obtained from the noisy projection data, to produce a polygon which has
16
moments up to order 2 which are close to (or in some cases equal to) those which were estimated from the
projection data. The resulting polygon, which will be used as our initialization, should thus have the same
basic geometric structure as the underlying object.
Recall that in this process of generating an initial object from the moment data we want to avoid the
difficulties usually associated with obtaining an object from a set of its moments [24, 26, 25, 23]. For this
reason, the initial polygon we will use is simply obtained as the affine transformation of a reference object
V,ef(N), which is a centered regular N-gon of unit area. For a given choice of number of sides N, the
reference object we use is given by:
I
cos(O)
cos(-
)
...
o
Vref(N) =
-
)
(18)
sin(O) sin(-2)
. sin( N-1)
The affine transformation of this reference object, which will be generated from the estimated moment
IV
aNsin(2rN
set, consists of a uniform scaling, a stretching along the coordinate axes, a rotation, and finally a translation,
and simply corresponds to the following transformation of the underlying spatial coordinates of the reference
object
[
[
| =L |
+ C.
(19)
In particular, given the form of Vref (N) in (18), this yields the following equation for the family of possible
initial objects 4nit:
Vinit = L Vref(N)+
[ C
...
C
(20)
The set of all such affine transformations of lref(N) we term the affinely regular N-gons [28]. In the
absence of noise, the initial guess algorithm we detail will exactly match the given estimated moments if the
underlying object itself happens to be affinely regular. If the underlying object is not affinely regular itself,
the algorithm will not necessarily produce an N-gon exactly matching its moments, even in the absence of
noise, though as we will show, it will usually be close. Of course, in the presence of noise the estimated
moments themselves are not exact and thus, while we would hope to get close, our resulting initial N-gon
will never exactly match the true moments of the underlying object anyway.
Given that we will restrict ourselves to initial objects of the form (20), let us consider how we might choose
17
the parameters of the transformation L and C to match a given estimated moment set Aipq, 0 < p + q
<
2.
Using (20) and (18) to calculate the moments of Vlnit up to order 2 we obtain the following relationships:
00oo(Vinit)
[o(Vni)
o01
(Vinit)
/L20 (Vinit)
=
J=
Alpq(Vinit)
i10I(Viit),
(21)
Idet(L)IC
(22)
Idet(L)l (kNLL T + CCT)
(23)
,11,(Vi~nit)
=
where
Idet(L)[
is the pq-th moment of Viit and kN = 1/(4N tan(r/N)). Thus to match /ioo(Vi,,it),
o01(Vinit) with their estimated counterparts the first two conditions require:
Idet(L)I
=
.oo
c
=
I
(24)
oo i
(25)
B01
The first condition simply corresponds to a scaling of Vref(N) so that its area matches the estimated
one. The second condition shows that the affine term C in the transformation (19) should correspond to a
translation of Vref(N) to the estimated center of mass of the object. These two conditions assure that we
match the estimated area and center of mass location.
Now, after some manipulation, (23) implies that to match the estimated second order moments j~pq,
p + q = 2, we must have:
LLT
I
(26)
kNAoo
where Z is the matrix of estimated central moments defined by:
L20 A11
1
10(27)
]
Ali
b02 /
00 -01
(27)
In particular, this condition implies another constraint on det(L) independent of (24), which we will not,
in general, be able to satisfy. Specifically, a necessary condition for finding an L satisfying both (26) and
(24) is that:
det()
= k2
18
oo
(28)
where the expression on the left is the determinant of the estimated central second moments. Actually
the condition (28) is also sufficient, as shown in Appendix A. Clearly, this condition will not, in general, be
satisfied and we will be unable to exactly match the estimated second moments. In fact, the objects that do
meet this constraint, and thus whose moments we can exactly match, are precisely the elements of the set
of affinely regular N-gons. Geometrically, this situation reflects the limitation of our restricted object class
(20), i.e. the set of affinely regular N-gons. Within this class, for a given object area we are constrained as
to the "size" of the corresponding inertia matrix we may have, where inertia size is measured as the product
of the principal central inertia moments (eigenvalues of the central inertia matrix). For example, while the
objects of our class will always be convex polygons, for a given area we can always obtain nonconvex objects
of greater inertia by "moving area outward," as in a dumbell.
The condition (26) can also be viewed as implying a different sealing on L needed to obtain a perfect
match to the inertia condition. In general, we thus have a choice of picking this scaling of L to satisfy either
the inertia condition (26) or the area condition (24). Since the area condition (24) is both a more robustly
estimated and a more fundamental geometric quantity, we choose to enforce this condition in the algorithm
to follow. We then choose L so that the resulting central inertia matrix of Vinit has the same principal axes
directions and has its principal inertias in the same ratio as those estimated from the data as found in I. We
accomplish these goals by using a square root of the matrix .Z normalized to have unit determinant for the
form of L, then scaling the result to match the determinant condition (24). Thus we sacrifice overall scaling
of the inertia matrix in favor of matching the estimated area. Collecting the above steps and reasoning, the
overall algorithm is given by the following:
Algorithm 1 (Initial Guess)
1. Compute the optimal (ML) estimates of the moments up to order 2 (oo, firo, io1,
o, ill, and io2)
20
from the raw projection data using (14)-(17).
2. Construct an N-sided regular polygon centered at the origin with vertices chosen as the scaled roots of
unity in counter-clockwise order so that they lie on the circle of radius 1/ /
19
sin(2). This polygon
has unit area and is defined in Equation 18.
3. Compute the translationC, obtained as the estimated object center of mass:
C
L1
~Io
goo
]
(29)
-
4. Form the estimated central inertia matrix .: from the estimated moments according to (27)
5. If I is not positive definite, set L =
x/Io'I2
and goto step 8. Otherwise proceed to step 6. (I2 is the
2 x 2 identity matrix.)
6. Perform an eigendecomposition of the normalized matrix Y as follows:
UT
=U
det()
[
0
(30)
1/A
where we have assumed that the eigenvalues are arrangedin descending order and that the eigenvectors
are normalized to unit length so that det(U) = ±1. Note that the eigenvalues are reciprocals of each
other since we have scaled the left hand side so that its determinant is 1.
7. Form the linear transformationL as a scaled square root of I as follows:
L =
/o U
o
o
1
(31)
1/vA
Note that det(L) = 0ooas desired. Depending on whether U is a pure rotation or a rotation followed
by a reflection, it will have determinant +1 or -1, respectively.
8. The initialguess Vinit is now obtained by applying the scaling, stretching, and rotation transformation
L and the translationC to the reference object Vref(N) via the coordinate transformation [x', y']T
L[x, y]T + C. Because of the form of Vref(N) this operation yields:
Vinit = L Vref(N) + [ C
...
C].
(32)
Note that the eigenvalue A of the unit determinant matrix calculated in step 7 gives the eccentricity of the
underlying object while the corresponding eigenvectors give its orientation. Also note that in the presence
20
of noise the estimated central inertia matrix for the object 2 may not be strictly positive definite and hence
may not correspond to the inertia matrix of any object at all. In such instances, the algorithm refrains from
the use of these moments of order 2 and computes an initial guess based only on the estimated area and
center of mass.
If the matrix L, computed above, is replaced by L' = LT for any orthogonal T, the resulting quantity
L' LI'T satisfies
LLT = L'L'T.
(33)
Hence, although the initial guess generated by the above algorithm is unambiguous and unique in the
sense that the square root of 5 obtained by the algorithm is unique, an infinity of other initial guesses having
the same moments up to order 2 may be generated by replacing L by LT and allowing T to range over the
set of all 2 x 2 orthogonal transformations. A precise characterization of this set of all affinely regular N-gons
with the same moments up to order 2 is, in fact, given in the following result, which we prove in Appendix
A.
Result 1 Consider the set of all N-gons with moments up to order 2: 1oo, o10,
P01,
1o20, C11,
A02,
such
that the resulting inertia matrix
IZ -
-=
(34)
/111
02
satisfies det(l) = k2_1o0.This set coincides with the set of N-gons with vertices on an ellipse go and
sides tangent, at their midpoints, to a homofocal ellipse £I, where these ellipses are given by
- C) = 1}
(35)
g = {Z I (Z - C)TEl (Z - C) = 1}.
(36)
go = {
I ( - C)T Eo(
with
C
=
A/oo
Eo
=
EI
=
j ~10~t1
Alo 1
0oo cos2(-/N)
-Z
2100
21
(37)
(38)
(39)
Figure 3: Illustration of Result 1
This result states that the class of all affinely regular N-gons with a given set of moments up to order 2 is
given by the class of N-gons whose vertices are on a fixed ellipse and whose side are tangent to a second
ellipse which is homofocal with the first. (See Figure 3 for an example.) The ellipses are uniquely determined
by the value of the given moments. In order to simplify the Initial Guess Algorithm, we do not search further
over this family. We simply use the output of the Initial Guess Algorithm described above as the starting
guess for our nonlinear optimization routines.
3
Experimental Results
In this section we present some performance studies of our proposed algorithm with simple polygonal objects
as prototypical reconstructions. One may expect that our algorithms work best when the underlying object
(that from which the data is generated) is itself a simple binary polygonal shape. While this is true, we
will also show that our algorithms perform consistently well even when the underlying objects are complex,
non-convex, and non-polygonal shapes.
First we demonstrate reconstructions based on the ML criterion. In these reconstructions we use the
22
parameters of the true polygon as the initial guess to ensure that the solution produced by the algorithm
is actually the ML estimate, i.e., at the global minimum of the ML cost function. Typical reconstructions
are shown along with average performance studies for a variety of noise and data sampling scenarios. In
particular, we show that the performance of our algorithms is quite robust to noise and sparsity of the data,
significantly more so than classical reconstruction techniques. To demonstrate this point, reconstructions
using our techniques and the classical FBP are provided.
Next we demonstrate how the MDL criterion may be used to optimally estimate the number of parameters
(sides) N directly from the data. We solve these MDL problems by solving the ML problem for a sequence of
values of N. To initialize each of these ML problems, a regular polygon of the desired number of sides with the
same area and centroid is used. This initialization tries to ensure that the actual ML solution corresponding
to the number of sides in question is most likely being found. The robustness of the MDL approach and its
ability to capture the shape information in noisy data when the underlying object is not polygonal is also
shown through polygonal reconstruction of more complicated shapes. Finally, studies are reported in which
the Initial Guess algorithm is used to produce a starting guess for the optimization routines. In these studies,
we show that although the performance of the overall algorithm does degrade somewhat, this degradation
in performance is not significant.
In order to quantify some measure of performance of our proposed reconstruction algorithms, we first
need to define an appropriate notion of signal-to-noise ratio (SNR). We define the SNR per sample as
i g2 (tt
SNR = 101oglo
0
)/
(40)
where d = m x n is the total number of observations, and or2 is the variance of the i.i.d. noise w in the
projection observations (2).
In all our simulations the reconstruction error is measured in terms of the percent Hausdorff distance [28]
between the estimate and the true polygon or shape. The Hausdorff metric is a proper notion of "distance"
between two nonempty compact sets and it is defined as follows. Let d(p*, S) denote the minimum distance
between the point pr and the compact set S:
d(p*, S) = inf{llp* - pl
23
p C S}.
(41)
Define the e-neighborhoodof the set S as
S()
= p I d(p, S)
e}.
(42)
Now given two non-empty compact sets, Si and S2 , the Hausdorff distance between them is defined as:
71(S 1,S 2 ) = inf{E I
S1
C s(e) and S, C S()}
(43)
In essence, the Hausdorff metric is a measure of the largest distance by which the sets Si and S 2 differ.
The percent Hausdorff distance between the true object S and the reconstruction S is now defined as
Percent Error = 100% x
1i(.;,
S)
Ho(0, S)
(44)
where O denotes the set composed of the single point at the origin, so that if S contains the origin,
1H(0, S) is the maximal distance of a point in the set to the origin and thus a measure of the set's size.
3.1
ML Based Reconstruction
Here we present examples and performance analyses of the ML based reconstruction method (4). For these
experiments an initial condition equal to the true object was used to ensure us of obtaining the actual ML
estimate.
3.1.1
Sample Reconstructions
In Figures 4 and 5 we show optimal reconstructions of a triangle and a hexagon, respectively, based on
the ML criterion. The true polygon, in each case is depicted in solid lines while the estimate is shown
in dashed lines. For both objects, 1000 noisy projection samples were collected in the form of 50 equally
spaced projections in the interval (0, ir] (m=50), and 20 samples per projection (n=20). The field of view
(extent of measurements in the variable t) was chosen as twice the maximum width of the true object in
each case. For each of these data sets the variance of the noise in (2) was set so that the SNR given by (40)
was equal to 0. The typical behavior of the optimal ML based reconstructions in the projection space can
be seen in Figure 6, which corresponds to the hexagon of Figure 5. The top image of this figure shows the
underlying projection function g(ti, 9 j, V*) of (2) for the hexagon, while the middle image shows the noisy
observed data Yj,j. The object is difficult to distinguish due to the noise in the image. The bottom image
24
shows the reconstructed projections corresponding to the optimal estimate g(ti, Oj, V), which are virtually
indistinguishable from those corresponding to the true object. Figure 7 shows the best FBP reconstruction
of the hexagon used in Figure 5 based on 4096 projection samples of the same SNR (0) (64 angles with 64
samples per angle). For comparison, the reconstruction from this data using our algorithm is shown in Figure
8. (Note here that what constitutes the "best" FBP is somewhat subjective. We tried many different filters
and visually, the best reconstruction was obtained with a Butterworth filter of order 3 with 0.15 normalized
cutoff frequency.)
Note that the number of samples per projection used in this reconstruction is actually more than the
number used to produce the ML-based reconstruction in Figure 5. The increase in sampling was necessary
because CBP produces severe artifacts if the number of views exceeds the number of samples per view [5].
The ML approach has no such difficulties, as we will see in the next section where we examine performance.
In contrast to the ML-based reconstruction, the details of the hexagon are corrupted in the FBP reconstruction. In addition, there are spurious features in the FBP reconstructions and perhaps most importantly,
to extract a binary object from the FBP reconstruction, we would need to threshold the image or perform
edge detection on it. Neither of these postprocessing steps are easily interpretable in an optimal estimation
framework and, of course, they incur even more computational costs.
3.1.2
Effect of Noise on Performance
The average performance of the ML based reconstructions is presented through several Monte-Carlo studies.
Again, for these experiments an initial condition equal to the true object was used in each case to ensure us
of obtaining the actual ML estimates. The first study establishes average reconstruction errors at various
SNR's for a fixed number of data samples. The purpose of these simulations is to demonstrate that the
ML-based reconstructions are robust to the quality of the data used for a wide range of SNR's. The same
two polygons as in Figures 4 and 5 were chosen as the underlying objects. Again, in each case, 1000 samples
of the projections of these objects were collected in the form of 50 equally spaced projections in the interval
(0, 7r] (m=50), and 20 samples per projection (n=20) while the field of view (extent of measurements in the
25
3
1
/
:
-2
-3
-3
-2
-1
0
1
2
3
Figure 4: Triangle example. SNR= 0 dB, 50 views, 20 samples/view: True(-), Reconstruction(- -). %Error
=7.2
3
2
O0
-1
-2
-3I
-3
Figure 5: Hexagon example.
%Error=9.6
-2
I
-1
0
1
2
SNR=- 0 dB 50 views, 20 samples/view:
26
3
True(-), Reconstruction(- -).
*·:·
'*';';, .
*c*:.i:i8i:::2:·':::
. '.
..
'...*.
*.;.."
;..
..
*.*.*.*.*.*.*.*.*....
...
::::.. .
.......
.~:~
Figure 6: From Top to Bottom: Sinograms with 50 projections and 20 samples per projection of I) Noiseless
Hexagon, II) Noisy data at 0 dB III) Reconstructed Hexagon. In each of these images, the horizontal axis is
0, the vertical axis is t, and the intensity values are the values of the corresponding projections mapped to
the grayscale range of [0, 255]
27
*..
\
28
t
25
+
I -.
0
I
I
................
I5
-4
-2
0
2
4
SNR (dB) per sample
Figure 9: Mean performance curves for ML reconstructions of a triangle and a hexagon
variable t) was chosen as twice the maximum width of the object in each case. The samples g(tj, Oj, V*) were
then corrupted by Gaussian white noise w(ti,8j) of different intensities to yield data sets at several SNR's.
At each SNR, 100 reconstructions were done using independent sample paths of the corrupting white noise.
The average reconstruction error was then computed and is displayed against the SNR in Figure 9. The
error bars denote the 95% confidence intervals for the computed mean values.
The percent error in these reconstructions increases with decreasing SNR, as one would expect. In fact,
the graph shows that, at least in the examined SNR range of -4.35 to +4.35 dB, the relation between
percent error and SNR is roughly linear in the cases of the triangle and the hexagon. This suggests that the
performance of our algorithm does not degrade very fast with decaying SNR, demonstrating the robustness
to noise of such object-based optimal ML estimates.
3.1.3
Effect of Sampling on Performance
Here the performance of our ML-based estimates with respect to both the number of available data samples
and their distribution is studied. One would naturally expect that as the number of available data points
29
decreases, the reconstruction error should increase. The main aim of these simulations is to demonstrate
that the ML-based reconstructions are robust to both the quantity and the distribution of data over a wide
range of SNR's. In particular, reasonable estimates are produced even with a drastic reduction of data and,
unlike the behavior seen in FBP reconstructions, the ML estimates display no catastrophic degradation as
the samples per angular view are reduced. The relative sensitivity of the ML estimates to density of samples
of g(t, 8, V*) in t and 8 is also discussed, providing information of use for the design of sampling strategies.
The true hexagon used in Figure 5 was again used as the underlying object. As before, an initial condition
equal to the true object was used for each of experiments to ensure us of obtaining the actual ML estimates.
A series of Monte-Carlo simulations (50 runs for each sampling configuration) were then performed at various
SNR's to observe the effect of sparse projections and sparse sampling in each projection. In Figure 10, the
percent Hausdorff reconstruction error is plotted versus the number of angular views for SNR's of 0, 4.35,
and 8.7 dB, while the number of samples per view was fixed at 50. With a modest 50 samples per view, all
three curves fall below 10% reconstruction error when the number of views is greater than about 10. This is
only 500 total observations, many of which do not contain the object at all (since the field of view is twice
as large as the object). Furthermore, as the number of angular views is decreased from 100 to 10, only a
marginal increase in the reconstruction error is observed. These observations testify to the robustness of
optimal ML estimates with respect to the number of views.
In Figure 11, the dual case is presented. In this figure the percent Hausdorff reconstruction error is
plotted versus the number of samples per view for SNR's of 0, 4.35, and 8.7 dB, while the number of angular
views was fixed at 50. With 50 angular views, all curves fall below 10% reconstruction error when the number
of samples per view is greater than only 10. Also, as the number of samples per view is decreased from 100
to 10, again only a marginal increase in the reconstruction error is observed. This behavior shows that the
optimal ML estimates are robust with respect to the number of samples per view. Note that for a fixed
sampling strategy, the reconstruction error increases only slightly as the SNR is decreased over a wide range.
For instance, in Figure 10, with 40 angular views and 50 samples per view, the percent error is reduced only
about 5% while the SNR goes from 0 to 8.7 dB.
30
25
50 Samples/View
20
15
10
SNR=O dB
SNR--4.35 d
= ~-~ - - .SNR=8.7
0
20
40
60
Number of Views
80
dB
100
120
Figure 10: Performance as a function of Number of Views
25
50 Views
20
15
10
A
SNR=4.35dB
SNR=8.7 dB
0
20
40
60
80
Number of Samples/View
100
Figure 11: Performance as a Function of Number of Samples per View
31
120
Finally, it is noteworthy that the reconstruction error enjoys a dramatic improvement for all SNR's (0,
4.35, and 8.7 dB) when the number of samples per view is increased from 5 to 10. This improvement is
more significant than that observed in Figure 10 when the number of views is increased from 5 to 10. This
behavior indicates that in a scenario where only a small (fixed) number of sample points can be collected, it
is more beneficial to have more samples per view rather than more views.
3.2
MDL Reconstructions
Here we will examine reconstruction under the MDL criterion of (7) where we now assume that the number of
sides of the reconstructed polygon is unknown. In particular, the reconstruction experiments for the hexagon
in Figure 5 were repeated at SNR=O dB assuming no knowledge of the number of sides. The MDL criterion
was employed to estimate the optimal number of sides. As in the ML algorithm, it is important to find a
good initial guess for the MDL algorithm as well. The problem is twofold. First, a reasonable guess must
be made as to the appropriate range of the number of sides. We picked a fairly small range for the number
of sides of the reconstruction; typically, 3 to 10 sides. Next, for each number of sides, the Initial Guess
algorithm was used to produce an initial guess to the optimization routine. The method for selecting the
range of the number of sides is ad hoc, but was shown to be reliable in the sense that for our simulations, the
MDL cost never showed local or global minima for number of sides larger than 10. Figure 12 shows a plot of
the MDL cost corresponding to the expression in (7) versus the number of sides for a sample reconstruction
of the hexagon in Figure 5. It can be seen that the minimum occurs at N = 6, demonstrating that the
optimal MDL reconstruction will consist of 6-sides. Indeed this number coincides with the true number of
sides of the underlying object. The optimal MDL estimate is thus exactly the optimal ML estimate for this
data set presented before.
3.3
Polygonal Reconstruction of Non-polygonal Objects
In this section we wish to study the robustness of MDL-based estimates when the underlying, true object is
non-polygonal. First we examine the case of an elliptical object. We use the MDL formulation presented in
32
1225
1220
1215
. 1210
.! 1205
1200
1195
119
4.5
4.5
5
5.5
6
6.5
Number of sides (N)
7
7.5
8.5
8
Figure 12: Cost vs number of sides for the hexagon in Figure 5
the previous section and study the behavior of the optimal reconstructions at two different SNR's. To this
end, let the true object (that which generated the data) be a binary ellipse whose boundary is given by:
ay
I (X - 1)2 +
9/4
=
1
(45)
5)
The above relation defines an ellipse centered at the point (1/2, -1/2) whose major and minor axes are
aligned with the coordinate axes with lengths 1 and 3/2, respectively.
One thousand (1000) noisy samples of the Radon transform of this ellipse were generated (m=50 equally
spaced angular views in (0,ir], and n=20 samples per view) at SNR's of 0 and 2.17 dB respectively for 50
different sample paths of the corrupting noise. For each set of data, reconstructions were performed using
the ML algorithm with 3, 4, 5, 6, 7, and 8 sides together with the initial guess algorithm. The MDL cost in
(7) was then computed for each of these reconstructions. The ensemble mean of this cost over the 50 runs,
for each value of N, is presented in the top part of Figure 13. The error bars denote the 95% confidence
intervals for the computed mean values. The top left curve corresponding to the SNR= 2.17 dB case displays
its minimum at N = 5. This behavior indicates that the average optimal MDL reconstruction uses 5 sides at
this noise level. A corresponding typical such five-sided reconstruction of the ellipse is displayed on the lower
33
1200
1200
SNR=2.17 dB
SNR=0 dB
1150
1150
1100
1100,
1050
21050
4
6
8
Number of sides (N)
1050
10
21050
5-sided reconstruction, SNR=2.17 dB
4
6
8
Number of sides (N)
10
6-sided reconstruction, SNR=0 dB
I
.
0
0
Figure 13: Minimum MDL costs and Sample Reconstructions for an Ellipse
left plot of Figure 13 together with the true ellipse. The upper right curve corresponding to the SNR= 0 dB
case displays its minimum at N = 6 which indicates that the average optimal MDL reconstruction for this
case uses 6 sides. The MDL cost curve for this lower SNR case has now become quite flat however, showing
that the reconstruction with N from 4 to 6 are all about equally explanatory of the data. Although the curves
for both cases demonstrate the ability of the MDL procedure to capture the shape's complexity through its
choice of N, this behavior suggests that with increasing noise intensity, an MDL-based estimate becomes less
sensitive to the precise level of complexity of the reconstruction, as we would expect. Apparently, in high
noise situations the differences between these reconstructions that would be apparent in high SNR scenarios
are masked. As the noise level increases, these fine distinctions are unimportant or not supported by the
data. A typical 6-sided reconstruction is also displayed in the lower right plot of Figure 13 along with the
true ellipse.
As another example of the robustness of the ML-based estimates when the underlying object is nonpolygonal, we present a sample reconstruction of a nonpolygonal object that is also non-convex. In Figure 14
a typical reconstruction of this object is presented based on 20 equally spaced projections with 50 samples
34
snr--4.35 dB
2
-
-2 -
-3
-2
-1
0
1
2
3
Figure 14: True object (-), reconstruction (-.), initial guess (o) picked by eye
per projection, at a signal to noise ratio of 10. To ensure that the reconstruction was not a local minimum,
the vertices of the initial guess were chosen by eye at points where the underlying object had curvature
extrema.
Furthermore, the number of sides was picked arbitrarily according to the number of apparent
curvature extrema of the true object. Figure 15 shows a reconstruction of the same kidney-shaped object
at SNR of 4.35 dB with the initial guess chosen by the Initial Guess Algorithm. Figure 16 contains the
reconstruction produced by FBP using the same data set. As in our other examples, the underlying object
has been captured more accurately and without spurious features through the use of our algorithm.
3.4
Initial Guess Algorithm
In this section we present some sample reconstructions and performance plots for which we use the Initial
Guess algorithm for generating a starting point to the nonlinear optimization (3). To study the average
performance of the ML algorithm using the output of the Initial Guess Algorithm, a Monte-Carlo simulation
was done for the reconstruction of the hexagon shown in Figure 5. 100 reconstructions were carried out for
different realizations of the noise at various SNR's, each with 1000 projection samples as before (50 projections
35
3
$nr=4.35 dB
20
0/
-1
~~~~o
-2 -
'3-3
-:2
-1
0
1
2
3
Figure 15: True object (-), reconstruction (-.), initial guess (o) picked using Initial Guess Algorithm
~~i~~~ii
~~~~:~~~:~~~'~~:~~~si'::::~~~~~~~~'::·::~~~~~~:~~~:~~~:~[
ii[~.':[
Figure 16: FBP Reconstruction of non-polygonal, non-convex Object: 3rd order Butterworth filter with 0.15
normalized cutoff frequency, SNR--4.35 dB
36
350
.
Outlier --- >
300
250
t,..
o 200
Outlier --->
150
a. 100
50
o
o
0
0
&
20
0
0
0
40
60
80
Realizations Number
Figure 17: A sample path of the reconstruction error at SNR=0 dB
100
and 20 samples per projection). For each SNR, on average less than 5 percent of the reconstructions (i.e.
5 out of 100 sample reconstructions) had very large reconstruction errors (we call these instances outliers).
Figure 17 shows the reconstruction errors for 100 realizations of the noise at 0 dB. The outliers are clearly
visible.
Figure 17 indicates that in a few instances, the reconstructions were essentially at local minima very far
from the global minimum of the cost. In our experience, these outliers occur most frequently when poor
estimates of the moments of order 2 are obtained from the noisy data. Note that the second order moments
are used in the Initial Guess Algorithm only if the corresponding inertia matrix obtained from them is strictly
positive definite. The Initial Guess Algorithm decides whether to use the second order moments or not solely
on the basis of this positive definiteness and regardless of how close the inertia matrix may be to negative- or
indefiniteness. Hence, the outliers occur in those rare instances when the estimated inertia matrix happens
to be a very poor estimate, but yet positive definite (and hence used in the Initial Guess Algorithm). This
phenomenon, in turn, seems to occur when relatively few samples per projection are available.
Figure 18 shows the mean percent error in the Monte-Carlo runs after the removal of the outliers. The
37
outliers were removed from the ensemble and the results of the remaining realizations were averaged to yield
the values in Figure 18. That is to say that if 3 out of 100 realizations led to outliers, then only those 97
results which seemed "reasonable" were used in computing the ensemble average. Whether the result of a
run was deemed reasonable or not was decided by comparing the resulting percent error to the ensemble
median reconstruction error for all 100 runs. In particular, if the percent error for a run was larger than
one standard deviation away from the median, the run was declared an outlier. In the case of Figure 17,
using the computed median value of 17.2, and standard deviation of 33.2 a threshold level of 50.4 was chosen
above which outliers were declared.
The resulting "mean" performance is plotted here to show the average performance without the effect of
the outliers. It can be seen, upon comparing Figure 18 with the corresponding performance curve in Figure
9 that the performance of the ML algorithm using the output of the Initial Guess algorithm still suffers
even after discounting the obvious outliers. This means that instances of convergence to local minima still
occur, but note that, at least from a visual standpoint, the average performance after the removal of outliers
is not significantly different from the average performance with the actual polygon as the initial guess. In
particular, the degradation in performance here is roughly 7 percentage points in the Hausdorff norm over the
given SNR range. This corresponds to a small visual error as can be seen in Figure 4. From this observation,
we conclude that even though the initial guess algorithm does not always lead to convergence to the global
minimum, it almost always leads to, at least, a local minimum that is fairly close to the global minimum of
the ML cost function. Typical reconstruction at local minima which are close to the global minimum of the
cost are shown in Figure 19 for SNR= 0 dB and in Figure 20 for SNR= 4.35 dB.
4
4.1
Conclusions
Conclusions from Experiments
Several conclusions may be drawn from the experimental results presented in this paper.
The optimal
estimates based on the ML criterion produce reconstructions that are highly robust to noise and sparsity and
38
25
20
i
10 ..................
eformance. for. Hexago.n. with .lnitial .Guess Algorithm ......
5
-40. .
0
-2
0I
SNR (dB) per sample
2I
4I
Figure 18: Percent error for Hexagon vs SNR after outlier removal
True(-), Reconstruction(- .), Initial Guess(+)
.2
........................
1 .....................
-'
-2
............ ..........
-1
0
............
1
2
3
Figure 19: A typical reconstruction at a local minimum with SNR= 0 dB
39
39
True(-), Reconstruction(- .), Initial Guess(+)
3
1
............
.
.
.
: .
:
. . ..............
. .....
'\
\+
Percent Error 1 2.04
-2 .
-3
.....................................
-2
-1
0
x
1
2
3
Figure 20: A typical reconstruction at a local minimum with SNR= 4.35 dB
distribution of data. This behavior is essentially a direct consequence of the fact that ML-based techniques
focus all of the available information to the task at hand as represented in the parameters of underlying
object. In contrast, classical approaches, such as the CBP algorithm must spread this information over all
the pixels in an image. Extending the ML approach through the MDL principle allows for the automatic
determination of the optimum number of parameters needed to describe the given data set. The experiments
verify the utility of this method. The ML-based approaches are also able to produce reconstructions for a wide
variety of objects. The drawback of such statistically optimal ML and MDL approaches is that, in contrast
to the linear formulations of classical reconstruction algorithms, they lead to highly non-linear optimization
problems. This fact makes the issue of computing a good initial guess to the nonlinear optimization routines
an important one. We provided a way to circumvent this difficulty through a simple initial guess algorithm
based on the estimated moments of the object. In particular, the coarse geometric information carried in
the projections can be easily extracted in the form of moments, which are then used to generate the initial
guess. The efficacy of this algorithm over a range of SNR's was demonstrated.
40
4.2
Overall Conclusions
In this paper, we studied statistical techniques for the reconstruction of finitely parameterized geometric
objects. In particular, we focused on the reconstruction of binary polygonal objects. The reconstruction of
such objects was posed as a parameter estimation problem for which the Maximum Likelihood technique was
proposed as a solution. In contrast to the classical techniques, such as FBP, the ML based reconstructions
showed great robustness to noise and data loss and distribution. The drawback of such ML-based formulations is that the resulting optimization problems are highly non-linear and thus a good initial guess is
necessary to ensure convergence of optimization routines to the true ML estimate. To this end, an algorithm
was presented for computing such a reasonable initial guess using moments of the object which are estimated
directly from the projection data. While estimation of a function from its moments is, in general, a difficult
and ill-posed problem, we avoid these problems by using the noisy estimated moments only to guide a coarse
object estimate. This estimate, in turn, mitigates the difficulties associated with the non-linearities of the
optimal ML statistical approach. The efficacy of this moment based initial guess algorithm was demonstrated
over a range of SNR's.
If the number of parameters describing the underlying object are not known, a Minimum Description
Length criterion can be employed that simply generalizes the ML framework to penalize the use of an
excessively large number of parameters for the reconstruction.
The MDL approach was shown to work
successfully in estimating the number of sides and the underlying object itself for low signal-to-noise ratio
situations and for a variety of sampling scenarios. It was further demonstrated that if the underlying object
is not polygonal, but still binary, the proposed ML and MDL algorithms are still capable of producing
polygonal reconstructions which reasonably capture the object shape in the presence of high noise intensity
and sparsely sampled data.
In this work we have focused on the reconstruction of binary polygonal objects parameterized by their
vertices. The ML and MDL-based techniques used here may also be applied to more general object parameterizations. In particular, while we used the (estimated) moments of the object only as the basis for
41
generating an initial guess, it is, in some cases, possible to actually parameterize the object entirely through
its moments. For instance, Davis [29] has shown that a triangle in the plane is uniquely determined by its
moments up to order 3, while in [27] we have generalized this result to show that the vertices of any simply
connected nondegenerate N-gon are uniquely determined by its moments up to order 2N - 3.
More generally, a square integrable function defined over a compact region of the plane is completely
determined by the entire set of its moments [26, 24, 23]. In reality we will only have access to a finite set
of these moments and these numbers, coming from estimates, will themselves be inexact and noisy. While
estimation of the moments of a function based on its projection is a convenient linear problem, inversion of the
resulting finite set of moments to obtain the underlying function estimate is a difficult and ill-posed problem.
These observations suggest a spectrum of ways in which to use moments in our reconstruction problems.
At one extreme, only a few moments are used in a sub-optimal way to generate a simple initialization for
solution of a hard, non-linear estimation problem. At the other extreme, the moments are themselves used
in an optimal reconstruction scheme. In [27] we have studied regularized variational formulations for the
reconstruction of a square integrable function from noisy estimates of a finite number of its moments. We
have also studied array-processing based algorithms for the reconstruction of binary polygonal objects from
a finite number of their moments.
A
Theoretical Results on the Initial Guess Algorithm
In this section we present some theoretical justification for the initial guess algorithm. To start, we state
some elementary properties of unit area polygons Vref(N) whose vertices are the scaled Nth roots of unity
(in counter-clockwise direction) as defined by (18). From [27], it is a matter of some algebraic manipulations
to show that the regular polygon Vr,ef(N) has moments of up to order 2 given by
LOO(Vef(N))
=
(46)
1
/ilo(Vref(N))=
/io(Vref(N))
=0
A20(Vref(N)) =
o02(Vref(N))
= 4
42
(47)
Ntan()
=kN
(48)
mLii(Vref(N)) =
(49)
Now let Viit be an affine transformation of Vrf as
Vinit = LVre(N) + [C I Cl ...
C]
(50)
for some linear transformation L and some 2 x 1 vector C. Let O(Vinit) denote the closed, binary polygonal
region enclosed by the N-gon Viit. Now by considering the change of variables z = Lu, and dropping the
explicit dependences on N we have
/too(Vinit)
=
I
=
fI(?..,
-=
- dzA,
|
(51)
det(L)ldu,
(52)
oo(Vref)l det(L)l = Idet(L)l
(53)
Similarly, we get
[/lo(V'init) /ol(Vinit)]
T
= (L[/1o(Vref) /Loi(Vref)]T + C) I det(L)l = I det(L)IC
(54)
and
I(Vinit) = (LI(Vref)LT + CCT)I det(L)l = (kNLL T
+ CCT)I det(L)l
(5.5)
where for any N-gon V we write
1I(V)A20M P(
ull(V)
(56)
8,02(V)
This proves relations (21), (22), and (23). We next establish an explicit description of the set of all
affinely regular N-gons with a fixed set of moments up to order 2. In order to do this, we first need to prove
a lemma.
Lemma 1 For every N-gon V with moments uooi, /lo
matrizx I satisfies det(I) =
= 0, Ao01 = 0, /120, /11l,
Uo02, such
that the inertia
NLO,
-k2
there exists a matrix L, unique up to some orthogonal transformation,
such that V = LVref.
Proof:
The assumptions that /lo = 0 and /oL = 0 are made without loss of generality and to facilitate
the presentation of the proof. Having said this, we define L as the scaled (unique) square root of I as follows.
First, write the following eigendecomposition
43
Vdet-(I7)
= US2 UT,
(57)
where U is orthogonal and S has unit determinant. Define L as
L = VANtUS.
(58)
The moments of V = LVref are then given by
Poo(V)
=
o10(V)
=
I(V)
=
(59)
mboo
uol(V) = 0
(60)
kNLL TIdet(L)I
(61)
Note that
det(27) = k2 _44
(62)
as required. If L is replaced by LT where T is any 2 x 2 orthogonal transformation, the same moments
are obtained. Hence the lemma is established.
C
Given this lemma, we obtain an interesting geometric representation of all affinely regular N-gons that
have a prespecified set of moments of up to order 2. This characterization is given by Result 1, on page 21,
which we prove next.
Proof of Result 1:
For the sake of simplicity, and without loss of generality, we carry out the proof for
the case where all polygons are centered at the origin.
Let S1 denote the set of all N-gons whose first three moment sets are L00 ,
0lo= o01 = 0, ,
20,
11,
P/02-
Let S 2 denote the set of all N-gons with vertices on the ellipse zTEolz = 1, and sides tangent (at their
mid-points) to the ellipse
zTEi lz
= 1. We show that S1 = S 2.
First consider an N-gon V E S1. V has moments too, 0, 0, I and therefore, by Lemma 1, there exists an
L given by (57) and (58), unique up to some orthogonal matrix T1 such that we can write
V = (LTl)Vref(N)
Let us denote the N-gons V and Vref(N) explicitly in terms of their columns as
44
(63)
V
=
[V1 iv
V,,ref(N)
=
[w
'I
2
Iw2 |
N]
(64)
wN]
(65)
so that
(66)
vj = LTlwj.
It is easy to show from the definition of Vref(N) that
(Wj+ + ,
)T (wj+
WT
=
+ wj)
=
(67)
(67)
aN=
w;
r2v
N-sin(2ir/N)'
4
3
4 N= Ntan(/N)
(68)
Now to show that V E S 2, we prove that
vJTE-lvj
(vj+l + vj)T
2
=
(69)
1
E 1 (vj+l + v)
2
E
(70)
for j = 1, 2 ... N, where by convention, N + 1 = 1. Using (57) and (58), we can write
VT E- 1LOOkN
(71)
T-TI-1lVj
=OOkN
12
ooWTTTSUTUS- 2 UTUSTlWj.
(72)
(73)
-wjTw-.
=
CaN
(74)
1i.
Similarly,
1 plookN
(Vj+, + vj)T E-1 (vj+, + vj)
2
4 3
Pv
kkNAlOO
- 2
-
4-(w7j+
T-1
1
S s
0
(° °+lT
+Uj
+ Uj)T(Wj+l + Wj)
-
(vj+ + uj)
(75)
4tN
-
(76)
1
Hence, V E S 2 .
Now assume that V E S2. Then, by assumption,
-
-- --·-
Similarly,-
45
T E o l vj
v
(j+1 +
(v-1
(vj+l + vj)T E
=
)
(77)
1
2
2
1
(78)
1
Define the vertices of the related N-gon Z as
S - l UTvj.
Z=
(79)
where S and U are given by the normalized eigendecomposition of I given in (57). Writing (77) and (78)
in terms of zj, after some algebraic manipulations, we get
fTzj
(Z+l + Zj)T (Zj+l + j)
2
=
1
=
cos (
2
(80)
)
(81)
N
From these identities, again with some algebraic manipulation, it easily follows that Z is equilateral.
Specifically,
Ilzj+l - zj 111/2
= 2 sin(
)r
(82)
Since the above identities show that the N-gon Z is a regular N-gon inscribed in the unit circle, then it
must be related to Vref(N) through a scaling and some orthogonal transformation T2. In particular,
Z = V
IN
sin(
2
)T 2Vref(N)
(83)
This in turn shows that
1
S-1UTV
S/-s2
=
in(
)T2 Vef(N)
(84)
or after solving for V and simplifying,
V = V/t-UST 2 Vref(N).
(85)
V = (LT2)Vref(N).
(86)
Letting L = Vi/ooUS, we obtain
This last identity implies that V has moments /oo(V) = poo, / 1 o(V) = pOl(V) = 0, and
Z(V) = kNLLT I det(L)I
with det(I(V)) = k2 ,4o0
o . Hence V
C S1 and the result is established.
46
(87)
O
If Z is not the inertia matrix of an affinely regular N-gon then the L constructed in the Initial Guess
Algorithm will not have have the prescribed inertia matrix. We are, however, able to explicitly compute the
approximation error in the following way.
Result 2 Suppose that the moments
/zoo,
pilo = Pol = 0,
:r=
are given such that det(I)
k2
P20
P1
111
/02
I
(88)
+ e> 0. Define
L = 1t/US
(89)
where
= US
2
UT
(90)
~~~~)~~~~~(90)
V~~=nsUT
is the normalized eigendecomposition of 1. Then the normalized Frobenius-normerror is given by
IIlHookNLL T -
F
Proof:
Letting A =
1ookNLL T
kN POO
1
1
ZllF
I-l
k-1-+leoo
(91)
and B = I, we can write
A
=
aUS2 UT,
(92)
B
=
bUS2 UT,
(93)
where
S2
=
0
a
=
b =
,
I
(94)
1/A
kNlgo,
(95)
4kNoo+ e.
(96)
Hence we have
IIA- BIip
=
IIBIIF
=
II(a- b)S2 ll
bl
A2
+-.
47
= la-bl
A2+
(97)
(98)
0.9
0.7 /,/
0.6
0.5
,n- 0.4
-
0.3
-
N=3
N=1000
0.2
0.1
0.2
0
0.4
0.6
0.8
1
epsilon
Figure 21: Relative error in matching second order moments using the Initial Guess Algorithm
Hence,
|A|-
BI
llBltlF
|I
-
bl
b
I VNoo+c- kNA 2o
_
0
o + eoo
I
ki
which establishes the result.
_________
(99)
+e
C
We have plotted the expression for the relative error in Figure 21 for N = 3 and N = 1000, and assuming
that p0oo = 1. This figure shows that although the relative error grows quite fast as E is increased, it never
exceeds the maximum of 1 (i.e. 100 percent) for a fixed ,0oo. Also, the relative errors for different number of
sides are seen to be very close.
48
References
[1] M. Bergstrom, J. Litton, L. Ericksson, C. Bohm, and G. Blomqvist, "Determination of object contour
from projections for attenuation correction in cranial positron emission tomography," J. Comput. Assist.
Tomography, vol. 6, pp. 365-372, 1982.
[2] J.-P. Thirion, "Segmentation of tomographic data without image reconstruction," IEEE Transactions
on Medical Imaging, vol. 11, pp. 102-110, March 1992.
[3] N. Srinivasa, K. Ramakrishnan, and K. Rajgopal, "Detection of edges from projections," IEEE Transaction on Medical Imaging, vol. 11, pp. 76-80, March 1992.
[4] W. Munk and C. Wunsch, "Ocean acoustic tomography: a scheme for larger scale monitoring," Deep-Sea
Research, vol. 26A, pp. 123-161, 1979.
[5] G. T. Hermann, Imag Reconstruction From Projections. New York: Academic Press, 1980.
[6] D. Rossi and A. Willsky, "Reconstruction from projections based on detection and estimation of objectsparts I and II: Performance analysis and robustness analysis," IEEE Trans. on Acoustic, Speech and
Signal Proc., vol. 32, pp. 886-906, August 1984.
[7] J. L. Prince., Geometric Model-based Estimation from Projections. PhD thesis, MIT, Dept. of EECS,
1988.
[8] A. Lele, "Convex set estimation from support line measurements," Master's thesis, MIT, Dept. of EECS,
1990.
[9] K. Hanson, "Tomographic reconstruction of axially symmetric objects from a single radiograph," in
High Speed Photography, (Strassbourg), 1984. Vol. 491.
[10] W. C. Karl, Reconstructing Objects from Projections. PhD thesis, MIT, Dept of EECS, 1991.
[11] Y. Bresler and A. Macovski, "Estimation of 3-d shape of blood vessels from X-ray images," in Proc.
IEEE Int. Conf. on Acoustics, Speech and Signal Processing, March 1984.
49
[12] J. A. Fessler and A. Macovski, "Object-based 3-d reconstruction of arterial trees from magnetic resonance angiograms," IEEE Trans. on Medical Imag., vol. 10, pp. 25-39, March 1991.
[13] S. Chang, "The reconstruction of binary patterns from their projections," Comm. of the A CM, vol. 14,
no. 1, 1971.
[14] A. Kuba, "Reconstruction of measurable plane sets from their two projections taken in arbitrary directions," Inverse Problems, vol. 7, pp. 101-107, 1991.
[15] A. Volcic, "A three point solution to Hammer's X-ray problem," Journal of the London Mathematical
Society, vol. 2, no. 34, pp. 349-359, 1986.
[16] P. C. Fishburn, J. Lagarias, J. Reeds, and L. A. Shepp, "Sets uniquely determined by projections on
axes I. continuous case," SIAM J. Appl. Math., vol. 50, no. 1, pp. 288-306, 1990.
[17] R. J. Gardner, "Symmetrals and X-rays of planar convex bodies," Arch. Math., vol. 41, pp. 183-189,
1983.
[18] S. Helgason, Radon Transform. Boston: Birkhauser, 1980.
[19] R. A. Knox, Ocean Circulation Models: Combining Data and Dynamics, vol. 284 of NATO ASI Series,
ch. Ocean Acoustic Tomography: A Primer, pp. 141-188. Kluwer Academic Publishers, 1989.
[20] H. VanTrees, Detection, Estimation, and Modulation Theory: Part I. John Wiley and Sons, 1968.
[21] J. Rissanen, Stochastic Complexity in StatisticalInquiry, vol. 15 of Series in Computer Science. World
Scientific, 1989.
[22] P. Milanfar, W. C. Karl, and A. S. Willsky, "Recovering the moments of a function from its Radontransform projections: Necessary and sufficient conditions," LIDS Technical Report LIDS-P-2113, MIT,
Laboratory for Information and Decision Systems, June 1992.
[23] G. Talenti, "Recovering a function from a finite number of moments," Inverse Problems, vol. 3, pp. 501517, 1987.
50
[24] J. Shohat and J. Tamarkin, The problem of moments. New York: American Mthematical Society, 1943.
[25] M. Pawlak, "On the reconstruction aspects of moments descriptors," IEEE Trans. Info. Theory, vol. 38,
pp. 1698-1708, November 1992.
[26] N. Akhiezer, The classical moment problem and some related questions in analysis. New York: Hafner,
1965.
[27] P. Milanfar, Geometric Estimation and Reconstruction from tomographic data. PhD thesis, MIT, Department of Electrical Engineering, June 1993.
[28] M. Berger, Geometry I and II. Springer-Verlag, 1987.
[29] P. J. Davis, "Plane regions determined by complex moments," Journal of Approzimation Theory, vol. 19,
pp. 148-153, 1977.
51
Download