Gaussian profile estimation in two dimensions

advertisement
Gaussian profile estimation in two dimensions
Nathan Hagen1,* and Eustace L. Dereniak2
1
Fitzpatrick Institute for Photonics, Duke University, Durham, North Carolina 27708, USA
2
College of Optical Sciences, University of Arizona, Tucson, Arizona 85721, USA
*Corresponding author: nhagen@optics.arizona.edu
Received 31 January 2008; revised 7 June 2008; accepted 20 October 2008;
posted 5 November 2008 (Doc. ID 92236); published 15 December 2008
We extend recent results for estimating the parameters of a one-dimensional Gaussian profile to twodimensional profiles, deriving the exact covariance matrix of the estimated parameters. While the exact
form is easy to compute, we provide a set of close approximations that allow the covariance to take on a
simple analytic form. This not only provides new insight into the behavior of the estimation parameters,
but also lays a foundation for clarifying previously published work. We also show how to calculate the
parameter variances for the case of truncated sampling, where the profile lies near the edge of the array
detector. Finally, we calculate expressions for the bias in the classical formulation of the problem and
provide an approach for its removal. This allows us to show how the bias affects the problem of choosing
an optimal pixel size for minimizing parameter variances. © 2008 Optical Society of America
OCIS codes:
000.5490, 100.2960, 300.3700.
1. Introduction
In a previous paper [1], we outlined an approach for
using maximum-likelihood estimation (MLE) of onedimensional (1D) Gaussian profile parameters from
data corrupted by noise. Here we extend the method
to the two-dimensional (2D) case—estimating a 2D
Gaussian profile from an image—to incorporate
added parameters while retaining a fast algorithm.
This estimation procedure can be useful in fields
such as Gaussian beam characterization [2], astrometry [3], wavefront sensing [4], bioimaging [5,6],
and calibrations of computational sensors, such as
computed tomography instruments [7].
The initial approach we use here makes no assumptions on the sampling—uniform or nonuniform—of the profile. Bad samples, such as from
inoperative pixels, are easily taken care of by simply
deleting them from the data set. The Gaussian function being measured need not even have most of its
volume in the sampled region (i.e., its peak may be
located off the edge of the detector array), though
the nonlinear optimization procedure may have dif0003-6935/08/366842-10$15.00/0
© 2008 Optical Society of America
6842
APPLIED OPTICS / Vol. 47, No. 36 / 20 December 2008
ficulty in locating the global maximum. A good initial
guess is required in such a case.
The estimation procedure we use is the same as
that for the 1D problem, namely:
1. Assuming Gaussian- or Poisson-distributed
additive noise, we first form the log-likelihood function ℓ ¼ ln prðgjθÞ and then calculate its gradient (the
“score”) ∇ℓ and its Hessian matrix H.
2. These three functions are used in a nonlinear
optimization routine (such as Newton iteration) to
solve for the parameter set θ, which maximizes the
likelihood, via
θðkþ1Þ ¼ θðkÞ − ðHðkÞ Þ−1 ∇ℓðkÞ ;
where a superscript (k) indicates the iteration index
and HðkÞ is the Hessian matrix of ℓ evaluated at θðkÞ .
3. To determine the accuracy of the estimates, we
use the Cramér–Rao bound to calculate the covariance matrix K of parameter estimators, obtained
from the exact Fisher information matrix F by
K ¼ F−1 .
4. Alternatively, an analytic approximation to K
can be obtained by first approximating F and then
performing an analytic inverse on the resulting
simplified matrix.
5. Since each model is actually biased due to the
conventional approximation of the pixel response
functions as δ functions, we also provide an analytic
expression for the bias under the rect-sampling model. Bias correction can then be performed by using
the ML estimate to calculate the bias, which is then
subtracted from the raw data for a second iteration of
the estimation algorithm.
For each of the 2D Gaussian models discussed in
this paper—the symmetric Gaussian (circular cross
section), the asymmetric separable Gaussian (elliptical cross section), and the general case (elliptical
cross section unaligned to coordinate axes)—we provide the expressions for the above steps.
2. Symmetric 2D Gaussian Profile Model
The simplest type of 2D Gaussian profile is a
rotationally symmetric object function, which we
model as
2
2
f ðrÞ ¼ Ae−ðr − rÞ =2w ;
where A is the peak amplitude, r ¼ ðx; yÞ is the position of the peak (the “center”), and w is the Gaussian
width. The object is a continuous function, sampled
by the detection process to produce a discrete data
vector g, where
m ¼ δx δy Qm
g
Z
∞
−∞
Z
∞
−∞
¼ δx δy Qm Ae−ðrm −rÞ
f ðxÞδðx − xm Þδðy − ym Þdxdy
2 =2w2
:
ð1Þ
The 2D image data is thus mapped into a 1D vector,
where δð·Þ is the Dirac delta function, m is the pixel
index (1 ≤ m ≤ M), rm gives the abscissa position vector for the center of pixel m, and δx δy gives the x and y
dimensions of the pixels. An example g is illustrated
in Fig. 1. In Eq. (1), the quantity Qm is the gain for
pixel m, giving the number of digital counts per detected photoelectron. (Note that detector gain is often
quoted as photoelectrons per digital count—the inverse of Qm here.) In general, Qm can vary significantly with position on the detector array.
We model the measurement result g as the sum of
a noiseless discrete data vector and a zero-mean
noise vector, g ¼ g þ n, where the noise vector n is
a Gaussian-distributed random variable such that
1
2
2
prðgm Þ ¼ pffiffiffiffiffiffiffiffiffiffiffiffi e−ðgm −gm Þ =2σ m ;
2
2πσ m
ð2Þ
where σ 2m is the variance of the noise at pixel m, given
in units of digital counts.
The likelihood of a given parameter set θ ¼
ðA; x; y; wÞ producing a measured image g is defined
as
LðθjgÞ ¼ prðgjθÞ:
Taking the logarithm, ℓ ¼ log L, it is easy to derive
(see Eq. 6 in [1]) that the gradient has the form
X 1
∂ℓ
∂g
m Þ m :
¼
ðgm − g
2
∂θi
∂θi
σm
ð3Þ
So far, we have assumed a Gaussian-distributed
noise model. If we rederive the likelihood from a Poisson-noise model, we find that we again obtain Eq. (3),
m =Qm. Thus, although we
but with σ 2m replaced by g
continue to work with a likelihood function derived
from a Gaussian-noise model below, it is left understood that the same results can be achieved under
m =Qm.
Poisson noise by setting σ 2m ¼ g
For the symmetric Gaussian profile, the noise model gives
∂ℓ X
¼
γm;
∂A
∂ℓ
A X
¼ 2
γ m ηm ;
∂ y w
∂ℓ
A X
¼ 2
γ m ξm ;
∂ x w
∂ℓ
A X
¼ 3
γ m ρ2m ;
∂w w
where we define
ξm ¼ ðxm − xÞ;
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ρm ¼ ξ2m þ η2m ;
ηm ¼ ðym − yÞ;
γm ¼
δx δy
m ÞQm Em ;
ðgm − g
σ 2m
Em ¼ e−ðξm þηm Þ=2w :
2
Fig. 1. Example noiseless symmetric 2D Gaussian profile,
sampled on a pixel grid.
2
2
All of the sums are taken over the pixel index,
1 ≤ m ≤ M. The Hessian matrix is
20 December 2008 / Vol. 47, No. 36 / APPLIED OPTICS
6843
0
H¼
∂2 ℓ
∂A2
B ∂2 ℓ
B ∂ x ∂A
B 2
B ∂ℓ
@ ∂ y ∂A
∂2 ℓ
∂w∂A
∂2 ℓ
∂A∂
x
∂2 ℓ
2
∂
x
∂2 ℓ
∂
y ∂
x
∂2 ℓ
∂w∂
x
∂2 ℓ
∂A∂
y
∂2 ℓ
∂
x ∂
y
∂2 ℓ
2
∂
y
∂2 ℓ
∂w∂
y
1
duce approximations that simplify the elements of F
prior to taking its inverse. These approximations are:
∂2 ℓ
∂A∂w
∂2 ℓ C
C
∂
x ∂w C
:
∂2 ℓ C
∂
y ∂w A
∂2 ℓ
∂w2
H is symmetric, so it is only necessary to calculate the
upper triangular portion. Performing the partial derivative calculations gives, for the matrix elements,
X 1
1 X
Q2m E2m
H 12 ¼ 2
Γ m ξm
2
σm
w
1 X
1 X
¼ 2
H 14 ¼ 3
Γm ηm
Γm ρ2m
w
w
A X
A X
¼ 4
Γm ξ2m − γ m w2 H 23 ¼ 4
Γm ξm ηm
w
w
A X
¼ 5
Γm ξm ρ2m − 2γ m ξm w2
w
A X
¼ 4
Γm η2m − γ m w2
w
A X
¼ 5
Γm ηm ρ2m − 2γ m ηm w2
w
A X
¼ 6
Γm ρ4m − 3γ m ρ2m w2 ;
w
H 11 ¼ −δ2x δ2y
H 13
H 22
H 24
H 33
H 34
H 44
1. Flat noise: σ m ¼ σ, i.e., all pixels share the
same noise variance [9].
2. Uniform gain: Qm ¼ Q.
3. Uniform sampling: The spacing ðδx ; δy Þ between sample locations ðxm ; ym Þ is constant.
4. Complete sampling: The data region
xmin ≤ x ≤ xmax , ymin ≤ y ≤ ymax is sufficient such that
the Gaussian f ðrÞ is approximately zero outside of
the sampled region.
5. The profile is well sampled: Since each of
the F ij are sums of sampled Gaussian functions, we
can approximate the sums with integrals and then
replace the integrals with known analytic results.
With these approximations, and noting that
δx ¼ δξ; δy ¼ δη , we can write
X
1 X −ðξ2m þη2m Þ=w2
δξδη
e
δx δy
Z ∞
Z ∞
1
πw2
2
2
2
2
≈
e−ξ =w dξ
e−η =w dη ¼
δx δy
δx δy −∞
−∞
E2m ¼
and, similarly,
9
P 2
ξm ≈ 0 >
>
P Em
>
E 2 ξm η m ≈ 0 >
>
>
P m
πw4 =
2 2
Em ξm ≈ 2δx δy
P 2 2 πw4 >:
Em ρm ≈ δx δy >
>
>
>
P 2 4 2πw
6 >
;
E ρ ≈
gm
Qm Em and γ m is as dewhere Γm ¼ δx δy gmσ−2
2
m
fined above.
The Fisher information matrix is obtained as the
negative of the elementwise expectation of the Hessian matrix [8], with the expectation taken over the
data:
Z
F ij ¼ −EfH ij g ¼ −
∂2 ℓ
ℓðθÞ
dM g;
∂θi ∂θj
ð4Þ
m m
P
¼ Rm
P
¼ wA2 Rm ξm
P
¼ wA2 Rm ηm
P
¼ wA3 Rm ρ2m
P
A2
Rm ξ2m
¼w
4
0
F 23
F 24
F 33
F 34
F 44
P
9
A2
R m ξm η m >
¼w
4
>
P
>
A2
Rm ρ2m ξm >
¼w
>
5
=
P
A2
2
R
¼w
η
4
m m
>
P
>
A2
Rm ρ2m ηm >
¼w
>
5
>
;
P
2
A
4
¼ w6 Rm ρm
KFlat ≈
6844
APPLIED OPTICS / Vol. 47, No. 36 / 20 December 2008
2
w2
σ2 B
B 0
2B
πδx δy Q @ 0
−1
Aw
ð5Þ
for Rm ¼ δ2x δ2y Q2m E2m =σ 2m. Note that the notation dM g
expresses the differential for an M-dimensional integral over all data elements gm .
When the Cramér–Rao bound is met (asymptotically for a large number of measurements M), the
covariance matrix K is given by the inverse of the
above matrix, which can be obtained either analytically by Cramer’s rule or numerically. The analytic
expressions for K are quite involved and so provide
limited insight into the algorithm. To gain a better
understanding of the relationship between the estimator variances and the object parameters, we intro-
δx δy
Constructing an approximate form of F from these
expressions, we can analytically calculate its inverse
to give the parameter covariance matrix K:
producing
F 11
F 12
F 13
F 14
F 22
ð6Þ
0
0
0
2
A2
−1
Aw
0
0
2
A2
0
0
1
C
C
C:
A
ð7Þ
1
A2
0
Alternatively, if we assume that the noise variance is
not the same at all pixels (i.e., not flat) but rather is
determined from the Poisson distribution of the
m at each pixel, then we can substitute
mean signal g
m Qm ¼ δx δy AQ2 Em and, recalculating F, obσ 2m ¼ g
tain a modified covariance matrix [10]:
0
KPoisson ≈
2A
2
Bw
1 B0
B
2π @ 0
−1
2w
0
1
A
0
0
0
0
−1
2w
0
1
4A
1
A
1
C
0C
C:
0A
ð8Þ
The parameter variances can thus be written out explicitly as
^ ≈
varðAÞ
ðFlatÞ
2β=w2
Forming the log-likelihood function ℓ, its gradient
∇ℓ, and the Hessian matrix H for this model under
additive Gaussian noise, we obtain
;
A=ð2πw2 Þ ðPoissonÞ
ðFlatÞ
2β=A2
varð^
;
xÞ ¼ varð^yÞ ≈
1=ð2πAÞ ðPoissonÞ
ðFlatÞ
β=A2
^ ≈
;
varðwÞ
1=ð8πAÞ ðPoissonÞ
ð9Þ
where β ¼ πδ σδ Q2 .
x y
To calculate the volume U under the Gaussian
spot, we first form its estimator. Since
Z
Z
2
2
2
U ¼ A dξ dη e−ðξ þη Þ=2w ¼ 2πAw2 ;
½∇ℓ1 ¼
∂ℓ
¼ Σγ m
∂A
½∇ℓ2 ¼
∂ℓ
A
¼ 2 Σγ m ξm
∂ x wx
½∇ℓ3 ¼
∂ℓ
A
¼
Σγ η
∂ y w2y m m
½∇ℓ4 ¼
∂ℓ
A
¼
Σγ ξ2
∂wx w3x m m
½∇ℓ5 ¼
∂ℓ
A
¼ 3 Σγ m η2m ;
∂wy wy
2
where Em ¼ eξm =2wx e−ηm =2wy , ξm ¼ ðxm − xÞ,
δ δ
m ÞQm Em .
ðym − yÞ, and γ m ¼ σx2 y ðgm − g
2
∂U
∂U
≈
K
T θ ∂θ
∂θ
8π 2 βw2
2πAw2
H 11 ¼ −δ2x δ2y
ðFlatÞ
:
ðPoissonÞ
H 13
H 14
H 15
H 22
3. Separable 2D Gaussian Profile Model
For a separable Gaussian model function with nonequal widths wx and wy (Fig. 2), we have, using the
same nomenclature as Section 3 above,
2 =2w2
x
e−ðy−yÞ
2 =2w2
y
H 23
H 24
;
H 25
m ¼ δx δy Qm f ðxm ; ym Þ; gm ¼ g
m þ nðxm ; ym Þ
where
g
and the vector of parameters is now θ ¼ ðA;
x; y; wx ; wy Þ.
H 33
H 34
H 35
H 44
H 45
H 55
where Γm ¼
Fig. 2. Example noiseless asymmetric 2D Gaussian profile,
sampled on a pixel grid.
2
ηm ¼
X Q2m E2m
σ 2m
1 X
Γ m ξm
w2x
1 X
¼ 2
Γm ηm
wy
1 X
¼ 3
Γm ξ2m
wx
1 X
¼ 3
Γm η2m
wy
A X
¼ 4
Γm ξ2m − γ m w2x
wx
A X
¼ 2 2
Γm ξm ηm
wx wy
A X
¼ 5
Γm ξ3m − 2γ m ξm w2x
wx
A X
¼ 2 3
Γm ξm η2m
wx wy
A X
¼ 4
Γm η2m − γ m w2y
wy
A X
¼ 3 2
Γm ξ2m ηm
wx wy
A X
¼ 5
Γm η3m − 2γ m ηm w2y
wy
A X
¼ 6
Γm ξ4m − 3γ m ξ2m w2x
wx
A X
¼ 3 3
Γm ξ2m η2m
wx wy
A X
¼ 6
Γm η4m − 3γ m η2m w2y ;
wy
H 12 ¼
Note, again, that these simple expressions for the
parameter variances hold only when the assumptions of completely, uniformly, and well-sampled data
exists.
f ðrÞ ¼ Ae−ðx−xÞ
2
m
^w
^ ¼ 2π A
^ 2 . Using the elements of
the estimator is U
the covariance matrix to estimate the variance of
^ under the flat-noise and Poisson-noise approximaU
tions ([8], p. 45–46)
^ ¼
varðUÞ
2
δ2x δ2y Q2m
σ 2m
δx δy
σ 2m
m ÞQm Em . Since εfΓm g ¼
ðgm − 2g
AE2m and Efγ m g ¼ 0, the Fisher information
−
matrix F here is
20 December 2008 / Vol. 47, No. 36 / APPLIED OPTICS
6845
P
F 11 ¼ Rm
P
F 12 ¼ wA2 Rm ξm
x
P
F 13 ¼ wA2 Rm ηm
y
P
F 14 ¼ wA3 Rm ξ2m
x
P
F 15 ¼ wA3 Rm η2m
y
P
A2
Rm ξ2m
F 22 ¼ w
4
x
P
2
F 23 ¼ wA2 w2 Rm ξm ηm
x y
P
A2
Rm ξ3m
F 24 ¼ w
5
F 25 ¼
F 33 ¼
F 34 ¼
P
A2
Rm ξm η2m
w2x w3y
P
2
A
Rm η2m
w4y
P
A2
R m ξ2 η m
w3 w2y
x
2
P
y
A2
P
A
F 35 ¼ w
5
Rm η3m
;
Rm ξ4m
P
¼ w3 w3 Rm ξ2m η2m
x y
P
A2
Rm η4m
¼w
6
F 44 ¼ w6
x
F 45
F 55
x
A2
4.
y
πwx wy
δx δy
P 2 2 πw3x wy
Em ξm ≈ 2δx δy
P 2 2 πwx w3y
Em ηm ≈ 2δx δy
P 2 2 2 πw3x w3y
Em ξm ηm ≈ 4δx δy
P 2 4 3πw5x wy
Em ξm ≈ 4δx δy
P 2 4 3πwx w5y
Em ηm ≈ 4δx δy
E2m ≈
9
>
>
>
>
>
>
>
>
>
>
>
>
=
>
>
>
>
>
>
>
>
>
>
>
>
;
:
ð10Þ
KFlat
B
B 0
B
2
B
σ
B 0
≈
2B
πδx δy Q B −1
B Awy
@
0
−1
Awy
2wx
A2 wy
0
0
0
2wy
A2 wx
−1
Awx
3A
B wx wy
0
0
0
0
0
KPoisson
B 0
B
1 B
B 0
≈
2π B
B −1
B wy
@
−1
wx
0
0
−1
wy
0
0
0
0
2wx
3Awy
1
3A
0
wx
Awy
0
wy
Awx
0
0
−1
Awx
1
C
0 C
C
C
0 C
C;
C
0 C
A
0
2wx
A2 wy
0
where
C−1 ¼
0
2
wx wy
1
m ¼ Aδx δy Qm exp − ðrm − rÞT C−1 ðrm − rÞ
g
2
≡ Aδx δy Em ;
(All sums containing odd powers of ξm or ηm are zero.)
From these we readily obtain the approximate form
of the covariance matrix K for both flat and for Poisson noise:
0
General 2D Gaussian Model
A general 2D Gaussian model function (Fig. 3) is
with Rm ¼ δ2x δ2y Q2m E2m =σ 2m . This is the exact form of F.
Introducing the same approximations as used in Section 3, we obtain the following approximations of the
above sums:
P
Note that the separable (i.e., five-parameter)
Gaussian model can also be used for a completely
general 2D Gaussian profile (Section 4) when the azimuth angle of the Gaussian’s elliptical cross section
is known a priori. The data g used by the estimation
algorithm is unmodified in this case, and the x and y
abscissa inputs need only be rotated by the appropriate angle prior to estimation.
c1
c3
c3
;
c2
and θ ¼ ðA; x; y; c1 ; c2 ; c3 Þ. The matrix elements c1 and
c2 are equivalent to the 1=w2y ; 1=w2x , from the separable 2D profile, and c3 represents the “covariance
term.” The tilt angle α of the Gaussian profile relative
to the coordinate axes is then given by α ¼
pffiffiffiffiffiffiffiffiffi
1
2 arctan½−2c3 c1 c2 =ðc1 − c2 Þ.[11]
The score vector ∇ℓ, Hessian matrix H, and Fisher
information matrix F for this model are
2wy
A2 wx
−1
wx
1
C
0 C
C
C
0 C:
C
1 C
3A C
A
2w
y
3Awx
^ ^x; ^y; w
^ y Þ are
^ x; w
The parameter variances for θ^ ¼ ðA;
found on the diagonal of the covariance matrix K.
The estimator for the energy of the Gaussian is
^w
^ ¼ 2π A
^ xw
^ y and its variance is
U
^ ≈
varðUÞ
6846
8πσ2 wx wy
δx δy Q2
2πAwx wy
ðFlatÞ
:
ðPoissonÞ
APPLIED OPTICS / Vol. 47, No. 36 / 20 December 2008
Fig. 3. Example noiseless general 2D Gaussian profile, sampled
on a pixel grid. (Note that the axis of the Gaussian is not aligned to
the grid.)
∂ℓ
¼ Σγ m
∂A
∂ℓ
¼ AΣγ m ðc1 ξm þ c3 ηm Þ
½∇ℓ2 ¼
∂ x
∂ℓ
¼ AΣγ m ðc2 ηm þ c3 ξm Þ
½∇ℓ3 ¼
∂ y
for γ m ¼
H 13 ¼
∂ℓ
A
¼ Σγ m ξ2m
∂c1 2
½∇ℓ5 ¼
∂ℓ
A
¼ − Σγ m η2m
∂c2
2
½∇ℓ6 ¼
∂ℓ
¼ −AΣγ m ξm ηm ;
∂c3
F 24 ¼ −
δx δy
σ 2m
X
X
H 23
H 24
H 25
H 26
H 33
H 34
H 35
H 36
H 44
H 46
H 56
F 25
m ÞQm Em, and
ðgm − g
F 26
F 33
X Q2m E2m
σ 2m
Γm ðc1 ξm þ c3 ηm Þ
Γm ðc2 ηm þ c3 ξm Þ
1X
Γm ξ2m
2
X
¼−
Γm ξm ηm
H 14 ¼ −
1X
H 16
Γm η2m
2
X
¼A
Γm ðc1 ξm þ c3 ηm Þ2 − c1 γ m
X
¼A
Γm ðc1 ξm þ c3 ηm Þðc2 ηm þ c3 ξm Þ − c3 γ m
1 X
Γm ðc1 ξm þ c3 ηm Þξ2m − 2γ m ξm
¼− A
2
1 X
Γm ðc1 ξm þ c3 ηm Þη2m
¼− A
2
X
¼ −A
Γm ðc1 ξm þ c3 ηm Þξm ηm − γ m ηm
X
¼A
Γm ðc2 ηm þ c3 ξm Þ2 − c2 γ m
1 X
Γm ðc2 ηm þ c3 ξm Þξ2m
¼− A
2
1 X
Γm ðc2 ηm þ c3 ξm Þη2m − 2γ m ηm
¼− A
2
X
¼ −A
Γm ðc2 ηm þ c3 ξm Þξm ηm − γ m ξm
AX
AX
¼
H 45 ¼
Γm ξ4m
Γm ξ2m η2m
4
4
AX
AX
¼
H 55 ¼
Γm ξ3m ηm
Γm η4m
2
4
X
AX
¼
H 66 ¼ A
Γm ξ2m η2m ;
Γm ξm η3m
2
where Γm ¼
δx δy
σ 2m
m ÞQm Em .
ðgm − 2g
Once again, EfΓm g ¼ −
so that
δ2x δ2y
σ 2m
A2 X
Rm ðc1 ξm þ c3 ηm Þξ2m
2
A2 X
¼−
Rm ðc1 ξm þ c3 ηm Þη2m
2
X
¼ −A2
Rm ðc1 ξm þ c3 ηm Þξm ηm
X
¼ A2
Rm ðc2 ηm þ c3 ξm Þ2
A2 X
Rm ðc2 ηm þ c3 ξm Þξ2m
2
A2 X
Rm ðc2 ηm þ c3 ξm Þη2m
¼−
2
X
¼ −A2
Rm ðc2 ηm þ c3 ξm Þξm ηm
F 34 ¼ −
H 15 ¼ −
H 22
X
½∇ℓ4 ¼
H 11 ¼ −δ2x δ2y
H 12 ¼
F 11 ¼
Rm
X
F 12 ¼ A
Rm ðc1 ξm þ c3 ηm Þ
X
Rm ðc2 ηm þ c3 ξm Þ
F 13 ¼ A
X
A
AX
F 14 ¼ −
F 15 ¼ −
Rm ξ2m
Rm η2m
2
2
X
F 16 ¼ −A
Rm ξm ηm
X
F 22 ¼ A2
Rm ðc1 ξm þ c3 ηm Þ2
X
F 23 ¼ A2
Rm ðc1 ξm þ c3 ηm Þðc2 ηm þ c3 ξm Þ
½∇ℓ1 ¼
AQ2m E2m and εfγ m g ¼ 0,
F 35
F 36
A2 X
Rm ξ4m
4
A2 X
Rm ξ3m ηm
¼
2
A2 X
¼
Rm ξm η3m
2
A2 X
Rm ξ2m η2m
4
A2 X
Rm η4m
¼
4
X
¼ A2
Rm ξ2m η2m ;
F 44 ¼
F 45 ¼
F 46
F 55
F 56
F 66
where, again, Rm ¼ δ2x δ2y Q2m E2m =σ 2m .
We can again use integral expressions to approximate F to give analytic expressions for K ¼ F−1 under
flat or Poisson noise:
0
□
B 0
B
B 0
2σ 2 B
B□
KF ≈
πδx δy Q2 B
B A
B□
@ A
0
2A
B 0
B
B 0
μ B
B
KP ≈
B
2π B c1
B
Bc
@ 2
c3
0
0
□
A
□
A
□
A2
□
A2
□
A2
□
A2
□
A2
□
A2
c2
A2 μ
c3
A2 μ
c3
A2 μ
c1
A2 μ
□
A
0
0
0
0
0
0
0
0
c1
0
0
c2
0
0
2c21
A
2c23
A
2c1 c3
A
2c23
A
2c22
A
2c2 c3
A
c2
Aμ2
c3
Aμ2
c3
Aμ2
c1
Aμ2
0
0
0
0
0
0
0
0
0
0
□
A
1
0C
C
0C
C
C
□ C;
2
A C
□C
A
A2
□
A2
1
c3
0
0
C
C
C
C
C
2c1 c3 C
C;
A
C
2c2 c3 C
C
A
A
c c þc2
1 2
A
3
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
where μ ¼ c1 c2 − c23 . Each instance of the symbol
“□” in the equation above represents a term that
20 December 2008 / Vol. 47, No. 36 / APPLIED OPTICS
6847
has no compact analytic expression but that
depends solely on the matrix elements c1 , c2 , and
^ (the estimate of voc3 . Calculating the variance of U
lume under the Gaussian profile), using the same
procedure as in Sections 2 and 3, gives
^ ≈
varðUÞ
□ δ δσ Q2
x y
□A
2
ðFlatÞ
:
ðPoissonÞ
^ to be independent of the
Thus, we again find varðUÞ
input parameters x and y, and for the case of flat
noise, the variance is also independent of the profile
height A.
6.
Bias in the Standard Model
As mentioned in the previous paper [1], the model
used here is biased, in the sense that use of an overly
simple model for the measurement gm produces a
biased estimator of the true Gaussian:
^f unbiased ðx; yÞ ¼ ^f biased ðx; yÞ þ bðx; yÞ:
If we model the pixel response with rectanglefunctions over which the signal is integrated, then
Z
m ¼ Qm
g
∞
Z
∞
f ðx; yÞ
−∞
x − xm
y − ym
rect
dxdy;
× rect
δx
δy
−∞
ð11Þ
5. Truncated Sampling
Occasionally, it is necessary to estimate a Gaussian
profile that has been truncated by lying at the edge of
the sampled region (see Fig. 4) and we can expect
that the estimation accuracy will fall when the profile has been asymmetrically sampled in this way.
The numerical evaluation of the exact Fisher information matrix terms allows us to obtain the CramérRao bound for this specific situation, as well.
A simulation illustrating the behavior of the estimation parameters as the Gaussian profile sampling
is increasingly truncated is shown in Fig. 5. The simulation uses a symmetric 2D profile whose center
ð
x; yÞ is at first well inside the sampling region (such
that the “truncation parameter” t ¼ 0), then approaches and goes beyond the vertical edge of the
sampling region (t ¼ 0:6, i.e., 60% of the volume under the profile is unsampled). For the example
shown, the profile is sampled at a rate of 1=7 the
Gaussian width (that is, an 8 × 8 pixel region contains ∼68% of the volume under the profile).
The results show that the variance of the profile
peak estimate is little affected until the truncation
becomes quite large. And, along the truncation direction, the profile center estimate suffers much more
than does the accuracy orthogonal to the truncation
direction.
where ðδx ; δy Þ is the physical dimension of the pixel
and the rectangle function is defined as
rectðx=LÞ ¼
1 ∶ jxj < L=2
:
0 ∶ jxj > L=2
The bias results from using a δ-sampling model,
m ¼ f ðxm ; ym Þ, whereas the detection process
i.e., g
is better approximated as an integral over the region
of the pixel’s response, i.e., Eq. (11). If we take a Taylor expansion of f ðx; yÞ about the pixel’s center
ðxm ; ym Þ, we find that the two models coincide up
to the linear term of the Taylor series. Thus, we
can approximate the bias by integrating the second-order term across the pixel’s response.
The Taylor expansion of a scalar-valued function of
two variables has the form
∂f
f ðx; yÞ ≈ f ðxm ; ym Þ þ ðx − xm Þ
∂x
þ ðy − ym Þ
∂f
∂y
0
x ¼ xm
y ¼ ym
x ¼ xm
y ¼ ym
2 1B
∂ f
þ @ðx − xm Þ2
xm
2
∂x2 xy ¼
¼y
m
∂ f
þ 2ðx − xm Þðy − ym Þ
∂x∂y x ¼ xm
þ ðy − ym Þ2
∂2 f
∂y2
2
1
x ¼ xm
y ¼ ym
A:
y ¼ ym
ð12Þ
Therefore, the bias is given by
Z
Fig. 4. Truncated sampling of a Gaussian profile, showing a case
in which 60% of the volume under the profile lies outside of the
region falling on the array detector.
6848
APPLIED OPTICS / Vol. 47, No. 36 / 20 December 2008
bðrm Þ ≈
xm þ12δx
xm −12δx
Z
ym þ12δy
ym −12δy
½second-order termdxdy:
ð13Þ
Fig. 5. The change in variance of estimation parameters as the sampling of the profile is increasingly truncated. The truncation parameter t is defined simply as the fraction of volume under the Gaussian profile which lies outside of the sampling region. In the figures, lines
indicate the parameter variances as calculated numerically from the exact Fisher information matrix. (The analytic approximations are
not valid for the case of a truncated profile.) Dots indicate variance estimates from a Monte Carlo simulation. The variances have been
normalized to the completely sampled data’s parameter variances. In both figures, the value of x is allowed to vary while all other model
parameters are held constant.
1 2
The integral over the ðx − xm Þ2 term gives 12
δx ; like1 2
2
wise, the integral over ðy − ym Þ gives 12 δy , while the
cross term integrates to zero. Substituting these into
Eq. (13) gives the equation of the bias as
bðrm Þ ≈
AQm Eðrm Þ ðxm − xÞ2
−
1
δ2x
24w2
w2
ðym − yÞ2
þ
− 1 δ2y :
w2
Following the same procedure for the five-parameter
Gaussian model (the separable 2D profile) gives an
expression for the bias of
2
AQm Eðrm Þ ðxm − xÞ2
δ
bðrm Þ ≈
− 1 x2
2
24
wx
wx
2
2
δy
ðym − yÞ
þ
−1 2 :
w2y
wy
ing the variance of the position estimator, x^. In this
situation, where we allow the detector pixel size to
vary while the system magnification is kept constant, we need to be careful to enforce a constant energy U, via the relations
2πAw2
ð4 paramsÞ
U ¼ 2πAwx wy ð5 paramsÞ ;
2πA=μ
ð6 paramsÞ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
where, again, μ ¼ c1 c2 − c23 . (The volume U is equal
to the number of photoelectrons collected on the detector array.) The expressions for the position estimate variances become
ð4 paramsÞ : varð^xÞ ≈
ð5 paramsÞ : varð^xÞ ≈
And, for the six-parameter model,
AQm Eðrm Þ
bðrm Þ ≈
½ð½c1 ðx − xÞ þ c3 ðy − yÞ2 − c1 Þδ2x
24
þ ð½c2 ðy − yÞ þ c3 ðx − xÞ2 − c2 Þδ2y :
Knowing the value of the bias provides a means of
further improving the estimation, by applying a correction to the input data and iterating the estimation
until the bias becomes negligible. Note that the bias
here is expressed in units of digital counts, not photoelectrons.
7. Optimal Pixel Size
As in the previous paper, for the 2D case, we can compare previous research with the results obtained
here to evaluate whether there is an optimal magnification for the Gaussian profile or, equivalently, an
optimal detector pixel size. Reference [12] reports an
optimal choice of δx ¼ δy ¼ 1:5 w ∼ 2:5 w for minimiz-
ð6 paramsÞ : varð^xÞ ≈
8πσ 2 w4
δx δy Q2 U 2
4w2
U
ðFlatÞ
8πσ 2 w3x wy
δx δy Q2 U 2
4w2x
ðFlatÞ
U
2πσ 2 c2
δx δy μ3 Q2 U 2
2
c1 U
ðPoissonÞ
ð14Þ
ðPoissonÞ
ðFlatÞ
ðPoissonÞ
:
(The variances for ^y can easily be obtained from the
above by symmetry arguments.)
These results indicate that, for Poisson noise,
varð^xÞ is independent of detector pixel size, whereas
the flat-noise model shows that the variance decreases with larger pixel sizes. However, if the bias
has not been corrected for within the algorithm, then
we need to account for its effect on the measurement,
as well. Figure 6 shows a comparison of the squared
error in ^x due to bias with varð^xÞ. (Note that the bias
is a function of x relative to the sample locations xm ;
if, for example, x coincides with a sample location,
then the bias has no effect on the error in estimating
x.)
For w ≲ 0:7δx, Fig. 6 indicates that the estimation
error is dominated by bias effects. For w ≥ 0:7δx, the
20 December 2008 / Vol. 47, No. 36 / APPLIED OPTICS
6849
F ij ≈
^
^
Þ and bias2 ðx
Þ, given in units of
Fig. 6. A comparison of varðx
ðpixelsÞ2 , for an example symmetric Gaussian profile as a function
of the profile width w. Keeping a constant value for U, we increase
the profile width (and thus reduce the profile height A) while maintaining the same sampling rate. The approximate variance is calculated by Eq. (9) and the exact variance by taking the inverse of
Eq. (5). The two curves shown for the variance are for two different
values of U.
error is predominantly due to estimator variance.
This behavior would suggest that an optimal pixel
size should be δx ≤ 1:43w. Note that although the exact location of the optimal tradeoff will vary with
each parameter set, the bias to the position estimator
^
x is relatively insensitive to the parameters A and y.
8. Discussion
Real-life detector array data has noise properties
that are neither purely Gaussian nor Poisson but
are well approximated by a combined function in
which the electronic noise is given by a Gaussian distribution and the photoelectron noise by a Poisson
distribution (see [13], Eq. 3.16). Under this combined
distribution, the Fisher information matrix is approximately (Eq. 3.17 of [13])
M
X
σ2
m¼1 G
m ∂g
m
1
∂g
;
m Qm ∂θi ∂θj
þg
where σ 2G describes the variance of Gaussian-distributed readout noise (with all pixels sharing the same
m Qm gives the variance of the Poisvariance) and g
son-distributed shot noise. To make use of this Gaussian–Poisson mixed model in our system, we need
m Qm Þ for σ 2m in the exact expresonly insert ðσ 2G þ g
sions for F (i.e., Eq. (5) and the corresponding equations for the five- and six-parameter models).
Unfortunately, there does not appear to be a way
of deriving the corresponding approximate analytic
expressions for this noise model.
One difficulty that can thwart accurate estimation
is the failure of the optimization routine to locate a
global maximum to the likelihood function. For wellsampled Gaussian profile models, the likelihood
function is typically smooth and well behaved (an example is shown in Fig. 7(a)]. For difficult models, in
which the width parameter w is less than a pixel
width, multiple local maxima develop in the likelihood function [see Fig. 7(b)]. For such models, local
optimization techniques, such as Newton iteration,
become unreliable and global optimization methods,
though slow, become important.
Finally, Gaussian profile estimation is often used
in systems where other optical effects are present,
such as a slowly varying background signal (due,
for example, to stray light, scatter, or to background
signals). Ignoring these effects would skew the estimation and, yet, they are not of interest to the observer and, thus, are termed nuisance parameters.
Ideally, the estimation algorithm should incorporate
the presence of these parameters by marginalizing
the likelihood over the conditional probability density of the set of nuisance parameters. If this conditional density is unavailable, then an alternative
approach is to estimate the nuisance parameters
and subtract their effect from the data prior to use
Fig. 7. Contour maps showing slices through the 4D likelihood function for (a) a well-behaved model and (b) a poorly behaved one. The
parameters are ðA; x; y; wÞ ¼ ð100; 0:25; 0; 3Þ and (10,0.25,0,0.5) respectively, where x, y, and w are given in units of pixel widths.
6850
APPLIED OPTICS / Vol. 47, No. 36 / 20 December 2008
in the Gaussian profile estimation algorithm. This
second approach would allow use of the algorithm
presented in this paper.
The authors have posted a set of programs, written
in the IDL [14] language, that perform the estimation algorithm discussed in this paper. These are located at the url http://www.ittvis.com/codebank/
index.asp. (The file is gauss_mle2.pro under the
heading “Statistics”.) Interested readers are encouraged to download, distribute, and use the code freely.
This work was supported in part by Department of
Defense (DOD) contract DAAE07-02-C-L011. We
would like to thank an anonymous reviewer for a
very careful reading and valuable comments.
References and Notes
1. N. Hagen, M. Kupinski, and E. L. Dereniak, “Gaussian profile
estimation in one dimension,” Appl. Opt. 46, 5374–5383
(2007).
2. L. G. Kazovsky, “Beam position estimation by means of
detector arrays,” Opt. Quantum Electron. 13, 201–208
(1981).
3. L. H. Auer and W. F. van Altena, “Digital image centering II,”
Astron. J. 83, 531–537 (1978).
4. R. Irwan and R. G. Lane, “Analysis of optimal centroid estimation applied to Shack–Hartmann sensing,” Appl. Opt. 38,
6737–6743 (1999).
5. M. K. Cheezum, W. F. Walker, and W. H. Guilford, “Quantitative comparison of algorithms for tracking single fluorescent
particles,” Biophys. J. 81, 2378–2388 (2001).
6. R. E. Thompson, D. R. Larson, and W. W. Webb, “Precise nanometer localization analysis for individual fluorescent
probes,” Biophys. J. 82, 2775–2783 (2002).
7. Y.-C. Chen, L. R. Furenlid, D. W. Wilson, and H. H. Barrett,
“Calibration of scintillation cameras and pinhole SPECT imaging systems,” in Small-Animal SPECT Imaging, M. A. Kupinski and H. H. Barrett, eds. (Springer, 2005), Chap. 12,
pp. 195–202.
8. S. V. Kay, Fundamentals of Statistical Signal Processing: Estimation Theory (Prentice Hall, 1993), p. 34.
9. What we call “flat noise” here was labeled “uniform noise” in
[1]. This has been changed in recognition that this can easily
be mistaken as referring to noise obtained from the uniform
probability distribution.
m =Qm is the appropriate substitution
10. It may seem that σ 2m ¼ g
here, but since the raw measurements gm are scaled versions
of the Poisson-distributed photoelectrons, the variance increases as the square of the scale parameter, Qm .
11. S. Brandt, Data Analysis, 3rd ed., (Springer 1999), p. 113–114.
12. K. A. Winick, “Cramér–Rao lower bounds on the performance
of charge-coupled-device optical position estimators,” J. Opt.
Soc. Am. A 3, 1809–1815 (1986).
13. H. H. Barrett, C. Dainty, and D. Lara, “Maximum-likelihood
methods in wavefront sensing: stochastic models and likelihood functions,” J. Opt. Soc. Am. A 24, 391–414 (2007).
14. The Interactive Data Language software is developed by
ITT Visual Information Systems, http://www.ittvis.com/
index.asp.
20 December 2008 / Vol. 47, No. 36 / APPLIED OPTICS
6851
Download