Parameter estimation of superimposed signals using the EM algorithm

advertisement
471
IEEE TRANSACTIONS ON ACOUSTICS, SPEECH, A N D SIGNAL PROCESSING, VOL. 36, NO. 4, APRIL 1988
Parameter Estimation of Superimposed Signals Using
the EM Algorithm
MEIR FEDER
AND
Abstract-We develop a computationally efficient algorithm for parameter estimation of superimposed signals based on the EM algorithm. The idea is to decompose the observed data into their signal
components and then to estimate the parameters of each signal component separately. The algorithm iterates back and forth, using the
current parameter estimates to decompose the observed data better
and thus increase the likelihood of the next parameter estimates. The
application of the algorithm to the multipath time delay and to the
multiple source location estimation problems is considered.
I. INTRODUCTION
HE general problem of interest here may be characterized using the following model:
T
K
y(t) =
k= 1
Sk(t;
0,)
-k n ( t )
where 0, are the vector unknown parameters associated
with the kth signal component and n ( t ) stands for the
additive noise. This model covers a wide range of problems involving superimposed signals. The specific problem we have in mind is multiple source location estimation; in that case, ( t ; 0,) are the array signals observed
from the kth source, and 0 k are the unknown source location parameters.
In this paper, we develop a computationally efficient
scheme for the joint estimation of e , ,
, OK based
on the Estimate Maximize (EM) algorithm. The idea is to
decompose the observed data y ( t ) into its signal components and then estimate the parameters of each signal
component separately. The algorithm iterates back and
forth, using the current parameter estimates to decompose
the observed data better and thus improve the next parameter estimates. Under the stated regularity conditions, the
algorithm converges to a stationary point of the likelihood
function where each iteration cycle increases the likelihood of the estimated parameters.
-
Manuscript received February 20, 1986; revised September 28, 1987.
This work was supported in part by the MIT-WHO1 Joint Program, in part
by the Advanced Research Projects Agency monitored by ONR under Contract NO00144 1-K-0742 and the National Science Foundation under Grant
ECS-8407285 in MIT, and in part by the ONR under Contract N00014-85K-0272 in WHOI. It is WHO1 Contribution 6584.
M. Feder is with the Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA
02139, and the Department of Ocean Engineering, Woods Hole Oceanographic Institution, Woods Hole, MA 02543.
E. Weinstein is with the Department of Electronic Systems, Faculty of
Engineering, Tel-Aviv University, Tel-Aviv 69978, Israel, and the Department of Ocean Engineering, Woods Hole Oceanographic Institution,
Woods Hole, MA 02543.
IEEE Log Number 8719013.
EHUD WEINSTEIN
The results developed in this paper can be viewed as a
generalization of the results presented by the authors in
[l] and [2]. We note that the idea of iteratively decomposing the observed signal and estimating the parameters
of each signal component separately has been previously
proposed in several applications (e.g., [3]-[5]). However, in most cases, the approach is ad hoc, and there is
no proof of convergence of the algorithm. As will be
shown, the EM method suggests a very specific way of
decomposing the observed signals, leading to a numerically as well as statistically stable algorithm.
The paper is organized as follows. In Section I1 the EM
method is represented following the derivation in [6]. In
Section I11 the EM method is applied to the parameter
estimation of superimposed signals, and the basic algorithm is developed for deterministic (known) signals and
for stationary Gaussian signals. In Section IV the algorithm is applied to the multipath time-delay estimation, in
Section V the algorithm is applied to the multiple source
angle of arrival estimation, and in Section VI we summarize the results.
11. MAXIMUM
LIKELIHOOD
ESTIMATION
V I A THE EM
ALGORITHM
The EM algorithm, developed in [6], is a general
method for solving maximum likelihood (ML) estimation
problems given incomplete data. The considerations leading to the EM algorithm are given below.
Let Y denote the observed (incomplete) data, possessing the probability densityfy( y; 0 ) indexed by the parameter vector 0 E 8 C R K , and let X denote the “complete”
data, related to Y by
H(X)= Y
(1)
where H ( ) is a noninvertible (many-to-one) transformation. Express densities
f a x ; 0)
=fx/Y=y(x;
0) * fr(y; 0)
vH(x) = Y
(2)
where f x ( x ; 0 ) is the probability density associated with
X and f X l y = y(x; 0 ) is the conditional probability density
of X given Y = y. Taking the logarithm on both sides of
(2)
9
lOgfy(Y; 0 ) = logfx(x; 0 ) - logfx/Y=y(x; 0 ) .
(3)
Taking the conditional expectation given Y = y at a
0096-3518/88/0400-0477$01.OO
O 1988 IEEE
IEEE TRANSACTIONS ON ACOUSTICS, SPEECH, A N D SIGNAL PROCESSING, VOL. 36, NO. 4. APRIL 1988
478
parameter value
logfy(y; 0 )
e’,
=
E{logfx(x; e ) / Y
=
- E{logfx/y=y(x;
Y; e’}
e)/Y
= y; O r } .
(4)
Define, for convenience,
L ( e ) = logfy(Y;
u(e,
el)
=
e>
(5)
E { iogfx(x; O ) / Y = Y; o r }
(6)
and
v(e,e!) = E{ iogfx/y=y(x; e ) / y = Y; e r } .
(7)
With these definitions, (4) reads
~ ( e=) u(e,er) - q e , o r ) .
(8)
We identify L ( 8 ), the log-likelihood of the observed
data, as the function we want to maximize. Invoking the
Jensen’s inequality,
v(e,
el)
5
v(er,e r ) .
(9)
Hence, if
cannot be found, and thus the computation of U ( e , O r )
at each iteration cycle generally requires multiple integration. Appendix A has importance of its own since it covers a wide range of problems.
We note that the EM algorithm is not uniquely specified. The transformation H ( ) relating X to Y can be any
noninvertible transformation. Obviously, there are many
possible “complete” data specifications X that will generate the observed data Y . However, the choice of “complete” data may critically affect the complexity and the
rate of convergence of the algorithm, and the unfortunate
choice of “complete” data may yield a completely useless algorithm.
In the next section, it will be shown that for the class
of problems involving superimposed signals, there is a
natural choice of “complete” data, leading to a surprisingly simple algorithm to extract the ML parameter estimates.
111. PARAMETER
ESTIMATION
OF SUPERIMPOSED
SIGNALS
IN NOISE
The general problem of interest here may be characterized using the following model:
u(e,er) > u(w,er)
K
then
~ ( e >) q w ) .
(10)
The relation in (10) forms the basis for the EM algorithm. The algorithm starts with an arbitrary initial guess
6“’, and denote by
the current estimate of 8 after n
iterations of the algorithm. Then, the next iteration cycle
can be described in two steps, as follows:
E step: Compute
u(e,6 ( n ) ) .
(11)
where Ok are the vector unknown parameters associated
with the kth signal component and n ( t ) denotes the additive noise. This model covers a wide range of problems
arising in array and signal processing, including multitone
frequency estimation, multipath time-delay estimation,
and multiple source location estimation.
Given observations of y ( t ) , we want to find the ML
estimate of el, e*, * * , 0 K jointly. We shall now consider the cases of deterministic (known) signals and stochastic Gaussian signals separately.
-
M step:
Max U ( 0 , 6‘”))
0
+
6(”+’).
(12)
If U ( 0 , e‘) is continuous in both 8 and O r , the algorithm converges to a stationary point of the log-likelihood
function [7] where the maximization in (12) ensures that
each iteration cycle increases the likelihood. Of course,
as in the case of all “hill climbing” algorithms, the convergence point may not be the global maximum of the
likelihood function, and thus several starting points may
be needed. The rate of convergence of the algorithm is
exponential, depending on the fraction of the covariance
of the “complete” data that can be predicted using the
observed (incomplete) data [6], [8]. If that fraction is
small, the rate of convergence tends to be slow, in which
case one could use standard numerical methods to accelerate the algorithm.
In the Appendix we derive a closed-form analytical
expression for U ( 8, O r ) for the case where X and Y are
jointly Gaussian, related by a linear transformation. We
note that in general a closed-form analytical expression
A. Deterministic Signals
Consider the model of (13) under the following assumptions.
1) Thes,(t; Q k ) , k = 1, 2, *
, K, are conditionally
known up to the vector parameters Ok.
2) The n ( t )are vector zero-mean white Gaussian processes whose covariance matrix is E { n ( t ) n* ( a) } = Q
6 ( t - a).
Under these assumptions, the log-likelihood function is
given by (e.g., see [9])
-
-
K
Q-’[y(t) -
k= 1
~ d t e,)]
;
dt
(14)
where = 1 if n ( t ) is real-valued, X = 2 if n ( t ) is complex-valued, and c is a normalizing constant. In the case
of discrete-time observations, L ( 8 ) is still given by (14)
where the integral is replaced by the sum over all discrete
419
FEDER A N D WEINSTEIN: PARAMETER ESTIMATION OF SUPERIMPOSED SIGNALS
points t E T. Thus, to obtain the ML estimate of the various Ok’s, one therefore must solve
In that case, the log-likelihood of the complete data x( t )
is
K
Q-’[Y(t)
*
C
k- I
-
Sk(t; e k ) ]
dt].
=
This is a complicated multiparameter optimization
problem. Of course, brute force can always be used to
solve the problem, evaluating the objective function on a
coarse grid to locate roughly the global minimum and then
applying the Gauss method, the Newton-Raphson, or
some other gradient-search iterative algorithm. However,
when applied to the Problem at hand, these methods tend
to be computationally complex and time consuming.
Having the EM algorithm in mind, we want to simplify
the computations involved. To apply the algorithm to the
problem- at hand, the first step is io specify the “complete” data. A natural choice for the “complete” data
x ( t ) is obtained by decomposing the observed data y ( 1 )
into its signal components; that is,
\
I
1
.
A - ’ [ x ( t ) - s ( t ; e)] dt
(15)
+2
d
+2
-
l
T
T
s*(t;
e ) A - ’ x ( t ) dt
x*(t) A - ’ s ( t ; e) dt
1
J s * ( t ; e, A - ~ s ( te; ) dt
2 T
(22)
where d contains all the terms that are independent of 0.
s (t; 0 ) and A are the mean and the covariance matrix of
x(t) given, respectively, by
1
where
Xk(t) = sk(t; e k )
+ nk(t)
(17)
where the s ( t ) are obtained by arbitrarily decomposing
the total noise n ( t ) into the K components, so that
where the notation in (24) indicates that A is a block-diagonal matrix. Taking the conditional expectation of (22)
given y ( t ) at a parameter value O f ,
K
s * ( t ; e) A - ’ i ( t ) dt
From (13), (17), and (18), the relation between the
“complete” data x ( t ) and the incomplete data y ( t ) is
given by
K
y(t) =
C xk(t) = H
k= 1
’
X(t)
(19)
where
H=[Z Z.**Z].
L
v
*
(20)
Kterms
We shall find it most convenient to let the nk( t ) De statistically independent, zero-mean, and Gaussian with the
covariance matrix E{ nk(t)nk*(u)}= QkS(t - a) where
Qk = &Q and the &’S are arbitrary real-valued scalars
satisfying
A-’[i(t)-
s(t;
e)] dt
(25)
where
f(t) =
E{x(t)/y(t); O f }
(26)
and e is a constant independent of 8. Since x ( t ) and y ( t )
are jointly Gaussian, related by the linear transformation
y ( t ) = Hx(t), then using (A7) of the Appendix,
480
IEEE TRANSACTIONS ON ACOUSTICS, SPEECH, A N D SIGNAL PROCESSING, VOL. 36, NO. 4, APRIL 1988
Substituting (23) and (24) into (25) and following
straightforward matrix manipulations, we obtain
w e , e') = e -
,zlS,[w
X K
- sk(t;
,, " + / I
e,)] *
e,)] dt
Q i ' [ % ( t ) - s,(t;
(28)
where the a,( t ) are the components of P ( t ) given by
i
ett) + O , [ y ( t ) - k =
e;)]. (29)
Observing that the maximization of V ( 0 , e') with rei , ( t ) = s,(t;
Sk(t;
1
spect to 8 is equivalent to the minimization of each of the
terms in the k sum of (28) separately, the EM algorithm
assumes the following form.
, K , compute
Estep: F o r k = 1, 2,
-
i : " ' ( t ) = s,(t;
ep) + 0,[y ( t ) -
,Ilc
s,(t; ern))].
(30)
Mstep: F o r k = 1, 2,
min
ek
j
[if'(t) -
*
-
*
7
-
K,
4:"'
fil")
81
Fig. 1 . The EM algorithm for deterministic (known) signals.
2) The observation interval T is long compared to the
correlation time (inverse bandwidth) of the signals and the
noise, Le., WT/2n >> 1 .
Fourier analyzing y ( t ) , we obtain
Under the above assumptions, y ( t ) is WSS, zero-mean,
and Gaussian. Thus, for WT/27r >> 1 , the Y ( w l ) ' sare
asymptotically uncorrelated, zero-mean, and Gaussian
with covariance matrix P ( w l ; e ) where P ( w ; 0 ) is the
spectral density matrix of y ( t ) given by
K
sk(t; e,)]*
e) = kC
= 1
P(W;
T
Q - ' [ i r ' ( t ) - s k ( t ; e,)] dt
(31)
We observe that
is, in fact, the ML estimate of
8, based on a:"' ( t ) . The algorithm is illustrated in Fig. 1 .
We note that in the case of discrete observations, the integral in (31) is substituted by the sum over all points t E
T.
The most striking feature of the algorithm is that it decouples the complicated multiparameter optimization into
K separate ML optimizations. Hence, the complexity of
the algorithm is essentially unaffected by the assumed
number of signal components. As K increases, we have
to increase the number of ML processors in parallel; however, each ML processor is maximized separately. Since
the algorithm is based on the EM method, each iteration
cycle increases the likelihood until convergence is accomplished.
We note that the 0,'s must satisfy the constraint stated
in (21), but otherwise they are arbitrary free variables in
the algorithm. The 0,'s can be used to control the rate of
convergence of the algorithm and possibly to avoid the
convergence to an unwanted stationary point of the algorithm. These considerations are currently under investigation.
-+
@F+l'.
@p+"
B. Gaussian Signals
Consider the model of (13) under the following assumptions.
1) n ( t ) and s , ( t ; e,), k = 1, 2, *
, K , are mutually
uncorrelated, wide-sense stationary (WSS), zero-mean
Gaussian vector stochastic processes whose spectral density matrices areN(w) andSk(w; e,), k = 1, 2, * * , K ,
respectively.
e,) + N(u).
(33)
The log-likelihood function given the Y ( q ) ' s is therefore given by
q e ) = - C1
[logdet r p ( w l ; e)
+ Y*(Wl) P - ' ( q ; 0 ) y ( 4 ]
(34)
where the summation in (34) is carried over all wI in the
signal frequency band. In the case of discrete observations y , = y (iAt),the log-likelihood is still given by (34)
where the Y ( w l ) ' s are the discrete Fourier transform
(DFT) of the y l ' s . To obtain the ML estimate of the various Ok's, we therefore must solve
min
81,82,'
' '
,8k
[T [logdet P(w,;e)
+ Y * ( ~ ,P) - ' ( ~ e)
~ ;Y
I
( ~ ~.) ]
(35)
We want to use the EM method to bypass this complicated multiparameter optimization. Following the same
considerations as in the deterministic signal case, let the
"complete" data x ( t ) be given by
-
-
where
T l
481
FEDER A N D WEINSTEIN: PARAMETER ESTIMATION OF SUPERIMPOSED SIGNALS
and the n , ( t ) are chosen to be mutually uncorrelated,
zero-mean, and Gaussian with respective spectral density
matrices N k ( w ) = 0, N ( w ) where the &'S are arbitrary
real-valued constants subject to (2 1).
Following the same considerations leading to (34), the
log-likelihood of the "complete" data is given by
-
logfx(x;
e) = - C1
e), (43)
[logdet T A ( ~ ,e)
;
+X*(4
= -
Exploiting the block-diagonal form of A ( w ;
becomes
0) X(4]
7
[logdet T A ( u [ ; 0)
+ tr (A-'(ai; 0) X ( w i > X * ( w 1 ) ) ] (38)
where
(39)
(40)
A,(@; 0,)
= &(w;
0,)
+ Pk
'
N('").
(42)
Taking the conditional expectation of (38) given Y ( w l )
at a parameter value e',
where P ( w l ; e) is defined in (33). Observing that the
maximization of U ( 8, e') with respect to 8 is equivalent
to the minimization of each of the terms in the k sum of
(46) separately, the EM algorithm assumes the following
form.
Mstep: Fork = 1, 2,
min
8'
,K,
C [ logdet A,( w l ; e,)
I
We observe that 6?+" is the ML estimate of 0, where
is substituted by its
the sufficient statistic X,( w,) X,*(or)
current estimate X k ~ l ) ' n 'The
. algorithm is illustrated in Fig. 2. The most attractive feature of the algorithm is that it decouples the full multidimensional optimization (35) into optimizations in the smaller-dimensional parameter subspaces. Thus, as in the deterministic
signal case, the complexity of the algorithm is essentially
unaffected by the assumed number of signal components.
As K increases, we have to increase the number of ML
processors in parallel; however, an ML processor is maximized separately. Since the algorithm is based on the EM
method, each iteration cycle increases the likelihood until
convergence is accomplished.
482
IEEE TRANSACTIONS ON ACOUSTICS. SPEECH, AND SIGNAL PROCESSING. VOL. 36, NO. 4. APRIL 1988
Mstep:
-,$"I
Fork
1 , 2,
=
, K,
*
(53)
Ai")
Fig. 2. The EM algorithm for stochastic Gaussian signals.
IV. APPLICATION
TO MULTIPATH
TIME-DELAY
ESTIMATION
Let the observed signal y ( t ) be modeled by
kzl
Assuming that the observation interval T is long compared to the duration of the signal and the maximum expected delay, the two-parameter maximization required in
(53) can be carried out in two steps as follows:
(54)
K
y(t) =
a k s ( l - rk)
+ n(t).
(50)
This model characterizes multipath effects where the
transmitted signal s ( t ) is observed at the receiver through
more than one path. We shall concentrate here on the case
where s ( t ) is a deterministic known signal. The extension
to the case where s ( t ) is deterministic but unknown, and
to the case where s ( t ) is a sample function from a Gaussian random process, can be found in [ l l ] and [l], respectively. We suppose that the additive noise n ( t ) is spectrally flat over the receiver frequency band. The problem
,K.
is to estimate the pairs ( q ,
rk), k = 1 , 2 ,
The direct ML approach requires the solution to [see
(1511
e
-
.
This optimization problem is addressed in [lo], where
it is shown that, at the optimum, the (Yk's can be expressed
in terms of the Tk's. Thus, the 2 K-dimensional search can
be reduced to a K-dimensional search. Howeyer, as
pointed out in [lo], for K 2 3 the required computations
become too involved. Consequently, ad hoc approaches
and suboptimal solutions have been proposed. The most
common solution consists of correlating y ( t ) with a replica of s ( t ) and searching for the K highest peaks of the
correlation function. If the various paths are resolvable,
i.e., the difference between Tk and T~ is long compared to
the temporal correlation of the signal for all combinations
of k and 1, this approach yields near-optimal estimates.
However, in situations where the signal paths are unresolvable, this approach is only distinctly suboptimal.
We identify the model in (50) as a special case of (13).
Therefore, in correspondence with (30) and (31), we obtain the following algorithm.
, K, compute
Estep: Fork = 1 , 2, *
$ ' ( t ) = &u:"'s(t - $')
+ bk[ y ( t ) - I 5
&l"'s(t
=
1
l'-1
-
?I"')
1
.
(52)
where E = j T I s ( t ) I
dt
is the signal energy and
P
&r)
= J T i l " ' ( t )s * ( t - r ) dt.
(56)
We note that gp'( r ) can be generated by passing if'(
t)
through a filter matched to s ( t ) . The algorithm is illustrated in Fig. 3. This computationally attractive algorithm
decreases iteratively the objective function in (5 1) without ever going through the indicated multiparameter optimization. The complexity of the algorithm is essentially
unaffected by the assumed number of signal paths. As K
increases, we have to increase the number of matched filters in parallel; however, each matched filter output is
maximized separately.
If the number K of signal paths is unknown, several
criteria for its determination are developed in [ 121. These
criteria are based on the ML parameter estimates and can
therefore be easily incorporated into the algorithm.
To demonstrate the performance of the algorithm, we
have considered the following example. The observed
signal y ( t ) consists of three signal paths in additive noise
3
c
y(t) =
(.kS(t - 7 k )
k= I
+ n(t)
where s ( t ) is the trapezoidal pulse
s(t) =
[i:
O ~ t < 5
5 j t 5 1 5
1/4
10)/20
15
<
t
20.
The actual delays are
71 =
0
72 =
5
73
=
1 , 2, 3.
=
10,
and the amplitude scales are
(Yk
=
1
K
The additive noise is spectrally flat with a spectral level
of N = 0.025 (so that the postintegration signal-to-noise
ratio (SNR) per pulse is approximately 16 dB). The observed data consist of 100 time samples, as illustrated in
11
483
FEDER A N D WEINSTEIN: PARAMETER ESTIMATION OF SUPERIMPOSED SIGNALS
I
Y
ill
I
SIGNAL
--
I
PEAK
n ("+I! n ( " + I /
S E L E C T O R k a ' "I
I
MATCHED
FILTER
n ("tl)
("*I!
'K
,TK
PEAK
SELECTOR
Fig. 3. The EM algorithm for multipath time-delay estimation.
-40-30-20 -10
0 10 20 30 40 50 60
Fig. 4 . The observed data.
Fig. 4. In Fig. 5 we have plotted the matched filter output
as a function of delay. As we can see, the conventional
method cannot resolve the various signal paths and estimate their parameters.
First, we have computed the ML estimates by direct
minimization of the objective function in (51) using exhaustive search. We obtain
?I
=
0.0117
?2 =
5.0031
?3 =
&I
=
1.1511
&2
=
0.7799
&3
=
9.9884
0.9471,
Fig. 5 . The conventional matched filter response.
and the value of the objective function at the minimum
(corresponds to the value of the log-likelihood function at
the maximum) is
J
=
0.45879.
We have also computed lower bounds on the root mean
square (rms) error of each parameter estimate using the
Cramer-Rao inequality. We obtain
a ( ? l ) = 0.028
a ( ? 2 )=
0.030 a(?3) = 0.028
0.076
~ ( & 2 )=
0.079
~ ( & 1 )
=
~ ( & 3 )=
-80 -60-40-20
0 20 40 60 80
0.076.
a ( ? k ) denotes the minimum attainable rms error in the
estimate, and ( T ( & k ) denotes the minimum attainable
rms error in the ( Y k estimate.
We now apply our algorithm. In Fig. 6 we have plotted
the matched filter response to the various signal paths, as
they are evolved with the iterations. In Fig. 7 we have
tabulated the results using several arbitrarily selected
starting points as indicated by the first line of each table.
We see that after 10-15 iterations, the algorithm converges within the Cramer-Rao lower bound to ML estimates of all the unknown parameters, regardless of the
initial guess. We note that the small differences in the final estimates result from the finite grid used for the optimization.
Using the asymptotic efficiency and lack of bias of the
ML estimates, we can claim with some confidence that
the rms error performance of the algorithm is the minimum attainablk characterized by the- Cramer-Rao lower
bound.
Tk
-+:
i
I
:
I
-80-60-40-20 0 20 40 60 80
Fig. 6. The matched filter response to each signal path
V . PASSIVEMULTIPLESOURCELOCATION
ESTIMATION array, the actual waveform observed at the mth sensor
The basic system of interest here consists of K spatially Output is
K
distributed sources radiating noise-like signals toward an
array of M spatially distributed sensors. Assuming perfect
ym(t) =
Sk(f - Tkm) + nrn(f)
k= I
propagation conditions in the medium, and ignoring amplitude attenuations of the signal wavefronts across the
m = 1,2,
,M
(57)
IEEE TRANSACTIONS ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING. VOL. 36, NO. 4, APRIL 1988
484
.
:
,
, .-.,
. - ,o 1 - .,
. :
,
.: .] , I , .l
,
,
.*
L
*,
",
I
. S I :i , l G 23 - 4
I--0 il 3
.. c., ....
:I.,
('I
- , I
11,
I
......
.. .. , .I. ......
. . > . .A, ,,,
. ".1 e. .. .......
. . .
I32
j .
s,
. . ......,,......,..
I
>.,
, , , . . I
E
~1
-
0 0 0 0 0 0 0 0 0 0 0 0 I> 0 i) 0
-CN
<U
. . . . . . . . . . . . . . . .
u
" c1 i) i> .> 0 0 3 0
E, 0 3
?,
c)
,c,
i J
f,
7
: , 4
L. P-
3
a ,
3"
- u
<a 5
r-
I.,
"1
I,
,m
u.
7
-
-.a
o
c-
dJ .1 >
:I I0 0
000 0 0 0 0
-..... - .. -. -..-- .-.. -. ..
11,
I
. . . . -. . -. . . -. . . . . . .
~~
o m r- P D m m ~ irl o m b i Q m
rae moPN -0 o
o o m "7 o m
o y7 D m D li: m m o o o o o o
k, , I 0 c t 1 8
0 o 0 0 c .a 0 0 0
I
..--I
I
<I-
!
I
I
,
I
I
I
I
-
4
I
I
O
I
Q 51
-.
m
P F *a
P 1.1
e
-
G-
m
- -
-
I I o
~ N n a oroN
Q
No m m
n m u? N o m r- o Q Q
u o n -3 u i n u e v o e
or m o m m
,o L-
I-
r-
o
n
o
- 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
o o
NP
m
Q P
r
o o r*
N
m I-
-
N
................
O O O O O i J O O O O O O O O O O
I
4
t
n"'.*,
. . . .
.: -
c vi
-0 I- 01
N ( I
N
-
r i 0 r- m
nm
e
LI
-
L:
r i 3 r r-
o
0
I
I
I
1
o
m
-
o
. . Q. . .m.o.n.w. .r.m.v..o.u-.o.
o o m r m o o r m o o o ~ ~ u
o
-n
0 3 0 0 0 0 CI 0 0 0 0 0
c4
D
o mm no
C
. . . -. . . . . . . . . . . . .
">
0 b 1 ld C J O1 4 h l L?
aeo
mo mo r
(8
-. _.- ..
m u %I
:I
--
-- 2
-.>
-.-................ E
I-
o m r m o m r- o c
ru m o r 3 ri o Q u o P m m -* e
U)
- m ~ u ~ o m ~ m o m . - 3 a o n
0- 4 n o m o o Q m
c
o c =.
o r
Q P m N - 0 0 0 0 0 0 0 0 0 0 0
-
- -. .--.
.................
o
~
N
~
~
o
N
o m m I- m
omo o or
-
~ o Jo
m -3 J
N
~ 9
N
L I
2
i
o
-
P
-7
Y
mc
o3
nnoaI-r-r-mmmmmmoa0 0 0 0 0 0 0 0 0 0 0 0 0 3 '> o
I
e
r-
1
. . . . . . . . . . . . c. . o. 3.
00 0 00 0 00 7 000
1 %-
0
- ., -...................
o m - r i ~ r - c - o r - . >
o Q ca o P m N I- o r a 3 v
o o c e m m b? Q P- Q
b i Q a r P- r- m m m m p1 w
-1
R D
Y w
0 00 0 0 00 0 0 0 00 3.3
- -
c- o
n>
C.
0
-
:,..
. . . . . . . . . . . -. . . . .
on
rd c
r Q pi w
r i . 3. F .i
o t-4 m m r* ra o o lil F v 4
J
o Q r r, r o m u ri 9 (1 o o o o
0Pm
k l P b> m
30 0 0 0 0u
b7 0
m lr
-4
- I
- - --
o
r*
-
Q N o r4 Q
4
mc
..
4
e m r.1
I
4
- .-
. . . . . . . . . . . . -. . . .
O
N
N
~
N
I
N
-
~
-
C
~
O
~
~
N
o m 91 m r- o m n r n m
oon
~ m r - ~ m n m ~ - o o o o o o o
o
-Ir
-
N
m o n 4 I- m P o
-
O
- - .. - rq
11
c
*>
4
11
485
FEDER A N D WEINSTEIN: PARAMETER ESTIMATION OF SUPERIMPOSED SIGNALS
where sk ( t ) is the kth source signal, n, ( t ) is the additive
noise at the rnth sensor output, and Tkm is the travel time
of the signal wavefront from the kth source to the rnth
sensor.
Information concerning the various source location parameters can be extracted by measuring the various Tkm.
In the passive case, one can only measure the travel time
differences, obtainable by selecting one sensor as a reference and comparing its output to that of every other sensor. If we let sensor M be the reference and set 7km = 0 ,
then Tkm measures the travel time difference of the kth signal wavefront to the ( m , M ) sensor pair.
To simplify the exposition, suppose that the various
signal sources are relatively far-field so that the observed
signal wavefronts are essentially planar across the array.
If we further suppose that the array sensors are colinear,
then
where d , is the spacing between sensors m and M , c is
the velocity of propagation in the medium, and d k is the
angle of arrival of the kth signal wavefront relative to the
boresight.
Substituting (58) into (57) and concatenating the various equations, we obtain
where
where yrn = d , / c .
Suppose that the various sk ( t ) and the various n, ( t ) are
mutually independent, WSS , zero-mean Gaussian random
processes with spectral densities & ( w ) and N , ( w ) , respectively (the cases where the sk( t ) are deterministic
known/unknown signals are presented in [2] and [ 111, respectively). We want to find the ML estimates of el, &,
. . . , given observations of y ( t ).
Assuming that the observation interval T is long compared to the correlation time (inverse bandwidth) of the
signals and the noises, the direct ML approach requires
the solution to [see (35)]
T'
min
Bt,el,. . ,eK
c [logdet P ( q ; 0) + Y * ( q )
1
- P-'(wl; e) Y ( w , ) ] .
(611
where Y ( w I ) are the Fourier transform coefficients [or the
DFT in the discrete case of y ( t ) ] and
K
P(w; 8) =
c s k ( w ) v(w; 0,) v*(w;6,) + N ( w )
k= 1
(62)
where
and
1
This is a complicated multiparameter optimization
problem in several unknowns. Consequently, numerous
ad hoc solutions and suboptimal approaches have been
proposed in the recent literature (e.g., [13]-[18]). Still,
the most common approach consists of beamforming and
searching for the K highest peaks. If the various signal
sources are widely separated, this approach is nearly optimal. However, if the source signals are closely spaced,
we are likely to obtain poor estimates of the various 0k's.
Identifying the model in (59) as a special case of (13),
the algorithm specified by (48) and (49) is directly applicable, where
Ak(w; e,) = & ( w ) V ( w ; 8,) v*(w; 6,)
+ PkN(W).
(65)
Now,
det
&(W;
[
0,) = 1
+ -1 s k ( W ) v*(Cd;6,)
Pk
486
IEEE TRANSACTIONS ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING. VOL. 36. NO. 4, APRIL 1988
Substituting (66) and (67) into (49) and ignoring the
terms that are independent of e,, the M step can be further
simplified. The resulting algorithm is as follows.
, K , compute
E step: For k = 1, 2 ,
X m ) ' " ) = & ( a l ; 8:"') - Ak(wl;8:"')
*
P-'( 4; 6'"') A,( q ;8:"')
+ & ( a 1 ;8:"')
P-'(q;
e'"')
Y ( 4Y"(4
e'"')A,( e?)).
P-'( w l ;
M s t e p : Fork = 1, 2 ,
*
.N-'(~/)
wl;
9
e)
(68)
K,
-+
a:"+?
We note that the objective function in (69) is the array
is substibeamformer, where the product X,( w , ) Xz ( q )
.
altuted by its current estimate X k m l ) ( " ) The
gorithm is illustrated in Fig. 8. This computationally attractive algorithm decreases iteratively the objective function in (61) without ever going through the indicated multiparameter optimization. The complexity of the algorithm
is essentially unaffected by the assumed number of signal
sources. As K increases, we have to increase the number
of beamformers in parallel; however, each beamformer
output is maximized separately.
To demonstrate the performance of the algorithm, we
have considered the following example. The array consists of five colinear and evenly spaced sensors. There are
two far-field signal sources at bearings
el =
00
e2 =
100
B
t
I
1
-90 -72
relative to the boresight. The array-source geometry is
shown in Fig. 9. The signals and the noises are spectrally
flat with S, ( w ) = S and N,,, ( a )= N over the frequency
band [ - W / 2 , W/2]. We assume that S I N = 1 and that
WT/2n = 20 (so that the postintegration SNR per channel is approximately 23 dB). The array length is taken to
be L = 6X where X is the wavelength associated with the
highest frequency in the signal band.
In Fig. 10 we have plotted the array beamformer response as a function of the bearing angle. As we can see,
the conventional beamformer cannot resolve between the
signal sources and estimate their bearings.
The ML estimates, obtained by direct minimization of
the objective function in (61), are
8,
=
-0.0563
82 =
10.4556,
and the value of the objective function at the minimum
(corresponds to the value of the log-likelihood function at
the maximum) is
J = 159.0137.
r i I
II
O
Fig. 9. Array-source geometry.
-54
-36
-18
o
18
36
54
72
*
90 8
Fig. 10. The conventional beamformer.
We have also computed the Cramer-Rao lower bound
on the rms error of each parameter estimate. We obtain
.(el)
=
0.2680
~ ( 8 , )= 0.2722.
We now apply our algorithm. In Fig. 11 we have plotted the beamformer response to the various signal sources
as they are evolved with iterations. In Fig. 12 we have
tabulated the results using several arbitrarily selected initial guesses. We see that in all cases, after 5-10 iterations,
the algorithm essentially converges, within the CramerRao lower bound, to the ML estimates of all the unknown
bearing parameters simultaneously, and the various signal
sources are correctly resolved.
VI. DISCUSSION
We have presented a computationally efficient algorithm for ML parameter estimation of superimposed signals based on the EM method. The algorithm is developed
I I
FEDER AND WEINSTEIN: PARAMETER ESTIMATION OF SUPERIMPOSED SIGNALS
I
l
l
I
-0.0-7.5 -5.0 -2.5
l
0
I
2.5
I
I
t
L
4
1
:
5.0 7.5 10.0 12.5 15.0;
n= I
I
I
I
-10.0 7.5 -51) -25
I
0
I
I
I
I
I
t
*
2.5 91) 7.5 10.0 12.5 15.08
Fig. 11. The beamfonner response to each signal source.
--n
0
1
2
3
4
5
6
7
8
9
n0
1
2
3
4
5
6
7
8
9
10
11
12
13
P
2.0000
1.5776
1.1220
.4932
.3716
.1572
.os00
- -0036
0304
0304
-.
-.
6
-
8.0000
8.4652
8.9744
9.4568
9.8588
10.1268
10.2076
10.3660
10.4216
10.4404
(n)
I
4.0000
3.6412
3.2928
2.9176
2.4888
2.0332
1.5240
1.0148
.5860
.2912
.1304
.OS00
- ,0636
0304
-.
7.0006
7.2592
7.5272
7.8220
8.1436
8.5188
8.9476
9.4032
9.8 0 S 2
16. 1000
10.2608
10..3680
10.4216
10.4484
J
-
n-
269.6614
241.1230
210.3016
0
1
184.1323
3
4
5
168.3571
161.7404
159.7 169
159. I956
159.0440
159.0218
J
2
6
7
8
.
n-
378.5662
351.2384.
328.5127
306.1186
281.8299
254.9325
224.2842
194.4633
173.5086
163.5225
160.3573
159.3599
159.6797
159.0218
0
1
2
3
4
c
6
7
8
9
10
11
12
13
14
15
16
-5.0000
-2.4692
-1.2632
-6736
3788
2440
1644
I108
--.
-.
-.
-.
-.0840
13.0000
111.1452
1. 9492
10.7164
10.5280
10.4484
10.4216
10.4216
10.4216
p)
J
-
I
7.0000
. 6.0532
4.9544
3.8288
2.7836
1.9528
1.3632
.9344
.6396
.4520
.3180
.2108
,1304
.0768
.os00
.0232
* 0036
-
J
520.2958
294.7128
196.5459
167.7777
161.0492
159.6113
159.2048
1 59.0730
159.0405
13. ElWO
12.6996
12.3780
12.08.32
11.8420
1 1.6276
11.4132
1 1.2256
1 1.0380
10.9040
10.7960
10.7164
10.6360
10.5024
10.5555
10.5288
10.5020
Fig. 12. Tables of results for the multiple source angle-of-arrival estimation.
497.0527
448.0688
385.8327
318.2541
256.9945
214.1421
109.1836
175.1 le9
167.4013
i 63.6597
161.5851
160.3e80
159.6668
159. 33351
159.2103
159.1171
159.0536
488
IEEE TRANSACTIONS ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL. 36. NO. 4. APRIL 1988
for the case of deterministic (known) signals and for the
case of stationary Gaussian signals. The most striking feature of the algorithm is that it decouples the full multidimensional search associated with the direct ML approach
into searches in smaller-dimensional parameter subspaces, leading to a considerable simplification in the
computations involved. The algorithm is applied to the
multipath time-delay and multiple source location estimation problems; in both cases, we demonstrate the performance of the algorithm in a given example and show
that the algorithm converges iteratively to the exact ML
estimate of all the unknown parameters simultaneously
where each iteration increases the likelihood.
We finally note that the derivation of the EM algorithm
for the linear-Gaussian case (the Appendix) has importance of its own since it covers a wide range of applications.
APPENDIX
DEVELOPMENT
OF THE EM ALGORITHM
FOR
LINEAR-GAUSSIAN
CASE
Y = HX
where h = 1 if X is real-valued and = 2 if X is complexvalued. Taking the logarithm of (A2),
2
-
m * ( e )P ( 8
where
f
=
E{x/Y
=
E{XX*/Y = y ; 8 ' ) .
= y;
8')
and
XX*
(A61
Since X and Y are related by a linear transformation,
they are jointly Gaussian, and the conditional expectations required in (A5) and (A6) can be computed by a
straightforward modification of existing results. We obtain
R
=
m(w)+
n(w)H*[HA(W)H*]-'
[Y - Hm((3')]
(-47)
xx* = A ( 8 ' ) - A ( 8 ' ) H * [ H A ( e ' ) H * ] - '
('41)
where H is a noninvertible matrix and X possesses the
following multivariate Gaussian probability density:
--
+ m * ( e )P ( e
THE
Suppose the "complete" data X and the observed (incomplete) data Yare related by the linear transformation
x [logdet a(e)
u(8,e') = c
*
HA(€)')
+ if*.
('48)
Note that if we set 8' = 8, (A7) and (A8) are the wellknown formulas for the conditional expectations in the
Gaussian case (e.g., [19]).
Substituting (A7) and (A8) into (A4), the function U(8,
8 ' ) required for the EM algorithm is given in a closed
form. We observe that U(8, 8 ' ) and log f x ( x ; 8 ) have
the same dependence on 8. Maximizing U(8, 8 ' ) with
respect to 8 is therefore the same as maximizing logfx(x;
8 ) with respect to 8. Hence, the EM algorithm essentially
requires the ML solution in the X model, which might be
significantly simpler than the direct ML solution in the Y
model.
In correspondence with (1 1) and (12), the EM algorithm for the linear-Gaussian case is given by the following.
E step:
$"'
=
m(@"') + A(@"')
*
x
= c - - [logdet A ( 8 ) + m * ( 8 ) A - ' ( 8 )
&*(")
=
2
. m ( e ) - x*A-'(e)
-
m * ( e ) A-'(e)x
H * [ H A ( 0 ' " ' ) H * ] - ' [ y- Hm(8'"')]
. HA( e("))+ f(n)f(n)*.
m(e)
+ tr ( ~ ( ~ ) x x * ) ]
(A3 1
where c is a constant independent of 8. Taking the conditional expectation of (A3) given Y = y at a parameter
value e',
(A91
-'
A ( e("))- A ( e(")) H*[ H A ( ( j ( " ) ) H * ]
M step:
min {logdet A ( 8 )
e
( A10 )
+ m*(8)
- n-'(e) m ( e ) - d")*A-'(e) m ( e )
-
m * ( e ) +(e) 2'")
+ tr (A-'(e)&*('))}
+
W+l).
(All)
FEDER AND WEINSTEIN: PARAMETER ESTIMATION OF SUPERIMPOSED SIGNALS
ACKNOWLEDGMENT
The authors wish to thank D. Kelter and Y. Eyal for
their help in the simulations, and the referees for many
helpful comments.
REFERENCES
[ l ] M. Feder and E. Weinstein, “Optimal multiple source location estimation via the EM algorithm,” in Proc. I985 Int. Conf. Acoust.,
Speech, Signal Processing (ICASSP ’85),Tampa Bay, FL, Mar. 1985,
pp. 1762-1765.
[2] -,
“Multipath and multiple source array processing via the EM
algorithm,” in Proc. I986 Int. Con$ Acoust., Speech, Signal Processing (ICASSP ’86), Tokyo, Japan, Apr. 1986.
[3] J. R. Grindon, “Estimation theory for a class of nonlinear random
vector channels,” Ph.D. dissertation, Washington Univ., St. Louis,
MO, 1970.
[4] J. S. Lim and A. V. Oppenheim, “All pole modeling of degraded
speech,” IEEE Trans. Acoust., Speech, Signal Processing, vol.
ASSP-26, pp. 197-210, 1978.
[5] M. Segal, “Resolution and estimation techniques with application to
vectorial miss-distance indicator,” M.Sc. thesis, Technion-Israel Inst.
Technol., Haifa, Israel, Aug. 1981.
[6] N. M. Laird, A. P. Dempster, and D. B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” Ann. Roy. Stat.
Soc., pp. 1-38, Dec. 1977.
[7] C. F. J. Wu, “On the convergence properties of the EM algorithm,”
Ann. Stat., no. 11, pp. 95-103, 1983.
[8] I. Meilijson, “A fast improvement to the EM algorithm on its own
terms,’’ submitted for publication.
[9] H. L. Van Trees, Detection Estimation and Modulation Theory, Part
I. New York: Wiley, 1968.
[lo] J. E. Ehrenberg, T . E. Ewart, and R. D. Moms, “Signal processing
techniques for resolving individual pulses in multipath signal,” J .
Acoust. SOC.Amer., vol. 63, no. 6, pp. 1861-1865, 1978.
[ I l l M. Feder and E. Weinstein, “Optimal multiple source location estimation via the EM algorithm,” Woods Hole Oceanographic Institution, Woods Hole, MA, Tech. Rep. WHOI-85-25, July 1985.
[12] M. Wax and T. Kailath, “Detection of signals by information theoretic criteria,’’ IEEE Trans. Acoust., Speech, Signal Processing, vol.
ASSP-32, pp. 387-392, 1985.
[131 S. S. Reddy, “Multiple source location-A digital approach,” IEEE
Trans. Aerosp. Electron. Sysr., vol. AES-15, pp. 95-105, 1979.
[14] G. V. Borgiotti and L. J. Kaplan, “Superresolution of uncorrelated
interference sources by using adaptive array techniques,” IEEE Trans.
Antennas Propagat., vol. AP-27, pp. 842-845, 1979.
‘11
489
[15] R. 0. Schmidt, “A signal subspace approach to multiple emitter location and spectral estimation,” Ph.D. dissertation, Stanford Univ.,
Stanford, CA, 1981.
1161 G. Su and M. Morf, “The signal subspace approach for multiple wideband emitter location,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-31, pp. 1503-1522, 1983.
[17] B. Porat and B. Friedlander, “Estimation of spatial and spectral parameters of multiple sources,” IEEE Trans. Inform. Theory, vol. IT29, pp. 412-425, 1983.
1181 A. Nehorai, G. Su, and M. Morf, “Estimation of time difference of
arrival for multiple ARMA sources by pole decomposition,” IEEE
Trans. Acoust., Speech, Signal Processing, vol. ASSP-31, pp. 14781491, 1983.
[19] B. 0. Anderson and J. B. Moore, Optimal Filtering. Englewood
Cliffs, NJ: Prentice-Hall, 1979.
Meir Feder received the B.Sc. and M.Sc. degrees
(summa cum laude) from Tel-Aviv University,
Tel-Aviv, Israel, and the Sc.D. degree from the
Massachusetts Institute of Technology, Cambridge, and the Woods Hole Oceanographic Institution, Woods Hole, MA, all in electrical engineering, in 1980, 1984, and 1987, respectively.
He is currently a Research Associate in the De
partment of Electrical Engineering and Computer
Science at M.I.T. His research interests lie in the
broad area of signal processing, stochastic estimation, and information theory. He is particularly interested in investigating new statistical estimation techniques and using them in signal processing applications, e.g., speech, image, and array processing problems.
Ehud Weinstein received the B.Sc. degree from
the Technion-Israel Institute of Technology,
Haifa, Israel, and the Ph.D. degree from Yale
University, New Haven, CT, both in electncal
engineering, in 1975 and 1978, respectively.
He is currently an Associate Professor in the
Department of Electronic Systems, Faculty of Engineering, Tel-Aviv University, Tel-Aviv, Israel,
and an Adjunct Scientist with the Department of
Ocean Engineenng, Woods Hole Oceanographic
Institution, Woods Hole, MA. His research interests are in the general area of statistical estimation theory and its application to signal and array processing.
Dr Weinstein is the co-recipient of the IEEE Acoustics, Speech, and
Signal Processing Society’s 1983 Senior Award.
Download