Wiener Filtering

advertisement
EE264: Lecture 12
Wiener Filtering
In this lecture we will take a different view of filtering. Previously, we have depended on frequency-domain
specifications to make some sort of LP/ BP/ HP/ BS filter, which would extract the desired information from
an input signal.
Now, we wish to filter a signal x[n] to modify it such that it approximates some other signal d[n] in some
statistical sense. That is, the output of the filter y[n] is a good estimate of d[n]. The output error e[n]
represents the mismatch between y[n] and d[n]. This can be considered a time-domain specification of the
filter.
x[n]
y[n]
FILTER
−
e[n]
+
+
d[n]
Why would we want to do this? Well, consider the problem of predicting the stock market. Assume that
x[n] represents the price of a particular stock on day n and d[n] is the price of that stock one day in the
future. The goal is to find a filter that will predict d[n] given x[n]. Moreover, we would like to find the best
such linear predictor.
Q: Is there an obvious solution for this case?
A: Sure, pick H(z) = z, but this is worthless for this real-time example!
In order to talk about an “optimal” filter which estimates d[n] from x[n], we must have a method of measuring how good a job the filter does. A “cost function” is used to judge the performance, and could take on
many different forms.
c
c
c
Copyright 1995–1996
by G. Plett. Copyright 1998–2004
by M. Kamenetsky. Copyright 2005
by A.
131
Flores
132
Wiener Filtering
Lect. 12
For example,
ξ=
E[u[|e[n]| − k]]
Mean
Threshold
Error
error
Mean
Absolute
Error
ξ = E[|e[n]|]
Mean
Square
Error
ξ=
E[e2 [n]]
error
error
We most commonly use the mean square error (MSE) as our cost function.
ξ = E[e2 [n]]
where E[·] represents statistical expectation, and
2
E[e [n]] =
Z
∞
−∞
x2 pe (x) dx
where pe (x) is the probability density function of the error. The filter that is optimum in the MSE sense is
called a Wiener filter.
In all our analyses here we will assume:
1. x[n] is wide-sense stationary, i.e. it has constant (and finite) mean and variance, and a correlation
function which is a function of time shift only:
E[x[m1 ] · x[m1 + n]] = E[x[m2 ] · x[m2 + n]]
∀ m 1 , m2
Incidentally, a stronger condition would be to assume that x[n] is strong-sense stationary, i.e. all of its
moments are constant and finite. An even stronger condition would be to assume that x[n] is ergodic,
i.e. all of its sample averages converge to the true moments. Fortunately, these latter two conditions
are not necessary for the present development.
2. All of the signals are zero-mean.
3. We use MSE as our error criterion.
Correlation
The definition of correlation between two random processes x[n] and y[n] (when x and y are wide sense
stationary) is:
φxy [k] = E[x[n]y[n + k]]
c
c
c
Copyright 1995–1996
by G. Plett. Copyright 1998–2004
by M. Kamenetsky. Copyright 2005
by A. Flores
Lect. 12
Wiener Filtering
133
It is a measure of the similarity between two random variables, and is related to the amount of information
that one random variable contains about another.
Example:
φxx
φxx
Highly
Correlated
Uncorrelated
(white)
Two-Sided (Unconstrained) Wiener Filters
ξ =
=
=
=
E[e2 [n]]
E[(d[n] − y[n])2 ]
E[d2 [n]] − 2E[y[n]d[n]] + E[y 2 [n]]
φdd [0] − 2φyd [0] + φyy [0]
So, the cost function is itself a function of the correlation and cross correlation of the output of the filter and
the desired output. We know that:
y[n] =
∞
X
m=−∞
h[m]x[n − m]
Observe that, in general, h[m] represents a non-causal filter with infinite impulse response (IIR). (Later, we
consider the special case where h[m] is either a causal IIR filter or a causal FIR filter.) Substituting the
definition for y[n] into the previous result, we get:
φyd [0] = E[y[n]d[n]]
= E
=
"
∞
X
h[m]x[n − m]d[n]
m=−∞
∞
X
#
h[m]φxd [m]
m=−∞
φyy [0] = E[y[n]y[n]]

= E
=
∞
X
h[m]x[n − m]
m=−∞
∞
∞
X X
m=−∞ l=−∞
ξ = φdd [0] − 2
"
∞
X
l=−∞

h[l]x[n − l]
h[m]h[l]φxx [l − m]
∞
X
m=−∞
#

h[m]φxd [m] + 
∞
X
∞
X
m=−∞ l=−∞

h[m]h[l]φxx [l − m]
Our goal is to find h[n] which minimizes ξ. The solution is to take the derivative
dξ
d h[ni ]
and set it to zero:
∞
X
dξ
= 0 − 2φxd [ni ] + 2
hopt [m]φxx [ni − m] = 0
d h[ni ]
m=−∞
c
c
c
Copyright 1995–1996
by G. Plett. Copyright 1998–2004
by M. Kamenetsky. Copyright 2005
by A. Flores
134
Wiener Filtering
∞
X
m=−∞
Lect. 12
hopt [m]φxx [ni − m] = φxd [ni ]
which must be satisfied for all ni .
This equation specifies the impulse response of the optimal filter. Observe that it depends on the cross
correlation between x[n] and d[n] and the autocorrelation function of x[n].
The optimal filter is known as the Wiener solution. Observing that the left hand side of the equation is equal
to the convolution of hopt [n] and φxx [n],
Hopt (z)Φxx (z) = Φxd (z)
Hopt (z) = Φxd (z)Φ−1
xx (z)
Note:
• The filter obtained may be non-causal!
• ξmin = φdd [0] −
∞
X
n=−∞
hopt [n]φxd [n] 6= 0
in general.
Example: Noise Canceling:
x[n]= s[n] + v[n]
y[n]
Noise
Canceler
e[n]
−
+
+
d[n] = s[n]
Assuming that s[n] and v[n] are uncorrelated, zero mean:
E[x[n]x[n + k]] =
=
Φxx (z) =
E[x[n]d[n + k]] =
=
Φxd (z) =
Thus,
E[(s[n] + v[n])(s[n + k] + v[n + k])]
φss [k] + φvv [k]
Φss (z) + Φvv (z)
E[(s[n] + v[k])s[n + k]]
φss [k]
Φss (z)
Φss (z)
Hopt (z) =
Φss (z) + Φvv (z)
We notice two properties from this solution:
1. At “frequencies” where Φvv (z) → 0 (i.e. the noise is zero) there is no need to filter the signal and
Hopt (z) → 1.
2. At “frequencies” where Φvv (z) → ∞ (i.e. where the noise dominates) the filter is “turned off” and
Hopt (z) → 0.
c
c
c
Copyright 1995–1996
by G. Plett. Copyright 1998–2004
by M. Kamenetsky. Copyright 2005
by A. Flores
Lect. 12
Wiener Filtering
135
Numerical Example: Assume that the autocorrelation function of the input signal is:
|k|
1
2
10
φss [k] =
27
and that the autocorrelation of the (white) noise source is:
2
φvv [k] = δ[k]
3
The signal and noise are assumed to be uncorrelated with each other.
φss
φvv
To find the Wiener filter, we need to obtain Φ xd (z) and Φxx (z). We can first find φxd [k] and φxx [k], and
then z-transform them. Since the signal and noise are uncorrelated with each other,
φxx [k] = φss [k] + φvv [k]
10 1 |k| 2
=
+ δ[k],
27 2
3
Φxx (z) = Φss (z) + Φvv (z).
The z-transform of the noise autocorrelation is
Φvv (z) =
2
.
3
The z-transform of the signal autocorrelation (after some math) is
Φss (z) =
5/18
(1 −
1 −1
2 z )(1
− 12 z)
.
Combining these:
Φxx (z) =
5/18
1 −1
2 z )(1
− 12 z)
− 6z −1
+
(1 −
20 − 6z
=
.
18(1 − 12 z −1 )(1 − 12 z)
2
3
We saw earlier,
φxd [k] = φss [k],
Φxd (z) = Φss (z).
Combining these results:
Hopt (z) =
5/18
(1 − 13 z −1 )(1 − 31 z)
5
hopt [n] =
16
|n|
1
3
,
∀n
c
c
c
Copyright 1995–1996
by G. Plett. Copyright 1998–2004
by M. Kamenetsky. Copyright 2005
by A. Flores
136
Wiener Filtering
Lect. 12
Q: How did we pick this hopt [n] without knowing the ROC of Hopt (z)?
A: We need a stable noise canceller.
Clearly this resulting filter is non-causal. We have two recourses:
1. Limit the filter to be a causal IIR filter.
2. Compute the optimal FIR filter instead.
Causal (Shannon-Bode) Wiener Filters
Our interest now focuses on the realization of causal Wiener filters, whose impulse responses are constrained
to be zero for negative time. The optimal causal impulse response has zero response for negative time and
has zero derivatives of ξ with respect to impulse response for all times equal to and greater than zero. The
causal Wiener equation becomes:
∞
X
m=0
hopt causal [m]φxx [n − m] = φxd [n],
hopt causal [n] = 0,
n≥0
n < 0.
This is not a simple convolution like the unconstrained Wiener equation, and special methods will be needed
to find useful solutions. The approach developed by Shannon and Bode will be used. We begin with a simple
case. Let the filter input be white with zero-mean and unit variance, so that
φxx [n] = δ[n].
For this input, the causal Wiener equations become
∞
X
m=0
hopt causal [m]φxx [n − m] = hopt causal [n] = φxd [n],
hopt causal [n] = 0,
n≥0
n<0
With the same white input, but without the causality constraint, the Wiener equation would be
∞
X
m=−∞
hopt [m]φxx [n − m] = hopt [n] = φxd [n],
∀n.
From this we may conclude that when the input to the Wiener filter is white, the optimal solution with a
causality constraint is the same as the optimal solution without the constraint, except that with the causality
constraint the impulse response is set to zero for negative time. With a white input, the causal solution is
easy to obtain, and it is key to Shannon-Bode. You find the unconstrained two-sided Wiener solution and
remove the non-causal part in the time domain. You don’t do this if the input is not white.
Usually, the input to the Wiener filter is not white. Therefore the first step in the Shannon-Bode realization
is to whiten the input signal. A whitening filter can always be designed to do this, using a priori knowledge
of the input autocorrelation function or its z-transform.
Assume that the z-transform of the autocorrelation function is a rational function of z that can be written as
the ratio of a numerator polynomial in z to a denominator polynomial in z. Factor both the numerator and
denominator polynomials. The autocorrelation function is symmetric, and its z-transform has the following
symmetry:
φxx [n] = φxx [−n],
Φxx (z) = Φxx (z −1 ).
c
c
c
Copyright 1995–1996
by G. Plett. Copyright 1998–2004
by M. Kamenetsky. Copyright 2005
by A. Flores
Lect. 12
Wiener Filtering
137
Accordingly, there must be symmetry in the numerator and denominator factors:
Φxx (z) = A
(1 − az −1 )(1 − az)(1 − bz −1 )(1 − bz) . . .
.
(1 − αz −1 )(1 − αz)(1 − βz −1 )(1 − βz) . . .
With no loss in generality,1 assume that all the parameters a, b, c, . . . , α, β, . . . have magnitudes less
than one. We can then factor Φxx (z) as
−
Φxx (z) = Φ+
xx (z) · Φxx (z),
where
√ (1 − az −1 )(1 − bz −1 ) . . .
A
(1 − αz −1 )(1 − βz −1 ) . . .
√ (1 − az)(1 − bz) . . .
Φ−
A
.
xx (z) =
(1 − αz)(1 − βz) . . .
Φ+
xx (z) =
−
All poles and zeros of Φ+
xx (z) will be inside the unit circle in the z-plane. All poles and zeros of Φ xx (z)
will be outside the unit circle in the z-plane. Furthermore,
−
−1
+
−1
−
Φ+
xx (z) = Φxx (z ), Φxx (z ) = Φxx (z).
The whitening filter can now be designed. Let it have the transfer function
Hwhitening (z) =
1
.
Φ+
xx (z)
To verify its whitening properties, let its input be x[n] with autocorrelation function φ xx [n], and let its output
be x̃[n] with autocorrelation function φ x̃x̃ [n]. The transform of the output autocorrelation function is
Φx̃x̃ (z) = Hwhitening (z −1 ) · Hwhitening (z) · Φxx (z)
1
1
· Φxx (z)
= + −1 · +
Φxx (z ) Φxx (z)
1
1
= −
· +
· Φxx (z)
Φxx (z) Φxx (z)
= 1.
The autocorrelation function of the output is
φx̃x̃ [n] = δ[n].
Therefore the output is white with unit mean square.
It is important to note that the whitening filter is both causal and stable, since the zeros of Φ +
xx (z) are all
inside the unit circle in the z-plane. Furthermore, using this whitening filter does nothing irreversible to the
signal. The inverse of the whitening filter has a transfer function equal to Φ +
xx (z), which is also stable and
causal because Φ+
(z)
has
all
of
its
poles
inside
the
unit
circle.
The
whitening
filter can be readily utilized
xx
since it is causal, stable, and invertible with a causal, stable filter.
1
The only case not included involves zeros of Φxx (z) exactly on the unit circle. This is a special case and it requires special
treatment. These zeros are assumed to be somewhere off the unit circle, and they are placed back on it by a limiting process.
c
c
c
Copyright 1995–1996
by G. Plett. Copyright 1998–2004
by M. Kamenetsky. Copyright 2005
by A. Flores
138
Wiener Filtering
x̃[n]
1
x[n]
Φ+
xx (z)
Yopt (z)
Lect. 12
y[n]
+
e[n]
+
d[n]
The subsequent filter is easily designed, since its input is white. We represent its transfer function by
Yopt causal (z) and can design it by first obtaining Y opt (z) disregarding causality, and then removing the
non-causal part of its impulse response.
We wish to find Yopt (z) disregarding causality in terms of the autocorrelation of x[n] and crosscorelation
with d[n]. To do so, we could work out expressions for Φ x̃x̃ (z) and Φx̃d (z) in terms of Φxx (z) and Φxd (z)
and then use the formula for the uncostrained Wiener filter. However, there is an easier way; notice that the
Φxd (z)
cascading of Φ+1(z) and Yopt (z) should be equal to the unconstrained Wiener filter given by Φ
, hence
xx (z)
xx
Φxd (z)
· Φ+
xx (z)
Φxx (z)
Φxd (z)
= −
Φxx (z)
Yopt (z) =
To obtain Yopt causal (z), we first inverse transform Yopt (z) into the time domain, remove the non-causal
part, and then z-transform the causal part obtained. This operation cannot be done with z-transforms alone.
A special notation has been devised to represent taking the causal part:
"
Φxd (z)
Yopt causal (z) = [Yopt (z)]+ =
Φ−
xx (z)
#
.
+
The Shannon-Bode realization of the causal Wiener filter can now be formulated:
"
Φxd (z)
1
Hopt causal (z) = +
Φxx (z) Φ−
xx (z)
#
.
+
Example:
An example helps to illustrate how this formula is used. We will rework the above noise filtering example,
only in this case the Wiener filter will be designed to be causal.
The first step is to factor Φxx :
Φxx (z) =
(1 − 13 z)(1 − 31 z −1 )
.
(1 − 12 z)(1 − 21 z −1 )
Therefore,
Φ+
xx =
(1 − 13 z −1 )
(1 − 31 z)
−
,
Φ
=
.
xx
(1 − 12 z −1 )
(1 − 12 z)
Φxd (z) was found before. Thus:
Φxd (z) =
5/18
(1 −
1 −1
2 z )(1
− 21 z)
.
c
c
c
Copyright 1995–1996
by G. Plett. Copyright 1998–2004
by M. Kamenetsky. Copyright 2005
by A. Flores
Lect. 12
Wiener Filtering
139
Φxd (z)
5/18
=
−
1 −1
Φxx (z)
(1 − 2 z )(1 − 31 z)
Yopt (z) =
=
"
1
1 −1
6z
− 12 z −1
+
1
3
1 − 13 z
#
.
The filter Yopt (z) is generally two-sided in the time domain. In order for it to be stable, its two terms must
correspond to time domain components as follows:
1
1 −1
6z
− 12 z −1
1
3
1 − 13 z
1
6
→ (right-handed time function).
→ (left-handed time function).
1
1
12
1
24
1
3
1
3
1
27
1 − 13 z
1
9
"
1
3
1
27
1
6
1
9
1 −1
z
6
− 12 z −1
1
12
"
1
12
1
1 −1
z
6
− 12 z −1
+
1
3
1 − 13 z
#
= Yopt (z)
1
24
1
3
1
6
1
1 −1
z
6
− 12 z −1
+
1
3
1 − 13 z
#
= Yopt causal (z)
+
1
24
Yopt causal (z) =
=
1
1 1 −1
+ z + z −2 + . . .
3 6
!12
1−
1
3
1 −1
2z
.
Including the whitening filter, the transfer function of the causal Wiener filter is
Hopt causal (z) =
=
1 − 12 z −1
1 − 13 z −1
1
3
1 −1 .
z
3
n
1−
1 1
hopt causal [n] =
3 3
,
!
1−
1
3
1 −1
2z
!
n ≥ 0.
c
c
c
Copyright 1995–1996
by G. Plett. Copyright 1998–2004
by M. Kamenetsky. Copyright 2005
by A. Flores
140
Wiener Filtering
1
3
1
9
1
1
27
1 −1
z
3
− 13 z −1
Lect. 12
= Hopt causal (z)
1
81
It is expected that the causal Wiener filter would not perform as well as the unconstrained non-causal Wiener
filter. This is indeed the case, and can be verified with a little math.
ξmin
causal
= φdd [0] −
∞
X
hopt causal [n]φxd [n].
n=0
Optimal FIR Wiener Filters
For an FIR realization, we make the following changes in the derivation:
y[n] =
N
−1
X
k=0
h[k]x[n − k]
This time, we will solve the system using matrix notation:
Filter
H = [h0 , h1 , . . . , hN −1 ]T
Input
X = [xn , xn−1 , . . . , xn−N +1 ]T
Output y[n] = H T X = X T H
(scalar)
Error
e[n] = d[n] − y[n] = d[n] − H T X
ξ = E[e2 [n]]
= E[d2 [n]] − 2E[H T Xd[n]] + E[H T XX T H]
= φdd [0] −2H T E[Xd[n]] +H T E[XX T ] H
| {z }
|
σd2
{z
P
}
|
{z
R
}
Note: ξ is a quadratic function of the filter H. This implies (among other things) that there is a unique global
optimum.
P = E[Xd[n]]

E[x[n]d[n]]
 ..
=  .

E[x[n − N + 1]d[n]

p[0]

 ..
=  .

p[N − 1]



If X and d[n] are uncorrelated, then P = 0.
R = E[X(X)T ]

E[x[n]x[n]]
 ..
=  .
···
..
.
E[x[n]x[n − N + 1] · · ·
E[x[n − N + 1]x[n]]
..
.
E[x[n − N + 1]x[n − N + 1]]



c
c
c
Copyright 1995–1996
by G. Plett. Copyright 1998–2004
by M. Kamenetsky. Copyright 2005
by A. Flores
Lect. 12
Wiener Filtering

141
r[0]
···
 ..
..
=  .
.
r[N − 1] · · ·

r[N − 1]

..

.
r[0]
If x[n] is a white noise process with unit variance, then R = I, the identity matrix.
Solution: The minimum error location will be where:
T
∂ξ
∂ξ
where
∇ξ(H) =
,...,
∇ξ(Hopt ) = 0
∂ h[0]
∂ h[N − 1]
As a result,
−2P + 2RHopt = 0
RHopt = P
A correlation matrix R is always positive semi-definite, so unless R = 0, its inverse will exist:
Hopt = R−1 P
The minimum error can be found too:
ξmin =
=
=
=
ξ(Hopt )
σd2 − 2P T R−T P + P T R−T RR−1 P
σd2 − P T R−1 P
σd2 − P T Hopt
R −T =
R−1 since
R is a cov.
matrix
and since R is positive semi-definite,
ξmin ≤ σd2
Without filtering, the error is σd2 , so we do better!
c
c
c
Copyright 1995–1996
by G. Plett. Copyright 1998–2004
by M. Kamenetsky. Copyright 2005
by A. Flores
Download