Effect of Measurement Noise

advertisement
Effect of Measurement Noise
ccrma.stanford.edu/~jos/ filters/Effect_Measurement_Noise.html
http://ccrma.stanford.edu/~jos/filters/Effect_Measurement_Noise.html
In practice, measurements are never perfect. Let
denote the measured
output signal, where is a vector of ``measurement noise'' samples. Then we
have
By the orthogonality principle [37], the least-squares estimate of
is obtained by
orthogonally projecting onto the space spanned by the columns of
speaking, choosing
. Geometrically
to minimize the Euclidean distance between and
is the same
thing as choosing it to minimize the sum of squared estimated measurement errors
The distance from
to is minimized when the projection error
orthogonal to every column of , which is true if and only if
have, applying the orthogonality principle,
Solving for
is
[82]. Thus, we
yields Eq. (5.16) as before, but this time we have derived it as the least
squares estimate of
in the presence of output measurement error.
It is also straightforward to introduce a weighting function in the least-squares
estimate for
by replacing
in the derivations above by
, where
is any
positive definite matrix (often taking to be diagonal and positive). In the present
context, (time-domain formulations), it is difficult to choose a weighting function
that corresponds well to audio perception. Therefore, in audio applications,
frequency-domain formulations are generally more powerful for linear-timeinvariant system identification. A practical example is the frequency-domain
equation-error method described in §G.4.3 [74].
Noise.
.
1. Noise Power
The power produced by a resistor at a temperature T as a result of random thermal
motions of electrons is given by
(1)
where k is Boltzmann's constant and
is a given frequency interval. The equivalent
noise power of a receiving system is then defined as
(2)
where
is the system temperature
(3)
is the receiver temperature (which represents internal noise from the receiving
amplifier), and
is the "antenna temperature" which represents the unwanted noise
from the antenna produced by ground radiation, atmospheric attenuation, and other
sources. (Note that there is actually a second nonequivalent definition of antenna
temperature.)
2.
Power Spectral Density estimation
www.le.ac.uk/eg/fss1/psde(1).doc
by
Fernando S. Schlindwein
psde(1).doc

S ( f )   Rxx ( )e  j 2f d

(Wiener-Khinchin
theorem)
or, in the discrete form,
S( f ) 

R
k  
xx
(k )e  j 2fk
(2.23)
where Rxx ( ) is the autocorrelation of x(t) and is defined by
Rxx ( )  Ex(t   )  x(t )
Problems with that:
- Autocorrelation function is not known
- Limited (finite) amount of data available
First the autocorrelation function should be estimated. This can be done
either directly or with the DFT or FFT. If a window is then applied to the
autocorrelation estimates followed by the DFT, the result is a consistent
spectral estimate.
The biased estimate of the autocorrelation function
1
Rˆ xx (k ) 
N
N  k 1
 x ( n  k ) x ( n)
(2.18)
n 0
is generally preferred to the unbiased estimator
RˆUxx (k ) 
1
N k
N  k 1
 x ( n  k ) x ( n)
(2.20)
n 0
because it tends to have a smaller mean square error than the unbiased
estimate.
SˆBT ( f )  T
M
R
k  M
xx
(k )e  j 2fkT
or, with an antileakage data window W(f):
SˆBT ( f )  T
M
R
k  M
xx
(k )  W (k )e  j 2fkT
(2.25)
where 
1
1
 f 
, and
2T
2T
Rxx (k ) are the estimates of the autocorrelation
function.
The original periodogram
In 1898, in a paper investigating periodic variations hidden behind irregular
fluctuations of observations where both the period and the amplitude of the
periodic components were unknown, Arthur Schuster coined the word
periodogram. The original description of periodogram is transcribed below:
"THE PERIODOGRAM. It is convenient to have a word for some
representation of a variable quantity which shall correspond to the
"spectrum" of a luminous radiation. I propose the word periodogram, and
define it more particularly in the following way
Let
T
t T
a  t 1 f ( t ) cos(kt )dt
1
2
T
t T
b  t 1 f ( t ) sin( kt )dt
1
2
(2.26)
where T may for convenience be chosen to be equal to some integer multiple
2
2
of 2 , and plot a curve with 2 as abscissæ and r  a  b as
k
k
ordinates; this curve, or, better, the space between this curve and the axis of
abscissæ, represents the periodogram of f(t)."
The direct method of PSDE with the Fourier technique is the discrete
version of Schuster's periodogram.
THE FFT-PERIODOGRAM APPROACH
A frequently used approach to the estimation of the power spectrum of
random signals is to compute the squared magnitude of the DFT (or FFT) of
a segment of the signal. The resulting estimate, called the periodogram, is
asymptotically unbiased, but its variance does not decrease to zero as the
length of the segment increases, and consequently the periodogram is not a
consistent estimate.
What should be done:
1
S ( f )  lim T  E 
 T
T

2
T
2
x ( n)e
 j 2ft

dt 

2
What is actually done:
1
SˆPER ( f )   x(n)e
N
  j 2nk 


N


2
2
SˆPER ( f )  DFT {x(n)}
(2.27)
(2.28)
AVERAGE (OR MODIFIED) PERIODOGRAM ESTIMATE
By dividing the available signal sequence into a set of shorter segments and
averaging the associated periodograms, a consistent estimate is obtained.
The sequence of operations is:
1) Divide the available length of the signal into a number of shorter
segments,
2) Multiply each segment by an antileakage window,
3) Compute the associated periodograms for each segment,
4) Average the periodograms of all segments,
The technique, known as average periodogram estimate, or modified
periodogram, was suggested by P.D. Welch in 1967 and results in a
consistent estimate of the power spectrum, but the spectral resolution is
decreased (size of the segments actually used).
Some problems and how to avoid them:
aliasing - effect caused by sampling a signal with inappropriate sampling
frequency (fsam). Aliasing is avoided if the signal is band-limited (no energy
above a certain frequency fc) and we are careful enough to sample it at a
frequency above 2 fc.
Anti-aliasing filter – Analogue filter used to remove or largely attenuate
frequency components outside the band of interest. In general this is a lowpass filter with cut-off at or just below the frequency of interest fc as defined
above.
statistical instability - variance of the estimates of the PSDE when using
the standard (non-averaged) DFT for the estimation. Essentially the variance
of the estimate is the same as the mean.
smoothing function - coefficients of the 'low-pass' filter applied to the
spectral data after the DFT. The results of the DFT are convolved with a
moving average window, with weights, say 1/4, 1/2, 1/4. This approach of
low-pass filtering the spectral coefficients (after the transform) is statistically
better than the approach of using an anti-leakage window on the time
domain signal (before the transform).
Anti-leakage window – function w(n) used to multiply the (time-domain)
signal over the data frame to be transformed in an attempt to reduce leakage
in spectral estimation. Antileakage windows are always symmetrical and
tapered towards zero at the ends to reduce the discontinuity caused by the
implicit assumption of periodicity of the time-domain data imposed by the
Fourier approach. Instead of analysing the original signal x(n) the modified
signal x(n)w(n) is analysed.
pre-whitening or de-trending - procedure used to remove or largely
attenuate unwanted DC components and linear components on the time
signal before spectral estimation. The subtraction of the best straight line
over each data frame is a popular approach to de-trending. It can be
performed with a simple digital filter, say
y(nT) = x(nT) - 0.99 x(nT-T) or
y(n) = x(n) - 0.99 x(n-1) in simplified notation.
or use a more sophisticated digital high-pass filter.
Spectral estimation techniques can be seen as attempts to fit the measured
data to an assumed model. Conventional FFT spectral analysis is based on a
Fourier series model of the data, ie, the process is assumed to behave as a
sum of sinusoids (or exponential functions) and the coefficients of these
sinusoids describe the process according to the model.
Autoregressive (AR) and Autoregressive-moving –average (ARMA)
models
The autoregressive (AR) model assumes that the current value of the
process, xn, can be described by a finite linear aggregate of the previous
values of the process and the current value of a white noise driving source
nn. An autoregressive process of order p is defined as
xn = nn - a1xn-1 - a2xn-2 - a3xn-3 - ... - apxn-p
Autoregressive model used for prediction
The AR model can be used for prediction (this is why it is sometimes
referred to as Linear Prediction Coding, or LPC):
p
x ( n)   a k x n  k  
k 1
where , the error, is white noise with zero mean and variance 2.
The AR estimation of the next value is
(3.1)
p
xˆ (n)   ak xnk
k 1
Autoregressive model used for power spectral density estimation
The autoregressive, moving average (ARMA) model assumes that the
current value of the process, xn, can be described by a finite linear aggregate
of the previous values of the process and the current plus previous values of
a white noise driving source nn. The autoregressive, moving average
(ARMA) as a digital filter is
n(t)
x(t)
h(t)
or, in the z domain representation,
N(z)
X(z)
H(z)
The current value of the output at a given time is given by
q
p
r 0
k 1
p
q
k 1
r 0
xi   br nir   ak xik
or
xi   ak xik   br nir
in the z domain we can write this last equation as
p

q

k 
X ( z ) 1   ak z   N ( z )  br z r 
 k 1

 r 0

therefore the transfer function in z is
H ( z) 
q
r 
b
z
 r 
 r 0

X ( z)

p
N ( z) 
k 
1

a
z

k


 k 1

Remembering the Power Gain Theorem that states that the power of the
output of a digital filter is the power at the input times the square of the
transfer function
Px  Pn  H (z )
2
we can write
Px  Pn
 q b z r 
r
r

0
2
1  p a z k 
 k1 k 
 sT
Replacing z  e for s  j (around the unit circle on the z-plane, that is,
making the real part of s equal zero), that is making z  e  jT and knowing
that the power spectral density of white noise is its variance  2 we can write
the above equation as
PARMA( f ) 
q
 T  br e
2
 j 2 frT
2
r 0
p
1   ak e
 j 2  fkT
2
k 1
Autoregressive model
In the particular case, if all br  0 , except b0  1 , i.e. the forward path to the
adder has a single feed line from the input, with gain =1 and no
contributions from delayed input samples, then we have an autoregressive
model, and the autoregressive power spectral density function is given by
PAR ( f ) 
 2T
p
1   ak e
 j 2 f k T
2
k 1
psde(1).doc
Download