Effect of Measurement Noise ccrma.stanford.edu/~jos/ filters/Effect_Measurement_Noise.html http://ccrma.stanford.edu/~jos/filters/Effect_Measurement_Noise.html In practice, measurements are never perfect. Let denote the measured output signal, where is a vector of ``measurement noise'' samples. Then we have By the orthogonality principle [37], the least-squares estimate of is obtained by orthogonally projecting onto the space spanned by the columns of speaking, choosing . Geometrically to minimize the Euclidean distance between and is the same thing as choosing it to minimize the sum of squared estimated measurement errors The distance from to is minimized when the projection error orthogonal to every column of , which is true if and only if have, applying the orthogonality principle, Solving for is [82]. Thus, we yields Eq. (5.16) as before, but this time we have derived it as the least squares estimate of in the presence of output measurement error. It is also straightforward to introduce a weighting function in the least-squares estimate for by replacing in the derivations above by , where is any positive definite matrix (often taking to be diagonal and positive). In the present context, (time-domain formulations), it is difficult to choose a weighting function that corresponds well to audio perception. Therefore, in audio applications, frequency-domain formulations are generally more powerful for linear-timeinvariant system identification. A practical example is the frequency-domain equation-error method described in §G.4.3 [74]. Noise. . 1. Noise Power The power produced by a resistor at a temperature T as a result of random thermal motions of electrons is given by (1) where k is Boltzmann's constant and is a given frequency interval. The equivalent noise power of a receiving system is then defined as (2) where is the system temperature (3) is the receiver temperature (which represents internal noise from the receiving amplifier), and is the "antenna temperature" which represents the unwanted noise from the antenna produced by ground radiation, atmospheric attenuation, and other sources. (Note that there is actually a second nonequivalent definition of antenna temperature.) 2. Power Spectral Density estimation www.le.ac.uk/eg/fss1/psde(1).doc by Fernando S. Schlindwein psde(1).doc S ( f ) Rxx ( )e j 2f d (Wiener-Khinchin theorem) or, in the discrete form, S( f ) R k xx (k )e j 2fk (2.23) where Rxx ( ) is the autocorrelation of x(t) and is defined by Rxx ( ) Ex(t ) x(t ) Problems with that: - Autocorrelation function is not known - Limited (finite) amount of data available First the autocorrelation function should be estimated. This can be done either directly or with the DFT or FFT. If a window is then applied to the autocorrelation estimates followed by the DFT, the result is a consistent spectral estimate. The biased estimate of the autocorrelation function 1 Rˆ xx (k ) N N k 1 x ( n k ) x ( n) (2.18) n 0 is generally preferred to the unbiased estimator RˆUxx (k ) 1 N k N k 1 x ( n k ) x ( n) (2.20) n 0 because it tends to have a smaller mean square error than the unbiased estimate. SˆBT ( f ) T M R k M xx (k )e j 2fkT or, with an antileakage data window W(f): SˆBT ( f ) T M R k M xx (k ) W (k )e j 2fkT (2.25) where 1 1 f , and 2T 2T Rxx (k ) are the estimates of the autocorrelation function. The original periodogram In 1898, in a paper investigating periodic variations hidden behind irregular fluctuations of observations where both the period and the amplitude of the periodic components were unknown, Arthur Schuster coined the word periodogram. The original description of periodogram is transcribed below: "THE PERIODOGRAM. It is convenient to have a word for some representation of a variable quantity which shall correspond to the "spectrum" of a luminous radiation. I propose the word periodogram, and define it more particularly in the following way Let T t T a t 1 f ( t ) cos(kt )dt 1 2 T t T b t 1 f ( t ) sin( kt )dt 1 2 (2.26) where T may for convenience be chosen to be equal to some integer multiple 2 2 of 2 , and plot a curve with 2 as abscissæ and r a b as k k ordinates; this curve, or, better, the space between this curve and the axis of abscissæ, represents the periodogram of f(t)." The direct method of PSDE with the Fourier technique is the discrete version of Schuster's periodogram. THE FFT-PERIODOGRAM APPROACH A frequently used approach to the estimation of the power spectrum of random signals is to compute the squared magnitude of the DFT (or FFT) of a segment of the signal. The resulting estimate, called the periodogram, is asymptotically unbiased, but its variance does not decrease to zero as the length of the segment increases, and consequently the periodogram is not a consistent estimate. What should be done: 1 S ( f ) lim T E T T 2 T 2 x ( n)e j 2ft dt 2 What is actually done: 1 SˆPER ( f ) x(n)e N j 2nk N 2 2 SˆPER ( f ) DFT {x(n)} (2.27) (2.28) AVERAGE (OR MODIFIED) PERIODOGRAM ESTIMATE By dividing the available signal sequence into a set of shorter segments and averaging the associated periodograms, a consistent estimate is obtained. The sequence of operations is: 1) Divide the available length of the signal into a number of shorter segments, 2) Multiply each segment by an antileakage window, 3) Compute the associated periodograms for each segment, 4) Average the periodograms of all segments, The technique, known as average periodogram estimate, or modified periodogram, was suggested by P.D. Welch in 1967 and results in a consistent estimate of the power spectrum, but the spectral resolution is decreased (size of the segments actually used). Some problems and how to avoid them: aliasing - effect caused by sampling a signal with inappropriate sampling frequency (fsam). Aliasing is avoided if the signal is band-limited (no energy above a certain frequency fc) and we are careful enough to sample it at a frequency above 2 fc. Anti-aliasing filter – Analogue filter used to remove or largely attenuate frequency components outside the band of interest. In general this is a lowpass filter with cut-off at or just below the frequency of interest fc as defined above. statistical instability - variance of the estimates of the PSDE when using the standard (non-averaged) DFT for the estimation. Essentially the variance of the estimate is the same as the mean. smoothing function - coefficients of the 'low-pass' filter applied to the spectral data after the DFT. The results of the DFT are convolved with a moving average window, with weights, say 1/4, 1/2, 1/4. This approach of low-pass filtering the spectral coefficients (after the transform) is statistically better than the approach of using an anti-leakage window on the time domain signal (before the transform). Anti-leakage window – function w(n) used to multiply the (time-domain) signal over the data frame to be transformed in an attempt to reduce leakage in spectral estimation. Antileakage windows are always symmetrical and tapered towards zero at the ends to reduce the discontinuity caused by the implicit assumption of periodicity of the time-domain data imposed by the Fourier approach. Instead of analysing the original signal x(n) the modified signal x(n)w(n) is analysed. pre-whitening or de-trending - procedure used to remove or largely attenuate unwanted DC components and linear components on the time signal before spectral estimation. The subtraction of the best straight line over each data frame is a popular approach to de-trending. It can be performed with a simple digital filter, say y(nT) = x(nT) - 0.99 x(nT-T) or y(n) = x(n) - 0.99 x(n-1) in simplified notation. or use a more sophisticated digital high-pass filter. Spectral estimation techniques can be seen as attempts to fit the measured data to an assumed model. Conventional FFT spectral analysis is based on a Fourier series model of the data, ie, the process is assumed to behave as a sum of sinusoids (or exponential functions) and the coefficients of these sinusoids describe the process according to the model. Autoregressive (AR) and Autoregressive-moving –average (ARMA) models The autoregressive (AR) model assumes that the current value of the process, xn, can be described by a finite linear aggregate of the previous values of the process and the current value of a white noise driving source nn. An autoregressive process of order p is defined as xn = nn - a1xn-1 - a2xn-2 - a3xn-3 - ... - apxn-p Autoregressive model used for prediction The AR model can be used for prediction (this is why it is sometimes referred to as Linear Prediction Coding, or LPC): p x ( n) a k x n k k 1 where , the error, is white noise with zero mean and variance 2. The AR estimation of the next value is (3.1) p xˆ (n) ak xnk k 1 Autoregressive model used for power spectral density estimation The autoregressive, moving average (ARMA) model assumes that the current value of the process, xn, can be described by a finite linear aggregate of the previous values of the process and the current plus previous values of a white noise driving source nn. The autoregressive, moving average (ARMA) as a digital filter is n(t) x(t) h(t) or, in the z domain representation, N(z) X(z) H(z) The current value of the output at a given time is given by q p r 0 k 1 p q k 1 r 0 xi br nir ak xik or xi ak xik br nir in the z domain we can write this last equation as p q k X ( z ) 1 ak z N ( z ) br z r k 1 r 0 therefore the transfer function in z is H ( z) q r b z r r 0 X ( z) p N ( z) k 1 a z k k 1 Remembering the Power Gain Theorem that states that the power of the output of a digital filter is the power at the input times the square of the transfer function Px Pn H (z ) 2 we can write Px Pn q b z r r r 0 2 1 p a z k k1 k sT Replacing z e for s j (around the unit circle on the z-plane, that is, making the real part of s equal zero), that is making z e jT and knowing that the power spectral density of white noise is its variance 2 we can write the above equation as PARMA( f ) q T br e 2 j 2 frT 2 r 0 p 1 ak e j 2 fkT 2 k 1 Autoregressive model In the particular case, if all br 0 , except b0 1 , i.e. the forward path to the adder has a single feed line from the input, with gain =1 and no contributions from delayed input samples, then we have an autoregressive model, and the autoregressive power spectral density function is given by PAR ( f ) 2T p 1 ak e j 2 f k T 2 k 1 psde(1).doc