Environmental Data Analysis with MatLab Lecture 14: Applications of Filters SYLLABUS Lecture 01 Lecture 02 Lecture 03 Lecture 04 Lecture 05 Lecture 06 Lecture 07 Lecture 08 Lecture 09 Lecture 10 Lecture 11 Lecture 12 Lecture 13 Lecture 14 Lecture 15 Lecture 16 Lecture 17 Lecture 18 Lecture 19 Lecture 20 Lecture 21 Lecture 22 Lecture 23 Lecture 24 Using MatLab Looking At Data Probability and Measurement Error Multivariate Distributions Linear Models The Principle of Least Squares Prior Information Solving Generalized Least Squares Problems Fourier Series Complex Fourier Series Lessons Learned from the Fourier Transform Power Spectral Density Filter Theory Applications of Filters Factor Analysis Orthogonal functions Covariance and Autocorrelation Cross-correlation Smoothing, Correlation and Spectra Coherence; Tapering and Spectral Analysis Interpolation Hypothesis testing Hypothesis Testing continued; F-Tests Confidence Limits of Spectra, Bootstraps purpose of the lecture further develop the idea of the Linear Filter and its applications from last lecture present output ∝ past and present values of input output input filter “convolution”, not multiplication Part 1: Predicting the Present or Part 1: Predicting the Present or very close to a convolution output input “prediction error” filter strategy for predicting the future 1. take all the data, d, that you have up to today 2. use it to estimate the prediction error filter, p (use generalized least-squares to solve p*d=0) 3. use the filter, p, and all the data, d, to predict dtomorrow application to the Neuse River hydrograph 4 discharge, cfs x 10 2 1 fs)2 per cycle/day 0 0 500 9 x 10 8 6 4 1000 1500 2000 2500 time, days 3000 3500 4000 pef(t) (std) error filter, p(t) prediction here’s the best fit filter, p 1.5 1 0.5 0 -0.5 0 10 20 30 40 50 60 time t, days time t, days 1.5 g) 1 70 80 90 pef(t) (std) error filter, p(t) prediction here’s the best fit filter, p in this case, only the first few coefficients are large 1.5 1 what’s that? 0.5 0 -0.5 0 10 20 30 40 50 60 time t, days time t, days 1.5 g) 1 70 80 90 importance of the prediction error since one is using least squares, the equation 0=p*d is not solved exactly the prediction error, e=p*d tells you what aspects of the data cannot be predicted on the basis of past behavior d(t) prediction e(t) error, e(t) discharge, d(t) A) 15000 10000 5000 0 -5000 0 50 100 250 300 350 150 200 time t, days 250 300 350 time t, days B) 15000 150 200 time t, days 10000 5000 0 -5000 0 50 100 time t, days d(t) prediction e(t) error, e(t) discharge, d(t) A) 15000 10000 5000 0 -5000 0 50 100 the error is small 10000 250 300 350 time t, days B) 15000 150 200 time t, days the error is spiky 5000 many spikes are at times when discharge increases 0 -5000 0 50 100 150 200 time t, days time t, days 250 300 350 Part 2: Inverse Filters Can a convolution be undone? if θ = g * h is there another filter ginv for which h = ginv * θ ? convolution c=a*b by hand: step 1 for simplicity, suppose a and b are of length 3 write a backward in time write b forward in time overlap the ends by one, and multiply. That gives c1 convolution c=a*b by hand: step 2 slide a right one place multiply and add. That gives c2 convolution c=a*b by hand: step 3 slide a right another place multiply and add. That gives c3 convolution c=a*b by hand: keep going Multiply. That gives c5 an important observation this is the same pattern that we obtain when we multiply polynomials z-transform turn a filter into a polynomial g = [g1, g2, g3, … g(z) = g1 + g2 z + g3 2 z T gN] + … gN N-1 z inverse z-transform turn a polynomial into a filer g(z) = g1 + g2 z + g3 z2 + … gN zN-1 g = [g1, g2, g3, … T gN] why would we want to do this? because we know a lot about polynomials the fundamental theorem of algebra a polynomial of n-th order has exactly n roots and thus can be factored into the product of n factors the fundamental theorem of algebra a polynomial of n-th order has exactly n-roots largest power, zn solutions to g(z)=0 and can be factored into the product of n factors g(z) ∝ (z-r1) (z-r2) … (z-rn) in the case of a polynomial constructed from a length-N filter, g where r1, r2, … rN-1 are the roots so, the filter g is equivalent a “cascade” of N-1 length-2 filters now let’s try to find the inverse of a length-2 filter the filter that undoes convolution by [-ri, 1]T is … ? z-transform the function that undoes multiplication by z-ri is1 /(z-ri) problem: 1/(z-ri) is not a polynomial solution: compute its Taylor series Taylor series Taylor series contains all powers of z so the filter that undoes convolution by [-ri, 1]T is … an indefinitely long filter this filter will only be useful if its coefficients fall off must decrease this happens when |ri|-1 > 1 or |ri| < 1 this filter will only be useful if its coefficients fall off must decrease this happens when |ri|-1 < 1 or |ri| > 1 the root, ri, must lie outside the “unit circle” the inverse filter for a length-N filter g step 1: find roots of g(z) step 2: check that roots are inside the unit circle step 3: construct inverse filter associated with each root step 4: convolve them all together example construct inverse filter of: 6 g(j) gj 4 2 0 5 10 15 20 25 30 element j ginv(j) element j 0.1 0 -0.1 35 40 45 50 only hard part of the process is finding the roots of a polynomial % find roots of g r = roots(flipud(g)); fortunately, MatLab does this easily 6 g(j) gj 4 2 0 5 10 15 20 25 30 element j 35 40 45 50 25 30 elementjj element element j ginv(j) 0 -0.1 [g*ginv](j) gjinv 0.1 [ginv*g]j 0 5 10 15 20 35 40 45 50 0 5 10 15 20 35 40 45 50 1 0 -1 25 30 elementjj element short time series 6 g(j) gj 4 2 0 5 10 15 20 25 30 element j 35 40 45 50 20 35 40 45 50 20 35 40 45 50 element j long timeseries ginv(j) gjinv 0.1 0 -0.1 [g*ginv](j) 0 [ginv*g]j 5 10 15 25 30 elementjj element spike 1 0 -1 0 5 10 15 25 30 elementjj element Part 3: Recursive Filters a way to make approximate a long filter with two short ones in the standard filtering formula we compute the output θ1, θ2, θ3, … in sequence but without using our knowledge of θ1 when we compute θ2 or θ2 when we compute θ3 etc that’s wasted information suppose we tried to put the information to work, as follows here we’ve introduced two new filters, u and v convention al filter, g new but conventional filter v that acts on filter, u already computed values of θ now define v1=1, so if we can find short filters u and v such that vinv * u ≈ g then we can speed up the convolution process an example g*h is the weighted average of recent values of h if g is truncated to N≈10 elements, then each time step takes 10N multiplications and 10N additions try this works, since the inverse of a length-2 filter is the convolution then becomes which requires only one addition and one multiplication per time step a savings of a factor of about ten A) h(t) and q(t) h1(t) and q1(t) 1 0 -1 0 h(t) and q(t) h2(t) and q2(t) 5 10 20 30 40 20 30 40 50 time, time t, t 60 70 80 90 100 60 70 80 90 100 B) 0 -5 0 10 50 time,, tt time