Applications of Filters Running Average of length M f= -1 M [1, 1, 1, … T 1] f M-1 0 t0 tM-1 time, t Note that average is “delayed” … … since no future values of x are available 1 x 0 t0 tM-1 Could be ‘recentered’ during plotting’ M-1 f*x 0 t0 tM-1 center time, t MatLab Example M = 30; % length of filter f = (1/M) * ones(M,1); y = conv(f,T); Note MatLab has an built-in convolution function If x has length N and y has length M then x*y has length N+M-1 Laguardia Airport Mean Temperature 2 years of data Blue: daily mean temperature data Red: 14 day running average Laguardia Airport Mean Temperature 2 years of data Blue: daily mean temperature data Red: 30 day running average Laguardia Airport Mean Temperature 20 years of data Blue: daily mean temperature data Red: 365 day running average Note “edge effect” Criticism of Running Average: it has sharp edges sharp edge f M-1 0 t0 tM-1 time, t The filter can produce rough results, because, as the filter advances in time, an outlier suddenly goes from being inside to outside Gaussian running average f M-1 0 t0 tM-1 time, t f 0 67% of area between t0 and tM-1 t0 tM-1 time, t Need to discard a bit of future here MatLab Example width = 14; % 67 percent width sigma = 14/2; % standard deviation M = 6*round(sigma/dt); % truncate filter at +/- 3 sigma delay = 3*sigma; % center of filter f = exp( -(t(1:M)-delay).^2 / (2*sigma*sigma) ); amp = sum(f); % normalize to unit amplitude f = f/amp; 14 day Gaussian filter 14 days Laguardia Airport Mean Temperature 2 years of data Blue: daily mean temperature data Red: 14 day Gaussian running average Laguardia Airport Mean Temperature “Box car” “Gaussian” Blue: daily mean temperature data Red: 14 day Gaussian running average Exponential was easy to implement f recursively M-1 0 t0 tM-1 f M-1 0 time, t 67% of area between t0 and tM-1 t0 tM-1 time, t Some calculations If f(t) = c-1 exp( -t/c ) What value of c puts A1 fraction of area between 0 and t1? t1 A = 0 c-1 exp( -t/c ) dt = exp(-t/c)| Note A(t=)=1 So A1 = 1- exp(-t1/c) (1-A1) = exp(-t1/c) ln(1-A1) = (-t1/c) c = -t1 / ln(1-A1) and t1 = -c / ln(1-A1) t1 0 = 1- exp(-t1/c) MatLab Code width = 14; % 67% width c = -width/log(1.0-0.67); % constant in exponential M = 6*round(width/dt); % truncate filter at 3 widths delay = -width/log(1.0-0.50); % delay at 50% of area f = exp( -t(1:M)/c ); amp = sum(f); % normalize to sum to unity f = f/amp; Laguardia Airport Mean Temperature “Gaussian” “exponential” Blue: daily mean temperature data Red: 14 day Gaussian running average first derivative filter f = Dt [1, -1]T But note delayed by ½Dt M=2; f = dt*[1,-1]'; delay = 0.5*dt; second derivative filter f= 2 (Dt) [1, -2, T 1] But note delayed by Dt First-derivative note more-variable in winter second-derivative note more-variable in winter prediction error filter x = [x1, x2, x3, x4, … xN-1]T f = [-1, f1, f2, f3, f4, … fN-1]T Choose f such that f*x 0 f5, f4, f3, f2, f1, -1 … xM-2, xM-2, xM-1, xM, xM+1, xM+2, xM+3, … xM = f1xM-1 + f2xM-2 + f3xM-3 … xM predicted from past values MatLab Code M=10; G = zeros(N+1,M); d = zeros(N+1,1); % filter of length M, data of length N % solve by least-squares % implement condition f0=1 % as if its prior information for p = [1:M] G(p:N,p) = T(1:N-p+1); d(p)=0; end % usual G matrix for filter G(N+1,1)=1e6; d(N+1)=1e6; % prior info, with epsilon=1e6 f = inv(G'*G)*G'*d; % least-squares solution y = conv(f,T); % try out fileter, y is prediction error % d vector is all zero filter length M = 10 days filter, f filter length M = 10 days x f*x error = 6.4105 filter length M = 100 days filter, f filter length M = 10 days x f*x error = 6.2105 f*x is the unpredictable part of x Let’s try it with the Neuse River Hydrograph Dataset What’s that? Filter length M=100 x f*x Close up of first year of data Note that the prediction error, f*x, is spikier than the hydrograph data, x. I think that this means that some of the dynamics of the river flow is being captured by the filter, f, and that the unpredictable part is mostly the forcing, that is, precipitation x f*x