Environmental Data Analysis with MatLab 2nd Edition Lecture 21: Interpolation SYLLABUS Lecture 01 Lecture 02 Lecture 03 Lecture 04 Lecture 05 Lecture 06 Lecture 07 Lecture 08 Lecture 09 Lecture 10 Lecture 11 Lecture 12 Lecture 13 Lecture 14 Lecture 15 Lecture 16 Lecture 17 Lecture 18 Lecture 19 Lecture 20 Lecture 21 Lecture 22 Lecture 23 Lecture 24 Lecture 25 Lecture 26 Using MatLab Looking At Data Probability and Measurement Error Multivariate Distributions Linear Models The Principle of Least Squares Prior Information Solving Generalized Least Squares Problems Fourier Series Complex Fourier Series Lessons Learned from the Fourier Transform Power Spectra Filter Theory Applications of Filters Factor Analysis Orthogonal functions Covariance and Autocorrelation Cross-correlation Smoothing, Correlation and Spectra Coherence; Tapering and Spectral Analysis Interpolation Linear Approximations and Non Linear Least Squares Adaptable Approximations with Neural Networks Hypothesis testing Hypothesis Testing continued; F-Tests Confidence Limits of Spectra, Bootstraps Goals of the lecture to introduce Interpolation the process of filling in missing data points A(t) Scenario 1: data are collected at irregular time intervals, but you want to compute power spectral density, which requires evenly sampled data. 1 time 2 ? psd 0 frequency A(t) Scenario 2: two datasets are collected with different sampling intervals, but you want to combine them into a scatter plot 2 1 time 2 B(t) 1 ? B 0 A in both scenarios the times that the data are collected at are inconvenient we encountered a problem similar to this one back in Lecture 8, where we used prior information to fill in data gaps dobs(t) observed data with missing points 0 1 time 2 dest(t) estimated data with missing points filled in 0 1 time 2 find diest so that est di ≈ obs di at the observation points and est di roughness of ≈0 everywhere the solution is inexact est di ≠ obs di everywhere and roughness of ≠0 everywhere est di but the inexactness isn’t a problem because both observations and prior information have error now we examine an alternative approach traditional interpolation similar, but subtly different find d(t) so that d(ti) = obs di at the observation points and roughness of d(t) = 0 in between the observation points find d(t) so that d(ti) = exact obs di at the observation points and exact roughness of d(t) = 0 in between the observation points find d(t) so that d(ti) = obs di “interpolant” at the observation points and roughness of d(t) = 0 in between the observation points advantage interpolant d(t) is an analytic function that is known everywhere disadvantage the observation points are singled out as special advantage interpolant d(t) is an analytic function that is known everywhere can evaluate d(t) at any time, t can differentiate d(t), integrate it, etc. disadvantage the observation points are singled out as special d(t) behaves differently at the observation points than between them the interpolation problem find an interpolant d(t) that goes through all the data points and “does something sensible” or “satisfies some prior information” between them some obvious ideas don’t work at all an (N-1) order polynomial can easily be constructed to that it passes through N points so use a polynomial for d(t) example d(t) time, t example d(t) what happened here? time, t and here? solution a low-order polynomial has less potential for wild swings so use many low-order polynomial each valid in a small time interval such a function is called a “spline” simplest case set of linear polynomials each valid between two data points “connect the data points with straight lines” d d(t) ti ti+1 t advantages conceptually very simple always get what you expect zero roughness between observations disadvantage d(t) has kinks at observation points example 5 d(t) 0 -5 0 0.1 0.2 0.3 0.4 0.5 time, t 0.6 0.7 0.8 0.9 1 example 5 kink d(t) 0 -5 0 0.1 0.2 0.3 0.4 0.5 time, t 0.6 0.7 0.8 0.9 1 in MatLab observations interpolated observations times of interpolation getting rid of the kinks use cubic polynomials Si(t) = c0 + c1 t + c2 t2 + c3 t3 each valid between two data points cubic polynomial has 4 coefficients two constrained by need to pass through two data two to implement prior information no kinks in d(t) or its first derivative the trick second derivative of cubic is linear so use linear interpolation formula for second derivative 2nd derivative yi+1 yi yi-1 ti-1 ti ti+1 t 2nd derivative yi+1 yi yi-1 ti-1 ti ti+1 t the second derivative at the observation points, denoted yi, become an unknown in the problem the second derivative is now integrated twice to give the spline function here ai and bi are two more unknowns that arise from the integration constants finally one finds the y’s, a’s and b’s so that the spline 1. goes through the observations and 2. has a first derivative that is continuous across the observation points the solution involves solving a matrix equation for the unknowns (see text for details) in MatLab observations interpolated observations times of interpolation example d(t) time, t example no kinks d(t) time, t interpolation involves prior information of smoothness in generalized least-squares the prior information of smoothness is quantified by a roughness matrix, H Hm then we minimize the overall roughness, which is to say the overall error in the prior information (Hm)T (Hm) note that (Hm)T (Hm) = mT (HTH) m but in generalized error also has the form mT Cm-1 m where Cm-1 is a covariance matrix so in this case Cm = (HTH)-1 so the prior information that the data are smooth is equivalent to the requirement that they have a specific covariance matrix which for stationary time series is equivalent to saying that they have a specific autocorrelation function so an alternative, more flexible way of interpolating data is by specifying the autocorrelation function that we want the results to have this is called Kriging (after Danie G Krige, its inventor) Kriging estimate data at arbitrary time t0 determine weights w by minimizing the variance of with respect to wi we’ll find that we don’t need to know d0true only its autocorrelation j assuming and means approximately cancel j assuming and means approximately cancel expand square j assuming and means approximately cancel expand square insert weighted average formula j assumming and means approximately cancel expand square insert weighted average formula j identify terms proportional to autocorrelation assumming and now differentiate with respect to the weight, wk which yields the matrix equation Mw = v now differentiate with respect to the weight, wk which yields the matrix equation Mw = v note that the autocorrelation appears on both sides of the equation, so that its overall normalization cancels out all we need now do is specify an autocorrelation function for example we could use the Normal function the variance, L2, controls the width of the autocorrelation and hence the smoothness of the interpolation In MatLab observations: tobs, dobs interpolated values: test, dest Normal autocorrelation function with variance L2 Example A) Kriging B) Generalized Least Squares 2 2 1.5 1 1 d(t) d(t 0 ) d(t) 0 d d 0.5 -0.5 -1 -1 -1.5 -2 0 20 40 60 time, t x 80 100 -2 0 10 20 30 40 50 x 60 time, t 70 80 90 100 Interpolation in two-dimensions construct an interpolant d(x,y) that goes through the observations and does something sensible in between notion of bracketing observations more complicated d y0 t t0 1 dimensions y0 x0 2 dimensions x notion of bracketing observations more complicated y d triangular tile ti ti+1 t0 1 dimensions t y0 segment of t-axis x0 2 dimensions x Delaunay triangles set of most equilateral triangles connecting data points B) Delaunay data triangles A) Observations data 0 5 5 10 10 15 15 20 x 20 25 25 30 30 35 35 40 40 x x x 0 0 5 10 15 20 y y 25 30 35 40 0 5 10 15 20 y y 25 30 35 40 B) Delaunay data triangles A) Observations data 0 5 5 10 10 15 15 20 x 20 25 25 30 30 35 35 40 40 x x x 0 0 5 10 15 20 y y 25 30 35 40 0 5 10 15 20 y 25 30 35 y triangle enclosing a point of interest 40 C) Linear linear Splines interpolation 0 5 5 10 10 15 15 20 x 20 25 25 30 30 35 35 40 40 x x x D) Cubic cubic Splines interpolation 0 0 5 10 15 20 y y 25 30 35 40 0 5 10 15 y 20 y 25 30 35 40 In MatLab linear splines cubic splines