Environmental Data Analysis with MatLab Lecture 18: Cross-correlation SYLLABUS Lecture 01 Lecture 02 Lecture 03 Lecture 04 Lecture 05 Lecture 06 Lecture 07 Lecture 08 Lecture 09 Lecture 10 Lecture 11 Lecture 12 Lecture 13 Lecture 14 Lecture 15 Lecture 16 Lecture 17 Lecture 18 Lecture 19 Lecture 20 Lecture 21 Lecture 22 Lecture 23 Lecture 24 Using MatLab Looking At Data Probability and Measurement Error Multivariate Distributions Linear Models The Principle of Least Squares Prior Information Solving Generalized Least Squares Problems Fourier Series Complex Fourier Series Lessons Learned from the Fourier Transform Power Spectral Density Filter Theory Applications of Filters Factor Analysis Orthogonal functions Covariance and Autocorrelation Cross-correlation Smoothing, Correlation and Spectra Coherence; Tapering and Spectral Analysis Interpolation Hypothesis testing Hypothesis Testing continued; F-Tests Confidence Limits of Spectra, Bootstraps purpose of the lecture generalize the idea of autocorrelation to multiple time series Review of last lecture autocorrelation correlations between samples within a time series high degree of short-term correlation what ever the river was doing yesterday, its probably doing today, too because water takes time to drain away Neuse River Hydrograph 4 A) time series, d(t) discharge, cfs d(t), cfs x 10 2 1 PSD, (cfs)2 per cycle/day 0 0 500 1000 1500 2000 2500 time, days time t, days 3000 3500 4000 0.04 0.045 9 x 10 8 6 4 2 0 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 frequency, cycles per day 0.05 low degree of intermediate-term correlation what ever the river was doing last month, today it could be doing something completely different because storms are so unpredictable Neuse River Hydrograph 4 A) time series, d(t) discharge, cfs d(t), cfs x 10 2 1 PSD, (cfs)2 per cycle/day 0 0 500 1000 1500 2000 2500 time, days time t, days 3000 3500 4000 0.04 0.045 9 x 10 8 6 4 2 0 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 frequency, cycles per day 0.05 moderate degree of long-term correlation what ever the river was doing this time last year, its probably doing today, too because seasons repeat Neuse River Hydrograph 4 A) time series, d(t) discharge, cfs d(t), cfs x 10 2 1 PSD, (cfs)2 per cycle/day 0 0 500 1000 1500 2000 2500 time, days time t, days 3000 3500 4000 0.04 0.045 9 x 10 8 6 4 2 0 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 frequency, cycles per day 0.05 1 day 3 days 2.5 discharge lagged by 3 days discharge lagged by 1 days 2.5 4 x 10 2 1.5 1 0.5 0 0 0.5 1 1.5 discharge 2 2.5 4 x 10 4 x 10 2.5 discharge lagged by 30 days 4 30 days 2 1.5 1 0.5 0 0 0.5 1 1.5 discharge 2 2.5 4 x 10 x 10 2 1.5 1 0.5 0 0 0.5 1 1.5 discharge 2 2.5 4 x 10 Autocorrelation Function autocorrelation 6 x 10 5 0 -30 -20 -10 0 lag, days 10 20 30 autocorrelation 6 x 10 5 0 -5 -3000 -2000 -1000 1 3 0 lag, days 1000 2000 3000 30 formula for covariance formula for autocorrelation autocorrelation at lag (k-1)Δt autocorrelation similar to convolution autocorrelation similar to convolution note difference in sign autocorrelation in MatLab Important Relation #1 autocorrelation is the convolution of a time series with its time-reversed self Important Relationship #2 Fourier Transform of an autocorrelation is proportional to the Power Spectral Density of time series End of Review Part 1 correlations between time-series scenario discharge correlated with rain but discharge is delayed behind rain because rain takes time to drain from the land dischagre, m3/s time, days time, days rain, mm/day rain, mm/day dischagre, m3/s time, days rain ahead of discharge time, days rain, mm/day dischagre, m3/s time, days shape not exactly the same, either time, days treat two time series u and v probabilistically p.d.f. p(ui, vi+k-1) with elements lagged by time (k-1)Δt and compute its covariance this defines the cross-correlation just a generalization of the auto-correlation different times in different time series different times in the same time series like autocorrelation, similar to convolution As with auto-correlation two important properties #1: relationship to convolution #2: relationship to Fourier Transform As with auto-correlation two important properties #1: relationship to convolution #2: relationship to Fourier Transform cross-spectral density cross-correlation in MatLab Part 2 aligning time-series a simple application of cross-correlation central idea two time series are best aligned at the lag at which they are most correlated, which is the lag at which their cross-correlation is maximum two similar time-series, with a time shift (this is simple “test” or “synthetic” dataset) 1 u(t) 0 v(t) -1 10 1 0 20 30 40 50 60 70 80 90 100 cross-correlation cross-correlate 5 0 -5 -20 -10 0 time 10 20 find maximum cross-correlation maximum 5 0 -5 -20 -10 0 time time lag 10 20 In MatLab In MatLab compute crosscorrelation In MatLab compute crosscorrelation find maximum In MatLab compute crosscorrelation find maximum compute time lag align time series with measured lag 0 -1 10 20 30 40 50 60 70 80 90 100 30 40 50 60 70 80 90 100 1 u(t) 0 v(t+tlag) -1 10 20 solar insolation and ground level ozone (this is a real dataset from West Point NY) solar, W/m2 A) 500 0 2 4 6 8 time, days 10 12 14 2 4 6 8 time, days 10 12 14 W/m2 ozone, ppb B) 100 50 0 500 solar, W/m2 solar insolation and ground level ozone 500 0 2 4 6 8 time, days 10 12 14 2 4 6 8 time, days 10 12 14 W/m2 ozone, ppb B) 100 50 0 note time lag 500 maximum 6 C) cross-correlation 4 x 10 3 2 1 0 -10 -5 0 5 time, hours time lag 3 hours 10 solar radiation, W/m2 A) 500 0 0.5 1 1.5 2 ozone, ppb B) 2.5 3 time, days 3.00 hour lag 100 3.5 4 4.5 5 original delagged 50 0 0.5 1 1.5 2 2.5 time, days 3 3.5 4 4.5 5