Lecture 18: MatLab 2 Edition

advertisement
Environmental Data Analysis with MatLab
2nd Edition
Lecture 18:
Cross-correlation
SYLLABUS
Lecture 01
Lecture 02
Lecture 03
Lecture 04
Lecture 05
Lecture 06
Lecture 07
Lecture 08
Lecture 09
Lecture 10
Lecture 11
Lecture 12
Lecture 13
Lecture 14
Lecture 15
Lecture 16
Lecture 17
Lecture 18
Lecture 19
Lecture 20
Lecture 21
Lecture 22
Lecture 23
Lecture 24
Lecture 25
Lecture 26
Using MatLab
Looking At Data
Probability and Measurement Error
Multivariate Distributions
Linear Models
The Principle of Least Squares
Prior Information
Solving Generalized Least Squares Problems
Fourier Series
Complex Fourier Series
Lessons Learned from the Fourier Transform
Power Spectra
Filter Theory
Applications of Filters
Factor Analysis
Orthogonal functions
Covariance and Autocorrelation
Cross-correlation
Smoothing, Correlation and Spectra
Coherence; Tapering and Spectral Analysis
Interpolation
Linear Approximations and Non Linear Least Squares
Adaptable Approximations with Neural Networks
Hypothesis testing
Hypothesis Testing continued; F-Tests
Confidence Limits of Spectra, Bootstraps
Goals of the lecture
generalize the idea of autocorrelation
to multiple time series
Review of last lecture
autocorrelation
correlations between samples within a
time series
high degree of short-term correlation
what ever the river was doing yesterday, its probably
doing today, too
because water takes time to drain away
Neuse River Hydrograph
4
A) time series, d(t)
discharge, cfs
d(t), cfs
x 10
2
1
PSD, (cfs)2 per cycle/day
0
0
500
1000
1500
2000
2500
time, days
time t, days
3000
3500
4000
0.04
0.045
9
x 10
8
6
4
2
0
0
0.005
0.01
0.015 0.02 0.025 0.03 0.035
frequency, cycles per day
0.05
low degree of intermediate-term correlation
what ever the river was doing last month, today it could
be doing something completely different
because storms are so unpredictable
Neuse River Hydrograph
4
A) time series, d(t)
discharge, cfs
d(t), cfs
x 10
2
1
PSD, (cfs)2 per cycle/day
0
0
500
1000
1500
2000
2500
time, days
time t, days
3000
3500
4000
0.04
0.045
9
x 10
8
6
4
2
0
0
0.005
0.01
0.015 0.02 0.025 0.03 0.035
frequency, cycles per day
0.05
moderate degree of long-term correlation
what ever the river was doing this time last year, its
probably doing today, too
because seasons repeat
Neuse River Hydrograph
4
A) time series, d(t)
discharge, cfs
d(t), cfs
x 10
2
1
PSD, (cfs)2 per cycle/day
0
0
500
1000
1500
2000
2500
time, days
time t, days
3000
3500
4000
0.04
0.045
9
x 10
8
6
4
2
0
0
0.005
0.01
0.015 0.02 0.025 0.03 0.035
frequency, cycles per day
0.05
1 day
3 days
2.5
discharge lagged by 3 days
discharge lagged by 1 days
2.5
4
x 10
2
1.5
1
0.5
0
0
0.5
1
1.5
discharge
2
2.5
4
x 10
4
x 10
2.5
discharge lagged by 30 days
4
30 days
2
1.5
1
0.5
0
0
0.5
1
1.5
discharge
2
2.5
4
x 10
x 10
2
1.5
1
0.5
0
0
0.5
1
1.5
discharge
2
2.5
4
x 10
Autocorrelation Function
autocorrelation
6
x 10
5
0
-30
-20
-10
0
lag, days
10
20
30
autocorrelation
6
x 10
5
0
-5
-3000
-2000
-1000
1 3
0
lag, days
1000
2000
3000
30
formula for covariance
formula for autocorrelation
autocorrelation
at lag (k-1)Δt
autocorrelation similar to convolution
autocorrelation similar to convolution
note difference in sign
autocorrelation in MatLab
Important Relation #1
autocorrelation is the convolution of a
time series with its time-reversed self
Important Relationship #2
Fourier Transform of an autocorrelation
is proportional to the
Power Spectral Density of time series
End of Review
Part 1
correlations between time-series
scenario
discharge correlated with rain
but discharge is delayed behind rain
because rain takes time to drain
from the land
dischagre, m3/s
time, days
time, days
rain, mm/day
rain, mm/day
dischagre, m3/s
time, days
rain ahead of
discharge
time, days
rain, mm/day
dischagre, m3/s
time, days
shape not
exactly the
same, either
time, days
treat two time series u and v probabilistically
p.d.f.
p(ui, vi+k-1)
with elements lagged by time
(k-1)Δt
and compute its covariance
this defines the cross-correlation
just a generalization of the auto-correlation
different times in
different time series
different times in
the same time series
like autocorrelation, similar to convolution
As with auto-correlation
two important properties
#1: relationship to convolution
#2: relationship to Fourier Transform
As with auto-correlation
two important properties
#1: relationship to convolution
#2: relationship to Fourier Transform
cross-spectral density
cross-correlation in MatLab
Part 2
aligning time-series
a simple application of cross-correlation
central idea
two time series are best aligned
at the lag at which they are most correlated,
which is
the lag at which their cross-correlation is maximum
two similar time-series, with a time shift
(this is simple “test” or “synthetic” dataset)
1
u(t)
0
v(t)
-1
10
1
0
20
30
40
50
60
70
80
90
100
cross-correlation
cross-correlate
5
0
-5
-20
-10
0
time
10
20
find maximum
cross-correlation
maximum
5
0
-5
-20
-10
0
time
time lag
10
20
In MatLab
In MatLab
compute crosscorrelation
In MatLab
compute crosscorrelation
find maximum
In MatLab
compute crosscorrelation
find maximum
compute time lag
align
time series with measured lag
0
-1
10
20
30
40
50
60
70
80
90
100
30
40
50
60
70
80
90
100
1
u(t)
0
v(t+tlag)
-1
10
20
solar insolation and ground level ozone
(this is a real dataset from West Point NY)
solar, W/m2
A)
500
0
2
4
6
8
time, days
10
12
14
2
4
6
8
time, days
10
12
14
W/m2
ozone, ppb
B)
100
50
0
500
solar, W/m2
solar insolation and ground level ozone
500
0
2
4
6
8
time, days
10
12
14
2
4
6
8
time, days
10
12
14
W/m2
ozone, ppb
B)
100
50
0
note time lag
500
maximum
6 C)
cross-correlation
4
x 10
3
2
1
0
-10
-5
0
5
time, hours
time lag
3 hours
10
solar radiation, W/m2
A)
500
0
0.5
1
1.5
2
ozone, ppb
B)
2.5
3
time, days
3.00 hour lag
100
3.5
4
4.5
5
original
delagged
50
0
0.5
1
1.5
2
2.5
time, days
3
3.5
4
4.5
5
Download