Environmental Data Analysis with MatLab Lecture 18: Cross-correlation
Oct 21, 2015
Environmental Data Analysis with MatLab
Lecture 18:
Cross-correlation
Lecture 01 Using MatLabLecture 02 Looking At DataLecture 03 Probability and Measurement Error Lecture 04 Multivariate DistributionsLecture 05 Linear ModelsLecture 06 The Principle of Least SquaresLecture 07 Prior InformationLecture 08 Solving Generalized Least Squares ProblemsLecture 09 Fourier SeriesLecture 10 Complex Fourier SeriesLecture 11 Lessons Learned from the Fourier TransformLecture 12 Power Spectral DensityLecture 13 Filter Theory Lecture 14 Applications of Filters Lecture 15 Factor Analysis Lecture 16 Orthogonal functions Lecture 17 Covariance and AutocorrelationLecture 18 Cross-correlationLecture 19 Smoothing, Correlation and SpectraLecture 20 Coherence; Tapering and Spectral Analysis Lecture 21 InterpolationLecture 22 Hypothesis testing Lecture 23 Hypothesis Testing continued; F-TestsLecture 24 Confidence Limits of Spectra, Bootstraps
SYLLABUS
purpose of the lecture
generalize the idea of autocorrelation
to multiple time series
Review of last lecture
autocorrelationcorrelations between samples within a
time series
high degree of short-term correlation
what ever the river was doing yesterday, its probably doing today, too
because water takes time to drain away
0 500 1000 1500 2000 2500 3000 3500 40000
1
2
x 104
time, days
disc
harg
e, c
fs
0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.050
2
4
6
8
x 109
frequency, cycles per dayPS
D,
(cfs
)2 per
cyc
le/d
ay
A) time series, d(t)
time t, days
d(t)
, cfs
Neuse River Hydrograph
low degree of intermediate-term correlation
what ever the river was doing last month, today it could be doing something completely different
because storms are so unpredictable
0 500 1000 1500 2000 2500 3000 3500 40000
1
2
x 104
time, days
disc
harg
e, c
fs
0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.050
2
4
6
8
x 109
frequency, cycles per dayPS
D,
(cfs
)2 per
cyc
le/d
ay
A) time series, d(t)
time t, days
d(t)
, cfs
Neuse River Hydrograph
moderate degree of long-term correlation
what ever the river was doing this time last year, its probably doing today, too
because seasons repeat
0 500 1000 1500 2000 2500 3000 3500 40000
1
2
x 104
time, days
disc
harg
e, c
fs
0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.050
2
4
6
8
x 109
frequency, cycles per dayPS
D,
(cfs
)2 per
cyc
le/d
ay
A) time series, d(t)
time t, days
d(t)
, cfs
Neuse River Hydrograph
0 0.5 1 1.5 2 2.5
x 104
0
0.5
1
1.5
2
2.5x 10
4
discharge
disc
harg
e la
gged
by
1 da
ys
0 0.5 1 1.5 2 2.5
x 104
0
0.5
1
1.5
2
2.5x 10
4
discharge
disc
harg
e la
gged
by
3 da
ys
0 0.5 1 1.5 2 2.5
x 104
0
0.5
1
1.5
2
2.5x 10
4
discharge
disc
harg
e la
gged
by
30 d
ays
1 day 3 days 30 days
-30 -20 -10 0 10 20 300
5
x 106
lag, days
auto
corr
elat
ion
-3000 -2000 -1000 0 1000 2000 3000
-505
x 106
lag, days
auto
corr
elat
ion
Autocorrelation Function
31 30
formula for covariance
formula for autocorrelation
autocorrelationat lag (k-1)Δt
autocorrelation similar to convolution
autocorrelation similar to convolution
note difference in sign
autocorrelation in MatLab
Important Relation #1autocorrelation is the convolution of a time series with its time-reversed self
Important Relationship #2Fourier Transform of an autocorrelation
is proportional to thePower Spectral Density of time series
End of Review
Part 1
correlations between time-series
scenario
discharge correlated with rain
but discharge is delayed behind rain
because rain takes time to drain from the land
time, days
time, days
rain
, mm
/day
disc
hagr
e, m
3 /s
time, days
time, days
rain
, mm
/day
disc
hagr
e, m
3 /s
rain ahead ofdischarge
time, days
time, days
rain
, mm
/day
disc
hagr
e, m
3 /s
shape not exactly the same, either
treat two time series u and v probabilistically
p.d.f. p(ui, vi+k-1)with elements lagged by time(k-1)Δtand compute its covariance
this defines the cross-correlation
just a generalization of the auto-correlation
different times in the same time series
different times in different time series
like autocorrelation, similar to convolution
As with auto-correlationtwo important properties
#1: relationship to convolution
#2: relationship to Fourier Transform
As with auto-correlationtwo important properties
#1: relationship to convolution
#2: relationship to Fourier Transform
cross-spectral density
cross-correlation in MatLab
Part 2
aligning time-seriesa simple application of cross-correlation
central idea
two time series are best alignedat the lag at which they are most correlated,
which is
the lag at which their cross-correlation is maximum
10 20 30 40 50 60 70 80 90 100-1
0
1
10 20 30 40 50 60 70 80 90 100-1
0
1
u(t)
v(t)
two similar time-series, with a time shift
(this is simple “test” or “synthetic” dataset)
-20 -10 0 10 20
-5
0
5
time
cros
s-co
rrel
atio
n
cross-correlate
-20 -10 0 10 20
-5
0
5
time
cros
s-co
rrel
atio
n
maximum
time lag
find maximum
In MatLab
In MatLab
compute cross-correlation
In MatLab
compute cross-correlation
find maximum
In MatLab
compute cross-correlation
find maximum
compute time lag
10 20 30 40 50 60 70 80 90 100-1
0
1
10 20 30 40 50 60 70 80 90 100-1
0
1
u(t)
v(t+tlag)
align time series with measured lag
A)
B)
2 4 6 8 10 12 140
500
time, days
solar
, W/m
2
2 4 6 8 10 12 140
50
100
time, days
ozon
e, p
pb
2 4 6 8 10 12 140
500
time, days
solar
, W/m
2
2 4 6 8 10 12 140
50
100
time, days
ozon
e, p
pbsolar insolation and ground level ozone(this is a real dataset from West Point NY)
B)
2 4 6 8 10 12 140
500
time, days
solar
, W/m
2
2 4 6 8 10 12 140
50
100
time, days
ozon
e, p
pb
2 4 6 8 10 12 140
500
time, days
solar
, W/m
2
2 4 6 8 10 12 140
50
100
time, days
ozon
e, p
pbsolar insolation and ground level ozone
note time lag
-10 -5 0 5 100
1
2
3
4x 10
6
time, hours
cros
s-co
rrel
atio
n
C)maximum
time lag3 hours
0.5 1 1.5 2 2.5 3 3.5 4 4.5 50
500
time, days
sola
r rad
iatio
n, W
/m2
0.5 1 1.5 2 2.5 3 3.5 4 4.5 50
50
100
3.00 hour lag
time, days
ozon
e, p
pb
A)
B) originaldelagged