Environmental Data Analysis with MatLab Lecture 20: Coherence; Tapering and Spectral Analysis
Mar 28, 2015
Environmental Data Analysis with MatLab
Lecture 20:
Coherence; Tapering and Spectral Analysis
Lecture 01 Using MatLabLecture 02 Looking At DataLecture 03 Probability and Measurement Error Lecture 04 Multivariate DistributionsLecture 05 Linear ModelsLecture 06 The Principle of Least SquaresLecture 07 Prior InformationLecture 08 Solving Generalized Least Squares ProblemsLecture 09 Fourier SeriesLecture 10 Complex Fourier SeriesLecture 11 Lessons Learned from the Fourier TransformLecture 12 Power Spectral DensityLecture 13 Filter Theory Lecture 14 Applications of Filters Lecture 15 Factor Analysis Lecture 16 Orthogonal functions Lecture 17 Covariance and AutocorrelationLecture 18 Cross-correlationLecture 19 Smoothing, Correlation and SpectraLecture 20 Coherence; Tapering and Spectral Analysis Lecture 21 InterpolationLecture 22 Hypothesis testing Lecture 23 Hypothesis Testing continued; F-TestsLecture 24 Confidence Limits of Spectra, Bootstraps
SYLLABUS
purpose of the lecture
Part 1
Finish up the discussion of
correlations between time series
Part 2
Examine how the finite observation time affects estimates of the power spectral density of time series
Part 1
“Coherence”
frequency-dependent correlations between time series
Scenario A
in a hypothetical region
windiness and temperature correlate at periods of a year, because of large scale climate patterns
but they do not correlate at periods of a few days
time, years
time, years
1 2 3
1 2 3
win
d sp
eed
tem
pera
ture
time, years
time, years
1 2 3
1 2 3
win
d sp
eed
tem
pera
ture
summer hot and windy
winters cool and calm
time, years
time, years
1 2 3
1 2 3
win
d sp
eed
tem
pera
ture
heat wave not especially
windy cold snap not especially calm
in this casetimes series correlated at long periods
but not at short periods
Scenario B
in a hypothetical region
plankton growth rate and precipitation correlate at periods of a few weeks
but they do not correlate seasonally
time, years
time, years
1 2 3
1 2 3
grow
th r
ate
prec
ipit
atio
n
time, years
time, years
1 2 3
1 2 3
plan
t gro
wth
rat
epr
ecip
itat
ion
summer drier than winter
growth rate has no seasonal signal
time, years
time, years
1 2 3
1 2 3
plan
t gro
wth
rat
epr
ecip
itat
ion
growth rate high at times of peak precipitation
in this casetimes series correlated at short periods
but not at long periods
Coherence
a way to quantifyfrequency-dependent correlation
strategy
band pass filter the two time series, u(t) and v(t)around frequency, ω0
compute their zero-lag cross correlation(large when time series are similar in shape)
repeat for many ω0’s to create a function c(ω0)
ω0
|f(ω)|2
ω0-ω0
2Δω2Δωband pass filter f(t) has this p.s.d.
evaluate atzero lag t=0 and at many ω0’s
Short Cut
Fact 1A function evaluates at time t=0 is equal to the integral of
its Fourier Transform
Fact 2the Fourier Transform of a convolution is the product
of the transforms
integral over frequency
integral over frequency
assume ideal band pass filter that is either 0 or 1negative frequencies
positive frequencies
integral over frequency
assume ideal band pass filter that is either 0 or 1negative frequencies
positive frequencies
c is realso real part is symmetric, addsimag part is antisymmetric, cancels
integral over frequency
assume ideal band pass filter that is either 0 or 1negative frequencies
positive frequencies
c is realso real part is symmetric, addsimag part is antisymmetric, cancels
interpret intergral as an average over frequency band
integral over frequency can be viewed as an average over frequency
(indicated with the overbar)
Two final steps
1. Omit taking of real part in formula(simplifying approximation)
2. Normalize by the amplitude of the two time series and square, so that result varies between 0 and 1
the final result is calledCoherence
0 200 400 600 800 1000 1200 1400 16000
5pr
ecip
itatio
n
0 200 400 600 800 1000 1200 1400 16000
1020
T-a
ir
0 200 400 600 800 1000 1200 1400 16000
10
20
T-w
ater
0 200 400 600 800 1000 1200 1400 160028303234
time, days
salin
ity
0 200 400 600 800 1000 1200 1400 16000
20
time, days
turb
idity
0 200 400 600 800 1000 1200 1400 16000
20
time, days
chlo
roph
yll
A)
B)
C)
D)
E)
F)
new dataset:
Water Quality
Reynolds Channel,Coastal Long Island, New York
0 200 400 600 800 1000 1200 1400 16000
5pr
ecip
itatio
n
0 200 400 600 800 1000 1200 1400 16000
1020
T-a
ir
0 200 400 600 800 1000 1200 1400 16000
10
20
T-w
ater
0 200 400 600 800 1000 1200 1400 160028303234
time, days
salin
ity
0 200 400 600 800 1000 1200 1400 16000
20
time, days
turb
idity
0 200 400 600 800 1000 1200 1400 16000
20
time, days
chlo
roph
yll
A)
B)
C)
D)
E)
F)
new dataset:
Water Quality
Reynolds Channel,Coastal Long Island, New York
precipitation
air temperature
water temperature
salinity
turbidity
chlorophyl
400 450 500 550-2-101
prec
ip
400 450 500 550-4-2024
T-a
ir
400 450 500 550-101
T-w
ater
400 450 500 550-0.4-0.2
00.20.40.6
time, dayssa
linity
400 450 500 550
-4-202
time, days
turb
idity
400 450 500 550
-2
0
2
time, days
chlo
roph
yl
A) periods near 1 year
0 200 400 600 800 1000 1200 1400 1600
-0.050
0.05
prec
ip
0 200 400 600 800 1000 1200 1400 1600-10
0
10
T-a
ir
0 200 400 600 800 1000 1200 1400 1600-10
0
10
T-w
ater
0 200 400 600 800 1000 1200 1400 1600-10
0
10
time, days
salin
ity
0 200 400 600 800 1000 1200 1400 1600
-3-2-101
time, days
turb
idity
0 200 400 600 800 1000 1200 1400 1600-6-4-2024
time, days
chlo
roph
yl
Fig, 9.18. Band-pass filtered water quality measurements from Reynolds Channel (New York) for several years starting January 1, 2006. A) Periods near one year; and B) periods near 5 days. MatLab script eda09_16.
B) periods near 5 days
400 450 500 550-2-101
prec
ip
400 450 500 550-4-2024
T-a
ir
400 450 500 550-101
T-w
ater
400 450 500 550-0.4-0.2
00.20.40.6
time, dayssa
linity
400 450 500 550
-4-202
time, days
turb
idity
400 450 500 550
-2
0
2
time, days
chlo
roph
yl
A) periods near 1 year
0 200 400 600 800 1000 1200 1400 1600
-0.050
0.05
prec
ip
0 200 400 600 800 1000 1200 1400 1600-10
0
10
T-a
ir
0 200 400 600 800 1000 1200 1400 1600-10
0
10
T-w
ater
0 200 400 600 800 1000 1200 1400 1600-10
0
10
time, days
salin
ity
0 200 400 600 800 1000 1200 1400 1600
-3-2-101
time, days
turb
idity
0 200 400 600 800 1000 1200 1400 1600-6-4-2024
time, days
chlo
roph
yl
Fig, 9.18. Band-pass filtered water quality measurements from Reynolds Channel (New York) for several years starting January 1, 2006. A) Periods near one year; and B) periods near 5 days. MatLab script eda09_16.
B) periods near 5 days
A)
0 0.05 0.1 0.15 0.20
0.5
1
frequency, cycles per day
cohe
renc
e
air-temp and water-temp
0 0.05 0.1 0.15 0.20
0.5
1
frequency, cycles per day
cohe
renc
e
precipitation and salinity
0 0.05 0.1 0.15 0.20
0.5
1
frequency, cycles per day
cohe
renc
e
water-temp and chlorophyllB) C)
one year
one weekone year
A)
0 0.05 0.1 0.15 0.20
0.5
1
frequency, cycles per day
cohe
renc
e
air-temp and water-temp
0 0.05 0.1 0.15 0.20
0.5
1
frequency, cycles per day
cohe
renc
e
precipitation and salinity
0 0.05 0.1 0.15 0.20
0.5
1
frequency, cycles per day
cohe
renc
e
water-temp and chlorophyllB) C)
one year
one weekone year
high coherence at periods of 1 year
moderate coherence at periods of about a month
very low coherence at periods of months to a few days
Part 2
windowing time series before computing power-spectral density
0 200 400-1
0
1
time t, s
d(t)
0 200 400-1
0
1
time t, s
W(t
)
0 200 400-1
0
1
time t, s
W(t
)*d(
t)
0 0.1 0.2 0.3 0.40
5
10
frequency f, Hz
AS
D o
f d
0 0.1 0.2 0.3 0.40
5
frequency f, Hz
AS
D o
f W
0 0.1 0.2 0.3 0.40
2
4
frequency f, Hz
AS
D o
f W
d)
0 200 400-1
0
1
time t, s
d(t)
0 200 400-1
0
1
time t, s
W(t
)
0 200 400-1
0
1
time t, s
W(t
)*d(
t)
0 0.1 0.2 0.3 0.40
5
10
frequency f, Hz
AS
D o
f d
0 0.1 0.2 0.3 0.40
5
frequency f, Hz
AS
D o
f W
0 0.1 0.2 0.3 0.40
2
4
frequency f, Hz
AS
D o
f W
d)
scenario: you are studying an indefinitely long phenomenon …
but you only observe a short portion of it …
how does the power spectral density of the short piece
differ from the p.s.d. of the indefinitely long phenomenon
(assuming stationary time series)
0 200 400-1
0
1
time t, s
d(t)
0 200 400-1
0
1
time t, s
W(t
)
0 200 400-1
0
1
time t, s
W(t
)*d(
t)
0 0.1 0.2 0.3 0.40
5
10
frequency f, Hz
AS
D o
f d
0 0.1 0.2 0.3 0.40
5
frequency f, Hz
AS
D o
f W
0 0.1 0.2 0.3 0.40
2
4
frequency f, Hz
AS
D o
f W
d)
0 200 400-1
0
1
time t, s
d(t)
0 200 400-1
0
1
time t, s
W(t
)
0 200 400-1
0
1
time t, s
W(t
)*d(
t)
0 0.1 0.2 0.3 0.40
5
10
frequency f, Hz
AS
D o
f d
0 0.1 0.2 0.3 0.40
5
frequency f, Hz
AS
D o
f W
0 0.1 0.2 0.3 0.40
2
4
frequency f, Hz
AS
D o
f W
d)
We might suspect that the difference will be increasingly significant as the window of observation becomes so short that it includes just a few oscillations of the period of interest.
0 200 400-1
0
1
time t, s
d(t)
0 200 400-1
0
1
time t, s
W(t
)
0 200 400-1
0
1
time t, s
W(t
)*d(
t)
0 0.1 0.2 0.3 0.40
5
10
frequency f, Hz
AS
D o
f d
0 0.1 0.2 0.3 0.40
5
frequency f, Hz
AS
D o
f W
0 0.1 0.2 0.3 0.40
2
4
frequency f, Hz
AS
D o
f W
d)
starting point
short pieceis
the indefinitely long time seriesmultiplied by a
window function, W(t)
0 200 400-1
0
1
time t, s
d(t)
0 200 400-1
0
1
time t, s
W(t
)
0 200 400-1
0
1
time t, s
W(t
)*d(
t)
0 0.1 0.2 0.3 0.40
5
10
frequency f, Hz
AS
D o
f d
0 0.1 0.2 0.3 0.40
5
frequency f, Hz
AS
D o
f W
0 0.1 0.2 0.3 0.40
2
4
frequency f, Hz
AS
D o
f W
d)
by the convolution theorem
Fourier Transform of short pieceis
Fourier Transform of indefinitely long time seriesconvolved with
Fourier Transform of window function
soFourier Transform of short piece
exactly
Fourier Transform of indefinitely long time series
when
Fourier Transform of window function
is a spike
0 200 400-1
0
1
time t, s
d(t)
0 200 400-1
0
1
time t, s
W(t
)
0 200 400-1
0
1
time t, s
W(t
)*d(
t)
0 0.1 0.2 0.3 0.40
5
10
frequency f, Hz
AS
D o
f d
0 0.1 0.2 0.3 0.40
5
frequency f, Hz
AS
D o
f W
0 0.1 0.2 0.3 0.40
2
4
frequency f, Hz
AS
D o
f W
d)
boxcar window function its Fourier Transform
0 200 400-1
0
1
time t, s
d(t)
0 200 400-1
0
1
time t, s
W(t
)
0 200 400-1
0
1
time t, s
W(t
)*d(
t)
0 0.1 0.2 0.3 0.40
5
10
frequency f, Hz
AS
D o
f d
0 0.1 0.2 0.3 0.40
5
frequency f, Hz
AS
D o
f W
0 0.1 0.2 0.3 0.40
2
4
frequency f, Hz
AS
D o
f W
d)
boxcar window function its Fourier Transformsinc() functionsort of spikybut has side lobes
0 200 400-1
0
1
time t, s
d(t)
0 200 400-1
0
1
time t, s
W(t
)
0 200 400-1
0
1
time t, s
W(t
)*d(
t)
0 0.1 0.2 0.3 0.40
5
10
frequency f, Hz
AS
D o
f d
0 0.1 0.2 0.3 0.40
5
frequency f, Hz
AS
D o
f W
0 0.1 0.2 0.3 0.40
2
4
frequency f, Hz
AS
D o
f W
d)
0 200 400-1
0
1
time t, s
d(t)
0 200 400-1
0
1
time t, s
W(t
)
0 200 400-1
0
1
time t, s
W(t
)*d(
t)
0 0.1 0.2 0.3 0.40
5
10
frequency f, Hz
AS
D o
f d
0 0.1 0.2 0.3 0.40
5
frequency f, Hz
AS
D o
f W
0 0.1 0.2 0.3 0.40
2
4
frequency f, Hz
AS
D o
f W
d)
narrow spectral peak
wide central spike
wide spectral peak
Effect 1: broadening of spectral peaks
0 200 400-1
0
1
time t, s
d(t)
0 200 400-1
0
1
time t, s
W(t
)
0 200 400-1
0
1
time t, s
W(t
)*d(
t)
0 0.1 0.2 0.3 0.40
5
10
frequency f, Hz
AS
D o
f d
0 0.1 0.2 0.3 0.40
5
frequency f, Hz
AS
D o
f W
0 0.1 0.2 0.3 0.40
2
4
frequency f, Hz
AS
D o
f W
d)
only one spectral peak
side lobes
spurious spectral peaks
Effect 2: spurious side lobes
Q: Can the situation be improved?
A: Yes, by choosing a smoother window function
more like a Normal Function(which has no side lobes)
but still zero outside of interval of observation
0 200 400-1
0
1
time t, s
d(t)
0 200 400-1
0
1
time t, s
W(t
)
0 200 400-1
0
1
time t, s
W(t
)*d(
t)
0 0.1 0.2 0.3 0.40
5
10
frequency f, Hz
AS
D o
f d
0 0.1 0.2 0.3 0.40
5
frequency f, HzA
SD
of
W
0 0.1 0.2 0.3 0.40
2
4
frequency f, Hz
AS
D o
f W
d)
boxcar window function 0 200 400-1
0
1
time t, s
d(t)
0 200 400-1
0
1
time t, s
W(t
)
0 200 400-1
0
1
time t, s
W(t
)*d(
t)
0 0.1 0.2 0.3 0.40
5
10
frequency f, Hz
AS
D o
f d
0 0.1 0.2 0.3 0.40
2
4
frequency f, Hz
AS
D o
f W
0 0.1 0.2 0.3 0.40
1
2
frequency f, Hz
AS
D o
f W
d)
Hamming window function
0 200 400-1
0
1
time t, s
d(t)
0 200 400-1
0
1
time t, s
W(t
)
0 200 400-1
0
1
time t, s
W(t
)*d(
t)
0 0.1 0.2 0.3 0.40
5
10
frequency f, Hz
AS
D o
f d
0 0.1 0.2 0.3 0.40
2
4
frequency f, Hz
AS
D o
f W
0 0.1 0.2 0.3 0.40
1
2
frequency f, Hz
AS
D o
f W
d)
0 200 400-1
0
1
time t, s
d(t)
0 200 400-1
0
1
time t, s
W(t
)
0 200 400-1
0
1
time t, s
W(t
)*d(
t)
0 0.1 0.2 0.3 0.40
5
10
frequency f, Hz
AS
D o
f d
0 0.1 0.2 0.3 0.40
2
4
frequency f, Hz
AS
D o
f W
0 0.1 0.2 0.3 0.40
1
2
frequency f, Hz
AS
D o
f W
d)
no side lobesbutcentral peak wider than with boxcar
Hamming Window Function
Q: Is there a “best” window function?
A: Only if you carefully specify what you mean by “best”
(notion of best based on prior information)
“optimal”window function
maximize
ratio ofpower in central peak
(assumed to lie in range ±ω0 )to overall power
The parameter, ω0, allows you to choose how much spectral broadening you can tolerate
Once ω0 is specified, the problem can be solved by using standard optimization techniques
One finds that there are actually several window functions, with radically different shapes, that are
“optimal”
0 10 20 30 40 50 60-0.2
00.2
time t, s
w1(
t)
0 10 20 30 40 50 60-0.2
00.2
time t, s
w2(
t)
0 10 20 30 40 50 60-0.2
00.2
time t, s
w3(
t)
0 10 20 30 40 50 60-0.2
00.2
time t, s
w4(
t)v
W1(
t)v
W2(
t)v
W3(
t)
vtime, s
vtime, s
vtime, s
Family of three “optimal” window functions
a common strategy isto compute the power spectral density
with each of these window functions separatelyand then average the result
technique calledMulti-taper Spectral Analysis
0 0.1 0.2 0.3 0.40
1020
frequency f, Hz
|d(f
)|
0 0.1 0.2 0.3 0.4 0.50
50100
frequency f, Hz
|d0(
f)|
0 0.1 0.2 0.3 0.40
0.20.4
frequency f, Hz
|d1(
f)|
0 0.1 0.2 0.3 0.40
0.2
frequency f, Hz
|d2(
f)|
0 0.1 0.2 0.3 0.40
0.2
frequency f, Hz
|d3(
f)|
0 0.1 0.2 0.3 0.4 0.50
0.10.2
frequency f, Hz
|d(f
)|avg
v
0 100 200 300 400 500-0.500.51
time t, s
d(t)
0 100 200 300 400 500-0.500.51
time t, s
d(t)
0 100 200 300 400 500-0.100.10.2
time t, s
w1(
t)d(
t)
0 100 200 300 400 500-0.100.1
time t, s
w2(
t)d(
t)
0 100 200 300 400 500-0.100.1
time t, s
w3(
t)d(
t)
v
B(t)d(t)
vtime t, s
vtime t, s
vtime t, s
vtime t, s
vtime t, s
d(t)
W1(t)d(t)
W2(t)d(t)
W3(t)d(t)
v
vfrequency, Hz
vfrequency, Hz
vfrequency, Hz
vfrequency, Hz
vfrequency, Hz
vfrequency, Hz
0 0.1 0.2 0.3 0.40
1020
frequency f, Hz
|d(f
)|
0 0.1 0.2 0.3 0.4 0.50
50100
frequency f, Hz
|d0(
f)|
0 0.1 0.2 0.3 0.40
0.20.4
frequency f, Hz
|d1(
f)|
0 0.1 0.2 0.3 0.40
0.2
frequency f, Hz
|d2(
f)|
0 0.1 0.2 0.3 0.40
0.2
frequency f, Hz
|d3(
f)|
0 0.1 0.2 0.3 0.4 0.50
0.10.2
frequency f, Hz
|d(f
)|avg
v
0 100 200 300 400 500-0.500.51
time t, s
d(t)
0 100 200 300 400 500-0.500.51
time t, s
d(t)
0 100 200 300 400 500-0.100.10.2
time t, s
w1(
t)d(
t)
0 100 200 300 400 500-0.100.1
time t, s
w2(
t)d(
t)
0 100 200 300 400 500-0.100.1
time t, s
w3(
t)d(
t)
v
B(t)d(t)
vtime t, s
vtime t, s
vtime t, s
vtime t, s
vtime t, s
d(t)
W1(t)d(t)
W2(t)d(t)
W3(t)d(t)
v
vfrequency, Hz
vfrequency, Hz
vfrequency, Hz
vfrequency, Hz
vfrequency, Hz
vfrequency, Hz
box car tapering
0 0.1 0.2 0.3 0.40
1020
frequency f, Hz
|d(f
)|
0 0.1 0.2 0.3 0.4 0.50
50100
frequency f, Hz
|d0(
f)|
0 0.1 0.2 0.3 0.40
0.20.4
frequency f, Hz
|d1(
f)|
0 0.1 0.2 0.3 0.40
0.2
frequency f, Hz
|d2(
f)|
0 0.1 0.2 0.3 0.40
0.2
frequency f, Hz
|d3(
f)|
0 0.1 0.2 0.3 0.4 0.50
0.10.2
frequency f, Hz
|d(f
)|avg
v
0 100 200 300 400 500-0.500.51
time t, s
d(t)
0 100 200 300 400 500-0.500.51
time t, s
d(t)
0 100 200 300 400 500-0.100.10.2
time t, s
w1(
t)d(
t)
0 100 200 300 400 500-0.100.1
time t, s
w2(
t)d(
t)
0 100 200 300 400 500-0.100.1
time t, s
w3(
t)d(
t)
v
B(t)d(t)
vtime t, s
vtime t, s
vtime t, s
vtime t, s
vtime t, s
d(t)
W1(t)d(t)
W2(t)d(t)
W3(t)d(t)
v
vfrequency, Hz
vfrequency, Hz
vfrequency, Hz
vfrequency, Hz
vfrequency, Hz
vfrequency, Hztapering with three “optimal” window functions
0 0.1 0.2 0.3 0.40
1020
frequency f, Hz
|d(f
)|
0 0.1 0.2 0.3 0.4 0.50
50100
frequency f, Hz
|d0(
f)|
0 0.1 0.2 0.3 0.40
0.20.4
frequency f, Hz
|d1(
f)|
0 0.1 0.2 0.3 0.40
0.2
frequency f, Hz
|d2(
f)|
0 0.1 0.2 0.3 0.40
0.2
frequency f, Hz
|d3(
f)|
0 0.1 0.2 0.3 0.4 0.50
0.10.2
frequency f, Hz
|d(f
)|avg
v
0 100 200 300 400 500-0.500.51
time t, s
d(t)
0 100 200 300 400 500-0.500.51
time t, s
d(t)
0 100 200 300 400 500-0.100.10.2
time t, s
w1(
t)d(
t)
0 100 200 300 400 500-0.100.1
time t, s
w2(
t)d(
t)
0 100 200 300 400 500-0.100.1
time t, s
w3(
t)d(
t)
v
B(t)d(t)
vtime t, s
vtime t, s
vtime t, s
vtime t, s
vtime t, s
d(t)
W1(t)d(t)
W2(t)d(t)
W3(t)d(t)
v
vfrequency, Hz
vfrequency, Hz
vfrequency, Hz
vfrequency, Hz
vfrequency, Hz
vfrequency, Hz
p.s.d. produced by averaging
Summary
always taper a time series before computing the p.s.d.
try a simple Hamming taper firstit’s simple
use multi-taper analysis when higher resolution is needed
e.g. when the time series is very short