Top Banner
1 Henrik Madsen H. Madsen, Time Series Analysis, Chapmann Hall Time Series Analysis Henrik Madsen [email protected] Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
25

Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

Jun 01, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

1Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

Time Series Analysis

Henrik Madsen

[email protected]

Informatics and Mathematical Modelling

Technical University of Denmark

DK-2800 Kgs. Lyngby

Page 2: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

2Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

Outline of the lecturePractical information

Introductory examples (See also Chapter 1)

A brief outline of the course

Chapter 2:Multivariate random variablesThe multivariate normal distributionLinear projections

Example

Page 3: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

3Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

Introductory example – shares (COLO B 18m)

From www.cse.dk

Page 4: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

4Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

Consumption of District Heating (VEKS) – dataH

eat C

onsu

mpt

ion

(GJ/

h)

Nov Dec Jan Feb Mar Apr1995 1996

800

1200

2000

Air

Tem

pera

ture

(°C

)

Nov Dec Jan Feb Mar Apr1995 1996

−10

−5

05

10

Page 5: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

5Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

Consumption of DH – simple model

Air Temperature (°C)

Hea

t Con

sum

ptio

n (G

J/h)

−15 −10 −5 0 5 10

1000

1500

2000

2500

Page 6: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

6Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

Consumption of DH – model error

Model Error

Mod

el E

rror

(G

J/h)

Nov Dec Jan Feb Mar Apr1995 1996

−40

00

400

Model Error as it should be if the model were OK

Mod

el E

rror

(G

J/h)

Nov Dec Jan Feb Mar Apr1995 1996

−40

00

400

Page 7: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

7Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

A brief outline of the courseGeneral aspects of multivariate random variables

Prediction using the general linear model

Time series models

Some theory on linear systems

Time series models with external input

Some goals:

Characterization of time series / signals; correlation functions,covariance functions, spectral distributions, stationarity,ergodicity, linearity, . . .

Signal processing; filtering, sampling, smoothing

Modelling; with or without external input

Prediction / Control

Page 8: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

8Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

Multivariate random variablesJoint and marginal densities

Conditional distributions

Expectations and moments

Moments of multivariate random variables

Conditional expectation

The multivariate normal distribution

Distributions derived from the normal distribution

Linear projections

Page 9: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

9Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

Multivariate random variablesDefinition (n-dimensional random variable; random vector)

X =

X1

X2

...Xn

Joint distribution function:

F (x1, · · · , xn) = P{X1 ≤ x1, · · · ,Xn ≤ xn}

Page 10: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

10Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

Multivariate random variablesProbability density function (continuous case):

f(x1, · · · , xn) =∂nF (x1, · · · , xn)

∂x1 · · · ∂xn

F (x1, · · · , xn) =

∫ x1

−∞· · ·

∫ xn

−∞f(t1, · · · , tn) dt1 . . . dtn

Probability density function (discrete case):

f(x1, · · · , xn) = P{X1 = x1, · · · ,Xn = xn}

Page 11: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

11Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

The Multivariate Normal DistributionThe joint p.d.f.

fX(x) =1

(2π)n/2√

detΣexp

[

−1

2(x − µ)TΣ

−1(x − µ)

]

Σ must be positive semidefinite

Notation: X ∼ N(µ,Σ)

Standardized multivariate normal: X ∼ N(0, I)

N(µ,Σ) = µ + T N(0, I), where Σ = TT T

If X ∼ N(µ,Σ) and Y = a + BX thenY ∼ N(a + Bµ,BΣBT )

More relations between distributions in Sec. 2.7

Page 12: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

12Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

Marginal density functionSub-vector: (X1, · · · ,Xk)

T (k < n)

Marginal density function:

fS(x1, · · · , xk) =

∫ ∞

−∞· · ·

∫ ∞

−∞f(x1, · · · , xn) dxk+1 · · · dxn

x1x2

Density

−4

−2

0

2

4

−4 −2 0 2 4

x1

x2

0.00

0.02

0.04

0.06

0.08

0.10

0

5

10

15

−4 −2 0 2 4

x1P

erce

nt o

f Tot

al

Marginal histogram of 100000 samples

Page 13: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

13Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

Conditional distributions

The conditional densityof Y given X = x isdefined as (fX(x) > 0):

fY |X=x(y) =fX,Y (x, y)

fX(x)

(joint density of (X,Y )divided by the marginaldensity of X evaluated atx)

−4

−2

0

2

4

−4 −2 0 2 4

x

y

0.00

0.02

0.04

0.06

0.08

0.10

Page 14: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

14Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

IndependenceIf knowledge of X does not give information about Y we getfY |X=x(y) = fY (y)

This leads to the following definition of independence:

fX,Y (x, y) = fX(x)fY (y)

Page 15: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

15Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

ExpectationLet X be a univariate random variable with density fX(x). Theexpectation of X is then defined as:

E[X] =

∫ ∞

−∞xfX(x)dx (continuous case)

E[X] =∑

all x

xP (X = x) (discrete case)

Calculation rule:

E[a + bX1 + cX2] = a + bE[X1] + c E[X2]

Page 16: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

16Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

Moments and variancen’th moment:

E[Xn] =

∫ ∞

−∞xnfX(x) dx

n’th central moment:

E[(X − E[X])n] =

∫ ∞

−∞(x − E[X])nfX(x) dx

The 2’nd central moment is called the variance:

V [X] = E[(X − E[X])2] = E[X2] − (E[X])2

Page 17: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

17Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

CovarianceCovariance:

Cov[X1, X2] = E[(X1−E[X1])(X2−E[X2])] = E[X1X2]−E[X1]E[X2]

Variance and covariance:

V [X ] = Cov[X, X ]

Calculation rule:

Cov[aX1 + bX2, cX3 + dX4] =

ac Cov[X1, X3] + ad Cov[X1, X4] + bc Cov[X2, X3] + bd Cov[X2, X4]

The calculation rule can be used for the variance also

Page 18: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

18Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

Expectation and Variance for Random VectorsExpectation: E[X] = [E[X1] E[X2] . . . E[Xn]]T

Variance-covariance (matrix):ΣX = V [X] = E[(X − µ)(X − µ)T ] =

V [X1] Cov[X1,X2] · · · Cov[X1,Xn]

Cov[X2,X1] V [X2] · · · Cov[X2,Xn]...

...Cov[Xn,X1] Cov[Xn,X2] · · · V [Xn]

Correlation:

ρij =Cov[Xi,Xj ]

V [Xi]V [Xj ]=

σij

σiσj

Page 19: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

19Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

Expectation and Variance for Random VectorsThe correlation matrix R = ρ is an arrangement of ρij in amatrix

Covariance matrix between X (dim. p) and Y (dim. q):

ΣXY = C[X,Y ] = E[

(X − µ)(Y − ν)T]

=

Cov[X1, Y1] · · · Cov[X1, Yq]...

...Cov[Xp, Y1] · · · Cov[Xp, Yq]

Calculation rules – see the book.

The special case of the variance C[X,X] = V [X] results in

V [AX] = AV [X]AT

Page 20: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

20Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

Conditional expectation

E[Y |X = x] =

∫ ∞

−∞yfY |X=x(y) dy

E[Y |X] = E[Y ] if andY are independent

E[Y ] = E[

E[Y |X]]

E[g(X)Y |X] = g(X)E[Y |X]

E[g(X)Y ] = E[

g(X)E[Y |X]]

E[a|X] = a

E[g(X)|X] = g(X)

E[cX + dZ|Y ] = cE[X|Y ] + dE[Z|Y ]

Page 21: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

21Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

Variance separationDefinition of conditional variance and covariance:

V [Y |X] = E[(

Y − E[Y |X])(

Y − E[Y |X])T |X

]

C[Y ,Z|X] = E[(

Y − E[Y |X])(

Z − E[Z|X])T |X

]

The variance separation theorem:

V [Y ] = E[

V [Y |X]]

+ V[

E[Y |X]]

C[Y ,Z] = E[

C[Y ,Z|X]]

+ C[

E[Y |X], E[Z|X]]

Page 22: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

22Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

Linear ProjectionsConsider two random vectors Y and X, then

E

[(

Y

X

)]

=

(

µY

µX

)

and V

[(

Y

X

)]

=

(

ΣY Y ΣY X

ΣXY ΣXX

)

Consider the linear projection: E[Y |X] = a + BX

Then:

E[Y |X] = µY + ΣY XΣXX−1(X − µX)

V [Y − E[Y |X]] = ΣY Y −ΣY XΣXX−1

ΣY XT

C[

Y − E[Y |X],X]

= 0

The linear projection above has minimal variance among alllinear projections.

Page 23: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

23Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

Air pollution in citiesCarstensen (1990) has used time series analysis to set upmodels for NO and NO2 at Jagtvej in Copenhagen

Measurements of NO and NO2 available every third hour (00,03, 06, 09, 12, . . . )

We have µNO2= 48µg/m3 and µNO = 79µg/m3

In the model X1,t = NO2,t − µNO2and X2,t = NOt − µNO is

used

Page 24: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

24Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

Air pollution in cities – model and forecast(

X1,t

X2,t

)

=

(

0.9 −0.1

0.4 0.8

)(

X1,t−1

X2,t−1

)

+

(

ξ1,t

ξ2,t

)

Xt = ΦXt−1 + ξt

V [ξt] = Σ =

(

σ21 σ12

σ21 σ22

)

=

(

30 21

21 23

)

(µg/m3)2

Assume that t corresponds to 09:00 today and we havemeasurements 64 µg/m3 NO2 and 93 µg/m3 NO

Forecast the concentrations at 12:00 (t + 1)

What is the variance-covariance of this forecast?

Page 25: Time Series Analysis - Henrik Madsenhenrikmadsen.org/wordpress/wp-content/uploads/2014/02/lect01.pdf · H. Madsen, Time Series Analysis, Chapmann Hall Air pollution in cities –

25Henrik Madsen

H. Madsen, Time Series Analysis, Chapmann Hall

Air pollution in cities – linear projectionAt 12:00 (t + 1) we now assume that NO2 is measured with67 µg/m3 as the result, but NO cannot be measured due tosome trouble with the equipment.

Estimate the missing NO measurement.

What is the variance of the error of the estimation?