Financial Econometric Models Vincent JEANNIN – ESGF 5IFM Q1 2012 1 [email protected] ESGF 5IFM Q1 2012
Jan 15, 2015
Financial Econometric Models Vincent JEANNIN – ESGF 5IFM
Q1 2012
1
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 5
IFM
Q1
20
12
2
Summary of the session (Est. 3h) • Reminder of Last Session • Time Series Analysis Principles • Auto Regressive Process • Moving Average Process • ARMA • Conclusion
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 5
IFM
Q1
20
12
3
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 5
IFM
Q1
20
12
Reminder of Last Session
Be logic!
4
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 5
IFM
Q1
20
12
𝑌𝐷𝑖𝑓𝑓 = ln(𝑌)
Differentiation possible
5
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 5
IFM
Q1
20
12
Time can be a factor of a regression
6
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 5
IFM
Q1
20
12
Differentiation can add value
7
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 5
IFM
Q1
20
12
Check ACF/PACF for autocorrelation
Time Series Analysis Principles
8
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Reminders of the 3 steps
Identify
Fit
Forecast
9
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Reminders of the 3 components
Trend
Seasonality
Residual
10
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Lag
𝐵𝑥𝑡 = 𝑥𝑡−1
Difference
∆𝑥𝑡= 𝑥𝑡 − 𝑥𝑡−1
Seasonality Difference
∆30𝑥𝑡 = 𝑥𝑡 − 𝑥𝑡−30
11
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Differentiate series to obtain stationary series
Time series analysis and forecast simpler with stationary series
Different models involved with stationary or heteroscedasticity
12
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Properties of stationary series
(𝑌1, 𝑌2, 𝑌3, … , 𝑌𝑛)
(𝑌2, 𝑌3, 𝑌4, … , 𝑌𝑛+1)
Same distribution of the following
Distribution not time dependent
Rare occurrence
Stationarity accepted if
𝐸(𝑌𝑡) = 𝜇 Constant in the time
𝐶𝑜𝑣(𝑌𝑡 , 𝑌𝑡−𝑛) Depends only on n
13
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Acceptable Shortcut
A series is stationary if the mean and the variance are stable
Which one is more likely to be stationary?
14
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
About the residuals…
White noise!
Normality test
Have an idea with
Skewness
Kurtosis
Proper tests: KS, Durbin Watson, Portmanteau,…
Auto Regressive Process
15
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
There is a correlation between current data and previous data
Main principle
𝑋𝑡 = 𝑐 + 𝜑1𝑋𝑡−1 + 𝜑2𝑋𝑡−2 + ⋯+ 𝜑𝑛𝑋𝑡−𝑛 + 𝜀𝑡
𝜑𝑛 Parameters of the model
𝜀𝑛 White noise
If the parameters are identified, the prediction will be easy
AR(n)
16
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
DATA<-read.csv(file="C:/Users/vin/Desktop/Series1.csv",header=T)
plot(DATA$Val, type="l")
Let’s upload some data
17
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Is this a white noise?
hist(DATA$Val, breaks=20)
18
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Probably not…
Portmanteau test
Test the autocorrelation of a series
If there is autocorrelation, data aren’t independently distributed
Let’s use Ljung–Box statistics
𝑄 = 𝑛(𝑛 + 2) 𝜌 2𝑘
𝑛 − 𝑘
𝑛
𝑘=1
𝜌 𝑘 Autocorrelation at the lag k
H0: Data are independently distributed H1: Data aren’t independently distributed
𝑄 > Χ21−𝛼,ℎ
With α confidence interval rejection following a Chi Square distribution
19
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
> Box.test(DATA$Val)
Box-Pierce test
data: DATA$Val
X-squared = 188.3263, df = 1, p-value < 2.2e-16
H0 is rejected, the data aren’t independently distributed
20
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Let’s try a regression and analyse residuals
TReg<-lm(DATA$Val~DATA$t)
plot(DATA$Val, type="l")
abline(TReg, col="blue")
21
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
eps<-resid(TReg)
ks.test(eps, "pnorm")
layout(matrix(1:4,2,2))
plot(TReg)
22
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Box-Pierce test
data: eps
X-squared = 187.6299, df = 1, p-value < 2.2e-16
Residuals aren’t a white noise
Regression rejected
Not a surprise, did the series look stationary?
What next then?
23
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
lag.plot(DATA$Val, 9, do.lines=FALSE)
Differentiation seems to be interesting
24
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Does the differentiation create a stationary series?
plot(diff(DATA$Val), type="l")
25
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
ACF & PACF
par(mfrow=c(2,1))
acf(diff(DATA$Val),20)
pacf(diff(DATA$Val),20)
ACF decreasing
PACF cancelling after order 1
26
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Decreasing ACF
PACF cancel after order 1
Typically an Autoregressive Process
AR(1) ?
27
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Modl<-ar(diff(DATA$Val),order.max=20)
plot(Modl$aic)
Let’s try to fit an AR(1) model
The likelihood for the order 1 is significant
28
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
> ar(diff(DATA$Val),order.max=20)
Call:
ar(x = diff(DATA$Val), order.max = 20)
Coefficients:
1 2 3
0.5925 -0.1669 0.1385
Order selected 3 sigma^2 estimated as 0.8514
> ARDif<-diff(DATA$Val)
> ARDif[1]
[1] 0.3757723
We have our coefficient and standard deviation
We know the first term of our series
𝑦𝑡 = 0.3757723 + 0.5925. 𝑦𝑡−1 + 𝜀𝑡
Here is our model
29
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Need to test the residuals
Box.test(Modl$resid)
Box-Pierce test
data: Modl$resid
X-squared = 7e-04, df = 1, p-value = 0.9789
H0 accepted, residuals are independently distributed (white noise)
The differentiated series is a AR(1)
30
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
> predict(arima(diff(DATA$Val), order = c(1,0,0)), n.ahead = 7)
$pred
Time Series:
Start = 193
End = 199
Frequency = 1
[1] -0.81359048 -0.43300609 -0.22850452 -0.11861853 -0.05957287 -
0.02784553 -0.01079729
$se
Time Series:
Start = 193
End = 199
Frequency = 1
[1] 0.923352 1.048210 1.081582 1.091027 1.093739 1.094521 1.094747
80
85
90
95
100
105
110
115
120
1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101106111116121126131136141146151156161166171176181186191196
31
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Another typical example?
You make the comments!
32
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
DATA<-read.csv(file="C:/Users/vin/Desktop/Series2.csv",header=T)
plot(DATA$Ser2, type="l")
hist(DATA$Ser2, breaks=20)
33
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
> Box.test(DATA$Ser2)
Box-Pierce test
data: DATA$Ser2
X-squared = 149.9227, df = 1, p-value < 2.2e-16
TReg<-lm(DATA$Ser2~DATA$t)
plot(DATA$Ser2, type="l")
abline(TReg, col="blue")
34
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
> eps<-resid(TReg)
> Box.test(eps)
Box-Pierce test
data: eps
X-squared = 148.5669, df = 1, p-value < 2.2e-16
> layout(matrix(1:4,2,2))
> plot(TReg)
35
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
> lag.plot(DATA$Ser2, 9, do.lines=FALSE)
Much less obvious but clues of autoregression
36
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
par(mfrow=c(2,1))
plot(diff(DATA$Ser2), type="l")
plot(diff(DATA$Ser2, lag=2), type="l")
37
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
par(mfrow=c(2,1))
plot(diff(DATA$Ser2), type="l")
plot(diff(DATA$Ser2, lag=2), type="l")
ACF decreases 2 by 2
PACF cancelling after order 2
38
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
First order differentiation, strong AR(2) clues
par(mfrow=c(1,1))
Modl<-ar(diff(DATA$Ser2),order.max=20)
plot(Modl$aic)
39
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Parameters estimation
> ar(diff(DATA$Ser2),order.max=20)
Call:
ar(x = diff(DATA$Ser2), order.max = 20)
Coefficients:
1 2 3
0.5919 -0.8326 0.1086
Order selected 3 sigma^2 estimated as 0.877
> ARDif<-diff(DATA$Ser2)
> ARDif[1]
[1] 0.3757723
40
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
> predict(arima(diff(DATA$Ser2), order = c(2,0,0)), n.ahead = 7)
$pred
Time Series:
Start = 193
End = 199
Frequency = 1
[1] 0.4505213 2.0075741 0.6639701 -1.2321156 -1.1409989 0.3866745
1.0879588
$se
Time Series:
Start = 193
End = 199
Frequency = 1
[1] 0.9220713 1.0332515 1.1413067 1.2938326 1.2957576 1.3932158 1.4080266
80
85
90
95
100
105
110
115
1 6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
10
1
10
6
11
1
11
6
12
1
12
6
13
1
13
6
14
1
14
6
15
1
15
6
16
1
16
6
17
1
17
6
18
1
18
6
19
1
19
6
41
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
The more factors the harder the prediction is
> Box.test(Modl$resid)
Box-Pierce test
data: Modl$resid
X-squared = 0.0023, df = 1, p-value = 0.9619
Model accepted
The more factors there are the more stationary need to be the series for a good prediction
Moving Average Process
42
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Stationary series with auto correlation of errors
Main principle
𝑋𝑡 = 𝜇 + 𝑍𝑡 + 𝜑1𝑍𝑡−1 + 𝜑2𝑍𝑡−2 + ⋯+ 𝜑𝑛𝑍𝑡−𝑛
𝜑𝑛 Parameters of the model
𝑍𝑛 White noise
More difficult to estimate than a AR(n)
MA(n)
43
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
plot(Data, type="l")
hist(Data, breaks=20)
44
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
acf(Data,20)
pacf(Data,20)
ACF & PACF suggest MA(1)
ACF cancels after order 1
PACF decays to 0
45
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
> Box.test(Rslt$residuals)
Box-Pierce test
data: Rslt$residuals
X-squared = 0, df = 1, p-value = 0.9967
It works, MA(1), 0 mean, parameter -0.4621
> arima(Data, order = c(0, 0, 1),include.mean = FALSE)
Call:
arima(x = Data, order = c(0, 0, 1), include.mean = FALSE)
Coefficients:
ma1
-0.4621
s.e. 0.0903
sigma^2 estimated as 0.937: log likelihood = -138.76, aic = 281.52
46
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Fore<-predict(Rslt, n.ahead=5)
U = Fore$pred + 2*Fore$se
L = Fore$pred - 2*Fore$se
minx=min(Data,L)
maxx=max(Data,U)
ts.plot(Data,Fore$pred,col=1:2,
ylim=c(minx,maxx))
lines(U, col="blue", lty="dashed")
lines(L, col="blue", lty="dashed")
47
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Another typical example?
You make the comments!
48
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
plot(Data, type="l")
hist(Data, breaks=20)
49
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
50
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
> arima(Data, order = c(0, 0, 2),include.mean = FALSE)
Call:
arima(x = Data, order = c(0, 0, 2), include.mean = FALSE)
Coefficients:
ma1 ma2
-0.5365 0.6489
s.e. 0.0701 0.1044
sigma^2 estimated as 1.005: log likelihood = -142.74, aic = 291.48
> Box.test(Rslt$residuals)
Box-Pierce test
data: Rslt$residuals
X-squared = 0.0283, df = 1, p-value = 0.8664
MA(2)
51
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Fore<-predict(Rslt, n.ahead=5)
U = Fore$pred + 2*Fore$se
L = Fore$pred - 2*Fore$se
minx=min(Data,L)
maxx=max(Data,U)
ts.plot(Data,Fore$pred,col=1:2,
ylim=c(minx,maxx))
lines(U, col="blue", lty="dashed")
lines(L, col="blue", lty="dashed")
ARMA
52
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
The series is a function of past values plus current and past values of the noise
Main principle
ARMA(p,q)
Combines AR(p) & MA(q)
53
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
plot(Data, type="l")
hist(Data, breaks=20)
54
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Both ACF and PACF decreases exponentially after order 1
55
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
> Rslt<-arima(Data, order = c(1, 0, 1),include.mean = FALSE)
> Rslt
Call:
arima(x = Data, order = c(1, 0, 1), include.mean = FALSE)
Coefficients:
ar1 ma1
0.7214 0.7563
s.e. 0.0716 0.0721
sigma^2 estimated as 0.961: log likelihood = -141.13, aic = 288.27
> Box.test(Rslt$residuals)
Box-Pierce test
data: Rslt$residuals
X-squared = 0.0098, df = 1, p-value = 0.9213
ARMA(1,1) fits
56
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
> par(mfrow=c(1,1))
> Fore<-predict(Rslt, n.ahead=5)
> U = Fore$pred + 2*Fore$se
> L = Fore$pred - 2*Fore$se
> minx=min(Data,L)
> maxx=max(Data,U)
57
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Identification can get tricky at this stage
58
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
What do you think?
59
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
> Rslt<-arima(Data, order = c(4, 0, 3),include.mean = FALSE)
> Rslt
Call:
arima(x = Data, order = c(4, 0, 3), include.mean = FALSE)
Coefficients:
ar1 ar2 ar3 ar4 ma1 ma2 ma3
0.2722 -0.5276 0.0202 -0.2663 0.8765 -0.4672 -0.5248
s.e. 0.2018 0.2308 0.1968 0.1546 0.1992 0.1690 0.1882
sigma^2 estimated as 1.140: log likelihood = -151.19, aic = 318.38
> Box.test(Rslt$residuals)
Box-Pierce test
data: Rslt$residuals
X-squared = 0.2953, df = 1, p-value = 0.5869
Data<-arima.sim(model=list(ar=c(0.5,-0.5,0.3,-
0.3),ma=c(0.75,-0.5,-0.5)),n=100)
Was supposed to fit pretty wel….
60
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Identification can be difficult
Easiest model is AR
Imagine when the series is not stationary…
Step by step approach, exploration, tries,…
Sometimes you find a satisfying model
Sometimes you don’t!
61
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 5
IFM
Q1
20
12
Conclusion
AR
MA
ARMA
Times series