Time Series Analysis by MATLAB AMS 586 Time Series Analysis Professor Wei Zhu Group Project by Jisun Son, Jesse Colton, Xin Li, Chen Li, Yasheng Xu, and Lizhen Peng April 14, 2015
Dec 21, 2015
Time Series Analysisby MATLAB
AMS 586 Time Series AnalysisProfessor Wei ZhuGroup Project by
Jisun Son, Jesse Colton, Xin Li, Chen Li, Yasheng Xu, and Lizhen PengApril 14, 2015
Outlines
• Introduction to MATLAB• ARIMA • ARCH-GARCH • VAR• Regression model with ARIMA errors
MATLAB
Introduction and The BasisBy Jisun
Contents
1. Introduction
2. Basic Coding
3. Graphic and Function
4. Basic rules
Introduction of MATLAB-1
history 1. Name- MATLAB: matrix laboratory 2. 1st - in 1980 by Cleve Moler
MathWorks 1. Founded in 1984 2. Develops MATLAB MATLAB
What?1. A numerical computing environment & fourth-generation programming language2. Current popular release: MATLAB r2009b,r2010b
Introduction of MATLAB-2
Advantages
1 . Friendly human interface (“help ***”,”type ***”)2 . Similar syntax to C++3. Highly respected algorithms ready to use4 . Powerful toolboxes5 . Excellent graphic facilities6 . Be able to integrate MATLAB programs with other languages, e.g. C, C++, Fortran, Java, VBA(Excel)
Disadvantages
1. Slow2 . Not good for large-scale problem3 . And more from your experiences?
Introduction of MATLAB-3
• Main interface
Introduction of MATLAB-4
• Function browse f(x) & Help interface
Basic coding-1
Some useful commands
1. clc-Clear the command window
2. clear-Clear all variables in the workspace clear all clear <variable name>
3. help <name> - Display help information for function/command
4. save <name> -save workspace variables into name.mat
5. load <name>-load all variables from name.mat into workspace>> saveSaving to: MATLAB.mat>> loadLoading from: MATLAB.mat
6.↑/↓ -scroll through previously entered commands7. what-list MATLAB-specific files in directory8. Whos-list all the variables in the current workspace, together with information about their size, bytes,class,etc
Basic coding-2
1. MATLAB is case sensitive• y >> r3 = pi;• y >> R3• y ??? Undefined function or variable 'R3'.
2. Use the dots (…) to continue a command
to the next line• y >> r4 = datestr(datenum('Aug-17-2010')+...• y 14)• y r4 =• y 31-Aug-2010
•Use a semicolon (;) at the end of a line to
•stop commands from echoing to the
•screen• y>> r2 = [3, 5, 9];• r2 =3 5 9• Use who and whos to check the user-defined variablesy >> who Your variables are: r1 r2 r3 r4
Size
y >> whos Name
1x1 1x3 1x1 1x11
Bytes Class Attributes
8 double24 double8 double22 char
r1r2r3r4
Basic coding-3
• Variable can be viewed and edited in this workspace browser
Basic coding-4
• The lookfor command finds all functions that is related to a given key word
y>> lookfor black-sholes
y black-sholes not found.
• Use which to locate functions and files y >> which blsprice y >>C:\Program Files\MATLAB\R2010b\toolbox\finance\finance\blsprice.m
Basic coding-5
• Basic Math Operations
+
-
*
/
^
Addition
subtraction
multiplicati
on division
power
•Frequently Used Built-in Functions sqrt, exp, log, log2, log10 sin, cos, tan, asin, acos, atan, …
Basic coding-6 Variable Naming Rules
Legal names consist of any combination of letters and digits, starting with a letter
• ExampleAllowable names:
Time2maturity, time_to_maturity, x1, optionprice
Non-Allowable names: net-cost, 2pay, %a, _price
Basic coding-7•Vectors Row vector: separated by commas (,) or spaces y >> v5 = [1 3,sqrt(16)]v5 = 1 3 4
Column vector: separated by semicolons (;) or “new-lines”y >> v6 = [1;3 sqrt(25)]v6 = 1
35
Basic coding-8• Matrix Vector is a special Matrix To enter a matrix, just type it in row by row
using the same syntax as for vectors
Size of a matrix (e.g. Q is a m*n matrix)•size(Q)•ans = m n•length(Q)•ans = max(m,n)
Basic coding-9•Addition and subtraction
Matirx Q1 + (-) Matrix Q2 → Q1 and Q2 should be the
same size
Matrix Q + (-) scalar x = each element of Q + (-) x
•Product and Division
Matrix Q * (/) scalar x = each element of Q * (/) x Matrix Q1 * Matrix Q2 → Q1(m,n), Q2(n,k) Matrix Q1 / Matrix Q2 = Q1 * inv(Q2) Matrix Q1 \ Matrix Q2 = inv(Q1) * Q2
Graphic&function-1
Ex(diff,int)
MATLAB codes results
•>> syms a x;•>> f = sin(a*x);•>> dfx = diff(f,x);•>> dfa = diff(f,a);• >> f1 = x * log(1+x);•>> int1 = int(f1,x);• >> int2 = int(f1,x,0,1);•>>dfxx = diff(f,x,2);
•dfx = a*cos(a*x)•dfa = x*cos(a*x)• int1 = x/2 - log(x + 1)/2 + x^2*(log(x + 1)/2 - 1/4)
•Int2 = ¼•dfxx = -a^2*sin(a*x)
Graphic&function-2
• mesh -> 3-D mesh surface
MATLAB programmingBasic Rules - Use clear and clc at the beginning of main program, not in sub program
- Use colon (;) to stopping printing on the screen
- Set “Current Folder” appropriately
References
• http://www.mathworks.com/help/matlab/entering-commands.html
• http://www.mathworks.com/help/matlab/matrices-and-arrays.html
• http://www.mathworks.com/help/matlab/operators-and-elementary-operations.html
ARIMA
By Jesse
Retrieving Stock Prices• To retrieve the daily closing price for Starbucks(SBUX) we can use the
following code:c=yahoo;sb=fetch(c,'SBUX','CLOSE','01/01/2013','12/31/2014');
format long g %{Converts to Standard notation%}SB=flipud(sb); %{Orders data from old to new%}SB=SB(1:length(SB),2) %{Only uses closing price, deleted dates%}lSB=log(SB) %{Converts price to log price%}
• Then to plot the data we use:plot(lSB,'r') %{Plot and make red, then label graph%}axis([1 505 3.95 4.45])xlabel('Date','Fontname','Times New Roman','FontSize',15); ylabel('Log Closing Price ($)','Fontname','Times New Roman','FontSize',15); title('Log of Closing Price of Starbucks(SBUX)','Fontname'...,'Times New Roman','FontSize',16)
Retrieve Without Coding
• MATLAB has easy GUI for everything! We will retrieve the daily closing prices of Dunkin Donuts (DNKN) in a different way– Start by downloading data from Yahoo Finance, and saving
as a .csv or .xls file– Click Import Data in MATLAB– Choose file– Click Import Selection– Choose your variable– Choose your plot– Customize the plot inside the plot window
• Close=log(Close) %{Converts to log price%}
Checking for Stationarity
• Augmented Dickey-Fuller Testadftest(lSB)
ans = 0
adftest(lDD)ans = 0
• In MATLAB, hypothesis tests come back as either 0 (fail to reject the null hypothesis) or 1 (reject the null in favor of the alternative )
• Both ADF tests failed to reject , so there is a unit root, and both time series require at least a first order differencing.
First Order Difference
• To make each time series stationary, we take the first difference:
dSB=diff(lSB); %{diff() takes the first order difference of the time series%}dDD=diff(lDD);figureplot(dSB, 'r')xlabel('Date','Fontname','Times New Roman','FontSize',15); ylabel('Daily Returns ($)','Fontname','Times New Roman','FontSize',15); title('Daily Log Returns of Starbucks(SBUX)','Fontname' ...,'Times New Roman','FontSize',16); figure %{“Figure” option is so we can produce 2 separate graphs%}plot(dDD, 'b') %{The ’b’ makes the plot blue%}xlabel('Date','Fontname','Times New Roman','FontSize',15); ylabel('Daily Returns ($)','Fontname','Times New Roman','FontSize',15); title('Daily Log Returns of Dunkin Donuts(DNKN)','Fontname' ...,'Times New Roman','FontSize',16);
• The daily returns of Starbucks(SBUX) on the left and Dunkin Donuts(DNKN) on the right
• They both appear to be stationary, but I will perform an ADF test on each to be sure.
adftest(dSB)ans =
1adftest(dDD)
ans = 1
• Since both ADF tests came back 1, we reject , and conclude that the differenced time series are now stationary.
• Next, we will examine the ACF and PACF of Starbux using autocorr() and parcorr()
figuresubplot(2,1,1)autocorr(dSB)title('Sample ACF of Starbucks(SBUX) Daily Log Returns','Fontname' ...,'Times New Roman','FontSize',16);subplot(2,1,2)parcorr(dSB)title('Sample PACF of Starbucks(SBUX) Daily Log Returns','Fontname' ...,'Times New Roman','FontSize',16);
• We also check for seasonalityperiodogram(dSB)
Model Fitting
• There does not appear to be any dominant spikes in the Periodogram, so we concluded that our model should not include any seasonality
• None of the lags appear to have significant autocorrelation or partial autocorrelation, except perhaps lag 2. This may warrant further investigation.
• For now we will try to fit ARIMA(p,1,q) models only up to p=1 and q=1
References
• http://www.mathworks.com/help/datafeed/yahoo.fetch.html• http://www.mathworks.com/help/MATLAB/ref/plot.html• http://www.mathworks.com/help/econ/adftest.html• http://
www.mathworks.com/help/econ/autocorrelation-and-partial-autocorrelation.html
ARIMA Diagnostics
By Xin
Model DiagnosticsMoving Average processA process {} is said to be a moving average process of order q (or MA(q) process) if
where {} are constants and is a purely random process with mean 0 and variance .
Model DiagnosticsAutoregressive processA process {} is said to be an autoregressive process of order p (or AR(p)) if
where is a purely random process with mean 0 and variance .
Model DiagnosticsAutoregressive/moving-average processA mixed autoregressive/moving-average process containing p AR terms and q MA terms is said to be an ARMA process of order (p, q). It is given by
Model DiagnosticsAutoregressive Integrated Moving AverageLet = =, the general ARIMA process is of the form
Model Diagnostics%Fit AR(1)Mdl1 = arima(1,0,0);EstMdl1 = estimate(Mdl1,r.data);%Diagnostic plots[res1,v1,logL(1)] = infer(EstMdl1,r.data);stdres1 = res1./sqrt(v1);[~,pValue1] = lbqtest(stdres1,'lags',1:10);figuresubplot(3,1,1)% residual plotplot(stdres1)
Model Diagnosticsaxis tighttitle('Standardized Residuals for AR(1)')subplot(3,1,2)% QQ-plotqqplot(stdres1);title('QQ-Plot for Residuals of AR(1)')subplot(3,1,3)% Ljung-Box Plotscatter(1:10,pValue1)title('Ljung-Box Plot for Residuals of AR(1)')xlabel('Lag')ylabel('p-value')
Model Diagnostics%Fit MA(1)Mdl2 = arima(0,0,1);EstMdl2 = estimate(Mdl2,r.data);%Diagnostic plots[res2,v2,logL(2)] = infer(EstMdl2,r.data);stdres2 = res2./sqrt(v2);[~,pValue2] = lbqtest(stdres2,'lags',1:10);figuresubplot(3,1,1)% residual plotplot(stdres2)
Model Diagnosticsaxis tighttitle('Standardized Residuals for MA(1)')subplot(3,1,2)% QQ-plotqqplot(stdres2);title('QQ-Plot for Residuals of MA(1)')subplot(3,1,3)% Ljung-Box Plotscatter(1:10,pValue2)title('Ljung-Box Plot for Residuals of MA(1)')xlabel('Lag')ylabel('p-value')
Model Diagnostics%Fit ARMA(1,1)Mdl3 = arima(1,0,1);EstMdl3 = estimate(Mdl3,r.data);%Diagnostic plots[res3,v3,logL(3)] = infer(EstMdl3,r.data);stdres3 = res3./sqrt(v3);[~,pValue3] = lbqtest(stdres3,'lags',1:10);figuresubplot(3,1,1)% residual plotplot(stdres3)
Model Diagnosticsaxis tighttitle('Standardized Residuals for ARMA(1,1)')subplot(3,1,2)% QQ-plotqqplot(stdres3);title('QQ-Plot for Residuals of ARMA(1,1)')subplot(3,1,3)% Ljung-Box Plotscatter(1:10,pValue3)title('Ljung-Box Plot for Residuals of ARMA(1,1)')xlabel('Lag')ylabel('p-value')
Model Diagnostics
Read Results
%Model SelectionnumObs = length(r.data);[aic,bic] = aicbic(logL,[3;3;4],numObs);modselect = table(logL,aic,bic,'RowNames',{'AR(1)','MA(1)','ARMA(1,1)'});
the result of variable
Type the variable name
Forecasting% Get extra 10 data pointfprice = getYahooDailyData(symbol,'01/01/2013‘,'15/01/2015', 'dd/mm/yyyy');fp = timeseries(table2array(price.SBUX(:,5)),... datestr(table2array(price.SBUX(:,1))),... 'Name','Starbucks Daily Close Price');fp.TimeInfo.Format = 'mm/dd/yy';fr = timeseries(diff(log(table2array(fprice.SBUX(:,5)))),... datestr(table2array(fprice.SBUX(2:end,1))),... 'Name','Starbucks Daily Logged Return');fr.TimeInfo.Format = 'mm/dd/yy';T = getabstime(fr);
Forecasting% 10-step-ahead forecasting (logged return)[yF,yMSE] = forecast(EstMdl1,10,'Y0',r.data);upper = yF + 1.96*sqrt(yMSE);lower = yF - 1.96*sqrt(yMSE);figureplot(fr.Time(end-50:end),fr.Data(end-50:end),'Color',[.75,.75,.75])hold onh1 = plot(fr.Time(end-9:end),yF,'r','LineWidth',2);h2 = plot(fr.Time(end-9:end),upper,'k--','LineWidth',1.5);plot(fr.Time(end-9:end),lower,'k--','LineWidth',1.5)title('10 Step ahead Forecast and 95% Forecast Interval for Logged Return')legend([h1,h2],'Forecast','95% Interval','Location','NorthWest')set(gca,'xtick',fr.Time(end-50:10:end),'xticklabel',T(end-50:10:end))hold off
Forecasting%10-step-ahead forcasting (price)logp = cumsum([log(ts.data(end)) yF']);p = exp(logp(2:end));pu = p'.*exp(upper);pl = p'.*exp(lower);figureplot(fp.Time(end-50:end),fp.Data(end-50:end),'Color',[.75,.75,.75])hold onh1 = plot(fp.Time(end-9:end),p,'r','LineWidth',2);h2 = plot(fp.Time(end-9:end),pu,'k--','LineWidth',1.5);plot(fp.Time(end-9:end),pl,'k--','LineWidth',1.5)title('10 Step ahead Forecast and 95% Forecast Interval for Closing Price')legend([h1,h2],'Forecast','95% Interval','Location','NorthWest')set(gca,'xtick',fp.Time(end-50:10:end),'xticklabel',T(end-50:10:end))hold off
Edit plot
ARCH-GARCH
By Chen
• Let be N(0,1). The processis an ARCH(q) process if it is strictly stationary and if it satisfies, for all and some strictly positive-valued process , the equations
• Where and , .• ARCH models assume the variance of the current error
term or innovation to be a function of the previous time periods' error terms.
ARCH-GARCH Model
• ARCH(q) has some useful properties. For simplicity, we will show them in ARCH(1).
• Without loss of generality, let a ARCH(1) process be represented by • Conditional Mean
• Unconditional Mean
• So have mean zero
ARCH-GARCH Model
ARCH-GARCH Model• For ARCH(1),we can write it as
where
is finite , then it is an AR(1) process for .
ARCH-GARCH Model
• We can write the GARCH(1,1) as
where
• is finite , then it is an ARMA(1,1) process for Xt2.
ARCH-GARCH Model
In the real world, the variances of the error terms in the data may be not equal, and the error terms may reasonably be expected to be larger for some points or ranges of the data than for other. So we combine the ARMA model and GARCH model, where we use ARMA to fit mean and GARCH to fit variance.
For example,
ARMA(1,1)-GARCH(1,1)
ARCH-GARCH Model
Coding
• %%Fit ARMA-GARCH Model• %%AR(1)-GARCH(1,1) model• Mdl4 = arima('ARLags',1,'Variance',garch(1,1));• %% where ‘arima’ returns a model with addintional options specified by
one or more Name, Value pair arguments.
• EstMdl4 = estimate(Mdl4,r.data);
• %%Estmdl = estimate (Mdl,y) uses maximum likelihood to estimate the parameters of the ARIMA(p,d,q) model Mdl given the observed univariate time series y. EstMdl is an arima model that stores the result.
Coding
• You can get all you want from EstMdl4:
Coding
• %Diagnostic plots• [res4,v4,logL4] = infer(EstMdl4,fr.data);%%[E,V,logL] = infer (Mdl,Y,Name,Value) infers the ARIMA model residuals E, and conditional variances and returns the loglikelihood objective function values, with additional options specified name, value pair arguments. %%Here we just get the residuals and conditional variances and loglikelihood objective function values.
Coding
Figure• stdres4 = res4./sqrt(v4);• [~,pValue4] = lbqtest(stdres4,'lags',1:10);• subplot(3,1,1) % residual plot• plot(stdres4)• axis tight• title('Standardized Residuals for AR(1)-GARCH(1,1) of SBUX')• subplot(3,1,2)% QQ-plot• qqplot(stdres4);• title('QQ-Plot for Residuals of AR(1)-GARCH(1,1) of SBUX')• subplot(3,1,3)% Ljung-Box Plot• scatter(1:10,pValue4)• title('Ljung-Box Plot for Residuals of AR(1)-GARCH(1,1) of SBUX')• xlabel('Lag')• ylabel('p-value')
Coding
Coding
For SBUX
• ARIMA(1,0,0) Model:• --------------------• Conditional Probability Distribution: Gaussian
• Standard t • Parameter Value Error Statistic • ----------- ----------- ------------ -----------• Constant 0.00082736 0.000526726 1.57076• AR{1} -0.00527129 0.0535863 -0.0983702•
For SBUX
• GARCH(1,1) Conditional Variance Model:• ----------------------------------------• Conditional Probability Distribution: Gaussian
• Standard t • Parameter Value Error Statistic • ----------- ----------- ------------ -----------• Constant 3.80613e-05 2.3244e-05 1.63746• GARCH{1} 0.63547 0.197561 3.21657• ARCH{1} 0.0896846 0.030469 2.94347
We can also read the result directly from the command window
Coding
For DNKN
• ARIMA(1,0,0) Model:• --------------------• Conditional Probability Distribution: Gaussian
• Standard t • Parameter Value Error Statistic • ----------- ----------- ------------ -----------• Constant 0.000504465 0.000543407 0.928339• AR{1} -0.0349977 0.0529296 -0.661211
For DNKN• GARCH(1,1) Conditional Variance Model:• ----------------------------------------• Conditional Probability Distribution: Gaussian
• Standard t • Parameter Value Error Statistic • ----------- ----------- ------------ -----------• Constant 6.74193e-05 2.89125e-05 2.33184• GARCH{1} 0.494306 0.178745 2.76542• ARCH{1} 0.114903 0.0342274 3.35703
Forecasting• [yF2,yMSE2] = forecast(EstMdl4,10,'Y0',r.data);• upper2 = yF2 + 1.96*sqrt(yMSE2);• lower2 = yF2 - 1.96*sqrt(yMSE2);• • figure• plot(fr.Time(end-50:end),fr.Data(end-50:end),'Color',[.75,.75,.75])• hold on• h1 = plot(fr.Time(end-9:end),yF2,'r','LineWidth',2);• h2 = plot(fr.Time(end-9:end),upper2,'k--','LineWidth',1.5);• plot(fr.Time(end-9:end),lower2,'k--','LineWidth',1.5)• title('10 Step ahead Forecast and 95% Forecast Interval for Logged Return of SBUX')• legend([h1,h2],'Forecast','95% Interval','Location','NorthWest')• set(gca,'xtick',fr.Time(end-50:10:end),'xticklabel',T(end-50:10:end))• hold off
Forecasting
Compared to the AR(1) Model
• By the figures, it looks that the CI of AR(1)-GARCH(1,1) is closer to the real data.
• %%Compared to the AR(1) Model using MSE:• MseMdl1=immse(yF,fr.data(end-9:end));• MseMdl4=immse(yF2,fr.data(end-9:end));• %% We use immse(x,y) to calculate the mean square error between x and
y. Here x is the 10-steps forecasting result and y is the real data. yF is the forecasting result of AR(1), and yF2 is the forecasting result of AR(1)-GARCH(1,1).
Compared to the AR(1) Model• For SBUX• MseMdl1 = 2.7403e-04• MseMdl4 = 2.7431e-04 • For DNKN• MseMdl1 =3.1458e-04• MseMdl4 =3.1449e-04
• We find that the mean squared error of AR(1) forecasting for SBUX is smaller than that of AR(1)-GARCH(1,1) forecasting. But the result for DNKN is opposite.
• In fact, if we do a 20-steps forecasting, which is from 01/01/2013 to 30/01/2013, we will find the AR(1)-GARCH(1,1) model have better results for both SBUX and DNKN.
• So we may say that the AR(1)-GARCH(1,1) model is better.
References
• [1] Box, G. E. P., G. M. Jenkins, and G. C. Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.
• [2] Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation. Robert F. Engle
• [3] Pankratz, A. Forecasting with Dynamic Regression Models. John Wiley & Sons, Inc., 1991.
• [4] Tsay, R. S. Analysis of Financial Time Series. 2nd ed. Hoboken, NJ: John Wiley & Sons, Inc., 2005.
• [5] http://www.mathworks.com/help/econ/regarima-class.html • [6] http://www.mathworks.com/help/econ/regarima.forecast.html
VAR
By Yasheng
How to use MATLAB for Multivariate Time Series and Risk Management
Multivariate time series• Granger Causality Test• Cross-correlation• VAR• Multivariate GARCH Models
– Diagonal Vectorization (VEC) Model - Bollerslev, Engle, and Wooldridge (1988)
– Baba-Engle-Kraft-Kroner (BEKK) Model - Engle and Kroner (1995)
– Constant Conditional Correlation (CCC) GARCH Models - Bollerslev (1990)
– Dynamic Conditional Correlation (DCC) GARCH Models - Engle (2002)
– Rotated Dynamic Conditional Correlation (RDCC) Model - Diaa Noureldin, Neil Shephard, Kevin Sheppard(2014) - introduces a new class of multivariate volatility models which is easy to estimate using covariance targeting, even with rich dynamics.
Multivariate time series
• Copula– Fundamental - copulas of important dependence structure
– Implicit copulas - extracted from well known distribution functions but no closed form expression exists
– Explicit copulas - have explicit closed form expressions
• Principal Component Analysis and Factor Models
• MATLAB Toolbox:– Statistics and Machine Learning Toolbox– Signal Processing Toolbox– Financial Toolbox– Econometrics Toolbox– Oxford Econometrics Toolbox (most professional
toolbox for Econometrics)
DNKN and SBUX
Cross-correlation
• The lag- cross-covariance matrix of is defined as
Cross-correlation
• Correlation coefficient = 0.5179, corrcoef(logR1,logR2)• Cross-correlation Function: crosscorr(logR1,logR2,40,3)
• 3 specifies the number of standard deviations of the sample XCF estimation error.
VAR - Official
• vgxset sets or modifies parameter values in a multivariate time series specification structure.
• vgxvarx estimates parameters of VAR and VARX models using maximum likelihood estimation.– Syntax:– Spec = vgxset('Name1',Value1,'Name2',Value2,...)– Spec = vgxset('n',2,'nAR',1);– [EstSpec,EstStdErrors,LLF,W] = vgxvarx(Spec,y);
• Drawback: In my opinion, it is hard for people to define a right model when they use this function.
VAR - Official
• Specification– vartovec Vector autoregression (VAR) to
vector error-correction model (VEC)– vectovar Vector error-correction (VEC) to
vector autoregression (VAR)– vgxget Get VARMAX model specification
parameters– vgxset Set VARMAX model specification
parameters
• Estimation– egcitest Engle-Granger cointegration test– jcitest Johansen cointegration test– jcontest Johansen constraint test– vartovec Vector autoregression (VAR) to
vector error-correction model (VEC)
– vectovar Vector error-correction (VEC) to vector autoregression (VAR)
– vgxar Convert VARMA model to VAR model
– vgxma Convert VARMA model to VMA model
– vgxvarx Estimate VARX model parameters– vgxdisp Display VARMAX model
parameters and statistics– vgxqual Test VARMAX model for
stability/invertibility– vgxplot Plot VARMAX model responses– vgxinfer Infer VARMAX model innovations
VAR - Professional
• Estimates Pth order (regular and irregular) vector autoregressions. The options for vectorar include the ability to include or exclude a constant, choose the lag order, and to specify which assumptions should be made for computing the covariance matrix of the estimated parameters. The parameter covariance matrix can be estimated under 4 sets of assumptions on the errors:– Uncorrelated and Homoskedastic– Correlated and Homoskedastic– Uncorrelated and Heteroskedastic– Correlated and Heteroskedastic
VAR - Professional• Syntax:
– [parameters,stderr,tstat,pval,const,conststd,r2,errors,s2,paramvec,vcv] = vectorar(y,constant,lags,het,uncorr);
– het: Scalar value of either 1 (assume heteroskedasticity) or 0 (assume homoskedasticity). The default value for this optional parameters is 1.
– uncorr: Scalar value of either 0 (assume the errors are correlated) or 1 (assume no error correlation). The default value for this optional parameters is 0.
– [parameters,stderr,tstat,pval,const,conststd,r2,errors,s2,paramvec,vcv] = vectorar(y,1,1);
– parameters{1} • -0.0322 0.0175• -0.0867 0.0265
VAR - Professional
• To estimate a VAR(1) assuming homoskedastic and correlated errors• parameters = vectorar(y,1,1,0);• parameters = vectorar(y,1,1,0,0);• To estimate a VAR(1) assuming homoskedastic but uncorrelated errors• parameters = vectorar(y,1,1,0,1);• To estimate a VAR(1) assuming heteroskedastic but uncorrelated errors• parameters = vectorar(y,1,1,[],1);• parameters = vectorar(y,1,1,1,1);
VAR - Order Selection
• data = VAR_data(:,2:4);• s2 = cell(13,1);• AIC = zeros(13,1);• HQIC = zeros(13,1);• BIC = zeros(13,1);• LR = zeros(13,1);• pval = zeros(13,1);• for lags=0:12• p = lags;• disp('Data length after adjustment')• T = length(data(12-lags+1:end,:)) - lags;• disp(T)• [~,~,~,~,~,~,~,~,s2{lags+1}] = vectorar(data(12-
lags+1:end,:),1,1:p);• s2_det = det(s2{lags+1});• AIC(lags+1) = log(det(s2_det)) + 3 + 2 * 3^2 * p / T;• HQIC(lags+1) = log(det(s2_det)) + 3 + 2 *
log(log(T)) * 3^2 * p / T; % Hannan-Quinn Information Criterion
• BIC(lags+1) = log(det(s2_det)) + log(T) * 3^2 * p / T;
• if lags>=1• LR(lags + 1) = (T - p * 3^2) * (log(det(s2{lags})) -
log(det(s2{lags+1})));• pval(lags+1) = 1 - chi2cdf(LR(lags+1),9);• end• end
Granger Causality Test
• Granger Causality testing in a VAR. Most of the choices in grangercause are identical to those in vectorar and knowledge of the features of vectorar is recommended
• Syntax:– [stat,pval]=grangercause(
y,constant,lags,het,uncorr,inference)
– [stat,pval]=grangercause(y,1,1:2);
– GC statistic– 0.3440 6.3615– 5.6416 15.9695– GC statistic pval– 0.8420 0.0416– 0.0596 0.0003
Granger Causality Test
Granger Causality Test – Empirical Analysis
• Monica Billio, Mila Getmansky, Andrew W. Lo, Loriana Pelizzon Econometric measures of connectedness and systemic risk in the finance and insurance sectors, Journal of Financial Economics
• For the main analysis, we use monthly returns data for hedge funds, broker/dealers, banks and insurers,
Granger Causality Test – Empirical Analysis
Granger Causality Test – Empirical Analysis
• Granger-causality relationships are drawn as straight lines connecting two institutions, color-coded by the type of institution that is ‘‘causing’’ the relationship, i.e., the institution at date-t which Granger-causes the returns of another institution at date t+1. Green indicates a broker, red indicates a hedge fund, black indicates an insurer, and blue indicates a bank. Only those relationships significant at the 5% level are depicted.
Granger Causality Test – Empirical Analysis
• Empirical results suggest that the banking and insurance sectors may be even more important sources of connectedness than other parts, which is consistent with the anecdotal evidence from the recent financial crisis. The illiquidity of bank and insurance assets, coupled with the fact that banks and insurers are not designed to withstand rapid and large losses (unlike hedge funds), make these sectors a natural repository for systemic risk.
DCC GARCH Models
DCC GARCH Models
DCC GARCH Models
DCC GARCH Models
DCC GARCH Models
• Syntax:– [parameters,ll,ht,vcv,scores,diagnostics] = dcc
(data,dataasym,m,l,n,p,o,q,gjrtype,method,composite,startingvals,options);
– data - A T by K matrix of zero mean residuals -or- K by K by T array of covariance estimators (e.g. realized covariance)
– m - Order of symmetric innovations in DCC model– l - Order of asymmetric innovations in ADCC model– n - Order of lagged correlation in DCC model– ave = mean(y);– data = bsxfun(@minus, y, ave);– [params,ll,ht,vcv,scores,diagnostics] = dcc(data,[],1,0,1);
DCC GARCH Models - Iteration
DCC GARCH Models
• [T,K] = size(y);• h = zeros(K,T);• Rt = zeros(K,K,T);• rho = zeros(T,1);
• for t=1:T• h(:,t) = sqrt(diag(ht(:,:,t)));• end• for t=1:T• Rt(:,:,t) = ht(:,:,t)./(h(:,t)*h(:,t)');• end• for t=1:T• rho(t,1) = Rt(1,2,t);• end
• name = {'rho'};• dccts = fints(Date1,rho,name,1);• clearvars name;
• plot(dccts,'r');
DCC GARCH Models
How to use MATLAB for Multivariate Time Series and Risk Management
Multivariate time series
• Granger Causality Test• Cross-correlation• VAR• Multivariate GARCH Models
– Diagonal Vectorization (VEC) Model - Bollerslev, Engle, and Wooldridge (1988)
– Baba-Engle-Kraft-Kroner (BEKK) Model - Engle and Kroner (1995)
– Constant Conditional Correlation (CCC) GARCH Models - Bollerslev (1990)
– Dynamic Conditional Correlation (DCC) GARCH Models - Engle (2002)
– Rotated Dynamic Conditional Correlation (RDCC) Model - Diaa Noureldin, Neil Shephard, Kevin Sheppard(2014) - introduces a new class of multivariate volatility models which is easy to estimate using covariance targeting, even with rich dynamics.
Multivariate time series
• Copula– Fundamental - copulas of important dependence structure
– Implicit copulas - extracted from well known distribution functions but no closed form expression exists
– Explicit copulas - have explicit closed form expressions
• Principal Component Analysis and Factor Models• MATLAB Toolbox:
– Statistics and Machine Learning Toolbox– Signal Processing Toolbox– Financial Toolbox– Econometrics Toolbox– Oxford Econometrics Toolbox (most professional toolbox for
Econometrics)
DNKN and SBUX
Cross-correlation
• The lag- cross-covariance matrix of is defined as
Cross-correlation
• Correlation coefficient = 0.5179, corrcoef(logR1,logR2)• Cross-correlation Function: crosscorr(logR1,logR2,40,3)• 3 specifies the number of standard deviations of the sample XCF estimation error.
VAR - Official
• vgxset sets or modifies parameter values in a multivariate time series specification structure.
• vgxvarx estimates parameters of VAR and VARX models using maximum likelihood estimation.– Syntax:– Spec = vgxset('Name1',Value1,'Name2',Value2,...)– Spec = vgxset('n',2,'nAR',1);– [EstSpec,EstStdErrors,LLF,W] = vgxvarx(Spec,y);
• Drawback: In my opinion, it is hard for people to define a right model when they use this function.
VAR - Official
• Specification– vartovec Vector autoregression (VAR) to
vector error-correction model (VEC)– vectovar Vector error-correction (VEC) to
vector autoregression (VAR)– vgxget Get VARMAX model specification
parameters– vgxset Set VARMAX model specification
parameters
• Estimation– egcitest Engle-Granger cointegration test– jcitest Johansen cointegration test– jcontest Johansen constraint test– vartovec Vector autoregression (VAR) to
vector error-correction model (VEC)
– vectovar Vector error-correction (VEC) to vector autoregression (VAR)
– vgxar Convert VARMA model to VAR model
– vgxma Convert VARMA model to VMA model
– vgxvarx Estimate VARX model parameters– vgxdisp Display VARMAX model
parameters and statistics– vgxqual Test VARMAX model for
stability/invertibility– vgxplot Plot VARMAX model responses– vgxinfer Infer VARMAX model innovations
VAR - Professional
• Estimates Pth order (regular and irregular) vector autoregressions. The options for vectorar include the ability to include or exclude a constant, choose the lag order, and to specify which assumptions should be made for computing the covariance matrix of the estimated parameters. The parameter covariance matrix can be estimated under 4 sets of assumptions on the errors:– Uncorrelated and Homoskedastic– Correlated and Homoskedastic– Uncorrelated and Heteroskedastic– Correlated and Heteroskedastic
VAR - Professional
• Syntax:– [parameters,stderr,tstat,pval,const,conststd,r2,errors,s2,pa
ramvec,vcv] = vectorar(y,constant,lags,het,uncorr);– het: Scalar value of either 1 (assume heteroskedasticity) or 0 (assume
homoskedasticity). The default value for this optional parameters is 1.– uncorr: Scalar value of either 0 (assume the errors are correlated) or 1
(assume no error correlation). The default value for this optional parameters is 0.
– [parameters,stderr,tstat,pval,const,conststd,r2,errors,s2,paramvec,vcv] = vectorar(y,1,1);
– parameters{1} • -0.0322 0.0175• -0.0867 0.0265
VAR - Professional
• To estimate a VAR(1) assuming homoskedastic and correlated errors• parameters = vectorar(y,1,1,0);• parameters = vectorar(y,1,1,0,0);• To estimate a VAR(1) assuming homoskedastic but uncorrelated errors• parameters = vectorar(y,1,1,0,1);• To estimate a VAR(1) assuming heteroskedastic but uncorrelated errors• parameters = vectorar(y,1,1,[],1);• parameters = vectorar(y,1,1,1,1);
VAR - Order Selection
• data = VAR_data(:,2:4);• s2 = cell(13,1);• AIC = zeros(13,1);• HQIC = zeros(13,1);• BIC = zeros(13,1);• LR = zeros(13,1);• pval = zeros(13,1);• for lags=0:12• p = lags;• disp('Data length after adjustment')• T = length(data(12-lags+1:end,:)) - lags;• disp(T)• [~,~,~,~,~,~,~,~,s2{lags+1}] = vectorar(data(12-
lags+1:end,:),1,1:p);• s2_det = det(s2{lags+1});• AIC(lags+1) = log(det(s2_det)) + 3 + 2 * 3^2 * p / T;• HQIC(lags+1) = log(det(s2_det)) + 3 + 2 *
log(log(T)) * 3^2 * p / T; % Hannan-Quinn Information Criterion
• BIC(lags+1) = log(det(s2_det)) + log(T) * 3^2 * p / T;
• if lags>=1• LR(lags + 1) = (T - p * 3^2) * (log(det(s2{lags})) -
log(det(s2{lags+1})));• pval(lags+1) = 1 - chi2cdf(LR(lags+1),9);• end• end
Granger Causality Test
• Granger Causality testing in a VAR. Most of the choices in grangercause are identical to those in vectorar and knowledge of the features of vectorar is recommended
• Syntax:– [stat,pval]=grangercause(
y,constant,lags,het,uncorr,inference)
– [stat,pval]=grangercause(y,1,1:2);
– GC statistic– 0.3440 6.3615– 5.6416 15.9695– GC statistic pval– 0.8420 0.0416– 0.0596 0.0003
Granger Causality Test
Granger Causality Test – Empirical Analysis
• Monica Billio, Mila Getmansky, Andrew W. Lo, Loriana Pelizzon Econometric measures of connectedness and systemic risk in the finance and insurance sectors, Journal of Financial Economics
• For the main analysis, we use monthly returns data for hedge funds, broker/dealers, banks and insurers,
Granger Causality Test – Empirical Analysis
Granger Causality Test – Empirical Analysis
• Granger-causality relationships are drawn as straight lines connecting two institutions, color-coded by the type of institution that is ‘‘causing’’ the relationship, i.e., the institution at date-t which Granger-causes the returns of another institution at date t+1. Green indicates a broker, red indicates a hedge fund, black indicates an insurer, and blue indicates a bank. Only those relationships significant at the 5% level are depicted.
Granger Causality Test – Empirical Analysis
• Empirical results suggest that the banking and insurance sectors may be even more important sources of connectedness than other parts, which is consistent with the anecdotal evidence from the recent financial crisis. The illiquidity of bank and insurance assets, coupled with the fact that banks and insurers are not designed to withstand rapid and large losses (unlike hedge funds), make these sectors a natural repository for systemic risk.
DCC GARCH Models
DCC GARCH Models
DCC GARCH Models
DCC GARCH Models
DCC GARCH Models
• Syntax:– [parameters,ll,ht,vcv,scores,diagnostics] = dcc
(data,dataasym,m,l,n,p,o,q,gjrtype,method,composite,startingvals,options);
– data - A T by K matrix of zero mean residuals -or- K by K by T array of covariance estimators (e.g. realized covariance)
– m - Order of symmetric innovations in DCC model– l - Order of asymmetric innovations in ADCC model– n - Order of lagged correlation in DCC model– ave = mean(y);– data = bsxfun(@minus, y, ave);– [params,ll,ht,vcv,scores,diagnostics] = dcc(data,[],1,0,1);
DCC GARCH Models - Iteration
DCC GARCH Models
• [T,K] = size(y);• h = zeros(K,T);• Rt = zeros(K,K,T);• rho = zeros(T,1);
• for t=1:T• h(:,t) = sqrt(diag(ht(:,:,t)));• end• for t=1:T• Rt(:,:,t) = ht(:,:,t)./(h(:,t)*h(:,t)');• end• for t=1:T• rho(t,1) = Rt(1,2,t);• end
• name = {'rho'};• dccts = fints(Date1,rho,name,1);• clearvars name;
• plot(dccts,'r');
DCC GARCH Models
DCC and RDCC GARCH Models
• Use DCC and RDCC GARCH Models between 50 stocks in Chinese financial market and CSI 300 index
Regression with ARIMA Errors
By Lizhen
Regression Model with ARIMA Errors
• Definition: A model that explains the behavior of a response using a linear regression model with predictor data, though the errors have autocorrelation indicative of an ARIMA process.
• By default, the time series errors (also called unconditional disturbances) are independent, identically distributed, mean 0 Gaussian random variables.
• If the errors have an autocorrelation structure, then you can specify models for them. The models include:
I. moving average (MA)II. autoregressive (AR)III. mixed autoregressive and moving average (ARMA)IV. integrated (ARIMA)V. multiplicative seasonal (SARIMA)
The Model Formula
a(L) A(L)(1 ) (1 ) ( ) B(L)
t t t
D st t
y c X u
L L u b L
A Simple Case (AR error)
0 1
21 1 2 2
21 2
0 1
with and ~ iid (0, )
(B)=1- ,
then we can write the AR model for errors as
(B)
So the model can be written as
/ (B)
t t t
t t t t t
t t
t t t
y x
w w N
Let B B
w
y x w
When to fit model with ARIMA errors?
Examining Whether This Model May be Necessary
• 1. Start by doing an OLS regression. Store the residuals.• 2. Analyze the time series structure of the residuals to
determine if they have an ARIMA structure, and which ARIMA structure is the most suitable one.
• 3. If the residuals from the ordinary regression appear to have an ARIMA structure, estimate this model and diagnose whether the model is appropriate.
MATLAB Coding
Specify error models containing known coefficients to:
• Simulate responses using simulate.• Estimate unknown coefficients with data using estimate.• Forecast future observations using forecast.
Regression model with ARIMA errors
• regression model with ARIMA(2,1,3) errors:
2 2 31 2 1 2 3(1 )(1 ) (1 )
t t
t t
y u
L L L u L L L
Modify a Regression Model with ARIMA Errors• Regression model with ARIMA errors:
2
1.52
0.2
(1 0.2 0.3 )(1 ) (1 0.1 )
where is the Gaussian with variance 0.5
t t t
t t
t
y X u
L L L u L
• Specify the following model:
Specify a Regression Model with SARIMA Errors
4 8 4 4 8
1 6
(1 0.2 )(1 )(1 0.5 0.2 )(1 ) (1 0.1 )(1 0.05 0.01 )
t t t
t t
y X u
L L L L L u L L L
Simulate responses• Monte Carlo simulation of regression model with ARIMA errors
Syntax• [Y,E] = simulate(Mdl,numObs)Simulates one sample path of observations (Y) and innovations (E) from the regression model with ARIMA time series errors, Mdl. The software simulates numObs observations and innovations per sample path.• [Y,E,U] = simulate(Mdl,numObs)Additionally simulates unconditional disturbances, U.• [Y,E,U] = simulate(Mdl,numObs,Name,Value)Simulates sample paths with additional options specified by one or more Name,Value pair arguments.
Simulate with SARIMA
• Simulate paths of responses, innovations, and unconditional disturbances from a regression model with SARIMA(2,1,1) errors.
• Specify the model:
2 12 12
1.5
2
(1 0.2 0.1 )(1 )(1 0.01L ) (1 0.5 )(1 0.02 )
t t t
t t
y X u
L L L u L L
• Simulate and plot 500 paths with 25 observations each.
• Plot the 2.5th, 50th (median), and 97.5th percentiles of the simulated response paths.
• Plot a histogram of the simulated paths at time 20.
Forecasting• Forecast the Stationary Process Using Monte Carlo Simulations• Regress the stationary, quarterly log GDP onto the CPI using a
regression model with ARMA(1,1) errors, and forecast log GDP using Monte Carlo simulation.
1. Load the US Macroeconomic data set and preprocess the data.
• 2. Fit a regression model with ARMA(1,1) errors.
• 3. Infer unconditional disturbances.
• 4. Simulate 1000 paths with 15 observations each. Use the inferred unconditional disturbances as presample data.
• 5. Plot the simulation mean forecast and approximate 95% forecast intervals.
References• [1] Box, G. E. P., G. M. Jenkins, and G. C. Reinsel. Time Series Analysis: Forecasting
and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.• [2] Box, G. E. P., G. M. Jenkins, and G. C. Reinsel. Time Series Analysis: Forecasting
and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.• [3] Davidson, R., and J. G. MacKinnon. Econometric Theory and Methods. Oxford,
UK: Oxford University Press, 2004.• [4] Enders, W. Applied Econometric Time Series. Hoboken, NJ: John Wiley & Sons,
Inc., 1995.• [5] Hamilton, J. D. Time Series Analysis. Princeton, NJ: Princeton University Press,
1994.• [6] Pankratz, A. Forecasting with Dynamic Regression Models. John Wiley & Sons,
Inc., 1991.• [7] Tsay, R. S. Analysis of Financial Time Series. 2nd ed. Hoboken, NJ: John Wiley &
Sons, Inc., 2005.• [8] http://www.mathworks.com/help/econ/regarima-class.html • [9] http://www.mathworks.com/help/econ/regarima.simulate.html • [10] http://www.mathworks.com/help/econ/regarima.forecast.html
Summary
• Introduction to MATLAB• ARIMA:MATLAB has a lot of readily available functions for analyzing and visualizing data in order to determine the most accurate model• ARCH-GARCH• VAR• regARIMA:It covered when and how should we fit regression with ARIMA/SARIMA errors. MATLAB is supper powerful with simulating ARIMA errors, fiting regression with ARIMA errors, coefficients estimation and forecasting.
Thank You!