Top Banner
Business Forecasting ECON2209 Slides 02 Lecturer: Minxian Yang BF-02 1 my, School of Economics, UNSW
26
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Econ2209 Week 2

Business Forecasting ECON2209

Slides 02

Lecturer: Minxian Yang

BF-02 1 my, School of Economics, UNSW

Page 2: Econ2209 Week 2

Ch.2 Brief Review

• Lecture Plan – A review of probability theory – A review of linear regressions

• Simple linear regression and scatter plot • Multiple linear regression • Ordinary least squares • Statistical inference in linear regressions • Model selection

– EViews • Quick start (more will be introduced gradually) • Understand EViews output

BF-02 my, School of Economics, UNSW 2

Page 3: Econ2209 Week 2

Ch.2 Brief Review

• A review of probability theory – Random variable (RV) X: A RV is a numerical description of the outcomes of a random experiment

eg. Tomorrow BHP share price; Outcome of the next election

– Probability distribution • cumulative distribution (cdf): F(x) = Prob(X ≤ x) • probability density (pdf): f(x) = derivative of F(x)

– Expectation operation E[g(X)] It is the average weighted by pdf/pmf.

BF-02 my, School of Economics, UNSW 3

. )()(

)f()(

)]([

allRV discrete for

RV continuous for

==

∞−

xxXPxg

dxxxg

XgE

-4 -2 0 2 4

0.0

0.2

0.4

0.6

0.8

1.0

Cumulative Distribution

x

P(X

<=x)

Page 4: Econ2209 Week 2

Ch.2 Brief Review

• A review of probability theory – Mean and variance of a RV

μ = E(X) : measure of centre location; (a point prediction) σ2 = E[(X−μ)2] : measure of dispersion.

– Joint probability distribution of 2 RVs joint cdf: FXY(x,y) = P(X ≤ x, Y ≤ y) joint pdf: fXY(x,y) = partial derivative of FXY(x,y).

– Marginal probability distributions obtained by letting y (or x) go to infinity in FXY(x,y),

FX(x) = FXY(x,∞) or FY(y) = FXY(∞,y).

BF-02 my, School of Economics, UNSW 4

Page 5: Econ2209 Week 2

Ch.2 Brief Review

• A review of probability theory – Independence

Two RVs X and Y are independent if FXY(x,y) = FX(x)FY(y) or fXY(x,y) = fX(x)fY(y).

When independent, E(XY) = E(X)E(Y) . – Correlation between 2 RVs

covariance: Cov(X, Y) = E[(X−μX)(Y−μY)] correlation: ρ = Cov(X,Y)/(σXσY)

– Conditional distribution of Y given X = x The conditional pdf is fY|X(y|x) = fXY(x,y)/fX(x).

When independent, Conditional pdf = Marginal pdf.

BF-02 my, School of Economics, UNSW 5

Page 6: Econ2209 Week 2

Ch.2 Brief Review

• A review of probability theory – Conditional expectation Given X, the conditional expectation of Y, E(Y|X), is the average of Y weighted by the conditional pdf. – A forecasting model is built on historic data or

observed information set. – The conditional (on the data) distribution of future Y,

which utilises available information, is the object we aim to understand.

BF-02 my, School of Economics, UNSW 6

X = history up to now (ie, data) Y = future value of interest. Then, the conditional distribution pdf(Y|X) is all we need for forecasting Y.

Page 7: Econ2209 Week 2

Ch.2 Brief Review

• A review of linear regression models – Simple linear regression model

A dependent variable yt is linearly explained by an independent variable xt (regressor), subject to a random disturbance εt .

– Scatter plot (y vs x) eg. “xyz.dat”

BF-02 my, School of Economics, UNSW 7

.10 ttt xy εββ ++=

X Y Z-1.3044 6.9105 -1.71840.0268 8.9035 1.63420.9392 13.0322 -0.71650.9108 12.1195 1.17750.6078 9.1661 0.1522

-1.4191 9.3989 -0.9960... 5

6

7

8

9

10

11

12

13

14

-3 -2 -1 0 1 2 3 4

X

Y

Y vs X

Page 8: Econ2209 Week 2

Ch.2 Brief Review

• A review of linear regression models – Assumptions about the disturbance (error term) εt

a) εt ~ iid (0, σ2): identically independently distributed (iid) with mean 0 and variance σ2;

b) εt is independent of regressor xt .

– These imply • E(εt | xt) = 0 ; • εt and xt are uncorrelated, ie, Cov(εt , xt ) = 0; • E(yt | xt) = β0 + β1xt . (population regression function)

– If the parameters (β0, β1, σ2) are know, y may be predicted by the regression function for any given x.

BF-02 my, School of Economics, UNSW 8

ttt xy εββ ++= 10

Page 9: Econ2209 Week 2

Ch.2 Brief Review

• A review of linear regression models – Ordinary least squared (OLS) estimation

– Fitted value and residual

fitted value = sample regression function = in-sample “forecast”

BF-02 my, School of Economics, UNSW 9

.2

1)ˆˆ(2

)ˆˆ( minimises )ˆ,ˆ(

1

2

1

210

2

1

21010

∑∑

==

=

−=−−

−=

−−

T

tt

T

ttt

T

ttt

eT

xyT

xy

ββσ

ββββ

Sum of squared residuals (SSR)

.ˆˆˆ,ˆˆˆ 1010 ttttttt xyyyexy ββββ −−=−=+=

sample variance of residualsstandard error of estimate - unbiased estimator of the variance of the error variable - square root of the above
Page 10: Econ2209 Week 2

Ch.2 Brief Review

• A review of linear regression models eg. “xyz.dat”

Residual plot:

BF-02 my, School of Economics, UNSW 10

.949749.0ˆ,959319.9ˆ,ˆˆˆ 1010 ==+= ββββ tt xy

X Y Z -1.3044055855 6.910518 -1.71836069 0.0268445360 8.903510 1.63420482 0.9391732559 13.032210 -0.71649462 0.9108356314 12.119516 1.17746037 0.6078185429 9.166077 0.15218715 -1.4190856454 9.398871 -0.99595505 1.6450399496 10.861943 1.95295568 -1.5705456746 6.900604 -0.29073338 1.1725971823 8.557908 -1.35916970 -0.0001469589 10.384578 -0.36981642 -1.9102214403 8.951113 -0.53368660 0.4106547298 10.502956 0.01555975 -0.2633816925 9.012794 0.39533785 2.9898141231 13.172481 0.62545839 -1.1719854886 7.294597 1.05345240 1.8628263016 13.475142 -2.20815358 1.1237612050 11.766178 -0.54770754 -1.8801405887 6.552416 1.50752962 0.3094047477 11.078862 -1.85750582 -1.9164071601 11.977582 -2.60940240 0.0443377940 9.175949 0.07977844 2.1231233412 10.706387 0.70568532 -0.1525597566 8.633693 1.29592881 -0.0302621863 8.111663 -0.25520209 -0.5267709581 7.630387 -1.46954656 1.1087466899 10.874706 0.56564230 -0.7870372302 11.471748 -0.29023287 0.1217747790 10.913112 -1.09615388 -0.6272752055 8.379696 -0.47673444 3.0619892478 11.959233 1.22468138 1.0277405158 10.103590 1.03023708 2.2274783060 11.681388 -0.12631043 -0.0633318382 9.716037 -1.02449478 -0.9195725833 8.469919 1.50979638 -0.4661417205 9.167781 0.19944663 1.3575193910 10.512578 1.80739691 1.1639896889 10.094872 0.56168127 0.4570019947 12.046379 -0.69538078 -1.6182578476 9.805910 -1.20063045 -2.8131818585 5.811298 0.41839468 -0.1329263293 11.314932 -1.72367774 -0.5127844249 10.729959 -1.09610913 1.3569336398 13.846940 -1.07509952 -0.8396330238 10.350224 0.23852197 0.6548109587 10.837194 1.16375597 -0.2817164164 7.928579 0.50112855 0.7701143610 10.755102 0.62021869 -0.0454431404 12.906690 -1.10814255

-4

-2

0

2

4

4

6

8

10

12

14

5 10 15 20 25 30 35 40 45

Residual Actual Fitted

Page 11: Econ2209 Week 2

Ch.2 Brief Review

• A review of linear regression models – Multiple linear regression: y is explained by more than one x variable.

– Conditional mean (population regression function)

– OLS estimation

BF-02 my, School of Economics, UNSW 11

).,0(iid~, 2110 σεεβββ ttKtKtt xxy ++++= �

KtKtKttt xxxxy βββ +++= �1101 ),...,|(E

.1

)ˆˆˆ( minimises )ˆ,,ˆ(

1

22

1

21100

=

=

−−=

−−−−

T

tt

T

tKtKttK

eKT

xxy

σ

βββββ ��

sample residual variance
Page 12: Econ2209 Week 2

Ch.2 Brief Review

• A review of linear regression models – Sample regression function and residuals

eg. “xyz.dat”

BF-02 my, School of Economics, UNSW 12

tttKtKtt yyexxy ˆ,ˆˆˆˆ 110 −=+++= βββ �

.64.0ˆ,07.1ˆ,88.9ˆ,ˆˆˆˆ 210210 −===++= ββββββ ttt zxy

When we say

regress y on [1, x1, x2],

we mean the OLS estimation of

yt = β0 + β1x1t + β2x2t + εt .

Page 13: Econ2209 Week 2

Ch.2 Brief Review

• Quick start with EViews – Preparation to use EViews

1) Create a folder in your USB, say, F:\BF 2) Download xyz.dat and save it in F:\BF 3) Launch EViews, select Options, General Options, File

Locations, set Current Data Path as F:\BF , OK 4) In EViews, click File, Open, Foreign data as Workfile,

xyz.dat (in File name), Open, Finish 5) A workfile is created and you will see Workfile: XYZ

window. 6) Read: Help, User Guide I, Part I, Chapter 2

BF-02 my, School of Economics, UNSW 13

Download bfData13.zip

Unzip it into F:\BF

Page 14: Econ2209 Week 2

Ch.2 Brief Review

• Quick start with Eviews (Menu-driven approach)

1) Create a Workfile (Clicks/keys) File, New, Workfile, Dated-regular frequency, integer date (in Frequency), 1 (in Start date), 48 (in End date), OK 2) Read data file (in Workfile window) Proc, Import, Read…, xyz.dat (in File name), Open, x y z (seperated by a space in Names for series), OK 3) Find summary statistics (type hist y) Quick, Series Statistics, Histogram and Stats, y (in Series name), OK 4) Generate new series (type genr w=x+y+z) Quick, Generate Series, w = x+y+z (in Enter equation), OK 5) Graph data (type scat x y) Quick, Graph, x y (seperated by a space in List of series), OK, Scatter (in

specific), Regression Line (in Fit lines), OK 6) OLS estimation (type ls y c x z) Quick, Estimate Equation, y c x z (in Equation specification), OK

BF-02 my, School of Economics, UNSW 14

Steps 1) and 2) are an alternative to Step 4) of page 13. They are necessary for seasonality.

Page 15: Econ2209 Week 2

Ch.2 Brief Review

BF-02 my, School of Economics, UNSW 15

Residual plot and many test statistics may be viewed in the “View” menu. Press “Stats” to see this table again.

This top panel can be used to carry out commands directly. e.g. Type plot x y z and press Enter.

Page 16: Econ2209 Week 2

Ch.2 Brief Review

BF-02 my, School of Economics, UNSW 16

5

6

7

8

9

10

11

12

13

14

-3 -2 -1 0 1 2 3 4

X

Y

Y vs. X

5

6

7

8

9

10

11

12

13

14

-3 -2 -1 0 1 2

Z

Y

Y vs. Z

-4

0

4

8

12

16

5 10 15 20 25 30 35 40 45

X Y Z

Page 17: Econ2209 Week 2

Ch.2 Brief Review

• A review of linear regression models – Distribution of OLS estimator is approximately normal with mean being true parameter

and standard deviation (std error) being reported. eg. Std error for is estimated as se( ) = 0.150341 Test the null hypothesis β1 = 1. ( -1)/se( ) = 0.486. The null cannot be rejected. – Approximate 95% confidence interval for β:

eg. CI ≈ 1.07 ± 2(0.15) = [0.77, 1.37].

BF-02 my, School of Economics, UNSW 17

β̂

1β̂ 1β̂

1β̂ 1β̂

)ˆ(se2ˆCI ββ ⋅±=

N(0,1))1t(~))/sd(( ≈−−− KTˆˆ βββ

Page 18: Econ2209 Week 2

Ch.2 Brief Review

• A review of linear regression models – t-statistic and probability value (p-value)

p-value = prob of making Type-I error (rejecting true H0)

when you reject “H0: β = 0” eg. t-stat for : -3.699; p-value for : 0.0006 – Sum of squared residuals (useful to test restrictions)

OLS minimizes SSR to estimate parameters. eg. SSR = 76.56223

BF-02 my, School of Economics, UNSW 18

),ˆ(se/ˆˆ ββ=t .~t|t̂||tp tsStudent' ),Prob(| value- >=

2β̂ 2β̂

,zˆxˆˆye,e tttt

T

tt 210

1

2SSR βββ −−−==∑=

Page 19: Econ2209 Week 2

Ch.2 Brief Review

• A review of linear regression models – Mean of the dependent variable (eg. 10.08241)

sample mean of y; central location measure

– S.D. of the dependent variable (eg. 1.908842)

sample SD of y; dispersion measure BF-02 my, School of Economics, UNSW 19

∑=

=T

tty

Ty

1

1

∑=

−−

=T

tt yy

T 1

2)(1

1SD

Page 20: Econ2209 Week 2

Ch.2 Brief Review

• A review of linear regression models – S.E. of regression (eg. 1.304371)

measures the dispersion of error term ε.

– R-squared and Adjusted R-squared (eg. 0.553, 0.533)

BF-02 my, School of Economics, UNSW 20

∑=−−

==T

tte

KT 1

22

11ˆˆ σσ

=

=

=

=

−−

−−−=

−−= T

tt

T

tt

T

tt

T

tt

Tyy

KTeR

yy

eR

1

2

1

2

2

1

2

1

2

2

)1/()(

)1/(1,

)(1

the proportion of variation in y explained by the model

the model size (K) penalised in measuring goodness-of-fit

K = # of regressors = model size

Page 21: Econ2209 Week 2

Ch.2 Brief Review

• A review of linear regression models – Log likelihood (eg. -79.31472) likelihood = joint pdf of data (as a function of parameters) reported value = log likelihood evaluated at the OLS estimates, assuming normality for ε – Durbin-Watson stat (eg. 1.506278)

for testing if serial correlation (AC) exists in disturbance εt. Roughly, if no auto-correlation, DW ≈ 2. Significant AC if DW is

too different from 2. Diebold recommended 1.5 as cutoff.

BF-02 my, School of Economics, UNSW 21

∑∑==

−−=T

tt

T

ttt eeeDW

1

2

2

21 /)(

Page 22: Econ2209 Week 2

Ch.2 Brief Review

• A review of linear regression models – Akaike info criterion (eg. 3.429780)

selecting the model with smallest AIC – Schwarz criterion (eg. 3.546730)

selecting the model with smallest SIC – Trade-off in model selection

BF-02 my, School of Economics, UNSW 22

TK

T12)SSRln(AIC +

+=

TKT

T1)ln()SSRln(SIC +

+=

complex model smaller SSR

more parameters

Page 23: Econ2209 Week 2

Ch.2 Brief Review

• A review of linear regression models – Selecting a model: K*+1 minimises SIC

BF-02 my, School of Economics, UNSW 23

K+1 K*+1

ln(T)(K+1)/T ln(SSR/T)

SIC = ln(SSR/T)+ ln(T)(K+1)/T

Page 24: Econ2209 Week 2

Ch.2 Brief Review

• A review of linear regression models – F-statistic (eg. 27.82752) for testing joint H0: β1 = β2 = ... = βK = 0 (H0: independent variables do not explain/predict y)

where SSRr is the SSR under H0.

– Prob(F-statistic): Prob(F-RV > F-stat) the probability of Type-I error (rejecting true H0) when

you reject H0. eg. 0.000000 = P(F-RV > 27.82752)

BF-02 my, School of Economics, UNSW 24

0r Hunder distr. )1,(~

)1(SSR/SSR)/(SSRstat- −−

−−−

= KTKFKT

KF

Page 25: Econ2209 Week 2

Ch.2 Brief Review

• A review of linear regression models – Residual-based specification tests

• A regression model is based on assumptions about εt

a) independent, identical distribution, εt ~ iid (0, σ2); b) independent of xt. • If model correct, residuals approximate εt. • For the correct model, the residuals should be approximately iid (0, σ2). If the residuals are not iid, the model must be mis-specified. • Check the model by checking if the residuals are iid.

eg. Check if there is serial correlation in the residuals. Check if there is heteroskedasticity in the residuals. Check if normality is supported by data.

BF-02 my, School of Economics, UNSW 25

ttt yye ˆ−=

Page 26: Econ2209 Week 2

Ch.2 Brief Review

• Summary – What is a linear regression? – What is the key assumption about linear regressions? – What is the method of estimating a linear regression

model? – How do you use EViews (read, plot, summarise data)? – How do you use EViews to estimate linear regression

models? – Do you understand EViews output? – How do you do inference with EViews output? – Why do we use SIC (or AIC) to choose models?

BF-02 my, School of Economics, UNSW 26