Linear Regression and Testing - uni-muenster.de · Linear Regression and Testing Pag. 5 Under these assumptions, the Gauss–Markov Theorem holds: In the classical linear regression

Andrea Beccarini (CQE) Empirical Methods Summer 2013

Linear Regression and Testing Pag. 1

Linear Regression and Testing

1. Assumptions of the Classical Linear Regression Model.

2. CLRM assumption and the time series analysis.

3. Usual estimation procedure.

4. Example: estimating the Euro-area Phillips curve.

5. Summary Statistics: Coefficient Results, S. E., R-sq., Adj. R-

sq., Sum-of-Sq. Residuals, Mean and S.D. of the Dep. Variable.

6. Verifying the basic assumptions: overview

7. Verifying the basic assumptions: OLS tests: t-Statistics, p-

value, F-Statistic, DW Statistic, RESET Test.

1. Assumptions of the Classical Linear Regression Model

Using the matrix notation, the standard regression model may be

written as:

where is a T-dimensional vector containing observations on the

dependent variable, X is a T x k matrix of independent variables,

is a k-vector of coefficients, and is a T-vector of disturbances.

Alternatively:

t=1…T (2)

The following assumptions permit to consider the Ordinary Least

Squares estimates (OLS) b for the vector :

A1. Linearity: The model specifies a linear relationship between y and the columns of X. A2. Full rank: There is no exact linear relationship among any of the independent variables in the model. This is necessary for estimating the parameters of the model. A3. Exogeneity of the independent variables: E[ |X] = 0. This states that the expected value of the disturbance at each observation in the sample is not a function of the independent variables observed at any observation.

A4. Homoscedasticity and non-autocorrelation: Each disturbance, has the same finite variance, and is uncorrelated with every other disturbance, .

A5. Exogenously generated data: The data in X may be any mixture of a constant and random variables. The process generating the data is independent of the process that generates . Analysis is done conditionally on the observed X. A6. Normal distribution: The disturbances are normally distributed . This assumption is made for convenience.

Under these assumptions, the Gauss–Markov Theorem holds: In the classical linear regression model, the least squares estimator b is the minimum variance linear unbiased estimator of β whether X is stochastic or non-stochastic, so long as the other assumptions of the model continue to hold. Where b is defined as:

And: (4)

The following finite sample properties hold:

. (6) Gauss−Markov theorem: MVLUE. (7)

Results that follow from Assumption A6, normally distributed disturbances: b and are statistically independent. It follows that b and are

uncorrelated and statistically independent. The exact distribution of b|X, is . The ratio is chi-squared distributed with T-k

degrees of freedom, .

2. CLRM assumption and the time series analysis Consider the estimation of the parameters of a pth-order

autoregression, AR(P), by OLS:

with roots of

outside the unit

circle and with an i.i.d sequence with mean zero, variance

and finite fourth moment.

An autoregression has the form of the standard regression model

However, an autoregression cannot satisfy usual condition that

is independent of for all t and s. See A3.

In fact, although and are independent, this is not the case for

Without this independence, none of the small-sample results are

valid for the classical linear regression model applies.

Even if is Gaussian, the OLS coefficient b gives biased estimate of

for an autoregression and the standard t and F statistics can only

be justified asymptotically. However, one may rely on consistency:

B1. Stationarity: given a stochastic process generating t=1,..,T if

neither its mean nor its autocovariances depend on the date

t, then the process for is said to be autocovariance-stationary or

weakly stationary.

B2. Ergodicity: A covariance-stationary process is said to be ergodic

for the mean if converges in probability to

as T goes to infinity.

B1 and B2 imply for the OLS estimator that :

In order to find the distribution of , suppose the sample consists

of T+p observations on : ,…, , ,…, ),

OLS estimation will thus use observations 1 through T. Then

One may assume that:

with Q a non singular and non stochastic matrix.

is assumed to be a martingale difference sequence,

thus one can show:

Substituting (10) and (11) into (9),

from which the asymptotical application of the t and F statistics

follows.

3. Usual estimation procedure

As a first step of the estimation procedure one should find b,

the estimate of and other basic descriptive statistics.

As a second step, one should proceed in verifying the above

assumptions A1-A6 plus B1-B2 if any.

If all of these assumptions are verified one could treat the point

and interval estimates as reliable, and test potential restriction

suggested by the theory.

Otherwise one should find some remedy provided in the

literature.

4. Example: estimating the Euro-area Phillips curve

Dependent Variable: HICPEA Sample(adjusted): 1996:3 2008:1 Included observations: 47 after adjusting endpoints

Variable Coefficient Std. Error t-Statistic Prob.

C 0.602236 0.088140 6.832720 0.0000 HICPEA(-1) -0.195737 0.151978 -1.287924 0.2045

OGEAP 0.101738 0.059188 1.718913 0.0927

R-squared 0.079167 Mean dependent var 0.503383 Adjusted R-squared 0.037311 S.D. dependent var 0.322358 S.E. of regression 0.316287 Akaike info criterion 0.597366 Sum squared resid 4.401639 Schwarz criterion 0.715461 Log likelihood -11.03810 F-statistic 1.891416 Durbin-Watson stat 1.968068 Prob(F-statistic) 0.162921

HICPEA is inflation rate and OGEAP is the output gap (% changes).

5. Summary Statistics

As purely descriptively, one may generally consider the following

statistics.

Coefficient Results

Regression Coefficients are point estimates. The least squares

regression coefficients b are computed by the standard OLS

formula:

- The coefficient measures the marginal contribution of the

independent variable to the dependent variable, holding all other

variables fixed.

- In the above example, a percentage increase of OGEAP implies

an expected contemporaneous increase of HICPEA of 0.10.

- If present, the constant or intercept in the regression is the base

level of the prediction when all of the other independent

variables are zero.

Standard Errors

- The standard errors measure the statistical reliability of the

coefficient estimates—the larger the standard errors, the more

statistical noise in the estimates.

- They permit to perform interval estimates. If the errors are

normally distributed (as assumed), there is a 66% probability

that the true regression coefficient lies within 1 standard error

of the reported coefficient, and a 95% probability that it lies

within 2 standard errors.

The standard errors of the estimated coefficients are the square

roots of the diagonal elements of the coefficient (estimated)

covariance matrix.

The estimated covariance matrix of the estimated coefficients is

computed as (see eq. (4)):

where is the residual. In the above example it is:

C HICPEA(-1) OGEAP C 0.007769 -0.011412 0.001127 HICPEA(-1) -0.011412 0.023097 -0.002110 OGEAP 0.001127 -0.002110 0.003503

R-squared

The R-squared statistic ( ) measures the success of the

regression in predicting the values of the dependent variable

within the sample.

In standard settings, may be interpreted as the fraction of the

variance of the dependent variable explained by the independent

variables.

where is the mean of the dependent variable.

The statistic will equal one if the regression fits perfectly, and zero

if it fits no better than the simple mean of the dependent variable.

It can be even negative, if

- the regression does not have an intercept,

- the regression contains coefficient restrictions,

- the estimation method is two-stage least squares or ARCH.

In the example above, which is a small number.

However, there is no particular criterion to evaluate it, i.e.

by simply recognizing that it is a small number.

Adjusted R-squared

- One problem with using as a measure of goodness of fit is

that the will never decrease as you add more regressors.

- In the extreme case, you can always obtain if you include

as many regressors as there are sample observations.

- The adjusted , commonly denoted as , penalizes the for

the addition of regressors which do not contribute to the

explanatory power of the model.

- The adjusted is computed as

- It is never larger than the , it can decrease as you add

regressors, and for poorly fitting models, may be negative.

Standard Error of the Regression

It is a measure based on the estimated variance of the residuals

Sum-of-Squared Residuals

Mean and Standard Deviation (S.D.) of the Dependent Variable

6. Verifying the basic assumptions: overview

Some conditions may be easily verified, some must be tested.

Condition A2, A3 and A5 are easily verified (to some extent).

A1, A3, A4 and A5 may be in part verified by the tests based on

A1, A3, A4 and A5 require Maximum Likelihood principles.

A6 may be verified through the Jarque-Bera test.

B1 may be verified through the stationarity tests (Dicky-

Fueller,..).

B2 is assumed after B1.

7. Verifying the basic assumptions: OLS tests

Condition A1: this implies two sub-conditions:

I. Linearity (and in general terms the correct functional form).

II. The inclusion of the correct regressors in the model.

Apply the t-statistics, the F-statistics and the RESET test.

Condition A3: apart from general consideration,(time series,

simultaneous equations), apply the RESET test.

Condition A4: it may be verified by the Durbin-Watson test and

the RESET test.

Condition A5 may be verified by the RESET test.

t-Statistics

Since, is and

it follows:

Where is the t-student distribution with T-k degrees of

freedom.

- Through the t-statistic, one could test the particular null

hypothesis: k=1,2,... the hypothesis that the kth

coefficient is equal to zero.

- In this case, the t-statistic is computed as the ratio of an

estimated coefficient to its standard error.

- This probability to compare the t-test is described below.

- There are cases where normality can only hold asymptotically, in

this case, one talks about a z-statistic instead of a t-statistic.

Probability (p-value)

- This p-value is also known as the marginal significance level.

- The p-values are computed from a t-distribution with T-k degrees

of freedom.

- Given a p-value, one can say if one rejects or accepts the

hypothesis that the true coefficient is zero against a two-sided

alternative that it differs from zero.

- For example, at the 5% significance level, a p-value lower than

0.05 is taken as evidence to reject the null hypothesis of a zero

coefficient -- this excludes the significance of HICPEA(-1) (0.20)

and OGEAP (0.09).

F-Statistic

The F-statistic permits the consideration of J linear restrictions

(contemporaneously) stated in the null hypothesis:

Against the alternative hypothesis:

The F-statistics is defined as:

. (14)

This statistic also allows to test restrictions suggested by the

economic theory.

The F-statistic associated to the regression output is a test of the

hypothesis that all of the slope coefficients (excluding the constant,

or intercept) in a regression are zero.

For ordinary least squares models, the F-statistic is computed as

Under the null hypothesis with normally distributed errors, this

statistic has an F-distribution with k-1 numerator degrees of

freedom and T-k denominator degrees of freedom.

The p-value given along with the F-statistic, denoted Prob(F-

statistic), is the marginal significance level of the F-est.

If the p-value is less than the significance level, (say 0.05), one

rejects the null hypothesis that all slope coefficients are equal

to zero.

Note that the F-test is a joint test and its response does not

necessarily coincide with the response of the t-statistics.

In table 1, one obtains that: F-statistic = 1.891416 with Pr(F-

statistic) = 0.162921,

which leads to the rejection of the estimates as specified above.

IS THE PHILLIPS CURVE ABSENT IN THE EU-DATA?

Ramsey's RESET Test

RESET stands for Regression Specification Error Test and was proposed by Ramsey (1969). The RESET test is a general test which covers any departure from the assumptions of the CLRM:

Serial correlation, heteroskedasticity, or non-normality of all violate the assumption that the disturbances are distributed as . See A3, A4, A5, A6.

RESET is a general test for the following types of specification errors:

- Omitted variables; X does not include all relevant variables (A1).

- Incorrect functional form; some or all of the variables in y and X

should be transformed to logs, powers, reciprocals, or in some other way (A1).

- Correlation between X and , which may be caused, among

other things, by measurement errors, simultaneity, or the presence of lagged y values and serially correlated disturbances. (A3, A4, A5).

Under such specification errors, OLS estimators will be biased and inconsistent, and conventional inference procedures will be invalidated. Ramsey (1969) showed that any or all of these specification errors produce a non-zero mean vector for . The null and alternative hypotheses of the RESET test are:

The test is based on an augmented regression . The test of specification error evaluates the restriction . The crucial question in constructing the test is to determine what variables should enter the Z matrix. Note that the Z matrix may, for example, contain variables that are not in the original specification, so that the test of is simply the omitted variables test.

In testing for incorrect functional form, the nonlinear part of the regression model may be some function of the regressors included in X. For example, the linear relation , may be specified instead of the true relation:

(16) A more general example might be a very non linear relationship:

A Taylor series approximation of the non linear relation would yield an expression involving powers and cross-products of the explanatory variables. Ramsey's suggestion is to include powers of the predicted values of the dependent variable (which are, of course, linear combinations of powers and cross-product terms of the explanatory variables) in Z:

where is the vector of fitted values from the regression of y on X. The first power is not included since it is perfectly collinear with the matrix.

The RESET test has the form of a F-test, the null hypothesis is that the coefficients on the powers of fitted values are all zero. This test can detect something wrong in the model but does not provide any indication of what is wrong. Regarding the estimate of table 1, the F statistics is 0.55 with p-value: 0.69, this result implies that the null hypothesis cannot be rejected. This supports the hypothesis that the model is correctly specified.

Durbin-Watson Statistic

The Durbin-Watson statistic measures the serial correlation in the

residuals (assumption A4). The statistic is computed as

As a rule of thumb, if the DW is less than 2, there is evidence of

positive serial correlation.

This statistics must be used to test only for AR(1) errors and is not applicable whether there are lagged dependent variables (autocorrelation). In table 1, DW=1.97 which indicates no serial correlation, although

it is not reliable due to the presence of the lagged variable among

regressors.

The Q-statistic, and the Breusch-Godfrey LM test, both of which

provide a more general testing framework than the Durbin-Watson

test (see later on).

Linear Regression and Testing - uni-muenster.de · Linear Regression and Testing Pag. 5 Under these assumptions, the Gauss–Markov Theorem holds: In the classical linear regression

Documents

Week 5: Midterm revision session - GitHub Pages · Overview...

Basic Statistics Linear Regression. X Y Simple Linear...

Simple linear regression and correlation analysis 1....

Simple linear regression Linear regression with one...

CONTINUATION OF INFERENCE TESTING 9E.1: INFERENCE TESTING...

1 Curve-Fitting Interpolation. 2 Curve Fitting Regression...

The Algebra of Linear Regression - statpower.net...

Lecture 5 Hypothesis Testing in Multiple Linear...

Regression Linear Regression

Multiple Linear Regression - Analysis Made Easy · Multiple...

Introduction to F-testing in linear regression · PDF...

Chapter 12 Simple Linear Regression n Simple Linear...

Multiple Linear Regression: Global tests and Multiple...

REGRESSION 12.1 Simple Linear Regression Model 12.2...

Testing a single regression coefficient in high ... ·...

Chapter 13: SIMPLE LINEAR REGRESSION. 2 Simple Regression...