1 Autocorrelation and Hetroskedasticity Having considered the multivariate CLRM, we now want to consider cases when our assumptions are broken. In such cases, OLS estimates are no longer BLUE and we must look to alternative models to correct for lost efficiency. However, before proceeding, it is helpful to understand two types of data sets that exist. First, we can collect data that is generated at the same time from different sources. Examples may be stock returns, labor supply curves or baseball player salaries. Such data is called cross sectional. It takes a cross section of a data set at a point in time, similar to how the balance sheet accounts for the value of assets at a point in time. The second type of data is called time series data. This type of data takes a given data generating process and observes it through time. Examples include GDP, unemployment and perhaps school attendance. Time series data is used to consider data over time, such as the income statement accounting for profit over time. These two types of data sets are prominent in econometrics and call on alternative estimation techniques to deal with the problems they create. Both of these types of data lead to inefficient estimators, therefore, will cast doubt on our estimated variances, consequently, t-statistics. In sum, it causes serious difficulties. Heteroskedasticity You may recall from your introductory statistics course that random variables are assumed to be independently and identically distributed. When our error terms are identically distributed, it implies they have the same variance for all observations. This is known as homoskedasticity. If they are not, it causes serious problems for our estimates and must be corrected if we are to obtain reliable estimates. Hetroskedasticity is a deviation from the identically distributed assumption because the variances are not the same for each value. The consequences of hetroskedasticity are serious. While parameter estimates remain unbiased, they are no longer efficient, i.e., no longer BLUE. Since the estimated error’s variance- covariance is not efficient, it invalidates the t-statistics. Fortunately, we can correct for hetroskedasticity by selecting an alternative estimator that correctly weights the errors and retains
31
Embed
Autocorrelation and Hetroskedasticity - University of …general.utpb.edu/fac/carson_s/Econometrics/Econometri… · · 2001-07-06Autocorrelation and Hetroskedasticity ... The consequences
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
Autocorrelation and Hetroskedasticity
Having considered the multivariate CLRM, we now want to consider cases when our
assumptions are broken. In such cases, OLS estimates are no longer BLUE and we must look to
alternative models to correct for lost efficiency. However, before proceeding, it is helpful to
understand two types of data sets that exist. First, we can collect data that is generated at the
same time from different sources. Examples may be stock returns, labor supply curves or
baseball player salaries. Such data is called cross sectional. It takes a cross section of a data set
at a point in time, similar to how the balance sheet accounts for the value of assets at a point in
time. The second type of data is called time series data. This type of data takes a given data
generating process and observes it through time. Examples include GDP, unemployment and
perhaps school attendance. Time series data is used to consider data over time, such as the
income statement accounting for profit over time. These two types of data sets are prominent in
econometrics and call on alternative estimation techniques to deal with the problems they create.
Both of these types of data lead to inefficient estimators, therefore, will cast doubt on our
estimated variances, consequently, t-statistics. In sum, it causes serious difficulties.
Heteroskedasticity
You may recall from your introductory statistics course that random variables are
assumed to be independently and identically distributed. When our error terms are identically
distributed, it implies they have the same variance for all observations. This is known as
homoskedasticity. If they are not, it causes serious problems for our estimates and must be
corrected if we are to obtain reliable estimates. Hetroskedasticity is a deviation from the
identically distributed assumption because the variances are not the same for each value.
The consequences of hetroskedasticity are serious. While parameter estimates remain
unbiased, they are no longer efficient, i.e., no longer BLUE. Since the estimated error’s variance-
covariance is not efficient, it invalidates the t-statistics. Fortunately, we can correct for
hetroskedasticity by selecting an alternative estimator that correctly weights the errors and retains
2
the BLUE properties. This estimator is known as the generalized least squares (GLS) estimator.
The GLS estimator transforms the errors such that the errors become homoskedastic. However,
transformation of the errors only makes the variance-covariance matrix efficient. It does not
change the meaning of the coefficients.
In the CRLM, the following conditions held:
That is, the assumptions implied
Our objective is to transform our model such that the model has constant variances. For
example,
We transform the model by pre-multiplying by
Consider the variance of the errors:
[ ] [ ] [ ] 0,,0 22 === sttt eeCovandeVareE Iσ
[ ]
==
25
24
23
22
21
2
0000
0000
0000
0000
0000
σσ
σσ
σ
σεε ITE
tKtKtt xxy εββα ++++= ...11
tσσ
t
t
tkk
ttt
t xxy
σσε
σσβ
σσβ
σασ
σσ
++++= ...11
2
2
22
2
2
)()( σσσσ
εσσ
σσε
===t
tt
tt
t VarVar
3
Hence, the simple transformation produces constant variance, i.e., homoskedasticity.
A convenient matrix demonstration of hetroskedasticity and autocorrelation follows.
Recall that for the least squares estimator, the variance for the coefficients is:
Note: This is a large variance, i.e., inefficient. When OLS is BLUE, V(β)=σ2I. We know that a
consequence of hetroskedasticity and autocorrelation means OLS is no longer BLUE. We can
model V(β) as
Ω weights errors to correct for hetroskedasticity and autocorrelation. GLS produces the BLUE if
we transform the original data so that the variance-covariance matrix transforms errors such that
E[εεT]=σ2I. To do so, we rely on some regulatory conditions and a basic matrix algrebra theorem
that states
( ) ( )( )[ ]( ) ( ) ( ) ( ) ( ) 11211
11)(])ˆ)(ˆ[()ˆ(
−−−−
−−
Ω==
=−−=
XXXXXXXXXEXXX
XXXXXXEEV
TTTTTTT
TTTTT
σεε
εεβββββ
[ ] Ω= 2σεεTE
.,
)()(1
111
−
−−−
Ω=
==Ω⇒
=Ω
TTSo
TTTT
ITT
T
TT
T
4
Now, we can transform our OLS model by pre-multiplying by T.
We can now see how Generalized Least Squares coefficient estimates efficient.
This transformed GLS variance is BLUE if the A2, A4 and A5 are satisfied.
A proposed matrix T is
( ) ( ) ( )( ) ( ) ( ) ( ) ( )
( ) 112
1112111**1
1******
1**
)(
])ˆ)(ˆ[()ˆ(
−−
−−−−−−−
−−
Ω=
ΩΩΩ==
=−−=
XX
XXXXXXTXTXTXEXTTXTX
XXXEXXXEV
T
TTTTTTTTT
TTTTTGLS
σ
σεε
εεβββββ
=
5
4
3
2
1
10000
01
000
001
00
0001
0
00001
σ
σ
σ
σ
σ
σT
( )
yXXX
TT
However
TyTXTXTX
yXXXSquaresLeastdGeneralize
XyeldTransforme
TandTXXTyyLet
TTXTy
TTGLS
T
TTTT
GLS
T
111
1
1
**1
**
***
***
)(
,
)(
:
:mod
,
−−−
−
−
−
ΩΩ=⇒
Ω=
=
=
+=
===
+=
β
β
εβ
εε
εβ
5
Of course, this is the same transformation as our scalar counterpart. And like our scalar
counterpart, the matrix T will transform OLS into GLS.
Detecting Hetroskedasticity
There are multiple tests to determine if a data set is hetroskedastic. Since they produce
the same result, we’ll primarily focus on one, the White test. The procedure for the White test
follows:
1. Regress y on x1, x2 . . . xn
2. Save and square the residual et.
3. Regress et2 on all independent, squared independent and the cross product of all independent
variables.
4. Calculate the test statistic NR2~χ2(1)
5. If NR2 is greater than the χ2 critical value, reject the null hypothesis of equal variances across
observations. If NR2 is less than the χ2 critical value, we fail to reject the null hypothesis, i.e.,
the errors are homoskedastic.
Example
Recall the simple linear regression model, however, I’ve tweeked the data a little to
accentuate hetroskedasticity. We will run four regressions, each using a different transformation
of the dependent and/or independent variables. The first model regresses earnings on work
experience. We then perform a white test to determine if we have hetroskedasticity.
Model 1:
TYPE COMMAND:_read (4) earn we...SAMPLE RANGE IS NOW SET TO: 1 20
TYPE COMMAND:_ OLS earn we/resid=e
REQUIRED MEMORY IS PAR= 2 CURRENT PAR= 500 OLS ESTIMATION 20 OBSERVATIONS DEPENDENT VARIABLE = EARN
6
...NOTE..SAMPLE RANGE SET TO: 1, 20
R-SQUARE = 0.4904 R-SQUARE ADJUSTED = 0.4621VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.59787E+08STANDARD ERROR OF THE ESTIMATE-SIGMA = 7732.2SUM OF SQUARED ERRORS-SSE= 0.10762E+10MEAN OF DEPENDENT VARIABLE = 25584.LOG OF THE LIKELIHOOD FUNCTION = -206.388
We now plot the errors on work experience to see if we have hetroskedasticity. Since the model
is y=α+βx+ε . This implies the errors are ε=y-α-βx.
Notice how as work experience increases, the errors increase. This suggests hetroskedasticity
may be a problem. So, a white test is performed to determine if remedial measures are necessary.
The white test requires we square our residuals (resid=e) and regress this new variable on
independent and squared independent variables.
TYPE COMMAND:_g we2=we2**2TYPE COMMAND
OLS errors
-25000-20000-15000-10000
-50000
50001000015000
0 5 10 15 20
Work Experience
OL
S E
rro
rs
Series1
7
:_g e2=e**2TYPE COMMAND:_ols e2 we we2
REQUIRED MEMORY IS PAR= 3 CURRENT PAR= 500 OLS ESTIMATION 20 OBSERVATIONS DEPENDENT VARIABLE = E2...NOTE..SAMPLE RANGE SET TO: 1, 20
R-SQUARE = 0.2526 R-SQUARE ADJUSTED = 0.1647VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.79272E+16STANDARD ERROR OF THE ESTIMATE-SIGMA = 0.89035E+08SUM OF SQUARED ERRORS-SSE= 0.13476E+18MEAN OF DEPENDENT VARIABLE = 0.53809E+08LOG OF THE LIKELIHOOD FUNCTION = -392.844
We take the R2 from this auxiliary equation and multiply it by 20. This is the test statistic NR2.
Our null hypothesis is that errors are constant across all observations of work experience, i.e.,
σ12=σ2
2= . . . =σ202 (homoskedasitic) versus the alternative hypothesis that σ1
2≠σ22≠ . . . ≠σ20
2.
The test statistic is NR2=20(.2526)=5.052. This is compared to the χ2 critical value with one
degree of freedom, 3.84. Hence, our earnings model is hetroskedastic. Theory suggests the
variance-covariance matrix is not efficient. The work experience coefficient is also not BLUE
and out test statistics are not valid. However, we are still unbiased. To improve our t-statistics,
we must use an alternative estimation technique. There is an alternative variance-covariance
matrix, called the White variance-covariance matrix that corrects for the hetroskedasticity.
However, this estimation procedure does nothing to our coefficients. This is done by including
the hetcov option in Shazam.
Model 2:
TYPE COMMAND:_ols earn we/hetcov redid=e1
REQUIRED MEMORY IS PAR= 3 CURRENT PAR= 500 OLS ESTIMATION 20 OBSERVATIONS DEPENDENT VARIABLE = EARN
8
...NOTE..SAMPLE RANGE SET TO: 1, 20
USING HETEROSKEDASTICITY-CONSISTENT COVARIANCE MATRIX
R-SQUARE = 0.4904 R-SQUARE ADJUSTED = 0.4621VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.59787E+08STANDARD ERROR OF THE ESTIMATE-SIGMA = 7732.2SUM OF SQUARED ERRORS-SSE= 0.10762E+10MEAN OF DEPENDENT VARIABLE = 25584.LOG OF THE LIKELIHOOD FUNCTION = -206.388
Notice how the t-statistics have change. These are correct statistics, so we can conclude that
work experience is significant in explaining earnings. Let’s look at the errors plotted on work
experience to determine if the White variance-covariance matrix corrected for hetroskedasticity.
While White’s variance covariance matrix produces reliable t-statistics, it does not correct for
hetroskedasticty. This is demonstrated with by a White test.
TYPE COMMAND:_g e12=e1**2TYPE COMMAND:_ols e12 we we2
REQUIRED MEMORY IS PAR= 3 CURRENT PAR= 500
White Errors do not Correct Hetroskedasticity
-30000
-20000
-10000
0
10000
20000
0 5 10 15 20
Work Experience
Err
ors
Series1
9
OLS ESTIMATION 20 OBSERVATIONS DEPENDENT VARIABLE = E12...NOTE..SAMPLE RANGE SET TO: 1, 20
R-SQUARE = 0.2526 R-SQUARE ADJUSTED = 0.1647VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.79272E+16STANDARD ERROR OF THE ESTIMATE-SIGMA = 0.89035E+08SUM OF SQUARED ERRORS-SSE= 0.13476E+18MEAN OF DEPENDENT VARIABLE = 0.53809E+08LOG OF THE LIKELIHOOD FUNCTION = -392.844
Given the same R2 and coefficients, it is clear that the White variance-covariance matrix fails to
correct for hetroskedasticity. At this point, we revert back to our previous transformation
discussion. If we transform the dependent variable by some monotonic transformation (one
directional), we may be able to correct for hetroskedasticity. We will consider three
transformations. In the first model, we divide earnings by work experience. In the second model,
we transform the earnings by taking the natural log of earnings. The third model transform both
the dependent and independent variables by taking natural logs.
Model 3
The transformation proceeds as follows:
1) Transform model, divide by work experience.
2) Regress transformed the dependant variable on transformed independent variable.
TYPE COMMAND:_
WEWEWE
EWEE
εβαεβα ++=⇒++=
'''''
'''''
Re
,,,1
,
εβα
εεαββα
++=
=====
WEEgress
WEWEWE
WEE
ELet
10
REQUIRED MEMORY IS PAR= 2 CURRENT PAR= 500 OLS ESTIMATION 20 OBSERVATIONS DEPENDENT VARIABLE = EP...NOTE..SAMPLE RANGE SET TO: 1, 20
USING HETEROSKEDASTICITY-CONSISTENT COVARIANCE MATRIX
R-SQUARE = 0.8606 R-SQUARE ADJUSTED = 0.8529VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.41717E+07STANDARD ERROR OF THE ESTIMATE-SIGMA = 2042.5SUM OF SQUARED ERRORS-SSE= 0.75090E+08MEAN OF DEPENDENT VARIABLE = 8189.9LOG OF THE LIKELIHOOD FUNCTION = -179.763
Transforming the model to its original form implies
Note the change in the coefficients from the original OLS model. If this model reduces or
eliminates hetroskedasticity, we believe the transformed model is a better estimate. We want to
determine if the transformation reduced or eliminated hetroskedasticity. We run a White test.
TYPE COMMAND:_
REQUIRED MEMORY IS PAR= 3 CURRENT PAR= 500 OLS ESTIMATION 20 OBSERVATIONS DEPENDENT VARIABLE = E2...NOTE..SAMPLE RANGE SET TO: 1, 20
R-SQUARE = 0.2185 R-SQUARE ADJUSTED = 0.1266VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.27546E+14STANDARD ERROR OF THE ESTIMATE-SIGMA = 0.52485E+07SUM OF SQUARED ERRORS-SSE= 0.46829E+15MEAN OF DEPENDENT VARIABLE = 0.37545E+07LOG OF THE LIKELIHOOD FUNCTION = -336.223
The White test statistic is 20 (.2185)=4.37. Therefore, we reject the hypothesis that the errors are
identically distributed. However, we have eliminated some hetroskedasticity. This is seen by
plotting the transformed model’s errors on work experience.
Model 4
Since our first transformation didn’t work, let’s consider a second transformation. This
second model uses the natural log transformation of earnings to deal with hetroskedasticity. This
is a monotonic transformation on earnings to change the range while the domain of work
experience remains unchanged.
TYPE COMMAND:_g le=log(earn)TYPE COMMAND:_ols le we/resid=e2
REQUIRED MEMORY IS PAR= 3 CURRENT PAR= 500 OLS ESTIMATION 20 OBSERVATIONS DEPENDENT VARIABLE = LE...NOTE..SAMPLE RANGE SET TO: 1, 20
Transformed Model's Errors
-6000
-4000
-2000
0
2000
4000
0 5 10 15 20
Work Experience
Err
ors
Series1
12
R-SQUARE = 0.3750 R-SQUARE ADJUSTED = 0.3403VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.10566STANDARD ERROR OF THE ESTIMATE-SIGMA = 0.32505SUM OF SQUARED ERRORS-SSE= 1.9019MEAN OF DEPENDENT VARIABLE = 10.073LOG OF THE LIKELIHOOD FUNCTION = -4.84989
White test, hetroskedasticity is no longer a problem. These alternative regressors are accessed
using the diagnos/het command after the OLS regression. Let’s look at the white test:
TYPE COMMAND:_g e22=e2**2TYPE COMMAND:_ols e22 we we2
REQUIRED MEMORY IS PAR= 3 CURRENT PAR= 500 OLS ESTIMATION 20 OBSERVATIONS DEPENDENT VARIABLE = E22...NOTE..SAMPLE RANGE SET TO: 1, 20
R-SQUARE = 0.2017 R-SQUARE ADJUSTED = 0.1078VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.20554E-01STANDARD ERROR OF THE ESTIMATE-SIGMA = 0.14337SUM OF SQUARED ERRORS-SSE= 0.34942MEAN OF DEPENDENT VARIABLE = 0.95094E-01LOG OF THE LIKELIHOOD FUNCTION = 12.0933
Our final transformation involves transforming both the dependent and independent
variables with natural logs. Page 52 demonstrates that this transformation means the coefficient
of work experience is an elasticity.
REQUIRED MEMORY IS PAR= 2 CURRENT PAR= 500 OLS ESTIMATION 20 OBSERVATIONS DEPENDENT VARIABLE = LE...NOTE..SAMPLE RANGE SET TO: 1, 20
USING HETEROSKEDASTICITY-CONSISTENT COVARIANCE MATRIX
R-SQUARE = 0.3690 R-SQUARE ADJUSTED = 0.3340VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.10667STANDARD ERROR OF THE ESTIMATE-SIGMA = 0.32660SUM OF SQUARED ERRORS-SSE= 1.9200MEAN OF DEPENDENT VARIABLE = 10.073LOG OF THE LIKELIHOOD FUNCTION = -4.94492
Notice the work experience coefficient of .2735. This is an elasticity where a one percent change
in work experience leads to a .273 percent change in annual earnings. We once again run a White
test to determine if this final transformation has dealt with hetroskedasticity.
TYPE COMMAND:_
REQUIRED MEMORY IS PAR= 3 CURRENT PAR= 500 OLS ESTIMATION 20 OBSERVATIONS DEPENDENT VARIABLE = E2...NOTE..SAMPLE RANGE SET TO: 1, 20
R-SQUARE = 0.1886 R-SQUARE ADJUSTED = 0.0931VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.21374E-01STANDARD ERROR OF THE ESTIMATE-SIGMA = 0.14620SUM OF SQUARED ERRORS-SSE= 0.36336MEAN OF DEPENDENT VARIABLE = 0.96002E-01LOG OF THE LIKELIHOOD FUNCTION = 11.7021
The Durbin-Watson (DW) test is nearly the exclusive test for autocorrelation. The
intuition behind the DW is to test for successive values that are close to each other. This suggests
autocorrelation may be present. The DW lies between 0 and 4 with DW=2 indicating
autocorrelation is not present. Hence, our hypothesis concerning autocorrelation is
H0: Corr(ε t, εt-1)=0
H1: Corr(ε t, εt-1)≠0
The formal definition of DW is
Therefore, when ρ=0 (no autocorrelation), DW=2. The critical value with which we compare the
DW was adjusted by Durbin and Watson and is presented in PR on page 610. Notice in the table
that two limits are given: dl for the DW’s cv lower limit and du for the DW’s cv upper limit. The
following is a table you will find helpful when testing for autocorrelation with the DW:
DW Value Test Conclusion4-dl<DW<4 Reject null; negative autocorrelation
4-du<DW<4-dl Test inconclusive2<DW<4-du Fail to reject: No Autocorrelationdu<DW<2 Fail to reject: No Autocorrelationdl<DW<du Test inconclusive0<DW<dl Reject null; positive autocorrelation
It’s useful to graph the distribution with rejection, inconclusive and non-rejection areas.
( )( )ρ
ε
εε−=
−=
∑
∑
=
=−
12ˆ
ˆˆ
1
2
2
21
T
tt
ttt
DW
20
A final note regarding the DW is that it deteriorates in the presence of lagged dependent
variables. Durbin and Watson accounted for this with the Durbin’s h test.
Examples
We now use the tools presented in this section to assess time series data. The time series
we use is perhaps the most well known time series data, United States GDP estimates. We us
data from 1959-1994. Annual GDP estimates are regressed on time.
TYPE COMMAND:_...SAMPLE RANGE IS NOW SET TO: 1 36
TYPE COMMAND:_OLS GDP year/ resid=e exactdw
REQUIRED MEMORY IS PAR= 13 CURRENT PAR= 500 OLS ESTIMATION 36 OBSERVATIONS DEPENDENT VARIABLE = GDP...NOTE..SAMPLE RANGE SET TO: 1, 36
DURBIN-WATSON STATISTIC = 0.60702
DURBIN-WATSON P-VALUE = 0.000000
R-SQUARE = 0.9922 R-SQUARE ADJUSTED = 0.9919VARIANCE OF THE ESTIMATE-SIGMA**2 = 14294.STANDARD ERROR OF THE ESTIMATE-SIGMA = 119.56SUM OF SQUARED ERRORS-SSE= 0.48599E+06
21
MEAN OF DEPENDENT VARIABLE = 4269.5LOG OF THE LIKELIHOOD FUNCTION = -222.269
R-SQUARE = 0.9958 R-SQUARE ADJUSTED = 0.9957VARIANCE OF THE ESTIMATE-SIGMA**2 = 7575.5STANDARD ERROR OF THE ESTIMATE-SIGMA = 87.038SUM OF SQUARED ERRORS-SSE= 0.25757E+06MEAN OF DEPENDENT VARIABLE = 4269.5LOG OF THE LIKELIHOOD FUNCTION = -211.166
MODEL SELECTION TESTS - SEE JUDGE ET.AL.(1985, P.242) AKAIKE (1969) FINAL PREDICTION ERROR- FPE = 7996.4 (FPE ALSO KNOWN AS AMEMIYA PREDICTION CRITERION -PC) AKAIKE (1973) INFORMATION CRITERION- LOG AIC = 8.9866 SCHWARZ(1978) CRITERION-LOG SC = 9.0746MODEL SELECTION TESTS - SEE RAMANATHAN(1992,P.167) CRAVEN-WAHBA(1979) GENERALIZED CROSS VALIDATION(1979) -GCV= 8021.2 HANNAN AND QUINN(1979) CRITERION -HQ= 8244.8 RICE (1984) CRITERION-RICE= 8049.0 SHIBATA (1981) CRITERION-SHIBATA= 7949.6 SCHWARTZ (1978) CRITERION-SC= 8730.8 AKAIKE (1974)INFORMATION CRITERION-AIC= 7995.5
ANALYSIS OF VARIANCE - FROM MEAN SS DF MSREGRESSION 0.61688E+08 1. 0.61688E+08ERROR 0.25757E+06 34. 7575.5TOTAL 0.61946E+08 35. 0.17699E+07
23
ANALYSIS OF VARIANCE - FROM ZERO SS DF MSREGRESSION 0.71793E+09 2. 0.35896E+09ERROR 0.25757E+06 34. 7575.5TOTAL 0.71819E+09 36. 0.19950E+08
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY NAME COEFFICIENT ERROR 34 DF P-VALUE CORR. COEFFICIENT AT MEANSYEAR 125.69 3.794 33.13 0.000 0.985 0.9954 58.1860CONSTANT -0.24414E+06 7500. -32.55 0.000-0.984 0.0000 -57.1824
VARIANCE-COVARIANCE MATRIX OF COEFFICIENTSYEAR 14.397CONSTANT -28456. 0.56245E+08 YEAR CONSTANT
CORRELATION MATRIX OF COEFFICIENTSYEAR 1.0000CONSTANT -0.99998 1.0000 YEAR CONSTANT
DURBIN-WATSON = 1.4443 VON NEUMANN RATIO = 1.4855 RHO = 0.23705RESIDUAL SUM = -87.763 RESIDUAL VARIANCE = 7802.1SUM OF ABSOLUTE ERRORS= 2351.1R-SQUARE BETWEEN OBSERVED AND PREDICTED = 0.9957RUNS TEST: 15 RUNS, 22 POSITIVE, 14 NEGATIVE, NORMAL STATISTIC =-1.1085DURBIN H STATISTIC (ASYMPTOTIC NORMAL) = 2.0586 MODIFIED FOR AUTO ORDER=1COEFFICIENT OF SKEWNESS = -1.0163 WITH STANDARD DEVIATION OF 0.3925COEFFICIENT OF EXCESS KURTOSIS = 1.2358 WITH STANDARD DEVIATION OF0.7681
GOODNESS OF FIT TEST FOR NORMALITY OF RESIDUALS - 6 GROUPSOBSERVED 2.0 4.0 8.0 17.0 5.0 0.0EXPECTED 0.8 4.9 12.3 12.3 4.9 0.8CHI-SQUARE = 5.9837 WITH 2 DEGREES OF FREEDOM
As is demonstrated by the DW statistic, our test result is inconclusive. Let’s see how the auto
AR(1) command dealt with the model’s errors.
24
It appears that the auto command has done a good job reducing autocorrelated errors. However,
our DW test with the auto command suggests that the test is inconclusive. To improve on these
estimates, we want to see how errors are correlated. A useful tool to observe the correlation
between errors in different periods is the autocorrelation function plotted with a correlogram.
The idea behind the ACF is to plot the correlations between congruent periods and see which
perioids have significant correlation. Those periods with significant correlations must be
accounted for. Let’s look at the OLS autocorrelation function.
OLS ACF and PACF
AUTOCORRELATION FUNCTION OF THE SERIES (1-B) (1-B ) E
COMPLEX ROOTS - AUTOREGRESSIVE PROCESS DISPLAYS PSEUDO PERIODICBEHAVIOUR
26
WITH DAMPED SINE WAVE
R-SQUARE = 0.9964 R-SQUARE ADJUSTED = 0.9963VARIANCE OF THE ESTIMATE-SIGMA**2 = 6541.0STANDARD ERROR OF THE ESTIMATE-SIGMA = 80.876SUM OF SQUARED ERRORS-SSE= 0.22239E+06MEAN OF DEPENDENT VARIABLE = 4269.5LOG OF THE LIKELIHOOD FUNCTION = -208.679
MODEL SELECTION TESTS - SEE JUDGE ET.AL.(1985, P.242) AKAIKE (1969) FINAL PREDICTION ERROR- FPE = 6904.3 (FPE ALSO KNOWN AS AMEMIYA PREDICTION CRITERION -PC) AKAIKE (1973) INFORMATION CRITERION- LOG AIC = 8.8398 SCHWARZ(1978) CRITERION-LOG SC = 8.9278MODEL SELECTION TESTS - SEE RAMANATHAN(1992,P.167) CRAVEN-WAHBA(1979) GENERALIZED CROSS VALIDATION(1979) -GCV= 6925.7 HANNAN AND QUINN(1979) CRITERION -HQ= 7118.8 RICE (1984) CRITERION-RICE= 6949.8 SHIBATA (1981) CRITERION-SHIBATA= 6864.0 SCHWARTZ (1978) CRITERION-SC= 7538.4 AKAIKE (1974)INFORMATION CRITERION-AIC= 6903.6
ANALYSIS OF VARIANCE - FROM MEAN SS DF MSREGRESSION 0.61724E+08 1. 0.61724E+08ERROR 0.22239E+06 34. 6541.0TOTAL 0.61946E+08 35. 0.17699E+07
ANALYSIS OF VARIANCE - FROM ZERO SS DF MSREGRESSION 0.12807E+09 2. 0.64037E+08ERROR 0.22239E+06 34. 6541.0TOTAL 0.12830E+09 36. 0.35638E+07
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY NAME COEFFICIENT ERROR 34 DF P-VALUE CORR. COEFFICIENT AT MEANSYEAR 125.83 2.998 41.98 0.000 0.990 0.9965 58.2499CONSTANT -0.24442E+06 5925. -41.25 0.000-0.990 0.0000 -57.2480
VARIANCE-COVARIANCE MATRIX OF COEFFICIENTSYEAR 8.9853CONSTANT -17759. 0.35102E+08 YEAR CONSTANT
CORRELATION MATRIX OF COEFFICIENTSYEAR 1.0000CONSTANT -0.99999 1.0000
DURBIN-WATSON = 1.8599 VON NEUMANN RATIO = 1.9131 RHO = 0.00792RESIDUAL SUM = -22.386 RESIDUAL VARIANCE = 6852.2SUM OF ABSOLUTE ERRORS= 2101.6R-SQUARE BETWEEN OBSERVED AND PREDICTED = 0.9962RUNS TEST: 19 RUNS, 21 POSITIVE, 15 NEGATIVE, NORMAL STATISTIC =0.1741COEFFICIENT OF SKEWNESS = -1.1796 WITH STANDARD DEVIATION OF 0.3925
27
COEFFICIENT OF EXCESS KURTOSIS = 2.3433 WITH STANDARD DEVIATION OF0.7681
GOODNESS OF FIT TEST FOR NORMALITY OF RESIDUALS - 6 GROUPSOBSERVED 1.0 5.0 9.0 16.0 5.0 0.0EXPECTED 0.8 4.9 12.3 12.3 4.9 0.8CHI-SQUARE = 2.8661 WITH 2 DEGREES OF FREEDOM
The revised AUTO’s DW (1.8599) suggests we have eliminated autocorrelation. The revised
errors are close to random.
This plot of errors on year is close to a random pattern. Hence, the GLS that accounts for second
degree autocorrelation produces good estimates. We can also plot the ACF and PACF for the
revised AUTO model to determine if it accounted for the second-degree autocorrelation.
AR(2) AUTO
0 0 0AUTOCORRELATION FUNCTION OF THE SERIES (1-B) (1-B ) E1