EVALUATION OF THE BIAS OF YULE-WALKER ESTIMATES IN AUTOREGRESSIVE TIME SERIES PROCESSES A Writing Project Presented to The Faculty of the Department of Mathematics San Jos´ e State University In Partial Fulfillment of the Requirements for the Degree Master of Arts by Saeid M. Assadi May 2006
53
Embed
EVALUATION OF THE BIAS OF YULE-WALKER ESTIMATES IN ...crunk/masters/SaeidAssadiThesis.pdfby Saeid M. Assadi Intimeseries analysis theuseofYule-WalkerEquations isacommonmethodfor estimating
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
EVALUATION OF THE BIAS OF YULE-WALKER ESTIMATES IN
Figure 4.3: Accuracy of the Asymptotic Expected Bias
Since the bias increases as the abolute value of the coefficient α1 increases, we
would like to investigate relative error or relative accuracy of the asymptotic expected
bias. We will define the relative accuracy of the asymptotic exptected bias as such:
Relative Accuracy of AE Bias =Observed Bias − AE Bias
True value of coefficient
=α1 − AE(α1)
α1. (4.4)
We plotted the relative accuracy of the bias in figure 4.4. Clearly, the relative
bias increases as we approach the ends of the horizontal axis. Note also that when
α1 = 0, the relative accuracy of the bias function is undefined because of division by
zero.
26
Figure 4.4: Relative Accuracy of the Expected Bias
Figures 4.5 through 4.7 show how the coefficient α1, the bias of α1, and the bias
accuracy of α1 respectively, vary with sample size for the AR(1) process.
It can be seen that as the sample size increases we achieve estimates for α1
which are closer to the true values. In another words, the coefficient estimates from
the Tobs = 100 simulation are closer to the no-bias line (α = α) than those form the
Tobs = 10 simulation (see figure 4.5). We also see a higher accuracy of the asymptotic
exptected bias for the larger sample size. This is expected since our asymptotic
bias approaches the true exptected bias as our observed sample size Tobs approaches
infinity:
limTobs→∞
AE Bias = limTobs,T→∞
TE(α − α)
Tobs
= TE(α − α)
= Expected Bias.
27
Figure 4.5: Sample size comparison for coefficient α1
Figure 4.6: Sample size comparison for bias
28
Figure 4.7: Sample size comparison for AE bias accuracy
Recall the term B from equation (3.2). From chapter 3, B is used the derivation
of the bias expression presented by Shaman and Stine (1988). The assumptions made
around B in deriving the bias expression were:
1. For AR(2) or higher order processes the matrix B is assumed to have
eigenvalues less than 1 in absolute value in order to use a Taylor Series
sum. If the process is AR(1), then this assumption can be restated such
that the value |B| is assumed to be less than 1.
2. The Taylor Series expansion of B is truncated by taking all the terms of
higher order than B2 and grouping them into the term O(T−1/2) which
is then omitted from the expression.
We have simulated results indicating what values of B took on for various values
of α1 (see figure 4.8).
29
Figure 4.8: Minimum, maximum, mean, and median of |B| vs. α1
For these results, a group of 1000 simulations were taken for many given value of
α1 ranging from -1 to 1. The values for α1 were chose such that the very left data point
of each graph represents α1 = −0.999 with a resolution of 0.001 in between points
nearby. Likewise, the very right data point of each graph represents α1 = 0.999. The
minimum graph shows for every α1, there is at least one simlulation in the group
which yields a |B| close to zero with the exception of the edges of the horizontal axis.
The maximum graph shows |B| to exceed 1 for almost all values of α1 indicating
the assumptions made around B do not hold for at least those instances. Both the
maximum and the minimum graph shows values of |B| to be nearly the same at the
left and right edges of the horizontal axis indicating hardly any variation between
30
outputs of each simulation for these values of α1. The mean and median graph show
a consistent rise of |B| as α1 approaches ±1. This indicates that for α1 near ±1, |B|will be expected to be near 1 which violates the truncation assumption of the infinte
series representation of B.
Figure 4.9: Sample size comparison for mean |B| vs. α1
We can also observe the impact of sample size on the mean of |B| (see figure
4.9). Each curve represents a simulation run given the labeled value of the sample
size Tobs. As the sample size goes up, the closer the mean of the 1000 values of |B|approaches zero for all values of α1 except toward the edges of the horizontal axis. No
matter what the sample size we have used in the simulation, the mean of |B| always
approaches 1 when α1 approaches ±1.
In this section, we have discussed the AR(1) process focusing on simulated out-
put. We notice that our coefficients produced from simulation are biased as expected.
However, when the observed bias is compared to the expected bias as given by our
equations in Chapter 3, we notice differences. The differences are mostly apparent
31
in situations when the coefficients α1 close to ±1. Additionaly, the bias equations
indicate a linear relationship between the true coefficient α and the estimated coeffi-
cient α whereas the simulated output points toward a nonlinear relationship. Finally,
looking into what causes the difference, we notice the assumptions made around B
are violated in several cases, but mainly when we are dealing with situations when
the AR coefficient is close to ±1.
32
4.3 Simulation of the AR(2) Process: Real Roots
Once again we look at the AR(2) process with zero mean:
yt + α1yt−1 + α2yt−2 = εt.
Also, recall from Chapter 3 the characteristic equation where A(z−1) = 0:
z2 + α1z + α2 = 0.
Our simulation design takes the solution to the characteristic equation in sep-
arate cases, one simulation for the case where there are real roots, and once for
imaginary roots.
Figure 4.10: Bias of Alpha 1 and Alpha 2
We will use the labels z1 and z2 to represent each root. Recall from the beginning
of this chapter that we start the simulation by choosing either coefficients or roots
which in turn imply a set of coefficients. In figure 4.10 we can see that the bias
increases as either or both roots increase in absolute value. The contour lines in both
graphs show an exponentially increasing pattern as we approach (z1 = z2 = ±1).
33
4.4 Simulation of the AR(2) Process: Complex Roots
The solution to the characteristic equation in the AR(2) process may yield
complex roots. Since the roots can be plotted in the complex plane, we will use the
real and imaginary axes to plot all of our observations. Since complex roots come in
conjugate pairs, we will make all of our observations only with respect to the root
with the positive imaginary part. Plotting the conjugate with the negative imaginary
part would yield a mirror image. Figures 4.11 and 4.12 show the bias (α − α) of the
coefficients averaged over 100 simulations.
Figure 4.11: Bias of Alpha 1
The contour lines in figures 4.11 and 4.12 indicate the that bias of both α1 and
α2 increase in absolute value as the roots approach (±1 + 0i). As a comparison to
the real roots scenario, the real (horizontal) axis of these plots is equivalent to the
line with slope equal to 1 in the plots of figure 4.10. In both α1 and α2, the presence
of bias is expected. Shaman and Stine (1988) have produced equations for the bias
which we can compare to the bias observed from simulation (see Chapter 3). Using
34
Figure 4.12: Bias of Alpha 2
the Accuracy of the Asymptotic Expected Bias metric defined earlier, we can plot
how close the observed bias is to the expected bias. Figures 4.13 and 4.14 show the
difference between the observed and the expected bias using our expression defined
in the previous section (see equation 4.3).
The greatest innacuracy of the expected bias is greatest at the corners of the
plot (±1 + 0i). We will now consider the possibility of the magnitude of the roots
having an impact on the accuracy of the estimate. Figures 4.15 and 4.16 show that
dividing the bias by the magnitude of the coefficient still results in an increasing value
as the roots approach (±1 + 0i).
Now that we see there is a difference between the expected and observed bias,
we are interested in seeing what role the sample size has to do with the accuracy
of the bias formulas. In figures 4.11 through 4.16, our sample size was Tobs = 100,
however, in figures 4.17 and 4.18, we can see the effect of both a larger and small
sample size.
35
Figure 4.13: Accuracy of Asymptotic Expected Bias of Alpha 1
Figure 4.14: Accuracy of Asymptotic Expected Bias of Alpha 2
For the AR(2) process, the matrix B is 2x2 in size and yields two eigenvalues. In
the case of complex conjugate roots, we are interested in several different instances of
B, one for every root pair we study. In our program design, we have run simulations
36
Figure 4.15: Relative Accuracy of Asymptotic Expected Bias for Alpha 1
Figure 4.16: Relative Accuracy of Asymptotic Expected Bias for Alpha 2
for 5055 unique data points in the top half of the unit circle of the complex plane. For
each point in the unit circle we have several simulations depending on the magnitude
of the sample size. Therefore, at the end of the simulation process, we have a set
37
Figure 4.17: Accuracy of Asymptotic Expected Bias for Tobs = 10
Figure 4.18: Accuracy of Asymptotic Expected Bias for Tobs = 1000
of many eigenvalues. We chose to look at the larger of the two in absolute value
per matrix calculated for each simulated data set. Then, for every point in the unit
circle, we have organized them in an array and will look at the largest value, mean,
38
and median per point on the unit circle.
Figure 4.19 shows that for a sample size such as Tobs = 100, the largest eigen-
value of the B matrix is over 1 at points near the corners (±1 + 0i) of the unit
circle. However, even for a larger sample size Tobs = 10, 000, we still have the largest
eigenvalues exceeding 1 well within the edges of the unit circle (see figure 4.20).
Figure 4.19: Largest Eigenvalue of B for Tobs = 100
Figure 4.20: Largest Eigenvalue of B for Tobs = 10000
39
Figures 4.21 and 4.22 show the mean eigenvalues for the sample sizes Tobs = 100
and Tobs = 10, 000, respectively.
Figure 4.21: Mean of the largest of the pair of Eigenvalues of B per for Tobs = 100
Figure 4.22: Mean of the largest of the pair of Eigenvalues of B for Tobs = 10000
40
Figures 4.23 and 4.24 show the median eigenvalues for the sample sizes Tobs =
100 and Tobs = 10, 000, respectively.
Figure 4.23: Median of the largest of the pair of Eigenvalues of B for Tobs = 100
Figure 4.24: Median of the largest of the pair of Eigenvalues of B for Tobs = 10000
For the mean plots, we can see from visual inspection that there is not much
41
difference in magnitude of the Eigenvalues. However, the larger sample size displays
more defined contour lines. The same can be said about the median plots. The
eigenvalues of B being close to 1 or over 1 is certainly not a rare occurrence, as
presented in the most of the figures.
For the AR(2) process, we have examined the real and complex roots cases.
In both scenarios, we have shown how the bias, the accuracy of the expected bias,
and the eigenvalues of the matrix B all increase in magnitude as the roots approach
(Root 1 = 1, Root 2 = 1). Just as with the AR(1) process, this implies that the
Yule Walker estimates for α in the AR(2) process increase in error in these areas.
Looking toward the AR(3) process and beyond, we can apply a similar methodology
in examining the behavior. We can find the roots of the characteristic equation.
Break out scenarios for each kind of root set, a combination of real and complex
roots. Then, we can examine the bias of each coefficient with respect to one ore more
of the roots.
42
4.5 Higher Order AR(p) Processes
We have investigated in detail the AR(p) process for p = 1 and p = 2. In
representing the graphs for p = 3, we need to consider more cases than before. The
characteristic equation is now a third order polynomial with three roots. First, we
can sepearate cases for the types of roots. There is one case for all real roots and
one case for only one real root and two complex roots. Then, for each case, we
need to set up a plotting environment. For the case with all real roots, an example
of the environment could be a three-dimentional coordinate system with each root
being represented by an axis. If we were to plot the bias of one of the coefficients
of the three on this system, we could use contoured or colored surfaces to represent
a change in the bias. This would be quite difficult to observe with ordinary tools.
A simpler plotting environment for this case could be the same two-dimensional one
which was used for p = 2. In this environment, we could only use two of the roots
to be represented by the axes, then the third root would have to be a fixed value
for any given plot. Then, we could look at parameters such as the bias of one of
the coefficients and plot it using contour lines in the same fashion as in the case for
p = 2. This plot would have to be repeated for several different values of the third
root since it is fixed for any given plot. This produces a set of plots which would then
have to be repeated for the other two coefficients to study their biases. Any other
parameters such as the bias accuracy or maximum eignenvalue could be plotted in a
similarly. Lastly, this entire set of plots could be done once again for the case with
one real and two complex roots. Using this style of studying parameters leads us to
several plots for the AR(3) process. Studying higher order process this way will lead
to an exponentially growing set of plots to examine.
CHAPTER 5
DISCUSSION
The Yule-Walker equations are well known means of estimating parameters in
stationary autoregressive time series processes. The bias of the Yule-Walker estimates
is also known and documented, however required an infinite sample size (see Shaman
and Stine 1988). We have approximated the expected bias using the asymptotic
expected bias which uses a reasonable sample size. We have seen the results from
chapter 4 indicating a significant difference between the asymptotic expected bias
produced by the equations and observed bias from the output of simulation using
a reasonable sample size. The greatest differences are in the cases when the roots
of the characteristic equation of the process are close to the edges of the unit circle
where z = 1 ± 0i. As we look into the derivation of the bias equations further, we
can see theoretical problems with assumptions made around the estimator B. For
reasonable sample sizes and for roots near z = 1± 0i, the estimator B may not have
eigenvalues in absolute value less than one. In these cases, we can see that there are
large differences between the asymptotic expected bias and the true bias of simulated
data sets. For the complete derivation of the bias formulas, see Shaman and Stine
(1988). Additionally, there is other research done on making Yule-Walker estimates
better by tapering (see Crunk 1999).
For further research, we could investigate the behavior of the Yule-Walker es-
44
timates and expected bias calculations for higher order AR(p) processes. For these
processes, the general form of the expected O(1/T) bias may contain many terms and
be irreducible, however, it still could be easily calculated numerically. Additionally
we can investigate revising the expected bias formula. Our bias formula was an order
O(1/T), however, we could have better results from an O(1/T 2), O(1/T 3), or higher
order of T .
BIBLIOGRAPHY
[BD91] P. J. Brockwell and R. A. Davis, Time series: Theory and methods, 2nd ed.,Springer, 1991.
[Bha81] R. J. Bhansali, Effects of not knowing the order of an autoregressive processon the mean squared error of prediction - 1, Journal of the American Statis-tical Association 76 (1981), no. 375, 588–597.
[Cru99] S. Crunk, Dissertation on tapering to improve yule-walker estimation inautoregressive processes, 1999.
[SS88] P. Shaman and R. A. Stine, The bias of autoregressive coefficient estimators,Journal of The American Statistical Associations 83 (1988), 842–848.
[SS89] R. Stine and P. Shaman, A fixed point characterization for bias ofautoregressive estimators, The Annals of Statistics 17 (1989), no. 3, 1275–1284.
[Zha90] H. C. Zhang, Reduction of the asymptotic bias of autoregressive and spectralestimators by tapering, Journal of Time Series Analysis 13 (1990), no. 5,451–469.