Top Banner
70,000 60,000 50,000 40,000 30,000 20,000 10,000 0 Sales lagged one time period 0 10,000 20,000 30,000 40,000 50,000 60,000 70,000 Sales Time Series Forecasting 13-22 CHAPTER 13 Figure 13.10 13.2 Time Series Models Lagged time series plot of U.S. retail sales of general merchandise stores, for Exercise 13.19. Autoregressive models TIME SERIES MODELS Time series models The previous section applied regression methods from Chapters 10 and 11 to time series data. A time period variable was used as the explanatory variable, and the time series was the response variable. Models like these may not satisfy our usual regression assumptions—particularly the independence assumption. Specific models have been developed to aid in the analysis of time series data when usual regression methods are not appropriate. use past values of the time series to predict future values of the time series. In the language of regression, a future time period’s value is our “re- sponse” while one or more past time periods’ values (of the same variable) are the explanatory variable(s). This approach is different from using the values of one variable to predict the values of a second variable . A time series model makes forecasts based on past values of the time series itself. For example, we might forecast next month’s DVD player sales to be some multiple of this month’s DVD player sales. Instead, we might use average DVD player sales for the last three months to predict DVD player sales this month. A wide variety of time series models exist to implement forecasting strategies like these as well as more elaborate forecasting strategies. This section will introduce a few of the most common time series models and illustrate how to use them for forecasting future values of a time series. Can yesterday’s stock price help predict today’s stock price? Can last quarter’s sales be used to predict this quarter’s sales? Sometimes the best explanatory variables are simply past values of the response variable. x y
18

chap_13_2

Nov 17, 2014

Download

Documents

api-19746504
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: chap_13_2

70,000

60,000

50,000

40,000

30,000

20,000

10,000

0

Sales lagged one time period

0 10,000 20,000 30,000 40,000 50,000 60,000 70,000

Sale

s

Time Series Forecasting13-22 CHAPTER 13

Figure 13.10

13.2 Time Series Models

Lagged time series plot of U.S. retail sales of generalmerchandise stores, for Exercise 13.19.

Autoregressive models

TIME SERIES MODELS

Time series models

The previous section applied regression methods from Chapters 10 and11 to time series data. A time period variable was used as the explanatoryvariable, and the time series was the response variable. Models like these maynot satisfy our usual regression assumptions—particularly the independenceassumption. Specific models have been developed to aid in the analysis oftime series data when usual regression methods are not appropriate.

use past values of the time series to predict futurevalues of the time series.

In the language of regression, a future time period’s value is our “re-sponse” while one or more past time periods’ values (of the same variable)are the explanatory variable(s). This approach is different from using thevalues of one variable to predict the values of a second variable . A timeseries model makes forecasts based on past values of the time series itself.For example, we might forecast next month’s DVD player sales to be somemultiple of this month’s DVD player sales. Instead, we might use averageDVD player sales for the last three months to predict DVD player sales thismonth. A wide variety of time series models exist to implement forecastingstrategies like these as well as more elaborate forecasting strategies. Thissection will introduce a few of the most common time series models andillustrate how to use them for forecasting future values of a time series.

Can yesterday’s stock price help predict today’s stock price? Can lastquarter’s sales be used to predict this quarter’s sales? Sometimes the bestexplanatory variables are simply past values of the response variable.

x y

Page 2: chap_13_2

2,000,0001,800,0001,600,0001,400,0001,200,0001,000,000

800,000600,000400,000200,000

0

Units sold (lagged one time period)

0 500,000 1,000,000 1,500,000 2,000,000

Uni

ts s

old

13.2 Time Series Models 13-23

Figure 13.11 Lagged time series plot of the DVD player sales time series( 0 80).r .

FIRST-ORDER AUTOREGRESSION MODEL

first-order autoregression model

� � �t t t

bx

bx bx

� � �

� � � � �

0 1 1

Autoregressive time series models take advantage of the linear relationshipbetween successive values of a time series to predict future values of theseries.

The specifies a linear relationshipbetween successive values of the time series. The shorthand for thismodel is AR(1), and the equation is

In Figure 13.9 (page 13-18), we observed a positive, linear relationshipbetween successive residuals from the exponential trend model fit to theDVD player sales data. Autocorrelation in the residuals from a regressionmodel involving time-related variables indicates an AR(1) model might fitthe data well.

We have two options for specifying an AR(1) model for the DVD playersales data. We can take the number of units sold per month to be the timeseries, or since the positive correlation in Figure 13.9 is for the residualsfrom the exponential trend model, we may want to take the logarithm ofthe number of units sold per month as the time series. In general, fitting anexponential model is the same as fitting a simple linear regressionmodel with log( ) as the response variable instead of just . Taking logarithmsof both sides reveals a linear model based on log( ).

log( ) log( ) log( ) log( ) log( )

Figure 13.11 plots the number of units sold against the number of unitssold lagged one period. Figure 13.12 plots the same points after takinglogarithms. Figure 13.12 ( 0 95) shows a stronger linear pattern than

y y

y aey y

y

y ae a e a bx

r .

Page 3: chap_13_2

15

14

13

9

10

11

12

Log of units sold (lagged one time period)

9 10 11 12 13 14 15Lo

g of

uni

ts s

old

Time Series Forecasting13-24 CHAPTER 13

Figure 13.12

DVD player salesEXAMPLE 13.9

Lagged time series plot of the log of the DVD player salestime series ( 0 95).r .

maximum likelihood estimation

t t

t t t

t t

� �

� �

ˆ

maximumlikelihoodestimation

0 1

0 1

� � �

� �

0 1 1

1

Figure 13.11 ( 0 80) suggesting an AR(1) model will fit the DVD playersales data better after logarithms have been taken. We will proceed using thetransformed time series in an AR(1) model.

The model for DVD player sales in Example 13.9 looks like the simplelinear regression models found in Chapter 10 except that the explanatoryvariable is the same as the response variable but lagged one time period.Despite the similarity in the form of the models, the details that made theleast-squares line the best line in the setting of Chapter 10 are no longer validwith the autoregression model in Example 13.9. Most statistical softwarewill estimate and in an autoregressive model using a method otherthan least-squares. The most common alternative to least-squares estimationis called or simply MLE. The details ofmaximum likelihood estimation need not concern us. The general principleof MLE is to use the parameter estimates that are most in agreement with ourdata. The agreement between parameter estimates and our data is measuredin terms of “likelihood.” Maximum likelihood estimates are the answer tothe question “For what values of and are my data most likely toappear in a random sample?”

y yt

y y

y . . y

r .

The AR(1) model for the time series log( ) where is the number of DVD playerssold in time period is

log( ) log( )

Statistical software calculates the fitted model to be

log( ) 0 43717 0 96486 log( )�

� � �

CA

SE13

.1

Page 4: chap_13_2

13.2 Time Series Models 13-25

Forecasting July DVD player salesEXAMPLE 13.10

ˆ

t

x

y .

ˆ

ˆ

ˆ

1

1

� �

� �

� �

63

64 64

64 63

64

log( ) 14 23095

64

For the AR(1) model of Example 13.9, the MLE for is not toodifferent from the least-squares estimate we would obtain by fitting a simplelinear regression model to the time series. The least-squares estimate of is0.94963 compared to the MLE of 0.96486. However, the intercept estimatesdiffer more. The least-squares intercept estimate is 0.69 compared to the0.43717 estimated via maximum likelihood. A difference of this magnitudewill affect predictions you make with your model, so be sure to use softwarethat was designed to estimate time series models correctly. Using simplelinear regression to fit an AR(1) model is acceptable only if the time serieshas a mean near zero. The mean of the logarithm of the DVD player salesdata is 12.5, so we need software that fits the AR(1) model using maximumlikelihood estimation.

In Case 13.1 (page 13-7), the exponential trend model predicted salesof 2.4 million units, and in Example 13.5 (page 13-14), the exponentialtrend-and-season model predicted sales of 2.1 million units. We noted thatJuly sales are always below June sales (except in 1998), but that both theseforecasts are greater than the June 2002 sales of 1.6 million. Our AR(1)model’s forecast is an improvement in this respect because the forecast of1.5 million units is below June 2002 sales of 1.6 million units.

64

y , ,

y

y y

y . . y

. . , ,

. . .

.

e

y .

e e

y , ,

Our DVD player sales time series ends with June 2002 sales of 1,617,098units. June 2002 is the 63rd month of our time series, so our notation is

1 617 098. The AR(1) model of Example 13.9 relates DVD player sales forJuly 2002 to DVD player sales in June 2002, so we can forecast July’s sales fromthe reported number of units sold in June.

The model fitted in Example 13.9 is for the time series log( ). This creates oneextra step in our task of forecasting July 2002 DVD player sales. Our model willforecast log( ) and we will need to calculate the forecast of as the final step.

First, we use the model to forecast the logarithm of July 2002 DVD playersales.

log( ) 0 43717 0 96486 log( )

0 43717 0 96486 log(1 617 098)

0 43717 0 96486 (14 2961437)

14 23095

Second, use your calculator’s button (or use software) to calculate the forecastof July 2002 DVD player sales.

log( ) 14 23095

1 515 036

Our AR(1) model for DVD player sales predicts sales of over 1.5 million units forJuly 2002.

CA

SE13

.1

Page 5: chap_13_2

Time Series Forecasting13-26 CHAPTER 13

Forecasting August DVD player salesEXAMPLE 13.11

ˆ

x

y .

� �

� �

ˆ

ˆ

ˆ

ˆ

ˆ

64 0 1

63 65

0 1 64 64

65

64

� �

� �

� �

63 64

64

65 64

65

log( ) 14 16804

65

Our forecast of July 2002 DVD player sales was based on andata value—June 2002 sales is a known value in our time series. If we wishto forecast August 2002 DVD player sales, we will have to base our forecaston an value because July 2002 sales is not a known value in ourtime series.

In Chapters 10 and 11, we calculated prediction intervals for the responsein our model. Associated with time series forecasts, your software mayprovide prediction intervals for future time periods. These intervals can beused and interpreted like the prediction intervals we saw in Chapters 10 and11 although we will not present the details of their calculation.

The widths of time series prediction intervals generally increase as thetime period of the prediction moves further beyond the end of the knowntime series. Predictions far into the future are subject to more uncertaintythan predictions close to the end of the time series. Examples 13.10 and13.11 demonstrate why this is true. Our forecast for July 2002 DVD playersales ( ) depends on our estimated values of and and thevalue of . However, our forecast of depends on estimated values of

, , . The additional uncertainty involved in estimating willmake our prediction interval for wider than our prediction interval for

. Statistical software provides the prediction intervals in the followingtable. The widths have been calculated—notice how the widths increase aswe predict further into the future.

65

y yy

y . . y

. . , ,

. . .

.

e

y .

e e

y , ,

observed

estimated

y knowny y

and y yy

y

Following Example 13.10, we use the model to forecast the logarithm of August2002 DVD player sales. Because our time series ends with , the value of isnot known. In its place, we will use the value of calculated in Example 13.10.

log( ) 0 43717 0 96486 log( )

0 43717 0 96486 log(1 515 036)

0 43717 0 96486 (14 23095)

14 16804

Finally, use your calculator’s button or software to calculate the forecast ofAugust 2002 DVD player sales.

log( ) 14 16804

1 422 662

We can continue in this manner to forecast DVD player sales for other futuremonths.

CA

SE13

.1

Page 6: chap_13_2

13.2 Time Series Models 13-27

APPLY YOURKNOWLEDGE

Moving average models

Time Period Lower 95% PI Upper 95% PI Width

t

t

t

13.20 Existing home sales.

13.21 A model for existing home sales.

�1

July 2002 678,324 3,383,804 2,705,480August 2002 465,754 4,345,553 3,879,799September 2002 349,161 5,133,999 4,784,838October 2002 274,628 5,805,884 5,531,256November 2002 223,067 6,384,087 6,161,020December 2002 185,553 6,881,936 6,696,383January 2003 157,263 7,309,011 7,151,748February 2003 135,342 7,673,072 7,537,730March 2003 117,982 7,980,825 7,862,843April 2003 103,989 8,238,248 8,134,259May 2003 92,539 8,450,757 8,358,218June 2003 83,050 8,623,283 8,540,233

Autoregression models are useful when future values of a time seriesdepend linearly on past values of the same time series. With softwarehandling the calculation details, the concepts introduced in Chapters 10 and11 for regression models apply equally well for autoregression models.

The autoregression model AR(1) looks back time period and uses thatvalue of the time series in the forecast for the current time period. This

ex13 020.dat

yt

ex13 020.dat

yy

one

The National Association of Realtors tracks monthlysales of existing homes in the United States. The file on the CDthat accompanies this book has the existing home sales time series beginningin January 1968 and ending with July 2001. Use statistical software toanalyze this time series.

(a) Make a time plot of the existing home sales time series.

(b) Describe any important features of the time series. Is there a strong,clear trend? If so, describe it. What about seasonal variation?

(c) Make a lagged time series plot (see Exercise 13.19 for an example) ofthe existing home sales time series. Does this plot suggest that an AR(1)model is appropriate for this time series? Why or why not?

For this exercise, let denote the numberof existing homes sold (in thousands of units) during time period . Usestatistical software and the data file.

(a) Fit a simple linear regression model using as the response variableand as the explanatory variable. Record the estimated regressionequation. Use this model to forecast August 2001 existing home sales.

(b) Fit an AR(1) model to the existing home sales time series. Record theestimated autoregression equation. Use this model to forecast August2001 existing home sales.

(c) Compare the two estimated equations as well as the August 2001forecasts from parts (a) and (b). State which of the two estimatedmodels is preferred and briefly explain why.

APPLY YOURKNOWLEDGE

Page 7: chap_13_2

1996

-1st

1996

-3rd

1997

-1st

1997

-3rd

1998

-1st

1998

-3rd

1999

-1st

1999

-3rd

2000

-1st

2000

-3rd

2001

-1st

2001

-3rd

2002

-1st

12,000

10,000

8,000

6,000

4,000

2,000

0JCPe

nney

sal

es (

mill

ion

$)

Year-quarter

Time Series Forecasting13-28 CHAPTER 13

Figure 13.13

Moving averages for JCPenney salesEXAMPLE 13.12

Moving averages based on a span of 4 ( )overlaying the JCPenney quarterly sales data ( ), for Example 13.12.

k purplered

MOVING AVERAGE FORECAST MODEL

moving average forecast model

span

���t t t kt

� � ��

1 2

works well when consecutive values of the time series are linearly related,but not all time series fit this pattern. With some time series, our forecastscan be improved by using the past time periods.

models use the average of the last several values of the time series toforecast the next value.

The uses the average of the lastvalues of the time series as the forecast for time period . The

equation is

The number of preceding values included in the moving average iscalled the of the moving average.

Some care should be taken in choosing the span for a moving averageforecast model. As a general rule, larger spans “smooth” the time series morethan smaller spans by averaging many ups and downs in each calculation.Smaller spans tend to follow the ups and downs of the time series. Withseasonal data, the length of the season is often used for the value of .

k

average of many Movingaverage

k t

y y yy

k

k

k

Exercise 13.1 (page 13-5) asked you to plot and comment on quarterly JCPenneysales data. Figure 13.13 displays the JCPenney sales time series with the movingaverage forecast model based on a span of 4 overlaid. Note that the seasonalpattern in the time series is not present in the moving averages. Our movingaverages are a smoothed version of our original time series. The moving averagesfollow the general movements of the time series but not every up and down.

� � �ˆ

Page 8: chap_13_2

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

40,000

35,000

30,000

25,000

20,000

15,000

10,000

5,000

0

Att

enda

nce

per

gam

e

Year

DataSpan = 5Span = 20

13.2 Time Series Models 13-29

Figure 13.14

Chicago Cubs attendance per gameEXAMPLE 13.13

Chicago Cubs annual attendance per game time series( ) with 5-year ( ) and 20-year ( ) moving averages overlaid, forExample 13.13.red purple black

ˆ� � �

� � �24 23 22 21

25

When a strong seasonal pattern is present, moving average models withthe span set equal to the length of the season are similar to the trend-onlymodels presented in Section 13.1. The moving averages will follow thelong-run trend of the time series, but the ups and downs of the seasons areignored. We would not expect the moving average model of Example 13.12to forecast fourth-quarter JCPenney sales well.

y y y y ,y

eg13 013.dat

The JCPenney sales time series begins with the first quarter of 1996 and ends withthe fourth quarter of 2001 for a total of 24 quarters. Figure 13.13 includes theforecasted value for the first quarter of 2002. The forecast is calculated as

32 0048001

4 4

Major League Baseball’s Chicago Cubs have been playing their home games atWrigley Field since 1916. The file on the CD that accompanies thisbook has the “attendance per home game” time series beginning in 1916 andending in 2001.

Figure 13.14 displays the time series (red), moving averages based on a spanof 5 years (purple), and moving averages based on a span of 20 years (black). Themoving averages based on a span of 20 years highlight the general upward trend inattendance per home game at Wrigley Field without following the ups and downsin the data. The 5-year moving averages are not as smooth as the 20-year movingaverages. The 5-year moving averages follow the larger ups and downs whilesmoothing the smaller changes in the time series.

The attendance time series has 86 observations ending with attendance perhome game in 2001. Our interest is in forecasting attendance per home game for

Page 9: chap_13_2

Time Series Forecasting13-30 CHAPTER 13 �

APPLY YOURKNOWLEDGE

www.nass.usda.gov/mt/economic/prices/wwheatpr.htm

13.22 Winter wheat prices.

13.23 Forecasting winter wheat prices.

ˆ

ˆ

� � �� � �

� � �� � �

86 85 8287

86 85 6787

Once a new value becomes available (like 2002 attendance figures inExample 13.13), we update our time series to include the new observation.We can then calculate a forecast for the next time period (2003 attendanceper home game in Example 13.13).

y y y ,y ,

y y y ,y ,

ex13 022.dat

2002. We will do this using both the 5-year moving averages and the 20-yearmoving averages.

162 52232 504

5 5

553 68127 684

20 20

The forecast based on the 20-year moving average is less than the forecast basedon the 5-year moving average because the 20-year average includes the lowerattendance figures of the 1980s. We might guess from looking at Figure 13.14that the 5-year moving average forecast will be more accurate than the 20-yearmoving average forecast. In this case, actual 2002 attendance figures for WrigleyField are now available. Total attendance was 2,693,096 for 81 home games, sothe average attendance per home game was 2,693,096/81 or 33,248. The 5-yearmoving average forecast was more accurate.

The United States Department of Agriculture (USDA)tracks prices received by Montana farmers for winter wheat crops. The pricesare tracked monthly in dollars per bushel. The file on the CDthat accompanies this book has the wheat prices time series beginning in July1929 and ending with October 2002 (880 months). Use statistical softwareto analyze this time series.

(a) Make a time plot of the wheat prices time series.

(b) Describe any important features of the time series. Be sure to commenton trend, seasonal patterns, and significant shifts in the series.

(c) Calculate 12-month moving averages and plot them on your time seriesplot from part (a).

(d) Calculate 120-month moving averages and plot them together withthe time series from part (a) and the 12-month moving averages frompart (c).

(e) Compare the 12-month and 120-month moving averages. Which fea-tures of the wheat prices time series does each capture? Which featuresdoes each smooth?

(a) Using the winter wheat prices from the previous exercise, calculate andcompare the 12-month and 120-month moving average forecasts forNovember 2002.

(b) Record the actual winter wheat price received by Montana farmersfor November 2002 from the Web page

. Which moving average fore-cast model provided the most accurate forecast for November 2002?

���

���

APPLY YOURKNOWLEDGE

Page 10: chap_13_2

13.2 Time Series Models 13-31

Philip Morris returnsEXAMPLE 13.14

Exponential smoothing models

EXPONENTIAL SMOOTHING MODEL

exponential smoothing model

smoothing constant

t t

t t t

� �

exponentialsmoothing

1 1

1 1

Moving average forecast models appeal to our intuition. Using the averageof several of the most recent data values to forecast the next value of thetime series is easy to understand conceptually. However, two criticisms canbe made against moving average models. First, our forecast for the next timeperiod ignores all but the last observations in our data set. If you have100 observations and use a span of 5, your forecast will not use 95%of your data! Second, the data values used in our forecast are all weightedequally. In many settings, the current value of a time series depends moreon the most recent value and less on past values. We may improve ourforecasts if we give the most recent values greater “weight” in our forecastcalculation. models address both these criticisms.

There are several variations on the basic exponential smoothing model.We will look at the details of the whichwe will refer to as, simply, the . More complexvariations exist to handle time series with specific features, but the details ofthese models are beyond the scope of this chapter. We will only mention thescenarios for which these more complex models are appropriate.

The uses a weighted average of theobserved value and the forecasted value as the forecast fortime period . The forecasting equation is

(1 )

The weight is called the for the exponentialsmoothing model. The smoothing constant is a value between 0 and 1.Choosing close to 1 puts more weight on the most recent value.

Choosing the smoothing constant in the exponential smoothing modelis similar to choosing the span in the moving average model—both relatedirectly to the smoothness of the model. Smaller values of correspond togreater smoothing of the ups and downs in the time series. Larger values of

put most of the weight on the most recent observed value, so the forecaststend to follow the ups and downs of the series more closely.

w .w .

kk

Exponential smoothing

simple exponential smoothing model,exponential smoothing model

y yt

y wy w y

w

w

wk

w

w

Table 1.10 (page 49) gives the monthly returns on Philip Morris stock for theperiod from June 1990 to July 2001 (134 months). Figure 13.15 displays thereturns (red), the exponential smoothing model with 0 5 (purple), and theexponential smoothing model with 0 1 (black).

Using a smoothing constant of 0.5 puts more weight on the most recent valuesof the time series than does using a smoothing constant of 0.1. As a result, the

� �

� �

ˆ

ˆ ˆ

Page 11: chap_13_2

1990

.06

1990

.1219

91.0

6

1993

.06

1992

.1219

92.0

619

91.12

1993

.1219

94.0

619

94.12

1995

.1219

95.0

6

1996

.06

1996

.1219

97.0

6

1998

.06

1997

.12

1998

.1219

99.0

619

99.12

2000

.1220

00.0

6

2001

.06

30

20

–30

–20

–10

0

10

Perc

ent

retu

rn (

mon

thly

)Year.month

Data w = 0.5 w = 0.1

Time Series Forecasting13-32 CHAPTER 13

Figure 13.15

Philip Morris monthly stock returns ( ) with two overlays,exponential smoothing model with 0 5 ( ) and exponentialsmoothing model with 0 1 ( ), for Example 13.14.

redw . purple

w . black

� �

� �

� � �

� � �

� � ��� �

n n n

n n n

n n n

n n n n

n n n n

nn n n

n

n

n

��

� �

� � �

� � �

� � � �

� � � �

� � � � �

135

1

1 12

1 12

1 2 22 3

1 2 2

2 21 2 2

11

1

1 2

A little algebra is needed to see that exponential smoothing modelsaddress the criticisms of moving average models. We will start with theforecasting equation for the exponential smoothing model and imagineforecasting the value of the time series for the time period 1 where isthe number of observed values in the time series. A specific example wouldbe forecasting the August 2001 Philip Morris return using the observedreturns for the preceding 134 months.

(1 )

(1 )[ (1 ) ]

(1 ) (1 )

(1 ) (1 ) [ (1 ) ]

(1 ) (1 ) (1 )...

(1 ) (1 ) (1 )

(1 )

Careful substitution and multiplication reveal a version of the forecastequation showing exactly how our forecast depends on the values of thetime series. First, notice that the calculation of the forecast for usesavailable values of the time series not just the most recentvalues as a moving average model would. Second, the values are not equallyweighted. Using a value of close to 1 puts greater weight on the most

n n

yn

y wy w y

wy w wy w y

wy w wy w y

wy w wy w wy w y

wy w wy w wy w y

wy w wy w wy w wy

w y

y ally , y , , y , k

w

purple curve tends to follow the ups and downs of the time series more closelythan the smoother black curve. With a smoothing constant close to 0, the modelwill follow only the major changes in the time series.

� �

� �

� � �

� � �

�� �

. . .

ˆ ˆ

ˆ

ˆ

ˆ

ˆ

Page 12: chap_13_2

13.2 Time Series Models 13-33

Forecasting Philip Morris returnsEXAMPLE 13.15

� n

ˆ ˆ

ˆˆ ˆ

ˆ

ˆ

ˆ ˆ

ˆ ˆ

ˆ ˆ

ˆ ˆ

� �

1

136 135

� �

� �

� �

� �

� �

� �

� �

� �

� �

135 134 134

134

134 133

135

1

1

2 1 1

3 2 2

4 3 3

134

135

135 134 134

recent observation. In this version of the forecast equation, we also see howthe model gets its name. The coefficients in the forecast model decreaseexponentially in value as you read the equation from left to right (with theexception of the last coefficient, (1 ) ). Exercise 13.25 has you explorewhat happens to the coefficients as you change the value of the smoothingconstant .

While the second version of our forecasting equation reveals someimportant properties of the exponential smoothing model, it is easier to usethe first version of the equation for calculating forecasts.

With the forecasted value for August 2001 from Example 13.15, calcu-lating the forecast for September 2001 requires only one calculation.

(0 5)( ) (0 5)(0 194)

w .

y . y . y

yy y

y

yy

y . y . y

. .

y . y . y

. . .

.

y . y . y

. . . .

.

y .y

y . y . y

. . . .

.

w

w

y . y . .

Consider forecasting the Philip Morris August 2001 return using an exponentialsmoothing model with 0 5.

0 5 (1 0 5)

We need the forecasted value to finish our calculation. However, to calculatewe will need the forecasted value of ! In fact, this pattern continues,

and we need to calculate all past forecasts before we can calculate . We willcalculate the first few forecasts here and leave the remaining calculations forsoftware. The forecast for the first time period is always taken to be the actualvalue of the time series in the first time period.

0 5 (1 0 5)

(0 5)(3) (0 5)(3)

3

0 5 (1 0 5)

(0 5)( 5 7) (0 5)(3)

1 35

0 5 (1 0 5)

(0 5)(1 2) (0 5)( 1 35)

0 075

Software continues our calculations to arrive at a forecast for of 3 812. Weuse this value to complete our forecast calculation for .

0 5 (1 0 5)

(0 5)(4 2) (0 5)( 3 812)

0 194

Our model forecasts a 0.194% return for Philip Morris stock in August 2001.

ˆ

� � �

� � �

� � �

� � �

� � �

Page 13: chap_13_2

Time Series Forecasting13-34 CHAPTER 13 �

APPLY YOURKNOWLEDGE

n

13.24 Philip Morris returns.

13.25 It’s exponential.

. . .

. . .

. . .

ˆ ˆ

135

1 4

2 2

Once we observe the actual Philip Morris return for August 2001, we can enter that value into the forecast equation above. Updating

forecasts from exponential smoothing models requires only that we keeptrack of last period’s forecast and last period’s observed value. In contrast,moving average models require that we keep track of the last observedvalues of the time series. For this reason, exponential smoothing modelsare often preferred over moving average models especially if data storage isan issue.

The exponential smoothing model is best suited for forecasting time se-ries with no strong trend or seasonal variation. Variations on the exponentialsmoothing model have been developed to handle time series with a trend(double exponential smoothing and Holt’s exponential smoothing), with sea-sonality (seasonal exponential smoothing), and with both trend and season-ality (Winters’ exponential smoothing). Your software may offer one ormore of these smoothing models.

y , , yw .

w.

w .

w .

w, w w, w w, , w w

n

w .

w .

w .

, , , ,

w

y

k

(a) Using the Philip Morris returns in Table 1.10 (page 49), calculate thefirst 4 forecasts using an exponential smoothing model with

0 1. Do not use statistical software for these calculations.

(b) Now, use software to fit an exponential smoothing model with0 1. Use the forecasts provided by the software to verify your handcalculations in part (a). Are your forecasts the same as those providedby your software?

(c) Provide a forecast for the August 2001 Philip Morris return based onyour exponential smoothing model with 0 1 and compare this tothe forecast calculated in Example 13.15.

(d) Write down the forecast equation for the September 2001 Philip Morrisreturn based on the exponential smoothing model with 0 1.

Exponential smoothing models are so named because thecoefficients

(1 ) (1 ) (1 )

decrease in value exponentially. For this exercise, take 11. Use softwareto do the calculations.

(a) Calculate the coefficients for a smoothing constant of 0 1.

(b) Calculate the coefficients for a smoothing constant of 0 5.

(c) Calculate the coefficients for a smoothing constant of 0 9.

(d) Plot each set of coefficients from parts (a), (b), and (c). The coefficientvalues should be measured on the vertical axis while the horizontalaxis can simply be numbered 1 2 9 10 for the 10 coefficients fromeach part. Be sure to use a different plotting symbol and/or color todistinguish the three sets of coefficients and connect the points for eachset. Also, label the plot so that it is clear which curve corresponds toeach value of used.

(e) Describe each curve in part (d). Which curve puts more weight on themost recent value of the time series when you are calculating a forecast?

� � �

APPLY YOURKNOWLEDGE

Page 14: chap_13_2

5

4

0

1

1 51 101 151 201 251

2

3

Val

ue

Time

13.2 Time Series Models 13-35

Figure 13.16

EYOND THE ASICS: PLINE ITS

Spline curve models for three values of the smoothing con-stant : 0.000001 ( ), 10,000 ( ), and 1,000,000( ).

red purpleblack

B B S F

smoothing spline.

n

� ��

� �

� � �

smoothing spline

� � � �

11

1

1

Modern computing capabilities have made possible many statistical toolsthat would otherwise be impossible or impractical. One such tool is thegeneral purpose A spline curve can be fit to any ( )data set. For a time series, we take to be the time variable and tobe the time series values. A spline curve is a single curve consisting of alarge number of polynomial curves pieced together in an optimal manner.The more polynomials used, the more flexible and less smooth the splinecurve is.

The smoothness of a spline fit is determined by the choice of a positive-valued smoothing constant similar to the choice of in the exponentialsmoothing model. The spline smoothing constant is often denoted by thelowercase Greek letter lambda ( ). Choosing a value of close to zero resultsin a very flexible and not very smooth spline curve. As greater values ofare used, the spline curve becomes less flexible and more smooth.

Figure 13.16 displays three spline curves. The red spline curve offers theleast amount of smoothing for the data ( 0 000001). With nearly 0, thespline curve passes through every point. The purple spline curve is based on

10 000. This spline curve follows the ups and downs of the time serieswithout trying to pass through each point. The black spline curve offers the

y wy w

w wy

x, yx y

w

.

,

(f) The coefficient of in the exponential smoothing model is (1 ) .Calculate the coefficient of for each of the values of in parts (a),(b), and (c). How do these values compare to the first 10 coefficientsyou calculated for each value of ? Which value of puts the greatestweight on when you are calculating a forecast?

Page 15: chap_13_2

Time Series Forecasting13-36 CHAPTER 13

ECTION UMMARY

ECTION XERCISES

S

S

13.2 S

13.2 E

Time series models

first-order autoregression model,

maximum likelihood estimation

Moving average forecast modelsspan

exponential smoothing

t

t

t

13.26 A closer look at oranges.

13.27 Least-squares or maximum likelihood?

12

2

highest degree of smoothing with 1 000 000. The black spline curvereveals the overall pattern in the time series without following the smallerups and downs.

Spline fits can help you quickly identify overall trends and seasonalvariation as well as less regular patterns in your time series data. They are apowerful, modern exploratory data analysis tool.

Regression models relating a time series to a time variable are not alwaysappropriate. Specifically, the residuals from such a model may exhibit highcorrelation. use past values of the time series to forecastfuture values of the time series.

The AR(1), is appropriate whensuccessive values of a time series are linearly related. The parameters of themodel are often estimated using ratherthan least-squares estimation.

use the average of the last observedvalues to forecast next period’s value. is called the of the movingaverage. Larger values of result in a smoother model.

The forecast equation for the model is a weightedaverage of last period’s observed value and last period’s forecasted value.The degree of smoothing is determined by the choice of a smoothingconstant between 0 and 1. Values of close to 0 result in a smoothermodel.

fg01 007.dat

yt

yy

R s

R

, ,

kk

k

w w

Example 1.7 (page 20) looked at the trend andseasonal variation in the average monthly price of oranges. Figure 1.7 (page21) is a time series plot of the data. The data is found in the fileon the CD that accompanies this book.

(a) Make a lagged time series plot (see Exercise 13.19 for an example).

(b) Does the plot in part (a) suggest that an AR(1) model might be appro-priate for the orange prices time series?

(c) Calculate the correlation between successive values of orange prices.Does the value of this correlation support your conclusion in part (b)?Why or why not?

Continue analyzing the orange pricesdata from the previous exercise. Let denote the average price of orangesin time period .

(a) Fit a simple linear regression model using as the response variableand as the explanatory variable. Record the estimated regressionequation, the value of , and the regression standard error .

(b) Fit an AR(1) model to the orange prices time series. Record the estimatedautoregression equation. Your software should also report a model

Page 16: chap_13_2

13.2 Time Series Models 13-37

t t

t

t t

13.28 Moving averages for orange prices.

13.29 Exponential smoothing for orange prices.

13.30 A special AR(1) model. Random walk models

random walkmodels

� �

� � �

2

1 0

0 1

1

1

R s

k

w

differences

y y

y

y y

ex13 030.dat

and a model standard deviation (or model standard error)—record thesevalues also.

(c) Compare the two estimated slopes and intercepts.

(d) Compare the model -values and the model standard errors . Do thesevalues give any clear indication as to which fitting method (least-squaresor maximum likelihood) is preferable?

(a) Calculate and plot (on a single time series plot) moving averages forspans of 5, 10, and 20.

(b) Comment on the smoothness of each moving average model in part (a).Which model would be best for forecasting monthly ups and downs inorange prices?

(c) Calculate and compare forecasts for January 2001 orange prices foreach of the models in part (a). Which model provided the most accurateforecast? (The actual value of the orange prices time series for January2001 is 224.2.)

(a) Calculate and plot (on a single time series plot) exponential smoothingmodels using smoothing constants of 0.1, 0.5, and 0.9.

(b) Comment on the smoothness of each exponential smoothing model inpart (a). Which model would be best for forecasting monthly ups anddowns in orange prices?

(c) Calculate and compare forecasts for January 2001 orange prices foreach of the models in part (a). Which model provided the most accurateforecast? (The actual value of the orange prices time series for January2001 is 224.2.)

(d) Update your data by appending the January 2001 observed value of224.2. Now forecast the February 2001 orange price with each of themodels from part (a). Which model provided the most accurate forecast?(The actual value of the orange prices time series for February 2001 is229.6.)

for various financial timeseries are often mentioned in business literature. A simple random walkmodel specifies that one-period in the time series can be modeledas a constant term plus a random-error term. The equation for this randomwalk model is

If we rewrite this equation solving for , we get

which is our AR(1) model with 1. If we fit an AR(1) model and findthat the estimate is close to 1, then a simple random walk model for thetime series is another modeling option.

The CD that accompanies this book contains a data file named. Daily “USD to Euro” exchange rates beginning July 24, 2001

and ending July 23, 2002 are contained in the data file. The first exchangerate is 1.1509. This means that on July 24, 2001 a single U.S. dollar (USD)was worth 1.1509 euro (EUR).

� � �

� �

Page 17: chap_13_2

Time Series Forecasting13-38 CHAPTER 13 �

www.oanda.com/convert/fxhistory

www.oanda.com/convert/fxhistory

t t

t t

13.31 Moving averages for exchange rates.

13.32 Exchange rate moving averages continued.

13.33 Exponential smoothing of exchange rates.

ˆ�

�� �

� �

2

0

1 diff

1 diff

R

y y yy y y

ex13 030.dat

k

kk

kk

k

k

ex13 030.dat

w w

(a) Fit an AR(1) model to the exchange rate time series. Record the estimatedforecast equation, the -value, and the model standard error estimate.What evidence do you have that a simple random walk model for theseexchange rates is appropriate?

(b) Use the AR(1) forecast equation to predict the exchange rate for July24, 2002.

(c) For a simple random walk model, the estimate of is simply theaverage of all one period differences (call this average )and the forecast equation is . Calculate the one-perioddifferences and their average and use these to provide a “random walkforecast” for the exchange rate on July 24, 2002. Compare the randomwalk forecast to the AR(1) forecast.

(d) You can get the actual exchange rate for July 24, 2002 at the Web site. Compare both forecasts

to the actual exchange rate. Which model was more accurate?

Use statistical software to fit movingaverage models to the exchange rate data in the file on the CDthat accompanies this book.

(a) What would the moving average forecast equation be if we used a span of1? (Your response to this part is not specific to the exchange rate data.)

(b) Choose a value for that is larger than 1 and less than 6 and calculatethe corresponding moving averages. Be sure to report what value ofyou chose.

(c) Choose a value of that is larger than 35 and less than 365 and calculatethe corresponding moving averages. Be sure to report what value ofyou chose.

(d) Plot both sets of moving averages on a time series plot of the exchangerates. Be sure to label the plot clearly. Comment on the smoothness ofboth sets of moving averages.

(a) Using a span of 1, forecast the USD to Euro exchange rate on July24, 2002.

(b) Use your model from part (b) of the previous exercise to forecast theUSD to euro exchange rate on July 24, 2002.

(c) Use your model from part (c) of the previous exercise to forecast theUSD to euro exchange rate on July 24, 2002.

(d) What would our moving average forecast equation be if we used a spanof 365? What would this model forecast for July 24, 2002?

(e) You can get the actual exchange rate for July 24, 2002 at the Website . Compare these fourforecasts to the actual exchange rate. Which model was closest tothe actual exchange rate? Which model performed the worst on thisforecast?

Use statistical software tofit exponential smoothing models to the exchange rate data in the file

on the CD that accompanies this book.

(a) What would the exponential smoothing forecast equation be if we useda smoothing constant of 1? What about 0? (Your response tothis part is not specific to the exchange rate data.)

Page 18: chap_13_2

Statistics in Summary 13-39

TATISTICS IN UMMARYS S

A. TRENDS

B. SEASONS

C. TIME SERIES MODELS

www.oanda.com/convert/fxhistory

Standard regression methods like those described in Chapters 10 and 11 canbe used with time series data with time as the explanatory variable and thetime series as the response variable. These methods can help model bothtrend and seasonality in a time series although the models will generallynot satisfy the regression assumptions needed for inference. Special timeseries models relate the current period’s value to past values of the timeseries. These models are often referred to as “smoothing models.” We relyon statistical software for fitting time series models and forecasting futurevalues using these models. Here are the specific skills you should developfrom studying this chapter.

1. Identify a long-run trend in a time series plot.2. Use software to fit a regression model to the long-run trend.3. Use a regression model to forecast future values of a time series based

on a trend-only model.

1. Identify seasonal variation in a time series plot.2. Model seasonal variation using indicator variables in a regression

model.3. Model seasonal variation using seasonality factors.4. Forecast future values of a time series taking into account trend and

seasonality.5. Recognize seasonally adjusted data in a time series plot.

1. Identify autocorrelation in a lagged residual plot or a lagged timeseries plot.

2. Use software to calculate the autocorrelation of a time series.

w w .w

w . ww

(b) Choose a value for such that 0 0 3 and fit the correspondingexponential smoothing model. Be sure to report what value of youchose.

(c) Choose a value of such that 0 7 1 and fit the correspondingexponential smoothing model. Be sure to report what value of youchose.

(d) Plot both exponential smoothing models on a time series plot of theexchange rates. Be sure to label the plot clearly. Comment on thesmoothness of each model.

(e) Using the models from parts (b) and (c), forecast the July 24, 2002exchange rate and compare both these forecasts to the the actualexchange rate on that day. (You can get the actual exchange rate for July24, 2002 at the Web site .)

� �

� �