8/2/2019 Predictors of Stock Market Values
1/38
PREDICTORS OF STOCK MARKET VALUES
QMB 6305
UNIVERSITY OF WEST FLORIDA
Submitted by
Aaron Hall
April 13, 2010
Instructor
DR. GAYLE BAUGH
8/2/2019 Predictors of Stock Market Values
2/38
TABLE OF CONTENTS
Introduction ............................................................................................................ 1
The Data ............................................................................................................. 3
Prediction And Moving Averages ........................................................................ 5
Data Manipulation ............................................................................................ 11
Data Exploration .............................................................................................. 12
Linear Prediction .............................................................................................. 13
Checking The Model ......................................................................................... 20
Conclusions .......................................................................................................... 22
The Model ......................................................................................................... 22
Confidence Intervals ........................................................................................ 22
Error Comparison ............................................................................................. 23
Summary .......................................................................................................... 25
References ........................................................................................................... 27
Appendix A: Acknowledgements .......................................................................... 28
GNU/Linux/Ubuntu ............................................................................................ 28
R ....................................................................................................................... 28
OpenOffice.org ................................................................................................. 28
Other Tools ....................................................................................................... 28
Appendix B: R Code ............................................................................................. 29
ii
8/2/2019 Predictors of Stock Market Values
3/38
Predictors of Stock Market Values 1
INTRODUCTION
The goal of this project is to analyze available econometric data and find
predictors of the valuation of the United States stock market. Availability of
data is an important constraint for this analysis. As the stock market has its
value calculated every second, using predictors with monthly frequency
would have been preferable, however, macroeconomic data is usually given
in quarterly and annual forms. Thus, all data used here has been annualized.
The proxy for stock values, the model's independent variable, is the
Ibbotson Large Company total return values. Since these values are given in
percentage returns from year 1925, the figures used for the model are
transformed as $1,000 invested in 1925 (Harrington, 2008).
Dependent variables include projected GDP, interest rates, inflation, and
the money supply.
Projected GDP is measured by reported predictions by the fed in the
Greenbook, but since it is only reported with a three year delay, the data
must be analyzed up to that date. Further, since the Fed's methodology is a
secret, the results for this figure, if they are significant, cannot be accurately
reproduced. GDP is an estimate even after the period for which it is
measured, and is usually revised several times (St. Louis Fed, 2010).
Projected GPD is important because it represents the productivity of the
United States economy. It would make sense that the more productive the US
economy, the greater chance for profitability for any US company, though
certainly not a guarantee. If an investor believes the economy will be more
8/2/2019 Predictors of Stock Market Values
4/38
Predictors of Stock Market Values 2
productive in the future, perhaps this will increase the price the investor will
pay for the investment.
Since stock prices are based on the present values of future cash flows,
changes in interest rates are likely to affect the valuation of stocks. Lower
rates would increase the present value of the future cash flows. Interest rates
also affect the cost of capital for the firm as well. Lower interest rates would
decrease the borrowing costs of firms and increase their profitability.
The original proposal sought to examine bond prices as a predictor
variable. Since interest rate changes are directly related to changes in bond
prices, it precludes the use of bonds as a predictor variable.
Inflation may also be a predictor of stock valuations. Higher inflation
means that investors in bonds, if they are spending the bond income, are
losing capital to inflation. As a result, they may seek higher returns in the
stock market. Further, the values of hard assets owned by the companies
may also be increasing in value relative to the weakening dollar. Similarly to
the Large Company stock values, the figures used for inflation are indexed to
$1,000 in 1925.
The money supply is the final predictor to examine. The money supply
represents the number of dollars in circulation. It is related to inflation as the
more dollars in circulation, the less each individual dollar may be worth.
Since it may be highly correlated to inflation, it is unlikely both will remain in
the final model. This idea is not unrelated to Keynes' idea that the
components of money demand include a speculative demand in addition to
8/2/2019 Predictors of Stock Market Values
5/38
Predictors of Stock Market Values 3
classical notions of precautionary demand and transactional demand.
Since some predictors are only available on a quarterly or annual basis,
important data is only available on a year
A concern for this analysis is whether or not the stock market of today is
affected by the same causes even a decade past.
The Data
The data is expected upon graphical examination to reveal trends and
cyclicality. It is expected that since the stock market values are
representative of growth, perhaps a log transformation of the data is the best
approach. However, the other values also follow a growth form, and
transforming both the predictor and dependent variables in an identical
fashion will not yield any additional predictive power. The Box-Cox operation
may reveal the optimal transformation.
Stock market values will be represented by Ibbotson Large Company
Returns. Since returns are given in terms of percentage gains or losses, the
data is transformed into values based on $1000 invested in 1925, and
represent the value gained or lost by the end of the year (Harrington, 2008).
Projected economic production is measured by Projected Gross National
Product (in billions) for up to year 1992, and Projected Gross Domestic
Product (in billions) for year 1992 on, with one year of overlap in projections
for year 1992. These projections are given by the Greenbook, which is
released along with the Federal Open Market Committee meeting transcripts
after a five year lag (St. Louis Fed, 2010).
8/2/2019 Predictors of Stock Market Values
6/38
Predictors of Stock Market Values 4
Inflation is given by Ibbotson Inflation Return data. The Fed Funds Rate
will proxy for interest rates. Money supply may be represented variously by
Institutional Money Funds (series IMFNS from the St. Louis Fed) and M2, and
the two series added together. (M3 was to be our time series for money
supply as it is the most encompassing definition of money, but it has been
discontinued by the Fed on the grounds that the costs of gathering the data
are not overcome by the value of the series.)
It should be noted that both the Large Company and Inflation return data
are given in terms of annual percentage growth, and have been transformed
to indicate the growth in the value of $1000 in 1925, therefore these figures
indicate the value by the end of the period. For prediction, the predictor
variables (other than projected GDP and GNP) will be lagged.
Occam's razor states that if two possible explanations are equally likely,
one should accept the least complicated explanation. When making
predictions, one should accept more complicated explanations only if the
more complicated explanation provides significantly greater prediction value.
Thus, this paper will seek to find the simplest model with the best
prediction value.
8/2/2019 Predictors of Stock Market Values
7/38
Predictors of Stock Market Values 5
Prediction And Moving Averages
There are various ways of attempting to predict a variable based on its
past values. The simplest method is to use the last measured value. This
method may put too much emphasis on a single terms' values. Various forms
of moving averages can provide a more nuanced approach to prediction of
the next period's value.
Simple moving Averages (SMAs) weight all periods evenly, and the only
Table 1: Raw Data
Year LCStock GNP GDP Inflation FEDFUNDS IMFNS M2NS M2IMFNS
1978 89597.47 8445.3 NA 3777.37 NA NA NA NA
1979 106119.24 9385.1 NA 4280.14 13.78 9.5 1479.0 1488.5
1980 140523.09 10219.1 NA 4810.87 18.90 15.2 1604.8 1620
1981 133623.41 11342.9 NA 5240.97 12.37 38.0 1760.3 1798.3
1982 162232.18 12479.4 NA 5443.79 8.95 50.0 1917.2 1967.21983 198750.65 12979.4 NA 5650.66 9.47 42.5 2136.2 2178.7
1984 211212.31 14604.2 NA 5873.86 8.38 65.9 2320.9 2386.8
1985 279138.19 15628.9 NA 6095.30 8.27 68.2 2506.6 2574.8
1986 330695.02 16478.2 NA 6164.18 6.91 88.5 2744.1 2832.6
1987 347990.37 17707.1 NA 6436.02 6.77 95.0 2842.7 2937.7
1988 406487.55 18951.2 NA 6720.49 8.76 94.9 3006.3 3101.2
1989 534490.48 20890.4 NA 7032.99 8.45 112.5 3171.4 3283.9
1990 517547.13 22115.7 NA 7462.71 7.31 141.5 3290.2 3431.7
1991 675657.77 22918.9 NA 7691.07 4.43 191.2 3391.1 3582.3
1992 727480.73 23731.8 23646.8 7914.11 2.92 216.0 3446.7 3662.7
1993 800156.05 NA 25106.9 8131.75 2.96 221.3 3501.2 3722.5
1994 810638.09 NA 26916.3 8348.86 5.45 216.1 3517.7 3733.81995 1114059.93 NA 28464.6 8560.92 5.60 270.3 3663.9 3934.2
1996 1371073.56 NA 29644.5 8845.15 5.29 332.4 3839.7 4172.1
1997 1828463.70 NA 31702.1 8995.51 5.50 409.7 4053.6 4463.3
1998 2351038.63 NA 33760.6 9140.34 4.68 565.6 4398.2 4963.8
1999 2845697.15 NA 35286.5 9385.30 5.30 674.2 4661.9 5336.1
2000 2586454.14 NA 39068.3 9703.47 6.40 833.4 4948.7 5782.1
2001 2279183.39 NA 41928.5 9853.87 1.82 1248.0 5469.0 6717
2002 1775483.86 NA 41733.8 10088.39 1.24 1300.8 5816.2 7117
2003 2285047.73 NA 43518.7 10278.05 0.98 1154.7 6101.4 7256.1
2004 2533432.42 NA 46639.7 10613.12 2.16 1103.0 6443.7 7546.7
2005 2657823.95 NA NA 10976.09 4.16 1172.1 6703.1 7875.2
2006 3077760.13 NA NA 11254.88 5.24 1378.4 7102.3 8480.72007 3246729.16 NA NA 11714.08 4.24 1934.8 7530.2 9465
2008 2045439.37 NA NA 11724.62 0.16 2430.9 8251.3 10682.2
8/2/2019 Predictors of Stock Market Values
8/38
Predictors of Stock Market Values 6
parameter is the number of periods used for prediction.
Exponential Moving Averages (EMAs) reduce the parameters involved in
weighted moving averages to the number of periods used and a parameter,
alpha, which defines the amount of weighting each period receives. If alpha is
restricted to 2 /n1 , one may reduce the number of parameters to 1
(Colby, 2003).
Weighted Moving Averages (WMA) may also be considered. There are
enumerable variations in choice for weightings. Restricting the weighting to
ntx where n is the number of periods, tis the most recent period, and
xis the period number, can provide a general rule structure with equally
declining rates of weighting. SMA and the Last method are actually restricted
cases of WMA, the SMA with equal weighting, and the Last method with
n=1.
These are simple methods for forecasting time series data. They can
provide a baseline for deciding if other more complicated methods are
worthwhile. Measurement of the degree to which they fail to predict can
provide a way of eliminating the less effective methods of prediction. The
methods are demonstrated and compared on the following pages.
8/2/2019 Predictors of Stock Market Values
9/38
Predictors of Stock Market Values 7
Table 2: Prediction Method of using the Last Period's Value
Year Last Absolute Error Squared Error Abs. % Error
1978 89597.47
1979 106119.24 89597.47 16521.77 272968969.30 15.57%
1980 140523.09 106119.24 34403.86 1183625368.92 24.48%
1981 133623.41 140523.09 6899.68 47605638.59 5.16%
1982 162232.18 133623.41 28608.77 818461848.89 17.63%1983 198750.65 162232.18 36518.46 1333598241.05 18.37%
1984 211212.31 198750.65 12461.67 155293109.25 5.90%
1985 279138.19 211212.31 67925.88 4613925152.16 24.33%
1986 330695.02 279138.19 51556.82 2658106122.24 15.59%
1987 347990.37 330695.02 17295.35 299129110.46 4.97%
1988 406487.55 347990.37 58497.18 3421920136.68 14.39%
1989 534490.48 406487.55 128002.93 16384749714.32 23.95%
1990 517547.13 534490.48 16943.35 287077043.93 3.27%
1991 675657.77 517547.13 158110.65 24998976830.29 23.40%
1992 727480.73 675657.77 51822.95 2685618284.69 7.12%
1993 800156.05 727480.73 72675.32 5281702797.86 9.08%
1994 810638.09 800156.05 10482.04 109873251.96 1.29%1995 1114059.93 810638.09 303421.84 92064812356.11 27.24%
1996 1371073.56 1114059.93 257013.63 66056004341.90 18.75%
1997 1828463.70 1371073.56 457390.14 209205740036.57 25.01%
1998 2351038.63 1828463.70 522574.93 273084552890.20 22.23%
1999 2845697.15 2351038.63 494658.53 244687058285.70 17.38%
2000 2586454.14 2845697.15 259243.01 67206938571.71 10.02%
2001 2279183.39 2586454.14 307270.75 94415315113.52 13.48%
2002 1775483.86 2279183.39 503699.53 253713215787.74 28.37%
2003 2285047.73 1775483.86 509563.87 259655335708.01 22.30%
2004 2533432.42 2285047.73 248384.69 61694953315.94 9.80%
2005 2657823.95 2533432.42 124391.53 15473253157.22 4.68%
2006 3077760.13 2657823.95 419936.18 176346398595.85 13.64%2007 3246729.16 3077760.13 168969.03 28550533539.91 5.20%
2008 2045439.37 3246729.16 1201289.79 1443097161504.6 58.73%
225742.56 115510479476.74 16.94%
MAD MSE MAPE
LCStock
8/2/2019 Predictors of Stock Market Values
10/38
Predictors of Stock Market Values 8
Table 3: Simple Moving Average, n=3
Year 3SMA Absolute Error Squared Error Abs. % Error
1978 89597.47
1979 106119.24
1980 140523.09
1981 133623.41 112079.93 21543.48 464121451.77 16.12%
1982 162232.18 126755.25 35476.94 1258612933.63 21.87%1983 198750.65 145459.56 53291.08 2839939693.60 26.81%
1984 211212.31 164868.75 46343.57 2147726102.60 21.94%
1985 279138.19 190731.71 88406.48 7815705416.34 31.67%
1986 330695.02 229700.38 100994.63 10199915820.03 30.54%
1987 347990.37 273681.84 74308.53 5521756957.95 21.35%
1988 406487.55 319274.53 87213.02 7606111133.42 21.46%
1989 534490.48 361724.31 172766.17 29848147904.41 32.32%
1990 517547.13 429656.13 87891.00 7724827496.83 16.98%
1991 675657.77 486175.05 189482.72 35903703032.64 28.04%
1992 727480.73 575898.46 151582.27 22977183646.41 20.84%
1993 800156.05 640228.54 159927.51 25576807786.21 19.99%
1994 810638.09 734431.52 76206.58 5807442490.69 9.40%1995 1114059.93 779424.96 334634.98 111980567596.75 30.04%
1996 1371073.56 908284.69 462788.87 214173535872.04 33.75%
1997 1828463.70 1098590.53 729873.17 532714845282.45 39.92%
1998 2351038.63 1437865.73 913172.89 833884735153.89 38.84%
1999 2845697.15 1850191.96 995505.19 991030584614.88 34.98%
2000 2586454.14 2341733.16 244720.98 59888359287.38 9.46%
2001 2279183.39 2594396.64 315213.25 99359393130.41 13.83%
2002 1775483.86 2570444.90 794961.03 631963045960.47 44.77%
2003 2285047.73 2213707.13 71340.60 5089480910.29 3.12%
2004 2533432.42 2113238.33 420194.09 176563073690.98 16.59%
2005 2657823.95 2197988.00 459835.95 211449097709.30 17.30%
2006 3077760.13 2492101.37 585658.77 342996192310.68 19.03%2007 3246729.16 2756338.83 490390.33 240482676908.24 15.10%
2008 2045439.37 2994104.42 948665.04 899965361827.70 46.38%
337495.89 204341961189.70 25.28%
MAD MSE MAPE
LCStock
8/2/2019 Predictors of Stock Market Values
11/38
Predictors of Stock Market Values 9
Table 4: Exponential Moving Average, n=3, alpha=0.5
Year 3EMA.5 Absolute Error Squared Error Abs. % Error
1978 89597.47
1979 106119.24 89597.47
1980 140523.09 97858.35
1981 133623.41 119190.72 14432.69 208302472.58 10.80%1982 162232.18 126407.07 35825.12 1283438940.56 22.08%
1983 198750.65 144319.62 54431.02 2962736201.05 27.39%
1984 211212.31 171535.14 39677.18 1574278358.49 18.79%
1985 279138.19 191373.72 87764.47 7702601885.25 31.44%
1986 330695.02 235255.96 95439.06 9108613854.10 28.86%
1987 347990.37 282975.49 65014.88 4226934433.03 18.68%
1988 406487.55 315482.93 91004.62 8281840836.41 22.39%
1989 534490.48 360985.24 173505.24 30104067776.38 32.46%
1990 517547.13 447737.86 69809.27 4873334340.09 13.49%
1991 675657.77 482642.49 193015.28 37254899475.16 28.57%
1992 727480.73 579150.13 148330.59 22001964771.08 20.39%
1993 800156.05 653315.43 146840.62 21562167965.08 18.35%1994 810638.09 726735.74 83902.35 7039605132.02 10.35%
1995 1114059.93 768686.92 345373.02 119282520409.14 31.00%
1996 1371073.56 941373.43 429700.13 184642205957.35 31.34%
1997 1828463.70 1156223.49 672240.21 451906896336.44 36.77%
1998 2351038.63 1492343.60 858695.03 737357153315.08 36.52%
1999 2845697.15 1921691.11 924006.04 853787164899.99 32.47%
2000 2586454.14 2383694.13 202760.01 41111621713.91 7.84%
2001 2279183.39 2485074.14 205890.75 42390999723.26 9.03%
2002 1775483.86 2382128.76 606644.90 368018038091.87 34.17%
2003 2285047.73 2078806.31 206241.42 42535521976.81 9.03%
2004 2533432.42 2181927.02 351505.40 123556043793.01 13.87%
2005 2657823.95 2357679.72 300144.23 90086558779.20 11.29%2006 3077760.13 2507751.83 570008.30 324909460857.22 18.52%
2007 3246729.16 2792755.98 453973.18 206091648861.04 13.98%
2008 2045439.37 3019742.57 974303.20 949266726355.81 47.63%
311128.82 173819531389.31 23.61%
MAD MSE MAPE
LCStock
8/2/2019 Predictors of Stock Market Values
12/38
Predictors of Stock Market Values 10
Table 5: Weighted Moving Average, 3 periods
Year 3WMA Absolute Error Squared Error Abs. % Error
1978 89597.47
1979 106119.24
1980 140523.09
1981 133623.41 120568 13055.87 170455826.59 9.77%
1982 162232.18 131339 30892.91 954371666.51 19.04%
1983 198750.65 149078 49672.90 2467397310.21 24.99%
1984 211212.31 175723 35489.03 1259471001.03 16.80%
1985 279138.19 198895 80243.12 6438958847.56 28.75%
1986 330695.02 243098 87596.71 7673183321.03 26.49%
1987 347990.37 293596 54394.74 2958787899.04 15.63%
1988 406487.55 330750 75737.66 5736193038.66 18.63%
1989 534490.48 374356 160134.08 25642922636.86 29.96%
1990 517547.13 460739 56807.65 3227108677.42 10.98%
1991 675657.77 504685 170972.79 29231696566.83 25.30%
1992 727480.73 599426 128054.38 16397925184.80 17.60%
1993 800156.05 675217 124938.57 15609647468.82 15.61%
1994 810638.09 755181 55456.87 3075463885.92 6.84%
1995 1114059.93 793285 320775.42 102896866984.15 28.79%
1996 1371073.56 960602 410471.55 168486896330.42 29.94%
1997 1828463.70 1191996 636467.26 405090572707.41 34.81%
1998 2351038.63 1556933 794105.60 630603703969.31 33.78%
1999 2845697.15 2013519 832177.68 692519690655.54 29.24%
2000 2586454.14 2511272 75182.07 5652344215.05 2.91%
2001 2279183.39 2633633 354449.17 125634213850.63 15.55%
2002 1775483.86 2476026 700542.07 490759197131.8 39.46%
2003 2285047.73 2078545 206502.31 42643204645.54 9.04%
2004 2533432.42 2114216 419216.70 175742642136.79 16.55%
2005 2657823.95 2324313 333511.19 111229711943.21 12.55%
2006 3077760.13 2554231 523529.40 274083030393.65 17.01%
2007 3246729.16 2847060 399669.05 159735345716.28 12.31%
2008 2045439.37 3092255 1046815.91 1095823551868.69 51.18%
302846.77 170434983551.10 22.20%
MAD MSE MAPE
LCStock
8/2/2019 Predictors of Stock Market Values
13/38
Predictors of Stock Market Values 11
Table 6: Moving Average Summary
It is clear that the method of choosing the last value has the best
performance of the moving averages, since it has the least Mean Absolute
Deviation (MAD), the least Mean Squared Errors (MSE), and the least Mean
Absolute Percentage Error (MAPE). In terms of the moving averages, the best
performer is the simplest prediction method, where the last period is used to
predict the next period. Since the data follows a trend with occasional
retracement, it makes sense that this prediction method would provide the
best performance.
This is not to say that selecting the Last Period is a satisfactory approach
to prediction. The variable is generally in an upward trend. If the troughs can
be predicted, an investor will safely gain returns while avoiding or even
profiting from market losses.
Data Manipulation
Since the data are to be used to predict future period Large Company
Stock returns using current data, aside from projected GDP. Therefore
Inflation, Interest Rates, and the Money Supply figures will be lagged.
Also, Projected GNP ends where Projected GDP begins, with one year of
overlap. These series are spliced together with the average taken for the
year of overlap (since there is minimal difference in the figures, less than half
Method MAD MSE MAPE
Last 225742.56 115510479476.74 16.94%
SMA 337495.89 204341961189.70 25.28%
EMA 311128.82 173819531389.31 23.61%
WMA 302846.77 170434983551.10 22.20%
8/2/2019 Predictors of Stock Market Values
14/38
Predictors of Stock Market Values 12
a percent.) This series will be referred to as SPLICEGROSS in the data, and as
GDP in the text from this point on.
The lagged money supply data will consist of the combined M2 and
Institution Money Funds, since M3 (the most expansive definition of the
money supply) was discontinued. In the data, it is referred to under the
MRM2IMFNS label. In the text, it will be referred to as simply the money
supply. The most recent (thus lagged) Fed Funds and Inflation data are also
used.
Data Exploration
The data are highly correlated. Pairwise correlation is measured where
data is available for both terms. The term of primary interest is the
dependent variable, Large Cap Stock. Correlations with the dependent
variable are: Year, 0.926; GDP, 0.934; lagged Inflation, 0.912; lagged
FEDFUNDS, 0.64; lagged M2 and Institution Money Funds, 0.878.
There are other pairwise correlations of note. FEDFUNDS is negatively
correlated with all other variables. Its strongest relationship is with the GDP (-
0.798, -0.781 lagged) and Inflation (-0.808) variables, both at near 0.8.
The measure of money supply is highly correlated with Inflation (0.962)
as well as GDP (0.981). This relationship may indicate multicollinearity in the
data, and it makes the money supply variable an early potential candidate for
removal. The removal of the money supply (or any other variable, for that
matter) from the model does not mean that it is unimportant, it merely
means that it is not needed to predict the response variable.
8/2/2019 Predictors of Stock Market Values
15/38
Predictors of Stock Market Values 13
Multicollinearity is the problem of having one predictive variable being a
near linear transformation of another predictive variable. When the variables
are highly correlated, the model may still be reliable so long as the
relationship between the independent variables is stable. If the relationship
Illustration 1: Plots of the Key Variables
8/2/2019 Predictors of Stock Market Values
16/38
Predictors of Stock Market Values 14
between variables changes, the model may cease to be reliable (Faraway,
2005).
Multicollinearity also means that it is difficult to explain the individual
importance of each variable. Small changes in the predicted variable can
create large changes in the beta coefficients.
The indication of multicollinearity is not only shown in the correlation
matrix. It may also show itself in the model. (Variance inflation factors are a
another approach to examining collinearity, but are beyond the scope of this
paper.)
There are other potential problems as well, including heteroskedasticity,
non-constant variance of the errors.
Linear Prediction
The first model,
LCS=YearProjected GDPInflationFed Funds RateMoney Supply , is fit. GDP and
the money supply are shown to be significant at the 5% level. Adjusted R-
squared is 0.9014, and the p-value for the model is highly significant.
(Excluded observations are those before 1980 and after 2004; all
observations between and including 1980 and 2004 were included in the
regression.)
8/2/2019 Predictors of Stock Market Values
17/38
Predictors of Stock Market Values 15
Using the original model yields a very high F statistic and R-squared.
However, only two of the independent variables is significant at the 5% level.
Visually observing the errors, they appear to be within a narrow band to the
left and a much wider band to the right.
Variance appears to be non-constant, a condition called
heteroskedasticity. A Q-Q Plot of the errors also indicates non-normality, but
is similar to log-normal residuals (which is more evidence for a log
transformation of the dependent variable). The Shapiro-Wilk normality test
gives a p-value of 0.03778. Since the Shapiro-Wilk null hypothesis is that the
residuals are normal, this test provides formal evidence for the rejection of an
Illustration 2: Residual Variance is Non-Constant
8/2/2019 Predictors of Stock Market Values
18/38
Predictors of Stock Market Values 16
assumption of normality. (R documentation indicates a rejection threshold of
less than .1 is adequate, citing a remark in Applied Statistics by Patrick
Royston in 1995 (R Development Core Team, 2009).)
A further problem is autocorrelation. Visual inspection of the trending
data is indicative of autocorrelation. The Durbin-Watson test is conclusive
(Zeileis & Hothorn, 2002). It reports a p-value of 0.0002858, rejecting the
hypothesis of non-correlated errors. An approach to deal with autocorrelation
is to add the lagged response variable to the predictor variables. This
approach is akin to using the Last method for prediction.
The proper transformation of the data can be easily estimated with the
Box-Cox method (Venables & Ripley, 2002). The Box-Cox method transforms
the response variable by raising it to the power of lambda (and dividing it by
Illustration 3: Box-Cox Operation Indicates Natural Log-Transformation
8/2/2019 Predictors of Stock Market Values
19/38
Predictors of Stock Market Values 17
lambda), except when lambda equals zero, which then takes the natural log
of the response variable. The 95% confidence interval for lambda falls
between approximately -0.28 and 0.05, confirming earlier suspicions of the
appropriateness of taking the natural log of the response.
Based on the suggestions of the analysis thus far, the model will be
changed and transformed before dropping insignificant variables in an
attempt to improve the model. The new model is
ln LCS=YearPro.GDPInfl.Fed Funds RateMoney SupplylnLaggedLCS .
The transformation of the dependent variable indicates a successful
improvement on the model. The R-squared has increased, and the p-value is
still significant. The money supply is still significant, but the untransformed
projected GDP is no longer significant.
The Shapiro-Wilk normality test now indicates normally distributed errors.
The Durbin-Watson test still indicates autocorrelation with a p-value of .0847.
The Box-Cox test now indicates a wide range of possible transformations
including -2 and 1 within the 95% confidence level. Perhaps the Box-Cox
result is a problem with the predictors not having the correct transformation.
Both the Money Supply, Inflation, and Gross Domestic Product are functions
of growth over time. The next model will take their logs as well and fit.
This iteration does not improve for any of the variables except the lag
predictor which corrects for autocorrelation. Up to this point, inflation appears
to be an unimportant variable. Remove inflation for the next regression.
Removing inflation improves this model. Removing variables usually
8/2/2019 Predictors of Stock Market Values
20/38
Predictors of Stock Market Values 18
decreases R-squared, but in this case, there was no change. Adjusted R-
squared actually improved.
Since the lagged dependent variable is in the model as a predictor, it
does a better job of predicting the next year's performance than the year.
Since the year is the least significant predictor now, it is the next variable to
remove from the model.
Removing the Year variable lowers R-squared an insignificant amount,
while still improving adjusted R-squared. In addition, the GDP variable
becomes significant again. The variable for the Federal Funds Rate is the lone
remaining insignificant term. Next, remove the Federal Funds Rate from the
model.
Removing the Fed Funds Rate creates very small decreases to R-squared
(at 0.9842) and adjusted R-squared (at 0.982). The model is now
ln LCS=lnPro.GDPln MoneySupplylnLaggedLCS , which is the best
iteration so far. Each term is significant at the 5% level, and both terms were
significant in the first iterations model (before any transformations). (See
Appendix B, page 33 for the regression ANOVA with the object code "malt5".)
The least significant term is the money supply term. Even though it is
significant at the 5% level, given the high R-squared, the model may be over-
fit. The problem with over-fitting a model is that it is overly sensitive to newly
sampled data. Training the model on a subset of the data and testing its
ability to predict based on data outside the subset is one way of testing for
fit, and this method shall be demonstrated at the end of this paper.
8/2/2019 Predictors of Stock Market Values
21/38
Predictors of Stock Market Values 19
Removing the money supply data reduces both R-squared and adjusted
R-squared slightly. It also reduces the confidence level of the prediction
provided by predicting gross economic production. It may be that the money
supply adds meaning that is required for projected GDP to mean anything. An
important thing to note is that this regression includes the observations from
year 1979. A look at the data indicates nothing strange that should arise from
that year being included. (Also, there is no indication of multiplicative effects,
see Appendix B.)
Checking The Model
Since the optimal model has been found, checking previous diagnostics
on the model, ln LCS=lnProjected GDPln Money Supplyln LaggedLCS will
test the strength of the model. The Shapiro test gives a p-value of .2885,
which is little evidence to reject the assumption of normality in the errors.
The Durbin-Watson test gives a p-value of .1872, which is evidence for not
rejecting the assumption of independent errors (other forms of data-
exploration confirm this conclusion). And the Box-Cox transformation test
indicates that the data are correctly transformed with a maximal value for
lambda of close to one, although the 95% confidence interval ranges from
approximately -0.75 to 2.66. Thus we can assume the model is linear in the
parameters.
8/2/2019 Predictors of Stock Market Values
22/38
Predictors of Stock Market Values 20
The one remaining problem with the data is multicollinearity. Projected
GDP is correlated at 98% with the lagged money supply. Although this
correlation indicates removing one of the predictors from the model, removal
from this point is impossible. Upon removal of projected GDP, the money
supply variable's p-value increases to 0.3530. Removal of the money supply
variable causes projected GDP's p-value to go to 0.267.
CONCLUSIONS
The Model
This study indicates that together, the money supply and projected GDP
provide information that indicates the direction of the stock market. To
Illustration 4: Box-Cox Operation Maximum Likelihood: Linear Model
8/2/2019 Predictors of Stock Market Values
23/38
Predictors of Stock Market Values 21
implement this model in making predictions about the stock market, first
predict the next year's GDP to the same level of accuracy as the Fed (no
small feat). Then, using the current year's end of year money supply and
stock market values, combined with the GDP projection, predict the stock
market's valuations with the following formula.
LCS=e4.98091.9336ln ProjGDP1.0676ln MoneySupply0.5758ln LaggedLCS
The transformation is justified by both the Box-Cox procedure as well as
the improvement in R-Squared of over 0.05.
Confidence Intervals
The minimum and maximum residuals are -0.22263 and 0.23279
respectively. To understand the difference, for any number exponential in e
greater than expected by 0.25, the exponential function is 28.4% greater
than expected. Similarly, for a number in e's exponent less than expected by
0.25, the result is 22.1% less than expected.
The residual standard error is 0.1368 with 21 degrees of freedom. Based
on the two tailed t-distribution with an alpha of 95%, the critical range is plus
or minus 2.08 standard errors. Thus the 95% confidence interval for the
regression is from 24.8% less than expected to 33.0% greater than expected.
This calculation indicates that this regression is no gold mine, and that even
with some expectation of a future value, there can be very large variance.
Error Comparison
Based on the sample the regression was calculated from, assuming
accurate projection of GDP, the next four years would have had this result.
8/2/2019 Predictors of Stock Market Values
24/38
Predictors of Stock Market Values 22
This prediction uses actual cumulative quarterly GDP instead of projected
GDP (which as noted earlier, is only released by the Fed after a five year lag).
The Mean Absolute Percent Error is the best measure of error, since this out
of sample prediction is after significant growth in the stock market and other
variables, and the absolute errors and squared errors should be much larger.
Looking at the individual prediction percentage errors, for the first three
years, note an average over-prediction that ranges from 7.75% to 13.46%.
The MAPE of 26.73% is skewed high by the 2008 observation.
For the entire sample plus the next four years, the MAPE is 13.12%.
Relative to even the best of the moving average prediction methods, by this
measure, the regression is far superior.
Table 7: Four Year Forecast and Error
Year LCSTOCK GDP M2IMF AUTOCOR Predicted Abs Errors Squared Errors bs%Erro
2005 2657824.0 50553.5 7547 2533432.42 3015493.21 357669.25767 127927297880 13.46%
2006 3077760.1 53595.7 7875 2657823.95 3316359.46 238599.33319 56929641800.5 7.75%
2007 3246729.2 56290 8481 3077760.13 3665965.51 419236.34554 175759113421 12.91%
2008 2045439.4 57765.7 9465 3246729.16 3534858.84 1489419.4698 2218370357114 72.82%
Beta: -4.9809 1.9336 -1.068 0.5758 Sums: 2504924.4062 2578986410215Means: 626231.10156 644746602554 26.73%
MAE MSE MAPE
8/2/2019 Predictors of Stock Market Values
25/38
Predictors of Stock Market Values 23
It may be considered unfair to compare MAPE for the whole set of years
for a regression fitted to those years designed to minimize Mean Squared
Errors. However, there is little else to compare. Indeed this regression may be
the best approximation to predicting the next year's stock market levels.
Optimization of prediction notwithstanding, the variance may be far too high
to create profitable trading rules based on the data.
Table 8: Final Model's Error
Year LCSTOCK GDP RM2IMFN AUTOCOR Predicted Abs ErrorsSquared Errors Abs%Error
1980 140523.09 10219.1 1488.5 106119.24 124752.29 15770.80 248718251.35 11.22%
1981 133623.41 11342.9 1620 140523.09 163920.03 30296.62 917885351.98 22.67%
1982 162232.18 12479.4 1798.3 133623.41 171322.78 9090.599 82638994.598 5.60%
1983 198750.65 12979.4 1967.2 162232.18 187800.16 10950.49 119913272.36 5.51%
1984 211212.31 14604.2 2178.7 198750.65 237773.43 26561.12 705493083.59 12.58%1985 279138.19 15628.9 2386.8 211212.31 254694.48 24443.71 597495000.97 8.76%
1986 330695.02 16478.2 2574.8 279138.19 305514.84 25180.18 634041355.26 7.61%
1987 347990.37 17707.1 2832.6 330695.02 349602.05 1611.683 2597523.5158 0.46%
1988 406487.55 18951.2 2937.7 347990.37 394866.81 11620.74 135041639.52 2.86%
1989 534490.48 20890.4 3101.2 406487.55 492043.93 42446.55 1801709516 7.94%
1990 517547.13 22115.7 3283.9 534490.48 605042.42 87495.29 7655425684 16.91%
1991 675657.77 22918.9 3431.7 517547.13 607121.90 68535.87 4697165498 10.14%
1992 727480.73 23689.3 3582.3 675657.77 720759.17 6721.556 45179310.683 0.92%
1993 800156.05 25106.9 3662.7 727480.73 821835.76 21679.71 470009941.49 2.71%
1994 810638.09 26916.3 3722.5 800156.05 976169.36 165531.3 27400600909 20.42%
1995 1114059.93 28464.6 3733.8 810638.09 1092297.79 21762.14 473590650.94 1.95%
1996 1371073.56 29644.5 3934.2 1114059.93 1341884.08 29189.48 852025708.45 2.13%
1997 1828463.7 31702.1 4172.1 1371073.56 1617168.71 211295.0 44645574834 11.56%
1998 2351038.63 33760.6 4463.3 1828463.7 2005824.72 345213.9 119172641764 14.68%
1999 2845697.15 35286.5 4963.8 2351038.63 2254232.08 591465.1 349830931632 20.78%
2000 2586454.14 39068.3 5336.1 2845697.15 2836037.94 249583.8 62292075342 9.65%
2001 2279183.39 41928.5 5782.1 2586454.14 2824486.32 545302.9 297355288996 23.93%
2002 1775483.86 41733.8 6717 2279183.39 2217762.22 442278.4 195610143382 24.91%
2003 2285047.73 43518.7 7117 1775483.86 1957991.01 327056.7 106966098805 14.31%
2004 2533432.42 46639.7 7256.1 2285047.73 2535676.41 2243.989 5035487.8639 0.09%
2005 2657823.95 50553.5 7546.7 2533432.42 3015493.21 357669.3 127927297880 13.46%
2006 3077760.13 53595.7 7875.2 2657823.95 3316359.46 238599.3 56929641800 7.75%
2007 3246729.16 56290 8480.7 3077760.13 3665965.51 419236.3 175759113421 12.91%
2008 2045439.37 57765.7 9465 3246729.16 3534858.84 1489419 2.21837E+012 72.82%
Beta: -4.9809 1.9336 -1.0676 0.5758 Sums: 58182523.80170E+012Means: 207794.7 135775133291 13.12%
MAE MSE MAPE
8/2/2019 Predictors of Stock Market Values
26/38
Predictors of Stock Market Values 24
Summary
Since this model was arrived at over a series of iterative processes that
eliminated one variable at a time, it may be argued that the findings are
spurious, and the result of random chance. That the model is the result of
pure chance is unlikely to be the case, however.
In general, this model states that the stock market goes up when the
economy is expected to grow, and when the money supply is decreasing. The
effect for the economy is about twice as much as the effect for the money
supply.
Expectations of economic growth fuel speculation in stocks. When people
expect the economy to grow more, stock prices increase. When there is less
of an expectation for economic growth, stock prices do not increase as much.
The Fed acts to contract the money supply when the economy is growing
too fast. The stock market is known to be a leading indicator of economic
growth. It would make sense that the Fed would be tightening the money
supply as the stock market is increasing.
Sometimes time series data runs the risk of reaching a change point
where the effects being used for prediction cease to work (Chatfield, 2000). It
is unlikely that the effects found here will cease to predict, however. These
effects are the result of actions of or predictions by a United States
government chartered organization that has powerful control over
fundamental aspects of the economy.
The high correlation between the two factors is an element of concern. It
8/2/2019 Predictors of Stock Market Values
27/38
Predictors of Stock Market Values 25
would make sense that if the Fed sees the economy growing above average
the next year that it would act today to reduce the money supply. This
reasoning would explain the high level of correlation. This interaction is
troubling, but each needs the other for its significance level in the model. And
without the two variables, the model is left with nothing but an
autocorrelation correction variable based on the previous year's market and
about a third higher residual standard error.
Low standard errors with many variables relative to the number of
observations may indicate a model that is over-fit, but the two variables (plus
the autocorrelation variable) do not seem to be too much relative to the size
of the data available. In retrospect, this model also has the lowest standard
errors, and since all of these models have a very high R-squared, optimizing
for standard errors while keeping the number of predictors small would seem
to be the best remaining approach.
8/2/2019 Predictors of Stock Market Values
28/38
Predictors of Stock Market Values 26
REFERENCES
Chatfield, C. (2000). Time-Series Forecasting. Boca Raton: Chapman &
Hall/CRC.
Colby, R. W. (2003). The Encyclopedia of Technical Market Indicators. New
York: McGraw-Hill.
Faraway, J. J. (2005). Linear Models with R. Boca Raton: Chapman & Hall/CRC.
Harrington, J. P. (Ed.). (2008). Ibbotson SBBI 2009 Classic Yearbook: Market
Results for Stocks, Bonds, Bills, and Inflation 1926-2008. Chicago:
Morningstar.
R Development Core Team. (2009). R: A Language and Environment for
Statistical Computing. Vienna, Austria: R Foundation for Statistical
Computing. Retrieved from http://www.R-project.org
St. Louis Fed. (2010). St. Louis Fed: Download Data for Series: M2NS, M2
Money Stock. St. Louis Fed. Retrieved April 6, 2010, from
http://research.stlouisfed.org/fred2/series/M2NS/downloaddata?cid=48
Venables, W. N., & Ripley, B. D. (2002). Modern Applied Statistics with S (4th
ed.). New York: Springer. Retrieved from
http://www.stats.ox.ac.uk/pub/MASS4
Zeileis, A., & Hothorn, T. (2002). Diagnostic Checking in Regression
Relationships. R News, 2(3), 7-10.
8/2/2019 Predictors of Stock Market Values
29/38
Predictors of Stock Market Values 27
APPENDIX A: ACKNOWLEDGEMENTS
GNU/Linux/Ubuntu
The GNU Community has developed or enabled the functioning of all of
the tools used to create this document. All of these tools are open-source
software packages that are free to use and free to modify. The Linux kernel
powered the computing. Ubuntu is a popular distribution of Linux, and the
source for software repositories that provided the operating system,
supporting software, and core tools (except Zotero).
R
This study was done in R, a powerful command-line statistical
programming package (R Development Core Team, 2009). The advantages of
a command-line interface are that one may maintain an exactly reproducible
copy of ones work (e.g. see Appendix B), while having complete access to
many powerful functions. The disadvantage is that the learning curve takes
longer to climb compared to graphical user interfaces.
OpenOffice.org
This paper was written in OpenOffice.org, an open-source version of
Sun's StarOffice. Writer was used for word processing and document
assembly. Calc was used for data manipulation, spreadsheet functions, and
table creation.
Other Tools
SciTE with R syntax highlighting was also used to manipulate the code.
Zotero Firefox and Writer plug-ins were used to manage citations.
8/2/2019 Predictors of Stock Market Values
30/38
Predictors of Stock Market Values 28
APPENDIX B: R CODE
This is the console input/output. It requires the files to be in the location
provided, and the lmtest and MASS libraries. The command prompt is the
">" symbol, and the "#" symbol indicates a non-executing comment.
R version 2.9.2 (2009-08-24)Copyright (C) 2009 The R Foundation for Statistical ComputingISBN 3-900051-07-0
R is free software and comes with ABSOLUTELY NO WARRANTY.You are welcome to redistribute it under certain conditions.Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and'citation()' on how to cite R or R packages in publications.
REvolution R enhancements not installed. For improvedperformance and other extensions: apt-get install revolution-r
> comb cor(comb, use= "pairwise.complete.obs")
Year LCSTOCK GNP GDP SPLICEGROSS InflationYear 1.0000000 0.9261381 0.9971895 0.9939544 0.9887894 0.9975192LCSTOCK 0.9261381 1.0000000 0.9715892 0.8080071 0.9340474 0.9154274GNP 0.9971895 0.9715892 1.0000000 NA 0.9999981 0.9855178
GDP 0.9939544 0.8080071 NA 1.0000000 0.9999990 0.9943664SPLICEGROSS 0.9887894 0.9340474 0.9999981 0.9999990 1.0000000 0.9782972Inflation 0.9975192 0.9154274 0.9855178 0.9943664 0.9782972 1.0000000FEDFUNDS -0.8233347 -0.6218049 -0.8141758 -0.4857107 -0.7987852 -0.8081528IMFNS 0.8859370 0.8180842 0.9549985 0.9533118 0.9287711 0.8753418M2NS 0.9683431 0.8936886 0.9902620 0.9829273 0.9898840 0.9679235M2IMFNS 0.9542132 0.8822151 0.9931752 0.9830307 0.9871683 0.9527529AUTOCOR 0.9261381 0.9508532 0.9668297 0.8545207 0.9352194 0.9307222AUTOCOR2 0.9261381 0.9060283 0.9720780 0.8874491 0.9320751 0.9238932AUTOCOR3 0.9369234 0.8664926 0.9575144 0.9434643 0.9391325 0.9144456MRINFL 0.9975192 0.9119503 0.9808353 0.9940487 0.9774272 0.9986158MRFUNDS -0.8080723 -0.6402895 -0.7588302 -0.3646230 -0.7814267 -0.7788310MRIMFNS 0.8782463 0.8099785 0.9478959 0.9329031 0.9090652 0.8900329MRM2NS 0.9696137 0.8894880 0.9940004 0.9675501 0.9867425 0.9741104
MRM2IMFNS 0.9547696 0.8778663 0.9947253 0.9609944 0.9814615 0.9621024FEDFUNDS IMFNS M2NS M2IMFNS AUTOCOR AUTOCOR2
Year -0.8233347 0.8859370 0.9683431 0.9542132 0.9261381 0.9261381LCSTOCK -0.6218049 0.8180842 0.8936886 0.8822151 0.9508532 0.9060283GNP -0.8141758 0.9549985 0.9902620 0.9931752 0.9668297 0.9720780GDP -0.4857107 0.9533118 0.9829273 0.9830307 0.8545207 0.8874491SPLICEGROSS -0.7987852 0.9287711 0.9898840 0.9871683 0.9352194 0.9320751Inflation -0.8081528 0.8753418 0.9679235 0.9527529 0.9307222 0.9238932FEDFUNDS 1.0000000 -0.6737792 -0.7703007 -0.7510230 -0.6632477 -0.7031356
8/2/2019 Predictors of Stock Market Values
31/38
Predictors of Stock Market Values 29
IMFNS -0.6737792 1.0000000 0.9612459 0.9785086 0.8846286 0.9476079M2NS -0.7703007 0.9612459 1.0000000 0.9974368 0.9093549 0.9487988M2IMFNS -0.7510230 0.9785086 0.9974368 1.0000000 0.9097530 0.9551136AUTOCOR -0.6632477 0.8846286 0.9093549 0.9097530 1.0000000 0.9508532AUTOCOR2 -0.7031356 0.9476079 0.9487988 0.9551136 0.9508532 1.0000000AUTOCOR3 -0.7960527 0.9413438 0.9476510 0.9520965 0.9060283 0.9508532
MRINFL -0.8420432 0.8805821 0.9640316 0.9495986 0.9154274 0.9307222MRFUNDS 0.8351315 -0.5872710 -0.7318301 -0.6988825 -0.6218049 -0.6558817MRIMFNS -0.6824676 0.9750173 0.9567361 0.9682330 0.8180842 0.9239063MRM2NS -0.7584503 0.9539833 0.9985131 0.9937647 0.8936886 0.9404726MRM2IMFNS -0.7456812 0.9678341 0.9966462 0.9960230 0.8822151 0.9445584
AUTOCOR3 MRINFL MRFUNDS MRIMFNS MRM2NS MRM2IMFNSYear 0.9369234 0.9975192 -0.8080723 0.8782463 0.9696137 0.9547696LCSTOCK 0.8664926 0.9119503 -0.6402895 0.8099785 0.8894880 0.8778663GNP 0.9575144 0.9808353 -0.7588302 0.9478959 0.9940004 0.9947253GDP 0.9434643 0.9940487 -0.3646230 0.9329031 0.9675501 0.9609944SPLICEGROSS 0.9391325 0.9774272 -0.7814267 0.9090652 0.9867425 0.9814615Inflation 0.9144456 0.9986158 -0.7788310 0.8900329 0.9741104 0.9621024FEDFUNDS -0.7960527 -0.8420432 0.8351315 -0.6824676 -0.7584503 -0.7456812IMFNS 0.9413438 0.8805821 -0.5872710 0.9750173 0.9539833 0.9678341
M2NS 0.9476510 0.9640316 -0.7318301 0.9567361 0.9985131 0.9966462M2IMFNS 0.9520965 0.9495986 -0.6988825 0.9682330 0.9937647 0.9960230AUTOCOR 0.9060283 0.9154274 -0.6218049 0.8180842 0.8936886 0.8822151AUTOCOR2 0.9508532 0.9307222 -0.6558817 0.9239063 0.9404726 0.9445584AUTOCOR3 1.0000000 0.9238932 -0.6733649 0.9417894 0.9414619 0.9493800MRINFL 0.9238932 1.0000000 -0.8081528 0.8753418 0.9679235 0.9527529MRFUNDS -0.6733649 -0.8081528 1.0000000 -0.6418000 -0.7506362 -0.7293701MRIMFNS 0.9417894 0.8753418 -0.6418000 1.0000000 0.9538724 0.9741593MRM2NS 0.9414619 0.9679235 -0.7506362 0.9538724 1.0000000 0.9970302MRM2IMFNS 0.9493800 0.9527529 -0.7293701 0.9741593 0.9970302 1.0000000>> m summary(m)
Call:lm(formula = LCSTOCK ~ Year + SPLICEGROSS + MRINFL + MRFUNDS +
MRM2IMFNS, data = comb, na.action = na.exclude)
Residuals:Min 1Q Median 3Q Max
-364914 -203016 -41612 106983 820959
Coefficients:Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.897e+08 3.763e+08 -1.036 0.3134Year 1.982e+05 1.913e+05 1.036 0.3134SPLICEGROSS 1.757e+02 8.159e+01 2.153 0.0444 *MRINFL -8.347e+02 5.898e+02 -1.415 0.1732MRFUNDS 1.792e+04 3.871e+04 0.463 0.6486MRM2IMFNS -6.165e+02 2.555e+02 -2.413 0.0261 *---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 290400 on 19 degrees of freedom(8 observations deleted due to missingness)
8/2/2019 Predictors of Stock Market Values
32/38
Predictors of Stock Market Values 30
Multiple R-squared: 0.9219, Adjusted R-squared: 0.9014F-statistic: 44.87 on 5 and 19 DF, p-value: 7.147e-10
> plot(fitted(m), residuals(m), xlab="Fitted",ylab="Residuals")> qqnorm(resid(m))> shapiro.test(residuals(m))
Shapiro-Wilk normality test
data: residuals(m)W = 0.9142, p-value = 0.03778
> library(lmtest)Loading required package: zoo
Attaching package: 'zoo'
The following object(s) are masked from package:base :
as.Date.numeric
> dwtest(m)
Durbin-Watson test
data: mDW = 1.0697, p-value = 0.0002858alternative hypothesis: true autocorrelation is greater than 0
>> library(MASS)> boxcox(m,plotit=T)
> boxcox(m,plotit=T,lambda=seq(-0.5,0.5,by=0.1))>>> malt summary(malt)
Call:lm(formula = log(LCSTOCK) ~ Year + SPLICEGROSS + MRINFL + MRFUNDS +
MRM2IMFNS + log(AUTOCOR), data = comb, na.action = na.exclude)
Residuals:Min 1Q Median 3Q Max
-0.224728 -0.071272 -0.002158 0.078216 0.189551
Coefficients:Estimate Std. Error t value Pr(>|t|)
(Intercept) -4.631e+02 1.879e+02 -2.464 0.0240 *Year 2.387e-01 9.603e-02 2.485 0.0230 *SPLICEGROSS -2.748e-06 3.466e-05 -0.079 0.9377MRINFL -3.679e-04 2.572e-04 -1.430 0.1697MRFUNDS -9.905e-04 1.666e-02 -0.059 0.9533MRM2IMFNS -2.957e-04 1.246e-04 -2.373 0.0290 *
8/2/2019 Predictors of Stock Market Values
33/38
Predictors of Stock Market Values 31
log(AUTOCOR) 3.814e-01 1.829e-01 2.085 0.0516 .---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 0.123 on 18 degrees of freedom(8 observations deleted due to missingness)
Multiple R-squared: 0.9891, Adjusted R-squared: 0.9854F-statistic: 271.3 on 6 and 18 DF, p-value: < 2.2e-16
> shapiro.test(residuals(malt))
Shapiro-Wilk normality test
data: residuals(malt)W = 0.9747, p-value = 0.7653
> dwtest(malt)
Durbin-Watson test
data: maltDW = 1.8845, p-value = 0.0847alternative hypothesis: true autocorrelation is greater than 0
> boxcox(malt,plotit=T)>> malt2 summary(malt2)
Call:lm(formula = log(LCSTOCK) ~ Year + log(SPLICEGROSS) + log(MRINFL) +
MRFUNDS + log(MRM2IMFNS) + log(AUTOCOR), data = comb, na.action =
na.exclude)
Residuals:Min 1Q Median 3Q Max
-0.242438 -0.088883 -0.004927 0.094161 0.211644
Coefficients:Estimate Std. Error t value Pr(>|t|)
(Intercept) -72.479133 89.941534 -0.806 0.43085Year 0.037791 0.048280 0.783 0.44395log(SPLICEGROSS) 1.322121 1.589509 0.832 0.41643log(MRINFL) -0.004836 1.307447 -0.004 0.99709MRFUNDS -0.019492 0.018379 -1.061 0.30293log(MRM2IMFNS) -1.318564 0.621873 -2.120 0.04813 *log(AUTOCOR) 0.620016 0.196808 3.150 0.00553 **---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 0.1397 on 18 degrees of freedom(8 observations deleted due to missingness)
Multiple R-squared: 0.9859, Adjusted R-squared: 0.9812F-statistic: 209.8 on 6 and 18 DF, p-value: 1.178e-15
8/2/2019 Predictors of Stock Market Values
34/38
Predictors of Stock Market Values 32
>> malt3 summary(malt3)
Call:
lm(formula = log(LCSTOCK) ~ Year + log(SPLICEGROSS) + MRFUNDS +log(MRM2IMFNS) + log(AUTOCOR), data = comb, na.action = na.exclude)
Residuals:Min 1Q Median 3Q Max
-0.242398 -0.088944 -0.005036 0.094195 0.211602
Coefficients:Estimate Std. Error t value Pr(>|t|)
(Intercept) -72.58974 82.56247 -0.879 0.39027Year 0.03784 0.04502 0.841 0.41102log(SPLICEGROSS) 1.31767 1.00937 1.305 0.20733MRFUNDS -0.01945 0.01461 -1.331 0.19881log(MRM2IMFNS) -1.31744 0.52746 -2.498 0.02185 *
log(AUTOCOR) 0.62009 0.19068 3.252 0.00419 **---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 0.136 on 19 degrees of freedom(8 observations deleted due to missingness)
Multiple R-squared: 0.9859, Adjusted R-squared: 0.9822F-statistic: 265.8 on 5 and 19 DF, p-value: < 2.2e-16
>> malt4 summary(malt4)
Call:lm(formula = log(LCSTOCK) ~ log(SPLICEGROSS) + MRFUNDS + log(MRM2IMFNS) +
log(AUTOCOR), data = comb, na.action = na.exclude)
Residuals:Min 1Q Median 3Q Max
-0.24297 -0.08841 0.01259 0.11157 0.22009
Coefficients:Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.22647 2.76583 -1.167 0.2571log(SPLICEGROSS) 1.83905 0.79046 2.327 0.0306 *MRFUNDS -0.01805 0.01441 -1.253 0.2246log(MRM2IMFNS) -1.24958 0.51741 -2.415 0.0254 *log(AUTOCOR) 0.63590 0.18835 3.376 0.0030 **---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 0.135 on 20 degrees of freedom(8 observations deleted due to missingness)
Multiple R-squared: 0.9854, Adjusted R-squared: 0.9825F-statistic: 337 on 4 and 20 DF, p-value: < 2.2e-16
8/2/2019 Predictors of Stock Market Values
35/38
Predictors of Stock Market Values 33
>> #Final model, malt5> malt5 summary(malt5)
Call:lm(formula = log(LCSTOCK) ~ log(SPLICEGROSS) + log(MRM2IMFNS) +
log(AUTOCOR), data = comb, na.action = na.exclude)
Residuals:Min 1Q Median 3Q Max
-0.22263 -0.09233 0.01955 0.09150 0.23279
Coefficients:Estimate Std. Error t value Pr(>|t|)
(Intercept) -4.9809 2.4174 -2.060 0.05196 .log(SPLICEGROSS) 1.9336 0.7975 2.425 0.02443 *log(MRM2IMFNS) -1.0676 0.5033 -2.121 0.04597 *
log(AUTOCOR) 0.5758 0.1846 3.119 0.00519 **---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 0.1368 on 21 degrees of freedom(8 observations deleted due to missingness)
Multiple R-squared: 0.9842, Adjusted R-squared: 0.982F-statistic: 436.9 on 3 and 21 DF, p-value: < 2.2e-16
>> malt6 summary(malt6)
Call:lm(formula = log(LCSTOCK) ~ log(SPLICEGROSS) + log(AUTOCOR),
data = comb, na.action = na.exclude)
Residuals:Min 1Q Median 3Q Max
-3.358e-01 -1.018e-01 4.623e-05 1.066e-01 2.075e-01
Coefficients:Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.2297 1.6680 -0.737 0.468log(SPLICEGROSS) 0.4299 0.3777 1.138 0.267log(AUTOCOR) 0.7774 0.1647 4.721 9.34e-05 ***---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 0.144 on 23 degrees of freedom(7 observations deleted due to missingness)
Multiple R-squared: 0.9832, Adjusted R-squared: 0.9817F-statistic: 672.3 on 2 and 23 DF, p-value: < 2.2e-16
>
8/2/2019 Predictors of Stock Market Values
36/38
Predictors of Stock Market Values 34
> malt7 summary(malt5)
Call:lm(formula = log(LCSTOCK) ~ log(SPLICEGROSS) + log(MRM2IMFNS) +
log(AUTOCOR), data = comb, na.action = na.exclude)
Residuals:Min 1Q Median 3Q Max
-0.22263 -0.09233 0.01955 0.09150 0.23279
Coefficients:Estimate Std. Error t value Pr(>|t|)
(Intercept) -4.9809 2.4174 -2.060 0.05196 .log(SPLICEGROSS) 1.9336 0.7975 2.425 0.02443 *log(MRM2IMFNS) -1.0676 0.5033 -2.121 0.04597 *log(AUTOCOR) 0.5758 0.1846 3.119 0.00519 **---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 0.1368 on 21 degrees of freedom(8 observations deleted due to missingness)
Multiple R-squared: 0.9842, Adjusted R-squared: 0.982F-statistic: 436.9 on 3 and 21 DF, p-value: < 2.2e-16
>> malt8 summary(malt8)
Call:lm(formula = log(LCSTOCK) ~ log(AUTOCOR), data = comb, na.action = na.exclude)
Residuals:Min 1Q Median 3Q Max
-0.48053 -0.09394 0.01333 0.12556 0.22081
Coefficients:Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.86816 0.35789 2.426 0.0220 *log(AUTOCOR) 0.94333 0.02646 35.657 comb2 > cor(comb2, use= "pairwise.complete.obs")
Year LCSTOCK SPLICEGROSS MRM2IMFNS AUTOCOR AUTOCOR2Year 1.0000000 0.9261381 0.9887894 0.9547696 0.9261381 0.9261381LCSTOCK 0.9261381 1.0000000 0.9340474 0.8778663 0.9508532 0.9060283
8/2/2019 Predictors of Stock Market Values
37/38
Predictors of Stock Market Values 35
SPLICEGROSS 0.9887894 0.9340474 1.0000000 0.9814615 0.9352194 0.9320751MRM2IMFNS 0.9547696 0.8778663 0.9814615 1.0000000 0.8822151 0.9445584AUTOCOR 0.9261381 0.9508532 0.9352194 0.8822151 1.0000000 0.9508532AUTOCOR2 0.9261381 0.9060283 0.9320751 0.9445584 0.9508532 1.0000000AUTOCOR3 0.9261381 0.8664926 0.9391325 0.9493800 0.9060283 0.9508532
AUTOCOR3
Year 0.9261381LCSTOCK 0.8664926SPLICEGROSS 0.9391325MRM2IMFNS 0.9493800AUTOCOR 0.9060283AUTOCOR2 0.9508532AUTOCOR3 1.0000000> m2 summary(m2)
Call:lm(formula = log(LCSTOCK) ~ log(SPLICEGROSS) + log(MRM2IMFNS) +
log(AUTOCOR), data = comb2, na.action = na.exclude)
Residuals:Min 1Q Median 3Q Max
-0.22263 -0.09233 0.01955 0.09150 0.23279
Coefficients:Estimate Std. Error t value Pr(>|t|)
(Intercept) -4.9809 2.4174 -2.060 0.05196 .log(SPLICEGROSS) 1.9336 0.7975 2.425 0.02443 *log(MRM2IMFNS) -1.0676 0.5033 -2.121 0.04597 *log(AUTOCOR) 0.5758 0.1846 3.119 0.00519 **---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 0.1368 on 21 degrees of freedom(9 observations deleted due to missingness)
Multiple R-squared: 0.9842, Adjusted R-squared: 0.982F-statistic: 436.9 on 3 and 21 DF, p-value: < 2.2e-16
>> plot(fitted(m2), residuals(m2), xlab="Fitted",ylab="Residuals")> qqnorm(resid(m2))> shapiro.test(residuals(m2))
Shapiro-Wilk normality test
data: residuals(m2)W = 0.9527, p-value = 0.2885
> library(lmtest)> dwtest(m2)
Durbin-Watson test
data: m2DW = 1.8683, p-value = 0.1872
8/2/2019 Predictors of Stock Market Values
38/38
Predictors of Stock Market Values 36
alternative hypothesis: true autocorrelation is greater than 0
> library(MASS)> boxcox(m2,plotit=T)> boxcox(m2,plotit=T,lambda=seq(-1,3,by=0.1))> m3 summary(m3)
Call:lm(formula = log(LCSTOCK) ~ log(SPLICEGROSS) + log(MRM2IMFNS) +
log(SPLICEGROSS) * log(MRM2IMFNS) + log(AUTOCOR), data = comb2,na.action = na.exclude)
Residuals:Min 1Q Median 3Q Max
-0.20975 -0.08643 0.01519 0.08435 0.22880
Coefficients:Estimate Std. Error t value Pr(>|t|)
(Intercept) -11.86766 11.49803 -1.032 0.31432log(SPLICEGROSS) 2.53960 1.27772 1.988 0.06072 .log(MRM2IMFNS) -0.10262 1.65485 -0.062 0.95117log(AUTOCOR) 0.58928 0.18869 3.123 0.00536 **log(SPLICEGROSS):log(MRM2IMFNS) -0.08825 0.14395 -0.613 0.54673---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 0.1389 on 20 degrees of freedom(9 observations deleted due to missingness)
Multiple R-squared: 0.9845, Adjusted R-squared: 0.9814F-statistic: 318.1 on 4 and 20 DF, p-value: < 2.2e-16
> # This creates the variables plot> comb3 plot(comb3)