Momentum in an ICAPM World December 30, 2019 ABSTRACT I empirically test the ability of the ICAPM to explain the momentum anomaly in equity markets. Using the Epstein and Zin formulation of Campbell and Vuolteenaho (2004) and Campbell et al (2017) as inspiration, this paper derives an ICAPM representation of the Stochastic Discount Factor (SDF) which implies that the momentum risk premium has three main components: cash flow, discount rate and volatility risk.Winner stocks are those that have higher cash flow betas and lower discount rate and volatility betas whereas loser stocks are those for whom the converse is true. Thus the momentum strategy can be seen as a bet against a stocks cash flow, discount rate and volatility betas. This no- arbitrage model of momentum fits the data reasonably well, though the central prediction that winner stocks have higher cash flow betas and lower discount rate and volatility betas is not empirically supported. 1
28
Embed
Momentum in an ICAPM World · 2019. 12. 30. · Momentum in an ICAPM World December 30, 2019 ABSTRACT I empirically test the ability of the ICAPM to explain the momentum anomaly in
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Momentum in an ICAPM World
December 30, 2019
ABSTRACT
I empirically test the ability of the ICAPM to explain the momentum anomaly in equity markets.
Using the Epstein and Zin formulation of Campbell and Vuolteenaho (2004) and Campbell et
al (2017) as inspiration, this paper derives an ICAPM representation of the Stochastic Discount
Factor (SDF) which implies that the momentum risk premium has three main components: cash
flow, discount rate and volatility risk.Winner stocks are those that have higher cash flow betas and
lower discount rate and volatility betas whereas loser stocks are those for whom the converse is
true. Thus the momentum strategy can be seen as a bet against a stocks cash flow, discount rate
and volatility betas. This no- arbitrage model of momentum fits the data reasonably well, though
the central prediction that winner stocks have higher cash flow betas and lower discount rate and
volatility betas is not empirically supported.
1
1. Introduction
The momentum effect, the tendency of short run returns to predict future returns, has mystified
the asset pricing anomalies literature for over three decades. Traditional factor models such as the
CAPM or the Fama-French 3 factor model (FF) cannot explain the statistically significant alphas
attached to the momentum strategy (Fama and French, 1996). Introducing ad-hoc macroeconomic
variables into this multifactor framework adds little predictive power for these factor models as
momentum alphas are still largely unexplained (Chen, Roll, and Ross 1986; Li, Vassalou, and Xing
2006; Liu and Zhang 2008).
Whilst it is tempting to do so, one should not interpret the failure of traditional factor models
to price momentum as evidence against the efficient markets hypothesis. Rather, the failure can
largely be explained by a misspecification in the factor structure. Traditional reduced form factor
models such as the CAPM and FF assume that equity pricing is symmetric, that is asset prices
respond identically to positive and negative shocks of the same magnitude. Yet recent evidence
suggests that the time varying equity risk premium is largely asymmetric, with downside risk priced
more heavily by investors than upside risk. In fact, Lettau, Maggiori and Weber (2014) find that the
Downside Risk CAPM (DR CAPM), which allows time variation in market betas, has far greater
predictive power than the CAPM or FF for the cross-section of equity returns, especially in bearish
markets. This asymmetric element to equity pricing is especially important for momentum, given
the documented crash risk in momentum strategy returns across bad states of nature (Daniel and
Moskowitz, 2016).
This paper seeks to address this asymmetric pricing issue by considering an ICAPM SDF model
based on Epstein-Zin (EZ) recursive preferences which assumes that the utility function is non-
separable across states of nature. This allows news about future bad states to influence investor’s
utility today and consequently asset prices. Using EZ preferences, our stylised no-arbitrage model
is able to identify three main determinants of the SDF: cash flow, discount rate and volatility risk.
The SDF positively covaries with cash flow risk whereas discount rate and volatility risk attract a
negative risk premium.
This implies that momentum is a bet on the cash flow, discount rate and volatility betas
of a stock . ’Winner’ stocks are those with higher cash flow betas and lower discount rate and
volatility betas whereas ’loser’ stocks are those for whom the converse is true. I follow Campbell
and Vuolteenaho (2004) in making the assumption that cash flow and volatility risk attract greater
risk prices than discount rate risk. I make this assumption because discount rate risk is transitory:
higher discount rates today may reduce contemporaneous wealth but will improve investment op-
portunities in the future. On the other hand, shocks to cash flows and volatility have permanent
effects on investor wealth meaning that risk averse investors demand greater compensation for these
1
components of the long run risk equation (Campbell and Vuolteenaho, 2004).
Intuitively, the ’winner’ portfolio loads heavily on cash flow risk because it is largely comprised
of small and value stocks which are highly sensitive to news about future cash flows (Campbell
and Vuolteenaho, 2004). This is largely because small and value stocks generate cash flows in the
immediate future and are thus less sensitive to discount rates, leaving cash flow risk as the main
source of variation. In contrast, the ’loser’ portfolio loads heavily on discount rate risk because
large and growth stocks generate more distant cash flows that are more sensitive to market-wide
discount rates (Campbell and Vuolteenaho, 2004). Thus momentum is long cash flow risk but short
discount rate risk.
With regards to volatility risk, the momentum strategy can be seen as a strategy that is short
volatility because of the option-like behaviour of the losers. In bear markets, the loser portfolio
becomes highly levered and in accordance to Merton (1974) behaves like a deep out of the money
call option on the market portfolio. When markets correct, the steep convexity of the loser portfolio
results in the losers significantly outperforming the winners, initiating a ’momentum crash’ (Daniel
and Moskowitz, 2016). Thus, momentum is a bet against volatility with the volatility risk price
being negative in our framework.
I find that this long run risk formulation of momentum anomaly has large explanatory power for
the returns to ten momentum sorted portfolios. The adjusted R2 ranges between 60-90% and the
signs of the cash flow, discount rate and volatility risk factors are highly signiifcant at a 1% level
and conform to the basic predictions of our model, with cash flow risk attracting a positive risk pre-
mium and discount rate and volatility risk attracting a negative risk premium. However I find that
the magnitudes of the coefficients are not consistent with our model: the loser portfolio attracts
higher cash flow, discount rate and volatility betas relative to the winner portfolio. One possible
explanation that may explain the failure of our model predictions may be that higher moments
of cash flow, discount rate and volatility risk are important in pricing momentum returns. Kalev,
Saxena and Zolotoy (2017) find that this is certainly the case for the size and the value anomalies,
indicating that an avenue for future research would be to extend this framework to include these
higher moments. Given recent evidence that higher moments of the return distribution do matter
for momentum (Barroso and Santa Clara, 2014), there is great potential for such an avenue of
research to yield critical results for the pricing of the momentum anomaly.
The remainder of this paper is organised as follows. Sections 2 and 3 introduce my theoretical
model and discuss the key implications for the momentum risk premium. Section 4 discusses my
empirical methodology and estimation results. Section 5 concludes this paper.
2
2. Model
My theoretical model assumes a complete markets framework where a unique stochastic discount
factor (SDF) mt+1 exists that prices all assets in the universe. Specifically, I use the SDF framework
of Campbell et al (2017) which assumes that the representative agent has Epstein and Zin (EZ)
preferences that takes the following form:
mt+1 = θ ln δ − θ
ψ∆ct+1 + (θ − 1)rt+1 (1)
∆ct+1 refers to the log value of reinvested wealth, rt+1 refers to the return on the aggregate wealth
portfolio Wt which is the market value of the consumption stream owned by the representative
agent. δ refers to the subjective time preference parameter. ψ refers to the elasticity of intertem-
poral substitution (EIS) and θ= 1−γ1− 1
ψ
where γ is the relative risk aversion parameter.
By introducing separate parameters for relative risk aversion and EIS, long run consumption risk is
allowed to influence investor’s contemporaneous utility as per EZ preferences through the parame-
ter rt+1. So long as γ< 1ψ , the investor is concerned about both short run and long run consumption
growth risk.
To avoid empirical issues associated with consumption data, Campbell (1993) demonstrates that
one can apply a log-linear transformation of equation 1 to substitute out consumption and redefine
the SDF in terms of news to future cash flows Nc,t+1, news to future discount rates Nd,t+1 and news
to future volatility Nv,t+1. Campbell and Vuolteenaho (2004) and Campbell et al (2017) use this
transformation alongside the Campbell-Shiller return decomposition to derive a process for shocks
to mt+1 as follows:
mt+1 − Etmt+1 = −γNc,t+1 +Nd,t+1 + φNv,t+1 (2)
where:
Nc,t+1 = (Et+1 − Et)
∞∑s=1
ρsct+1+s
Nd,t+1 = (Et+1 − Et)∞∑s=1
ρsrt+1+s
Nv,t+1 = (Et+1 − Et)
∞∑s=1
ρsvart+s(mt+1+srt+1+s) (3)
In this stylised framework, the risk price for Nc,t+1 is γ times greater than the risk price for Nd,t+1.
This follows from the fact that cash flow shocks permanently reduce wealth whereas positive dis-
count rate shocks are transitory and are partially offset by higher future returns (Campbell and
3
Vuolteenaho, 2004). In a similar vein, the risk price for Nv,t+1 is φ times greater than the risk price
for Nd,t+1. The parameter ρ is a discount coefficient.
Substituting equation 2 into equation 1 yields a stochastic process for the expected log SDF Etmt+1 :
Etmt+1 = κ+ Et[−γ∞∑s=0
ρs∆ct+1+s +
∞∑s=0
ρsrt+1+s + φ
∞∑s=1
ρsvart+s(mt+1+srt+1+s)] (4)
Finally I arrive at my specification for mt+1 by substituting equation 4 into equation 2 which yields
a stochastic process that depends on both expectations and shocks to cash flows, discount rates
Again, the risk price for Nd,t+1 is normalised to 1. Using equation 17 as our final moment condition,
I use GMM to estimate our beta estimates γ, φ and τ . I follow Campbell et al (2017) in using the
CRSP value weighted index as the reference asset Rr,t+1. The 10 momentum sorted portfolios are
used as the test assets in the GMM estimation.
8
4. News Estimation
In order to estimate the betas using the approaches outlined in the previous section, the news terms
Nc,t+1, Nd,t+1, Nv,t+1, Ec,t, Ed,t and Ev,t must be estimated. This is done via a VAR system that
I detail below.
4.1. VAR Methodology
I follow Campbell et al (2014) in using the following VAR system to estimate the news terms:
zt+1 = α + Γ(zt − z) + σtut+1 (18)
zt+1 is an m x 1 vector of state variables with excess market returns rt+1 as the first element. α and
ut+1 are m x 1 vectors of constant parameters and residuals respectively. z and Γ are m x 1 the m
x m vector of constant parameter estimates. Finally σt is the conditional variance of market returns.
After estimating the parameters of the VAR system, I linearly map Nc,t+1, Nd,t+1 and Nv,t+1 to
the shock vector ut. Our linear mapping process is captured by the expressions below:
Nc,t+1 = (e1T + e1Tλσt)ut+1
Nd,t+1 = e1Tλσtut+1
Nv,t+1 = e2Tλσtut+1 (19)
e1 is an m×m vector with 1 as the first element and 0 for all other elements. e2 is an m x 1 vector
with 1 tied to the element linked with volatility and 0 for all other elements. The term e1T + e1Tλ
represents the contribution that shocks to the state variables play in driving Nc,t+1. e1Tλ measures
the contribution of state variable shocks to Nd,t+1. Finally, e2λ measures the contribution of state
variable shocks to Nv,t+1. We multiply each of the news terms by the scalar term σt to account for
different error variances across the state variables in the VAR system. I outline how σt is estimated
later in this section.
λ is defined as:
λ = (ρΓ(I − ρΓ)−1 (20)
where ρ = 0.95112 and I is the identity matrix. Following Campbell and Vuolteenaho (2004) I use
this default value for ρ on the assumption that the annual consumption-wealth ratio is 5.2% for
the representative investor.
I adopt a similar linear mapping process to estimate Ec,t, Ed,t and Ev,t, mapping them to the lagged
9
state variables:
Ec,t = (e1T + e1Tλσt)Γzt
Ed,t = (e1Tλσt)Γzt
Ev,t = (e2T + e2Tλσt)Γzt (21)
4.2. VAR State Variables
To operationalise our VAR approach we must choose specific state variables to enter our state
vector. We follow Campbell et al (2017) in choosing the following six state variables: excess mar-
ket returns (ReM,t+1), equity volatility (σt), term spread (TSt+1), risk free rate (RFt+1), price to
earnings (PEt+1) and the small stock value spread (V St+1). I define these variables in accordance
with Campbell et al (2017). Sampling information about these variables are detailed below in table
1:
Table 1: State Variable Sample Information
State Variables Source Sample Period Interval
ReM,t+1 CRSP Database January 1929 - Dec 2010 Monthly
σt Kenneth French Data Library January 1929 - Dec 2010 MonthlyTSt+1 CRSP Database January 1929 - Dec 2010 MonthlyRFt+1 Kenneth French Data Library January 1929 - Dec 2010 MonthlyPEt+1 Robert Shiller Website January 1929 - Dec 2010 MonthlyVSt+1 Kenneth French Data Library January 1929 - Dec 2010 Monthly
In terms of the first state variable, ReM,t+1 is defined as the difference between the log returns on the
CRSP value weighted index and the log risk free rate. The second variable is the predicted market
variance conditional on all state variables at time t. The third variable is the US term spread which
is defined as the difference between the 10 year US government yield and the 3 month US treasury
bill rate. The fourth variable is the 3 month US Treasury Bill Rate. The fifth variable is the
logarithm of the smoothed SP500 price-earnings ratio available via Professor Bob Shiller’s website.
Finally the small stock value spread is the return differential between small value and small growth
stocks. This spread is computed using the return differential between the small growth and small
value portfolios available on the Kenneth French Data Library.
Our choice of these state variables is motivated by the need for variables that predict future cash
flows, discount rates and volatility. Firstly, financial variables such as the price-earnings ratio, small
stock value spread and market excess returns drive the cross-section of equity returns, indicating
that they hold important information for predicting future discount rates. Secondly, macroeco-
nomic variables such as the term spread and the risk free rate capture broad shifts in the economy
and hence are likely to be good predictors for future cash flows. Finally, historical volatility is
10
needed to predict future patterns in expected volatility.
4.3. Predicting Conditional Market Variance
In order to calculate the second state variable-conditional market variance σt, I follow the approach
of Campbell et al (2017). First I construct realized monthly market variance RV ARt+1 from daily
market return data using the standard definition of sample variance:
RV ARt+1 =1
N − 1[
N∑t=1
rt − r] (22)
Define rt+1 as the log daily return on the CRSP value weighted index and r as the average monthly
return for month t+ 1. I then run the following predictive regression:
RV ARt+1 = α+ φ1RV ARt + φ2MRt + φ3TSt + φ4RFt + φ5PEt + φ6V St (23)
To control for the heteroskedasticity of innovations to our state variables, Equation 16 is estimated
as a Weighted Least Squares (WLS) regression where each observation is weighted by the inverse
of the realized variance RV AR−1t . The predictive regression results are presented below:
Table 2: Variance Predictive Regression
Variance Prediction ParametersConstant RVARt MRt TSt RFt PEt VSt R2