Board of Governors of the Federal Reserve System International Finance Discussion Papers Number 1070 December 2012 Firm Characteristics and Empirical Factor Models: A Data-Mining Experiment Leonid Kogan and Mary Tian NOTE: International Finance Discussion Papers are preliminary materials circulated to stimulate discussion and critical comment. References to International Finance Discussion Papers (other than an acknowledgment that the writer has had access to unpublished material) should be cleared with the author or authors. Recent IFDPs are available on the Web at www.federalreserve.gov/pubs/ifdp/. This paper can be downloaded without charge from the Social Science Research Network electronic library at www.ssrn.com.
52
Embed
Firm Characteristics and Empirical Factor Models: A Data-Mining ...
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Board of Governors of the Federal Reserve System
International Finance Discussion Papers
Number 1070
December 2012
Firm Characteristics and Empirical Factor Models: A Data-Mining Experiment
Leonid Kogan and Mary Tian NOTE: International Finance Discussion Papers are preliminary materials circulated to stimulate discussion and critical comment. References to International Finance Discussion Papers (other than an acknowledgment that the writer has had access to unpublished material) should be cleared with the author or authors. Recent IFDPs are available on the Web at www.federalreserve.gov/pubs/ifdp/. This paper can be downloaded without charge from the Social Science Research Network electronic library at www.ssrn.com.
Firm Characteristics and Empirical Factor Models: aData-Mining Experiment∗
Leonid Kogan †
Mary Tian ‡
December 2012
Abstract
“A three-factor model using the standardized-unexpected-earnings and cashflow-to-price factors explains 15 well-known asset pricing anomalies.” Our data-mining exper-iment provides a backdrop against which such claims can be evaluated. We constructthree-factor linear pricing models that match return spreads associated with as manyas 15 out of 27 commonly used firm characteristics over the 1971-2011 sample. Weform target assets by sorting firms into ten portfolios on each of the chosen character-istics and form candidate pricing factors as long-short positions in the extreme decileportfolios. Our analysis exhausts all possible 351 three-factor models, consisting oftwo characteristic-based factors in addition to the market portfolio. 65% of the exam-ined factor models match a larger fraction of the target return cross-sections than theCAPM or the Fama-French three-factor model. We find that the relative performanceof the complete set of three-factor models is highly sensitive to the sample choice andthe factor construction methodology. Our results highlight the challenges of evaluatingempirical factor models.
∗We thank seminar participants at the Finance Forum workshop at the Federal Reserve Board of Gov-ernors. The views in this paper are solely the responsibility of the authors and should not be interpretedas reflecting the views of the Board of Governors of the Federal Reserve System or of any other personassociated with the Federal Reserve System.
‡Division of International Finance, Federal Reserve Board of Governors, [email protected].
1 Introduction
Empirical asset pricing literature has documented many examples of firm characteristics
being able to predict future stock returns. When not accounted for by standard asset pric-
ing models, such patterns are often interpreted as anomalous. It is challenging to develop
meaningful theoretical explanations of the observed patterns in returns.1 In contrast, the
long-short portfolios constructed by sorting firms on various characteristics – the “c-factors”,
often named after the sorting variable – provide readily available inputs into empirical fac-
tor models. By searching through the firm characteristics known to be associated with large
spreads is stock returns, it is relatively easy to construct seemingly successful empirical factor
pricing models.
When we hear of a new c-factor model with N factors that “explains” M of the well-
known anomalies, how should we evaluate such a result? Is there a quantitative threshold
for the M -to-N ratio above which such a result strongly points to an economically important
source of systematic risk, even without a solid theoretical foundation? The ease of construc-
tion of c-factor models and virtually unlimited freedom in selecting test assets provide fertile
ground for data mining.2 In this paper we quantify just how easy it is to generate seem-
ingly successful empirical c-factor models. Our findings imply that it is extremely difficult
to evaluate factor pricing model based solely on their pricing performance, and one must
emphasize the theoretical and empirical foundation for their economic mechanism.
We systematically mine the 1971-2011 historical sample under a specific set of rules
designed to be representative of the commonly used empirical procedures. We consider 27
firm characteristics proposed in the literature as predictive variables for stock returns (see
section 2 and Appendix A for the list of the characteristics, with references to the relevant
1“Meaningful” is an important qualifier here: it is not hard to come up with an ad hoc ex-post rationaliza-tion of why a particular firm characteristic may proxy for exposure to a risk factor. A compelling theoreticalexplanation should identify the economic mechanism giving rise to such a factor, provide alternative testableimplications of this mechanism, as well as a rationale for why other firm characteristics are correlated withfirms’ exposures to the proposed risk factor.
2Many studies in the literature warn of the dangers of data mining biases, particularly in the contextof return predictability, e.g., Black (1993), Lo and MacKinlay (1990), Ferson (1996), Lewellen, Nagel, andShanken (2006), Novy-Marx (2012).
1
literature). Some of these characteristics have been proposed as candidate empirical proxies
for systematic risk exposures, others as likely proxies for mispricing – we do not discriminate
based on the merits of the original motivation. To qualify as a contender for our data-mining
exercise, a firm characteristic simply needs to be a subject of an academic publication.
We rank firms into ten portfolios based on each of the 27 characteristics and define the
associated return factors as return differences between the tenth and the first decile portfolios.
We then tabulate the pricing performance of all possible three- and four-factor models, each
consisting of the market portfolio and two or three factors respectively, chosen out of the set
of 27. We thus consider a total of 351 alternative three-factor models, and 2,925 four-factor
models.
If a pricing model is not rejected by testing it against a cross-section of portfolios sorted
on a particular firm characteristic, we say that this model matches such a cross-section. We
find that it is relatively easy to construct a three-factor model that match more than half
of the 25 target cross-sections of returns over the full sample (we exclude the cross-sections
used to form the model factors from the set of target cross-sections).
The best-performing model over the entire sample, by the total number of matched cross-
sections, includes the factors based on unexpected earnings and the cash flow-to-price ratio.
It matches 15 out of 25 return cross-sections. Each of the top-twenty models reported in
Table 5 matches return cross-sections based on each of 12 or more different characteristics.3
Four-factor models achieve slightly better coverage, with the top model matching 16 out
of 24 cross-sections, and the worst of the top-twenty models matching 14. For comparison,
the CAPM and the Fama and French (1993) three-factor model both match eight out of
27 return cross-sections (we do not exclude any test assets when evaluating these reference
models).
As expected in a data mining exercise, performance of the c-factor models tends to be
fragile. It is highly sensitive to the sample period choice and the details of the factor construc-
tion. In particular, there is virtually no correlation between the relative model performance
3We summarize performance of all 351 models in an on-line document, http://tinyurl.com/d43mf3h.
The definitions and construction of the characteristics are contained in Appendix A.
After dropping all firms in the financial sector (SIC 6000-6999), we sort remaining firms
into ten portfolios with respect to each characteristic, thus performing 27 independent one-
way sorts. We sort firms every year in June with respect to the underlying characteristic
and then compute value-weighted returns of each portfolio from July to June of the next
year.5 We take the difference in value-weighted returns of the high and low portfolios (decile
10 minus decile 1) to form 27 characteristic return factors.6 Alternatively, we also construct
factors by doing a sequential double-sort on size and then the characteristic: firms are
separated into either big or small firms, and subsequently within each group, sorted into ten
portfolios with respect to the characteristic. Then, we construct each factor as the equal-
weighted average of the high minus low portfolio within the big and small size group. Our
base set of results use factors constructed from the one-way sort; we compare results using
the alternative double-sort factor construction in Section 3.3.
4Strictly speaking, market beta is a measure of risk, and is not what is typically taken as a firm charac-teristic. We include market beta as one of the sorting variables because of the recent resurgence of interestin the failure of CAPM to price the market-beta sorted portfolios, (e.g., Black, Jensen, and Scholes, 1972;Frazzini and Pedersen, 2011; Baker, Bradley, and Wurgler, 2011). Similarly, idiosyncratic return volatilityis a return statistic rather than a firm characteristic observable at a point in time. We include idiosyncraticvolatility because of its striking ability to forecast future stock returns, e.g., Ang, Hodrick, Xing, and Zhang(2006).
5We perform a monthly sort for idiosyncratic volatility, following Ang et al. (2006).
6 In particular, to be consistent, we construct the size and book-to-market factors in this manner, whichwe call SIZE and BM , instead of using the standard Fama-French factors SMB and HML.
4
We create three-factor models by taking the market portfolio and choosing two factors
among our 27 return factors. Overall, this generates a universe of 351 linear three-factor
models. In addition to the complete list of all possible three-factor empirical models, we
also consider the CAPM; the Fama-French three-factor model; and a model consisting of
the market portfolio and the first two principal component vectors from the span of the 27
factor returns. While CAPM is perhaps the most commonly used theoretical benchmark,
the other two models are empirical factor models.
We test each factor model’s ability to match the average return differences across port-
folios sorted on each characteristic using a standard time-series regression framework. In
particular, following Gibbons, Ross, and Shanken (1989), for each characteristic we regress
excess returns on the ten characteristic-sorted portfolios on the returns of the three factors:
#{j, k : 1 ≤ j ≤ 26, 2 ≤ k ≤ 27, j < k, j 6= n, k 6= n}
= 1−∑27{j=1,k=2},j<k,j 6=n,k 6=n 1[pFn,j,k>0.1]
325.
In the first method, the fraction of matched return cross-sections is simply the number
of return cross-sections the model can match divided by the total number of target cross-
sections.
The second weighting scheme places higher weight on the “harder-to-explain” cross-
sections – the cross-sections that are matched by fewer c-factor models. Our motivation for
this is two-fold. First, this construction is supposed to alleviate the effect of double-counting
caused by the fact that some of the return factors we consider are constructed using closely
related firm characteristics, and thus may not be viewed as truly distinct. Placing a higher
weight on the harder-to-match cross-sections reduces the relative performance ranking of the
models that include c-factors closely related to several other characteristics. Second, c-factor
models that match a number of return cross-sections that are viewed as challenging, i.e.,
are rarely matched by the models proposed thus far, are likely to receive more attention in
the literature. Our second weighted measure places higher premium on the mechanically
constructed models with such attention-grabbing potential.7
Unless otherwise specified, our results utilize the first weighting method.
7If a particular pattern in returns is firmly viewed as a true anomaly that is not supposed to be explainedby systematic risk, matching such a cross-section may be seen as evidence against a proposed factor modelbeing risk-based. We abstract from this consideration in our definition of our second performance measure.
6
3 Properties of Empirical Factor Models
In this section we present the summary statistics of the characteristic-based factor portfolios,
examine the ability of linear factor models to capture average returns on these factors, and
show which of the factors are the hardest to reconcile with empirical factor models.
3.1 Characteristic-Sorted Portfolios
We present summary statistics of 27 characteristic-based factor portfolios in Table 1. For
each firm characteristic cn, n = 1, ..., 27, we first form decile portfolios sorted in the order of
increasing characteristic value. All portfolios are value-weighted. We then form the empirical
cn-factor, which is long the top-decile portfolio, and short the bottom-decile portfolio.
For each c-factor, we present the estimates of average returns (Panel A), CAPM alphas
(Panel B), and Fama-French alphas (Panel C), together with corresponding t-statistics. All
numbers are estimated with monthly data. The table contains the full sample and subsample
results.
The first set of results (moving vertically down the table) covers return factors related to
firm valuation. This includes the following firm characteristics: firm market capitalization
(SIZE), book-to-market ratio (BM), dividend-to-price ratio (DP), earnings-to-price ratio
(EP), and cash flow-to-price ratio (CP). Return factors based on BM, EP, and CP generate
a statistically significant spread in average returns, which is not captured by the CAPM
model.
The second set of characteristics is related to firms’ investment and physical assets. This
set includes return factors based on investment-to-assets ratios (IA), asset growth (AG), ac-
cruals (AC), abnormal investment (AI), net operating assets (NOA), investment over capital
(IK), and investment growth (IG). Several of the investment-related characteristics forecast
future stock returns. Qualitatively, firms with relatively high investment relative to assets
tend to have lower future returns. Factors based on IA, AG, and AC show the strongest
effects, which are not captured neither by CAPM, nor by the Fama-French model. These
7
effects persist over both subsamples, although they are somewhat stronger in the first-half
of the sample. The factors based on IK and IG have lower statistical significance. The IK
factor violates the CAPM over the entire sample and each of the subsamples, while the IG
factor is less robust – its return premium is captured by the CAPM in the first-half of the
sample. The Fama-French model fits the average returns on both of these factors reasonably
well.
The next set includes factors related to prior returns: return momentum (MOM) and
long-term reversal (LTR). Returns on the MOM factor are large on average, robust across
the subsamples, and not captured by the CAPM and the Fama-French model. Returns on
the LTR factor are smaller on average, but violate the CAPM and Fama-French model in
different subsample periods.
The next set of factors is related to firms’ earnings. This covers return on assets (ROA),
standardized unexpected earnings (SUE), return on equity (ROE), and sales growth (SG).
Firms with high ROA or high SUE tend to have higher average returns, which is not fully
captured by the CAPM and the Fama-French model. For ROA, the patterns are robust
across the subsamples, while the patterns for SUE have higher statistical significance in the
first subsample. ROE produces weaker patterns of the same sign. Sales growth predicts
stock returns with the opposite sign to the other earnings-based characteristics. SG returns
violate the CAPM over the entire sample, but are captured by the Fama-French model.
The next set of factors is related to financial distress, sorting firms on their Ohlson score
(OS) and market leverage (LEV). OS predicts returns with a negative sign. The magnitude
of the average returns of this factor is large, with statistically significant CAPM and Fama-
French alphas of -1% per month over the entire and subsample periods. LEV predicts
returns with a positive sign and a weakly-significant CAPM alpha of 0.5% per month. The
Fama-French model captures the returns on the LEV factor.
The next two factors are related to external financing: net stock issues (NSI) and com-
posite issuance (CI). Both characteristics predict returns negatively, and the resulting factor
returns violate both the CAPM and the Fama-French model in both sub-samples and over
the entire sample.
8
The last group contains several firm characteristics that are not immediately related to
each other nor to the characteristics covered above. These include organizational capital
Table 2 presents results from a principal component analysis on the 27 characteristic-based return factors. Factors are the highminus low portfolio from sorting firms into ten portfolios with respect to the underlying firm characteristic. The table showsthe proportion of cumulative variation that the first n principal components can capture. Results are presented over the wholesample period 1971-2011 and subsamples 1971-1991 and 1992-2011.
Table 3 presents factor loadings for the first three principal components extracted from the set of 27 factor returns. Loadingsare shown for the whole sample period 1971-2011 and subsamples 1971-1991 and 1992-2011.
Characteristic abbreviations are as follows: size (SIZE), book-to-market (BM), dividend-to-price (DP), earnings-to-price (EP),cash flow-to-price (CP), investment-to-assets (IA), asset growth (AG), accruals (AC), abnormal investment (AI), net operatingassets (NOA), investment-to-capital (IK), investment growth (IG), momentum (MOM), long-term reversal (LTR), return onassets (ROA), standardized unexpected earnings (SUE), return on equity (ROE), sales growth (SG), Ohlson score (OS), marketleverage (LEV), net stock issues (NSI), composite issuance (CI), organization capital (OK), liquidity risk (LIQ), turnover (TO),idiosyncratic return volatility (VOL), and market beta (BETA). Details on characteristic definitions and construction is inAppendix A.
25
Table 4: Factor Regression on the Principal-Component Model
1971-2011 1971-1991 1992-2011factor alpha t stat R2 alpha t stat R2 alpha t stat R2
Table 4 presents results from regressing the characteristic-based return factors on the benchmark three-factor model, consistingof the market portfolio and the first two principal component vectors of the return factors. Factors are the high minus low port-folio from sorting firms into ten portfolios with respect to the underlying firm characteristic. The alpha coefficient, t-statistic,and R2 from the regression is shown in the table for the whole sample period 1971-2011 and subsamples 1971-1991 and 1992-2011.
Characteristic abbreviations are as follows: size (SIZE), book-to-market (BM), dividend-to-price (DP), earnings-to-price (EP),cash flow-to-price (CP), investment-to-assets (IA), asset growth (AG), accruals (AC), abnormal investment (AI), net operatingassets (NOA), investment-to-capital (IK), investment growth (IG), momentum (MOM), long-term reversal (LTR), return onassets (ROA), standardized unexpected earnings (SUE), return on equity (ROE), sales growth (SG), Ohlson score (OS), marketleverage (LEV), net stock issues (NSI), composite issuance (CI), organization capital (OK), liquidity risk (LIQ), turnover (TO),idiosyncratic return volatility (VOL), and market beta (BETA). Details on characteristic definitions and construction is inAppendix A.
1 SUE CP 0.60 MOM CP 0.80 AG EP 0.842 MOM CP 0.56 MOM IA 0.72 AG CP 0.843 AG CP 0.56 MOM IK 0.72 MOM NSI 0.804 AI CP 0.56 IA SUE 0.72 MOM CI 0.805 CP LIQ 0.56 IA EP 0.72 ROA AG 0.806 SIZE VOL 0.52 OS AG 0.72 AG SUE 0.807 BM MOM 0.52 IA OS 0.68 AG CI 0.808 BM SUE 0.52 AC CP 0.68 AG VOL 0.809 BM CP 0.52 AI CP 0.68 SUE CI 0.8010 EP IG 0.52 NOA CP 0.68 CI LIQ 0.8011 ROE CP 0.52 NOA IK 0.68 CP IG 0.8012 NOA CP 0.52 CP IG 0.68 SIZE VOL 0.7613 CP IG 0.52 CP LIQ 0.68 MOM SG 0.7614 MOM EP 0.48 MOM AG 0.64 NSI SUE 0.7615 LTR CP 0.48 IA ROA 0.64 EP IG 0.7616 ROA CP 0.48 ROA CP 0.64 IG VOL 0.7617 OS AG 0.48 DP CP 0.64 BM MOM 0.7218 OS CP 0.48 AG SUE 0.64 BM SUE 0.7219 NSI LIQ 0.48 AG EP 0.64 MOM DP 0.7220 AG EP 0.48 AC IK 0.64 MOM LEV 0.72
Table 5 lists the characteristic-based factors that constitute the top twenty linear factor models, in terms of the proportion ofremaining characteristics they can capture, via the equal-weighted method. We say that a factor model M captures, or spans,a characteristic C, if the p-value from the Gibbons et al. (1989) F-test of joint significance of abnormal average return withrespect to M across the ten sorted portfolios on C is above 10%. Top factor models are shown for the whole sample period1971-2011 and subsamples 1971-1991 and 1992-2011.
The universe of factor models is all three-factor models consisting of the market portfolio and two characteristic return factors(C1, C2) from our list of 27. Characteristic abbreviations are as follows: size (SIZE), book-to-market (BM), dividend-to-price(DP), earnings-to-price (EP), cash flow-to-price (CP), investment-to-assets (IA), asset growth (AG), accruals (AC), abnormalinvestment (AI), net operating assets (NOA), investment-to-capital (IK), investment growth (IG), momentum (MOM), long-term reversal (LTR), return on assets (ROA), standardized unexpected earnings (SUE), return on equity (ROE), sales growth(SG), Ohlson score (OS), market leverage(LEV), net stock issues (NSI), composite issuance (CI), organization capital (OK),liquidity risk (LIQ), turnover (TO), idiosyncratic return volatility (VOL), and market beta (BETA). Details on characteristicdefinitions and construction is in Appendix A.
1 ROE SG 0.20 LEV BETA 0.28 IA LIQ 0.402 IK VOL 0.20 CI VOL 0.28 IA TO 0.403 VOL SG 0.20 EP VOL 0.28 IA SG 0.404 SIZE BM 0.16 AC OK 0.28 IA BETA 0.405 SIZE MOM 0.16 AI VOL 0.28 LTR DP 0.406 SIZE IA 0.16 IK VOL 0.28 DP BETA 0.407 SIZE LTR 0.16 VOL SG 0.28 AC AI 0.408 SIZE AI 0.16 BM CI 0.24 SMB HML 0.379 SIZE LIQ 0.16 LTR AC 0.24 SIZE LTR 0.3610 IA LTR 0.16 ROA VOL 0.24 SIZE DP 0.3611 IA AC 0.16 CI OK 0.24 SIZE AC 0.3612 ROA VOL 0.16 AC VOL 0.24 SIZE OK 0.3613 NSI VOL 0.16 OK VOL 0.24 SIZE LIQ 0.3614 DP VOL 0.16 LIQ VOL 0.24 SIZE BETA 0.3615 CI VOL 0.16 VOL BETA 0.24 IA AI 0.3616 AC OK 0.16 SIZE ROA 0.20 LTR IK 0.3617 SIZE LEV 0.12 SIZE SUE 0.20 LTR BETA 0.3618 SIZE AC 0.12 SIZE ROE 0.20 SIZE IA 0.3219 IA SG 0.12 DP VOL 0.20 IA DP 0.3220 SIZE SUE 0.08 VOL TO 0.20 SIZE AI 0.28
Table 6 lists the characteristic-based factors that constitute the bottom twenty linear factor models, in terms of the proportionof remaining characteristics they can capture, via the equal-weighted method. We say that a factor model M captures, orspans, a characteristic C, if the p-value from the Gibbons et al. (1989) F-test of joint significance of abnormal average returnwith respect to M across the ten sorted portfolios on C is above 10%. Bottom factor models are shown for the whole sampleperiod 1971-2011 and subsamples 1971-1991 and 1992-2011.
The universe of factor models is all three-factor models consisting of the market portfolio and two characteristic return factors(C1, C2) from our list of 27. Characteristic abbreviations are as follows: size (SIZE), book-to-market (BM), dividend-to-price(DP), earnings-to-price (EP), cash flow-to-price (CP), investment-to-assets (IA), asset growth (AG), accruals (AC), abnormalinvestment (AI), net operating assets (NOA), investment-to-capital (IK), investment growth (IG), momentum (MOM), long-term reversal (LTR), return on assets (ROA), standardized unexpected earnings (SUE), return on equity (ROE), sales growth(SG), Ohlson score (OS), market leverage (LEV), net stock issues (NSI), composite issuance (CI), organization capital (OK),liquidity risk (LIQ), turnover (TO), idiosyncratic return volatility (VOL), and market beta (BETA). Details on characteristicdefinitions and construction is in Appendix A.
28
Table 7: Model Performance Correlation: First versus Second Half of the Sample
Table 7 shows the rank correlation and correlation of factor model performance for the first subsample period (1971-1991)versus the second subsample period (1992-2011). The universe of factor models is all three-factor models consisting of themarket portfolio and two characteristic return factors from our list of 27. The rank correlation is Spearman’s rank correlationcoefficient from the ranking of factor models, based on the percentage of characteristics matched. The correlation is thecorrelation coefficient of factor models’ percentage of characteristics matched.
Correlations are shown for two characteristic weighting methods: equal-weighted method and characteristic matching frequencymethod. The “equal-weighted” method gives an equal weight to each characteristic matched. The “characteristic matchingfrequency” method gives each characteristic a weight of 1 minus the proportion of factor models that can match the cross-sectionof returns based on the characteristic under consideration.
29
Table 8: Model Performance Correlation: Characteristic Weighting Methods
Table 8 shows the rank correlation and correlation of factor model performance across the two characteristic weighting methodsused to compute the proportion of characteristics explained. The “equal-weighted” method gives an equal weight to eachcharacteristic matched. The “characteristic matching frequency” method gives each characteristic a weight of 1 minus theproportion of factor models that can match the cross-section of returns based on the characteristic under consideration.The universe of factor models is all three-factor models consisting of the market portfolio and two characteristic return factorsfrom our list of 27. The rank correlation is Spearman’s rank correlation coefficient from the ranking of factor models, based onthe percentage of characteristics matched. Results are shown for the whole sample period 1971-2011 and subsamples 1971-1991and 1992-2011.
30
Table 9: Model Performance Correlation: Factor Construction
Table 9 shows the rank correlation and correlation of factor model performance across the two different methods to constructcharacteristic-based return factors. The default method is to construct the factor as the high minus low portfolio of a one-waysort. The second method is to construct the factor as the equal-weighed average of the high minus low portfolio within the bigand small size group, from a double-sort first on size and then the characteristic.
The universe of factor models is all three-factor models consisting of the market portfolio and two characteristic return factorsfrom our list of 27. The rank correlation is Spearman’s rank correlation coefficient from the ranking of factor models, basedon the percentage of characteristics matched. The correlation is the correlation coefficient of factor models’ percentage ofcharacteristics matched.
Correlations are shown for two characteristic weighting methods, equal-weighted method and characteristic matching frequencymethod, as well as for the whole sample period 1971-2011 and subsamples 1971-1991 and 1992-2011. The “equal-weighted”method gives an equal weight to each characteristic matched. The “characteristic matching frequency” method gives eachcharacteristic a weight of 1 minus the proportion of factor models that can match the cross-section of returns based on thecharacteristic under consideration.
31
Table 10: Top 20 Performing Factor Models - Double Sort
1 BM VOL 0.48 LEV CP 0.80 MOM SUE 0.802 MOM EP 0.48 CP IG 0.80 DP SUE 0.763 BM LEV 0.44 BM CI 0.76 EP AC 0.764 OS CP 0.44 BM CP 0.76 ROA AC 0.725 OS TO 0.44 LTR CP 0.76 AC BETA 0.726 LEV LIQ 0.44 NSI TO 0.76 SUE AI 0.687 AC CP 0.44 AG ROE 0.76 BM MOM 0.648 AC TO 0.44 ROE CP 0.76 MOM OS 0.649 AC BETA 0.44 CP SG 0.76 ROA ROE 0.6410 CP VOL 0.44 BM ROE 0.72 CI AC 0.6411 CP BETA 0.44 MOM CP 0.72 AC ROE 0.6412 LIQ SG 0.44 IA CP 0.72 AC TO 0.6413 BM MOM 0.40 LTR EP 0.72 BM SUE 0.6014 BM EP 0.40 OS IK 0.72 BM AI 0.6015 BM LIQ 0.40 NSI DP 0.72 MOM EP 0.6016 BM TO 0.40 NSI AG 0.72 MOM BETA 0.6017 MOM CP 0.40 NSI IK 0.72 ROA SUE 0.6018 MOM VOL 0.40 NSI VOL 0.72 ROA AI 0.6019 MOM TO 0.40 AG CP 0.72 OS CP 0.6020 MOM BETA 0.40 ROE IK 0.72 SUE CP 0.60
Table 10 lists the characteristic-based factors that constitute the top twenty linear factor models, in terms of the proportion ofremaining characteristics they can capture, via the equal-weighted method. We say that a factor model M captures, or spans,a characteristic C, if the p-value from the Gibbons et al. (1989) F-test of joint significance of abnormal average return withrespect to M across the ten sorted portfolios on C is above 10%. Factors are constructed as the equal-weighed average of thehigh minus low portfolio within the big and small size group, from a double-sort first on size and then the characteristic. Topfactor models are shown for the whole sample period 1971-2011 and subsamples 1971-1991 and 1992-2011.
The universe of factor models is all three-factor models consisting of the market portfolio and two characteristic return factors(C1, C2) from our list of 27. Characteristic abbreviations are as follows: size (SIZE), book-to-market (BM), dividend-to-price(DP), earnings-to-price (EP), cash flow-to-price (CP), investment-to-assets (IA), asset growth (AG), accruals (AC), abnormalinvestment (AI), net operating assets (NOA), investment-to-capital (IK), investment growth (IG), momentum (MOM), long-term reversal (LTR), return on assets (ROA), standardized unexpected earnings (SUE), return on equity (ROE), sales growth(SG), Ohlson score (OS), market leverage (LEV), net stock issues (NSI), composite issuance (CI), organization capital (OK),liquidity risk (LIQ), turnover (TO), idiosyncratic return volatility (VOL), and market beta (BETA). Details on characteristicdefinitions and construction is in Appendix A.
1 IA VOL 0.08 OS SUE 0.20 SIZE SUE 0.202 IA SG 0.08 SUE AC 0.20 SIZE SG 0.203 ROA TO 0.08 SUE IG 0.20 MOM IA 0.204 NSI LEV 0.08 SUE LIQ 0.20 IA DP 0.205 NSI AI 0.08 NOA OK 0.20 IA AI 0.206 NSI LIQ 0.08 SIZE MOM 0.16 IA IG 0.207 NSI TO 0.08 SIZE LTR 0.16 SIZE DP 0.168 AG CI 0.08 ROA SUE 0.16 SIZE AG 0.169 SUE ROE 0.08 ROA AC 0.16 SIZE AC 0.1610 CI OK 0.08 SUE EP 0.16 SIZE AI 0.1611 AC LIQ 0.08 SUE ROE 0.16 SIZE IG 0.1612 SIZE MOM 0.04 SUE OK 0.16 SIZE LIQ 0.1613 SIZE SUE 0.04 SUE VOL 0.16 IA AG 0.1614 BM IA 0.04 SUE BETA 0.16 IA CI 0.1615 IA DP 0.04 SIZE ROE 0.12 IA LIQ 0.1616 ROA IK 0.04 DP SUE 0.12 AG AC 0.1617 NSI SUE 0.04 SUE CI 0.12 AG LIQ 0.1618 SUE IK 0.04 SUE NOA 0.12 SIZE MOM 0.1219 CI AC 0.04 SIZE ROA 0.08 SIZE IA 0.1220 ROA NSI 0 SIZE SUE 0.04 IA AC 0.12
Table 11 lists the characteristic-based factors that constitute the bottom twenty linear factor models, in terms of the proportionof remaining characteristics they can capture, via the equal-weighted method. We say that a factor model M captures,or spans, a characteristic C, if the p-value from the Gibbons et al. (1989) F-test of joint significance of abnormal averagereturn with respect to M across the ten sorted portfolios on C is above 10%. Factors are constructed as the equal-weighedaverage of the high minus low portfolio within the big and small size group, from a double-sort first on size and then thecharacteristic. Bottom factor models are shown for the whole sample period 1971-2011 and subsamples 1971-1991 and 1992-2011.
The universe of factor models is all three-factor models consisting of the market portfolio and two characteristic return factors(C1, C2) from our list of 27. Characteristic abbreviations are as follows: size (SIZE), book-to-market (BM), dividend-to-price(DP), earnings-to-price (EP), cash flow-to-price (CP), investment-to-assets (IA), asset growth (AG), accruals (AC), abnormalinvestment (AI), net operating assets (NOA), investment-to-capital (IK), investment growth (IG), momentum (MOM), long-term reversal (LTR), return on assets (ROA), standardized unexpected earnings (SUE), return on equity (ROE), sales growth(SG), Ohlson score (OS), market leverage (LEV), net stock issues (NSI), composite issuance (CI), organization capital (OK),liquidity risk (LIQ), turnover (TO), idiosyncratic return volatility (VOL), and market beta (BETA). Details on characteristicdefinitions and construction is in Appendix A.
33
Figure 1: Factor Correlation
Figure 1 shows a heatmap representation of the correlation matrix for the 27 characteristic-based factors, the market portfolio,and the first three principal components extracted from the return factors. The magnitude of correlations is represented in thefigure, with darker areas representing higher correlation.
Factors are the high minus low portfolio from sorting firms into ten portfolios with respect to the underlying firm characteristic.Characteristic abbreviations are as follows: size (SIZE), book-to-market (BM), dividend-to-price (DP), earnings-to-price (EP),cash flow-to-price (CP), investment-to-assets (IA), asset growth (AG), accruals (AC), abnormal investment (AI), net operatingassets (NOA), investment-to-capital (IK), investment growth (IG), momentum (MOM), long-term reversal (LTR), return onassets (ROA), standardized unexpected earnings (SUE), return on equity (ROE), sales growth (SG), Ohlson score (OS), marketleverage(LEV), net stock issues (NSI), composite issuance (CI), organization capital (OK), liquidity risk (LIQ), turnover (TO),idiosyncratic return volatility (VOL), and market beta (BETA). Details on characteristic definitions and construction is inAppendix A.
(a) 1971-2011
SIZ
EB
M DP
EP
CP IA AG
AC AI
NO
A IK IGM
OM
LTR
RO
AS
UE
RO
ES
GO
SLE
VN
SI
CI
OK
LIQ TO
VO
LB
ET
Am
ktP
C1
PC
2P
C3
SIZEBMDPEPCPIA
AGACAI
NOAIKIG
MOMLTR
ROASUEROE
SGOS
LEVNSI
CIOKLIQTO
VOLBETA
mktPC1PC2PC3
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
34
SIZ
EB
M DP
EP
CP IA AG
AC AI
NO
A IK IGM
OM
LTR
RO
AS
UE
RO
ES
GO
SLE
VN
SI
CI
OK
LIQ TO
VO
LB
ET
Am
ktP
C1
PC
2P
C3
SIZEBMDPEPCPIA
AGACAI
NOAIKIG
MOMLTR
ROASUEROE
SGOS
LEVNSI
CIOKLIQTO
VOLBETA
mktPC1PC2PC3
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
(b) 1971-1991
35
SIZ
EB
M DP
EP
CP IA AG
AC AI
NO
A IK IGM
OM
LTR
RO
AS
UE
RO
ES
GO
SLE
VN
SI
CI
OK
LIQ TO
VO
LB
ET
Am
ktP
C1
PC
2P
C3
SIZEBMDPEPCPIA
AGACAI
NOAIKIG
MOMLTR
ROASUEROE
SGOS
LEVNSI
CIOKLIQTO
VOLBETA
mktPC1PC2PC3
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
(c) 1992-2011
36
Figure 2: Factor Model Performance
(a) 1971-2011: Equal-weighted
0 20 40 60 80 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
CAPM
FF
percentile of factor models
perc
enta
ge o
f cha
ract
eris
tics
mat
ched
(b) 1971-2011: Characteristic Freq
0 20 40 60 80 1000
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
CAPM
FF
percentile of factor models
wei
ghte
d fr
actio
n
(c) 1971-1991: Equal-weighted
0 20 40 60 80 1000.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
CAPM
FF
percentile of factor models
perc
enta
ge o
f cha
ract
eris
tics
mat
ched
(d) 1971-1991: Characteristic Freq
0 20 40 60 80 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
CAPM
FF
percentile of factor models
wei
ghte
d fr
actio
n
(e) 1992-2011: Equal-weighted
0 20 40 60 80 1000.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
CAPM
FF
percentile of factor models
perc
enta
ge o
f cha
ract
eris
tics
mat
ched
(f) 1992-2011: Characteristic Freq
0 20 40 60 80 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
CAPM
FF
percentile of factor models
wei
ghte
d fr
actio
n
Figure 2 displays the distribution of factor model performance, as measured by the percentage of characteristics matched, overthe whole sample period 1971-2011 and subsamples 1971-1991 and 1992-2011. The universe of factor models is all three-factormodels consisting of the market portfolio and two characteristic return factors from our list of 27. The percentage of charac-teristics matched is computed using two characteristic weighting methods: equal-weighted method and characteristic matchingfrequency method. The “equal-weighted” method gives an equal weight to each characteristic matched. The “characteristicmatching frequency” method gives each characteristic a weight of 1 minus the proportion of factor models that can match thecross-section of returns based on the characteristic under consideration. For comparison, the figures also show the rankings ofthe CAPM and the Fama-French three-factor model.
37
Figure 3: Factor Model Performance
Figure 3 shows a heatmap matrix representation of overall factor model performance. The universe of factor models is allthree-factor models consisting of the market portfolio and two characteristic return factors from our list of 27. Factors are thehigh minus low portfolio from sorting firms into ten portfolios with respect to the underlying firm characteristic. Factor modelsare ordered along the x-axis in increasing proportion of characteristics matched; characteristics are ordered along the y-axis indecreasing frequency matched (listed in parentheses). Cell (i, j) is shaded black if factor model i is able to match characteristicj, shaded gray if factor model i is unable to match characteristic j, and shaded white if factor model i comprises of a factorconstructed from characteristic j. We present figures for the whole sample period 1971-2011 and subsamples 1971-1991 and1992-2011.
Characteristic abbreviations are as follows: size (SIZE), book-to-market (BM), dividend-to-price (DP), earnings-to-price (EP),cash flow-to-price (CP), investment-to-assets (IA), asset growth (AG), accruals (AC), abnormal investment (AI), net operatingassets (NOA), investment-to-capital (IK), investment growth (IG), momentum (MOM), long-term reversal (LTR), return onassets (ROA), standardized unexpected earnings (SUE), return on equity (ROE), sales growth (SG), Ohlson score (OS), marketleverage (LEV), net stock issues (NSI), composite issuance (CI), organization capital (OK), liquidity risk (LIQ), turnover (TO),idiosyncratic return volatility (VOL), and market beta (BETA). Details on characteristic definitions and construction is inAppendix A.
Figure 4 shows a heatmap matrix representation of overall factor model performance. The universe of factor models is allthree-factor models consisting of the market portfolio and two characteristic return factors from our list of 27. Factors areconstructed as the equal-weighed average of the high minus low portfolio within the big and small size group, from a double-sortfirst on size and then the characteristic. Factor models are ordered along the x-axis in increasing proportion of characteristicsmatched; characteristics are ordered along the y-axis in decreasing frequency matched (listed in parentheses). Cell (i, j) isshaded black if factor model i is able to match characteristic j, shaded gray if factor model i is unable to match characteristicj, and shaded white if factor model i comprises of a factor constructed from characteristic j. We present the figure for thewhole sample period 1971-2011.
Characteristic abbreviations are as follows: size (SIZE), book-to-market (BM), dividend-to-price (DP), earnings-to-price (EP),cash flow-to-price (CP), investment-to-assets (IA), asset growth (AG), accruals (AC), abnormal investment (AI), net operatingassets (NOA), investment-to-capital (IK), investment growth (IG), momentum (MOM), long-term reversal (LTR), return onassets (ROA), standardized unexpected earnings (SUE), return on equity (ROE), sales growth (SG), Ohlson score (OS), marketleverage (LEV), net stock issues (NSI), composite issuance (CI), organization capital (OK), liquidity risk (LIQ), turnover (TO),idiosyncratic return volatility (VOL), and market beta (BETA). Details on characteristic definitions and construction is inAppendix A.
We provide details on the definitions and construction of 27 firm characteristics.
A.1 Valuation
Size (SIZE)
Stocks with low market capitalization have abnormally high average returns (Banz (1981), Fama and French (1992)). Size isdefined to be the log of market capitalization.
Book-to-Market (BM)
Stocks with high book-to-market have abnormally high average returns (Rosenberg, Reid, and Lanstein (1985), Chan, Hamao,and Lakonishok (1991), Fama and French (1992)). The effect remains after controlling for many other variables and is strongestamong smaller stocks (Fama and French (1993), Fama and French (2008)).
Dividend-to-Price (DP)
There is a positive association between stock returns and dividend yield (Litzenberger and Ramaswamy (1982), Miller andScholes (1982)). However, more recently, it has been shown that dividend yield has little predictive power for future returns(Lewellen (2011)).
Earnings-to-Price (EP)
Stocks with high earnings-to-price have abnormally high average returns (Basu (1977), Basu (1983)). The effect seems to besubsumed by size and book-to-market (Fama and French (1992), Fama and French (1996)). The earnings measure is totalearnings before extraordinary items.
Cash Flow-to-Price (CP)
Stocks with high cash flow-to-price ratios have abnormally high average returns. Cash flow is total earnings before extraordinaryitems, plus equity’s share of depreciation, plus deferred taxes if available.
A.2 Investment
Investment-to-Assets (IA)
Stocks with low investment-to-assets ratios have abnormally high average returns (Lyandres, Sun, and Zhang (2008), Chen,Novy-Marx, and Zhang (2010)). Following Chen et al. (2010), we define investment-to-assets as the annual change in property,plant, and equipment (Compustat item PPEGT) plus annual change in total inventories (Compustat item INVT) divided bylagged total assets (Compustat item AT).
Asset Growth (AG)
Stocks with low asset growth have abnormally high average returns (Cooper, Gulen, and Schill (2008)). The effect is not veryrobust to sorting within different size groups and is absent for large stocks (Fama and French (2008)). Asset growth is thepercentage change in total assets (Compustat item AT).
42
Accruals (AC)
Stocks with low accruals have abnormally high average returns (Sloan (1996)). Accruals is the change in current assets(Compustat item ACT) minus the change in cash and short-term investments (Compustat item CASH) minus the change incurrent total liabilities (Compustat item LCT) plus the change in debt in current liabilities (Compustat item DLC) plus thechange in income taxes payable (Compustat item TXP) minus depreciation and amortization (Compustat item DP). All of thisis divided by the average of total assets (Compustat item AT) over fiscal year t− 1 and t− 2.
Abnormal Investment (AI)
Stocks with low abnormal investment have abnormally high average returns (Fairfield, Whisenant, and Yohn (2003), Titman,Wei, and Xie (2004)). Abnormal investment is the deviation of current investment from the past three year moving average.Investment is defined to be the ratio of capital expenditure (Compustat item CAPX) over the net sales turnover ratio (Compustatitem SALE).
Net Operating Assets (NOA)
Stocks with low net operating assets have abnormally high average returns (Hirshleifer, Hou, Teoh, and Zhang (2004)). Netoperating assets is defined as follows:
where AT is total assets, CHE is cash and short-term investments, DLC is debt in current liabilities, DLTT is long term debt,MIB is non-controlling interest, PSTK is preferred capital stock, and CEQ is common equity.
Investment-to-Capital (IK)
Stocks with low investment-to-capital ratios have abnormally high average returns (Xing (2008)). Investment to capital is theratio of capital expenditure (Compustat item CAPX) over property, plant, and equipment (Compustat item PPENT).
Investment Growth (IG)
Stocks with low investment growth rates have abnormally high average returns (Xing (2008)). Investment growth is thepercentage change in capital expenditure (Compustat item CAPX).
A.3 Prior Returns
Momentum (MOM)
Stocks with high returns over the last year have abnormally high average returns for the next few months (Jegadeesh andTitman (1993), Chan, Jegadeesh, and Lakonishok (1996)). The effect is robust to sorting within different size groups (Famaand French (2008)). Momentum in month t is defined as the cumulated continuously compounded stock return from montht− 12 to month t− 2.
Long-term Reversal (LTR)
Stocks with low returns over the past 3-5 years have abnormally high average returns (DeBondt and Thaler (1985)). The effectis not present after accounting for the Fama French factors (Fama and French (1996)). Long-term reversal in month t is definedas the cumulated continuously compounded stock return from month t− 60 to month t− 13.
43
A.4 Earnings
Return on Assets (ROA)
Stocks with high return on assets have abnormally high average returns (Chen et al. (2010)). Return on assets is defined to bethe ratio of income before extraordinary items (Compustat item IBQ) over total assets (Compustat item ATQ).
Standardized unexpected earnings (SUE)
Post-earnings announcement drift is the tendency for a stock’s returns to drift in the direction of an earnings surprise for severalweeks after an earnings announcement. Stocks with high SUE have abnormally high average returns (Ball and Brown (1968),Bernard and Thomas (1989)). SUE is defined to be the change in the most recently announced quarterly earnings per share(Compustat item EPSPIQ) from its announced value four quarters ago divided by the standard deviation of the change inquarterly earnings over the prior eight quarters.
Return on Equity (ROE)
More profitable firms have abnormally high average returns (Haugen and Baker (1996), Cohen, Gompers, and Vuolteenaho(2002), Piotroski (2000), Fama and French (2006)). The effect is not as robust as there is little evidence that unprofitablefirms have unusually low returns (Fama and French (2008)). Return on equity is defined to be the ratio of equity income overbook value of equity. Equity income is income before extraordinary items (Compustat item IB) minus preferred dividends(Compustat item DVP) plus deferred income taxes (Compustat item TXDI), if available.
Sales Growth (SG)
Stocks with low past sales growth have abnormally high average returns (Lakonishok, Shleifer, and Vishny (1994)). Sales growthis the percent change in net sales over turnover (Compustat item SALE).
A.5 Financial Distress
Ohlson Score (OS)
Stocks with lower Ohlson score (lower probability of default) have abnormally high average returns. OS is computed usingModel One Table 4 of Ohlson (1980).
Market Leverage (LEV)
Stocks with higher market leverage have abnormally high average returns (Bhandari (1988)). The predictive power of leverageis subsumed by the book to market effect in returns (Fama and French (1992)). Market leverage is the ratio of total assets(Compustat item AT) over the market value of equity.
A.6 External Financing
Net Stock Issues (NSI)
Stocks with low net stock issues have abnormally high average returns (Fama and French (2008), Pontiff and Woodgate (2008)),where returns after stock repurchases are high (Ikenberry, Lakonishok, and Vermaelen (1995)) and returns after stock issuesare low (Loughran and Ritter (1995)). Net stock issues is the log of the ratio of split-adjusted shares outstanding at fiscal yearend t− 1 and t− 2. Split-adjusted shares outstanding is the product of common shares outstanding (Compustat item CSHO)and the cumulative adjustment factor (Compustat item ADJEXC).
44
Composite Issuance (CI)
Stocks with low composite issuance have abnormally high average returns (Daniel and Titman (2006)). The five year compositeissuance measure is defined as:
ι(t− τ) = log(MEt
MEt−τ)− r(t− τ, t)
where r(t − τ, t) is the cumulative log return on the stock from the last trading day of calendar year t − 6 to the last tradingday of calendar year t− 1, and ME(t) (ME(t− τ)) is total market equity on the last trading day of calendar year t (t− 6).
A.7 Other
Organization Capital (OK)
Eisfeldt and Papanikolaou (2012) find that firms with more organization capital relative to industry peers outperform firmswith less organization capital. The stock of organization capital is (1-depreciation rate) of organization capital from one periodbefore plus the deflated value of selling, general, and administrative expenses (Compustat item XSGA). Following the originalpaper, we sort on the ratio of organization capital to physical capital.
Liquidity Risk (LIQ)
Firms with high liquidity betas have higher returns than firms with low liquidity betas (Pastor and Stambaugh (2003)). Liquiditybeta is measured as the loading on innovations in aggregate liquidity, in a regression of excess returns on the Fama French threefactors and aggregate liquidity innovation.
Turnover (TO)
Average turnover over the past 3-12 months is negatively related to subsequent returns (Lee and Swaminathan (2000)). Turnoveris defined to be the ratio of shares traded over shares outstanding.
Idiosyncratic Return Volatility (VOL)
Ang et al. (2006) find that firms with high idiosyncratic return volatility have abnormally low returns. Idiosyncratic volatilityis measured as the standard deviation of residuals from a regression of daily excess returns on the Fama French three factormodel.
Market Beta (BETA)
Frazzini and Pedersen (2011) find that a portfolio long on assets with high market betas and short on assets with low marketbetas exhibits significantly negative risk-adjusted returns. Market beta is estimated as the sum of the coefficients from regressingan asset’s daily excess returns on current and lagged excess returns of the market portfolio, with lags up to 5 trading days.
1 MOM OS AG 0.67 MOM IA NOA 0.83 MOM ROA AG 0.922 SIZE MOM VOL 0.63 MOM LTR NOA 0.83 MOM NSI AG 0.923 SIZE LIQ VOL 0.63 MOM NOA CP 0.83 MOM NSI LIQ 0.924 BM MOM CP 0.63 MOM CP LIQ 0.83 MOM CI LIQ 0.925 MOM IA EP 0.63 MOM CP TO 0.83 ROA AG NOA 0.926 MOM NSI LIQ 0.63 SIZE MOM NOA 0.79 ROA AG IK 0.927 MOM AG CP 0.63 MOM IA CP 0.79 ROA AG TO 0.928 MOM CI LIQ 0.63 MOM LTR OS 0.79 AG SUE CP 0.929 SUE AI CP 0.63 MOM AG NOA 0.79 MOM AG SUE 0.8810 SUE CP IG 0.63 MOM AC CP 0.79 MOM AG CP 0.8811 SUE CP LIQ 0.63 MOM AI CP 0.79 MOM SUE CI 0.8812 CP IG LIQ 0.63 MOM CP IK 0.79 ROA AG AI 0.8813 SIZE VOL BETA 0.58 MOM CP BETA 0.79 ROA AG LIQ 0.8814 BM SUE CP 0.58 IA OS CP 0.79 ROA AG SG 0.8815 MOM IA CP 0.58 IA NSI EP 0.79 AG SUE CI 0.8816 MOM AG EP 0.58 IA SUE NOA 0.79 AG SUE EP 0.8817 MOM AI CP 0.58 IA SUE CP 0.79 AG SUE ROE 0.8818 MOM CP IG 0.58 IA EP IG 0.79 AG SUE VOL 0.8819 MOM CP LIQ 0.58 IA EP LIQ 0.79 AG LIQ VOL 0.8820 MOM CP SG 0.58 BM MOM NOA 0.75 AG VOL TO 0.88
Table B.1 lists the characteristic-based factors that constitute the top twenty linear four-factor models, in terms of theproportion of remaining characteristics they can capture, via the equal-weighted method. We say that a factor model Mcaptures, or spans, a characteristic C, if the p-value from the Gibbons et al. (1989) F-test of joint significance of abnormalaverage return with respect to M across the ten sorted portfolios on C is above 10%. Top factor models are shown for thewhole sample period 1971-2011 and subsamples 1971-1991 and 1992-2011.
The universe of factor models is all four-factor models consisting of the market portfolio and three characteristic return factors(C1, C2, C3) from our list of 27. Characteristic abbreviations are as follows: size (SIZE), book-to-market (BM), dividend-to-price(DP), earnings-to-price (EP), cash flow-to-price (CP), investment-to-assets (IA), asset growth (AG), accruals (AC), abnormalinvestment (AI), net operating assets (NOA), investment-to-capital (IK), investment growth (IG), momentum (MOM), long-term reversal (LTR), return on assets (ROA), standardized unexpected earnings (SUE), return on equity (ROE), sales growth(SG), Ohlson score (OS), market leverage(LEV), net stock issues (NSI), composite issuance (CI), organization capital (OK),liquidity risk (LIQ), turnover (TO), idiosyncratic return volatility (VOL), and market beta (BETA). Details on characteristicdefinitions and construction is in Appendix A.
46
Figure B.1: Four-Factor Model Performance
(a) 1971-2011: Equal-weighted
0 20 40 60 80 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
CAPM
FF
percentile of factor models
perc
enta
ge o
f cha
ract
eris
tics
mat
ched
(b) 1971-2011: Characteristic Freq
0 20 40 60 80 1000
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
CAPMFF
percentile of factor models
wei
ghte
d fr
actio
n
(c) 1971-1991: Equal-weighted
0 20 40 60 80 1000.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
CAPM
FF
percentile of factor models
perc
enta
ge o
f cha
ract
eris
tics
mat
ched
(d) 1971-1991: Characteristic Freq
0 20 40 60 80 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
CAPM
FF
percentile of factor models
wei
ghte
d fr
actio
n
(e) 1992-2011: Equal-weighted
0 20 40 60 80 1000.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
CAPMFF
percentile of factor models
perc
enta
ge o
f cha
ract
eris
tics
mat
ched
(f) 1992-2011: Characteristic Freq
0 20 40 60 80 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
CAPMFF
percentile of factor models
wei
ghte
d fr
actio
n
Figure B.1 displays the distribution of four-factor model performance, as measured by the percentage of characteristics matched,over the whole sample period 1971-2011 and subsamples 1971-1991 and 1992-2011. The universe of factor models is all four-factor models consisting of the market portfolio and three characteristic return factors from our list of 27. The percentageof characteristics matched is computed using two characteristic weighting methods: equal-weighted method and characteristicmatching frequency method. The “equal-weighted” method gives an equal weight to each characteristic matched. The “char-acteristic matching frequency” method gives each characteristic a weight of 1 minus the proportion of factor models that canmatch the cross-section of returns based on the characteristic under consideration. For comparison, the figures also show therankings of the CAPM and the Fama-French-Carhart four-factor model (consisting of the market, SMB, HML, and MOM).
47
Figure B.2: Four-Factor Model Performance
Figure B.2 shows a heatmap matrix representation of overall factor model performance. The universe of factor models is allfour-factor models consisting of the market portfolio and three characteristic return factors from our list of 27. Factors are thehigh minus low portfolio from sorting firms into ten portfolios with respect to the underlying firm characteristic. Factor modelsare ordered along the x-axis in increasing proportion of characteristics matched; characteristics are ordered along the y-axis indecreasing frequency matched (listed in parentheses). Cell (i, j) is shaded black if factor model i is able to match characteristicj, shaded gray if factor model i is unable to match characteristic j, and shaded white if factor model i comprises of a factorconstructed from characteristic j. We present figures for the whole sample period 1971-2011 and subsamples 1971-1991 and1992-2011.
Characteristic abbreviations are as follows: size (SIZE), book-to-market (BM), dividend-to-price (DP), earnings-to-price (EP),cash flow-to-price (CP), investment-to-assets (IA), asset growth (AG), accruals (AC), abnormal investment (AI), net operatingassets (NOA), investment-to-capital (IK), investment growth (IG), momentum (MOM), long-term reversal (LTR), return onassets (ROA), standardized unexpected earnings (SUE), return on equity (ROE), sales growth (SG), Ohlson score (OS), marketleverage (LEV), net stock issues (NSI), composite issuance (CI), organization capital (OK), liquidity risk (LIQ), turnover (TO),idiosyncratic return volatility (VOL), and market beta (BETA). Details on characteristic definitions and construction is inAppendix A.