Appendix A Large-Scale Approach for Evaluating Asset Pricing Models Laurent Barras ∗ This version: March 24, 2018 ∗ Desautels Faculty of Management, McGill University, Montreal. E-mail: [email protected].
Appendix
A Large-Scale Approach
for Evaluating Asset Pricing Models
Laurent Barras∗
This version: March 24, 2018
∗Desautels Faculty of Management, McGill University, Montreal. E-mail: [email protected].
I The Cross-Section of Micro Portfolios
A Portfolio Formation Procedure
We describe the procedure for forming the set of micro portfolios in each size group
(tiny-, small-, and big-cap). First, we sort the stocks in each formation year
( = 1 ) according to their estimated average returns. To compute this variable
denoted by ( = 1 ) we use a linear combination of firm characteristics.1
For each month prior to the formation date we run a cross-sectional regression of
the monthly stock excess returns on the most recently observed characteristics: =
0 + , where denotes the vector of characteristics including a constant. Then,
we estimate the characteristic-based average return as
= 0 (A1)
where is the vector of characteristics observed in the formation year and is
the time-series average of the monthly vector of coefficients. To facilitate the chaining of
portfolio returns over consecutive years, we work with the standardized average return
computed as
= − 1
P
=1 µ1
P
=1 2 −
³1
P
=1
´2¶ 12
(A2)
Second, we construct, for each stock a micro portfolio by equally weighting the stock
itself and −1 additional stocks with the nearest values to . This technique is calledlocal averaging and borrows from Efron (2010, ch. 9). Third, we chain the portfolio
returns over time to obtain stable average returns. For each pair ( ) of micro portfolios
in years and +1, we compute the distance between them as¯ − +1
¯2 Then,
we match the portfolios with the lowest distance (each year- portfolio can only be
paired with one year-+1 portfolio). To minimize changes in portfolio composition, we
match the pair ( ) first if¯ − +1
¯is in the bottom 1% of all measured distances.
1The characteristic-average return relation used here should be distinguished from previous studies
that impose a linear relation between characteristics and pricing errors (Avramov and Chordia (2006),
Brennan, Chordia, and Subrahmanyam (1998)). In our case, two stocks can have similar characteristics
and yet different pricing errors because they are not exposed to the same risk factors.2Alternatively, we can compute the estimated average return of the newly-created portfolio as
1
+
where ( = 1 − 1) denotes the identity of the additional stocks in-
cluded in the portfolio. Using this approach leaves the results unchanged.
1
In Figure A1, we illustrate the portfolio formation procedure in a population of 50
stocks ( = = 50) over a 2-year sample period ( = 2) Each dot denotes the
ordered value at the start of each year. We see that the portfolio composition
changes each year to account for the time-variation in characteristics. For instance, the
portfolio associated with the median average return 25 includes stocks S10, S42
S3 in year 1, and stocks S18, S6 S46 in year 2. The formation procedure yields a total
number of micro portfolios, equal to the number of stocks ( = )
In practice, the formation procedure is more complicated because the number of
stocks changes over time. Suppose that the number of stocks in year 2 is equal to 60
(instead of 50). Applying the matching procedure described above, we can pair 50 year-1
and 50 year-2 portfolios, which leaves 10 year-2 portfolios unmatched. In this example,
the cross-section includes 60 micro portfolios ( = 60) with unequal time-series lengths:
(i) 50 portfolios created in year 1 with complete return history (24 monthly returns),
(ii) 10 unmatched portfolios created in year 2 with 12 monthly return observations.
Conversely, suppose that we have 60 portfolios in year 1 (instead of 50). In this case, we
can only pair 50 year-1 portfolios, which leaves 10 year-1 portfolios unmatched ( = 60).
In general, the total number of micro portfolio is therefore equal to = max()
Please insert Figure A1 here
B Definition of the Characteristics
To compute the firm’s average return, we follow Fama and French (2008) and use a linear
combination of three characteristics–book-to-market, profitability, and investment. We
use the definitions of Fama and French (2008, 2015) to measure these characteristics at
the end of June of each year and winsorize the data at the 1% and 99% levels to
remove outliers. The book-to-market ratio is equal to the ratio of the book value of
equity to the market value of equity. The book value for year is defined as total
assets minus liabilities, plus balance sheet deferred taxes and investment tax credit (if
available), minus preferred shares stock liquidating values (if available), or carrying value
(if available) in the fiscal year ending in the calendar year − 1 The market value foryear equals the price times shares outstanding at the end of December of year − 1Investment for year is computed as the relative change in total assets between the
fiscal years ending in calendar years − 2 and − 1 Finally, profitability for year isdefined as revenues minus cost of goods sold, minus selling, general, and administrative
expenses, minus interest expense all divided by the book value of equity. Each of these
variables is computed using data in the fiscal year ending in the calendar year − 1
2
We also use the definitions in Hou, Xue, and Zhang (2015) to construct alternative
proxies for the three characteristics. First, we replace the book-to-market ratio with
the earnings-to-price and cash flow-to-price ratios. Earnings for year are defined as
income before extraordinary items. Cash flows for year are computed for year as
income before extraordinary items, plus equity’s share of depreciation, plus deferred
taxes (if available). Each of these variables is computed using data in the fiscal year
ending in the calendar year − 1. Price for year equals the market value measuredat the end of December of year − 1
Second, we measure investment using infrastructure growth and inventory growth.
Infrastructure growth for year is defined as the change in gross property, plant, and
equipment, plus changes in inventory in the fiscal year ending in the calendar year −1scaled by total assets in the fiscal year ending in the calendar year − 2 Inventorygrowth is defined as the relative change in capital expenditure between the fiscal years
ending in calendar years − 2 and − 1.Third, we measure profitability using the Return On Equity (ROE) and the Return
On Assets (ROA). ROE for year is defined as income before extraordinary items in
the fiscal year ending in the calendar year − 1 divided by book equity in the fiscalyear ending in the calendar year − 2 ROA for year is defined as income before
extraordinary items in the fiscal year ending in the calendar year − 1 divided by totalassets in the fiscal year ending in the calendar year − 2
II Estimation Procedure
A Extended Two-Pass Regression
We provide a general description of the econometric framework for estimating the pricing
errors of the micro portfolios. Under model ( = 1 ) the excess return of each
portfolio ( = 1 ) can be written as
= + + 0 + 0 + (A3)
where is the market excess return, is the -vector of risk factors specific to model
the -vector of factors included in the other models and orthogonal to and
, and denotes the residual term. The intercept is equal to
= − 0 (A4)
3
where is the portfolio pricing error and is the -vector of forward prices of the
risk factors.3 Using Equation (A4), we can write the pricing error as
= 0 (A5)
where the ( + 1)-vectors and are defined as = [1 0 ]0 and = [
0 ]0 To
estimate , we build on recent work by Gagliardini, Ossola, and Scaillet (2016; GOS
hereafter) who extend the traditional two-pass regression to a large and unbalanced
panel of test assets–two important features exhibited by micro portfolios.
In the first step, we run a time-series regression of on the ( + + 2)-vector
= [1 0 0]0 for each portfolio . The OLS estimator of the ( + + 2)-vector
of coefficients = [ 0
0 ]0 is given by
=
ÃX=1
10
!−1 X=1
1 (A6)
where is the total number of observations, and 1 equals one if is non-missing The
matrix inversion in Equation (A6) is numerically unstable if only few return observations
are available. To address this issue, GOS introduce the following trimming device:
1 = 1
n ≤ 1 () ≤ 2
o (A7)
where = =
P=1 1 () =
³max()min
³
´´ 12denotes
the condition number of the matrix =1
P=1 1
0 Following GOS, we set
1 =60660(a minimum of 60 monthly observations) and 2 = 15
In the second step, we estimate the -vector of forward prices using a cross-
sectional regression of the estimated intercept on the -vector of estimated betas
keeping the non-trimmed portfolios only:
(1) = −⎛⎝ X
=1
1
0
⎞⎠−1 X=1
1
0
(A8)
We adjust (1)
for the bias component Ψ= −1
³1
P=1
01
´, where =
[0 ] 1 = [0×1 ]0 is the × identity matrix, = 02
−1
−1 2
3The forward price of the market factor does not appear in equation (A4) because is an excess
return which, by definition, has a forward price equal to zero
4
= [0] = [2
0] and 2 is a ( + + 2)× ( + 1) matrix whose th row
( = 1 3 4 +1) has one for the th element and zeros everywhere else The final
estimate of is equal to
= (1) +1
Ψ
(A9)
where Ψis computed as −1
³1
P=1 1
01
(1)
´ =
P=1 1
=
1
P=1 1
0
, = 02
−1
−12 and (1) = [1 0
(1)]0 Following GOS, we es-
timate using the White estimator (1980): =1
P=1 1
2
0 where =
− 0 Plugging the estimated quantities in Equation (A5), we obtain
= 0
(A10)
B Estimation of the Portfolio -Statistics
We now prove Proposition 1 which provides an analytical expression for the -statistic
of the portfolio pricing error and its asymptotic distribution.
Proof of Proposition 1. We consider the misspecified model and suppose that
the residual terms ( = 1 ) are weakly correlated. When the number of port-
folios and return observations grow large ( → ∞) Proposition 7 of GOS showsthat converges towards
at a rate equal to
√ In addition, standard results in
regression analysis reveal that the vector of estimated coefficients
is asymptotically
distributed as √ (
− )→ (0 ) (A11)
where = 02−1
−1 2 With in the thousands and in the hundreds, the
asymptotic sampling variation in is therefore only driven by that of
, i.e.,
√ ( − )
→ ³0 0
´ (A12)
Using this result, we compute the portfolio -statistic as
where
2=1
0 = 0
(A13)
The variance term is equal to 1
where is a consistent estimator of the co-
variance matrix . In addition, Equation (A12) implies that the -statistic follows a
normal distribution, →
µ
1
¶ where = 0 and 2
= 0
.
5
III Statistical Inference
A Proportion of Mispriced Portfolios
We compute the proportion of portfolios that are mispriced by model as
= 1− ()
0()= 1−
1
P=1 1 (
)
Φ0() (A14)
where () is the estimated probability that the -statistic falls in the interval =
[− ], 1 ( ) is an indicator function equal to one if falls in and 0() is computed
from the standard normal cdf Φ0() = Φ0()−Φ0(−) = 2Φ0()− 1.Proof of Proposition 2. We consider the misspecified model and suppose that
the residual terms ( = 1 ) are weakly correlated. We further assume that the
-statistics are spatially ordered such that nearby -statistics exhibit higher correlation.
When the number of portfolios grows large ( → ∞) Lemma 2 of Farcomeni (2006)shows that √
(()− ())→
¡0 2
¢ (A15)
where 2 = (1 (1)) + 2P∞
=2 (1 (1) 1 (
)) and ( = 1 ∞) are the ordered
-statistics. Because the variance of the estimated proportion only depends on that
of () (Equation (A14)), the asymptotic distribution of the vector of estimated pro-
portions for two misspecified models and is given by
√
" − ∗ − ∗
#→
⎛⎝ 0
0
2
Φ0()2
Φ0()2
Φ0()2
2
Φ0()2
⎞⎠ (A16)
where ∗ = () and ∗ = () The variance terms are given by
2 = (1 (1)) + 2
∞X=2
(1 (1) 1 ( ))
2 = (1 (1)) + 2
∞X=2
(1 (1) 1 ())
= (1 (1) 1 (1)) +
∞X=2
(1 (1) 1 ()) + (1 (1) 1 (
)) (A17)
where ( = 1 ∞) are the ordered -statistics for models and
6
Using Proposition 2, we can test the null hypothesis that model is correctly speci-
fied. Under the null hypothesis 0 : ∗ = 0 Genovese and Wasserman (2004) show that
the estimated mispricing proportion is asymptotically distributed as
√
→ 1
20 +
1
2+
µ0
2Φ0()2
¶ (A18)
where 0 is a point-mass at zero and + is a positive-truncated normal distribution To
test this hypothesis at the size level we determine whether is sufficiently far away
from zero using the following threshold:
1√
Φ0() (A19)
where is the consistent estimator of and is the quantile of the standard normal
distribution at (1-) To compute , we use the following estimator proposed by Newey-
West (1987):
2 =
⎡⎣ 1
X=1
1 ( )
⎤⎦− ()2 + 2
X1=1
⎡⎣ 1
−
−1X=1
1 ( )1 (1+
)
⎤⎦− ()2 (A20)
where is the number of cross-sectional lags.4
Proposition 2 also allows us to test the null hypothesis of equal performance between
two misspecified, possibly non-nested models. Under the null hypothesis 0 : ∆∗ =
∗ − ∗ = 0 the estimated difference ∆ = − is asymptotically distributed as
√(∆ −∆∗) →
µ02 + 2 − 2
Φ0()2
¶ (A21)
To implement this testing procedure, we compute the covariance term using the
following consistent estimator:
=
⎡⎣ 1
X=1
1 (1)1 (1)
⎤⎦− ()()
+
X1=1
⎡⎣ 1
−
−1X=1
1 (1)1 (1+
) + 1 (1)1 (1+
)
⎤⎦− ()() (A22)
4 In the baseline specification, we set equal to 1% of the total number of portfolios ( = 40 in the
entire population) to account for potential weak dependencies between portfolios. Setting = yields
similar volatility estimates.
7
B Sign of the Pricing Errors
We can extend the large-scale approach to conduct inference on the estimated propor-
tions of portfolios with negative and positive pricing errors To compute both proportions
denoted by − and + , we use the procedure of Barras, Scaillet, and Wermers (2010).
First, we determine the proportions of portfolios with low or high estimated pricing
errors (−) and (
+) where − = [−∞−], + = [+∞], and − and denote
the lower and upper bounds of the interval . Second, we deduct the proportions of
"false discoveries", (1− )Φ0(−) and (1− )Φ0(
+)–both expressions measure the
proportions of correctly-priced portfolios which, by chance, have -statistics falling in
the intervals − and +. This two-step approach yields the following expressions for −and + and their variances:
− = (−)− (1− )Φ0(
−)
+ = (+)− (1− )Φ0(
+) (A23)
2−
= 2(−) +Φ0(−)22 − 2Φ0(−)(−)
2+
= 2(+) +Φ0(+)22 − 2Φ0(+)(+) (A24)
where 2(−)
= 12−
2(+)
= 12+ 2 =
1
2
Φ0()2 (−) =
1Φ0()
−
and (+) =1
Φ0()2+
The different components are given by
2− = (1−(1)) + 2∞X=2
(1−(1) 1−( ))
2+ = (1+(1)) + 2
∞X=2
(1+(1) 1+( ))
− = (1−(1) 1 (1)) +
∞X=2
(1−(1) 1 ( )) + (1 (1) 1
−( ))
+ = (1+(1) 1 (1)) +
∞X=2
(1+(1) 1 ( )) + (1 (1) 1
+( )) (A25)
where 1−( ) (1+( )) is an indicator function equal to one if
falls in − (+). After
replacing the above expressions with the consistent estimators proposed by Newey-West
(1987) we can conduct inference on the two proportions − and +
8
C Testing for Useless Factors
We now explain how to test whether the sth factor included in model is useless.
For each portfolio ( = 1 ) the beta on factor obtained from the first-pass
regression in Equation (A6) is asymptotically distributed as
√ ( − )
→ ¡0 0+1+1
¢ (A26)
where +1 is a ( + 1)-vector whose ( + 1)th element is one and the others are zero.
Using this result, we compute the associated -statistic as
where the estimated
variance of is given by
=1
0+1+1 (A27)
Then, we compute the proportion of portfolios with non-zero betas on factor using
the same expression as in Equation (A14):
() = 1− ()
0()= 1−
1
P=1 1 (
)
Φ0() (A28)
where () is the estimated probability that the beta -statistic falls in the interval
and 1 () is an indicator function equal to one if falls in .
If factor is useless, the true betas are all equal to zero and we obtain 0 :
(()) = ∗() = 0 Similar to we can write the distribution of the estimated
proportion () under 0 as
√()
→ 1
20 +
1
2+
µ0
2()
Φ0()2
¶ (A29)
where 0 is a point-mass at zero, + is a positive-truncated normal distribution, and
2() follows the same expression as in Equation (A17) except that we use the ordered
beta -statistics To test this hypothesis at the size level we determine whether
() is sufficiently far away from zero using the following threshold:
() 1√
()
Φ0() (A30)
where () is the consistent estimator of (
) and is the quantile of the standard
normal distribution at (1-)
9
IV Monte Carlo Analysis
A Setting
We conduct a Monte Carlo analysis to evaluate the finite-sample properties of the pro-
portion estimators for two misspecified models and . We extend the illustrative
example presented in the paper on several important dimensions to closely replicate the
salient features of the data. First, we match the total number of micro portfolios across
the three size groups (before imposing any filters on the data). Specifically, we construct
a set of 2,349 tiny-cap portfolios, 938 small-cap portfolios, and 1,302 big-cap portfolios
based on the empirical characteristics of the individual stocks in each size group.
Second, we account for the unbalanced nature of the panel of portfolio returns. To
guarantee the same unbalanced structure as in the data, we apply the empirical ×
matrix of indicators 1 ( = 1 and = 1 ) to each simulated panel of portfolio
returns, where denotes the total sample size equal to 606 monthly observations and
is equal to 4,589 micro portfolios.
Third, we jointly match the average proportion of mispriced portfolios across the
proposed models examined in the empirical section by adding a size premium to the
average excess return of each individual stock ( = 1 with =):
= + + + (A31)
where is the premium of the market return and denote the premia of the
two additional risk factors and
Model includes the market and factor which implies that the vector of ex-
planatory variables is defined as = [1 ]0 The term is the estimated
component of the omitted factor that is orthogonal to 0 = [1 ]0 i.e.,
= − 00 where is the vector of estimated coefficients from a time-series
regression of on 0 over the entire sample period. Model includes the market
and factor which implies that = [1 ]0 where = − 0
0,
0 = [1 ]0 and is the vector of estimated coefficients from a time-series re-
gression of the omitted factor on 0 over the entire sample period. We assume that
and the residual term are all independent and normally distributed as
( 2), (
2) (
2), and (0 2) respectively. We further assume that
are randomly drawn from the normal distribution (() ()),
(() ()), (() ()), (() ()).
To calibrate the model, we use monthly data on individual stocks and the Fama-
10
French three factors (market, size, value) over the entire sample period. The calibration
of the distribution parameters for the betas and the residual term is done separately for
each size group. To attribute each individual stock to a specific size group, we form,
each year, the three size groups by taking as breakpoints the 20th and 50th percentiles
of the market capitalization for the NYSE stocks (similar to Fama and French (2008)).
We then classify each stock based on the frequencies at which it falls in the three groups.
For each size group, we set () and () equal to the median and variance of the
estimated size betas, and () and () equal to the median and variance of the
estimated market betas. We further set () () and () () equal to
the median and variance of the estimated value betas. Finally, is set equal to the
cross-sectional average of the estimated residual volatility.
We set equal to 0.5% per month so as to approximate the median value for the
proportions of mispriced portfolios (around 45%). We set and equal to the
average return and volatility of the CRSP value-weighted index (0.5% and 4.4% per
month). For the volatilities and of the additional risk factors and we split
the volatility of the value factor in two (2.8% per month). To determine the values for
and we choose two scenarios to capture the minimum and maximum proportion
differences observed in the data. Under the first scenario, and are set equal to
10% and 0.0% per month so as to produce a large proportion difference between the
two models. Under the second scenario, and are both equal to 05% per month,
which implies that both models yield the same moderate performance.
B Simulation Procedure
For each scenario, we compute the estimated proportions of mispriced portfolios over
1,000 iterations and five sets of values for the stock betas ( = 5 000). For each iteration
( = 1 ) we first construct a -vector of monthly return observations for each
stock ( = 1 with =) :
() = + () + () + () + () (A32)
where () () () and () are drawn from their respective distributions.
Second, we form the cross-section of micro portfolios using the average stock return as
the sorting variable and apply the portfolio formation described above.5 The resulting
cross-section consists of micro portfolios, each containing 10 stocks ( = 10)–stock
5We assume that the book equity of each firm is proportional to its future expected cash flows. In
this case, the average return can be directly inferred from the observable book-to-market (bm) of each
firm, i.e., is proportional to (see Berk (2000))
11
and nine additional stocks with the nearest average return to stock We keep track
of the identity of the stocks included in each micro portfolio via a × matrix
whose th row has zeros everywhere except for the stocks included in the portfolio.
Third, we construct the monthly return of each micro portfolio from the -vector of
stock returns () = [1 ]0 :
() = 11
(()) (A33)
where 1 takes the value of one if the return is observed in the data (and zero otherwise).
Fourth, we compute the vector of -statistics for all portfolios using the extended two-
pass regression described above and estimate the proportions of mispriced portfolios for
the two models and its difference,
() = 1− ()()
Φ0()
() = 1− ()()
Φ0()
∆() = ()− () (A34)
as well as the estimated variances of these estimators using Equations (A20) and (A22),
2() =2()
Φ0()2
2() =2()
Φ0()2
2∆() =2() + 2()− 2()
Φ0()2 (A35)
Repeating these three steps times, we can then compute the average values of the
estimated proportions and their difference as
() = ∗ =1
X=1
()
() = ∗ =1
X=1
()
(∆) = ∆∗ = ∗ − ∗ (A36)
12
We also compare the true variance of the estimators with the average estimated values:
2 =1
X=1
2()− (∗)2 versus (2) =1
X=1
2()
2 =1
X=1
2()− (∗)2 versus (2) =1
X=1
2()
2∆ =1
X=1
∆2()− (∆∗)2 versus (2∆) =1
X=1
2∆() (A37)
To further measure the accuracy of the variance estimators, we compute the coverage
ratio of the confidence intervals at equal to 90% and 95% as
() =1
X=1
1{(()− ∗) ()}
() =1
X=1
1{(()− ∗) ()}
(∆) =1
X=1
1{(∆()−∆∗) 2∆()} (A38)
where 1{} equals one if the condition inside the parenthesis is satisfied (and zero oth-erwise), and equals the quantile of the standard normal distribution at (1-
2)
C Main Results
In Panel A of Table AI, we examine the properties of the different estimators under the
first scenario where the two models and achieve a large difference in performance
(34.5% in the entire population) The true volatilities of the different estimators range
between 3.6% and 9.0% and are typically higher for the two largest size groups which
contain fewer portfolios. Turning to the properties of the variance estimators, we find
that the average value for each model in the entire population is slightly below average
(0.5% for model and 0.3% for model ) In contrast, the volatility estimator for the
difference yields an average value that closely matches the true volatility (5.1% versus
5.0%). This last property is maintained across all three size groups. Finally, the coverage
ratios of the two confidence intervals at 90% and 95% are, in most cases, remarkably
accurate. For instance, the coverage ratios for the proportion difference in the entire
population are equal to 90.1% and 95.3%, respectively.
13
In Panel B, we repeat the analysis for the second scenario where the two models
yield the same moderate performance. Similar to the previous scenario, the volatility
estimators precisely capture the variability of the estimated mispricing proportions for
the entire population (they are identical to the true values for both models). We also
find that the coverage ratios stay close to their theoretical values (88.0% and 93.4%
for the intervals at 90% and 95%, respectively). While the results are similar for the
big-cap group, they are less accurate in the two smallest size groups (micro- and small-
cap). In both groups, the volatility estimators underestimate the true volatilities by 12%
on average (in relative terms) which implies that the coverage ratios of the confidence
intervals are slightly lower than their theoretical values.
Please insert Table AI here
V Overlapping versus Non-overlapping Portfolios
In this section, we show that the mispricing proportion is estimated more precisely
with overlapping portfolios. For simplicity, we consider a population of stocks whose
residual terms ( = 1 ) are homoscedastic and uncorrelated both across stocks
and over time. We also assume that both the number of portfolios and return obser-
vations grow large. The number of overlapping portfolios is equal to and the
number of non-overlapping portfolios satisfies≤ ≤
+ 1, where is the
number of stocks included in each portfolio. We denote by 2() the variance of
obtained with overlapping portfolios and by 2() the variance of obtained with
non-overlapping portfolios. The two asymptotic variances can be written as
√2() =
1
Φ0()2
Ã1 + 2
−1X=1
!2
√2() =
1
Φ0()22 (A39)
where 2 denotes the variance of the indicator function 1 ( ), and is the correla-
tion between the indicator functions associated with the ordered -statistics 1 ( ) and
1 (+). From Equation (A39), we infer that the overlapping scheme provides efficiency
gains if
2() 2()⇐⇒Ã1 + 2
−1X=1
! (A40)
14
To show that the above inequality holds, we proceed in three steps. First, we compare
the variances of the averages of the -statistics for both overlapping and non-overlapping
portfolios. Second, we infer from this comparison that a sufficient condition for Equation
(A40) to hold is that the correlation between the pair ( +) is higher than the
correlation between the pair (1 ( ) 1 (+)) Third, we verify that this is the case.
We write the -statistic averages for overlapping and non-overlapping portfolios as
() =1
X=1
() =1
X=1
(A41)
and their asympotic variances as
√2
() = 1 + 2
−1X=1
√2
() = (A42)
where is the correlation between the pair (
+) To determine , we note that
the correlation between the portfolio residuals and + is equal to
= max(1−
0) (A43)
which means that the correlation progressively drops from one to zero as the portfolio
distance approaches − 1 Then, we repeat the analysis for the estimated pricingerrors Building on Equation (A10) and Proposition 1, we can write
√ ( − ) =
( − )0 +1√
P=1
002−1 =
1√
P=1 where is a scalar equal to
002−1 Therefore, the asymptotic correlation between
√ and
√+ is
=(√
√+)³
(√ )(
√+)
´ 1
2
=2+
2+= = max(1−
0) (A44)
Finally, we use Theorem 8.5 in Efron (2010, ch. 8) to show that asymptotically
= = max(1−
0) (A45)
15
Plugging the above expression in Equation (A42), we find√2
() and√2
()
are identical because
1 + 2
−1X=1
= 1 + 2
µ− 12
¶= (A46)
Therefore, a sufficient condition for the inequality in Equation (A40) to hold is that
∀ ∈ [1 − 1] ≤ and ∃ ∈ [1 − 1] s.t. (A47)
To examine the relationship between these two correlations, we denote the bivariate
normal distribution for the -statistics ( +) by
³
+;
´to obtain
=
R
R¡ ;
¢
2 (A48)
where is the -statistic mean.6 The double integral in the numerator of Equation
(A48) does not have a closed-form expression but can be easily solved numerically. The
results show that if ∈ (0 1) we have for all the intervals and -statistic
means that belong to the sets and defined as = { = [− ]; ∈ [015 065]})and = {;() = ( ∈ ) 0) This implies that the condition in Equation
(A47) holds and that 2() 2() To illustrate, Figure A2 shows the function
= () for different values for ( = 0 0.5, 1, 1.5) and = [−04 04]. Inall cases, we see that (i) = when is equal to zero or one; (ii) the function
() is convex. Therefore, is strictly lower than when ∈ (0 1)
Please insert Figure A2 here
VI Additional Results
A Changes in the Estimation Procedure
A.1 Different Values for the Interval
In the baseline specification, we set the interval for estimating the mispricing pro-
portion equal to [-0.4,0.4]. To examine if our results are sensitive to this choice we
re-compute the mispricing proportions for each interval in the set = { = [− ]; ∈ [015 02 065]} Table AII shows the results for the entire population (Panel A)
6For simplicity, we set the mean of and + equal to the same value. This assumption is motivated
by the fact that the -statistics are spatially ordered and thus likely to have similar means. Allowing for
different means does not change the results.
16
and the three size groups (Panels B to D). We find that the estimated proportions remain
largely unchanged–for instance, the averages in the entire population range between
53.4% and 56.7%. This stability is consistent with the observations made by Barras,
Scaillet, and Wermers (2010) and Storey (2002).
Please insert Table AII here
A.2 Bootstrap Analysis
In the baseline specification, we rely on asymptotic theory to estimate the mispricing
proportion in Equation (A14)–that is, we assume that the -statistics of correctly-
priced portfolios follow a standard normal distribution (0 1) in order to replace 0()
with Φ0()We now relax this assumption using the bootstrap approach of Efron (2010,
ch. 2) in which the -statistic of each portfolio is transformed into a statistic called the -
value. This transformation guarantees that the -value of a correctly-priced portfolio is
distributed as a normal (0 1). Therefore, we can still use Equation (A14) to compute
provided that we use -values instead of -statistics.
To compute the -value of each portfolio ( = 1 ), we use the following
procedure. First, we draw, for each bootstrap iteration ( = 1 1 000) random
observations from the original sample of risk factors and residuals to reconstruct the
portfolio returns:
() = 0 + () + () + () + () (A49)
where we impose that the portfolio is correctly priced ( = 0) by setting
0 = − (A50)
Second, we re-estimate the portfolio -statistic by regressing the bootstrapped returns
on the bootstrapped factors, i.e.,
() =0
()³0 ()
´ 12
(A51)
where () =((),())0, () denote the bootstrapped coefficient vector and its
covariance matrix. Third, we repeat the first two steps 1 000 times and compute
the bootstrapped cumulative distribution function (cdf) associated with the original
17
-statistic as
0( ) =
1
1 000
1000X=1
1{ () 6 } (A52)
Finally, we obtain the -value by inverting the quantile 0( ) using the standard
normal cdf Φ−10 i.e.,
= Φ−10 (0(
)) (A53)
The empirical results obtained with the bootstrap procedure are reported in Table
AIII. The estimated proportions of mispriced portfolios remain largely unchanged. This
result implies that the sample size is sufficiently large for the normal distribution to be
a good approximation of the true -statistic distribution.
Please insert Table AIII here
B Changes in the Portfolio Formation Procedure
B.1 Different Portfolio Sizes
In this section, we examine the sensitivity of the results to changes in the portfolio for-
mation procedure. To begin, we decrease the number of stocks in each micro portfolio
from 10 to 5 stocks ( = 5). The results in Panel A of Table AIV are qualitatively
similar except that the mispricing proportions are generally lower. With only 5 stocks
in each portfolio, the benefits of diversification are not fully exploited and the detection
of the mispriced portfolios becomes more difficult.
Next, Panel B reports the mispricing proportions for micro portfolios formed with
15 stocks ( = 15) Overall, the results remain similar to those documented in Table III.
We also observe that the volatilities of the estimators are slightly higher because micro
portfolios have a higher degree of overlap.
Please insert Table AIV here
B.2 Identical Stock Representation
Next, we tackle the issue of stock representation. While the vast majority of stocks are
selected times in each formation year, some of them are included more or less often.
Therefore, the baseline portfolio formation procedure could potentially overweight the
importance of some stocks and underweight the importance of others.
To address this issue, we modify the formation procedure to guarantee that each
stock is selected exactly times. For each formation year ( = 1 ) we create
18
the set of micro portfolios following the procedure described above and count the number
of times a given stock ( = 1 ) is included in different portfolios. If
we include stock in − additional portfolios with the nearest values to the average
return . If we exclude stock from − randomly selected portfolios.
Table AV shows that the estimated mispricing proportions under this alternative
portfolio formation remain largely unchanged. The performance differences documented
in the paper are therefore not driven by variations in representation across stocks.
Please insert Table AV here
B.3 Alternative Set of Characteristics
We re-build the cross-section of micro portfolios using nine different sets of character-
istics for estimating average returns in Equation (A1). The first three specifications
simply use each characteristic in isolation (book-to-market, investment, profitability).
Specifications 4 and 5 keep our initial measures of investment and profitability but re-
place the book-to-market ratio with the earnings-to-price and cash flow-to-price ratios.
Specifications 6 and 7 keep our initial definitions of book-to-market and profitability
but measure investment using infrastructure growth and inventory growth (instead of
growth in total assets). Specifications 8 and 9 keep our initial definitions of book-to-
market and investment but measure profitability using Return On Equity (ROE) and
the Return On Assets (ROA) (instead of operating profitability).
To begin, we examine whether these alternative micro portfolios still produce gains
in power and a reduction in beta correlation. In Panel A of Table AVI, we confirm that
the interquartile spread in average returns, the median return volatility, and the median
number of observations are similar to those reported in Table I. In Panel B, we also
measure the independent variation in betas. For each factor included in the CAPM-
based models, we compute the beta residuals (the components orthogonal to the other
betas) and report the length of the 90%-interval spanned by these residuals. The results
show that the dispersion in betas is similar to that of Table II. The overall evidence
suggests that the alternative cross-sections of micro portfolios contain sufficient pricing
information to discriminate between models.
Next, we estimate the mispricing proportions for the different sets of characteristics.
The results in Table AVII reveal strong similarities with those reported in Table III.
First, the average mispricing proportions (across the nine specifications) obtained with
the standard CAPM remains high, i.e., they reach 72.2%, 57.5%, and 42.7% in the three
size groups (74.1%, 60.5%, and 46.4% in the baseline case). Second, the human capital
19
CAPM maintains its solid performance in all three size groups, e.g., in the tiny- and
big-cap groups, the average mispricing proportions are equal to 31.6% and 12.5% (37.7%
and 14.9% in the baseline case). Third, the conditional CAPM still performs well in
the two largest size groups–the average mispricing proportions are equal to 30.3% and
23.4% (27.3% and 17.9% in the baseline case). Fourth, the liquidity CAPM continues to
price tiny-cap portfolios well as the average mispricing proportion equals 33.7% (35.3%
in the baseline case). Finally, the three characteristic-based models generally dominate
the CAPM-based models and perform equally well except in the tiny-cap group.
Please insert Tables AVI and AVII here
C Traditional Performance Measure
C.1 Testing Procedure
In the paper, we advocate for the use of the mispricing proportion to evaluate models.
An alternative approach is to use a version of the traditional performance measures
proposed by Gagliardini, Ossola, and Scaillet (2016; GOS hereafter) which can be applied
in large cross-sections. This measure is defined as the sum of squared pricing errors,
=1
P=1(
)2, where is the true pricing error of portfolio .
The asset pricing test is based on the statistic c =
where = √(− 1
),
=1
P=1(
)2, is the estimated portfolio alpha, and 2
is the variance of
defined as
2 = 2 lim→∞
⎡⎣ 1
X=1
X=1
2 2
2
³002
−1
−1 2
´2⎤⎦ (A54)
where = 1
P=1 1 = 1
P=1 1 = 1
P=1 11,
= [1 0 ]0 = [
0] = [
0] and 2 is a ( + + 2) × ( + 1) matrix whose th
row ( = 1 3 4 + 1) has one for the th element and zeros everywhere else.
When the number of portfolios and return observations grow large ( → ∞)Proposition 6 of GOS shows that if model is correctly specified, we have
c =
→ (0 1) (A55)
20
To estimate the variance term 2 we use the following consistent estimator
2 = 21
X=1
X=1
2 2
2
³002
−1
−1 −1
2
´2
(A56)
where =1
P=1 1
0, =
1
P=1 1
0
= [1 ]0, and is given by
Equation (A9). We also need to impose a sparsity condition on the terms ( =
1 ) in the double sum (see assumption A.4 in GOS). To this end, we use the
estimator = 1(¯¯
¯¯≥ ) where =
1
P=1 11
0 and is the
threshold parameter set equal to 0.067·(log()) 12 .
C.2 Empirical Results
Table AVIII reports the statistic c for the entire portfolio population and the three size
groups. The results closely mirror those reported in Table III. First, the null hypothesis
of correct specification is rejected in all but one case–when the five-factor model is
tested on big-cap portfolios (the -value is equal to 0.22). Second, the ranking of the
CAPM-based models is the same in each size group. Third, the three characteristics-
based models generally produce lower pricing errors than the CAPM-based models.
Because c is consistently lower for the five-factor model, it is tempting to say that
it dominates the three- and -factor models. However, we cannot make such claims
without comparison tests which have not been developed for large cross-sections yet.
Please insert Table AVIII here
D Comparison with Individual Stocks
D.1 Construction of the Sample
In this section, we evaluate the different models using individual stocks as test assets.
Similar to micro portfolios, we classify stocks in three size groups (tiny-, small-, and big-
cap). At the end of June each year, we partition all existing stocks using as breakpoints
the 20th and 50th percentiles of the market capitalization for NYSE stocks. Then, we
classify each stock in one of the three size groups based on the highest frequency of
observations. We also require that each individual stock has a minimum of 60 monthly
return observations to compute its -statistic. The resulting cross-section includes a
total of 6,651 individual stocks (3,548 tiny-cap, 1,379 small-cap, 1,724 big-cap).
21
D.2 Mispricing Proportion
We begin our analysis by examining the proportion of mispriced stocks. Panel A of Table
AIX shows that the estimated proportions are significantly lower than those obtained
with micro portfolios, e.g., the averages in the tiny- and small-cap groups are equal
to 7.5% and 8.9% (versus 55.3% and 49.4% for micro portfolios). Coupled with high
estimation uncertainty, these low estimated proportions lead us to conclude that none
of the models are misspecified, i.e., we cannot reject the null hypothesis of correct
specification. Because the estimated pricing errors of individual stocks are too volatile,
we are unable to detect mispricing in the data.
D.3 Traditional Performance Measure
Alternatively, we can use the measure proposed by GOS. Because this measure
aggregates pricing errors, it allows us to sidestep the challenge of detecting mispricing
at the individual stock level. As shown in Panel B, the tests based on the measure
indicates that the models are all mispecified in the entire stock population (the -values
are all equal to zero). This performance analysis is consistent with (i) the previous
results of GOS which show that commonly-used models are strongly rejected at the
individual stock level; (ii) the performance evaluation obtained with micro portfolios
(see Tables III and AIV).
There is no available testing procedure for comparing models based on the mea-
sure in large cross-sections. However, a casual observation of the estimated values reveals
no striking performance differences across the models. A likely culprit for this result is
the time-variation in individual stock betas as firms go through different business cycles
and stages of development–the empirical evidence shows that this variation is signif-
icantly stronger for stocks than for portfolios (e.g., Andersen et al. (2006), Fama and
French (1997)). The time-variation in betas introduces an additional source of misspec-
ification (e.g., Jagannathan and Wang (1996)) which affect all time-invariant models. It
can therefore smooth out the performance differences observed with micro portfolios.7
Please insert Table AIX here
7We could explicitly specify the dynamics of the individual stock betas. However, time-varying models
are difficult to estimate because of the large number of parameters. In addition, Ghysels (1998) shows
that a wrong specification of time-varying betas may result in large pricing errors, possibly greater
than those produced by a constant-beta model. The evidence in GOS reveals that the time-varying
specifications of the tested models are also strongly rejected in the data.
22
References
[1] Andersen T. G., T. Bollerslev, F. X. Diebold, and G. Wu, 2006, Realized Beta: Per-
sistence and Predictability, Advances in Econometrics: Econometric Analysis of
Economic and Financial Time Series, Elsevier.
[2] Avramov D., and T. Chordia, 2006, Asset Pricing Models and Financial Market
Anomalies, Review of Financial Studies 19, 1001-1038.
[3] Barras L., O. Scaillet, and R. Wermers, 2010, False Discoveries in Mutual Fund
Performance: Measuring Luck in Estimated Alphas, Journal of Finance 65,
179-216.
[4] Berk J., 2000, Sorting Out Sorts, Journal of Finance 55, 407-427.
[5] Brennan M. J., T. Chordia, and A. Subrahmanyam, 1998, Alternative Factor Spec-
ifications, Security Characteristics and the Cross-Section of Expected Stock Re-
turns, Journal of Financial Economics 49, 345-373.
[6] Efron B., 2010, Large-Scale Inference, Cambridge University Press.
[7] Fama E. F., and K. R. French, 1997, Industry Cost of Equity, Journal of Financial
Economics 43, 153-193.
[8] Fama E. F., and K. R. French, 2006, Profitability, Investment, and Average Returns,
Journal of Financial Economics 82, 491-518.
[9] Fama E. F., and K. R. French, 2008, Dissecting Anomalies, Journal of Finance 63,
1653-1678.
[10] Fama E. F., and K. R. French, 2015, A Five-Factor Asset Pricing Model, Journal
of Financial Economics 116, 1-22.
[11] Farcomeni A., 2006, Some Results on the Control of the False Discovery Rate under
Dependence, The Scandinavian Journal of Statistics 34, 275-297.
[12] Gagliardini P., E. Ossola, and O. Scaillet, 2016, Time-varying Risk Premium in
Large Cross-sectional Equity Datasets, Econometrica 84, 985-1046.
[13] Genovese C., and L. Wasserman, 2004, A Stochastic Process Approach to False
Discovery Control, Annals of Statistics 32, 1035-1061.
[14] Ghysels E., 1998, On Stable Factor Structures in the Pricing of Risk: Do Time-
Varying Betas Help or Hurt?, Journal of Finance 53, 549—573.
[15] Hou K., C. Xue, and L. Zhang, 2015, Digesting Anomalies: An Investment Ap-
proach, Review of Financial Studies 28, 650—705.
23
[16] Jagannathan R., and Z. Wang, 1996, The Conditional CAPM and the Cross-Section
of Expected Returns, Journal of Finance 51, 3-53.
[17] Newey W. K., and K. D. West, 1987, A Simple, Positive Semi-Definite, Het-
eroscedasticity and Autocorrelation Consistent Covariance Matrix, Economet-
rica 55, 703-708.
[18] Storey J. D., 2002, A Direct Approach to False Discovery Rates, Journal of the
Royal Statistical Society 64, 479—498.
[19] White H., 1980, A Heteroskedasticity-Consistent Covariance Matrix Estimator and
a Direct Test for Heteroskedasticity, Econometrica 48, 817—838.
24
Table AI
Monte Carlo Analysis
Panel A reports the properties of the proportion estimators under the first scenario where there
is a large performance difference between the two misspecified models a and b. For the entire
population and each size group (tiny-, small-, and big-cap), the first column shows the average
values of the estimated proportions of mispriced portfolios for both models and their difference.
The second and third columns compare the true volatilities of the estimated proportions and their
difference with the estimated volatilities. The fourth and fifth columns show the coverage ratios
of the confidence intervals at 90% and 95% for the estimated proportions and their difference.
In Panel B, we repeat the analysis under the second scenario where the two models produce the
same moderate performance. The total number of iterations is equal to 5,000.
Panel A: Large Performance Difference
Volatility Confidence Interval
Mean True Estimated 90%-coverage 95%-coverage
All Portfolios
Model a 30.6 3.6 3.1 91.3 94.5
Model b 65.1 4.3 4.0 92.5 95.8
Difference -34.5 5.0 5.1 90.1 95.3
Tiny-Cap Portfolios
Model a 32.4 4.8 4.6 93.2 96.1
Model b 62.5 6.7 5.7 89.8 93.5
Difference -30.1 8.0 7.7 87.9 93.7
Small-Cap Portfolios
Model a 32.4 7.4 6.6 92.1 95.3
Model b 63.1 5.6 5.9 94.0 96.3
Difference -30.6 9.0 9.1 90.3 94.8
Big-Cap Portfolios
Model a 15.4 6.2 5.6 92.9 95.8
Model b 68.3 4.6 5.3 95.4 97.7
Difference -52.9 7.3 7.6 91.0 95.5
25
Table AI
Monte Carlo Analysis (Continued)
Panel B: No Performance Difference
Volatility Confidence Interval
Mean True Estimated 90%-coverage 95%-coverage
All Portfolios
Model a 40.0 2.9 2.9 94.1 96.9
Model b 40.0 2.9 2.9 94.1 96.7
Difference 0.0 4.0 3.8 88.0 93.4
Tiny-Cap Portfolios
Model a 35.9 4.4 3.9 92.0 93.5
Model b 35.7 4.4 3.8 90.1 95.3
Difference 0.2 5.9 5.3 85.2 91.3
Small-Cap Portfolios
Model a 33.3 6.5 5.8 90.0 93.3
Model b 33.0 6.3 5.8 92.7 95.3
Difference 0.3 9.5 8.2 84.4 90.3
Big-Cap Portfolios
Model a 34.9 5.3 5.2 93.1 96.1
Model b 35.0 5.1 5.3 94.7 97.2
Difference -0.1 7.4 7.2 88.7 93.6
26
Table AII
Performance Analysis with Different Intervals
Panel A reports, for the entire population, the estimated proportions of micro portfolios that
are mispriced by the standard CAPM, the CAPM-based models (conditional, human capital, in-
tertemporal, and liquidity CAPMs), and the characteristic-based models (three-factor, q-factor,
and five-factor models) across the set of intervals ={ = [− ]; = [015 020 065]}. InPanels B to D, we repeat the analysis for the three size groups (tiny-, small- and big-cap).
Panel A: All Portfolios
Interval Bound
0.15 0.20 0.25 0.3 0.35 0.40 0.45 0.50 0.55 0.60 0.65
CAPM 64.5 63.8 62.5 63.5 63.6 64.3 63.6 63.3 62.7 62.2 61.5
Conditional 41.2 42.5 42.1 42.5 42.9 43.4 43.7 42.5 41.0 41.0 40.8
Human Capital 36.8 36.1 37.3 36.9 36.2 35.1 35.6 35.8 35.6 35.5 34.4
Intertemporal 61.8 60.6 60.9 60.0 59.8 60.2 59.2 58.7 58.6 58.3 57.7
Liquidity 36.8 37.3 35.0 34.7 34.0 34.4 34.0 33.4 33.8 33.6 32.9
Average 48.2 48.1 47.6 47.5 47.2 47.4 47.2 46.7 46.4 46.1 45.4
3-factor Model 34.8 36.8 38.0 38.6 37.5 36.4 36.2 36.4 35.9 35.3 35.0
q-factor Model 48.9 45.0 45.1 45.8 45.0 45.3 43.3 41.8 41.8 41.8 41.4
5-factor Model 32.9 30.7 29.6 30.8 30.4 30.3 29.1 29.2 29.0 29.7 29.2
Average 38.8 37.5 37.5 38.3 37.6 37.3 36.2 35.8 35.5 35.5 35.2
Panel B: Tiny-Cap Portfolios
Interval Bound
0.15 0.20 0.25 0.3 0.35 0.40 0.45 0.50 0.55 0.60 0.65
CAPM 76.1 74.9 73.9 74.2 73.6 74.1 74.1 73.7 73.4 73.1 72.1
Conditional 58.7 63.2 63.2 61.3 62.6 61.5 60.9 60.5 60.1 58.9 58.3
Human Capital 37.3 37.8 38.0 38.8 39.2 37.7 39.4 39.0 38.3 38.5 36.4
Intertemporal 72.4 70.5 70.2 70.1 69.0 68.0 67.6 67.7 67.0 65.6 64.9
Liquidity 37.2 37.2 38.2 37.7 36.0 35.3 35.2 36.1 30.0 35.4 35.4
Average 56.3 56.7 56.7 56.4 56.1 55.3 55.5 55.4 55.0 54.3 53.4
3-factor Model 55.8 58.0 58.0 55.8 54.5 53.5 51.7 50.9 49.9 48.5 47.9
q-factor Model 72.8 71.1 69.8 69.4 68.6 68.7 67.4 66.1 65.8 66.0 66.1
5-factor Model 47.1 46.5 45.7 47.0 45.3 43.5 43.1 44.0 43.2 44.0 43.0
Average 58.5 58.5 57.8 57.4 56.1 55.3 54.0 53.6 53.0 52.8 52.3
27
Table AII
Performance Analysis with Different Intervals (Continued)
Panel C: Small-Cap Portfolios
Interval Bound
0.15 0.20 0.25 0.3 0.35 0.40 0.45 0.50 0.55 0.60 0.65
CAPM 63.3 61.0 60.5 60.2 59.4 60.5 59.1 58.8 57.1 57.7 58.6
Conditional CAPM 32.0 30.7 31.4 30.8 29.1 27.3 26.0 27.3 27.3 28.7 28.0
Human Capital CAPM 57.0 51.5 53.0 51.1 47.0 45.1 47.1 45.9 44.3 44.5 45.6
Intertemporal CAPM 62.4 67.7 70.3 68.8 68.0 67.0 65.9 64.6 63.2 62.7 63.0
Liquidity CAPM 57.0 54.8 52.9 49.3 46.5 47.1 46.2 44.5 44.0 45.6 43.5
Average 54.3 53.1 53.6 52.0 50.0 49.4 48.8 48.2 47.1 47.8 47.7
3-factor Model 23.0 19.9 18.4 19.9 19.7 18.3 20.1 22.0 21.3 23.2 23.5
q-factor Model 30.2 22.7 23.9 24.1 21.3 22.8 22.0 20.9 20.6 19.2 19.0
5-factor Model 12.3 15.8 14.6 18.5 22.0 23.4 21.6 19.5 19.8 20.6 20.0
Average 24.8 19.5 18.9 20.8 21.0 21.5 21.3 20.8 20.6 20.0 20.8
Panel B: Big-Cap Portfolios
Interval Bound
0.15 0.20 0.25 0.3 0.35 0.40 0.45 0.50 0.55 0.60 0.65
CAPM 40.7 42.4 39.6 43.5 45.6 46.4 44.8 45.0 44.7 42.7 41.4
Conditional CAPM 12.5 16.6 20.3 18.3 16.6 17.9 18.5 17.6 17.7 18.1 17.5
Human Capital CAPM 2.3 7.2 9.0 11.6 13.2 14.9 11.8 13.0 12.4 12.4 10.8
Intertemporal CAPM 62.5 63.6 60.9 57.0 56.4 57.2 55.5 55.0 53.4 51.5 51.2
Liquidity CAPM 29.7 28.9 27.8 31.2 30.9 30.5 28.9 28.2 27.1 26.8 26.9
Average 29.1 31.7 31.5 32.3 32.5 33.3 31.9 31.8 31.1 30.3 29.6
3-factor Model 8.1 6.0 12.3 17.9 16.6 15.2 16.9 17.8 18.6 17.3 17.3
q-factor Model 5.5 8.4 10.4 13.9 14.9 14.6 10.2 7.6 8.4 9.3 7.9
5-factor Model 14.5 9.5 8.0 6.4 5.8 7.7 5.6 5.9 6.6 7.0 7.3
Average 8.7 8.0 10.2 12.7 12.4 12.5 10.9 10.4 11.2 11.1 10.8
28
Table AIII
Performance Analysis with the Bootstrap
This table reports the estimated proportions of micro portfolios that are mispriced by the stan-
dard CAPM, the CAPM-based models (conditional, human capital, intertemporal, and liquidity
CAPMs), and the characteristic-based models (three-factor, q-factor, and five-factor models) for
the entire portfolio population (All) and the three size groups (tiny-, small-, and big-cap) using
a bootstrap approach. Figures in parentheses denote the estimated volatilities of the proportion
estimates.
Size Groups
All Tiny-cap Small-cap Big-cap
CAPM 62.9 (2.3) 72.8 (5.7) 57.7 (5.0) 43.3 (4.4)
Conditional CAPM 41.3 (1.9) 60.0 (4.8) 26.2 (4.8) 14.3 (5.7)
Human Capital CAPM 34.8 (3.2) 36.2 (5.7) 45.1 (5.9) 8.9 (5.6)
Intertemporal CAPM 58.4 (2.0) 67.8 (4.2) 66.3 (5.1) 54.2 (4.2)
Liquidity CAPM 32.2 (2.4) 33.6 (6.0) 45.7 (5.4) 28.4 (5.4)
Average 45.9 54.1 48.2 29.8
3-factor Model 35.4 (2.8) 52.1 (5.7) 16.6 (6.6) 14.0 (5.4)
q-factor Model 42.6 (3.0) 67.4 (6.1) 22.4 (4.5) 11.3 (5.2)
5-factor Model 27.4 (2.4) 43.7 (5.5) 19.3 (5.5) 4.7 (5.3)
Average 35.1 54.4 19.4 10.0
29
Table AIV
Performance Analysis with Different Portfolio Sizes
Panel A reports the estimated proportions of micro portfolios that are mispriced by the stan-
dard CAPM, the CAPM-based models (conditional, human capital, intertemporal, and liquidity
CAPMs), and the characteristic-based models (three-factor, q-factor, and five-factor models)
for the entire portfolio population (All) and the three size groups (tiny-, small-, and big-cap)
using five stocks (n=5). Figures in parentheses denote the estimated volatilities of the propor-
tion estimates. In Panel B, we repeat the analysis using micro portfolios made up of 15 stocks
(n=15).
Panel A: Five Stocks
Size Groups
All Tiny-cap Small-cap Big-cap
CAPM 48.3 (2.2) 58.8 (5.6) 41.6 (5.2) 31.7 (4.7)
Conditional CAPM 37.2 (2.1) 51.4 (5.3) 26.2 (5.2) 19.1 (5.6)
Human Capital CAPM 26.0 (3.1) 28.5 (4.6) 31.6 (6.1) 15.2 (5.2)
Intertemporal CAPM 46.6 (2.4) 54.2 (4.4) 50.5 (4.6) 39.4 (4.3)
Liquidity CAPM 36.1 (2.3) 42.8 (4.2) 39.6 (4.9) 19.7 (5.0)
Average 38.9 47.1 37.9 25.0
3-factor Model 28.3 (2.5) 34.9 (4.2) 26.8 (4.8) 15.5 (5.4)
q-factor Model 32.4 (2.4) 51.3 (5.2) 13.2 (4.8) 8.3 (5.0)
5-factor Model 24.3 (2.2) 36.7 (3.8) 24.8 (4.8) 0.0 (4.9)
Average 28.4 41.0 21.6 8.0
Panel B: Fifteen Stocks
Size Groups
All Tiny-cap Small-cap Big-cap
CAPM 71.2 (2.5) 78.3 (5.7) 70.8 (4.6) 56.3 (5.1)
Conditional CAPM 47.2 (2.5) 69.7 (5.2) 34.8 (5.4) 18.9 (5.5)
Human Capital CAPM 37.1 (3.8) 41.8 (5.1) 43.3 (6.1) 23.0 (6.1)
Intertemporal CAPM 67.1 (2.0) 75.9 (3.6) 75.9 (4.6) 59.5 (4.7)
Liquidity CAPM 34.0 (2.7) 32.0 (5.5) 57.7 (5.7) 22.7 (6.2)
Average 51.3 59.6 56.5 36.1
3-factor Model 44.0 (2.9) 63.6 (6.0) 24.8 (7.2) 18.2 (6.0)
q-factor Model 51.3 (3.0) 73.2 (6.1) 23.9 (5.0) 10.7 (4.5)
5-factor Model 34.4 (2.2) 55.5 (6.7) 17.5 (7.1) 3.8 (5.0)
Average 41.8 64.1 22.1 10.9
30
Table AV
Performance Analysis with Equal Stock Representation
This table reports the estimated proportions of micro portfolios that are mispriced by the stan-
dard CAPM, the CAPM-based models (conditional, human capital, intertemporal, and liquidity
CAPMs), and the characteristic-based models (three-factor, q-factor, and five-factor models) for
the entire portfolio population (All) and the three size groups (tiny-, small-, and big-cap) using
a portfolio formation procedure in which all stocks are selected the same number of times in
each formation year. Figures in parentheses denote the estimated volatilities of the proportion
estimates.
Size Groups
All Tiny-cap Small-cap Big-cap
CAPM 62.6 (2.2) 71.6 (5.5) 60.5 (4.5) 44.8 (5.4)
Conditional CAPM 46.2 (1.7) 65.2 (4.7) 33.1 (4.7) 13.7 (5.9)
Human Capital CAPM 34.7 (3.5) 35.8 (5.8) 49.2 (5.4) 13.1 (5.4)
Intertemporal CAPM 57.9 (2.3) 65.1 (5.0) 62.3 (5.4) 55.1 (4.2)
Liquidity CAPM 35.4 (2.5) 41.5 (5.7) 48.8 (5.3) 29.6 (5.0)
Average 47.5 55.8 50.9 31.2
3-factor Model 35.2 (2.5) 45.7 (6.5) 26.5 (6.6) 20.0 (5.7)
q-factor Model 47.4 (2.7) 68.6 (5.7) 24.5 (4.7) 14.0 (4.8)
5-factor Model 27.6 (2.9) 43.9 (6.9) 14.5 (5.6) 3.8 (4.4)
Average 36.7 52.7 21.8 12.6
31
Table AVI
Properties of Micro Portfolio with Different Sets of Characteristics
Panel A reports the interquartile spread in average returns (annualized), the median return
volatility (annualized), and the median number of return observations across micro portfolios
for the following sets of characteristics: (i) three cases where book-to-market (bm), investment,
and profitability are used in isolation (bm, inv, prof); (ii) two cases where bm is replaced with
the earnings/price and cash flows/price ratios (bm1, bm2); (iii) two cases where investment is
measured with infrastructure and inventory growth (inv1, inv2); (iv) two cases where profitability
is measured with the return on equity and the return on assets (prof1, prof2); (v) the baseline
case used in the paper (Base). Panel B measures the dispersion in the portfolio betas on each
risk factor specific to the CAPM-based models (conditional, human capital, intertemporal, and
liquidity CAPMs) for the different sets of characteristics. Each column measures the length of
the 90%-interval of the beta residuals (the components orthogonal to the other factor betas).
Panel A: Portfolio Returns
Set of Characteristics
bm inv prof bm1 bm2 inv1 inv2 prof1 prof2 Avg Base
Median Obs. 384 390 384 366 378 384 402 372 372 381 390
Return Spread 7.5 7.2 6.3 6.5 6.6 7.4 6.6 7.1 7.4 7.0 7.3
Median Vol. 29.7 29.7 29.2 26.5 27.0 29.6 29.7 27.9 27.9 28.6 30.2
Panel B: Independent Variation in Betas
Set of Characteristics
bm inv prof bm1 bm2 inv1 inv2 prof1 prof2 Avg Base
Conditional 1.91 1.90 1.80 1.35 1.40 1.80 1.60 2.28 2.21 1.81 1.75
Human Capital 1.35 1.44 1.49 1.10 1.17 1.47 1.28 1.66 1.67 1.40 1.41
Intertemporal 2.25 2.56 2.29 1.27 1.32 2.31 1.67 3.23 3.15 2.22 2.56
Liquidity 1.25 1.25 1.32 1.09 1.1 1.24 1.14 1.40 1.39 1.24 1.26
Average 1.70 1.79 1.73 1.21 1.25 1.71 1.43 2.1 2.1 1.67 1.75
32
Table AVII
Performance Analysis with Different Sets of Characteristics
Panel A reports, for the entire population, the estimated proportions of micro portfolios that
are mispriced by the standard CAPM, the CAPM-based models (conditional, human capital, in-
tertemporal, and liquidity CAPMs), and the characteristic-based models (three-factor, q-factor,
and five-factor models) for different sets of characteristics. The specifications are the following:
(i) three cases where book-to-market (bm), investment, and profitability are used in isolation
(bm, inv, prof); (ii) two cases where bm is replaced with the earnings/price and cash flows/price
ratios (bm1, bm2); (iii) two cases where investment is measured with infrastructure and inven-
tory growth (inv1, inv2); (iv) two cases where profitability is measured with the return on equity
and the return on assets (prof1, prof2); (v) the baseline case used in the paper (Base). In Panels
B to D, we repeat the analysis for the three size groups (tiny-, small- and big-cap).
Panel A: All Portfolios
Set of Characteristics
bm inv prof bm1 bm2 inv1 inv2 prof1 prof2 Avg Base
CAPM 52.9 62.3 52.8 63.6 66.6 62.4 59.8 66.5 66.0 61.4 64.3
Conditional 31.9 47.3 31.5 37.7 40.2 50.3 39.1 60.8 60.1 44.3 43.4
Human Capital 29.7 27.0 15.4 38.1 37.5 39.7 28.6 43.7 33.1 31.4 35.1
Intertemporal 38.6 57.9 35.0 46.9 47.7 55.5 53.6 56.8 56.1 49.8 60.2
Liquidity 21.2 36.0 22.6 29.9 18.9 32.4 24.1 48.6 52.4 31.8 34.4
Average 34.8 46.1 31.5 43.2 42.2 48.0 41.1 55.3 53.5 44.0 47.5
3-factor Model 26.2 35.7 15.1 38.2 33.8 36.6 33.9 34.2 398 32.6 36.4
q-factor Model 38.9 43.7 44.9 36.3 34.4 44.3 39.3 40.1 43.5 40.6 45.3
5-factor Model 27.1 33.7 18.8 26.5 26.4 31.7 31.3 27.2 29.8 28.1 30.3
Average 30.8 37.7 26.3 33.7 31.6 37.6 34.9 33.9 37.8 33.8 37.3
Panel B: Tiny-Cap Portfolios
Set of Characteristics
bm inv prof bm1 bm2 inv1 inv2 prof1 prof2 Avg Base
CAPM 63.9 73.5 69.5 74.7 77.6 73.7 73.7 70.9 72.2 72.2 74.1
Conditional 42.4 70.0 47.1 46.8 39.2 64.5 46.9 71.6 70.8 55.5 61.5
Human Capital 45.2 26.6 14.2 48.2 45.3 36.6 22.4 40.7 28.6 31.6 37.7
Intertemporal 44.0 63.1 33.1 37.4 32.7 54.5 53.4 57.8 58.7 48.3 68.0
Liquidity 24.1 35.8 29.7 30.3 16.9 34.6 19.8 57.0 55.1 33.7 35.3
Average 43.9 53.8 38.7 47.5 42.3 52.8 43.2 59.7 57.1 48.8 55.3
3-factor Model 38.9 52.4 20.5 52.6 50.4 49.1 40.5 48.6 49.2 44.7 53.5
q-factor Model 63.5 67.8 69.6 61.4 66.8 73.5 64.3 67.5 65.8 66.7 68.7
5-factor Model 39.5 49.1 28.3 43.1 41.7 47.1 37.4 43.5 44.8 41.6 43.5
Average 47.3 56.5 39.5 52.4 53.0 56.6 47.4 53.2 53.3 51.0 55.2
33
Table AVII
Performance Analysis with Different Sets of Characteristics (Continued)
Panel C: Small-Cap Portfolios
Set of Characteristics
bm inv prof bm1 bm2 inv1 inv2 prof1 prof2 Avg Base
CAPM 50.9 50.0 40.6 62.2 62.0 64.3 49.2 69.8 68.1 57.5 60.5
Conditional 21.8 23.2 13.6 27.0 32.8 48.4 33.2 40.2 32.8 30.3 27.3
Human Capital 22.1 27.6 40.0 18.1 40.8 59.8 52.1 68.1 54.1 42.5 45.1
Intertemporal 50.6 55.1 50.0 47.4 54.8 61.9 54.2 69.4 66.7 56.6 67.0
Liquidity 32.8 42.7 19.4 49.9 36.4 54.9 37.3 55.9 61.1 43.4 47.1
Average 35.6 39.7 32.7 40.9 45.4 57.8 45.2 60.7 56.5 46.1 49.4
3-factor Model 17.0 18.0 7.8 32.1 14.4 27.3 32.8 25.8 30.8 22.9 18.3
q-factor Model 9.8 22.8 13.6 20.6 16.9 9.3 13.4 17.2 21.1 14.3 22.8
5-factor Model 18.0 22.1 5.3 20.6 12.9 17.2 26.2 14.1 17.5 17.1 23.4
Average 15.0 21.0 8.9 24.5 9.7 17.9 24.1 19.1 23.1 18.1 21.5
Panel D: Big-Cap Portfolios
Set of Characteristics
bm inv prof bm1 bm2 inv1 inv2 prof1 prof2 Avg Base
CAPM 30.8 49.1 27.2 44.7 50.3 36.3 38.5 55.0 52.1 42.7 46.4
Conditional 10.0 20.4 8.4 29.0 33.4 20.5 21.2 32.0 35.7 23.4 17.9
Human Capital 5.0 10.8 10.0 14.0 10.2 16.9 17.8 18.0 10.3 12.5 14.9
Intertemporal 32.2 61.3 27.3 44.7 53.3 47.5 48.2 57.7 59.4 48.0 57.2
Liquidity 28.3 33.9 11.1 41.6 28.8 23.9 19.0 27.0 45.7 28.9 30.5
Average 21.3 35.1 16.8 34.8 35.2 29.0 29.0 38.0 40.7 31.1 33.3
3-factor Model 15.7 15.3 10.0 17.5 19.5 18.1 20.1 13.0 29.3 17.6 15.3
q-factor Model 10.9 10.2 18.6 4.2 2.6 11.8 7.0 5.6 18.8 9.9 14.6
5-factor Model 7.9 11.1 9.9 13.9 9.5 11.2 22.4 6.2 11.1 10.0 7.7
Average 11.5 12.2 12.8 7.7 10.5 13.7 16.5 8.3 19.8 12.6 12.5
34
Table AVIII
Performance Analysis with the Traditional Measure
This table reports the traditional performance measure proposed by Gagliardini, Ossola, and
Scaillet (2016) under the standard CAPM, the CAPM-based models (conditional, human cap-
ital, intertemporal, and liquidity CAPMs), and the characteristic-based models (three-factor,
q-factor, and five-factor models) for the entire portfolio population (All) and the three size
groups (tiny-, small-, and big-cap). The performance measure is defined as the sum of squared
pricing errors and provides a valid inference in large cross-sections. Figures in parentheses denote
the p-values under the null hypothesis that the model is correctly specified.
Size Groups
All Tiny-cap Small-cap Big-cap
CAPM 146.0 (0.0) 131.8 (0.0) 62.6 (0.0) 40.3 (0.0)
Conditional CAPM 85.8 (0.0) 84.6 (0.0) 30.7 (0.0) 19.8 (0.0)
Human Capital CAPM 71.1 (0.0) 47.4 (0.0) 54.4 (0.0) 14.7 (0.0)
Intertemporal CAPM 133.5 (0.0) 100.7 (0.0) 68.7 (0.0) 43.3 (0.0)
Liquidity CAPM 76.3 (0.0) 59.8 (0.0) 51.4 (0.0) 25.7 (0.0)
Average 102.6 84.9 53.6 28.8
3-factor Model 56.2 (0.0) 48.1 (0.0) 29.8 (0.0) 14.0 (0.0)
q-factor Model 72.7 (0.0) 88.6 (0.0) 9.2 (0.0) 5.4 (0.0)
5-factor Model 5.5 (0.0) 6.6 (0.0) 3.5 (0.0) 0.7 (0.23)
Average 44.8 47.8 14.1 6.7
35
Table AIX
Performance Analysis with Individual Stocks
Panel A reports the estimated proportions of individual stocks that are mispriced by the stan-
dard CAPM, the CAPM-based models (conditional, human capital, intertemporal, and liquidity
CAPMs), and the characteristic-based models (three-factor, q-factor, and five-factor models)
for the entire portfolio population (All) and the three size groups (tiny-, small-, and big-cap).
Figures in parentheses denote the estimated volatilities of the proportion estimates. Panel B
conducts the same analysis using the traditional performance measure proposed by Gagliardini,
Ossola, and Scaillet (2016). Figures in parentheses denote the p-values under the null hypothesis
that the model is correctly specified.
Panel A: Proportion of Mispriced Stocks
Size Groups
All Tiny-cap Small-cap Big-cap
CAPM 14.7 (15.5) 8.5 (21.7) 8.3 (17.8) 32.8 (14.3)
Conditional CAPM 15.5 (15.4) 9.1 (21.4) 9.0 (17.5) 32.3 (14.3)
Human Capital CAPM 12.8 (15.7) 5.2 (21.9) 10.3 (17.6) 29.6 (14.6)
Intertemporal CAPM 14.1 (15.6) 7.6 (21.8) 7.8 (17.8) 32.0 (14.4)
Liquidity CAPM 14.4 (15.6) 7.2 (21.8) 9.3 (17.8) 34.0 (14.2)
Average 14.3 7.5 9.0 32.2
3-factor Model 5.7 (16.0) 3.2 (21.9) 0.0 (18.4) 20.7 (14.2)
q-factor Model 14.3 (16.0) 3.8 (22.0) 0.0 (18.1) 13.1 (15.4)
5-factor Model 7.2 (16.0) 3.7 (21.9) 2.8 (18.1) 18.0 (15.1)
Average 6.1 3.6 1.0 17.2
Panel B: Traditional Performance Measure
Size Groups
All Tiny-cap Small-cap Big-cap
CAPM 15.1 (0.0) 3.7 (0.0) 4.5 (0.0) 20.3 (0.0)
Conditional CAPM 15.0 (0.0) 3.8 (0.0) 4.3 (0.0) 20.4 (0.0)
Human Capital CAPM 14.4 (0.0) 3.5 (0.0) 4.0 (0.0) 18.0 (0.0)
Intertemporal CAPM 15.4 (0.0) 3.7 (0.0) 4.8 (0.0) 21.1 (0.0)
Liquidity CAPM 15.2 (0.0) 3.6 (0.0) 4.8 (0.0) 21.2 (0.0)
Average 15.0 3.7 4.5 20.2
3-factor Model 12.0 (0.0) 3.9 (0.00) 0.5 (0.30) 17.4 (0.0)
q-factor Model 9.2 (0.0) 3.7 (0.0) 0.4 (0.34) 12.5 (0.0)
5-factor Model 6.0 (0.0) 3.6 (0.00) 1.4 (0.08) 8.2 (0.23)
Average 9.1 3.8 0.8 12.7
36
Figure A1
Forming the Cross-Section of Micro Portfolios
This figure illustrates the procedure for forming the set of micro portfolios sorted based onaverage returns using an hypothetical population of 50 individual stocks and a two-year sampleperiod. Each dot represents the estimated average return taken by each stock (S1, S2,...). Foreach stock, the procedure consists of forming an equally-weighted portfolio that includes thestock itself and 9 additional stocks with the nearest estimated average returns. Then, the port-folio returns are chained across years 1 and 2 to maintain a stable average return over time. Thisprocedure yields a cross-section of 50 micro portfolios ranging from P(Return 1) to P(Return 50).
S10 S42S3
S18 S6S46
Return 1 (year 1)......Return 25 (year 1)...
...Return 50 (year 1)
High Return
High ReturnLow Return
Low Return
Return 1 (year 2)......Return 25 (year 2)...
...Return 50 (year 2)
Chaining Returns
...P(Return 50)
Chaining Returns
...P(Return 25)...
Chaining Returns
P(Return 1)...
37
Figure A2
Overlapping versus Non-overlapping Portfolios
The four panels plot the pairwise correlation between the indicator functions 1(tkj ) and 1(tkj+d)
and the pairwise correlation between the t-statistics tkj and tkj+d for different mean values for the
average t-statistic (μts=0, 0.5, 1, 1.5). The interval I is set equal to [-0.4,0.4]. If the correlationbetween indicator functions is lower than the correlation between t-statistics, using overlappinginstead of non-overlapping porfolios brings efficiency gains (i.e., it reduces the variance of theestimated mispricing proportion).
Correlation t-statistics0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Cor
rela
tion
indi
cato
rs
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
(a) t-statistic mean of 0
Correlation t-statistics0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Cor
rela
tion
indi
cato
rs
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
(b) t-statistic mean of 0.5
Correlation t-statistics0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Cor
rela
tion
indi
cato
rs
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
(c) t-statistic mean of 1
Correlation t-statistics0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Corre
latio
n in
dica
tors
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
(d) t-statistic mean of 1.5
38