Appendix A Large-Scale Approach for Evaluating Asset Pricing … · 2018-09-23 · I The Cross-Section of Micro Portfolios A Portfolio Formation Procedure We describe the procedure

Appendix

A Large-Scale Approach

for Evaluating Asset Pricing Models

Laurent Barras∗

This version: March 24, 2018

∗Desautels Faculty of Management, McGill University, Montreal. E-mail: [email protected].

I The Cross-Section of Micro Portfolios

A Portfolio Formation Procedure

We describe the procedure for forming the set of micro portfolios in each size group

(tiny-, small-, and big-cap). First, we sort the stocks in each formation year

( = 1 ) according to their estimated average returns. To compute this variable

denoted by ( = 1 ) we use a linear combination of firm characteristics.1

For each month prior to the formation date we run a cross-sectional regression of

the monthly stock excess returns on the most recently observed characteristics: =

0 + , where denotes the vector of characteristics including a constant. Then,

we estimate the characteristic-based average return as

= 0 (A1)

where is the vector of characteristics observed in the formation year and is

the time-series average of the monthly vector of coefficients. To facilitate the chaining of

portfolio returns over consecutive years, we work with the standardized average return

computed as

= − 1

P

=1 µ1

P

=1 2 −

³1

P

=1

´2¶ 12

(A2)

Second, we construct, for each stock a micro portfolio by equally weighting the stock

itself and −1 additional stocks with the nearest values to . This technique is calledlocal averaging and borrows from Efron (2010, ch. 9). Third, we chain the portfolio

returns over time to obtain stable average returns. For each pair ( ) of micro portfolios

in years and +1, we compute the distance between them as¯ − +1

¯2 Then,

we match the portfolios with the lowest distance (each year- portfolio can only be

paired with one year-+1 portfolio). To minimize changes in portfolio composition, we

match the pair ( ) first if¯ − +1

¯is in the bottom 1% of all measured distances.

1The characteristic-average return relation used here should be distinguished from previous studies

that impose a linear relation between characteristics and pricing errors (Avramov and Chordia (2006),

Brennan, Chordia, and Subrahmanyam (1998)). In our case, two stocks can have similar characteristics

and yet different pricing errors because they are not exposed to the same risk factors.2Alternatively, we can compute the estimated average return of the newly-created portfolio as

1

+

where ( = 1 − 1) denotes the identity of the additional stocks in-

cluded in the portfolio. Using this approach leaves the results unchanged.

1

In Figure A1, we illustrate the portfolio formation procedure in a population of 50

stocks ( = = 50) over a 2-year sample period ( = 2) Each dot denotes the

ordered value at the start of each year. We see that the portfolio composition

changes each year to account for the time-variation in characteristics. For instance, the

portfolio associated with the median average return 25 includes stocks S10, S42

S3 in year 1, and stocks S18, S6 S46 in year 2. The formation procedure yields a total

number of micro portfolios, equal to the number of stocks ( = )

In practice, the formation procedure is more complicated because the number of

stocks changes over time. Suppose that the number of stocks in year 2 is equal to 60

(instead of 50). Applying the matching procedure described above, we can pair 50 year-1

and 50 year-2 portfolios, which leaves 10 year-2 portfolios unmatched. In this example,

the cross-section includes 60 micro portfolios ( = 60) with unequal time-series lengths:

(i) 50 portfolios created in year 1 with complete return history (24 monthly returns),

(ii) 10 unmatched portfolios created in year 2 with 12 monthly return observations.

Conversely, suppose that we have 60 portfolios in year 1 (instead of 50). In this case, we

can only pair 50 year-1 portfolios, which leaves 10 year-1 portfolios unmatched ( = 60).

In general, the total number of micro portfolio is therefore equal to = max()

Please insert Figure A1 here

B Definition of the Characteristics

To compute the firm’s average return, we follow Fama and French (2008) and use a linear

combination of three characteristics–book-to-market, profitability, and investment. We

use the definitions of Fama and French (2008, 2015) to measure these characteristics at

the end of June of each year and winsorize the data at the 1% and 99% levels to

remove outliers. The book-to-market ratio is equal to the ratio of the book value of

equity to the market value of equity. The book value for year is defined as total

assets minus liabilities, plus balance sheet deferred taxes and investment tax credit (if

available), minus preferred shares stock liquidating values (if available), or carrying value

(if available) in the fiscal year ending in the calendar year − 1 The market value foryear equals the price times shares outstanding at the end of December of year − 1Investment for year is computed as the relative change in total assets between the

fiscal years ending in calendar years − 2 and − 1 Finally, profitability for year isdefined as revenues minus cost of goods sold, minus selling, general, and administrative

expenses, minus interest expense all divided by the book value of equity. Each of these

variables is computed using data in the fiscal year ending in the calendar year − 1

2

We also use the definitions in Hou, Xue, and Zhang (2015) to construct alternative

proxies for the three characteristics. First, we replace the book-to-market ratio with

the earnings-to-price and cash flow-to-price ratios. Earnings for year are defined as

income before extraordinary items. Cash flows for year are computed for year as

income before extraordinary items, plus equity’s share of depreciation, plus deferred

taxes (if available). Each of these variables is computed using data in the fiscal year

ending in the calendar year − 1. Price for year equals the market value measuredat the end of December of year − 1

Second, we measure investment using infrastructure growth and inventory growth.

Infrastructure growth for year is defined as the change in gross property, plant, and

equipment, plus changes in inventory in the fiscal year ending in the calendar year −1scaled by total assets in the fiscal year ending in the calendar year − 2 Inventorygrowth is defined as the relative change in capital expenditure between the fiscal years

ending in calendar years − 2 and − 1.Third, we measure profitability using the Return On Equity (ROE) and the Return

On Assets (ROA). ROE for year is defined as income before extraordinary items in

the fiscal year ending in the calendar year − 1 divided by book equity in the fiscalyear ending in the calendar year − 2 ROA for year is defined as income before

extraordinary items in the fiscal year ending in the calendar year − 1 divided by totalassets in the fiscal year ending in the calendar year − 2

II Estimation Procedure

A Extended Two-Pass Regression

We provide a general description of the econometric framework for estimating the pricing

errors of the micro portfolios. Under model ( = 1 ) the excess return of each

portfolio ( = 1 ) can be written as

= + + 0 + 0 + (A3)

where is the market excess return, is the -vector of risk factors specific to model

the -vector of factors included in the other models and orthogonal to and

, and denotes the residual term. The intercept is equal to

= − 0 (A4)

3

where is the portfolio pricing error and is the -vector of forward prices of the

risk factors.3 Using Equation (A4), we can write the pricing error as

= 0 (A5)

where the ( + 1)-vectors and are defined as = [1 0 ]0 and = [

0 ]0 To

estimate , we build on recent work by Gagliardini, Ossola, and Scaillet (2016; GOS

hereafter) who extend the traditional two-pass regression to a large and unbalanced

panel of test assets–two important features exhibited by micro portfolios.

In the first step, we run a time-series regression of on the ( + + 2)-vector

= [1 0 0]0 for each portfolio . The OLS estimator of the ( + + 2)-vector

of coefficients = [ 0

0 ]0 is given by

=

ÃX=1

10

!−1 X=1

1 (A6)

where is the total number of observations, and 1 equals one if is non-missing The

matrix inversion in Equation (A6) is numerically unstable if only few return observations

are available. To address this issue, GOS introduce the following trimming device:

1 = 1

n ≤ 1 () ≤ 2

o (A7)

where = =

P=1 1 () =

³max()min

³

´´ 12denotes

the condition number of the matrix =1

P=1 1

0 Following GOS, we set

1 =60660(a minimum of 60 monthly observations) and 2 = 15

In the second step, we estimate the -vector of forward prices using a cross-

sectional regression of the estimated intercept on the -vector of estimated betas

keeping the non-trimmed portfolios only:

(1) = −⎛⎝ X

=1

1

0

⎞⎠−1 X=1

1

0

(A8)

We adjust (1)

for the bias component Ψ= −1

³1

P=1

01

´, where =

[0 ] 1 = [0×1 ]0 is the × identity matrix, = 02

−1

−1 2

3The forward price of the market factor does not appear in equation (A4) because is an excess

return which, by definition, has a forward price equal to zero

4

= [0] = [2

0] and 2 is a ( + + 2)× ( + 1) matrix whose th row

( = 1 3 4 +1) has one for the th element and zeros everywhere else The final

estimate of is equal to

= (1) +1

Ψ

(A9)

where Ψis computed as −1

³1

P=1 1

01

(1)

´ =

P=1 1

=

1

P=1 1

0

, = 02

−1

−12 and (1) = [1 0

(1)]0 Following GOS, we es-

timate using the White estimator (1980): =1

P=1 1

2

0 where =

− 0 Plugging the estimated quantities in Equation (A5), we obtain

= 0

(A10)

B Estimation of the Portfolio -Statistics

We now prove Proposition 1 which provides an analytical expression for the -statistic

of the portfolio pricing error and its asymptotic distribution.

Proof of Proposition 1. We consider the misspecified model and suppose that

the residual terms ( = 1 ) are weakly correlated. When the number of port-

folios and return observations grow large ( → ∞) Proposition 7 of GOS showsthat converges towards

at a rate equal to

√ In addition, standard results in

regression analysis reveal that the vector of estimated coefficients

is asymptotically

distributed as √ (

− )→ (0 ) (A11)

where = 02−1

−1 2 With in the thousands and in the hundreds, the

asymptotic sampling variation in is therefore only driven by that of

, i.e.,

√ ( − )

→ ³0 0

´ (A12)

Using this result, we compute the portfolio -statistic as

where

2=1

0 = 0

(A13)

The variance term is equal to 1

where is a consistent estimator of the co-

variance matrix . In addition, Equation (A12) implies that the -statistic follows a

normal distribution, →

µ

1

¶ where = 0 and 2

= 0

.

5

III Statistical Inference

A Proportion of Mispriced Portfolios

We compute the proportion of portfolios that are mispriced by model as

= 1− ()

0()= 1−

1

P=1 1 (

)

Φ0() (A14)

where () is the estimated probability that the -statistic falls in the interval =

[− ], 1 ( ) is an indicator function equal to one if falls in and 0() is computed

from the standard normal cdf Φ0() = Φ0()−Φ0(−) = 2Φ0()− 1.Proof of Proposition 2. We consider the misspecified model and suppose that

the residual terms ( = 1 ) are weakly correlated. We further assume that the

-statistics are spatially ordered such that nearby -statistics exhibit higher correlation.

When the number of portfolios grows large ( → ∞) Lemma 2 of Farcomeni (2006)shows that √

(()− ())→

¡0 2

¢ (A15)

where 2 = (1 (1)) + 2P∞

=2 (1 (1) 1 (

)) and ( = 1 ∞) are the ordered

-statistics. Because the variance of the estimated proportion only depends on that

of () (Equation (A14)), the asymptotic distribution of the vector of estimated pro-

portions for two misspecified models and is given by

√

" − ∗ − ∗

#→

⎛⎝ 0

0

2

Φ0()2

Φ0()2

Φ0()2

2

Φ0()2

⎞⎠ (A16)

where ∗ = () and ∗ = () The variance terms are given by

2 = (1 (1)) + 2

∞X=2

(1 (1) 1 ( ))

2 = (1 (1)) + 2

∞X=2

(1 (1) 1 ())

= (1 (1) 1 (1)) +

∞X=2

(1 (1) 1 ()) + (1 (1) 1 (

)) (A17)

where ( = 1 ∞) are the ordered -statistics for models and

6

Using Proposition 2, we can test the null hypothesis that model is correctly speci-

fied. Under the null hypothesis 0 : ∗ = 0 Genovese and Wasserman (2004) show that

the estimated mispricing proportion is asymptotically distributed as

√

→ 1

20 +

1

2+

µ0

2Φ0()2

¶ (A18)

where 0 is a point-mass at zero and + is a positive-truncated normal distribution To

test this hypothesis at the size level we determine whether is sufficiently far away

from zero using the following threshold:

1√

Φ0() (A19)

where is the consistent estimator of and is the quantile of the standard normal

distribution at (1-) To compute , we use the following estimator proposed by Newey-

West (1987):

2 =

⎡⎣ 1

X=1

1 ( )

⎤⎦− ()2 + 2

X1=1

⎡⎣ 1

−

−1X=1

1 ( )1 (1+

)

⎤⎦− ()2 (A20)

where is the number of cross-sectional lags.4

Proposition 2 also allows us to test the null hypothesis of equal performance between

two misspecified, possibly non-nested models. Under the null hypothesis 0 : ∆∗ =

∗ − ∗ = 0 the estimated difference ∆ = − is asymptotically distributed as

√(∆ −∆∗) →

µ02 + 2 − 2

Φ0()2

¶ (A21)

To implement this testing procedure, we compute the covariance term using the

following consistent estimator:

=

⎡⎣ 1

X=1

1 (1)1 (1)

⎤⎦− ()()

+

X1=1

⎡⎣ 1

−

−1X=1

1 (1)1 (1+

) + 1 (1)1 (1+

)

⎤⎦− ()() (A22)

4 In the baseline specification, we set equal to 1% of the total number of portfolios ( = 40 in the

entire population) to account for potential weak dependencies between portfolios. Setting = yields

similar volatility estimates.

7

B Sign of the Pricing Errors

We can extend the large-scale approach to conduct inference on the estimated propor-

tions of portfolios with negative and positive pricing errors To compute both proportions

denoted by − and + , we use the procedure of Barras, Scaillet, and Wermers (2010).

First, we determine the proportions of portfolios with low or high estimated pricing

errors (−) and (

+) where − = [−∞−], + = [+∞], and − and denote

the lower and upper bounds of the interval . Second, we deduct the proportions of

"false discoveries", (1− )Φ0(−) and (1− )Φ0(

+)–both expressions measure the

proportions of correctly-priced portfolios which, by chance, have -statistics falling in

the intervals − and +. This two-step approach yields the following expressions for −and + and their variances:

− = (−)− (1− )Φ0(

−)

+ = (+)− (1− )Φ0(

+) (A23)

2−

= 2(−) +Φ0(−)22 − 2Φ0(−)(−)

2+

= 2(+) +Φ0(+)22 − 2Φ0(+)(+) (A24)

where 2(−)

= 12−

2(+)

= 12+ 2 =

1

2

Φ0()2 (−) =

1Φ0()

−

and (+) =1

Φ0()2+

The different components are given by

2− = (1−(1)) + 2∞X=2

(1−(1) 1−( ))

2+ = (1+(1)) + 2

∞X=2

(1+(1) 1+( ))

− = (1−(1) 1 (1)) +

∞X=2

(1−(1) 1 ( )) + (1 (1) 1

−( ))

+ = (1+(1) 1 (1)) +

∞X=2

(1+(1) 1 ( )) + (1 (1) 1

+( )) (A25)

where 1−( ) (1+( )) is an indicator function equal to one if

falls in − (+). After

replacing the above expressions with the consistent estimators proposed by Newey-West

(1987) we can conduct inference on the two proportions − and +

8

C Testing for Useless Factors

We now explain how to test whether the sth factor included in model is useless.

For each portfolio ( = 1 ) the beta on factor obtained from the first-pass

regression in Equation (A6) is asymptotically distributed as

√ ( − )

→ ¡0 0+1+1

¢ (A26)

where +1 is a ( + 1)-vector whose ( + 1)th element is one and the others are zero.

Using this result, we compute the associated -statistic as

where the estimated

variance of is given by

=1

0+1+1 (A27)

Then, we compute the proportion of portfolios with non-zero betas on factor using

the same expression as in Equation (A14):

() = 1− ()

0()= 1−

1

P=1 1 (

)

Φ0() (A28)

where () is the estimated probability that the beta -statistic falls in the interval

and 1 () is an indicator function equal to one if falls in .

If factor is useless, the true betas are all equal to zero and we obtain 0 :

(()) = ∗() = 0 Similar to we can write the distribution of the estimated

proportion () under 0 as

√()

→ 1

20 +

1

2+

µ0

2()

Φ0()2

¶ (A29)

where 0 is a point-mass at zero, + is a positive-truncated normal distribution, and

2() follows the same expression as in Equation (A17) except that we use the ordered

beta -statistics To test this hypothesis at the size level we determine whether

() is sufficiently far away from zero using the following threshold:

() 1√

()

Φ0() (A30)

where () is the consistent estimator of (

) and is the quantile of the standard

normal distribution at (1-)

9

IV Monte Carlo Analysis

A Setting

We conduct a Monte Carlo analysis to evaluate the finite-sample properties of the pro-

portion estimators for two misspecified models and . We extend the illustrative

example presented in the paper on several important dimensions to closely replicate the

salient features of the data. First, we match the total number of micro portfolios across

the three size groups (before imposing any filters on the data). Specifically, we construct

a set of 2,349 tiny-cap portfolios, 938 small-cap portfolios, and 1,302 big-cap portfolios

based on the empirical characteristics of the individual stocks in each size group.

Second, we account for the unbalanced nature of the panel of portfolio returns. To

guarantee the same unbalanced structure as in the data, we apply the empirical ×

matrix of indicators 1 ( = 1 and = 1 ) to each simulated panel of portfolio

returns, where denotes the total sample size equal to 606 monthly observations and

is equal to 4,589 micro portfolios.

Third, we jointly match the average proportion of mispriced portfolios across the

proposed models examined in the empirical section by adding a size premium to the

average excess return of each individual stock ( = 1 with =):

= + + + (A31)

where is the premium of the market return and denote the premia of the

two additional risk factors and

Model includes the market and factor which implies that the vector of ex-

planatory variables is defined as = [1 ]0 The term is the estimated

component of the omitted factor that is orthogonal to 0 = [1 ]0 i.e.,

= − 00 where is the vector of estimated coefficients from a time-series

regression of on 0 over the entire sample period. Model includes the market

and factor which implies that = [1 ]0 where = − 0

0,

0 = [1 ]0 and is the vector of estimated coefficients from a time-series re-

gression of the omitted factor on 0 over the entire sample period. We assume that

and the residual term are all independent and normally distributed as

( 2), (

2) (

2), and (0 2) respectively. We further assume that

are randomly drawn from the normal distribution (() ()),

(() ()), (() ()), (() ()).

To calibrate the model, we use monthly data on individual stocks and the Fama-

10

French three factors (market, size, value) over the entire sample period. The calibration

of the distribution parameters for the betas and the residual term is done separately for

each size group. To attribute each individual stock to a specific size group, we form,

each year, the three size groups by taking as breakpoints the 20th and 50th percentiles

of the market capitalization for the NYSE stocks (similar to Fama and French (2008)).

We then classify each stock based on the frequencies at which it falls in the three groups.

For each size group, we set () and () equal to the median and variance of the

estimated size betas, and () and () equal to the median and variance of the

estimated market betas. We further set () () and () () equal to

the median and variance of the estimated value betas. Finally, is set equal to the

cross-sectional average of the estimated residual volatility.

We set equal to 0.5% per month so as to approximate the median value for the

proportions of mispriced portfolios (around 45%). We set and equal to the

average return and volatility of the CRSP value-weighted index (0.5% and 4.4% per

month). For the volatilities and of the additional risk factors and we split

the volatility of the value factor in two (2.8% per month). To determine the values for

and we choose two scenarios to capture the minimum and maximum proportion

differences observed in the data. Under the first scenario, and are set equal to

10% and 0.0% per month so as to produce a large proportion difference between the

two models. Under the second scenario, and are both equal to 05% per month,

which implies that both models yield the same moderate performance.

B Simulation Procedure

For each scenario, we compute the estimated proportions of mispriced portfolios over

1,000 iterations and five sets of values for the stock betas ( = 5 000). For each iteration

( = 1 ) we first construct a -vector of monthly return observations for each

stock ( = 1 with =) :

() = + () + () + () + () (A32)

where () () () and () are drawn from their respective distributions.

Second, we form the cross-section of micro portfolios using the average stock return as

the sorting variable and apply the portfolio formation described above.5 The resulting

cross-section consists of micro portfolios, each containing 10 stocks ( = 10)–stock

5We assume that the book equity of each firm is proportional to its future expected cash flows. In

this case, the average return can be directly inferred from the observable book-to-market (bm) of each

firm, i.e., is proportional to (see Berk (2000))

11

and nine additional stocks with the nearest average return to stock We keep track

of the identity of the stocks included in each micro portfolio via a × matrix

whose th row has zeros everywhere except for the stocks included in the portfolio.

Third, we construct the monthly return of each micro portfolio from the -vector of

stock returns () = [1 ]0 :

() = 11

(()) (A33)

where 1 takes the value of one if the return is observed in the data (and zero otherwise).

Fourth, we compute the vector of -statistics for all portfolios using the extended two-

pass regression described above and estimate the proportions of mispriced portfolios for

the two models and its difference,

() = 1− ()()

Φ0()

() = 1− ()()

Φ0()

∆() = ()− () (A34)

as well as the estimated variances of these estimators using Equations (A20) and (A22),

2() =2()

Φ0()2

2() =2()

Φ0()2

2∆() =2() + 2()− 2()

Φ0()2 (A35)

Repeating these three steps times, we can then compute the average values of the

estimated proportions and their difference as

() = ∗ =1

X=1

()

() = ∗ =1

X=1

()

(∆) = ∆∗ = ∗ − ∗ (A36)

12

We also compare the true variance of the estimators with the average estimated values:

2 =1

X=1

2()− (∗)2 versus (2) =1

X=1

2()

2 =1

X=1

2()− (∗)2 versus (2) =1

X=1

2()

2∆ =1

X=1

∆2()− (∆∗)2 versus (2∆) =1

X=1

2∆() (A37)

To further measure the accuracy of the variance estimators, we compute the coverage

ratio of the confidence intervals at equal to 90% and 95% as

() =1

X=1

1{(()− ∗) ()}

() =1

X=1

1{(()− ∗) ()}

(∆) =1

X=1

1{(∆()−∆∗) 2∆()} (A38)

where 1{} equals one if the condition inside the parenthesis is satisfied (and zero oth-erwise), and equals the quantile of the standard normal distribution at (1-

2)

C Main Results

In Panel A of Table AI, we examine the properties of the different estimators under the

first scenario where the two models and achieve a large difference in performance

(34.5% in the entire population) The true volatilities of the different estimators range

between 3.6% and 9.0% and are typically higher for the two largest size groups which

contain fewer portfolios. Turning to the properties of the variance estimators, we find

that the average value for each model in the entire population is slightly below average

(0.5% for model and 0.3% for model ) In contrast, the volatility estimator for the

difference yields an average value that closely matches the true volatility (5.1% versus

5.0%). This last property is maintained across all three size groups. Finally, the coverage

ratios of the two confidence intervals at 90% and 95% are, in most cases, remarkably

accurate. For instance, the coverage ratios for the proportion difference in the entire

population are equal to 90.1% and 95.3%, respectively.

13

In Panel B, we repeat the analysis for the second scenario where the two models

yield the same moderate performance. Similar to the previous scenario, the volatility

estimators precisely capture the variability of the estimated mispricing proportions for

the entire population (they are identical to the true values for both models). We also

find that the coverage ratios stay close to their theoretical values (88.0% and 93.4%

for the intervals at 90% and 95%, respectively). While the results are similar for the

big-cap group, they are less accurate in the two smallest size groups (micro- and small-

cap). In both groups, the volatility estimators underestimate the true volatilities by 12%

on average (in relative terms) which implies that the coverage ratios of the confidence

intervals are slightly lower than their theoretical values.

Please insert Table AI here

V Overlapping versus Non-overlapping Portfolios

In this section, we show that the mispricing proportion is estimated more precisely

with overlapping portfolios. For simplicity, we consider a population of stocks whose

residual terms ( = 1 ) are homoscedastic and uncorrelated both across stocks

and over time. We also assume that both the number of portfolios and return obser-

vations grow large. The number of overlapping portfolios is equal to and the

number of non-overlapping portfolios satisfies≤ ≤

+ 1, where is the

number of stocks included in each portfolio. We denote by 2() the variance of

obtained with overlapping portfolios and by 2() the variance of obtained with

non-overlapping portfolios. The two asymptotic variances can be written as

√2() =

1

Φ0()2

Ã1 + 2

−1X=1

!2

√2() =

1

Φ0()22 (A39)

where 2 denotes the variance of the indicator function 1 ( ), and is the correla-

tion between the indicator functions associated with the ordered -statistics 1 ( ) and

1 (+). From Equation (A39), we infer that the overlapping scheme provides efficiency

gains if

2() 2()⇐⇒Ã1 + 2

−1X=1

! (A40)

14

To show that the above inequality holds, we proceed in three steps. First, we compare

the variances of the averages of the -statistics for both overlapping and non-overlapping

portfolios. Second, we infer from this comparison that a sufficient condition for Equation

(A40) to hold is that the correlation between the pair ( +) is higher than the

correlation between the pair (1 ( ) 1 (+)) Third, we verify that this is the case.

We write the -statistic averages for overlapping and non-overlapping portfolios as

() =1

X=1

() =1

X=1

(A41)

and their asympotic variances as

√2

() = 1 + 2

−1X=1

√2

() = (A42)

where is the correlation between the pair (

+) To determine , we note that

the correlation between the portfolio residuals and + is equal to

= max(1−

0) (A43)

which means that the correlation progressively drops from one to zero as the portfolio

distance approaches − 1 Then, we repeat the analysis for the estimated pricingerrors Building on Equation (A10) and Proposition 1, we can write

√ ( − ) =

( − )0 +1√

P=1

002−1 =

1√

P=1 where is a scalar equal to

002−1 Therefore, the asymptotic correlation between

√ and

√+ is

=(√

√+)³

(√ )(

√+)

´ 1

2

=2+

2+= = max(1−

0) (A44)

Finally, we use Theorem 8.5 in Efron (2010, ch. 8) to show that asymptotically

= = max(1−

0) (A45)

15

Plugging the above expression in Equation (A42), we find√2

() and√2

()

are identical because

1 + 2

−1X=1

= 1 + 2

µ− 12

¶= (A46)

Therefore, a sufficient condition for the inequality in Equation (A40) to hold is that

∀ ∈ [1 − 1] ≤ and ∃ ∈ [1 − 1] s.t. (A47)

To examine the relationship between these two correlations, we denote the bivariate

normal distribution for the -statistics ( +) by

³

+;

´to obtain

=

R

R¡ ;

¢

2 (A48)

where is the -statistic mean.6 The double integral in the numerator of Equation

(A48) does not have a closed-form expression but can be easily solved numerically. The

results show that if ∈ (0 1) we have for all the intervals and -statistic

means that belong to the sets and defined as = { = [− ]; ∈ [015 065]})and = {;() = ( ∈ ) 0) This implies that the condition in Equation

(A47) holds and that 2() 2() To illustrate, Figure A2 shows the function

= () for different values for ( = 0 0.5, 1, 1.5) and = [−04 04]. Inall cases, we see that (i) = when is equal to zero or one; (ii) the function

() is convex. Therefore, is strictly lower than when ∈ (0 1)

Please insert Figure A2 here

VI Additional Results

A Changes in the Estimation Procedure

A.1 Different Values for the Interval

In the baseline specification, we set the interval for estimating the mispricing pro-

portion equal to [-0.4,0.4]. To examine if our results are sensitive to this choice we

re-compute the mispricing proportions for each interval in the set = { = [− ]; ∈ [015 02 065]} Table AII shows the results for the entire population (Panel A)

6For simplicity, we set the mean of and + equal to the same value. This assumption is motivated

by the fact that the -statistics are spatially ordered and thus likely to have similar means. Allowing for

different means does not change the results.

16

and the three size groups (Panels B to D). We find that the estimated proportions remain

largely unchanged–for instance, the averages in the entire population range between

53.4% and 56.7%. This stability is consistent with the observations made by Barras,

Scaillet, and Wermers (2010) and Storey (2002).

Please insert Table AII here

A.2 Bootstrap Analysis

In the baseline specification, we rely on asymptotic theory to estimate the mispricing

proportion in Equation (A14)–that is, we assume that the -statistics of correctly-

priced portfolios follow a standard normal distribution (0 1) in order to replace 0()

with Φ0()We now relax this assumption using the bootstrap approach of Efron (2010,

ch. 2) in which the -statistic of each portfolio is transformed into a statistic called the -

value. This transformation guarantees that the -value of a correctly-priced portfolio is

distributed as a normal (0 1). Therefore, we can still use Equation (A14) to compute

provided that we use -values instead of -statistics.

To compute the -value of each portfolio ( = 1 ), we use the following

procedure. First, we draw, for each bootstrap iteration ( = 1 1 000) random

observations from the original sample of risk factors and residuals to reconstruct the

portfolio returns:

() = 0 + () + () + () + () (A49)

where we impose that the portfolio is correctly priced ( = 0) by setting

0 = − (A50)

Second, we re-estimate the portfolio -statistic by regressing the bootstrapped returns

on the bootstrapped factors, i.e.,

() =0

()³0 ()

´ 12

(A51)

where () =((),())0, () denote the bootstrapped coefficient vector and its

covariance matrix. Third, we repeat the first two steps 1 000 times and compute

the bootstrapped cumulative distribution function (cdf) associated with the original

17

-statistic as

0( ) =

1

1 000

1000X=1

1{ () 6 } (A52)

Finally, we obtain the -value by inverting the quantile 0( ) using the standard

normal cdf Φ−10 i.e.,

= Φ−10 (0(

)) (A53)

The empirical results obtained with the bootstrap procedure are reported in Table

AIII. The estimated proportions of mispriced portfolios remain largely unchanged. This

result implies that the sample size is sufficiently large for the normal distribution to be

a good approximation of the true -statistic distribution.

Please insert Table AIII here

B Changes in the Portfolio Formation Procedure

B.1 Different Portfolio Sizes

In this section, we examine the sensitivity of the results to changes in the portfolio for-

mation procedure. To begin, we decrease the number of stocks in each micro portfolio

from 10 to 5 stocks ( = 5). The results in Panel A of Table AIV are qualitatively

similar except that the mispricing proportions are generally lower. With only 5 stocks

in each portfolio, the benefits of diversification are not fully exploited and the detection

of the mispriced portfolios becomes more difficult.

Next, Panel B reports the mispricing proportions for micro portfolios formed with

15 stocks ( = 15) Overall, the results remain similar to those documented in Table III.

We also observe that the volatilities of the estimators are slightly higher because micro

portfolios have a higher degree of overlap.

Please insert Table AIV here

B.2 Identical Stock Representation

Next, we tackle the issue of stock representation. While the vast majority of stocks are

selected times in each formation year, some of them are included more or less often.

Therefore, the baseline portfolio formation procedure could potentially overweight the

importance of some stocks and underweight the importance of others.

To address this issue, we modify the formation procedure to guarantee that each

stock is selected exactly times. For each formation year ( = 1 ) we create

18

the set of micro portfolios following the procedure described above and count the number

of times a given stock ( = 1 ) is included in different portfolios. If

we include stock in − additional portfolios with the nearest values to the average

return . If we exclude stock from − randomly selected portfolios.

Table AV shows that the estimated mispricing proportions under this alternative

portfolio formation remain largely unchanged. The performance differences documented

in the paper are therefore not driven by variations in representation across stocks.

Please insert Table AV here

B.3 Alternative Set of Characteristics

We re-build the cross-section of micro portfolios using nine different sets of character-

istics for estimating average returns in Equation (A1). The first three specifications

simply use each characteristic in isolation (book-to-market, investment, profitability).

Specifications 4 and 5 keep our initial measures of investment and profitability but re-

place the book-to-market ratio with the earnings-to-price and cash flow-to-price ratios.

Specifications 6 and 7 keep our initial definitions of book-to-market and profitability

but measure investment using infrastructure growth and inventory growth (instead of

growth in total assets). Specifications 8 and 9 keep our initial definitions of book-to-

market and investment but measure profitability using Return On Equity (ROE) and

the Return On Assets (ROA) (instead of operating profitability).

To begin, we examine whether these alternative micro portfolios still produce gains

in power and a reduction in beta correlation. In Panel A of Table AVI, we confirm that

the interquartile spread in average returns, the median return volatility, and the median

number of observations are similar to those reported in Table I. In Panel B, we also

measure the independent variation in betas. For each factor included in the CAPM-

based models, we compute the beta residuals (the components orthogonal to the other

betas) and report the length of the 90%-interval spanned by these residuals. The results

show that the dispersion in betas is similar to that of Table II. The overall evidence

suggests that the alternative cross-sections of micro portfolios contain sufficient pricing

information to discriminate between models.

Next, we estimate the mispricing proportions for the different sets of characteristics.

The results in Table AVII reveal strong similarities with those reported in Table III.

First, the average mispricing proportions (across the nine specifications) obtained with

the standard CAPM remains high, i.e., they reach 72.2%, 57.5%, and 42.7% in the three

size groups (74.1%, 60.5%, and 46.4% in the baseline case). Second, the human capital

19

CAPM maintains its solid performance in all three size groups, e.g., in the tiny- and

big-cap groups, the average mispricing proportions are equal to 31.6% and 12.5% (37.7%

and 14.9% in the baseline case). Third, the conditional CAPM still performs well in

the two largest size groups–the average mispricing proportions are equal to 30.3% and

23.4% (27.3% and 17.9% in the baseline case). Fourth, the liquidity CAPM continues to

price tiny-cap portfolios well as the average mispricing proportion equals 33.7% (35.3%

in the baseline case). Finally, the three characteristic-based models generally dominate

the CAPM-based models and perform equally well except in the tiny-cap group.

Please insert Tables AVI and AVII here

C Traditional Performance Measure

C.1 Testing Procedure

In the paper, we advocate for the use of the mispricing proportion to evaluate models.

An alternative approach is to use a version of the traditional performance measures

proposed by Gagliardini, Ossola, and Scaillet (2016; GOS hereafter) which can be applied

in large cross-sections. This measure is defined as the sum of squared pricing errors,

=1

P=1(

)2, where is the true pricing error of portfolio .

The asset pricing test is based on the statistic c =

where = √(− 1

),

=1

P=1(

)2, is the estimated portfolio alpha, and 2

is the variance of

defined as

2 = 2 lim→∞

⎡⎣ 1

X=1

X=1

2 2

2

³002

−1

−1 2

´2⎤⎦ (A54)

where = 1

P=1 1 = 1

P=1 1 = 1

P=1 11,

= [1 0 ]0 = [

0] = [

0] and 2 is a ( + + 2) × ( + 1) matrix whose th

row ( = 1 3 4 + 1) has one for the th element and zeros everywhere else.

When the number of portfolios and return observations grow large ( → ∞)Proposition 6 of GOS shows that if model is correctly specified, we have

c =

→ (0 1) (A55)

20

To estimate the variance term 2 we use the following consistent estimator

2 = 21

X=1

X=1

2 2

2

³002

−1

−1 −1

2

´2

(A56)

where =1

P=1 1

0, =

1

P=1 1

0

= [1 ]0, and is given by

Equation (A9). We also need to impose a sparsity condition on the terms ( =

1 ) in the double sum (see assumption A.4 in GOS). To this end, we use the

estimator = 1(¯¯

¯¯≥ ) where =

1

P=1 11

0 and is the

threshold parameter set equal to 0.067·(log()) 12 .

C.2 Empirical Results

Table AVIII reports the statistic c for the entire portfolio population and the three size

groups. The results closely mirror those reported in Table III. First, the null hypothesis

of correct specification is rejected in all but one case–when the five-factor model is

tested on big-cap portfolios (the -value is equal to 0.22). Second, the ranking of the

CAPM-based models is the same in each size group. Third, the three characteristics-

based models generally produce lower pricing errors than the CAPM-based models.

Because c is consistently lower for the five-factor model, it is tempting to say that

it dominates the three- and -factor models. However, we cannot make such claims

without comparison tests which have not been developed for large cross-sections yet.

Please insert Table AVIII here

D Comparison with Individual Stocks

D.1 Construction of the Sample

In this section, we evaluate the different models using individual stocks as test assets.

Similar to micro portfolios, we classify stocks in three size groups (tiny-, small-, and big-

cap). At the end of June each year, we partition all existing stocks using as breakpoints

the 20th and 50th percentiles of the market capitalization for NYSE stocks. Then, we

classify each stock in one of the three size groups based on the highest frequency of

observations. We also require that each individual stock has a minimum of 60 monthly

return observations to compute its -statistic. The resulting cross-section includes a

total of 6,651 individual stocks (3,548 tiny-cap, 1,379 small-cap, 1,724 big-cap).

21

D.2 Mispricing Proportion

We begin our analysis by examining the proportion of mispriced stocks. Panel A of Table

AIX shows that the estimated proportions are significantly lower than those obtained

with micro portfolios, e.g., the averages in the tiny- and small-cap groups are equal

to 7.5% and 8.9% (versus 55.3% and 49.4% for micro portfolios). Coupled with high

estimation uncertainty, these low estimated proportions lead us to conclude that none

of the models are misspecified, i.e., we cannot reject the null hypothesis of correct

specification. Because the estimated pricing errors of individual stocks are too volatile,

we are unable to detect mispricing in the data.

D.3 Traditional Performance Measure

Alternatively, we can use the measure proposed by GOS. Because this measure

aggregates pricing errors, it allows us to sidestep the challenge of detecting mispricing

at the individual stock level. As shown in Panel B, the tests based on the measure

indicates that the models are all mispecified in the entire stock population (the -values

are all equal to zero). This performance analysis is consistent with (i) the previous

results of GOS which show that commonly-used models are strongly rejected at the

individual stock level; (ii) the performance evaluation obtained with micro portfolios

(see Tables III and AIV).

There is no available testing procedure for comparing models based on the mea-

sure in large cross-sections. However, a casual observation of the estimated values reveals

no striking performance differences across the models. A likely culprit for this result is

the time-variation in individual stock betas as firms go through different business cycles

and stages of development–the empirical evidence shows that this variation is signif-

icantly stronger for stocks than for portfolios (e.g., Andersen et al. (2006), Fama and

French (1997)). The time-variation in betas introduces an additional source of misspec-

ification (e.g., Jagannathan and Wang (1996)) which affect all time-invariant models. It

can therefore smooth out the performance differences observed with micro portfolios.7

Please insert Table AIX here

7We could explicitly specify the dynamics of the individual stock betas. However, time-varying models

are difficult to estimate because of the large number of parameters. In addition, Ghysels (1998) shows

that a wrong specification of time-varying betas may result in large pricing errors, possibly greater

than those produced by a constant-beta model. The evidence in GOS reveals that the time-varying

specifications of the tested models are also strongly rejected in the data.

22

References

[1] Andersen T. G., T. Bollerslev, F. X. Diebold, and G. Wu, 2006, Realized Beta: Per-

sistence and Predictability, Advances in Econometrics: Econometric Analysis of

Economic and Financial Time Series, Elsevier.

[2] Avramov D., and T. Chordia, 2006, Asset Pricing Models and Financial Market

Anomalies, Review of Financial Studies 19, 1001-1038.

[3] Barras L., O. Scaillet, and R. Wermers, 2010, False Discoveries in Mutual Fund

Performance: Measuring Luck in Estimated Alphas, Journal of Finance 65,

179-216.

[4] Berk J., 2000, Sorting Out Sorts, Journal of Finance 55, 407-427.

[5] Brennan M. J., T. Chordia, and A. Subrahmanyam, 1998, Alternative Factor Spec-

ifications, Security Characteristics and the Cross-Section of Expected Stock Re-

turns, Journal of Financial Economics 49, 345-373.

[6] Efron B., 2010, Large-Scale Inference, Cambridge University Press.

[7] Fama E. F., and K. R. French, 1997, Industry Cost of Equity, Journal of Financial

Economics 43, 153-193.

[8] Fama E. F., and K. R. French, 2006, Profitability, Investment, and Average Returns,

Journal of Financial Economics 82, 491-518.

[9] Fama E. F., and K. R. French, 2008, Dissecting Anomalies, Journal of Finance 63,

1653-1678.

[10] Fama E. F., and K. R. French, 2015, A Five-Factor Asset Pricing Model, Journal

of Financial Economics 116, 1-22.

[11] Farcomeni A., 2006, Some Results on the Control of the False Discovery Rate under

Dependence, The Scandinavian Journal of Statistics 34, 275-297.

[12] Gagliardini P., E. Ossola, and O. Scaillet, 2016, Time-varying Risk Premium in

Large Cross-sectional Equity Datasets, Econometrica 84, 985-1046.

[13] Genovese C., and L. Wasserman, 2004, A Stochastic Process Approach to False

Discovery Control, Annals of Statistics 32, 1035-1061.

[14] Ghysels E., 1998, On Stable Factor Structures in the Pricing of Risk: Do Time-

Varying Betas Help or Hurt?, Journal of Finance 53, 549—573.

[15] Hou K., C. Xue, and L. Zhang, 2015, Digesting Anomalies: An Investment Ap-

proach, Review of Financial Studies 28, 650—705.

23

[16] Jagannathan R., and Z. Wang, 1996, The Conditional CAPM and the Cross-Section

of Expected Returns, Journal of Finance 51, 3-53.

[17] Newey W. K., and K. D. West, 1987, A Simple, Positive Semi-Definite, Het-

eroscedasticity and Autocorrelation Consistent Covariance Matrix, Economet-

rica 55, 703-708.

[18] Storey J. D., 2002, A Direct Approach to False Discovery Rates, Journal of the

Royal Statistical Society 64, 479—498.

[19] White H., 1980, A Heteroskedasticity-Consistent Covariance Matrix Estimator and

a Direct Test for Heteroskedasticity, Econometrica 48, 817—838.

24

Table AI

Monte Carlo Analysis

Panel A reports the properties of the proportion estimators under the first scenario where there

is a large performance difference between the two misspecified models a and b. For the entire

population and each size group (tiny-, small-, and big-cap), the first column shows the average

values of the estimated proportions of mispriced portfolios for both models and their difference.

The second and third columns compare the true volatilities of the estimated proportions and their

difference with the estimated volatilities. The fourth and fifth columns show the coverage ratios

of the confidence intervals at 90% and 95% for the estimated proportions and their difference.

In Panel B, we repeat the analysis under the second scenario where the two models produce the

same moderate performance. The total number of iterations is equal to 5,000.

Panel A: Large Performance Difference

Volatility Confidence Interval

Mean True Estimated 90%-coverage 95%-coverage

All Portfolios

Model a 30.6 3.6 3.1 91.3 94.5

Model b 65.1 4.3 4.0 92.5 95.8

Difference -34.5 5.0 5.1 90.1 95.3

Tiny-Cap Portfolios

Model a 32.4 4.8 4.6 93.2 96.1

Model b 62.5 6.7 5.7 89.8 93.5

Difference -30.1 8.0 7.7 87.9 93.7

Small-Cap Portfolios

Model a 32.4 7.4 6.6 92.1 95.3

Model b 63.1 5.6 5.9 94.0 96.3

Difference -30.6 9.0 9.1 90.3 94.8

Big-Cap Portfolios

Model a 15.4 6.2 5.6 92.9 95.8

Model b 68.3 4.6 5.3 95.4 97.7

Difference -52.9 7.3 7.6 91.0 95.5

25

Table AI

Monte Carlo Analysis (Continued)

Panel B: No Performance Difference

Volatility Confidence Interval

Mean True Estimated 90%-coverage 95%-coverage

All Portfolios

Model a 40.0 2.9 2.9 94.1 96.9

Model b 40.0 2.9 2.9 94.1 96.7

Difference 0.0 4.0 3.8 88.0 93.4

Tiny-Cap Portfolios

Model a 35.9 4.4 3.9 92.0 93.5

Model b 35.7 4.4 3.8 90.1 95.3

Difference 0.2 5.9 5.3 85.2 91.3

Small-Cap Portfolios

Model a 33.3 6.5 5.8 90.0 93.3

Model b 33.0 6.3 5.8 92.7 95.3

Difference 0.3 9.5 8.2 84.4 90.3

Big-Cap Portfolios

Model a 34.9 5.3 5.2 93.1 96.1

Model b 35.0 5.1 5.3 94.7 97.2

Difference -0.1 7.4 7.2 88.7 93.6

26

Table AII

Performance Analysis with Different Intervals

Panel A reports, for the entire population, the estimated proportions of micro portfolios that

are mispriced by the standard CAPM, the CAPM-based models (conditional, human capital, in-

tertemporal, and liquidity CAPMs), and the characteristic-based models (three-factor, q-factor,

and five-factor models) across the set of intervals ={ = [− ]; = [015 020 065]}. InPanels B to D, we repeat the analysis for the three size groups (tiny-, small- and big-cap).

Panel A: All Portfolios

Interval Bound

0.15 0.20 0.25 0.3 0.35 0.40 0.45 0.50 0.55 0.60 0.65

CAPM 64.5 63.8 62.5 63.5 63.6 64.3 63.6 63.3 62.7 62.2 61.5

Conditional 41.2 42.5 42.1 42.5 42.9 43.4 43.7 42.5 41.0 41.0 40.8

Human Capital 36.8 36.1 37.3 36.9 36.2 35.1 35.6 35.8 35.6 35.5 34.4

Intertemporal 61.8 60.6 60.9 60.0 59.8 60.2 59.2 58.7 58.6 58.3 57.7

Liquidity 36.8 37.3 35.0 34.7 34.0 34.4 34.0 33.4 33.8 33.6 32.9

Average 48.2 48.1 47.6 47.5 47.2 47.4 47.2 46.7 46.4 46.1 45.4

3-factor Model 34.8 36.8 38.0 38.6 37.5 36.4 36.2 36.4 35.9 35.3 35.0

q-factor Model 48.9 45.0 45.1 45.8 45.0 45.3 43.3 41.8 41.8 41.8 41.4

5-factor Model 32.9 30.7 29.6 30.8 30.4 30.3 29.1 29.2 29.0 29.7 29.2

Average 38.8 37.5 37.5 38.3 37.6 37.3 36.2 35.8 35.5 35.5 35.2

Panel B: Tiny-Cap Portfolios

Interval Bound

0.15 0.20 0.25 0.3 0.35 0.40 0.45 0.50 0.55 0.60 0.65

CAPM 76.1 74.9 73.9 74.2 73.6 74.1 74.1 73.7 73.4 73.1 72.1

Conditional 58.7 63.2 63.2 61.3 62.6 61.5 60.9 60.5 60.1 58.9 58.3

Human Capital 37.3 37.8 38.0 38.8 39.2 37.7 39.4 39.0 38.3 38.5 36.4

Intertemporal 72.4 70.5 70.2 70.1 69.0 68.0 67.6 67.7 67.0 65.6 64.9

Liquidity 37.2 37.2 38.2 37.7 36.0 35.3 35.2 36.1 30.0 35.4 35.4

Average 56.3 56.7 56.7 56.4 56.1 55.3 55.5 55.4 55.0 54.3 53.4

3-factor Model 55.8 58.0 58.0 55.8 54.5 53.5 51.7 50.9 49.9 48.5 47.9

q-factor Model 72.8 71.1 69.8 69.4 68.6 68.7 67.4 66.1 65.8 66.0 66.1

5-factor Model 47.1 46.5 45.7 47.0 45.3 43.5 43.1 44.0 43.2 44.0 43.0

Average 58.5 58.5 57.8 57.4 56.1 55.3 54.0 53.6 53.0 52.8 52.3

27

Table AII

Performance Analysis with Different Intervals (Continued)

Panel C: Small-Cap Portfolios

Interval Bound

0.15 0.20 0.25 0.3 0.35 0.40 0.45 0.50 0.55 0.60 0.65

CAPM 63.3 61.0 60.5 60.2 59.4 60.5 59.1 58.8 57.1 57.7 58.6

Conditional CAPM 32.0 30.7 31.4 30.8 29.1 27.3 26.0 27.3 27.3 28.7 28.0

Human Capital CAPM 57.0 51.5 53.0 51.1 47.0 45.1 47.1 45.9 44.3 44.5 45.6

Intertemporal CAPM 62.4 67.7 70.3 68.8 68.0 67.0 65.9 64.6 63.2 62.7 63.0

Liquidity CAPM 57.0 54.8 52.9 49.3 46.5 47.1 46.2 44.5 44.0 45.6 43.5

Average 54.3 53.1 53.6 52.0 50.0 49.4 48.8 48.2 47.1 47.8 47.7

3-factor Model 23.0 19.9 18.4 19.9 19.7 18.3 20.1 22.0 21.3 23.2 23.5

q-factor Model 30.2 22.7 23.9 24.1 21.3 22.8 22.0 20.9 20.6 19.2 19.0

5-factor Model 12.3 15.8 14.6 18.5 22.0 23.4 21.6 19.5 19.8 20.6 20.0

Average 24.8 19.5 18.9 20.8 21.0 21.5 21.3 20.8 20.6 20.0 20.8

Panel B: Big-Cap Portfolios

Interval Bound

0.15 0.20 0.25 0.3 0.35 0.40 0.45 0.50 0.55 0.60 0.65

CAPM 40.7 42.4 39.6 43.5 45.6 46.4 44.8 45.0 44.7 42.7 41.4

Conditional CAPM 12.5 16.6 20.3 18.3 16.6 17.9 18.5 17.6 17.7 18.1 17.5

Human Capital CAPM 2.3 7.2 9.0 11.6 13.2 14.9 11.8 13.0 12.4 12.4 10.8

Intertemporal CAPM 62.5 63.6 60.9 57.0 56.4 57.2 55.5 55.0 53.4 51.5 51.2

Liquidity CAPM 29.7 28.9 27.8 31.2 30.9 30.5 28.9 28.2 27.1 26.8 26.9

Average 29.1 31.7 31.5 32.3 32.5 33.3 31.9 31.8 31.1 30.3 29.6

3-factor Model 8.1 6.0 12.3 17.9 16.6 15.2 16.9 17.8 18.6 17.3 17.3

q-factor Model 5.5 8.4 10.4 13.9 14.9 14.6 10.2 7.6 8.4 9.3 7.9

5-factor Model 14.5 9.5 8.0 6.4 5.8 7.7 5.6 5.9 6.6 7.0 7.3

Average 8.7 8.0 10.2 12.7 12.4 12.5 10.9 10.4 11.2 11.1 10.8

28

Table AIII

Performance Analysis with the Bootstrap

This table reports the estimated proportions of micro portfolios that are mispriced by the stan-

dard CAPM, the CAPM-based models (conditional, human capital, intertemporal, and liquidity

CAPMs), and the characteristic-based models (three-factor, q-factor, and five-factor models) for

the entire portfolio population (All) and the three size groups (tiny-, small-, and big-cap) using

a bootstrap approach. Figures in parentheses denote the estimated volatilities of the proportion

estimates.

Size Groups

All Tiny-cap Small-cap Big-cap

CAPM 62.9 (2.3) 72.8 (5.7) 57.7 (5.0) 43.3 (4.4)

Conditional CAPM 41.3 (1.9) 60.0 (4.8) 26.2 (4.8) 14.3 (5.7)

Human Capital CAPM 34.8 (3.2) 36.2 (5.7) 45.1 (5.9) 8.9 (5.6)

Intertemporal CAPM 58.4 (2.0) 67.8 (4.2) 66.3 (5.1) 54.2 (4.2)

Liquidity CAPM 32.2 (2.4) 33.6 (6.0) 45.7 (5.4) 28.4 (5.4)

Average 45.9 54.1 48.2 29.8

3-factor Model 35.4 (2.8) 52.1 (5.7) 16.6 (6.6) 14.0 (5.4)

q-factor Model 42.6 (3.0) 67.4 (6.1) 22.4 (4.5) 11.3 (5.2)

5-factor Model 27.4 (2.4) 43.7 (5.5) 19.3 (5.5) 4.7 (5.3)

Average 35.1 54.4 19.4 10.0

29

Table AIV

Performance Analysis with Different Portfolio Sizes

Panel A reports the estimated proportions of micro portfolios that are mispriced by the stan-


CAPMs), and the characteristic-based models (three-factor, q-factor, and five-factor models)

for the entire portfolio population (All) and the three size groups (tiny-, small-, and big-cap)

using five stocks (n=5). Figures in parentheses denote the estimated volatilities of the propor-

tion estimates. In Panel B, we repeat the analysis using micro portfolios made up of 15 stocks

(n=15).

Panel A: Five Stocks

Size Groups


CAPM 48.3 (2.2) 58.8 (5.6) 41.6 (5.2) 31.7 (4.7)




Liquidity CAPM 36.1 (2.3) 42.8 (4.2) 39.6 (4.9) 19.7 (5.0)

Average 38.9 47.1 37.9 25.0

3-factor Model 28.3 (2.5) 34.9 (4.2) 26.8 (4.8) 15.5 (5.4)

q-factor Model 32.4 (2.4) 51.3 (5.2) 13.2 (4.8) 8.3 (5.0)

5-factor Model 24.3 (2.2) 36.7 (3.8) 24.8 (4.8) 0.0 (4.9)

Average 28.4 41.0 21.6 8.0

Panel B: Fifteen Stocks

Size Groups


CAPM 71.2 (2.5) 78.3 (5.7) 70.8 (4.6) 56.3 (5.1)




Liquidity CAPM 34.0 (2.7) 32.0 (5.5) 57.7 (5.7) 22.7 (6.2)

Average 51.3 59.6 56.5 36.1

3-factor Model 44.0 (2.9) 63.6 (6.0) 24.8 (7.2) 18.2 (6.0)

q-factor Model 51.3 (3.0) 73.2 (6.1) 23.9 (5.0) 10.7 (4.5)

5-factor Model 34.4 (2.2) 55.5 (6.7) 17.5 (7.1) 3.8 (5.0)

Average 41.8 64.1 22.1 10.9

30

Table AV

Performance Analysis with Equal Stock Representation

This table reports the estimated proportions of micro portfolios that are mispriced by the stan-


CAPMs), and the characteristic-based models (three-factor, q-factor, and five-factor models) for

the entire portfolio population (All) and the three size groups (tiny-, small-, and big-cap) using

a portfolio formation procedure in which all stocks are selected the same number of times in

each formation year. Figures in parentheses denote the estimated volatilities of the proportion

estimates.

Size Groups


CAPM 62.6 (2.2) 71.6 (5.5) 60.5 (4.5) 44.8 (5.4)




Liquidity CAPM 35.4 (2.5) 41.5 (5.7) 48.8 (5.3) 29.6 (5.0)

Average 47.5 55.8 50.9 31.2

3-factor Model 35.2 (2.5) 45.7 (6.5) 26.5 (6.6) 20.0 (5.7)

q-factor Model 47.4 (2.7) 68.6 (5.7) 24.5 (4.7) 14.0 (4.8)

5-factor Model 27.6 (2.9) 43.9 (6.9) 14.5 (5.6) 3.8 (4.4)

Average 36.7 52.7 21.8 12.6

31

Table AVI

Properties of Micro Portfolio with Different Sets of Characteristics

Panel A reports the interquartile spread in average returns (annualized), the median return

volatility (annualized), and the median number of return observations across micro portfolios

for the following sets of characteristics: (i) three cases where book-to-market (bm), investment,

and profitability are used in isolation (bm, inv, prof); (ii) two cases where bm is replaced with

the earnings/price and cash flows/price ratios (bm1, bm2); (iii) two cases where investment is

measured with infrastructure and inventory growth (inv1, inv2); (iv) two cases where profitability

is measured with the return on equity and the return on assets (prof1, prof2); (v) the baseline

case used in the paper (Base). Panel B measures the dispersion in the portfolio betas on each

risk factor specific to the CAPM-based models (conditional, human capital, intertemporal, and

liquidity CAPMs) for the different sets of characteristics. Each column measures the length of

the 90%-interval of the beta residuals (the components orthogonal to the other factor betas).

Panel A: Portfolio Returns

Set of Characteristics

bm inv prof bm1 bm2 inv1 inv2 prof1 prof2 Avg Base

Median Obs. 384 390 384 366 378 384 402 372 372 381 390

Return Spread 7.5 7.2 6.3 6.5 6.6 7.4 6.6 7.1 7.4 7.0 7.3

Median Vol. 29.7 29.7 29.2 26.5 27.0 29.6 29.7 27.9 27.9 28.6 30.2

Panel B: Independent Variation in Betas



Conditional 1.91 1.90 1.80 1.35 1.40 1.80 1.60 2.28 2.21 1.81 1.75

Human Capital 1.35 1.44 1.49 1.10 1.17 1.47 1.28 1.66 1.67 1.40 1.41

Intertemporal 2.25 2.56 2.29 1.27 1.32 2.31 1.67 3.23 3.15 2.22 2.56

Liquidity 1.25 1.25 1.32 1.09 1.1 1.24 1.14 1.40 1.39 1.24 1.26

Average 1.70 1.79 1.73 1.21 1.25 1.71 1.43 2.1 2.1 1.67 1.75

32

Table AVII

Performance Analysis with Different Sets of Characteristics

Panel A reports, for the entire population, the estimated proportions of micro portfolios that

are mispriced by the standard CAPM, the CAPM-based models (conditional, human capital, in-

tertemporal, and liquidity CAPMs), and the characteristic-based models (three-factor, q-factor,

and five-factor models) for different sets of characteristics. The specifications are the following:

(i) three cases where book-to-market (bm), investment, and profitability are used in isolation

(bm, inv, prof); (ii) two cases where bm is replaced with the earnings/price and cash flows/price

ratios (bm1, bm2); (iii) two cases where investment is measured with infrastructure and inven-

tory growth (inv1, inv2); (iv) two cases where profitability is measured with the return on equity

and the return on assets (prof1, prof2); (v) the baseline case used in the paper (Base). In Panels

B to D, we repeat the analysis for the three size groups (tiny-, small- and big-cap).

Panel A: All Portfolios



CAPM 52.9 62.3 52.8 63.6 66.6 62.4 59.8 66.5 66.0 61.4 64.3

Conditional 31.9 47.3 31.5 37.7 40.2 50.3 39.1 60.8 60.1 44.3 43.4

Human Capital 29.7 27.0 15.4 38.1 37.5 39.7 28.6 43.7 33.1 31.4 35.1

Intertemporal 38.6 57.9 35.0 46.9 47.7 55.5 53.6 56.8 56.1 49.8 60.2

Liquidity 21.2 36.0 22.6 29.9 18.9 32.4 24.1 48.6 52.4 31.8 34.4

Average 34.8 46.1 31.5 43.2 42.2 48.0 41.1 55.3 53.5 44.0 47.5

3-factor Model 26.2 35.7 15.1 38.2 33.8 36.6 33.9 34.2 398 32.6 36.4

q-factor Model 38.9 43.7 44.9 36.3 34.4 44.3 39.3 40.1 43.5 40.6 45.3

5-factor Model 27.1 33.7 18.8 26.5 26.4 31.7 31.3 27.2 29.8 28.1 30.3

Average 30.8 37.7 26.3 33.7 31.6 37.6 34.9 33.9 37.8 33.8 37.3

Panel B: Tiny-Cap Portfolios



CAPM 63.9 73.5 69.5 74.7 77.6 73.7 73.7 70.9 72.2 72.2 74.1

Conditional 42.4 70.0 47.1 46.8 39.2 64.5 46.9 71.6 70.8 55.5 61.5

Human Capital 45.2 26.6 14.2 48.2 45.3 36.6 22.4 40.7 28.6 31.6 37.7

Intertemporal 44.0 63.1 33.1 37.4 32.7 54.5 53.4 57.8 58.7 48.3 68.0

Liquidity 24.1 35.8 29.7 30.3 16.9 34.6 19.8 57.0 55.1 33.7 35.3

Average 43.9 53.8 38.7 47.5 42.3 52.8 43.2 59.7 57.1 48.8 55.3

3-factor Model 38.9 52.4 20.5 52.6 50.4 49.1 40.5 48.6 49.2 44.7 53.5

q-factor Model 63.5 67.8 69.6 61.4 66.8 73.5 64.3 67.5 65.8 66.7 68.7

5-factor Model 39.5 49.1 28.3 43.1 41.7 47.1 37.4 43.5 44.8 41.6 43.5

Average 47.3 56.5 39.5 52.4 53.0 56.6 47.4 53.2 53.3 51.0 55.2

33

Table AVII

Performance Analysis with Different Sets of Characteristics (Continued)

Panel C: Small-Cap Portfolios



CAPM 50.9 50.0 40.6 62.2 62.0 64.3 49.2 69.8 68.1 57.5 60.5

Conditional 21.8 23.2 13.6 27.0 32.8 48.4 33.2 40.2 32.8 30.3 27.3

Human Capital 22.1 27.6 40.0 18.1 40.8 59.8 52.1 68.1 54.1 42.5 45.1

Intertemporal 50.6 55.1 50.0 47.4 54.8 61.9 54.2 69.4 66.7 56.6 67.0

Liquidity 32.8 42.7 19.4 49.9 36.4 54.9 37.3 55.9 61.1 43.4 47.1

Average 35.6 39.7 32.7 40.9 45.4 57.8 45.2 60.7 56.5 46.1 49.4

3-factor Model 17.0 18.0 7.8 32.1 14.4 27.3 32.8 25.8 30.8 22.9 18.3

q-factor Model 9.8 22.8 13.6 20.6 16.9 9.3 13.4 17.2 21.1 14.3 22.8

5-factor Model 18.0 22.1 5.3 20.6 12.9 17.2 26.2 14.1 17.5 17.1 23.4

Average 15.0 21.0 8.9 24.5 9.7 17.9 24.1 19.1 23.1 18.1 21.5

Panel D: Big-Cap Portfolios



CAPM 30.8 49.1 27.2 44.7 50.3 36.3 38.5 55.0 52.1 42.7 46.4

Conditional 10.0 20.4 8.4 29.0 33.4 20.5 21.2 32.0 35.7 23.4 17.9

Human Capital 5.0 10.8 10.0 14.0 10.2 16.9 17.8 18.0 10.3 12.5 14.9

Intertemporal 32.2 61.3 27.3 44.7 53.3 47.5 48.2 57.7 59.4 48.0 57.2

Liquidity 28.3 33.9 11.1 41.6 28.8 23.9 19.0 27.0 45.7 28.9 30.5

Average 21.3 35.1 16.8 34.8 35.2 29.0 29.0 38.0 40.7 31.1 33.3

3-factor Model 15.7 15.3 10.0 17.5 19.5 18.1 20.1 13.0 29.3 17.6 15.3

q-factor Model 10.9 10.2 18.6 4.2 2.6 11.8 7.0 5.6 18.8 9.9 14.6

5-factor Model 7.9 11.1 9.9 13.9 9.5 11.2 22.4 6.2 11.1 10.0 7.7

Average 11.5 12.2 12.8 7.7 10.5 13.7 16.5 8.3 19.8 12.6 12.5

34

Table AVIII

Performance Analysis with the Traditional Measure

This table reports the traditional performance measure proposed by Gagliardini, Ossola, and

Scaillet (2016) under the standard CAPM, the CAPM-based models (conditional, human cap-

ital, intertemporal, and liquidity CAPMs), and the characteristic-based models (three-factor,

q-factor, and five-factor models) for the entire portfolio population (All) and the three size

groups (tiny-, small-, and big-cap). The performance measure is defined as the sum of squared

pricing errors and provides a valid inference in large cross-sections. Figures in parentheses denote

the p-values under the null hypothesis that the model is correctly specified.

Size Groups


CAPM 146.0 (0.0) 131.8 (0.0) 62.6 (0.0) 40.3 (0.0)




Liquidity CAPM 76.3 (0.0) 59.8 (0.0) 51.4 (0.0) 25.7 (0.0)

Average 102.6 84.9 53.6 28.8

3-factor Model 56.2 (0.0) 48.1 (0.0) 29.8 (0.0) 14.0 (0.0)

q-factor Model 72.7 (0.0) 88.6 (0.0) 9.2 (0.0) 5.4 (0.0)

5-factor Model 5.5 (0.0) 6.6 (0.0) 3.5 (0.0) 0.7 (0.23)

Average 44.8 47.8 14.1 6.7

35

Table AIX

Performance Analysis with Individual Stocks

Panel A reports the estimated proportions of individual stocks that are mispriced by the stan-


CAPMs), and the characteristic-based models (three-factor, q-factor, and five-factor models)

for the entire portfolio population (All) and the three size groups (tiny-, small-, and big-cap).

Figures in parentheses denote the estimated volatilities of the proportion estimates. Panel B

conducts the same analysis using the traditional performance measure proposed by Gagliardini,

Ossola, and Scaillet (2016). Figures in parentheses denote the p-values under the null hypothesis

that the model is correctly specified.

Panel A: Proportion of Mispriced Stocks

Size Groups


CAPM 14.7 (15.5) 8.5 (21.7) 8.3 (17.8) 32.8 (14.3)




Liquidity CAPM 14.4 (15.6) 7.2 (21.8) 9.3 (17.8) 34.0 (14.2)

Average 14.3 7.5 9.0 32.2

3-factor Model 5.7 (16.0) 3.2 (21.9) 0.0 (18.4) 20.7 (14.2)

q-factor Model 14.3 (16.0) 3.8 (22.0) 0.0 (18.1) 13.1 (15.4)

5-factor Model 7.2 (16.0) 3.7 (21.9) 2.8 (18.1) 18.0 (15.1)

Average 6.1 3.6 1.0 17.2

Panel B: Traditional Performance Measure

Size Groups


CAPM 15.1 (0.0) 3.7 (0.0) 4.5 (0.0) 20.3 (0.0)




Liquidity CAPM 15.2 (0.0) 3.6 (0.0) 4.8 (0.0) 21.2 (0.0)

Average 15.0 3.7 4.5 20.2

3-factor Model 12.0 (0.0) 3.9 (0.00) 0.5 (0.30) 17.4 (0.0)

q-factor Model 9.2 (0.0) 3.7 (0.0) 0.4 (0.34) 12.5 (0.0)

5-factor Model 6.0 (0.0) 3.6 (0.00) 1.4 (0.08) 8.2 (0.23)

Average 9.1 3.8 0.8 12.7

36

Figure A1

Forming the Cross-Section of Micro Portfolios

This figure illustrates the procedure for forming the set of micro portfolios sorted based onaverage returns using an hypothetical population of 50 individual stocks and a two-year sampleperiod. Each dot represents the estimated average return taken by each stock (S1, S2,...). Foreach stock, the procedure consists of forming an equally-weighted portfolio that includes thestock itself and 9 additional stocks with the nearest estimated average returns. Then, the port-folio returns are chained across years 1 and 2 to maintain a stable average return over time. Thisprocedure yields a cross-section of 50 micro portfolios ranging from P(Return 1) to P(Return 50).

S10 S42S3

S18 S6S46

Return 1 (year 1)......Return 25 (year 1)...

...Return 50 (year 1)

High Return

High ReturnLow Return

Low Return

Return 1 (year 2)......Return 25 (year 2)...

...Return 50 (year 2)

Chaining Returns

...P(Return 50)

Chaining Returns

...P(Return 25)...

Chaining Returns

P(Return 1)...

37

Figure A2

Overlapping versus Non-overlapping Portfolios

The four panels plot the pairwise correlation between the indicator functions 1(tkj ) and 1(tkj+d)

and the pairwise correlation between the t-statistics tkj and tkj+d for different mean values for the

average t-statistic (μts=0, 0.5, 1, 1.5). The interval I is set equal to [-0.4,0.4]. If the correlationbetween indicator functions is lower than the correlation between t-statistics, using overlappinginstead of non-overlapping porfolios brings efficiency gains (i.e., it reduces the variance of theestimated mispricing proportion).

Correlation t-statistics0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Cor

rela

tion

indi

cato

rs

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

(a) t-statistic mean of 0


Cor

rela

tion

indi

cato

rs

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

(b) t-statistic mean of 0.5


Cor

rela

tion

indi

cato

rs

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

(c) t-statistic mean of 1


Corre

latio

n in

dica

tors

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

(d) t-statistic mean of 1.5

38

Appendix A Large-Scale Approach for Evaluating Asset Pricing … · 2018-09-23 · I The Cross-Section of Micro Portfolios A Portfolio Formation Procedure We describe the procedure

Documents