Estimating panel data fixed and random effects with … · Cahier de recherche 2018-03 Estimating panel data fixed and random effects with application to the new Fama-French model

Cahier de recherche 2018-03

Estimating panel data fixed and random effects with application to the new Fama-French model using GMM robust

instruments

François-Éric Racicot

Telfer School of Management, University of Ottawa, Ottawa, ON K1N 6N5, Canada. Affiliate Research Fellow, IPAG Business School, Paris, France

Chaire d’information financière et organisationnelle, ESG-UQAM

E-mail: [email protected]

William F. Rentz

Telfer School of Management, University of Ottawa, Ottawa, ON K1N 6N5, Canada

Raymond Théoret

École des sciences de la gestion, Université du Québec à Montréal

Chaire d’information financière et organisationnelle, ESG-UQAM

May 9, 2018

mailto:[email protected]

2

Estimating panel data fixed and random effects with application to the new

Fama-French model using GMM robust instruments

Abstract

We investigate the five-factor Fama-French (2015) model using a GMM robust instrumental variables technique

comparing panel data fixed and random effects approaches. We rely on an improved Hausman artificial regression

to test for measurement errors. We also study a six-factor model that adds the Pástor-Stambaugh (2003) illiquidity

risk factor. While the fixed effects model is the most used in practice, we find that the random effects model is the

most appropriate for the data sample we resort to. Our fixed and random effects panel data approaches using

robust instrumental variables strongly suggest that the only consistently significant factor is the market risk factor.

Keywords: fixed and random effects; GMM; higher moment instruments; illiquidity; Fama-French five-factor model

JEL classification: C10; G11

Estimation GMM robuste des effets et variables de données en panel :

Le cas du nouveau modèle de Fama et French

Résumé

Nous investiguons le nouveau modèle à cinq facteurs de Fama et French (FF, 2015, 2016) rehaussé d’une mesure de liquidité bien connue (Pástor and Stambaugh, 2003) à l’aide d’une modification du GMM qui recourt à des instruments robustes, cela dans le cadre d’une analyse en panel. Lorsque nous recourons à l’estimateur OLS, notre modèle de Fama-French semble avoir un pouvoir explicatif en regard des rendements d’un portefeuille à douze secteurs. Cependant, notre étude en panel suggère que le seul facteur significatif est la prime de risque du marché, ce qui nous conduit à rejeter le modèle élargi de Fama-French. Dépendamment de la technique utilisée, nous trouvons que les erreurs de mesure peuvent être à la source de ce résultat, ce qui tendrait à appuyer le modèle élargi de Fama-French.

Mots-clefs : effets fixes et variables; GMM; instruments robustes; illiquidité; modèle de Fama et French à cinq

facteurs.

Classification JEL : C10; G11.

3

1. Introduction

Sharpe (1964), Lintner (1965), and Mossin (1966) developed what is known as the capital asset

pricing model (CAPM). Jensen (1968) is credited with the development of alpha ( ) , who he

applied to investigate the performance of mutual funds via the CAPM. Black (1972) extended

the theory of the CAPM to what is known as the zero-beta CAPM. Collectively, these ideas form

the basis of modern portfolio management and equity valuation. They have been widely

implemented over the last 50 years by academics and practitioners. There were many attempts to

extend the CAPM to a dynamic framework, such as the intertemporal CAPM (Merton, 1973) and

the consumption CAPM due to Hansen and Singleton (1982, 1984). Later Mehra and Prescott

(1985) studied the consumption CAPM to further investigate what is known as the equity

premium puzzle1. Nevertheless, it appears that the most appealing extension of the CAPM is the

static Fama and French (FF, 1992, 1993) three-factor asset pricing model that is akin to the

arbitrage pricing theory (APT) of Ross (1976). In addition to the excess market return factor, the

FF three-factor model includes size and value factors.

According to Cochrane (2011, p.1061), there is a “zoo of new variables”. In our study,

we will restrain the factors to the original FF three factors, the profitability and investment

factors recently introduced by FF (2015), and the Pástor-Stambaugh (PS, 2003) illiquidity factor.

These risk factors appear to be the most widely recognized factors explaining the cost of equity2.

All of these factors may be replicated by portfolios. If these portfolio risk factors do not span the

whole space of the unknown state factors, then specification errors could occur. Furthermore, as

noted by FF (2015, p. 2), the book/market ratio “is a noisy proxy for expected return”, which

implies potential measurement errors.

In addition to potential specification errors, some of the explanatory variables may be

highly interrelated. Cochrane (1991, 2011) used a modified version of Tobin’s (1969) Q theory

to show a link between asset prices and investment. Cochrane’s link can be modified to express a

relation between expected returns and investment3. Since Cochrane’s Q is approximated by the

market/book ratio, the FF value and investment factors are likely to be highly related.

1 For a summary of these developments, see Campbell et. al. (1997) or Cochrane (2005, 2008). Note that Hansen

and Singleton (1984) also previously found the equity premium puzzle. 2 See Pinto et. al. (2015, ch. 2). 3 See Hou, Xue, and Zhang (2015).

4

The original illiquidity factor of Pástor-Stambaugh (2003) is an example of what is

considered a generated variable because it is a parameter obtained from a regression, in this case

relating stock return to its trading volume. Note that there is a portfolio version of this variable

which is the one we use. This portfolio is long in illiquid stocks and short in liquid stocks.

However, this portfolio factor is statistically indistinguishable from its original version. Although

the OLS estimator remains unbiased, generated variables can increase the variance of the OLS

estimator according to Pagan (1984, 1986)4 and Shanken (1992)

5. Thus, the resulting inference

may be biased. Furthermore, Adrian et al. (2017) argue that traditional illiquidity measures are

endogenous variables, which therefore results in biased coefficients using OLS.

A powerful solution to the problems of specification and measurement errors is the

generalized method of moments (GMM) developed by Hansen (1982). However, the usefulness

of this method is questionable in the presence of weak instruments. Nelson and Starz (1990a,b);

Bound, Jaeger, and Baker (1995); and Hahn and Hausman (2003) show that the two-stage least

squares (2SLS) estimator is inconsistent when instrumental variables are weak.

Dagenais and Dagenais (1994, 1997) develop a method that creates instruments with

greater robustness. These robust instruments are generated through a Bayesian averaging

approach originally developed by Theil and Goldberger (1961). This approach employs

generalized versions of Durbin (1954) and Pal (1980) higher moment estimators. The principal

two features of this approach are i) it is parsimonious in the sense that it requires minimal

computational power and ii) it essentially minimizes a distance (d) measure. Based on this

distance notion, we refer to this approach as GMMd.

This article develops an empirical extension of Racicot (2015) that generalizes the GMMd

approach to a fixed and random effects panel data framework. In addition, we allow not only for

the Jensen performance measure to vary across individuals (sectors) but also the systematic

risk measure to vary6. This generalization enables us to i) evaluate the robustness of the new

five-factor FF (2015) model and ii) compare this model to a six-factor model that incorporates

4 Pagan and Ullah (1988), however, find that when estimating a regression using a generated variance regressor (e.g.

from GARCH), the resulting estimator will be biased. 5 In the two-pass regression approach, the second step uses estimated betas. These betas may be considered as

generated variables. Shanken (1992) showed that the standard error from this two-step approach should be corrected.

This result appears analogous to Pagan (1984, 1986). 6 However, note that a seemingly unrelated regression (SUR) procedure would have been more appropriate here

since in our applications, we have not enough cross-section compared to time range.

5

the PS (2003) illiquidity risk factor. This empirical framework allows us to provide some new

insights on the effects of unobserved heterogeneity in panel data models that may compound

measurement errors if not tackled properly. One approach to removing unobserved heterogeneity

is to rely on first-differencing. In fact, this may actually worsen the situation. Arrenallo (2003)7

showed that it is only by chance that the method of first-differencing in a panel data framework

will diminish measurement errors.

Fama and McBeth (1973) introduced a process for estimating cross-sectional regressions

and standard errors correcting for cross-sectional correlation in a panel data framework.

Cochrane (2005, p.245) showed that when the right-hand side variables are invariant through

time, the Fama-McBeth results are equivalent to (i) the pooled regression, (ii) cross-section OLS

with standard errors corrected for cross-sectional correlations, and (iii) single cross-sectional

regression on time series averages with standard errors corrected for cross-sectional correlations.

Shanken (1992) proposed a way to correct the bias in the estimation process for the standard

errors caused by the two-pass regression approach. However, as Cochrane (2005) points out, one

way to tackle all of these problems is to use the more powerful GMM approach. One of the

virtues of our proposed generalized GMMd panel data framework is a systematic treatment of the

previous specification errors including the problem of measurement errors. To the best of our

knowledge, we are the first to use panel data for both fixed and random effects models for

estimating factor risk premiums using our new GMMd approach.

In this paper, we find, using the Jarque-Bera (1980) statistic, that the return data for the

FF 12 sectors, the FF 5 portfolio risk factors and for the PS portfolio illiquidity factor all depart

significantly from normality. In general, our results show that using OLS in panel data for fixed

or random effects models, most of the new FF portfolio risk factors are significant although the

PS portfolio illiquidity is not. However, when using the GMMd approach, we obtain a different

picture, viz., the only strongly significant risk factor is the market factor and the illiquidity factor

is weakly significant for the pooled GMMd (fixed effects). We also find significant measurement

errors for the new FF investment factor and for the PS illiquidity factor relying on our modified

artificial regression Hausman (1978) test (Hausd).

7 Dagenais (1994) showed when pseudo differencing is used to correct for autocorrelation as in the iterative method

of Cochrane and Orcutt (1949), the problem of measurement error is exacerbated.

6

The remainder of this article is organized as follows. Section 2 introduces an extension of

the basic fixed and random effects panel data framework in the context of errors in variables in

the new Fama-French (FF, 2015) five-factor model and the six-factor model that includes Pástor-

Stambaugh (PS, 2003) illiquidity. Section 3 incorporates the GMMd approach into the panel data

framework. Section 4 discusses our Hausd test for measurement errors. Section 5 lays out the

testing procedures for random versus fixed effects models. Section 6 interprets some descriptive

statistics of the data used in this paper. Section 7 presents our empirical results. Section 8

discusses our conclusions and suggestions for further research.

2. The Fixed and Random Effects Fama-French Models

2.1 The five- and six- risk factor models8

In two influential papers, Fama and French (FF, 1992, 1993) introduced their three-factor asset

pricing model. Their idea was to improve on the explanatory power of observed equity returns.

The capital asset pricing model (CAPM) developed Sharpe (1964), Lintner (1965), and Mossin

(1966) is known to have only modest explanatory power for individual equity returns9. Fama and

French improved the cost of equity calculation by adding the size t

SMB and value t

HML factors

to the CAPM excess market return factorMt ft

R R to create the following three-factor model.

it Ft i i Mt Ft i t i t itR R a b R R s SMB h HML e (1)

tSMB is the difference in returns in period t of a diversified small cap portfolio and a diversified

large (i.e. big) cap portfolio. Note that this differential return also may be proxied by computing

the difference in return of the Russell 2000 and the S&P 500 index. t

HML is the difference in

returns in period t of a high book-to-market portfolio and a low book-to-market portfolio.

To further refine their model, FF (2015) introduced two additional factors, the

profitability factor t

RMW and the investment factor t

CMA to create the following five-factor

model10.

8 Note that some authors have recently considered other factors instead of illiquidity, like the momentum factor (see

Barillas and Shanken, 2015). This factor is, however, not new and is well documented in the literature (see Carhart,

1997). 9 Several authors (e.g. Benninga, 2014) show that the explanatory power of the CAPM substantially improves when

applied to a portfolio of equities. 10 The data for the five FF factors and sector returns are available from

http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html


7

it Ft i i Mt Ft i t i t i t i t itR R a b R R s SMB h HML r RMW c CMA e (2)

The profitability factor tRMW is the difference in returns in period t of diversified portfolios of

stocks with robust and weak profitability. The investment factor tCMA is the difference in returns

in period t of diversified portfolios of conservative and aggressive firms with respect to

investment behavior.

As a starting point for justifying these new factors, FF (2015) examined the market value

per share mt which is the discounted value at time t of the expected dividends per share tE d

where 1... and r is the cost of equity.

1

( )

1

tt

E dm

r

(3)

As FF explains, (3) can be manipulated to extract the relation between expected return and

expected profitability and between expected return and investment.

FF’s approach follows Miller and Modigliani’s (MM, 1961) approach, albeit with slightly

different notation. MM used the following expression to value the firm 0V at time 0,

10

1(0) ( ) ( )

1t

t

V X t I t

(4)

where I(t) is the investment at time t, X(t) is the total net profitability at time t, and is the

discount rate. Note that in (4), investment and profitability are explicitly considered.

Essentially FF generalized (4) to be at any time t, explicitly considered the expectation

operator E , and divided both sides of (4) by the book value Bt of the firm at time t to obtain

(5) which illustrates why book-to-market or B/M ratios are related to the rate of return of a

financial asset.

1

/ 1t t

t

t t

E NI B rM

B B

(5)

tNI is the net income for period ,t 1t t tB B B is the change in total book value

of equity, and r is the return on the financial asset. The change in book value of equity for any

period is the investment (disinvestment if negative) when a firm is all equity financed.

Macroeconomists would call this the change in capital, viz. 1 1t t t tK K B B I t

for the

8

period t. Note that (5) is also a proxy for Tobin’s (1969) Q, which is the market value of installed

capital divided by its replacement cost.

A firm should invest more when its marginal Q is high11

in order to maximize

shareholder wealth. As the firm invests more, it will move down its investment opportunity

schedule until marginal benefit equals marginal cost. Hence, higher investment will drive down

the firm’s rate of return. Hou et. al. (2015) used this argument to develop the following equation

for stock i at time t,

1

1 /

t itt it

it it

EE r

a I A

(6)

where t itE r and 1t itE are the conditional expected return and profitability, respectively; a

is a parameter for adjustment costs; itI is the investment; and itA are the firm’s productive assets.

This model is based on Lin and Zhang’s (2013) stochastic general equilibrium model in a two-

period setting, where the rate of return on investment is equated to the firm’s discount rate or

cost of capital. (6) provides a rationale for the factors tRMW and tCMA in (2).

Using Bellman’s (1957) equation of dynamic programming, Abel (1983)12

related

investment It to Tobin’s Qt13

, the interest rate r, and the elasticity of investment as shown in

(7).

1

1

1t

t

QI with

r

(7)

To be more specific, Abel proposed a simple model of investment where a firm undertakes to

accumulate (reduce) its capital stock in order to maximize its discounted net revenues subject to

the constraint of a Cobb-Douglas (1938) production function and to an uncertain future price for

its product or services14

. Note that (7) is consistent with (6) in the sense that investment increases

with Q and is inversely related to r.

11 The marginal Q is the NPV of future cash flows generated from an additional unit of assets. Note that (6) is

derived equating the marginal benefit to marginal cost. 12 See also Chow (1997) who also discussed this model. 13 Tobin’s Q is the expected marginal revenue product of capital. 14 Instead of using the cumbersome approach of dynamic programming, Chow (1997) showed how this problem can

be transformed into a simple Lagrange optimization problem.

9

Pástor and Stambaugh (2003) introduced a liquidity factor LIQt to the original Fama and

French (1992) three-factor model. The Pástor-Stambaugh liquidity factor may be viewed as a

generated variable. LIQt is an average of the stock it obtained from regression (8).

1 1 1id t md t it it idt it idt mdt idt id tr r r sign r r v (8)

where ridt is the return of stock i on day d in month t and vidt is the dollar trading volume of stock

i on day d in month t. Pagan (1984, 1986)15

shows that generated variables may increase the

variance of the OLS estimator but the estimator remains unbiased. In this paper, we compare (2)

with an augmented version of this equation that includes the liquidity as a sixth factor16

.

2.2 Fixed Effects Model17

We extend the model in (2) to a fixed effects panel data framework including the LIQ factor in

(9) below, written in a stacked vector format for the 12 FF sectors.

12 12

1 1

F i i i i M F

i i

Y R R D D R R s SMB h HML r RMW cCMA l LIQ e

(9)

11 1 1 12,1 1 12,T, , , , , , ,F T FT F FTY' R R R R R R R R represents the transpose of the

stacked vector Y of excess returns for each sector. ' 0, ,0, ,1, ,1,0, ,0iD is the transpose

of the stacked dummy variable, which is 0 everywhere except for the T observations for sector i.

i is the Jensen (1968) performance measure for sector i.

'M FR R 1 1 M1 1 MT, , , , , ,M F MT FT F FTR R R R R R R R is the transpose of the

stacked vector of excess market returns. That is, the excess market returns are stacked 12 times, once for

each sector. i is the sector i CAPM systematic risk beta. The other explanatory variables are

similarly defined. The coefficients of these other variables are 12-sector pooled coefficients. e is

the stacked vector of error terms.

For the fixed effects (FE) model, we will need the estimate of the variance-covariance

matrix. One way to proceed is by transforming the model into its deviation from the time means.

Consider first the basic LSDV model:

Y D X e (10)

15 See also Pagan and Ulah (1988) and Shanken (1992) for more information on related matters. 16 The LIQ factor is available from Pastor’s website http://faculty.chicagobooth.edu/lubos.pastor/research/ .We use

the tradable LIQ factor and multiply it by 100 to put it in percentage form. 17 See Heij et al. (2004) for a parsimonious introduction to the panel data framework with EViews applications.

http://faculty.chicagobooth.edu/lubos.pastor/research/

10

Following Wooldridge (2002), we can transform (10) into its deviation form by subtracting the

time mean from both sides (i.e., ��𝑖𝑡 = 𝑦𝑖𝑡 − ��𝑖, ��𝑖𝑡 = 𝑥𝑖𝑡 − ��𝑖 and ��𝑖𝑡 = 𝑒𝑖𝑡 − ��𝑖) to obtain:

�� = ��𝛽 + �� (11)

The fixed effect estimator is obtained by applying OLS on (11):

��𝐹𝐸 = (��′��)−1

��′�� (12)

The variance-covariance matrix of this estimator is therefore identical to the OLS one except for

the fact that it is in deviation form,

��𝐹𝐸(��𝐹𝐸|𝑋) = ��𝑒2(��′��)

−1 (13)

where ��𝑒2 = ��′��/(𝑁𝑇 − 𝑁 − 𝑘) and ��𝑖𝑡 = ��𝑖𝑡 − ��𝑖𝑡′��𝐹𝐸.

2.3 Random Effects Model18

To introduce the standard random effects model, we begin with the classic model where only the

constant term is allowed to vary randomly. We then progress to the general random parameters

model where all the parameters are allowed to vary randomly.

In the case of the standard random effects model, a generalized reformulation of (9) for

sector i at time t yields

it it i ity x β u ' (14)

where itx β' is the product of the explanatory variables and the vector of coefficients in (9) and

𝑢𝑖 = 𝑧𝑖′𝛼 − 𝐸(𝑧𝑖

′𝛼) is the random heterogeneity of the ith

sector added to 𝛼, the constant term. 𝑢𝑖

may be viewed as a set of factors for the ith

sector, 𝑧𝑖′𝛼, that are not in the regression and are

specific to that sector. Note that one way to remove this heterogeneity is by transforming the

model to deviation (Baltagi, 2001; Arenallo, 2003; Greene, 2012) from the group mean, that is

it i i i it i it i

it i it i

y y u u x x β

x x β

''

''

(15)

where ��𝑖 = (𝛼 + 𝑢𝑖) + ��𝑖′𝛽 + 𝜀��, i=1,…, N. In this setting, the LSDV estimator is a consistent

estimator of 𝛽. This approach has the virtue of being robust to specification errors19

. However,

18 We follow the presentation of Greene (2012, 2015). 19That is, if we wrongly choose the random or the fixed effects model, the LSDV estimator remains consistent.

11

the approach, while instructive, is like the OLS estimator: not efficient. An efficient GLS exists

and this is the preferred method to estimate (14). The GLS estimator (Greene, 2012) is given by

-1

-1-1 -1 -1 -1

1 1

' 'ˆ 'N N

i i i i

i i

X X X y X X X y

(16)

where Ω = (𝐼𝑁⨂Σ), Σ−1/2 =1

𝜎𝜀(𝐼 −

𝜃

𝑇i𝑇i𝑇

′ ), and 𝜃 = 1 −𝜎𝜀

√𝜎𝜀2+𝑇𝜎𝑢

2. The transformation of the

dependent explanatory variables used for the GLS is obtained by multiplying these variables by

Σ−1/2. The GLS estimator can be shown to be, like the pooled OLS, a weighted average (matrix)

of the within and between-units estimators (Greene, 2012):

ˆ ˆ ˆwithin within between betweenF b F b (17)

where ��𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 𝐼 − ��𝑤𝑖𝑡ℎ𝑖𝑛, ��𝑤𝑖𝑡ℎ𝑖𝑛 = (𝑆𝑥𝑥𝑤𝑖𝑡ℎ𝑖𝑛 + 𝜆𝑆𝑥𝑥

𝑏𝑒𝑡𝑤𝑒𝑒𝑛)−1𝑆𝑥𝑥𝑤𝑖𝑡ℎ𝑖𝑛, and 𝜆 = (1 − 𝜃)2 =

𝜎𝜀2

𝜎𝜀2+𝑇𝜎𝑢

2. From this, it can be seen that the inefficiency of the ordinary least squares, that is when

𝜆=1, come from the fact that it puts too much weight on the between-unit variation compared to

the GLS estimator. In practice, with Σ rarely known, one remedy is to rely on an estimated

version (10) to remove heterogeneity to obtain an estimator of 𝜎𝜀2 (Greene, 2012)

2

2ˆ

2 1 1ˆ ˆ

ˆ ˆ

N Tiiti t

LSDVNT N K

(18)

where 𝜀��𝑡 and 𝜀�� are, respectively, the estimated residuals and the mean over the time period of

the estimated residuals, both obtained from (14). An estimation of 𝜎𝑢2 can be obtained by first

applying OLS on the pooled model (that is y = Xβ+ε , where all the data is stacked) to obtain

��𝑃𝑜𝑜𝑙𝑒𝑑2 and then computing ��𝑢

2 = ��𝑃𝑜𝑜𝑙𝑒𝑑2 − ��𝐿𝑆𝐷𝑉

2 20. The estimator β in (16) is unbiased and

consistent, provided matrix X is uncorrelated to the errors. Otherwise, as in the case of the fixed

effects, there is no guarantee that the errors in the explanatory variables and the unobserved

heterogeneity would cancel out. For that matter, Arellano (2003) uses a simple cross-sectional

setting to show how this attenuation of the bias21

may happen by chance.

2.4 Generalized random effects model

20

See Greene (2012, pp. 375-376) for potential problems in implementing this approach. 21

Not to be confused with the well-known bias due to errors in variables called attenuation which results in an

estimator that tends to 0.

12

Note that we are using a generalized version of the fixed effects model in (9)22

. Thus, to be

comparable, we will use a generalized version of the random effects model where all the

parameters are allowed to vary randomly. This version of the model can be written as follows:

i i i iy X e (19)

with Xi, a matrix of observations of dimension T x k, i i

v where vi is a vector of random

effects of dimension k x 1 withi i i i i

E(v X ) 0, E(v v ' X ) and ei is a vector of random errors

of dimension T x 1. Thus, we can see that the i for an FF sector is the result of a random

process with mean vector and covariance matrix .

We can rewrite (2) in the format of the general random effects model as follows

1

2

3

4

5

6

i

i

ii F M F i

i

i

i

a v

b v

s vR R R R SMB HML RMW CMA e

h v

r v

c v

1

(20)

where Ri, RF, RM, SMB, HML, RMW, CMA, and ei are vectors of dimension T x 1 and vki

(k=1,…,6) is random variable. Note that we are assuming that all of the parameters are random,

not just the intercept term and the coefficient for the excess market return factor. The random

effects model simplifies considerably assuming that we are in the case where there no

autocorrelation or cross-section correlations in ei. Following Swamy (1970) and Greene (2012),

we apply GLS to (19)23

obtain

12-1

-1 -1 *

1

ˆ ' i i

i

X X X y W b

(21)

which is a simple weighted average of the OLS bi. Note that in this equation, Ω has to be

estimated. An empirical estimation of *iW in (21) is needed to implement this model. *

iW in its

theoretical form is given by

22In this paper we are not considering the more modern version of the random effects model that is referred to as the

hierarchical model. See Greene (2012) for a discussion. 23 This is equivalent to (20) using the variables in our model.

13

1

1 11 1

* 2 ' 2 '

1

,i i

N

i e i i e i i

i

W X X X X

(22)

where N = 12 is the number of FF sectors. To estimate *iW Swamy (1970) estimated using

the empirical variance of a set of N least squares estimates for the vector bi minus the average

value of 1

2 '

i i is X X

, viz.

''

1 1

ˆ 1 1/1

N N

i i i

i i

bb Nbb N VN

where 1

1N

i

i

b bN

and 1

2 ' .i i i iV s X X

In summary, we obtain an estimate of the vector given by the weighted

average vector β with an estimator of the covariance matrix given by .

We can write the empirical version of (21) for the random effects (RE) model by

substituting for and 1

2 '

i i is X X

for 1

2 '

ie i iX X

(i.e., Ω for Ω , which implies substituting

*ˆiW for *

iW ) to obtain the feasible generalized least squares (FGLS)

12-1

-1 -1 *

1

ˆ ˆ ˆ ˆ'ˆRE i i

i

X X X y W b

(23)

The asymptotic RE variance-covariance matrix of (23) is given by the standard GLS one

(Wooldridge, 2002):

1ˆ ˆˆ

REV (β) X' X

=-1Ω (24)

which translate empirically to (Swamy, 1970) :

1

11

2 '

1

ˆ ˆN

i i i

i

RE s X XV (β)

= (25)

As it can be seen, the estimated variance-covariance matrix of the random effects model is

simply given by the first part of (22).

Our purpose here is to propose a parsimonious approach to tackle measurement errors or

the endogeneity of the explanatory X based on the generalized method of moments. This method

has the virtue of freeing the analyst from having to choose between one instrument and another.

As the literature has well established (e.g., Anderson and Rubin, 1949, 1950; Dufour, 2003;

Nelson and Startz, 1990a,b; Hahn and Hausman, 2002, 2003; Stock and Yogo, 2005; Hausman,

Stock and Yogo, 2005; Stock and Watson, 2011; Greene, 2012; Olea and Pflueger, 2013), weak

14

instruments present a perverse problem. Choosing the wrong instruments may result in

increasing the problem one hoped to confront in the first place. That is, it may transform the

estimator into a biased and inconsistent one. For example, it may bias the two-stage least squares

estimator toward the OLS24

. Also, it will render the basic framework for statistical inference

inappropriate (Nelson and Startz, 1990a,b; Hahn and Hausman, 2003).

3. The GMMd approach in the panel data framework

3.1 Standard instrumental variables approach – differencing required

Before discussing our proposed methodology to tackle the endogeneity introduced by

measurement errors, we first show the standard instrumental variables approach (Hausman and

Taylor, 1981; Arellano and Bond, 1991). In this framework, some sort of differencing is required

either from their group mean or first differencing.

Assume that we have the following equation to estimate using panel data

'it it ity x (26)

where no fixed effects/random effects is shown explicitly and no transformation has been applied.

We also assume there are errors in the explanatory variables which take the following form:

𝑥𝑖𝑡 = ��𝑖𝑡 + 𝑣𝑖𝑡 (27)

where ��𝑖𝑡 is the unobserved explanatory variables measured with errors 𝑣𝑖𝑡. Applying the first

difference approach to (26), yields25

:

-1 -1 -1

-1

( ) '

( ) '

it it it it it it

it it it

y y x x

x x

(28)

The first step in this framework is to apply 2SLS on (28) assuming a matrix of instrumental

variables Z, resulting in (Greene, 2012, 2015)26

:

-1-1 -1

2

1 1 1 1 1 1

' ' ' ' ' 'ˆN N N N N N

SLS i i i i i i i i i i i i

i i i i i i

X Z Z Z Z X X Z Z Z Z y

(29)

where Xi, is a T x k matrix as in (15) and Zi is a matrix of instruments.

24

The OLS estimator is biased and inconsistent in that context. 25

There are generally three common approaches to deal with heterogeneity: first differencing, group demeaning, and

for the fixed effects model, the least squares dummy variable approach. 26

See also Arenallo (2003) for a presentation of the GMM in a panel data context.

15

The second step consists of forming the weighting matrix (��) for the GMM estimator based on

the estimated residuals of (28):

21

1' 'ˆ ˆˆ

N

i i i i

i

w Z ZN

(30)

Substituting (30) in the criterion or GMM estimation gives

-1

1 1

1' ' ˆ1ˆ ˆ

N N

i i i i

i i

q Z w ZN N

(31)

Finally, minimizing (31) for parameters 𝛽 yields

-1

-1 -1

1 1 1 1

' 'ˆ 'ˆ ' ˆN N N N

GMM i i i i i i i i

i i i i

X Z w Z X X Z w Z y

(32)

where the asymptotic variance-covariance matrix of estimator 𝜃𝐺𝑀𝑀 is given by:

-1

-1

1 1

. . ˆ ˆ' 'N N

GMM i i i i

i i

Est AsyV X Z w Z X

(33)

A parsimonious instrumental variables approach – no differencing required

Turning to our parsimonious approach, it has the virtue of not requiring any type differencing in

the context of our panel data approach to the five-factor FF (2015). In this context, our estimator

(Racicot, 2015) is obtained by replacing the Z instruments by our new robust instruments which

can be qualified as strong instruments. We describe below our GMMd approach in the context of

panel data fixed effects and generalized random effects models.

3.2 GMMd fixed effects model

The GMM estimator ˆd

GMM for estimating the fixed effects panel data regression models is

given by

-1

-1 -1

1 1 1 1

' ' 'ˆ ˆ 'ˆd

N N N N

GMM i i i i i i i i

i i i i

X d w d X X d w d Y

(34)

where 𝑑𝑖 = 𝑥𝑖 − ��𝑖 is a vector of robust “distance” instruments. These new instruments – the d

“distance” instruments – can be computed using a matrix-weighted average by applying GLS to

a combination of two robust estimators, namely the Durbin (1954) and Pal (1980) estimators.

16

These estimators are respectively defined, in there multivariate representation, by (Racicot,

2015):

-1

' '1 1D z x z y (Durbin) (35)

-1

' '2 2P z x z y (Pal) (36)

where 𝑧1 = [𝑥𝑖𝑗2 ], 𝑧2 = 𝑧3 − 3𝐷𝑖𝑎𝑔(𝑥′𝑥/𝑁)𝑥′, 𝑧3 = [𝑥𝑖𝑗

3 ], and Diag(x’x/N) = x’x/N Ik are

stacked vectors with i representing the sectors (i = 1, …, N), k the number of explanatory

variables (either 5 or 6), and t the time subscript (t = 1,…,T). The notation is the Hadamard

product. The second and third power (moments) of the de-meaned variables (x) are then

computed. This is analogous to computing the second and third moments of the explanatory

variables. In short, the instruments are obtained by taking the matrix of explanatory variables (X)

in deviation from its mean (x). Next, we obtain the weighted estimator (𝛽𝐻) by an application of

the GLS to the following combination (Racicot, 2015):

DH

P

W

(37)

where 𝑊 = (𝐶′𝑆−1𝐶)−1𝐶′𝑆−1 is the GLS weighting matrix, S is the covariance matrix of (𝛽𝐷

𝛽𝑃)

under the null hypothesis (i.e., no measurement errors), and 𝐶 = (𝐼𝑘𝐼𝑘

) is a matrix of two staked

identity matrices of dimension k. Note that this weighting approach, which relies on GLS as the

weighting matrix, is optimal in the Aitken (1935) sense27

. However, we opt for the GMM

method to weight the Durbin and Pal’s estimators. We consider this to be a more efficient

procedure than the one used by Dagenais and Dagenais (1997) in that we rely on the asymptotic

properties of the GMM estimator with respect to the correction of heteroskedasticity and

autocorrelation to weight the instruments obtained with GLS. Note that when using GMM, we

give up some efficiency gain in order to avoid completely specifying the nature of the

autocorrelation or heteroskedasticity of the innovation and the data generating process of the

27

Note that we use W as a weighting matrix in the GLS estimator in (37). As well-known, this matrix can be replaced

by the White (1980) or the Newey-West (1987) HAC asymptotically consistent variance-covariance matrix. For the

problem of cross-sectional correlation (or spatial correlation) see Driscoll and Kraay (1998).

17

measurement errors (Hansen, 1982). Again, we consider this a great advantage over the GLS

estimator.

3.3 Implementing the panel data fixed effects GMMd approach

To implement the GMMd approach in a fixed effects panel data framework, first create

the dummy variables for each sector. Next compute the robust instruments using the above

algorithm described in (37). Then calculate the GMM estimators in (34) using a HAC matrix

with the newly computed robust instruments and the sector dummy instruments.

3.4 Implementing the panel data random effects GMMd approach

To implement the GMMd estimator in the context of the generalized random effects

model, simply substitute ˆd

GMM given by (34) for ib in (23). Also, the least squares variance-

covariance estimator 1

2 '

i i is X X

should be replaced by 1

2 '

,dGMM i i is X X

.

4. Hausd test for measurement errors

To test whether there are measurement errors, we rely on a modified Hausman (1978) artificial

regression which we refer to as Hausd. Each variable in the original five-factor and six-factor

models has a companion variable in Hausd with its own t statistic that indicates whether the

original variable contains measurement error.

To implement the Hausd artificial regression, start by estimating the following equation

using OLS:

ˆY X e (38)

It is a two-stage least squares (2SLS) estimator because is also obtained by OLS and (38) can

be rewritten as

2ˆ ˆ *SLSY X e (39)

where measures the under/over estimation of the OLS benchmark estimator.

Following Pindyck and Rubinfeld (1998, pp. 195-197), can be obtained using the following

procedure. Assume a regression model of the form Y X , where X is an unobservable

variable that is related to the observable variable X* and where X* = X + v and v is matrix of

measurement errors that are assumed to be normally distributed. The OLS regression

* *Y X is related to the original regression by noting that * v . We can write

18

ˆ ˆ* *X X , where are the regression residuals from applying OLS on 0

ˆˆˆitx z . Note

this expression is just another representation of 1 ˆ ˆ' 'zP X Z Z Z Z X Z X

, the projection

of X. Substituting for *X in the equation for Y yields ˆ ˆ* *Y X . Let represent the

coefficient of the variable . Substituting ˆ ˆ* *X X yields ˆ* *Y X ,

which is analogous to (39). The resulting t statistics can be analyzed in the usual fashion. That is,

if a significant t statistic is obtained for any variable, there are significant

specification/measurement errors in the model28

.

(39) is a Hausman (1978) artificial regression that can also be obtained using 2SLS with the

same set of instruments (Spencer and Berk, 1981). To be precise, the estimated ‘slope’

coefficients of the GMMd regression should be the same as the corresponding ‘slope’

coefficients in the Hausman artificial regression.

In (39), is a matrix of residuals of the regression of each explanatory variable on the

instrument set. The notation is commonly used in Hausman artificial regressions. It is

equivalent to the 𝑑𝑖 = 𝑥𝑖 − ��𝑖 residual that emphasizes the idea of a ‘distance’ variable.

5. Testing for random effects versus fixed effects models

The standard approach to test whether the fixed effects model should be retained over the

random effects model is usually performed via a Hausman (1978) specification test. The test

simply verifies that the quadratic distance between the fixed effects estimator is significantly

different from the random effects one. The test can be written as follows (Greene, 2012;

Wooldridge, 2002)

' 1 2ˆ ˆ ~

a

FE RE FE RE FE RE MH b V V b

(40)

which is asymptotically distributed as chi-squared with k degrees of freedom. To be precise, we

have k coefficients (i.e., k explanatory variables excluding the constant term) in the estimator

vectors bFE and ˆRE for the fixed effects and random effects models, where FEV and REV can be

estimated using (13) and (25), respectively. Note that if there were only one parameter in these

28 An F test can be done to see if collectively, none of the coefficients of the variables in the artificial regression are

significantly different from zero. This turns out to be unnecessary, since at least one coefficient in every regression

is significantly different from zero using t tests on the individual coefficients.

19

vectors, the square root of statistic (40) would asymptotically follow a t statistic and under

certain assumptions would in fact be a normal distribution asymptotically (Wooldridge, 2002).

That being said, in this paper, we would prefer to rely on an auxiliary regression version

of this test. This is consistent with the approach used in this paper. The implementation of the

test is, in our view, more practical and parsimonious. This approach can be implemented via the

following regression (Mundlak, 1978; Schmidheiny, 2015)

' ' '

it i it it i t ity x z x u (41)

In this generalized version of the test, 1/i ittx T x is the time average of the explanatory

variables. In our case, this implies that we have to compute the time averages of each of our risk

factors. Note also that in our case that = t = 0, since we are using the constrained version of the

test as we did not consider the time varying effects model or other sources of heterogeneity. The

test is generally implemented by testing the null hypothesis that the vector of coefficients = 0

(H0: = 0) using a Wald test corrected for clustered errors that may be heteroskedastic or

autocorrelated29

. Essentially, the test amounts to running an LSDV regression, where the term

'

ix is added. More precisely, one needs to stack the time data for each sector for the returns Y

into a vector and for the risk factors X into a matrix. The term ix is a stacked vector of the

average factor value for each factor. To further provide implementation details about (35),

rewrite it into its matrix format

Y D X X u (42)

where Y are the stacked returns for each sector, X are the stacked FF factors including the

liquidity factor, its associated vector of coefficients that remain constant for each sector, and

X is defined as follows

29 Note that an expeditious way of testing the joint significance of the parameters in is to use an F test and look at

its associated p-value, where it should be under 5% if the desired level of confidence is 5%.

20

[ 11 12 16

11 12 16

21 22 26

21 22 26

12,1 12,2 12,6

12,1 12,2 12,6

x x x

x x x

x x x

x x x

x x x

x x x

L

L

M M M

L

L

M M M

L

L

M M M ]

𝑁𝑇×𝑘

X is a matrix of data of dimension NT = 12 sectors × number of monthly observations, k = 6

factors in the augmented version of the FF (2015) model, and λ is a vector of dimension k × 1.

6. Data

6.1 Data

Our sample is composed of monthly returns of 12 indices classified by FF industrial sectors. The

observation periods are from January 1968 through December 2014 for a total 564 monthly

observations. The panel data framework yields 12 sectors × 564 monthly observations = 6,768

total observations. The FF risk factors are drawn from French’s website30. The PS liquidity factor

is from Pástor’s website31.

30 French’s website is http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html. 31 Pástor’s website is http://faculty.chicagobooth.edu/lubos.pastor/research/liq_data_1962_2012.txt


http://faculty.chicagobooth.edu/lubos.pastor/research/liq_data_1962_2012.txt

21

6.2 Descriptive statistics

Tables 1 and 2 present the descriptive statistics of the dependent and independent variables,

respectively32

.

Insert Table 1 here

For all sectors, note that the JB statistic is greater than 5.99, which is the critical value of

the chi-square distribution at the 5% level for 2 degrees of freedom. Thus, we reject the null

hypothesis of normality for all sector returns. Mandelbrot (1963, 1972) and Fama (1963, 1965)

reached similar conclusions. This empirical behavior was discovered even earlier by Mitchell

(1915), who may have been the first to notice both time varying volatility and high-peaked (fat

tails) in commodity prices. Pareto also found a fat-tailed distribution of income in the late 1800s

and developed a theory for this (Haug, 2007). Note that nowadays authors consider modeling

fund returns using the tempered class of distributions (Bianchi, 2015).

Sector 6 Business Equipment has the highest standard deviation of 6.68. On a standalone

basis in the Markowitz (1959)33 mean-variance framework, this would indicate that Business

Equipment is the riskiest sector. However, in the higher-moments framework of Rubinstein

(1973) and of Jurczenko and Maillet (2006), this sector has the second lowest kurtosis. This

suggests that perhaps Business Equipment is not the most risky.

Nine of the 12 sectors show negative skewness, which is an indicator of downside risk.

Only Sector 2 Durables, Sector 4 Energy, and Sector 10 Health have the desirable positive

skewness, which is an indicator of strong upside potential.

Insert Table 2 here

32 The Jarque-Bera (1980) statistic is calculated by

22

23~ 2

6 24

akurtskewJB n k

where n is the

number of observations, k is the number of regressors which is zero when using the raw data, skew is the skewness

of the data which is zero for a normal distribution, and kurt is the kurtosis which is three for the normal distribution.

However, note that the kurtosis and the skewness measures are not independent. Wilkins (1944) shows that 𝑘𝑢𝑟𝑡 ≥𝑠𝑘𝑒𝑤2 + 1, and Schopflocher and Sullivan (2005) go further and write that 𝑘𝑢𝑟𝑡 = 𝑎 + 𝑏 ∗ 𝑠𝑘𝑒𝑤2. These measures

must therefore handled with care. 33

Markowitz (2012) noted that the mean-variance model still works well in the presence of moderate amounts of

skewness and kurtosis.

22

In Table 2, the JB statistics are even more indicative of non-normality. The new FF risk

factor RMW has an extremely high JB statistic, and the new FF risk factor CMA has the lowest

JB statistic. Nevertheless, at 46.41, the CMA JB statistic is still well above the 5.99 chi-square

5% cutoff value. The values for all of the risk factors indicate that extreme events occur far more

frequently than with the normal distribution. This is a reflection of the kurtosis measuring well

over the normal distribution value of 3 for each of these 6 risk factors. The highest kurtosis value

is for the RMW risk factor at 14.17, being over 4 times the normal distribution value. Only the

kurtosis and JB statistics for RMW fall outside the range of the kurtosis and JB values from Table

1 for the sector returns.

All these results support the logic of our proposed methodology, which uses higher

moments (cumulants) as instruments for the GMM estimation process. Using OLS when such

strong non-normality is present in both the dependent and explanatory variables, may lead to

wrong inferences.

7. Empirical Results

7.1 The Fixed Effects estimation of the FF new five-factor model

Table 3a presents our estimation results for the new FF five-factor model using the LSDV and

GMMd approaches for the fixed effects model.

Insert Table 3a here

For the FF five-factor OLS pooled model for the FF twelve-sector portfolio, the

coefficients of all five factors are significant at 1% except for SMB which is significant at 10%.

This suggests strong support for the FF five factors. However, using GMMd pooled, only the

coefficient of the market factor is significant at 1% with the RMW coefficient significant at 5%.

From an investment performance perspective, the Jensen performance measure is

negative but not significant for both the OLS and GMMd pooled approaches for the FF twelve-

sector portfolio. For OLS, the twelve-sector portfolio appears to be weighted towards firms that

are small cap (SMB, 0.0216), value (HML, 0.1091), robust profitability (RMW, 0.1671), and

conservative investment policy (CMA, 0.0748). For GMMd, the conclusions are the same except

that growth rather than value seems to slightly predominate (HML, - 0.0016). The Hausd test

suggests significant measurement errors in the RMW factor (��𝑅𝑀𝑊 = - 0.2805, t = - 2.36).

Adding the coefficient of RMW and its corresponding ω coefficient yields 0.3976 – 0.2805 =

23

0.1171. This Hausd result is an approximation of the OLS estimation of 0.1671, reminding the

reader that Hausd is asymptotic.

According to LSDV, we have 8 sectors (Durables, 0.0053; Manufacturing, 0.0054;

Business Equipment, 0.0097; Money, 0.0041; Other, 0.0023) that generate significant positive

excess returns while there are 3 sectors (Non-durables, - 0.0025; Telecom, - .0038; Utilities,

- 0.0110) that significantly underperform and 1 sector (Energy, 0.0000) with neutral

performance. The relative (to the market) systematic measure of risk for all 12 sectors is

significantly different from 0 and for all 12 sectors is within 0.25 of the market of 1.

For the fixed effects GMMd, only one sector (Energy, 0.0224) has a Jensen performance

measure that is significantly different from 0! Given that the Jensen performance measure for

Energy is only significant at the 10% level, these sector results are strongly supportive of

efficient markets. Again, the relative systematic measure of risk is significantly different from 0

for all 12 sectors with 10 sectors within 0.25 of 1. Only the betas for Energy at 1.4261 and

Telecom at 0.5862 fall outside the range of 0.25 of 1.

7.2 The Fixed Effects estimation of the FF new augmented six-factor model

Table 3b presents our estimation results for the new augmented (LIQ) FF six-factor model using

the LSDV and GMMd approaches for the fixed effects model.

Insert Table 3b here

For the FF six-factor OLS pooled model for the FF twelve-sector portfolio, all of the

coefficients of the five FF risk factors retain their previous significance level when the LIQ risk

factor is added. The LIQ risk factor, however, is insignificant, which suggests that LIQ on

average does not have a risk premium. This again suggests strong support for the FF five factors.

However, using GMMd pooled, only the coefficient of the market factor is significant at 1% and

illiquidity may matter (20% level of significance). When testing for measurement errors using

the Hausd test, the LIQ risk factor is significant at the 5% level and seems to be measured with

significant errors (��𝐿𝐼𝑄 = - 0.1105, t = - 2.30). Furthermore, the coefficient of the CMA risk

factor becomes significant once again at the 10% level (0.2242, t = 1.86).

From an investment performance perspective, the Jensen performance measure is

negative but not significant for the OLS pooled approach for the FF twelve-sector portfolio.

24

However, for the GMMd pooled approach, the Jensen measure is negative and significant at the

5% level (- 0.1653, t = - 1.99). For OLS, the twelve-sector portfolio appears to be weighted

towards firms that are small cap (SMB, 0.0218), value (HML, 0.1090), robust profitability (RMW,

0.1669), conservative investment policy (CMA, 0.0748), and slightly illiquid (LIQ, 0.0098). For

GMMd, the twelve-sector portfolio appears to be weighted towards firms that are large cap

(SMB, - 0.0501), value (HML, 0.1697), robust profitability (RMW, 0.1345), conservative

investment policy (CMA, 0.2242), and illiquid (LIQ, 0.1152).

According to LSDV, of course, the previous alpha and beta remain exactly the same for

the individual sectors, since the market risk factor is the only risk factor used in LSDV.

7.3 Sector Analysis of the random effects models: the new FF five-factor model

Table 4a presents the OLS and GMMd results for the 12 FF sectors of the new five-factor FF

model, since these results are needed in the estimation of the random effects model.

Insert Table 4a here

Note that the coefficient for the market factor RM – Rf is significant at 1% for all 12 FF

sectors using OLS and for 11 of the 12 FF sectors using GMMd. For the Utilities sector, the

coefficient using GMMd is insignificant at the standard 1%, 5%, and 10% levels but is significant

at the 20% level (t = 1.36 > 1.28).

For SMB, its OLS estimated coefficient was significant at 1% for 9 sectors with 5 of these

coefficients being positive and 4 negative. The OLS estimated SMB coefficient was significant

at the 5% level for 1 sector and insignificant for the other 2. Turning to the GMMd estimated

coefficients for SMB, no coefficients are significant!

For the HML factor, Fama and French (2015) themselves felt that HML could be

redundant34

with the addition of the RMW and CMA factors. In other words, there could be

multicollinearity with its attendant problems. For OLS, the HML coefficient is significant at the

1% level for 8 sectors with 2 having a negative sign. 1 sector is significant at the 10% level and 3

are insignificant. As with SMB, the GMMd estimated coefficients of the HML factor are all

insignificant at the standard levels of significance! However, 4 sectors are significant at the 20%

level with 1 sector having a negative sign.

34 See Fama and French (2015), p. 2.

25

Turning to the first new factor RMW, we note that the OLS estimated coefficients are

significant at the 1% level for 8 sectors with 2 having a negative sign. 2 sectors are significant at

the 5% level, and 2 are insignificant. For the GMMd estimated RMW coefficients, only the Non-

Durables sector has a significant coefficient at a standard level (10%).

For the CMA factor, the OLS estimated coefficients are significant at the 1% level for 4

sectors with one of these sectors having a negative sign. 1 sector is significant at the 5% level

with a negative sign, and 7 sectors are insignificant. The GMMd estimated CMA coefficients are

significant at the 1% level for 3 sectors. 2 of these sectors, Chemicals and Health, have a positive

sign and overlap with the OLS results. The third sector, Telecom has a negative sign and is

insignificant with OLS. 2 sectors are significant at the 5% level with GMMd with 1 being

positive and 1 being negative. Manufacturing is the one with the positive sign, but the sign is

insignificant with OLS. Business Equipment has a negative sign for both GMMd and OLS with

the sign being even more significant at 1% for OLS. 1 sector is significant at the 10% level with

GMMd, and 6 sectors are insignificant. Thus, the GMMd estimation results suggest that only the

new FF factor CMA seems to have some explanatory power.

7.4 An Investment Perspective for the random effects FF five factor model

Now, turning to an investment perspective, we will focus on the 5 sectors (Energy, Business

Equipment, Telecom, Utilities, and Health) that generate positive risk-adjusted abnormal returns

or positive Jensen (1968) alpha based on OLS (see Table 4a). Note, though, that only 2 of these

sectors (Business Equipment and Telecom) have positive alphas (0.4964 and 0.6862,

respectively)35

using GMMd.

For Business Equipment using OLS, it seems that this sector is dominated by small firms

(SMB coefficient 0.0854) with low book to market ratios (HML, - 0.4199), weak profitability

(RMW, - 0.4180), and aggressive investment behavior (CMA, - 0.4395) with the coefficients of

these variables all being significant at the 1% level except for SMB which is significant at the 5%

level. For GMMd, Business Equipment is dominated by large firms (SMB, - 0.0379) with high

book to market ratios (HML, 0.0425) weak profitability (RMW, - 0.7035), and aggressive

investment behavior (CMA, - 0.9481). Only RMW and CMA are significant, albeit at the 20% and

5% levels, respectively.

35 These numbers are in percent return per month. On a nominal annual basis, these abnormal returns are 5.96% and

8.23%, respectively.

26

Using OLS, Telecom is dominated by large firms (SMB coefficient - 0.2658) with a high

book to market ratio (HML, 0.1101), a weak profitability (RMW, - 0.3185) and a conservative

investment behavior (CMA, 0.0503) with the coefficients being significant at the 1%, 10%, and

1% levels for SMB, HML, and RMW, respectively, and insignificant for CMA. We obtain

identical results for GMMd with respect to the signs except that CMA now has a negative sign.

The levels of significance for the factors have changed. The CMA coefficient is now significant

at the 1% level, whereas it was the only insignificant one using OLS. Now SMB, HML, and

RMW are all insignificant.

Of these 5 sectors with positive alpha, only Business Equipment has a beta greater than

one for the market factor RM – Rf. To the investor, this suggests that one can earn abnormal

returns in 4 sectors while taking on less relative risk than the market portfolio. Note, however,

under GMMd only Telecom has positive alpha and beta less than one.

7.5 Sector analysis of the random effects model: the augmented new FF six-factor model

Table 4b presents our estimation results for the new augmented FF six-factor model using the

OLS and GMMd approaches for the random effects model.

Insert Table 4b here

Again, Business Equipment and Telecom have positive alphas using OLS with values of

0.3762 and 0.1978, respectively, and significance levels of 1% and 20%. Using GMMd, both

coefficients are positive at 0.3000 and 0.7119, respectively, with Business Equipment

insignificant and Telecom significant at 5%. Using OLS, the coefficients and t values for the new

FF five factors in the six-factor model are essentially the same. This is not surprising given that

the liquidity coefficient is not significantly different from 0 for both sectors. Turning to GMMd

for Telecom, the coefficients and t values for the new FF five factors did change somewhat with

the SMB, HML, and RMW coefficients remaining insignificant and the t value for CMA

coefficient dropping from 1% to 5%. The LIQ coefficient is insignificant for both OLS and

GMMd.

LIQ is really a measure of illiquidity because we are using the portfolio version of the

LIQ factor (tradable LIQ that is long on an illiquid portfolio and short on a liquid one).

Coefficients should be positive to generate a risk premium. For example, the Durables sector has

a positive sign and is significant at the 5% level for OLS and at the 20% level for GMMd. This is

27

consistent with the idea that durables are difficult to sell during periods of illiquidity. Only 3 of

the 12 FF sectors (Health, Money, and Other) have negative coefficients for both OLS and

GMMd. Although these LIQ coefficients are significant at the 5%, 1%, and 1% level,

respectively, for Health, Money, and Other for OLS, they are all insignificant for GMMd.

7.6 The Random Effects estimation of the FF new five-factor model

The t statistics in Table 4a for the coefficients of random effects model are calculated by 2

methods. First, weighted averages of the t statistics for the 12 sectors for each coefficient are

calculated using (22). Then the t statistics are calculated using the Swamy (1970) variance-

covariance matrix given by (25).

The estimation of the Jensen (1968) alpha (constant term) for the random effects model is

slightly negative but insignificant for both FGLS and GMMd. The insignificance of alpha is an

indicator of market efficiency. The beta coefficient for the market factor RM – Rf is close to 1 for

both FGLS and GMMd. Thus, the 12-sector portfolio has essentially the same relative market

risk as the market itself and has no abnormal or superior return. This suggests that the market

portfolio should be the preferred investment vehicle, as it can be cost effectively obtained from

either index mutual funds or exchange traded funds (ETFs).

For SMB, the t values are insignificant for both FGLS and GMMd and for both methods

of calculating t. For HML, all results are also insignificant except the weighted average t for

FGLS is significant at the 10% level.

Using FGLS, the new FF RMW factor is positive and significant at the 1% level using the

weighted average t and at the 10% level using the Swamy variance-covariance matrix. For

GMMd, RMW is insignificant using the weighted average t and significant at only the 20% level

using the Swamy variance-covariance matrix. These coefficients are 0.1674 for FGLS and

0.2704 for GMMd. These values are much bigger than the insignificant SMB and HML values

that ranged from 0.0217 to 0.1093. Therefore, robust profitability firms (RMW) do seem to have

some explanatory power for the 12-sector portfolio returns.

Meanwhile, conservative firms (CMA) do not seem to explain much of the 12-sector

portfolio returns with an FGLS coefficient of 0.0741 and a t that is almost significant at the 20%

level for the weighted average method. However, the t value is insignificant for the Swamy

method, and GMMd yields insignificant results.

28

7.7 The Random Effects estimation of the FF new six-factor model

For the six-factor model using FGLS, the coefficients of the FF five factors in the twelve-sector

FF equally weighted portfolio are imperceptibly different from their values in the five-factor

model (see Tables 4a and 4b). The t values have the same levels of significance, except for the

HML coefficient which has improved from the 10% level to the 5% level using the weighted

average method for calculating t.

Looking at the investment performance of the FF twelve-sector portfolio, the Jensen

(1968) performance measure is negative but insignificant even at the 20% level36

. Using GMMd,

it appears that the portfolio is weighted towards stocks that are large cap (SMB, - 0.0504), high

book to market (HML, 0.1720), robust profitability (RMW, 0.1338), conservative investment

(CMA, 0.2227), and illiquid (LIQ, 0.1170). These results seem consistent with our previous

Tobin Q and investment perspective. Normally, one expects large cap stocks to be liquid and

hence, the LIQ coefficient should not be significantly different from 0 or possibly significantly

negative. Here, we find that it is barely significant at the 20% using GMMd and the Swamy

variance-covariance matrix. Perhaps this is an effect of the 2007-2009 financial crisis when even

large cap stocks were somewhat illiquid.

7.8 F test for the fixed effects versus the random effects models

When testing the fixed effects model over the pooled one, the F test rejects the pooled regression

approach. The F test is given by (e.g., Greene, 2012)

2 2

2

/ 11,

1 /

LSDV Pooled

LSDV

R R NF N NT N k

R NT N k

(43)

where 2LSDVR is the coefficient of determination for the least squares dummy variables regression,

2PooledR is the coefficient of determination for the pooled regression, N is the number of sectors,

T is the number of months, and k is the number of regressors. Table 5 provides the F values for

the five and six factor models using OLS and GMMd estimation methods.

36 The t value is positive for the weighted average approach using FGLS even though the alpha is negative because

in this particular case the weighted summation of the sectors with positive t values outweighs the magnitude of the

weighted summation of the sectors with negative t values.

29

Insert Table 5 here

Note that all the F tests are significant at the 1% level, which means for the 5 and 6 risk factor

models, the pooled model is rejected in favor of the fixed effects model using either OLS or

GMMd estimation methods37

.

7.9 Hausman H test for the fixed effects versus the random effects models38

While the F tests let us draw conclusions about the fixed effects model, it cannot help

discriminate between the fixed and random effects models. The Hausman (1978) test is

particularly well adapted to discriminate between models that needs to be chosen, in our case, the

fixed effects versus the random effects models. The Hausman test statistic is chi-squared

distributed with k-1 degrees of freedom and is given by (40). Intuitively, the H test is a quadratic

distance weighted by its variance, the distance being between the fixed and random effects

estimations. Turning to our result, Table 5 shows that the Hausman test cannot reject the random

effects model using either OLS or GMMd and for the 5 and 6 risk factor models39

. Thus, the

fixed effects model is rejected.

8. Conclusions

We find that using OLS or LSDV estimation, the new Fama and French (2015) five factors are

highly significant. However, adding to this model the illiquidity factor of Pastor and Stambaugh

(2003) does not provide more explanatory power to the new FF model.

When applying the GMM approach proposed in this paper to either the FF five-factor or

augmented six-factor models, a different picture emerges. In the five-factor model, only the

market risk factor at the 1% level and the profitability RMW factor at the 5% level are significant

for the fixed effects model, with the Hausman auxiliary regression showing significant

measurement errors for RMW. Turning to the random effects model, the market factor is again

significant at the 1% level; whereas, the RMW factor falls to the non-standard 20% level.

Adding the PS illiquidity factor to the FF 5 factors changes the conclusions in the GMMd

universe. Except for the market risk factor, none of the factors is significant at the standard level.

37 Note that the critical value for F test in either model is 1.54. 38 Note that we did not perform the auxiliary regression version of the test. This is because we have repeated

obsevations of the regerssors, therefore rendering the test difficult to apply for this financial application. We

therefore rely on the Hausman test. 39The critical value for the chi-squared distribution of the H test is in our case 11.07 or12.59, respectively, for the

five and six risk factor models.

30

This result is consistent with MacKinlay (1995). The illiquidity factor, however, could be

considered significant if we lower the bar to the 20% level. Note, though, that the illiquidity

factor is measured with significant errors using the Hausman auxiliary regression test. Also note

when using this test, the investment policy CMA factor becomes significant at 10%.

For the fixed effects model, we find that the Jensen performance measure alpha is

negative and significant for the FF twelve-sector pooled augmented six-factor model using our

GMMd approach. Furthermore, although alpha is not significant for both the pooled OLS or

GMMd five-factor model or the pooled OLS augmented six-factor model, the coefficient

nevertheless has a negative sign. While markets may be ex-ante efficient and not ex-post, this

result shows ex-post that the twelve-sector portfolio is somewhat inefficient. Therefore, investors

would be better off holding the market portfolio, rather than this one. Turning to the random

effects model, the alpha is also negative, but insignificant for both the FGLS or GMMd

approaches and for both the five-factor or augmented six-factor models.

Because the FF model is theoretically firmly grounded, we believe that it is still capable

of explaining returns sufficiently well, even in the light of our mitigating results. Further

research, for example on hedge fund returns that are likely to have even higher levels of

skewness and kurtosis than equities, might well show the value of using the FF factors in our

higher moments GMMd approach.

31

References

Abel, A.B., 1983. Optimal investment under uncertainty, American Economic Review, 73, 228-

233.

Adrian, T., Fleming, M., Shachar, O., Vogt, E., 2017. Market liquidity after the financial crisis,

Annual Review of Financial Economics, 9, 43-83.

Aitken, A.C., 1935. On least squares and linear combinations of observations, Proceedings of the

Royal Statistical Society, 55, 42-48.

Anderson, T., Rubin, H., 1949. Estimation of the parameters of a single equation in a complete

system of stochastic equations, Annals of Mathematica Statistics, 20, 46-63.

Anderson, T., Rubin, H., 1950. The asymptotic properties of the parameters of a single equation

in a complete system of stochastic equations, Annals of Mathematica Statistics, 21, 570-582.

Arellano, M. 2003. Panel Data Econometrics, Oxford University Press, New York.

Arellano, M., Bond, S., 1991. Some tests of specification for panel data: Monte Carlo evidence

and an application to employment equations, Review of Economics Studies, 58, 277-297.

Baltagi, B., 2001. Econometric Analysis of Panel Data, 2rd

ed., John Wiley & Sons, New York.

Bellman, R., 1957. Dynamic Programming, Princeton University Press, Princeton, NJ.

Benninga, 2014. Financial Modeling, 4th

ed., MIT Press, Cambridge, MA.

Barillas, F., Shanken, J., 2015. Comparing asset pricing models, NBER Working Paper 21771,

Cambridge, MA.

Black, F., 1972. Capital market equilibrium with restricted borrowing, Journal of Business, 45,

444-455.

Bianchi, M.L., 2015. Are the log-returns of Italian open-end mutual funds normally distributed?

A risk assessment perspective, Journal of Asset Management, 16, 437–449.

Bound, J., Jaeger, D., Baker, R. 1995. Problems with instrumental variables estimation when the

correlation between the instruments and the endogenous explanatory variables is weak, Journal

of the American Statistical Association, 90, 443-450.

Campbell, J., Lo, A., MacKinlay, A., 1997. The Econometrics of Financial Markets, Princeton

University Press, Princeton, NJ.

Carhart, M., 1997. On the persistence in mutual fund performance, Journal of Finance, 52, 57-

82.

Chow, G.C., 1997. Dynamic Economics: Optimization by the Lagrange Method, Oxford

University Press, New York.

Cobb, C.M., Douglas, P.H., 1938. A theory of production, American Economic Review

Supplement, 18, 139-165.

32

Cochrane, D., Orcutt, G., 1949. Application of the least squares regression to relationships

containing autocorrelated error terms, Journal of the American Statistical Association, 44, 32-61.

Cochrane, J.H., 1991. Production-based asset pricing and the link between stock returns and

economic fluctuations, Journal of Finance, 46, 209-237.

Cochrane, J.H., 2005. Asset Pricing, revised ed., Princeton University Press, Princeton, NJ.

Cochrane, J.H., 2008. The dog that did not bark: A defense of return predictability, Review of

Financial Studies, 21, 1533-1575.

Cochrane, J.H., 2011. Presidential address: Discount rates, Journal of Finance, 66, 1047-1108.

Dagenais, M.G., Dagenais, D.L., 1994. GMM estimators for linear regression models with errors

in the variables, Centre de recherche et développement en économique (CRDE), Working Paper

0594, University of Montreal, Montreal, QC.

Dagenais, M.G., Dagenais, D.L., 1997. Higher moments estimators for linear regression models

with errors in the variables, Journal of Econometrics, 76, 193-221.

Driscoll, J.C., Kraay, A.C., 1998. Consistent covariance matrix estimation with spatially

dependent panel data, Review of Economics and Statistics, 80, 546-560.

Dufour, J.M., 2003. Identification, weak instruments and statistical inference in econometrics,

Working Paper No. 2003s-49, CIRANO, University of Montreal, Montreal, QC.

Durbin, J., 1954. Errors in variables, International Statistical Review, 22(1/3), 23-32.

Fama, E.F., 1963. Mandelbrot and the stable Paretian hypothesis, Journal of Business, 36 , 420-

429.

Fama, E.F., 1965. Portfolio analysis in a stable Paretian market, Management Science, 11, 404-

419.

Fama, E.F., French K.R., 1992. The cross-section of expected stock returns, Journal of Finance,

47, 427-465.

Fama, E.F., French K.R., 1993. Common risk factors in the returns of stocks and bonds, Journal

of Financial Economics, 33, 3-56.

Fama, E.F., French, K.R., 2015. A five-factor asset pricing model, Journal of Financial

Economics, 116, 1-22.

Fama, E.F., McBeth, J.D. 1973. Risk, return, and equilibrium: empirical tests, Journal of

Political Economy, 81, 607-636

Greene, W.H., 2012. Econometric Analysis, 7th

ed., Pearson Education, Inc., Boston, MA.

33

Greene, W.H., 2015. Class notes on The Econometric Analysis of Panel Data, Class 9, Stern

School of Business, New York University, New York, Winter 2015. Available at

http://people.stern.nyu.edu/wgreene/Econometrics/PanelDataNotes-9.ppt

Hahn, J., Hausman, J., 2002. A new specification test for the validity of instrumental variables,

Econometrica, 70, 163-189.

Hahn, J., Hausman, J., 2003. Weak instruments: Diagnosis and cures in empirical economics,

American Economic Review, 93, 118-125.

Hansen, L., 1982. Large sample properties of the generalized method of moments estimators,

Econometrica, 50, 1029-1054.

Hansen, L., Singleton, 1982. Generalized instrumental variables estimation of nonlinear rational

expectations models, Econometrica, 50, 1269-1286.

Hansen, L., Singleton, 1984. Generalized instrumental variables estimation of nonlinear rational

expectations models: Errata, Econometrica, 52, 267-268.

Haug, E.S., 2007. Derivatives Models on Models, Wiley, Chichester, England.

Hausman, J.,1978. Specification tests in econometrics, Econometrica, 46(6), 1251-1271.

Hausman, J., Taylor, W., 1981. Panel data and unobservable individual effects, Econometrica,

49, 1377-1398.

Hausman, J., Stock, J., Yogo, M., 2005. Asymptotic properties of the Hahn-Hausman test for

weak instruments, Economics Letters, 89, 333-342.

Heij, C., de Boer, P., Franses, P.H., Kloek, T., van Dijk, H.K., 2004. Econometric Methods with

Applications in Business and Economics, Oxford University Press, Oxford, England.

Hou, K., Xue, C., Zhang, L., 2015. Digesting anomalies: An investment approach, Review of

Financial Studies, 28, 650-705.

Jarque, C.M., Bera, A.K., 1980. Efficient tests for normality, homoscedasticity and serial

independence of regression residuals, Economics Letters, 6, 255-259.

Jensen, M.C., 1968. The performance of mutual funds in the period 1945-64, Journal of Finance,

23, 389-416.

Jurczenko, E., Maillet, B., 2006. The four-moment capital asset pricing model: Between asset

pricing and asset allocation. In Multi-Moment Asset Allocation and Pricing Models (eds.

Jurczenko, E., Maillet, B.), John Wiley & Sons, Chichester, England, Ch. 6.

Lin, X., Zhang, L., 2013. The investment manifesto, Journal of Monetary Economics, 60, 351-

366

Lintner, J., 1965. The valuation of risk assets and the selection of risky investments in stock

portfolios and capital budgets, Review of Economics and Statistics, 46, 13-37.

http://people.stern.nyu.edu/wgreene/Econometrics/PanelDataNotes-9.ppt

34

MacKinlay, A.C., 1995. Multifactor models do not explain deviations from the CAPM, Journal

of Financial Economics, 38, 3-28.

Mandelbrodt, B., 1963. The variation of certain speculative prices, Journal of Business, 36, 394-

419.

Mandelbrodt, B., 1972. Correction of an error in “The variation of certain speculative prices”,

Journal of Business, 45, 542-543.

Markowitz, H., 1959. Portfolio Selection: Efficient Diversification of Investments, Cowles

Foundation Monograph 16, John Wiley & Sons, New York.

Markowitz, H., 2012. The “Great Confusion” concerning MPT, Aestimatio, The IEB

International Journal of Finance, 4, 8-27.

Mehra, R., Prescott, E., 1985. The equity risk premium: A puzzle, Journal of Monetary

Economics, 15, 145-161.

Merton, R.C., 1973. An intertemporal capital asset pricing model, Econometrica, 41, 867-887

Miller, M.H., Modigliani, F., 1961. Dividend policy, growth, and the valuation of shares, Journal

of Business, 34, 411-433.

Mitchell, W.C., 1915. The making and using of index numbers, In Introduction to Index

Numbers and Wholesales Prices in the United States and Foreign Countries, Bulletin No. 173,

U.S. Bureau of Labor Statistics.

Mossin, 1966. Equilibrium in a capital asset market, Econometrica, 34, 768-783.

Mundlak, Y., 1978. On the pooling of time series and cross section data, Econometrica, 46, 69-

85.

Nelson, C., Startz, R., 1990a. Some further results on the exact small sample properties of the

instrumental variables estimator, Econometrica, 58, 967-976.

Nelson, C., Startz, R., 1990b. The distribution of the instrumental variables estimator and its t-

ratio with the instrument is a poor one, Journal of Business, 63, S125-S140

Newey, W.K., West, K.D., 1987. A simple, positive semi-definite, heteroskedasticity and

autocorrelation consistent covariance matrix, Econometrica, 55, 703-708.

Olea, J.L.M., Pflueger, C., 2013. A robust test of weak instruments, Journal of Business and

Economic Statistics, 31, 358–369.

Pagan, A.R., 1984. Econometric issues in the analysis of regressions with generated regressors,

International Economic Review, 25, 221-247.

Pagan, A.R., 1986. Two stage and related estimators and their applications, Review of Economic

Studies, 53, 517-538.

Pagan, A.R., Ullah A., 1988. The econometric analysis of models with risk terms, Journal of

Applied Econometrics, 3, 87-105.

35

Pal, M., 1980. Consistent moment estimators of regression coefficients in the presence of errors

in variables, Journal of Econometrics, 14, 349-364

Pástor, L., Stambaugh, R.F., 2003. Liquidity risk and expected stock returns, Journal of Political

Economy, 111, 642-685.

Pinto, J.E., Henry, E., Robinson, T.R., Stowe, J.D., 2015. Equity Valuation, 3nd

ed., John Wiley

& Sons, New York.

Pindyck, R., Rubinfeld, D., 1998. Econometric Models and Economic Forecasts, 4th

ed., Irwin

McGraw-Hill, Boston, MA.

Racicot, F.E., 2015. Engineering robust instruments for panel data regression models with errors

in variables: A note, Applied Economics, 47, 981-989.

Ross, S.A., 1976. The arbitrage theory of capital asset pricing, Journal of Economic Theory, 13,

341-360.

Rubinstein, M., 1973. The fundamental theorem of parameter-preference security valuation.

Journal of Financial and Quantitative Analysis, 8, 61-69.

Schmidheiny, K., 2015. Panel data: Fixed and random effects, short guides to

microeconometrics, Unversität Basel, Basel, Switzerland. Available at: http://www.schmidheiny.name/teaching/panel2up.pdf

Schopflocher, T.P., Sullivan, P.J., 2005. The relashionship between skweness and kurtosis of a

diffusing scalar, Boundary-Layer Meteorology, 115, 341–358.

Shanken, J., 1992. On the estimation of beta-pricing models, Journal of Finance, 47, 1-34.

Sharpe, W.F., 1964. Capital asset prices: A theory of market equilibrium under conditions of

risk, Journal of Finance, 19, 425-442.

Spencer, D., Berk, K., 1981. A limited information specification test, Econometrica, 49, 1079-

1085.

Stock, J., Yogo, M., 2005. Testing for weak instruments in linear IV regression. In Identification

and Inference in Econometrics: Festschrift in Honor of Thomas Rothenberg (eds. Stock, J.,

Andrews, D.), Cambridge University Press, Cambridge, England, 80-108.

Stock, J., Watson, M., 2011. Introductory Econometrics, 3rd

ed., Pearson Education, Inc.,

Boston, MA.

Swamy, P., 1970. Efficient inference in a random coefficient regression model, Econometrica,

38, 311-323.

Theil, H., Goldberger, A.S., 1961. On pure and mixed estimation in economics, International

Economic Review, 2, 65-78.

Tobin, J., 1969. A general equilibrium approach to monetary theory, Journal of Money, Credit

and Banking, 1, 15-29.

http://www.schmidheiny.name/teaching/panel2up.pdf

36

White, H., 1980. A heteroscedasticity-consistent covariance matrix estimator and a direct test for

heteroscedasticity, Econometrica, 48, 817-838.

Wilkins, J.E., 1944. A note on skewness and kurtosis, The Annals of Mathematical Statistics, 15,

333-335.

Wooldridge, J.M., 2002. Econometric Analysis of Cross Section and Panel Data, MIT Press,

Cambridge, MA.

37

Table 1

Fama and French 12 sectors 1968m01 – 2014m12

Notes: For each sector, there are 47 years x 12 months = 564 monthly observations for a total of 12 sectors x 564 = 6,768 for the panel data set.

Table 2

Fama-French (2015) and Pástor-Stambaugh risk factors 1968m01 - 2014m12

Notes: For each sector, there are 564 observations for a total of 6,768 for the panel data set. Here the panel data contain the 564 monthly

observations, repeated for each of the 12 sectors.

Mean Median Max Min Std. Dev. Skewness Kurtosis Jarque-Bera

1 Nodur 1.10 1.13 18.88 -21.03 4.40 -0.28 4.99 100.20

2 Durbl 0.87 0.84 42.62 -32.63 6.45 0.12 7.81 544.55

3 Manuf 0.97 1.27 21.08 -28.58 5.41 -0.51 5.64 188.14

4 Enrgy 1.06 1.03 24.56 -18.33 5.57 0.02 4.25 36.81

5 Chems 0.96 1.10 20.22 -24.59 4.72 -0.25 5.20 119.11

6 Buseq 0.89 0.87 20.75 -26.07 6.68 -0.20 4.20 37.51

7 Telcm 0.96 1.18 21.34 -16.22 4.75 -0.25 4.23 41.21

8 Utils 0.91 0.96 18.84 -12.65 4.13 -0.12 4.01 25.37

9 Shops 1.05 1.11 25.85 -28.25 5.34 -0.27 5.32 133.58

10 Hlth 1.07 1.11 29.52 -20.46 4.96 0.07 5.51 148.03

11 Money 1.02 1.33 21.10 -22.10 5.60 -0.42 4.64 79.21

12 Other 0.78 1.13 19.35 -29.24 5.54 -0.49 5.14 130.16

Panel data 0.97 1.10 42.62 -32.63 5.34 -0.21 5.68 2075.57

Mean Median Max Min Std. Dev. Skewness Kurtosis Jarque-Bera

R m -R f 0.49 0.86 16.10 -23.24 4.58 -0.52 4.78 100.05

SMB 0.20 0.04 19.05 -15.26 3.09 0.43 6.80 356.73

HML 0.38 0.34 13.89 -12.61 2.95 0.01 5.38 133.54

RMW 0.27 0.19 12.24 -17.60 2.20 -0.43 14.17 2951.29

CMA 0.37 0.25 8.93 -6.76 2.00 0.31 4.26 46.41

LIQ 0.43 0.20 21.46 -10.80 3.52 0.44 5.64 181.62

38

Table 3a

Fixed Effects Model, LSDV vs. GMMd estimation methods for the FF five-factor model by FF 12 sectors

Notes: LSDV is the least squares dummy variable method for estimating the alpha and beta for the FF 12 sectors. GMMd is the generalized

method of moments method using our robust distance instruments given in (34) with the Newey-West (1987) HAC variance-covariance

estimator. *** indicates significance at 1%; **, 5%; and *, 10%. 2R is the adjusted coefficient of determination, and DW is the Durbin-Watson

statistic for autocorrelation of order 1. Hausd is the Hausman (1978) artificial regression test for measurement errors using our robust distance

instruments. ƗThe pooled OLS and GMMd are equivalent to an average of the LSDV/GMMd estimations.

c R m-R f SMB HML RMW CMA DW

Sector Fama-French (2015)

1 NoDur LSDV -0.0025 0.8626

t-stat -1.90* 44.49***

GMMd -0.0045 0.9071

t-stat -0.50 6.15***

2 Durbl LSDV 0.0053 1.1039

t-stat 2.96*** 41.28***

GMMd -0.0122 0.8310

t-stat -0.95 3.77***

3 Manuf LSDV 0.0054 1.0702

t-stat 5.48*** 72.66***

GMMd 0.0060 1.0805

t-stat 1.32 17.61***

4 Enrgy LSDV 0.0000 0.9399

t-stat 0.00 29.11***

GMMd 0.0224 1.4261

t-stat 1.69* 7.42***

5 Chems LSDV 0.0011 0.9613

t-stat 0.86 50.37***

GMMd -0.0105 0.8892

t-stat -0.74 3.94***

6 BusEq LSDV 0.0097 1.0921

t-stat 5.97*** 44.84***

GMMd -0.0024 0.7924

t-stat -0.25 4.93***

7 Telcm LSDV -0.0038 0.8616

t-stat -2.27** 34.02***

GMMd -0.0099 0.5862

t-stat -0.97 3.29***

8 Utils LSDV -0.0110 0.7529

t-stat -6.26*** 28.62***

GMMd -0.0116 0.7717

t-stat -1.14 5.45***

9 Shops LSDV 0.0019 0.9510

t-stat 1.32 44.42***

GMMd 0.0007 0.9365

t-stat 0.09 8.12***

10 Hlth LSDV 0.0011 0.8947

t-stat 0.69 36.51***

GMMd -0.0032 0.9662

t-stat -0.24 4.22***

11 Money LSDV 0.0041 1.0494

t-stat 2.97*** 50.68***

GMMd 0.0069 1.0363

t-stat 0.96 8.75***

12 Other LSDV 0.0023 1.0382

t-stat 2.19** 66.93***

GMMd 0.0046 1.1466

t-stat 0.81 13.73***

Pooled Mode l

OLS -0.0492Ɨ

0.9928Ɨ

0.0216 0.1091 0.1671 0.0748 0.69 1.94

t-stat 1.02 35.82*** 1.67* 6.09*** 8.97*** 2.69***

GMMd -0.0614Ɨ

0.9374Ɨ

0.0906 -0.0016 0.3976 0.0901 0.67 1.93

t-stat -1.03 7.67*** 0.97 -0.01 2.04** 0.47

Hausd -0.0238Ɨ

0.9374Ɨ

0.0906 -0.0016 0.3976 0.0901 0.0542 -0.0670 0.0927 -0.2805 -0.0292 0.69 1.94

-0.99 11.66*** 1.53 -0.02 3.40*** 0.84 1.24 -1.10 1.03 -2.36** -0.26

39

Table 3b

Fixed Effects Model, LSDV vs. GMMd estimation methods for the augmented (LIQ) FF six-factor model by FF 12 sectors

Notes: LSDV is the least squares dummy variable method for estimating the alpha and beta for the FF 12 sectors. GMMd is the generalized

method of moments method using our robust distance instruments given in (34) with the Newey-West (1987) HAC variance-covariance

estimator. *** indicates significance at 1%; **, 5%; and *, 10%. 2R is the adjusted coefficient of determination and DW is the Durbin-Watson

statistic for autocorrelation of order 1. Hausd is the Hausman (1978) artificial regression test for measurement errors using our robust distance

instruments. ƗThe pooled OLS and GMMd are equivalent to an average of the LSDV/GMMd estimations.

c R m-R f SMB HML RMW CMA LIQ DW

Sector Fama-French (2015) and Pastor-Stambaugh (2003)

1 NoDur LSDV -0.0025 0.8626

t-stat -1.90* 44.49***

GMMd -0.0045 0.9071

t-stat -0.50 6.15***

2 Durbl LSDV 0.0053 1.1039

t-stat 2.96*** 41.28***

GMMd -0.0122 0.8310

t-stat -0.95 3.77***

3 Manuf LSDV 0.0054 1.0702

t-stat 5.48*** 72.66***

GMMd 0.0060 1.0805

t-stat 1.32 17.61***

4 Enrgy LSDV 0.0000 0.9399

t-stat 0.00 29.11***

GMMd 0.0224 1.4261

t-stat 1.69 7.42***

5 Chems LSDV 0.0011 0.9613

t-stat 0.86 50.37***

GMMd -0.0105 0.8892

t-stat -0.74 3.94***

6 BusEq LSDV 0.0097 1.0921

t-stat 5.97*** 44.84***

GMMd -0.0024 0.7924

t-stat -0.25 4.93***

7 Telcm LSDV -0.0038 0.8616

t-stat -2.27** 34.02***

GMMd -0.0099 0.5862

t-stat -0.97 3.29***

8 Utils LSDV -0.0110 0.7529

t-stat -6.26*** 28.62***

GMMd -0.0116 0.7717

t-stat -1.14 5.45***

9 Shops LSDV 0.0019 0.9510

t-stat 1.32 44.42***

GMMd 0.0007 0.9365

t-stat 0.09 8.12***

10 Hlth LSDV 0.0011 0.8947

t-stat 0.69 36.51***

GMMd -0.0032 0.9662

t-stat -0.24 4.22***

11 Money LSDV 0.0041 1.0494

t-stat 2.97*** 50.68***

GMMd 0.0069 1.0363

t-stat 0.96 8.75***

12 Other LSDV 0.0023 1.0382

t-stat 2.19** 66.93***

GMMd 0.0046 1.1466

t-stat 0.81 13.73***

Pooled Mode l

OLS -0.0535Ɨ

0.9932Ɨ

0.0218 0.1090 0.1669 0.0748 0.0098 0.69 1.94

t-stat -1.03 35.83*** 1.68* 6.08*** 8.96*** 2.69*** 0.95

GMMd -0.1653Ɨ

1.0148Ɨ

-0.0501 0.1697 0.1345 0.2242 0.1152 0.65 1.89

t-stat -1.99** 7.77*** -0.39 0.79 0.52 1.20 1.34

Hausd -0.0256Ɨ

1.0148Ɨ

-0.0501 0.1697 0.1345 0.2242 0.1152 -0.0232 0.0740 -0.0790 -0.0179 -0.1632 -0.1105 0.69 1.94

-0.99 11.76*** -0.61 1.52 0.85 1.86* 2.46** -1.16 0.89 -0.69 -0.11 -1.32 -2.30**

40

Table 4a Random Effects Model, OLS vs. GMMd estimation methods for the FF five-factor model by FF 12 sectors

Notes: FGLS is calculated using (23) for the random coefficient model. t-stat is calculated first as a Swamy (1970) weighted

average of the OLS sector t-stats using (22) and then using the estimated Swamy variance-covariance matrix given by (25). GMMd

is the generalized method of moments using our robust distance instruments given in (34) with the Newey-West (1987).

HAC variance-covariance estimator for the random coefficient model. *** indicates significance at 1%; **, 5%; and *, 10%. 2R

is the adjusted coefficient of determination and DW is the Durbin-Watson statistic for autocorrelation of order 1.

c R m-R f SMB HML RMW CMA DW

Sector Fama-French (2015)

1 NoDur OLS -0.0990 0.9100 0.0981 -0.0311 0.6417 0.4210 0.78 1.93

t-stat -1.08 41.74*** 3.15*** -0.72 14.36*** 6.32***

GMMd -0.1668 0.8790 0.2595 -0.0114 0.7132 0.4862 0.77 1.92

t-stat -1.33 8.32*** 1.27 -0.03 1.64* 1.68*

2 Durbl OLS -0.4367 1.2217 0.2181 0.5399 0.1692 0.0049 0.71 2.09

t-stat -2.79*** 32.95*** 4.12*** 7.39*** 2.23** 0.04

GMMd -0.4989 1.5280 -0.4635 1.5563 -0.9739 -0.0720 0.37 1.89

t-stat -1.60 6.43*** -0.90 1.45 -0.76 -0.13

3 Manuf OLS -0.1732 1.1424 0.1486 0.1724 0.2519 0.0255 0.89 2.03

t-stat -2.14** 59.48*** 5.42*** 4.56*** 6.40*** 0.43

GMMd -0.4176 1.3581 -0.1281 0.2938 0.0129 0.5937 0.80 1.94

t-stat -3.08*** 12.56*** -0.52 0.69 0.02 2.03**

4 Enrgy OLS 0.0744 0.9163 -0.1986 0.2310 0.0376 0.1657 0.46 1.90

t-stat 0.41 21.17*** -3.21*** 2.71*** 0.42 1.25

GMMd -0.0494 0.8585 0.1868 -1.2231 1.4670 0.8104 -0.12 1.70

t-stat -0.12 2.66*** 0.32 -1.45 1.28 1.13

5 Chems OLS -0.1995 1.0032 -0.0332 0.0266 0.4751 0.3349 0.79 2.04

t-stat -2.07** 43.83*** -1.02 0.59 10.13*** 4.79***

GMMd -0.3843 1.0222 0.0105 -0.1575 0.5345 0.9243 0.76 1.93

t-stat -2.33** 6.51*** 0.05 -0.52 1.29 2.62***

6 BusEq OLS 0.3866 1.0386 0.0854 -0.4199 -0.4180 -0.4395 0.83 2.01

t-stat 3.16*** 35.83*** 2.06** -7.35*** -7.04*** -4.97***

GMMd 0.4964 1.0529 -0.0379 0.0425 -0.7035 -0.9481 0.81 2.04

t-stat 2.65*** 7.21*** -0.14 0.09 -1.32 -2.27**

7 Telcm OLS 0.2114 0.8328 -0.2658 0.1101 -0.3185 0.0503 0.61 1.99

t-stat 1.59 26.40*** -5.90*** 1.77* -4.93*** 0.52

GMMd 0.6862 0.5372 -0.1448 0.5532 -0.3034 -1.3553 0.42 1.92

t-stat 3.34*** 3.84*** -0.49 1.25 -0.55 -2.73***

8 Utils OLS 0.0163 0.6564 -0.1648 0.4404 -0.0257 0.0717 0.45 1.95

t-stat 0.12 20.33*** -3.57*** 6.92*** -0.39 0.73

GMMd -0.0456 0.3437 0.3370 -0.2393 0.8909 0.4089 0.03 1.81

t-stat -0.16 1.36 0.75 -0.37 1.09 0.61

9 Shops OLS -0.1053 1.0279 0.2837 0.0045 0.5588 0.0749 0.80 1.85

t-stat -0.98 40.33*** 7.80*** 0.09 10.70*** 0.96

GMMd -0.1076 1.0142 0.3085 0.3837 0.4526 -0.2201 0.77 1.76

t-stat -0.49 5.49*** 1.23 0.76 0.78 -0.61

10 Hlth OLS 0.2070 0.8685 -0.1874 -0.4582 0.3768 0.3656 0.65 2.12

t-stat 1.58 28.02*** -4.24*** -7.50*** 5.93*** 3.86***

GMMd -0.1778 0.8985 0.0447 -0.8786 0.8722 1.3019 0.56 2.04

t-stat -0.97 5.89*** 0.14 -1.33 1.20 2.94***

11 Money OLS -0.1489 1.1757 -0.0308 0.5757 0.1178 -0.1812 0.84 1.89

t-stat -1.48 49.24*** -0.90 12.24*** 2.41** -2.48**

GMMd -0.0493 1.1164 -0.1781 0.6553 0.0337 -0.3112 0.83 1.78

t-stat -0.33 8.14*** -0.81 1.49 0.07 -1.06

12 Other OLS -0.3233 1.1206 0.3063 0.1181 0.1386 0.0033 0.91 1.97

t-stat -4.30*** 62.90*** 12.04*** 3.36*** 3.80*** 0.06

GMMd -0.4614 1.2232 0.1731 0.1156 0.2502 0.2327 0.89 1.92

t-stat -4.45*** 14.07*** 1.04 0.47 0.81 0.90

Random Effects Model : Swamy's weighted average

FGLS -0.0516 0.9928 0.0217 0.1093 0.1674 0.0741 0.73 1.98

t-stat (weighted avg) 0.62 +38.56*** 1.34 1.89* 3.60*** 1.27

t-stat (Swamy) -0.75 21*** 0.38 1.16 1.79* 1.08

GMMd -0.0956 0.9859 0.0312 0.0909 0.2704 0.1544 0.58 1.89

t-stat (weighted avg) -0.62 6.87*** 0.17 0.21 0.47 0.42

t-stat (Swamy) -0.91 10.51*** 0.46 0.44 1.33 0.69

41

Table 4b

Random Effects Model, OLS vs GMMd estimation methods for the augmented (LIQ) FF six-factor model by FF 12 sectors

Notes: FGLS is calculated using (23) for the random coefficient model. t-stat is calculated first as a Swamy (1970) weighted

average of the OLS sector t-stats using (22) and then using the estimated Swamy variance-covariance matrix given by (25).

GMMd is the generalized method of moments using our robust distance instruments given in (34) with the Newey-West (1987)

HAC variance-covariance estimator for the random coefficient model. *** indicates significance at 1%; **, 5%; and *, 10%. 2R is the adjusted coefficient of determination, and DW is the Durbin-Watson statistic for autocorrelation of order 1.

c R m-R f SMB HML RMW CMA LIQ DW

Sector Fama-French (2015) and Pastor-Stambaugh (2003)

1 NoDur OLS -0.1115 0.9111 0.0985 -0.0315 0.6411 0.4212 0.0281 0.78 1.92

t-stat -1.20 41.76*** 3.16*** -0.73 14.35*** 6.33*** 1.14

GMMd -0.3596 0.9800 0.0522 0.1747 0.3783 0.7256 0.2648 0.67 1.82

t-stat -1.48 5.26*** 0.16 0.42 0.65 1.57 1.55

2 Durbl OLS -0.4743 1.2249 0.2192 0.5387 0.1675 0.0053 0.0845 0.72 2.02

t-stat -3.02*** 33.10*** 4.15*** 7.40*** 2.21** 0.05 2.02**

GMMd -1.0225 1.7739 -1.0884 2.1724 -2.0290 0.4932 0.8474 -0.74 1.78

t-stat -1.32 3.18*** -0.96 1.24 -0.89 0.40 1.38

3 Manuf OLS -0.1877 1.1436 0.1490 0.1719 0.2513 0.0256 0.0328 0.89 2.03

t-stat -2.30** 59.56*** 5.44*** 4.55*** 6.39*** 0.44 1.51

GMMd -0.5326 1.3985 -0.3152 0.5242 -0.3415 0.6645 0.2637 0.64 1.85

t-stat -1.72* 6.27*** -0.72 0.77 -0.39 1.26 1.10

4 Enrgy OLS 0.0467 0.9187 -0.1978 0.2301 0.0363 0.1660 0.0624 0.47 1.87

t-stat 0.25 21.23*** -3.20*** 2.70*** 0.41 1.26 1.27

GMMd 0.1995 0.7029 0.4206 -1.4060 1.8204 0.4497 -0.2548 0.00 1.68

t-stat 0.30 1.44 0.42 -1.10 1.00 0.42 -0.56

5 Chems OLS -0.2052 1.0036 -0.0331 0.0264 0.4748 0.3349 0.0128 0.79 2.02

t-stat -2.11** 43.78*** -1.01 0.59 10.12*** 4.79*** 0.49

GMMd -0.3527 0.9971 0.0431 -0.1892 0.5884 0.8752 -0.0228 0.76 1.96

t-stat -1.57 5.10*** 0.14 -0.50 1.03 2.15** -0.15

6 BusEq OLS 0.3762 1.0395 0.0857 -0.4202 -0.4185 -0.4394 0.0234 0.83 1.99

t-stat 3.05*** 35.82*** 2.07** -7.36*** -7.04*** -4.96*** 0.71

GMMd 0.3000 1.1705 -0.2669 0.2753 -1.0946 -0.7021 0.2530 0.71 1.98

t-stat 0.93 5.70*** -0.59 0.42 -1.27 -1.35 1.09

7 Telcm OLS 0.1978 0.8339 -0.2654 0.1096 -0.3192 0.0504 0.0305 0.61 1.96

t-stat 1.48 26.41*** -5.89*** 1.76* -4.94*** 0.52 0.85

GMMd 0.7119 0.5282 -0.0605 0.4142 -0.1145 -1.3413 -0.0960 0.37 1.87

t-stat 2.43** 2.75*** -0.14 0.70 -0.14 -2.21** -0.48

8 Utils OLS -0.0146 0.6590 -0.1639 0.4394 -0.0272 0.0721 0.0695 0.48 1.94

t-stat -0.11 20.44*** -3.56*** 6.92*** -0.41 0.73 1.90*

GMMd -0.1939 0.4174 0.1457 -0.0325 0.5525 0.5663 0.2407 0.24 1.86

t-stat -0.64 1.59 0.30 -0.05 0.61 0.93 1.04

9 Shops OLS -0.1227 1.0294 0.2842 0.0040 0.5579 0.0751 0.0391 0.80 1.90

t-stat -1.13 40.38*** 7.82*** 0.08 10.69*** 0.97 1.35

GMMd -0.3114 1.1150 0.0924 0.5706 0.1093 0.0275 0.2922 0.66 1.81

t-stat -0.88 4.03*** 0.22 1.03 0.14 0.05 1.44

10 Hlth OLS 0.2469 0.8651 -0.1886 -0.4569 0.3787 0.3651 -0.0899 0.67 2.09

t-stat 1.89* 28.03*** -4.28*** -7.52*** 5.99*** 3.88*** -2.57**

GMMd -0.0743 0.8456 0.1457 -0.9557 1.0233 1.1712 -0.1390 0.54 2.03

t-stat -0.24 4.09*** 0.29 -1.22 1.01 2.34** -0.49

11 Money OLS -0.1076 1.1723 -0.0320 0.5770 0.1198 -0.1817 -0.0930 0.84 1.93

t-stat -1.07 49.53*** -0.95 12.39*** 2.47** -2.52** -3.47***

GMMd 0.0687 1.0551 0.0276 0.3810 0.4393 -0.3959 -0.2368 0.79 1.91

t-stat 0.25 4.54*** 0.07 0.72 0.60 -0.87 -0.92

12 Other OLS -0.2866 1.1175 0.3052 0.1193 0.1404 0.0029 -0.0828 0.91 2.02

t-stat -3.84*** 63.58*** 12.17*** 3.45*** 3.90*** 0.05 -4.16***

GMMd -0.4166 1.1934 0.2026 0.1076 0.2820 0.1567 -0.0303 0.90 1.92

t-stat -3.20*** 11.17*** 0.97 0.40 0.75 0.59 -0.32

Random Effects Model : Swamy's weighted average

FGLS -0.0544 0.9934 0.0217 0.1091 0.1671 0.0750 0.0093 0.73 1.97

t-stat (weighted avg) 1.11 +38.38*** 1.30 2.19** 3.35*** 0.81 0.65

t-stat (Swamy) -0.79 21.08*** 0.38 1.16 1.78* 1.10 0.52

GMMd -0.1595 1.0140 -0.0504 0.1720 0.1338 0.2227 0.1170 0.52 1.87

t-stat (weighted avg) -0.45 4.58*** 0.01 0.25 0.25 0.44 0.40

t-stat (Swamy) -1.24 9.47*** -0.46 0.68 0.47 1.07 1.31

42

Table 5

Testing fixed effects versus random effects models

5 factors 6 factors

Pool/FE GMMd/FE Pool/FE GMMd/FE

F test 29.88 18.56 29.64 16.79

OLS

RE/FE

GMMd

RE/FE

OLS

RE/FE GMMd RE/FE

H test 0.00074 -2.04 0.0031 -0.0053

Notes: F test is a Fisher F test for testing the pooled versus the fixed effects models.

Pool/FE designates the pooled OLS versus LSDV fixed effects models. GMMd/FE

designates the pooled GMMd estimation method versus the fixed effects model

estimated via GMMd. H test is the Hausman test for testing fixed versus random

effects models. OLS RE/FE designates the FGLS for the random effects versus

the LSDV models. GMMd RE/FE designates the GMMd estimation method for the

random effects versus the fixed effects models.

Estimating panel data fixed and random effects with … · Cahier de recherche 2018-03 Estimating panel data fixed and random effects with application to the new Fama-French model

Documents