-
The Limitations of Industry ConcentrationMeasures Constructed
with Compustat Data:Implications for Finance ResearchAshiq
AliSchool of Management, University of Texas at Dallas
Sandy KlasaDepartment of Finance, University of Arizona
Eric YeungJ. M. Tull School of Accounting, University of
Georgia
Industry concentration measures calculated with Compustat data,
which cover only thepublic firms in an industry, are poor proxies
for actual industry concentration. Thesemeasures have correlations
of only 13% with the corresponding U.S. Census measures,which are
based on all public and private firms in an industry. Also, only
when U.S. Censusmeasures are used is there evidence consistent with
theoretical predictions that more-concentrated industries, which
should be more oligopolistic, are populated by larger andfewer
firms with higher price-cost margins. Further, the significant
relations of Compustat-based industry concentration measures with
the dependent variables of several importantprior studies are not
obtained when U.S. Census measures are used. One of the reasonsfor
this occurrence is that Compustat-based measures proxy for industry
decline. Overall,our results indicate that product markets research
that uses Compustat-based industryconcentration measures may lead
to incorrect conclusions. (JEL G10, G30, L10)
A growing number of studies that consider the effects of product
markets onfinancial economics-related phenomena use industry
concentration measurescalculated with Compustat data, which cover
only the public firms in an indus-try. These studies examine issues
related to asset pricing (Hou and Robinson2006), informed trading
(Tookes 2008), idiosyncratic stock return volatility(Gaspar and
Massa 2006), mergers and acquisitions (Song and Walkling 2000;Fee
and Thomas 2004; Shahrur 2005), corporate governance (DeFond and
Park1999; Engel, Hayes, and Wang 2003; Rennie 2006; Karuna 2007),
capital struc-ture (Lang and Stulz 1992; Kale and Shahrur 2007),
corporate disclosure policy
We appreciate helpful comments from two anonymous referees,
Michelle Sovinsky Goeree, Eric Kelley,Chris Lamoureux, Bill
Maxwell, Hernan Ortiz-Molina, David Robinson, Janet Smith, Matthew
Spiegel, MarkTrombley, and Harold Zhang. We also thank Ed Altman
for providing us with the Altman-NYU SalomonCenter Bankruptcy list.
Send correspondence to Sandy Klasa, Department of Finance, Eller
College of Manage-ment, University of Arizona, Tucson, AZ
85721-0108; telephone: (520) 621-8761; fax: (520) 621-1261.
E-mail:[email protected].
C The Author 2008. Published by Oxford University Press on
behalf of The Society for Financial Studies.All rights reserved.
For Permissions, please e-mail:
[email protected]:10.1093/rfs/hhn103
Advance Access publication December 23, 2008
at Univ of Southern California on A
pril 17, 2014http://rfs.oxfordjournals.org/
Dow
nloaded from
-
The Review of Financial Studies / v 22 n 10 2009
(Harris 1998; Botosan and Harris 2000; Botosan and Stanford
2005; Rogersand Stocken 2005; Verrecchia and Weber 2006),
income-increasing accountingchoices (Zmijewski and Hagerman 1981),
and the determinants of corporateearnings (Cheng 2005).
We consider the empirical implications of using industry
concentrationmeasures that are based on only a firms publicly
traded rivals. To do so,we compare Compustat-based industry
concentration measures with indus-try concentration measures
collected from 19632002 Census of Manufac-tures publications
provided by the U.S. Census Bureau, which are basedon all public
and private firms in an industry. The Census of Manufac-tures data
have also been used to examine the effect of product marketfactors
on a wide spectrum of finance issues: corporate takeover
decisions(Eckbo 1985, 1992; Maksimovic and Phillips 2001), capital
structure de-cisions (Phillips 1995; Kovenock and Phillips 1997;
Mackay and Phillips2005; Campello 2006), corporate investment
patterns (Akdogu and Mackay2008), chief executive officer (CEO)
compensation contracts (Aggarwaland Samwick 1999), and risk
management decisions (Haushalter, Klasa, andMaxwell 2007). These
studies argue that it is preferable to use concentrationmeasures
calculated by the U.S. Census because the measures based on
Com-pustat data are subject to measurement error due to the
exclusion of privatefirms, which often account for a nonnegligible
percentage of industry sales.MacKay and Phillips (2005, p. 1439)
point out that because industry concen-tration measures calculated
by the U.S. Census are used by regulatory agenciessuch as the
Department of Justice, these measures are likely to be the
mostappropriate to study product market issues.
Our empirical evidence indicates that Compustat-based industry
concen-tration measures are poor proxies for actual industry
concentration. The cor-relation between the Compustat and U.S.
Census-based Herfindahl indexesis only 13%. Moreover, U.S.
Census-based concentration measures are pos-itively related to
industry price-cost margins and to firm size measures suchas net
sales, total assets, and market capitalization. However, these
relationsare not obtained using Compustat-based industry
concentration measures. Fur-ther, we show that the total number of
private and public firms in an industrymarkedly drops between the
highest and lowest quintiles of U.S. Census-basedindustry
concentration measures. In contrast, this number changes very
littleif Compustat-based industry concentration measures are used
instead. Thus,only when U.S. Census-based industry concentration
measures are used are theresults consistent with theoretical
predictions that more-concentrated industriesthat should be more
oligopolistic are populated by fewer and larger firms thatenjoy
higher price-cost margins due to their greater market power.
Next, we use the U.S. Census data to reexamine several important
resultsobtained in prior studies that use Compustat-based industry
concentration mea-sures. First, we consider the Hou and Robinson
(2006) finding that firms inmore-concentrated industries earn lower
future stock returns. They argue that
3840
at Univ of Southern California on A
pril 17, 2014http://rfs.oxfordjournals.org/
Dow
nloaded from
-
The Limitations of Industry Concentration Measures Constructed
with Compustat Data
their results indicate that barriers to entry in highly
concentrated industriesinsulate firms from undiversifiable distress
risk, which is priced in equity re-turns. Hou and Robinson (2006)
also report that firms in less-concentratedindustries spend more on
research and development. They contend that thisresult supports the
Schumpeter (1912) proposition that innovation, which is aform of
creative destruction, is more likely to occur in competitive
industries.1Also, they posit that higher innovation risk in
less-concentrated industries con-tributes to the higher cost of
equity capital in such industries. In contrast, wedocument that
industry concentration measures calculated by the U.S. Censusare
not related to future stock returns. Further, we find that the U.S.
Censusmeasures are positively rather than negatively associated
with research and de-velopment expenses. These differences in
results are not driven by our samplebeing confined to the
manufacturing sector, because using Compustat-basedindustry
concentration measures and limiting our analysis to this sector, we
areable to replicate the Hou and Robinson (2006) findings.
Second, we reexamine the Lang and Stulz (1992) result that the
effect ofbankruptcy announcements on the equity values of
competitors is more positivein more-concentrated industries and
that this effect is amplified in industrieswith low leverage. Lang
and Stulz (1992) argue that in industries that are moreconcentrated
and have lower leverage, competitors are more likely to benefitfrom
the difficulties faced by a bankrupt firm. We obtain the same
results asin Lang and Stulz (1992) with our sample of manufacturing
firms, when weuse Compustat-based industry concentration measures.
However, using U.S.Census-based industry concentration measures, we
are unable to replicate theLang and Stulz (1992) findings.
Third, we reexamine the Harris (1998) result that firms are less
likely toprovide segment disclosures for operations in
more-concentrated industries,measured using Compustat data. She
argues that to protect their abnormalprofits and market share,
firms in less-competitive industries are less likelyto disclose
commercially valuable information to competitors. We obtain thesame
results as in Harris (1998) with our sample of manufacturing firms,
whenwe use Compustat-based industry concentration measures.
However, we findthat the decision to provide segment disclosures
for operations in a particularindustry is not associated with the
U.S. Census-based industry concentrationmeasures of that
industry.
Finally, we consider the Defond and Park (1999) result that CEO
turnover isnegatively associated with Compustat-based industry
concentration measures.They argue that in more-competitive
industries in which there is greater ho-mogeneity across firms and
in which CEOs are likely to have more peers, itis easier to
identify and replace poorly performing CEOs. We obtain results
1 In contrast to his earlier prediction, Schumpeter (1942)
claims that there is more innovation in less-competitiveindustries
because firms in such industries can enjoy economic profits
resulting from their innovation, instead ofhaving these profits
competed away.
3841
at Univ of Southern California on A
pril 17, 2014http://rfs.oxfordjournals.org/
Dow
nloaded from
-
The Review of Financial Studies / v 22 n 10 2009
similar to those in Defond and Park (1999) with our sample of
manufacturingfirms when we use Compustat-based industry
concentration measures. How-ever, we find an insignificant
relationship between CEO turnover and U.S.Census-based industry
concentration measures.
Our finding that using U.S. Census-based industry concentration
measureswe are unable to replicate the Hou and Robinson (2006);
Lang and Stulz (1992);Harris (1998); and Defond and Park (1999)
results suggests that Compustat-based industry concentration
measures capture other industry characteristicsthat happen to be
correlated with the dependent variables of these studies. Toprovide
evidence on this issue, we examine what drives the Hou and
Robinson(2006) result that Compustat-based industry concentration
measures are neg-atively related to future stock returns and the
Harris (1998) finding that firmsare less likely to provide segment
disclosures for operations in industries withhigher values for
Compustat-based measures.
We find that Compustat-based industry concentration measures are
signifi-cantly negatively related to the change in industry
shipments reported by theCensus of Manufactures during the prior
five years. However, U.S. Census-based industry concentration
measures are not related to past shipment growth.Thus, for some
reason other than the actual concentration of an industry,
in-dustries with high Compustat-based measures experience poor
growth in therecent past. An explanation for these findings is that
a declining industry isleft with only a few large, public firms
relative to private firms. Consequently,there are only a few
companies in the Compustat database for the industry,and this
results in high Compustat-based industry concentration values.
Consist-ent with this explanation, we find a significant negative
relationship betweenthe Compustatbased industry concentration
measures and the change overthe prior five years in the number of
firms in an industry included in both theCenter for Research in
Security Prices (CRSP) and Compustat databases.2
Our finding that industries with high Compustat-based industry
concentra-tion measures tend to be declining industries explains
why these industriesspend less on research and development, as
reported in Hou and Robinson(2006) and confirmed in our study.
Given that prior work suggests that firmsthat spend more on
research and development have higher future stock re-turns (e.g.,
Chan, Lakonishok, and Sougiannis 2001; Chambers, Jennings,
andThompson 2002; Eberhart, Maxwell, and Siddique 2004), we examine
whetherthe association between Compustat-based industry
concentration measures andfuture stock returns is sensitive to
controlling for research and development ex-penses. After
controlling for current research and development expenses, whichwe
find are positively related to future stock returns, the negative
associationbetween Compustat-based industry concentration measures
and future stock
2 To construct Compustat-based industry concentration measures,
studies such as that by Hou and Robinson (2006)often require that
firms are included on both CRSP and Compustat.
3842
at Univ of Southern California on A
pril 17, 2014http://rfs.oxfordjournals.org/
Dow
nloaded from
-
The Limitations of Industry Concentration Measures Constructed
with Compustat Data
returns becomes insignificant. This finding suggests that the
negative rela-tionship between research and development expenses
and Compustat-basedindustry concentration measures drives the
negative association between thesemeasures and future stock
returns.
Next, we show that after controlling for prior growth in
industry shipments,the Harris (1998) documented negative
relationship between a firms decision toprovide segment disclosures
of its operations in an industry and the Compustat-based
concentration measures of that industry becomes insignificant.
Further,we find that a firms decision to provide segment
disclosures for its operationsin an industry is positively related
to that industrys prior shipment growth.The latter result is
consistent with the Miller (2002) and Kothari, Shu, andWysocki
(forthcoming) evidence that firms with weak (strong) prior
operatingperformance provide less (more) informative disclosures.
Overall, these find-ings suggest that the Harris (1998) result is
driven by Compustat-based industryconcentration measures proxying
for the prior performance of a firm in one ofits segments that in
turn determines the firms decision to provide a separatedisclosure
for that segment.
Our study makes the following contributions. First, we document
thatCompustat-based industry concentration measures, which exclude
data on pri-vate firms, are poor measures of actual industry
concentration. Second, weshow that researchers who use Compustat
data to construct industry concen-tration measures can arrive at
results that lead to incorrect conclusions. Finally,our findings
suggest that the significant results obtained in prior studies
thatuse Compustat-based industry concentration measures could be
due to thesemeasures proxying for other industry characteristics
that are correlated withthe dependent variables of these
studies.
The remainder of this article is organized as follows. Section 1
describes theCompustat- and U.S. Census-based industry
concentration measures used in thestudy. Section 2 provides
evidence that indicates that Compustat-based industryconcentration
measures are poor proxies for actual industry concentration.Section
3 reexamines results of four prior studies that use
Compustat-basedindustry concentration measures. Section 4
concludes.
1. Description of Compustat- and U.S. Census-based industry
concentrationmeasures
1.1 Compustat-based industry concentration
measuresCompustat-based industry concentration measures are
calculated using thesales data of firms included in the Compustat
database. Because Compustatexcludes private firms, Compustat-based
industry concentration measures canpotentially provide an
inaccurate picture of the actual degree of concentrationin an
industry. In particular, in industries in which private firms
account fora nonnegligible percentage of industry sales, it is
problematic to rely on data
3843
at Univ of Southern California on A
pril 17, 2014http://rfs.oxfordjournals.org/
Dow
nloaded from
-
The Review of Financial Studies / v 22 n 10 2009
that exclude these firms (Hay and Morris 1991, p. 210). However,
there aretwo advantages of using Compustat data to construct
industry constructionmeasures. First, such measures can easily be
calculated by extracting fromCompustat total sales for each firm in
a particular industry. Therefore, they canprovide a long and
continuous time series of concentration measures. Second,using the
Compustat database to calculate industry concentration
measuresallows researchers to construct these measures for a wide
spectrum of industries.
The Compustat-based Herfindahl index is calculated by adding the
squaresof the sales market shares of all the firms in an industry
that have sales dataon Compustat. Similarly, the Compustat-based
four-firm ratio is calculated byadding the sales market shares of
the four largest firms in an industry in terms ofmarket share. We
refer to the Compustat-based Herfindahl index and four-firmratio as
HI-Compustat and FFR-Compustat.
For the univariate results presented in this and the second
section of thearticle, HI-Compustat and FFR-Compustat are
calculated in a manner similarto that in Hou and Robinson (2006).
HI-Compustat and FFR-Compustat arecalculated using the sales market
shares of all the firms in an industry withsales data on Compustat
and are averaged over the past three years. Industry isdefined
using historical CRSP Standard Industrial Classification (SIC)
codes.3,4For the reexamination of the Hou and Robinson (2006); Lang
and Stulz (1992);Harris (1998); and Defond and Park (1999) results,
we calculate Compustat-based industry concentration measures using
the methodology employed bythe specific study.
1.2 U.S. Census-based industry concentration measuresThe Census
of Manufactures publications provided by the U.S. Census
Bureaureport concentration ratios for hundreds of industries in the
manufacturingsector. We hand-collect data on the U.S. Census-based
Herfindahl index andfour-firm concentration ratio from Census of
Manufactures publications forthe years 1963, 1966, 1967, 1970,
1972, 1977, 1982, 1987, 1992, 1997, and2002. The data are for
four-digit SIC industries (SIC codes between 2000and 3999) for the
years 19631992 and for six-digit North American
IndustryClassification System (NAICS) industries (NAICS codes
between 311111 and339999) for the years 1997 and 2002.5
3 Kahle and Walkling (1996) report that over long sample periods
there are advantages to using historical CRSPSIC codes instead of
Compustat SIC codes. Further, because over the past fifty years the
U.S. Census Bureau hasrevised the SIC system a number of times, it
is advantageous to use historical CRSP SIC codes when
constructingCompustat-based industry concentration measures.
4 Most work that uses Compustat- or U.S. Census-based industry
concentration measures assumes that a firmcompetes in only the
industry represented by the industry classification code assigned
to the firm. Because theaim of this article is to compare results
obtained with these two types of industry concentration measures,
wemake the same assumption.
5 For nonmanufacturing industries, the alternative to using
Compustat data and consequently excluding data onprivate firms to
construct industry concentration measures is to use concentration
measures collected fromindustry specific publications. Work that
uses such measures examines issues such as the interaction
between
3844
at Univ of Southern California on A
pril 17, 2014http://rfs.oxfordjournals.org/
Dow
nloaded from
-
The Limitations of Industry Concentration Measures Constructed
with Compustat Data
Unlike Compustat-based industry concentration measures, U.S.
Census-based measures are constructed using data from all public
and private firmsin an industry and hence should better capture
actual industry concentration.The use of U.S. Census-based measures
by government regulatory agenciessuggests that these measures
should be quite reliable. For instance, these mea-sures are often
used by the Federal Trade Commission when it decides whetherto
challenge mergers on antitrust grounds. Another factor that
suggests theU.S. Census-based industry concentration measures
should be reliable is thatall firms in the United States are
required by federal law to respond to U.S.Census surveys (under
Title 13 of the U.S. code). Further, Sections 213 and224 of Title
13 of the U.S. code state that employees of the U.S. Census
whocollect data on its behalf and who knowingly furnish false
information aresubject to imprisonment and that agents of companies
who willfully providefalse answers to questions about their company
are subject to hefty fines.
The Census of Manufactures calculates the Herfindahl index of an
industryas the sum of the squares of the individual company market
shares of allthe companies in an industry or the fifty largest
companies in the industry,whichever is lower. The four-firm ratio
of an industry is the sum of the marketshares of the four largest
firms in the industry in terms of market share. Werefer to these
measures as HI-Census and FFR-Census, respectively.
The Census of Manufactures is published only during years when a
U.S.Census takes place. We use the U.S. Census data for a given
year as a proxyfor industry concentration not only for that year
but also for the one ortwo years immediately before and after it.
This approach is similar to thatused in several prior studies
(e.g., Aggarwal and Samwick 1999; MacKay andPhillips 2005; Campello
2006; Haushalter, Klasa, and Maxwell 2007). Table1 provides
information on the time periods to which we apply a given yearsU.S.
Census data. For example, we use the 1992 Census of Manufactures
dataas a proxy for industry concentration for the period 19901994.
Data for FFR-Census are available in all Census of Manufactures
publications, resulting in asample period of 19632005. However,
data for HI-Census are available onlyfrom Census of Manufactures
publications from 1982 on, resulting in a sampleperiod of
19802005.
1.3 Descriptive statistics of U.S. Census- and Compustat-based
industryconcentration measures
Table 2 provides descriptive statistics of the industry
concentration measures.Panel A compares HI-Census and HI-Compustat
for four-digit SIC industries
industry concentration (collected from the annual publication
Supermarket News Distribution Study of GroceryStore Sales) and
capital structure decisions in the supermarket industry (Chevalier
1995a, 1995b; Chevalierand Scharfstein 1996), the relationship
between industry concentration (collected from the American
TruckingAssociation) and firm survival after deregulation of the
trucking industry (Zingales 1998), and how industryconcentration
(collected from Discount Merchandiser) interacts with ownership
structure, capital structure, andcorporate focus in the discount
department industries (Khanna and Tice 2000).
3845
at Univ of Southern California on A
pril 17, 2014http://rfs.oxfordjournals.org/
Dow
nloaded from
-
The Review of Financial Studies / v 22 n 10 2009
Table 1Sample periods to which a particular years Census of
Manufactures data on industry concentration areapplied
Years for which the Census ofManufactures reports data
Sample periods to which Census of Manufacturesdata on industry
concentration are applied
1963 196319641966 196519661967 196719681970 196919701972
197119741977 197519791982 198019841987 198519891992 199019941997
199519992002 20002005
The Census of Manufactures is published by the U.S. Census
Bureau.
Table 2Descriptive statistics of industry concentration
measures
Mean Median STD 20% 40% 60% 80%
Panel A: 19802005 sample period
Industry at four-digit SIC levelHI-Census 0.064 0.043 0.062
0.015 0.032 0.058 0.104HI-Compustat 0.696 0.714 0.278 0.410 0.596
0.857 1.000
Panel B: 19632005 sample period
Industry at four-digit SIC levelFFR-Census 0.382 0.350 0.208
0.195 0.290 0.411 0.560FFR-Compustat 0.969 1.000 0.079 0.970 1.000
1.000 1.000
Panel C: 19952005 sample period
Industry at four-digit SIC levelHI-Census 0.061 0.040 0.063
0.012 0.027 0.053 0.103FFR-Census 0.368 0.327 0.217 0.159 0.272
0.402 0.559
HI-Census is the Herfindahl index for four-digit SIC industries
as reported by the Census of Manufactures. The1997 and 2002 U.S.
Censuses define industry at the six-digit NAICS level. Over the
19952005 period, we useHI-Census values for six-digit NAICS
industries to calculate HI-Census values for four-digit SIC
industries byweighting the HI-Census values of component six-digit
NAICS industries by the square of their share of thebroader
four-digit SIC industry. To determine what are the component
six-digit NAICS industries of a broaderfour-digit SIC industry, we
use NAICS correspondence tables provided by the U.S. Census.
HI-Compustat is thesum of the squares of the sales market shares of
all firms in a CRSP four-digit SIC industry that have sales dataon
Compustat. For each year t, HI-Compustat is averaged over a
three-year period, from year t 2 to year t.FFR-Census is the sum of
the market shares of the four largest firms in terms of market
share in a four-digitSIC industry as defined by the Census of
Manufactures. Over the 19952005 period, FFR-Census values
forsix-digit NAICS industries are used to approximate FFR-Census
for broader four-digit SIC industries by firstdetermining which
component six-digit NAICS industry of a broader four-digit SIC
industry has the largest salesas measured by the sales of its top
four firms. Next, we divide the sales of the top four firms of this
six-digitNAICS industry by the total sales of the firms in all the
component six-digit NAICS industries within the broaderfour-digit
SIC industry. FFR-Compustat is the sum of the market shares of the
four largest firms in terms ofmarket share in a CRSP four-digit SIC
industry. A firms market share is measured as sales divided by
total salesof all CRSP firms in that industry that have sales data
on Compustat. For each year t, FFR-Compustat is averagedover a
three-year period from year t 2 to year t. Descriptive statistics
for industry concentration measuresare calculated by pooling
Compustat- and U.S. Census-based industry-year observations.
Observations used tocalculate HI-Census or HI-Compustat are taken
from all years within specific sample periods.
3846
at Univ of Southern California on A
pril 17, 2014http://rfs.oxfordjournals.org/
Dow
nloaded from
-
The Limitations of Industry Concentration Measures Constructed
with Compustat Data
over the 19802005 period. The difference between these two
measures is strik-ing. The mean values of HI-Census and
HI-Compustat are 0.064 and 0.696,respectively. Panel B compares
FFR-Census and FFR-Compustat for four-digitSIC industries over the
19632005 period. The mean values of FFR-Censusand FFR-Compustat are
0.382 and 0.969, respectively. The large differencesbetween the
values of the U.S. Census- and Compustat-based
concentrationmeasures indicate that on average private firms
account for a significant per-centage of industry sales and that it
is therefore problematic to exclude dataon these firms when
calculating industry concentration measures. The fortiethpercentile
value for FFR-Compustat is 1.000, showing that the majority
offour-digit SIC industries have four or fewer firms with sales
data available onCompustat.
As shown in Table 1, for the period 19952005, we use
concentration ratiosfrom the 1997 and 2002 Census of Manufactures
publications in which indus-try is defined using six-digit NAICS
codes. HI-Census for six-digit NAICSindustries and the total
shipments for these industries reported in the Censusof
Manufactures can be used to calculate HI-Census for broader
four-digitSIC industries. We do this by weighting HI-Census of the
component six-digitNAICS industries by the square of their share of
the shipments of the broaderfour-digit SIC industry.6,7 To
calculate FFR-Census for four-digit SIC indus-tries using
FFR-Census of component six-digit NAICS industries, we use
anapproximation method. We first determine the component six-digit
NAICS in-dustry of a broader four-digit SIC industry that has the
largest value for thesales of its top four firms. Next, we divide
the sales of the top four firms ofthis six-digit NAICS industry by
the total sales of all the component six-digitNAICS industries
within the broader four-digit SIC industry.8
Panel C of Table 2 provides descriptive statistics for the
19952005 periodfor HI-Census and for FFR-Census at the four-digit
SIC level. The statisticsfor these measures are very similar to
those reported in panels A and B. Theseresults suggest that our
methods for converting the six-digit NAICS level con-centration
measures to four-digit SIC level measures are reasonable.
6 For instance, if there are J six-digit NAICS industries that
belong to one four-digit SIC industry, the six-digitNAICS
Herfindahl index values are weighed by the square of the shares of
each component six-digit NAICSindustry. Thus, if the broader
four-digit SIC industry is called p, then the Herfindahl index of
industry p,
H H Ip =J
j=12j H H I j , where j =
total shipments of component 6 digit NAICS industryjtotal
shipments of broader 4 digit SIC industryp .
7 To determine what are the component six-digit NAICS industries
of a broader four-digit SIC industry, we useNAICS correspondence
tables provided by the U.S. Census.
8 We refer to this method as an approximation method because it
is possible that a component six-digit NAICSindustry that does not
have the largest sales as measured by the sales of its top four
firms has a firm whose salesis greater than at least one of the top
four firms of the component six-digit NAICS industry with the
largest valuefor the sales of its top four firms.
3847
at Univ of Southern California on A
pril 17, 2014http://rfs.oxfordjournals.org/
Dow
nloaded from
-
The Review of Financial Studies / v 22 n 10 2009
2. Evidence that Compustat-based industry concentration measures
are poorproxies for actual industry concentration
Table 3 reports firm and industry characteristics for quintiles
sorted by HI-Census and HI-Compustat. The panel A quintiles are
based on HI-Census. Themedian value of HI-Census for quintile 1 is
0.009 and for quintile 5 is 0.153,about fifteen times larger.
However, the corresponding values of HI-Compustatare quite similar,
0.659 and 0.891, respectively. The panel B quintiles are basedon
HI-Compustat. The median value of HI-Compustat for quintile 1 is
0.311 andfor quintile 5 is 1.000, about three times larger.
However, the correspondingvalues of HI-Census are quite similar:
0.041 and 0.059, respectively. Theseresults suggest that the
exclusion of data on private firms not only leads tolarge
differences between Compustat- and U.S. Census-based
concentrationmeasures but also leads to a low correlation between
the two types of measures.
Next, we determine the relation of industry markups with
HI-Census andHI-Compustat. Industry markups represent average
price-cost margins in anindustry. Industrial organization theory
predicts that in more-concentrated in-dustries there is less
intense competition and price is consequently set furtheraway from
marginal cost. Thus, a positive relation is expected between
industryconcentration and price-cost margins. We follow Allayannis
and Ihrig (2001)and calculate industry markups using aggregate
industry-level data from An-nual Survey of Manufacturers
publications. We also use their definition forindustry markups,
which is as follows:
Industry Markup =(Value of Sales + Inventories Payroll Cost of
Materials)
(Value of Sales + Inventories) .
We collect annual industry data at the four-digit SIC level from
1993, 1994,1995, and 1996 Annual Survey of Manufacturers
publications and calculateindustry markups for the period from 1993
to 1996. For this period, we formquintiles based on HI-Census and
HI-Compustat and calculate median industrymarkups for each of the
quintiles.
The results in panel A of Table 3 show that industry markups are
higher inindustries with higher values of HI-Census. In industries
that are in the highestquintile of HI-Census, industry markups are
almost 25% larger than they arein industries in the lowest quintile
of HI-Census. In contrast, panel B showsthat industry markups are
not systematically related to HI-Compustat. Theseresults suggest
that U.S. Census-based measures are better proxies for
actualindustry concentration than are Compustat-based measures.
Next, we compute for each quintile the median number of public
and privatefirms per industry, based on U.S. Census data,
Nfirms-Census. Panel A reportsthat for HI-Census-based quintiles 1
and 5, the median Nfirms-Census valuesare 1385 and 88,
respectively. Panel B shows that for HI-Compustat-based
3848
at Univ of Southern California on A
pril 17, 2014http://rfs.oxfordjournals.org/
Dow
nloaded from
-
TheLim
itationsofIndustry
ConcentrationM
easuresConstructedw
ithCom
pustatData
Table 3Firm characteristics of portfolios sorted by industry
concentration measures
HI-Census HI-Compustat Industry Markup Nfirms-Census
Nfirms-Compustat Nfirms-Compustat as apercentage
ofNfirms-Census
Net Sales Market Capitalization Book Assets
Panel A: Quintiles based on HI-Census1 0.009 0.659 0.299 1385 2
0.19% 217 94 1522 0.025 0.680 0.314 518 3 0.56% 304 141 2643 0.045
0.590 0.315 356 4 0.82% 298 181 2534 0.079 0.744 0.321 164 2 1.46%
379 188 3185 0.153 0.891 0.369 88 2 2.20% 911 651 691
Panel B: Quintiles based on HI-Compustat1 0.041 0.311 0.319 535
9 1.79% 388 200 3282 0.042 0.516 0.313 423 4 1.09% 274 138 2063
0.044 0.706 0.305 413 3 0.78% 306 178 2724 0.046 0.929 0.351 287 2
0.69% 470 225 3875 0.059 1.000 0.328 211 1 0.55% 304 187 219
The sample period is 19802005. Median values are reported.
HI-Census is the Herfindahl index for four-digit SIC industries as
reported by the Census of Manufactures. The 1997 and2002 U.S.
Censuses define industry at the six-digit NAICS level. Over the
19952005 period, we use HI-Census values for six-digit NAICS
industries to calculate HI-Census values forfour-digit SIC
industries by weighting the HI-Census values of component six-digit
NAICS industries by the square of their share of the broader
four-digit SIC industry. To determine whatare the component
six-digit NAICS industries of a broader four-digit SIC industry we
use NAICS correspondence tables provided by the U.S. Census.
HI-Compustat is the sum of thesquares of the sales market shares of
all firms in a CRSP four-digit SIC industry that have sales data on
Compustat. For each year t, HI-Compustat is averaged over a
three-year period fromyear t 2 to year t. Industry Markup is
calculated using data collected from 1993, 1994, 1995, and 1996
Annual Survey of Manufactures publications and is calculated as
(Value of Sales + Inventories Payroll Cost of Materials) / (Value
of Sales + Inventories). For the analysis of Industry Markup we
form HI-Census and HI-Compustat based quintiles over the19931996
period. Nfirms-Census is the number of firms per four-digit SIC
industry as reported by the Census of Manufactures. Over the
19952005 period we use Nfirms-Census valuesfor six-digit NAICS
industries to calculate Nfirms-Census values for four-digit SIC
industries by summing the Nfirms-Census values of component
six-digit NAICS industries in a broaderfour-digit SIC industry.
Nfirms-Compustat is the total number of CRSP firms in a four-digit
SIC industry that are included on Compustat. Net Sales is defined
as net sales in millions in year t.Market Capitalization is defined
as market value of equity in millions at the end of year t. Book
Assets is the book value of total assets at the end of year t. Net
Sales, Market Capitalization,and Book Assets are inflation
adjusted. Descriptive statistics are calculated by pooling
firm-year observations.3849
at Univ of Southern California on April 17, 2014
http://rfs.oxfordjournals.org/ Downloaded from
-
The Review of Financial Studies / v 22 n 10 2009
quintiles 1 and 5, the median Nfirms-Census values are 535 and
211, respec-tively. Thus, the difference in Nfirms-Census between
quintiles 1 and 5 is fargreater when the quintiles are based on
HI-Census than when they are basedon HI-Compustat. Given that
more-concentrated industries, which are presum-ably less
competitive, should be populated with a smaller number of firms,the
above results further suggest that HI-Census is a better indicator
of trueindustry concentration than is HI-Compustat.
Table 3 also reports for each of the quintiles the median value
of Nfirms-Compustat, defined as the number of firms per industry
that have data availableon CRSP and Compustat. This number
represents the number of firms per in-dustry used to calculate
Compustat-based industry concentration measures. Inall cases,
median Nfirms-Compustat is markedly lower than is median
Nfirms-Census. For example, for quintile 1 of panel A, median
Nfirms-Census is 1385,but median Nfirms-Compustat is only 2. This
finding suggests that an importantcontributing factor to why
HI-Compustat could be a poor indicator of industryconcentration is
that this measure is based on partial data. Furthermore,
medianNfirms-Compustat does not vary systematically across the
HI-Census quintiles,suggesting that there is no relationship
between actual industry concentrationand Nfirms-Compustat. However,
panel B shows that median Nfirms-Compustatmarkedly decreases from
HI-Compustat-based quintiles 15. This result indi-cates that
Compustat-based industry concentration measures are more likelyto
proxy for the number of firms in an industry that are covered by
CRSP andCompustat than proxy for true industry concentration.
Additionally, Table 3 presents for each of the quintiles the
median percentageof firms in an industry reported by the U.S.
Census that are included on CRSPand Compustat (Nfirms-Compustat as
a percentage of Nfirms-Census). PanelA shows that this percentage
increases substantially from HI-Census-basedquintiles 15. This is
consistent with the expectation that in more-concentratedindustries
there should be a greater percentage of large, public firms, the
typethat are likely to be included on both CRSP and Compustat.
Panel B, in contrast,shows that the percentage of U.S. Census firms
that are included on CRSP andCompustat does not increase from
HI-Compustat-based quintiles 15. Instead,this percentage decreases
over these quintiles. This finding provides furthersupport to the
notion that Compustat-based industry concentration measuresare poor
indicators of actual industry concentration.
Next, we examine how firm size varies across quintiles sorted by
HI-Censusand by HI-Compustat. We consider three measures of size:
net sales, marketcapitalization, and book assets. Each of these
measures is inflation adjusted.Data for these variables are
obtained from Compustat. Thus, the median valuesof each of the firm
size variables reported in Table 3 are not the medians acrossall
U.S. Census firms but are the medians for the subset of firms
covered bythe CRSP and Compustat databases. Consequently,
inferences based on thesevariables should be viewed with caution.
Panel A shows that median net sales(market capitalization, book
assets) for firms in the HI-Census-based quintiles
3850
at Univ of Southern California on A
pril 17, 2014http://rfs.oxfordjournals.org/
Dow
nloaded from
-
The Limitations of Industry Concentration Measures Constructed
with Compustat Data
Table 4Correlations between industry concentration measures and
industry markup
HI-Census HI-Compustat Industry Markup
HI-Census 0.129 0.220(0.000) (0.000)
HI-Compustat 0.111 0.016(0.000) (0.586)
Industry Markup 0.183 0.019(0.000) (0.519)
The sample period is 19802005. This table presents Pearson
(above the diagonal) and Spearman (below thediagonal) correlations
among selected variables. Numbers in parentheses are significance
levels. HI-Censusis the Herfindahl index for four-digit SIC
industries as reported by the Census of Manufactures. The 1997and
2002 U.S. Censuses define industry at the six-digit NAICS level.
Over the 19952005 period we useHI-Census values for six-digit NAICS
industries to calculate HI-Census values for four-digit SIC
indus-tries by weighting the HI-Census values of component
six-digit NAICS industries by the square of theirshare of the
broader four-digit SIC industry. To determine what are the
component six-digit NAICS in-dustries of a broader four-digit SIC
industry we use NAICS correspondence tables provided by the
U.S.Census. HI-Compustat is the sum of the squares of the sales
market shares of all firms in a CRSP four-digit SIC industry that
have sales data on Compustat. For each year t, HI-Compustat is
averaged over athree-year period from year t 2 to year t. Industry
Markup is calculated using data collected from 1993,1994, 1995, and
1996 Annual Survey of Manufactures publications and is calculated
as (Value of Sales + Inventories Payroll Cost of Materials) /
(Value of Sales + Inventories). Correlations involving
IndustryMarkup are calculated over the 19931996 period.
1 and 5 are $217m ($94m, $152m) and $911m ($651m, $691m),
respectively.These results indicate that, consistent with
theoretical expectations, in less-concentrated industries, which
are likely to be more competitive, firm size issubstantially
smaller than it is in highly concentrated less-competitive
indus-tries. Panel B shows that median net sales (market
capitalization, book assets)for firms in HI-Compustat-based
quintiles 1 and 5 are $388m ($200m, $328m)and $304m ($187m, $219m),
respectively. These findings show that firm sizeis actually smaller
in the highest HI-Compustat quintile than it is in the
lowestquintile, further suggesting that the Compustat-based
industry concentrationmeasures are poor proxies for actual industry
concentration.
Table 4 reports correlations between HI-Census, HI-Compustat,
and IndustryMarkup. The table first confirms the result in Table 3
that the correlationbetween HI-Census and HI-Compustat is low.
Specifically, the Spearman andPearson correlations between the two
variables are only 0.111 and 0.129.9The table also confirms the
results in Table 3 that the correlation between HI-Census and
Industry Markup is positive and that HI-Compustat and
IndustryMarkup are not systematically related. These results are
further evidence that
9 We find that the Spearman and Pearson correlations between
HI-Census and HI-Compustat for the period from1980 to 1994 are
0.126 and 0.135, respectively. These results rule out the
possibility that the low correlationbetween HI-Census and
HI-Compustat reported in Table 4 is due to the fact that for the
period from 1995 to2005, we convert HI-Census values for six-digit
NAICS industries into values for four-digit SIC industries.
3851
at Univ of Southern California on A
pril 17, 2014http://rfs.oxfordjournals.org/
Dow
nloaded from
-
The Review of Financial Studies / v 22 n 10 2009
Table 5Industries sorted by U.S. Census based Herfindahl index
valuesTwo-digit SIC code Industry name HI-Census HI-Compustat
24 Lumber & Wood Products 0.027 0.69123 Apparel 0.029
0.73134 Fabricated Metal Products 0.037 0.72425 Furniture and
Fixtures 0.039 0.69739 Miscellaneous Manufacturing 0.048 0.75327
Printing and Publishing 0.053 0.61735 Machinery and Computer
Equipment 0.054 0.64830 Rubber and Plastics 0.054 0.59629 Petroleum
Refining 0.058 0.63638 Measuring Instruments 0.059 0.59026 Paper
0.060 0.70033 Primary Metal 0.063 0.65922 Textile Product Mills
0.063 0.78736 Electronic Equipment 0.078 0.70920 Food and Kindred
Products 0.078 0.76332 Stone, Clay, Glass, and Concrete 0.087
0.79328 Chemicals 0.091 0.63031 Leather 0.092 0.73837
Transportation Equipment 0.096 0.66921 Tobacco 0.153 0.837
This table reports mean values of HI-Census and HI-Compustat for
four-digit SIC industries within a particulartwo-digit SIC
industry. The industries are listed in ascending order of
HI-Census. HI-Census is the Herfindahlindex for four-digit SIC
industries as reported by the Census of Manufactures. The 1997 and
2002 U.S. Censusesdefine industry at the six-digit NAICS level, and
consequently for these two years we use HI-Census valuesfor
six-digit NAICS industries to calculate HI-Census values for
four-digit SIC industries by weighting the HI-Census values of
component six-digit NAICS industries by the square of their share
of the broader four-digit SICindustry. To determine what are the
component six-digit NAICS industries of a broader four-digit SIC
industrywe use NAICS correspondence tables provided by the U.S.
Census. HI-Compustat is the sum of the squares ofthe sales market
shares of all firms in a CRSP four-digit SIC industry that have
sales data on Compustat. Foreach year t, HI-Compustat is averaged
over a three-year period from year t 2 to year t. The observations
usedto calculate mean HI-Census or HI-Compustat for two-digit SIC
industries are taken only from years when aU.S. Census takes place
(1982, 1987, 1992, 1997, and 2002).
Compustat-based industry concentration measures are not good
proxies foractual industry concentration.10
In Table 5, we report average values of HI-Census and
HI-Compustat forfour-digit SIC industries within particular
two-digit SIC industry groups. Thesedata allow us to provide
information on what the typical Herfindahl index valueis for a
four-digit SIC industry within a broader two-digit SIC industry.
Also,this way of reporting our findings makes it easier to
comprehend the data giventhe large number of four-digit SIC
industries within the manufacturing sector.The industries are
listed in ascending order of HI-Census. There is a largedifference
between HI-Census and HI-Compustat for every industry. Further,our
results show that although HI-Census increases significantly as one
moves
10 We also examine the correlation between industry markups and
the four-firm ratios. We find that the Pearsonand Spearman
correlations between Industry Markup and FFR-Census are positive
and significant and equal0.198 and 0.179, respectively. However,
the Spearman and Pearson correlations between Industry Markup
andFFR-Compustat are negative and significant and equal 0.082 and
0.123, respectively.
3852
at Univ of Southern California on A
pril 17, 2014http://rfs.oxfordjournals.org/
Dow
nloaded from
-
The Limitations of Industry Concentration Measures Constructed
with Compustat Data
from less- to more-concentrated industries based on this
measure, there is nosystematic variation in HI-Compustat along
these industries, reflecting the lowcorrelation between the two
measures.
3. Reexamination of results obtained in prior studies that
useCompustat-based industry concentration measures
Results that we have presented so far suggest that U.S.
Census-based industryconcentration measures are superior to
Compustat-based industry concentra-tion measures in measuring
actual industry concentration and that there is avery low
correlation between the two measures. It is therefore important to
ex-amine whether the results of prior empirical studies that use
Compustat-basedindustry concentration measures are robust to the
use of U.S. Census-based in-dustry concentration measures. We
reexamine the results in Hou and Robinson(2006); Lang and Stulz
(1992); Harris (1998); and Defond and Park (1999). Wereexamine
these four papers for the following reasons. First, we are able to
col-lect the necessary data to replicate these studies. Second, the
issues addressed inthese studies are from four different
finance-related areas: asset pricing, capitalstructure, corporate
disclosure policy, and corporate governance.
3.1 Hou and Robinson (2006)Hou and Robinson (2006) find that
firms in more-concentrated industries earnlower future stock
returns. They argue that barriers to entry in highly con-centrated
industries insulate firms from undiversifiable distress risk, which
ispriced in equity returns. We examine the sensitivity of their
results to usingU.S. Census-based industry concentration measures
in place of the Compustatmeasures.
Following Hou and Robinson (2006), we estimate firm-level
Fama-MacBethcross-sectional regressions of the model in their panel
B of Table 4. Specifically,we regress future monthly stock returns
from July of year t + 1 to June of yeart + 2 on HI-Compustat and
other firm characteristics. As did Hou and Robinson(2006), we
calculate HI-Compustat at the three-digit SIC level and measure
itas the mean value of HI-Compustat over the three prior years, t
2, t 1,and t. Hou and Robinson (2006) use the sample period from
1963 to 2001.Because data on HI-Census from the Census of
Manufactures are availablestarting in 1982 and, as shown in Table
1, the earliest year to which we canapply these data is 1980, we
study the period from 1980 to 2001. Also, dataon HI-Census are
available only for manufacturing firms. So that we can makebetter
comparisons, we examine only manufacturing firms when
estimatingresults with the HI-Compustat measure.
The regression results in the first column of Table 6 document a
significantnegative relationship between HI-Compustat and future
stock returns. Thisresult is similar to that reported in Hou and
Robinson (2006) and is the basis oftheir conclusion that firms in
more-concentrated industries earn lower returns
3853
at Univ of Southern California on A
pril 17, 2014http://rfs.oxfordjournals.org/
Dow
nloaded from
-
The Review of Financial Studies / v 22 n 10 2009
Table 6The relationship between stock returns and industry
concentration
1 2 3 4
Intercept 1.408 (3.70) 1.518 (4.45) 2.170 (6.67) 1.840
(6.92)HI-Compustat 0.342 (1.93)HI-Census 0.115 (0.24)FFR-Compustat
0.342 (2.00)FFR-Census 0.054 (0.47)Ln(Market
Capitalization)0.014 (0.24) 0.016 (0.29) 0.113 (2.65) 0.111
(2.61)
Ln(B/M) 0.258 (2.57) 0.299 (3.22) 0.220 (2.99) 0.218
(2.95)Momentum 0.660 (3.26) 0.644 (3.59) 0.573 (3.28) 0.570
(3.27)Beta 0.028 (0.16) 0.098 (0.61) 0.128 (0.97) 0.124
(0.94)Leverage 0.025 (0.09) 0.059 (0.23) 0.072 (0.35) 0.053
(0.26)Average adjusted
R-square0.032 0.032 0.049 0.049
Average number ofobservations
991 991 936 936
This table presents results from Fama-MacBeth cross-sectional
regressions explaining monthly stock returns,estimated monthly
between July 1980 and June 2001 for models 1 and 2 and between July
1963 and June 2001for models 3 and 4. Monthly returns are the
one-year-ahead twelve monthly returns from July of year t + 1
toJune of year t + 2. Industry concentration ratios and accounting
data are measured at the end of year t. HI-Censusin this table
represents the Herfindahl index for three-digit SIC industries
calculated using data collected fromCensus of Manufactures
publications. Prior to 1997, the U.S. Census defines industry using
four-digit SIC codes,while from 1997 onward industry is defined
using six-digit NAICS codes. Over the 19801994 period, we
useHI-Census values for four-digit SIC industries to calculate
HI-Census values for three-digit SIC industries byweighting the
HI-Census values of component four-digit SIC industries by the
square of their share of the broaderthree-digit SIC industry. Over
the 19952001 period, we use HI-Census values for six-digit NAICS
industries tocalculate HI-Census values for three-digit SIC
industries by weighting the HI-Census values of component six-digit
NAICS industries by the square of their share of the broader
three-digit SIC industry. To determine what arethe component
six-digit NAICS industries of a broader three-digit SIC industry we
use NAICS correspondencetables provided by the U.S. Census.
HI-Compustat is the sum of the squares of the sales market shares
of allfirms in a CRSP three-digit SIC industry that have sales data
on Compustat. For each year t, HI-Compustat isaveraged over a
three-year period from year t 2 to year t. FFR-Census in this table
represents the four-firm ratiofor four-digit SIC industries
calculated using data collected from Census of Manufactures
publications. Over the19952001 period FFR-Census values for
six-digit NAICS industries are used to approximate FFR-Census
forbroader four-digit SIC industries by first determining which
component six-digit NAICS industry of a broaderfour-digit SIC
industry has the largest sales as measured by the sales of its top
four firms. Next, we divide thesales of the top four firms of this
six-digit NAICS industry by the total sales of the firms in all the
componentsix-digit NAICS industries within the broader four-digit
SIC industry. FFR-Compustat is the sum of the marketshares of the
four largest firms in terms of market share in a CRSP four-digit
SIC industry. A firms market shareis measured as sales divided by
total sales of all CRSP firms in that industry that have sales data
on Compustat.For each year t, FFR-Compustat is averaged over a
three-year period from year t 2 to year t. Ln(MarketCapitalization)
is defined as the natural logarithm of the market value of equity
in millions at the end of year t.Ln(B/M) is defined as the natural
logarithm of book value of equity divided by market value of equity
at the endof year t. Momentum for each month is prior one-year
stock returns. Beta is market model beta estimated usingthe prior
thirty-six monthly equally weighted CRSP index returns. Leverage is
defined as total debt divided bymarket value of total assets (i.e.,
market value of equity plus debt) at the end of year t and is
trimmed at the 1%level. Time-series average values of the monthly
regression coefficients are reported with time-series
t-statisticsin parentheses. , , and represent significance at the
1%, 5%, and 10% levels, respectively, for a two-tailedtest.
because these industries are less risky. The second column of
the table presentsthe regression results of the same model as in
the first column, except that thethree-digit SIC level HI-Compustat
is replaced with the three-digit SIC levelHI-Census. We find that
future stock returns are not associated with HI-Census.Thus, this
result does not support the Hou and Robinson (2006) conclusion
thatindustry concentration is related to future stock returns.
3854
at Univ of Southern California on A
pril 17, 2014http://rfs.oxfordjournals.org/
Dow
nloaded from
-
The Limitations of Industry Concentration Measures Constructed
with Compustat Data
The third and fourth columns of Table 6 provide results of
regression modelsthat are similar to the models in the first two
columns of this table, except thatthe four-firm ratio is used as
the measure of industry concentration rather thanthe Herfindahl
index. This analysis provides evidence on the sensitivity of
ourconclusions to the use of an alternative definition of industry
concentration.Another benefit of this analysis is that we have data
on FFR-Census for the19632001 period, which is the sample period
used in Hou and Robinson(2006). We use the four-firm ratio at the
four-digit SIC level because FFR-Census is reported at the
four-digit SIC level and converting this measure tothe three-digit
SIC level involves approximation.11 The results show that
FFR-Compustat is significantly negatively associated with future
stock returns. Incontrast, there is no association between
FFR-Census and future stock returns,further suggesting that the Hou
and Robinson (2006) conclusion that industryconcentration is
related to future stock returns may not be valid.12,13
Hou and Robinson (2006) also document that firms in
less-concentratedindustries have higher research and development
expenses. They posit that thisresult is consistent with the
Schumpeter (1912) proposition that innovation asa form of creative
destruction is more likely to occur in competitive industries.Also,
they argue that if innovation risk is priced into stock returns and
this riskis higher in more-competitive industries, it contributes
to the higher cost ofequity capital in such industries.
We examine whether the relationship between industry
concentration andresearch and development expenses is sensitive to
using Compustat or U.S.Census data to measure industry
concentration. The model we estimate isfrom panel B of Table 2 of
Hou and Robinson (2006), which relates industryconcentration to
certain industry characteristics. The dependent variables forthe
models in columns 1 and 2 of Table 7 are HI-Compustat and
HI-Census,and the sample period is 19802001.
11 Our results remain qualitatively the same when we use
FFR-Compustat for three-digit industries and approximateFFR-Census
for three-digit SIC industries, using a methodology similar to that
discussed in Section 1.3 of thearticle. Further, if HI-Compustat
and HI-Census are defined at the four-digit SIC level instead of
the three-digitSIC level, as in the first two columns of Table 6,
the results remain qualitatively the same. We also examinewhether
our conclusions are sensitive to defining industry at the six-digit
NAICS level. Since the U.S. Censusmeasures for six-digit NAICS
industries are available for 1997 and 2002 and we can apply these
data for theperiod from 1995 to 2001, we use this sample period in
our analysis. We continue to find that future stockreturns are
significantly negatively associated with HI-Compustat and
FFR-Compustat, but are not significantlyassociated with HI-Census
and FFR-Census.
12 We also examine the sensitivity of the results to applying a
given years U.S. Census industry concentrationdata to the
surrounding one or two years. We estimate the models in columns 2
and 4 of Table 6 only for thoseyears to which the U.S. Census data
belong and find that the coefficients on HI-Census and FFR-Census
remaininsignificant.
13 In footnote 6 of their paper, Hou and Robinson (2006) report
that for a smaller sample of observations theirunivariate analysis
results suggest that U.S. Census-based industry concentration
measures are negatively relatedto future stock returns. However, as
we have shown in this article, our finding that these measures are
notsignificantly related to future stock returns is quite
robust.
3855
at Univ of Southern California on A
pril 17, 2014http://rfs.oxfordjournals.org/
Dow
nloaded from
-
The Review of Financial Studies / v 22 n 10 2009
Table 7The relationship between Compustat- and U.S. Census-based
industry concentration measures withindustry average research and
development expenses and industry averages of other firm
characteristics
1 2 3 4Dependent variable HI-Compustat HI-Census FFR-Compustat
FFR-Census
Intercept 0.782 (12.03) 0.024 (4.23) 0.980 (48.54) 0.087
(6.25)R&D expense/book
assets1.574 (4.17) 0.269 (4.98) 0.232 (6.28) 0.734 (8.16)
Ln(MarketCapitalization)
0.021 (7.03) 0.016 (15.37) 0.004 (10.36) 0.060 (27.13)
Earnings/book assets 0.607 (2.13) 0.081 (2.69) 0.065 (2.99)
0.124 (1.44)Dividends/book
equity0.544 (1.80) 0.069 (1.09) 0.170 (3.80) 0.355 (3.20)
Ln(B/M) 0.037 (2.23) 0.010 (2.82) 0.004 (1.61) 0.036 (6.62)Beta
0.072 (3.16) 0.011 (3.04) 0.010 (6.79) 0.013 (1.99)Leverage 0.089
(1.75) 0.013 (1.16) 0.019 (3.10) 0.073 (4.27)Average adjusted
R-square0.057 0.134 0.005 0.137
Average number ofobservations
118 118 269 269
Models 1 and 2 present results from Fama-MacBeth cross-sectional
regressions estimated using annual datafrom 19802001. Models 3 and
4 present results from Fama-MacBeth cross-sectional regressions
estimatedusing annual data from 1963 to 2001. Industry
concentration ratios and firm characteristics are measured at
theend of year t. HI-Census in this table represents the Herfindahl
index for three-digit SIC industries calculatedusing data collected
from Census of Manufactures publications. Prior to 1997, the U.S.
Census defines industryusing four-digit SIC codes, while from 1997
onward industry is defined using six-digit NAICS codes. Overthe
19801994 period, we use HI-Census values for four-digit SIC
industries to calculate HI-Census valuesfor three-digit SIC
industries by weighting the HI-Census values of component
four-digit SIC industries by thesquare of their share of the
broader three-digit SIC industry. Over the 19952001 period, we use
HI-Census valuesfor six-digit NAICS industries to calculate
HI-Census values for three-digit SIC industries by weighting the
HI-Census values of component six-digit NAICS industries by the
square of their share of the broader three-digit SICindustry. To
determine what are the component six-digit NAICS industries of a
broader three-digit SIC industry,we use NAICS correspondence tables
provided by the U.S. Census. HI-Compustat is the sum of the squares
ofthe sales market shares of all firms in a CRSP three-digit SIC
industry that have sales data on Compustat. Foreach year t,
HI-Compustat is averaged over a three-year period from year t 2 to
year t. FFR-Census in thistable represents the four-firm ratio for
four-digit SIC industries calculated using data collected from
Census ofManufactures publications. Over the 19952001 period,
FFR-Census values for six-digit NAICS industries areused to
approximate FFR-Census for broader four-digit SIC industries by
first determining which componentsix-digit NAICS industry of a
broader four-digit SIC industry has the largest sales as measured
by the sales ofits top four firms. Next, we divide the sales of the
top four firms of this six-digit NAICS industry by the totalsales
of the firms in all the component six-digit NAICS industries within
the broader four-digit SIC industry.FFR-Compustat is the sum of the
market shares of the four largest firms in terms of market share in
a CRSPfour-digit SIC industry. A firms market share is measured as
sales divided by total sales of all CRSP firms inthat industry that
have sales data on Compustat. For each year t, FFR-Compustat is
averaged over a three-yearperiod from year t 2 to year t. All
independent variables are industry averages of firm-level
characteristics.Ln(Market Capitalization) is defined as the natural
logarithm of the market value of equity in millions at the endof
year t. Ln(B/M) is defined as the natural logarithm of book value
of equity divided by market value of equityat the end of year t.
Beta is market model beta estimated using the prior thirty-six
monthly equally weightedCRSP index returns. Leverage is defined as
total debt divided by market value of total assets (i.e., market
valueof equity plus debt) at the end of year t. R&D
expense/book assets, Earnings/book assets, and Dividends/bookequity
are as of year t and are trimmed at the 1% level. Leverage is also
trimmed at the 1% level. Time-seriesaverage values of the monthly
regression coefficients are reported with time-series t-statistics
in parentheses. ,
, and represent significance at the 1%, 5%, and 10% levels,
respectively, for a two-tailed test.
The first column of Table 7 shows the replication of the Hou and
Robinson(2006) finding that HI-Compustat is significantly
negatively associated withresearch and development expenses. The
second column presents estimates ofthe model that replaces the
dependent variable HI-Compustat with HI-Census.We find a
significant positive association between HI-Census and research
and
3856
at Univ of Southern California on A
pril 17, 2014http://rfs.oxfordjournals.org/
Dow
nloaded from
-
The Limitations of Industry Concentration Measures Constructed
with Compustat Data
development expenses. If U.S. Census-based industry
concentration measuresare more appropriate measures of actual
industry concentration, then this resultindicates that innovation
risk is actually higher in more-concentrated industries.This
finding is consistent with the claim made in Schumpeter (1942) that
there ismore innovation in less-competitive industries because
firms in such industriescan enjoy economic profits resulting from
their innovation, instead of havingthese profits competed away.
However, this finding is not supportive of the Houand Robinson
(2006) claim that higher innovation risk in
less-concentratedindustries raises the overall cost of capital in
these industries.
There is another notable difference in the results in columns 1
and 2 in Table 7.Similar to Hou and Robinson (2006), we find a
significant negative associationbetween HI-Compustat and firm size.
However, HI-Census is significantlypositively associated with firm
size. Given that more-concentrated industriesare expected to have
on average larger firms, these results provide furthersupport to
the arguments made in section 2 of the article that U.S.
Census-basedmeasures of industry concentration are more meaningful
than are Compustat-based measures.
Columns 3 and 4 of Table 7 present results of the models that
use FFR-Compustat and FFR-Census, respectively, as the dependent
variables. Also,these results are for the longer sample period from
1963 to 2001, the sampleperiod used in Hou and Robinson (2006). The
relations of FFR-Compustatand FFR-Census with research and
development expenses and firm size arequalitatively the same as
those reported in columns 1 and 2 of Table 7.
3.2 Lang and Stulz (1992)Lang and Stulz (1992) examine the
intra-industry effects of bankruptcy an-nouncements. They show that
in industries with high leverage that are lessconcentrated, there
is a significant and important negative price reaction to
abankruptcy announcement in the industry. They argue that this
price reactionreflects the loss experienced by other firms in the
industry, because the an-nouncement conveys information about lower
future cash flows for these firms.They refer to this effect as a
contagion effect. They also find that industrieswith low leverage
that are more concentrated exhibit significantly positive
pricereactions to a bankruptcy announcement in the industry. They
contend that thisprice reaction reflects the benefits to
competitors that result from the difficultiesfaced by the bankrupt
firm and refer to it as a competitive effect.
We investigate whether the Lang and Stulz (1992) results,
reported in theirTables 3 and 4, are robust to the use of U.S.
Census-based industry concen-tration measures. They examine
fifty-nine bankruptcies from 1970 to 1989.Our sample of bankrupt
firms comes from the current Altman-NYU SalomonCenter Bankruptcy
list, which is an updated version of the data used by Langand Stulz
(1992). We examine eighty-six bankruptcies that took place in
themanufacturing sector from 1980 to 2004. As in Lang and Stulz
(1992), we
3857
at Univ of Southern California on A
pril 17, 2014http://rfs.oxfordjournals.org/
Dow
nloaded from
-
The Review of Financial Studies / v 22 n 10 2009
Table 8Abnormal returns for subsamples of industry portfolios
for the eleven days around a bankruptcyannouncement
Industry portfoliocharacteristics
No. of industry portfolioswith industrycharacteristics
below/abovethe sample median
Average abnormal returns for the subsampleof industry portfolios
with the value of theindustry portfolio characteristics
below/abovethe sample median
Below Above
Panel A: Compustat-based industry concentration measures
Leverage 43/43 0.140 2.319(0.101) (2.587)
[2.278]HI-Compustat 43/43 2.891 0.432
(1.920) (0.646)[1.869]
HI-Compustat (subsample ofindustry portfolios withbelow-median
leverage)
19/24 2.700 1.887(0.931) (2.038)
[2.736]HI-Compustat (subsample ofindustry portfolios
withabove-median leverage)
24/19 3.042 1.406(2.149) (1.466)
[1.457]Panel B: U.S. Census-based industry concentration
measures
HI-Census 43/43 1.328 1.131(1.385) (0.844)
[0.154]HI-Census (subsample of
industry portfolios withbelow-median leverage)
18/25 0.670 0.242(0.428) (0.116)
[0.513]HI-Census (subsample of
industry portfolios withabove-median leverage)
25/18 1.802 3.038(1.498) (2.267)
[0.403]
Cumulative abnormal returns (market model errors) are calculated
from day 5 to +5 relative to bankruptcyannouncements. Bankruptcy
announcements are from the Altman-NYU Salomon Center Bankruptcy
list andinclude bankruptcies between January 1980 and December 2004
with liabilities greater than $100 million forfirms in the
manufacturing sector (SIC codes between 2000 and 3999) with a
primary SIC code that is availablefrom Compustat (eighty-six
bankruptcies). An industry portfolio is a value-weighted portfolio
of firms withthe same primary four-digit SIC code as the bankrupt
firm for which announcement returns are available fromthe CRSP
files. Except for HI-Census, industry characteristics are obtained
from Compustat for the fiscal yearpreceding the announcement.
HI-Census is the Herfindahl index for four-digit SIC industries as
reported by theCensus of Manufactures. The 1997 and 2002 U.S.
Censuses define industry at the six-digit NAICS level. Overthe
19952004 period, we use HI-Census values for six-digit NAICS
industries to calculate HI-Census valuesfor four-digit SIC
industries by weighting the HI-Census values of component six-digit
NAICS industries bythe square of their share of the broader
four-digit SIC industry. To determine what are the component
six-digitNAICS industries of a broader four-digit SIC industry, we
use NAICS correspondence tables provided by theU.S. Census.
HI-Compustat is defined as the sum of the squares of the sales
market shares of all firms in aCompustat four-digit SIC industry.
Leverage is the debt-to-total assets ratio. The numbers in
parentheses arez-statistics and the numbers in square brackets are
z-statistics for differences between subsamples. , , and represent
significance at the 1%, 5%, and 10% levels, respectively, for a
two-tailed test.
define industry using four-digit SIC codes and calculate
announcement returnsfrom days 5 to +5 relative to bankruptcy
announcements.
Panel A of Table 8 reports our replication of the Lang and Stulz
(1992)univariate results. First, the announcement returns for
portfolios of industrypeers are significantly more negative in
industries with leverage above the sam-ple median. Second, the
announcement returns are significantly more positive
3858
at Univ of Southern California on A
pril 17, 2014http://rfs.oxfordjournals.org/
Dow
nloaded from
-
The Limitations of Industry Concentration Measures Constructed
with Compustat Data
for industries with HI-Compustat values above the sample median.
Third, forindustries with high HI-Compustat and low leverage, the
announcement returnsare significantly positive (1.887%). Finally,
for industries with high leverageand low HI-Compustat values, the
announcement returns are significantly neg-ative (3.042%). These
results are similar to those reported in Lang and Stulz(1992).
Panel B presents the results with HI-Compustat replaced with
HI-Census. None of the significant relationships involving industry
concentrationthat are observed in panel A are found to be
significant in panel B. Thus, resultsbased on the U.S. Census-based
industry concentration measures are not con-sistent with those
reported in Lang and Stulz (1992) and do not support
theirconclusions.
Lang and Stulz (1992) use multivariate regression models to
control for otherfactors that might be related to the announcement
returns. We reexamine theirregression results and report our
findings in Table 9. The first four columns inthis table present
the results from estimating the four multivariate models ofTable 4
of Lang and Stulz (1992). In the first three models, the
explanatoryvariables of interest are the three dummy variables
representing high debt/highHI-Compustat, low debt/high
HI-Compustat, and low debt/low HI-Compustat.Consistent with the
Lang and Stulz (1992) results, in all three models the
coef-ficients on the dummy variable representing low debt/high
HI-Compustat arepositive and significant. They conclude that this
result reflects the competitiveeffect. The coefficients on the
other variables in these models are also similar tothose in Lang
and Stulz (1992). In column 4, the dummy variables are replacedby
HI-Compustat and leverage. Once again, consistent with the Lang and
Stulz(1992) results, the coefficient on HI-Compustat is positive
and significant, andthe coefficient on leverage is insignificant.
The coefficients on the remainingvariables are also consistent with
Lang and Stulz (1992).
Columns 58 of Table 9 present results from estimating the
announcementreturn models after replacing HI-Compustat with
HI-Census. In columns 57,the coefficient on the dummy variable
representing low debt/high HI-Censusis not significant. Also in the
last column, the coefficient on HI-Census is notsignificant. In
sum, the Lang and Stulz (1992) results and conclusions are
notrobust to using U.S. Census-based industry concentration
measures instead ofCompustat-based measures.
3.3 Harris (1998)Harris (1998) shows that firms are less likely
to disclose separate segmentinformation for operations in
more-concentrated industries. She argues thatfirms behave in this
manner to protect the abnormal profits and market sharesrelated to
their operations. She uses Compustat-based industry
concentrationmeasures in her empirical analyses. We reexamine the
sensitivity of her resultsto using U.S. Census-based industry
concentration measures. Specifically, wereestimate the multivariate
logit model in her Table 3 on a sample of manu-facturing firms. To
create the dependent variable in these models, we follow
3859
at Univ of Southern California on A
pril 17, 2014http://rfs.oxfordjournals.org/
Dow
nloaded from
-
TheR
eviewofFinancialStudies/
v22
n10
2009
Table 9Weighted least squares regressions of industry portfolio
market model cumulative residuals at bankruptcy announcements on
industry characteristics
1 2 3 4 5 6 7 8
Intercept 0.034 (2.19) 0.157 (4.79) 0.156 (4.70) 0.100 (2.68)
0.023 (1.43) 0.137 (4.29) 0.139 (4.30) 0.120 (3.14)1 if high
debt/high
HI-Compustat; 0 otherwise0.016 (0.68) 0.028 (1.34) 0.029
(1.37)
1 if low debt/highHI-Compustat; 0 otherwise
0.053 (2.41) 0.061 (3.01) 0.060 (2.95)
1 if low debt/lowHI-Compustat; 0 otherwise
0.006 (0.24) 0.017 (0.79) 0.016 (0.77)
1 if high debt/high HI-Census; 0otherwise
0.009 (0.37) 0.019 (0.85) 0.020 (0.89)
1 if low debt/high HI-Census; 0otherwise
0.024 (1.08) 0.025 (1.26) 0.024 (1.18)
1 if low debt/low HI-Census; 0otherwise
0.016 (0.66) 0.013 (0.58) 0.012 (0.55)
HI-Compustat 0.053 (2.02)HI-Census 0.058 (0.36)Returns
correlation 0.016 (0.27) 0.023 (0.40)Leverage 0.004 (0.05) 0.022
(0.27)Log of average price 0.043 (4.25) 0.044 (4.27) 0.042 (3.92)
0.043 (4.14) 0.043 (4.16) 0.040 (3.65)Distress cumulative return
0.020 (0.62) 0.016 (0.53) 0.007 (0.21) 0.009 (0.28) 0.005 (0.15)
0.002 (0.07)Predistress cumulated return 0.024 (1.03) 0.024
(0.97)Adjusted R-square 0.031 0.199 0.197 0.157 0.022 0.149 0.150
0.116Number of observations 86 86 86 86 86 86 86 86
Cumulative abnormal returns (market model errors) are calculated
from day 5 to +5 relative to bankruptcy announcements. Bankruptcy
announcements are from the Altman-NYUSalomon Center Bankruptcy list
and include bankruptcies between January 1980 and December 2004
with liabilities greater than $100 million for firms in the
manufacturing sector (SICcodes between 2000 and 3999) with a
primary SIC code that is available from Compustat (86
bankruptcies). An industry portfolio is a value-weighted portfolio
of firms with the sameprimary four-digit SIC code as the bankrupt
firm for which announcement returns are available from the CRSP
files. The industry characteristics are obtained from Compustat for
thefiscal year preceding the announcement except for HI-Census and
the returns correlation variable. HI-Census is the Herfindahl index
for four-digit SIC industries as reported by the Censusof
Manufactures. The 1997 and 2002 U.S. Censuses define industry at
the six-digit NAICS level. Over the 19952004 period we use
HI-Census values for six-digit NAICS industries tocalculate
HI-Census values for four-digit SIC industries by weighting the
HI-Census values of component six-digit NAICS industries by the
square of their share of the broader four-digitSIC industry. To
determine what are the component six-digit NAICS industries of a
broader four-digit SIC industry, we use NAICS correspondence tables
provided by the U.S. Census.HI-Compustat is defined as the sum of
the squares of the sales market shares of all firms in a Compustat
four-digit SIC industry. High/low debt, HI-Compustat, and HI-Census
are as definedin Table 8. Returns correlation is the correlation
between the industry portfolio and the bankrupt firm returns for
the year preceding the announcement. Leverage is the average
debt-to-totalassets ratio in a firms industry. Log of average price
is the natural logarithm of average stock price in a firms
industry. Distress cumulative return is the industry portfolio
cumulative returnin excess of the market return from five days
before the first distress announcement to five days before the
bankruptcy announcement. Predistress cumulated return is the
industry portfoliocumulative return in excess of the market return
from 800 to 50 days before the first distress announcement. As in
Lang and Stulz (1992), we identify first distress announcements
using themethodology employed by Gilson, John, and Lang (1990).
t-statistics are in parentheses. , , and represent significance at
the 1%, 5%, and 10% levels, respectively, for a two-tailedtest.
3860
at Univ of Southern California on April 17, 2014
http://rfs.oxfordjournals.org/ Downloaded from
-
The Limitations of Industry Concentration Measures Constructed
with Compustat Data
her approach and use the Compustat Multiple SICs Tape, which
reports all theSIC codes for a firm in a given year. As in Harris
(1998), we compare theseSICs to those appearing in the Compustat
Industry Segment File, which reportsthe segments actually disclosed
in a firms annual report. If a three-digit SICin which a firm has
operations is reported as a primary or secondary SIC forone of the
firms business segments, the dependent variable takes a value of
1;otherwise, the dependent variable equals 0. This definition
allows for multiplefirm-industry observations in a given year. The
Compustat Multiple SICs Tapewas discontinued in 1998. However, we
were able to locate the 1997 CompustatMultiple SICs Tape and we
used it in our analysis. This tape allows us to studythe period
from 1995 to 1997 rather than the period from 1987 to 1991
studiedby Harris (1998). We do not study the years examined in
Harris (1998) becausewe have data for the SIC codes in which a firm
has operations only for 1997.
Table 10 reports the results of our logit regressions. The model
in the firstthree columns of the table uses three-digit SIC level
HI-Compustat as themeasure of industry concentration, although
Harris (1998) measures industryconcentration as FFR-Compustat at
the three-digit SIC level. Our reason forusing HI-Compustat as an
explanatory variable instead of FFR-Compustatin the regression
models is that the 1997 Census of Manufactures reportsconcentration
measures at the six-digit NAICS level, and it is possible touse
these data to calculate HI-Census precisely at the three-digit SIC
level,whereas we can compute only approximate FFR-Census values for
three-digitSIC industries.14 The other explanatory variables in the
model are the same asin Harris (1998).
We follow Harris (1998) and report results for each of the
sample yearsseparately. The coefficients on HI-Compustat are
negative and significant foreach of the individual sample period
years examined, consistent with the Harris(1998) results. However,
in models 46 in Table 10, in which we replace HI-Compustat with
HI-Census, the coefficients on HI-Census are insignificant foreach
of the sample years. Thus, the Harris (1998) findings are sensitive
to theuse of the U.S. Census-based industry concentration measures
in place of theCompustat-based measures.
3.4 Defond and Park (1999)Finally, we reexamine the Defond and
Park (1999) result that CEO turnoveris negatively associated with
industry concentration. They argue that inmore competitive
industries there is greater homogeneity across firms andCEOs are
likely to have more peers, making it easier to identify and
replacepoorly performing CEOs.
Defond and Park (1999) use the Lexis/Nexis news database for the
periodfrom 1988 to 1992 to construct their CEO turnover sample.
Their control sample
14 We reestimate all the models in Table 10 using FFR-Compustat
and FFR-Census calculated for three-digit SICindustries and obtain
results similar to those reported in this table.
3861
at Univ of Southern California on A
pril 17, 2014http://rfs.oxfordjournals.org/
Dow
nloaded from
-
TheR
eviewofFinancialStudies/
v22
n10
2009
Table 10Logit analysis of managers business segment reporting
decisions
1 2 3 4 5 6Sample period 1995 1996 1997 1995 1996 1997
Intercept 1.778 (0.000) 1.900 (0.000) 1.959 (0.000) 1.678
(0.000) 1.827 (0.000) 1.917 (0.000)HI-Compustat 0.536 (0.043) 0.494
(0.056) 0.486 (0.064)HI-Census 1.215 (0.458) 0.074 (0.964) 1.551
(0.338)ROA persistence 0.152 (0.015) 0.106 (0.087) 0.143 (0.021)
0.164 (0.009) 0.115 (0.064) 0.144 (0.020)Earnings persistence
across the SICs in which a firm operates 1.112 (0.000) 1.115
(0.000) 1.067 (0.000) 1.194 (0.000) 1.182 (0.000) 1.135
(0.000)Median industry sales scaled by firm sales 0.841 (0.000)
0.839 (0.000) 0.838 (0.000) 0.836 (0.000) 0.841 (0.000) 0.841
(0.000)Number of SICs in which the firm operates 0.718 (0.000)
0.735 (0.000) 0.769 (0.000) 0.720 (0.000) 0.736 (0.000) 0.770
(0.000)Likelihood ratio 1.896 (0.000) 2.034 (0.000) 2.121 (0.000)
1.893 (0.000) 2.030 (0.000) 2.119 (0.000)Number of observations
5109 5384 5468 5109 5384 5468
The sample period is 19951997 and includes firms in the
manufacturing sector (SIC codes between 2000 and 3999) covered by
Compustat. The dependent variable takes a value of 1 ifduring the
current year firm i decides to provide a segmental disclosure of
its operations for a three-digit SIC industry j in which it
operates, and equals 0 otherwise. This allows for
multiplefirm-industry observations in a given year. HI-Compustat is
defined as the sum of the squares of the sales market shares of all
firms in a Compustat three-digit SIC industry. HI-Census isthe
Herfindahl index for three-digit SIC industries as reported by the
Census of Manufactures. The 1997 U.S. Census defines industry at
the six-digit NAICS level. Over the 19951997period we use HI-Census
values for six-digit NAICS industries to calculate HI-Census values
for three-digit SIC industries by weighting the HI-Census values of
component six-digitNAICS industries by the square of their share of
the broader three-digit SIC industry. To determine what are the
component six-digit NAICS industries of a broader three-digit SIC
industrywe use NAICS correspondence tables provided by the U.S.
Census. ROA persistence represents the speed of adjustment for
firm-level positive abnormal return on assets in industry j andis
measured as in Harris (1998) as the slope coefficient B2j from the
following regression model, Xijt = B0j + B1j (DnXijt1) + B2j(Dp
Xijt1) + eijt, where Xijt = the difference betweenfirm is ROA and
mean ROA for its industry j, in year t, Dn = 1 if Xijt1 is less
than or equal to zero, zero otherwise, and Dp = 1 if Xijt1 is
greater than zero, zero otherwise. Earningspersistence across the
SICs in which a firm operates is measured as the maximum value
minus the minimum value of ROA persistence for a firm in the
three-digit SICs in which the firmhas operations during the current
year. Median industry sales scaled by firm sales is measured as
median sales for single-segment firms in a three-digit SIC industry
j divided by firm issales. Number of SICs in which the firm
operates is measured as the number of three-digit SIC industries in
which the firm operates during the current year. Significance
levels for Waldchi-square test statistics are in parentheses.
3862
at Univ of Southern California on April 17, 2014
http://rfs.oxfordjournals.org/ Downloaded from
-
The Limitations of Industry Concentration Measures Constructed
with Compustat Data
consists of firms in the Compact Disclosure database without any
CEO turnoverduring their sample period. Their final sample has 301
firm-year observationswith CEO turnovers and a control sample of
2429 firm-year observations withno CEO turnovers. For our sample,
we consider manufacturing firms in theExecuComp database over the
period from 1994 to 2000 and treat firm-yearsfor which the CEO for
year t differs from the CEO for year t 1 as a CEOturnover
observation. Like Defond and Park (1999), we consider all
instancesof CEO turnover regardless of the reason for the turnover.
Our control sampleconsists of manufacturing firms included on the
ExecuComp database that donot experience a CEO turnover during the
period from 1994 to 2000. Ourfinal sample consists of 203 firm-year
observations with CEO turnovers anda control sample of 2267
firm-year observations with no CEO turnovers. Wepoint out that we
do not study a longer sample period so as to not imposethe
condition that control firms have no CEO turnover over this longer
period.Also, we study the period from 1994 to 2000 rather than a
more recent periodso that our sample period can be closer in time
to the Defond and Park (1999)sample period.15
The first three models in Table 11 report the results of
replications of the threelogit models of panel A of Table 4 of
Defond and Park (1999). The dependentvariable takes a value of 1
for firm-year observations with a CEO turnover,and a value of 0 for
firm-years belonging to the control sample. The mainindependent
variable of interest is the square root of HI-Compustat
calculatedat the two-digit SIC level. All the models have the same
set of control variables,except that the first model does not
include analysts earnings forecast errors,the second model does not
include industry-relative earnings, and the thirdmodel includes
both of these variables. The results in the first three columns
ofTable 11 show that there is a significantly negative association
between CEOturnover and the square root of HI-Compustat, consistent
with the Defond andPark (1999) results. The results for the control
variables are also consistentwith those in Defond and Park (1999),
except that our coefficients on analystsearnings forecast errors
are not statistically significant.
The models in the fourth to the sixth columns of Table 11
replace the squareroot of HI-Compustat, as an explanatory variable,
with the square root ofHI-Census calculated at the two-digit SIC
level. We do not find significantassociations between CEO turnover
and the square root of HI-Census in any ofthe three models.
Consequently, the Defond and Park (1999) findings and
theirconclusions are not robust to using U.S. Census-based industry
concentrationmeasures instead of Compustat-based measures.
15 Given that comprehensive data in the ExecuComp database is
available from 1993 onward and the requirementthat we have
information on the identity of the CEO in year t 1, 1994 is the
earliest year in which our sampleperiod can begin.
3863
at Univ of Southern California on A
pril 17, 2014http://rfs.oxfordjournals.org/
Dow
nloaded from
-
TheR
eviewofFinancialStudies/
v22
n10
2009
Table 11Logit analysis predicting CEO turnover
1 2 3 4 5 6
Intercept 6.695 (7.08) 6.758 (7.12) 6.703 (7.09) 8.155 (10.56)
8.010 (10.29) 8.174 (10.60)Square root of HI-Compustat 6.013 (2.82)
5.023 (2.43) 6.033 (2.82)Square root of HI-Census 2.490 (0.94)
2.168 (0.81) 2.432 (0.92)Industry-relative earnings 2.930 (3.77)
2.883 (3.73) 2.696 (3.43) 2.653 (3.39)Analysts earnings forecast
errors 1.835 (1.00) 1.654 (0.90) 1.754 (0.96) 1.571
(0.86)Market-adjusted stock returns 0.403 (1.84) 0.579 (2.63) 0.415
(1.88) 0.436 (1.99) 0.593 (2.70) 0.445 (2.03)Age of CEO 0.097
(8.47) 0.093 (8.24) 0.097 (8.47) 0.097 (8.51) 0.094 (8.32) 0.097
(8.51)Dummy variable for if CEO age
= 63, 64, or 650.597 (2.93) 0.573 (2.81)