Artur-eea-presentation

Understanding rankings of financial analysts

Artur Aiguzhinov1,2 Ana Paula Serra1 Carlos Soares 2

1CEFUP & Department of Economics, University of Porto, Portugal

2LIAAD-INESC Porto LA & Department of Economics, University of Porto, Portugal

February 25th, 2011Agent-based computational economics: Computational Finance

Eastern Economics Association Conference, New York1 of 24

Motivation (1): the value of the recommendations

� Efficient Market Hypothesis (Fama, 1970);

� Information gathering costly ⇒ providing possibilities for abnormalreturns (Grossman and Stiglitz, 1980; Fama, 1991);

� On average, recommendations bring value to investors and financialanalysts’ accuracy in forecasts is valuable (Womack, 1996; Barber et al.,2001);

2 of 24

Motivation(2): rankings of analysts

StarMine R© issues annual rankings of financial analysts:

� Ranks the analysts based on� recommendation performance

� For each analyst a portfolio is constructed. For each “Buy”/“Sell”recommendation the portfolio is one (two) unit(s) long/short andsimultaneously one (two) unit(s) short/long the benchmark. For “Hold”recommendations, the portfolio invests one unit in the benchmark

� EPS forecast accuracy� Single stock Estimating Score (SES): relative accuracy of each analyst’s

earnings forecast when compared with their peer

3 of 24

Issue: Prediction of rankings of analysts

� Foreknowledge of analyst forecast accuracy is valuable (Brown andMohammad, 2003)

� Is it possible to predict these rankings? (StarMine rankings are ex-post)� If yes, can we use those predictions into profitable strategies?;

� Why not predict stock prices instead?� Analysts’ relative performance (rankings) rather more predictable than the

stock prices

Main goalAccurately predict the rankings of financial analysts

4 of 24

Research contributions

� Interdisciplinary approach (label ranking algorithm)

� First paper to identify the variables that discriminate the rankings ofanalysts� analysis of the financial analysts based on state variables concerning market

conditions and stock characteristics (instead of analyst characteristics)

� First paper to predict the rankings of analysts

5 of 24

Research design: an overview

1. Create (ex-post) rankings of the analysts (target rankings):� to establish the rankings we follow models of Clement (1999); Brown

(2001); Creamer and Stolfo (2009);

2. Define state variables

3. Identify the most discriminative state variables

4. Predict rankings of the analysts (naive Bayes for label ranking) andevaluate the ranking accuracy

6 of 24

Database to use

� ThomsonOne I/B/E/S Detailed History:� Quarterly EPS forecasts (1989Q1-2009Q4);

� ThomsonOne DataStream:� Accounting data;� Market data;

� Filtered out stocks� less than 3 analysts� less than 8 quarters in forecast history

7 of 24

Description of the data

Table: Summary of the data

Sector # stocks # brokers # forecasts # forecastsstocks/broker brokers/stock

Energy 170 205 3.09 2.75Industrials 298 320 2.42 2.23

IT 442 413 2.71 2.41Materials 94 229 2.56 2.43

Total 1004 1167 2.70 2.46

8 of 24

Average forecasts

Average forcasts per broker

Quarters

Mu

23

45

6

1989 1 1995 1 2000 1 2005 1 2009 4

Sectors:EnergyITIndustrialsMaterials

9 of 24

Additional data analysis

Figure: Average distribution of brokers based on issued forecasts for all stocks

1 2 3 4 5

0.1

0.2

0.3

0.4

Forecasts per quarter

% o

f ana

lyst

s

Sectors:EnergyIndustrialsITMaterials

10 of 24

topN-lastN analysis

Table: Average topN-lastN changes of analysts per quarter, N=3

topt lastttopt−1 lastt−1 topt−1 lastt−1

Energy 3.01 2.35 2.34 2.75Industrials 3.27 2.44 2.43 2.69

IT 3.61 2.64 2.57 3.22Materials 2.53 2.04 1.90 2.13

11 of 24

Creating target rankings: indexing of the analysts

Based on previous research, we use EPS mean adjusted forecast error(MAFE1) as a measure of analysts predicting accuracy:

FEq,a,s = |ActEPSq,s − EPSq,a,s| (1)

FEq,s =1

n

n∑a=1

FEq,a,s (2)

MAFEq,a,s =FEq,a,s

FEq,s

(3)

1We scaled the error measure by adding 1 so that the most accurate analyst willreceive a higher rank

12 of 24

Define state variables

Variables capture market conditions and stock characteristics (Jegadeeshet al., 2004; Creamer and Stolfo, 2009):

� Market Conditions: Market volatility (MKT)

� Consensus Analyst Variables� Lagged (consensus) forecasting error (FELAG)� Change in consensus (consensus)

� Earnings Momentum: Standardized Unexpected Earnings (SUE)

� Growth Indicators: Sales growth (SG)

� Accounting Fundamentals: Total accruals to total assets ratio (TA)

� Valuation multiples: Earnings-to-price ratio (EP)

13 of 24

Define state variables

Naive Bayes algorithm requires that continuous variables be converted intodiscrete

� discretization using the size bins method (Dougherty et al., 1995)

� 4 bins� High� Medium High� Medium Low� Low

14 of 24

Discriminative Value

Лист1

Страница 1

First Step: Calculate the rankings similarity matrixQuarters Ind. Variables Rankings Similarity matrix

X1 X2 X3 X4 a b c 1 2 3 4 51 c1 d2 d3 c4 1 2 3 1 1,00 0,25 1,00 0,00 0,002 d1 c2 a3 b4 2 3 1 2 0,25 1,00 0,25 0,75 0,753 a1 b2 d3 a4 1 2 3 3 1,00 0,25 1,00 0,00 0,004 b1 d2 d3 c4 3 2 1 4 0,00 0,75 0,00 1,00 1,005 c1 a2 d3 d4 3 2 1 5 0,00 0,75 0,00 1,00 1,00

15 of 24

Discriminative Value

Bins average similarity Weights Weighted averagea1 vs. b1 0.00 1 0.00a1 vs. c1 0.50 2 1.00a1 vs. d1 0.25 3 0.75b1 vs. c1 0.50 1 0.50b1 vs. d1 0.75 2 1.50c1 vs. d1 0.50 1 0.50

0.708

Discriminative Power : 1-0.708=0.292 The higher the discriminative power,the more different are the rankings between one and the other state of theworld

16 of 24

Results: Discriminative power

Table: Discriminative power of independent variables

Sectors FELAG SUE consensus EP SG TA MKT

Energy 0.163 0.194 0.224 0.218 0.183 0.186 0.222IT 0.207 0.214 0.220 0.196 0.183 0.197 0.194

Industrials 0.185 0.179 0.192 0.210 0.200 0.224 0.190Materials 0.199 0.189 0.258 0.218 0.203 0.220 0.183

17 of 24

Predict with Label Ranking Algorithm

� Naive Bayes algorithm for label ranking (Aiguzhinov et al., 2010):� non parametric technique that relies on the similarities of the rankings� predicts rankings conditional on the values of the state variables

� Alternative baseline ranking methods:� default (ranking based on the average rank of each label)� naive (previous quarter ranking)

� Accuracy of the methods: Spearman’s rank correlation

Figure: Time line of the predicted π̂ and the target π rankings

18 of 24

Results: Label rankings

Table: Ranking accuracy of the naive Bayes ranking method and the two baselines.

NBr default naive rankingSector mean std.dev mean std.dev mean std.dev

Energy 0.037 0.060 0.060 0.087 0.064 0.060IT -0.005 0.045 0.088 0.075 0.043 0.036

Industrials 0.010 0.050 0.078 0.074 0.033 0.046Materials 0.012 0.067 0.076 0.073 0.041 0.056

19 of 24

Differences(1)

Figure: Differences in ranking accuracy of naive Bayes and the default rankings

0 50 100 150

1.0

0.5

0.0

0.5

1.0

Energy

Stocks

Diff

eren

ce b

etwe

en m

ean

accu

racy

of N

br a

nd d

efau

lt ra

nkin

g

0 50 100 150 200 250 300

1.0

0.5

0.0

0.5

1.0

Industrials

Stocks

Diff

eren

ce b

etwe

en m

ean

accu

racy

of N

br a

nd d

efau

lt ra

nkin

g

0 100 200 300 400

1.0

0.5

0.0

0.5

1.0

IT

Stocks

Diff

eren

ce b

etwe

en m

ean

accu

racy

of N

br a

nd d

efau

lt ra

nkin

g

0 20 40 60 80

1.0

0.5

0.0

0.5

1.0

Materials

Stocks

Diff

eren

ce b

etwe

en m

ean

accu

racy

of N

br a

nd d

efau

lt ra

nkin

g

20 of 24

Differeces(2)

Figure: Differences in ranking accuracy of naive Bayes and the naive rankings

0 50 100 150

1.0

0.5

0.0

0.5

1.0

Energy

Stocks

Diff

eren

ce b

etwe

en m

ean

accu

racy

of N

br a

nd n

aive

rank

ing

0 50 100 150 200 250 300

1.0

0.5

0.0

0.5

1.0

Industrials

Stocks

Diff

eren

ce b

etwe

en m

ean

accu

racy

of N

br a

nd n

aive

rank

ing

0 100 200 300 400

1.0

0.5

0.0

0.5

1.0

IT

Stocks

Diff

eren

ce b

etwe

en m

ean

accu

racy

of N

br a

nd n

aive

rank

ing

0 20 40 60 80

1.0

0.5

0.0

0.5

1.0

Materials

Stocks

Diff

eren

ce b

etwe

en m

ean

accu

racy

of N

br a

nd n

aive

rank

ing

21 of 24

Conclusions

� Discriminative power analysis identifies Consensus as the mostdiscriminative variable in most of the sectors

� There is a room for improving of label ranking algorithm in particularrefining predictor state variables

22 of 24

References (1)

Aiguzhinov, Artur, Carlos Soares, and Ana Serra (2010), “A similarity-basedadaptation of naive bayes for label ranking: Application to themetalearning problem of algorithm recommendation.” In DiscoveryScience (Bernhard Pfahringer, Geoff Holmes, and Achim Hoffmann, eds.),volume 6332 of Lecture Notes in Computer Science, 16–26, SpringerBerlin, Heidelberg.

Barber, B., R. Lehavy, M. McNichols, and B. Trueman (2001), “CanInvestors Profit from the Prophets? Security Analyst Recommendationsand Stock Returns.” The Journal of Finance, 56, 531–563.

Brown, L. (2001), “How Important is Past Analyst Earnings ForecastAccuracy?” Financial Analysts Journal, 57, 44–49.

Brown, L.D. and E. Mohammad (2003), “The Predictive Value of AnalystCharacteristics.” Journal of Accounting, Auditing and Finance, 18.

23 of 24

References (2)

Clement, M.B. (1999), “Analyst forecast accuracy: Do ability, resources,and portfolio complexity matter?” Journal of Accounting and Economics,27, 285–303.

Creamer, G. and S. Stolfo (2009), “A link mining algorithm for earningsforecast and trading.” Data Mining and Knowledge Discovery, 18,419–445.

Dougherty, J., R. Kohavi, and M. Sahami (1995), “Supervised andunsupervised discretization of continuous features.” In MACHINELEARNING-INTERNATIONAL WORKSHOP, 194–202, MORGANKAUFMANN PUBLISHERS, INC.

Fama, E.F. (1970), “Efficient Capital Markets: A Review of EmpiricalWork.” The Journal of Finance, 25, 383–417.

Fama, E.F. (1991), “Efficient Capital Markets: II.” The Journal of Finance,46, 1575–1617.

24 of 24

References (3)

Grossman, S.J. and J.E. Stiglitz (1980), “On the Impossibility ofInformationally Efficient Prices.” American Economic Review, 70,393–408.

Jegadeesh, N., J. Kim, S.D. Krische, and C.M.C. Lee (2004), “Analyzing theAnalysts: When Do Recommendations Add Value?” The Journal ofFinance, 59, 1083–1124.

Vogt, M., JW Godden, and J. Bajorath (2007), “Bayesian interpretation of adistance function for navigating high-dimensional descriptor spaces.”Journal of chemical information and modeling, 47, 39–46.

Womack, K.L. (1996), “Do Brokerage Analysts’ Recommendations HaveInvestment Value?” The Journal of Finance, 51, 137–168.

25 of 24

Similarity-based Naive Bayes for Label Ranking: Priorprobability of label ranking

Table: Demonstration of the prior probability for label ranking

Quarters x1 x2 x3 x4 RanksAlex Brown Craig

1 High Low High Medium 1 2 32 High High High Low 2 3 13 Medium Medium High Low 1 2 34 Low Low Low High 1 3 2...

......

......

......

...14 Medium High High Medium 1 2 315 High Medium High Low 3 1 2

Maximizing the likelihood is equivalent to minimizing the distance (i.e.,maximizing the similarity) in a Euclidean space Vogt et al. (2007)

Label ranking: formalization

� Instance: X ⊆ {V1, . . . ,Vm}� Labels: L = {λ1, . . . , λk}� Output: Y = ΠL� Training set: T = {xi , yi}i∈{1,...,n} ⊆ X × Y

Learn a mapping h : X → Y such that a loss function ` is minimized:

` =

∑ni=1 ρ(πi , π̂i )

n(4)

with ρ being a Spearman correlation coefficient:

ρ(π, π̂) = 1−6∑k

j=1(πj − π̂j)2

k3 − k(5)

where π and π̂ are, respectively, the target and predicted rankings for agiven instance.

Posterior probability of label ranking

Proir probability of label ranking:

PLR(π) =

∑ni=1 ρ(π, πi )

n(6)

Conditional probability of label ranking:

PLR(va,i |π) =

∑i :xi,a=va,i

ρ(π, πi )

|{i : xi,a = va,i}|(7)

Estimated ranking:

π̂ = arg maxπ∈ΠL

PLR(π)m∏

a=1

PLR(xi,a|π) (8)

Artur-eea-presentation

Documents

rankings of analysts

nancial analysts accuracy

starmine rankings

prediction of rankings

analysts earnings forecast

expost rankings

analysts naive bayes

ranking accuracy