Top Banner
The Kelly criterion and its variants: theory and practice in sports, lottery, futures & options trading The symmetric downside Sharpe ratio and the evaluation of great investors & speculators and their use of the Kelly criterion William T Ziemba Alumni Professor at Financial Modeling and Stochastic Optimization, Emeritus, Sauder School of Business, UBC, Vancouver, BC, Canada V6T 1Z2 email: wtzimi@mac .com Mathematical Finance Seminar University of Chicago April 6, 2007
94
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: download

The Kelly criterion and its variants: theory and practicein sports, lottery, futures & options trading

The symmetric downside Sharpe ratio and the evaluation of great investors & speculators and their use of the Kelly criterion

William T Ziemba

Alumni Professor at Financial Modeling and Stochastic Optimization, Emeritus, Sauder School of Business, UBC, Vancouver, BC, Canada V6T 1Z2

email: [email protected]

Mathematical Finance SeminarUniversity of Chicago

April 6, 2007

Page 2: download

2

MacLean, L C and W T Ziemba (2006) The Kelly criterion: theory and practice

Thorp, E. O. (2006). The Kelly criterion in blackjack, sports betting and the stockmarket.

in S A Zenios and W T Ziemba, eds.,Handbook of Asset and Liability Management, Volume A: Theory and Methodology, North Holland.

MacLean, Sanegre, Zhao, Ziemba (2004) Capital growth with security, J.Economic Dynamics and Control, How to calculate the optimal Kelly fraction subject to being above a wealth path with high probability

Ziemba, W T (2005) The Symmetric downside Sharpe ratio and the evaluation of great investors and speculators, Journal of Portfolio Management (Fall).

Chapter 6 of W T Ziemba (2003) The Stochastic Programming Approach to Asset Liability and Wealth Management, AIMR updated into various chapters in Ziemba and Ziemba (2007), Scenarios for Risk Management and Global Investment Strategies,Wiley, July which is Wilmott columns merged into a book

Samuelson, P A and W T Ziemba (2007) Understanding the finite properties of Kelly log betting: a tale of five investors, Tech Report UBC

References

Page 3: download

3

• The Kelly or capital growth criteria maximizes the expected logarithm as its utility function period by period.

• It has many desirable properties such as being myopic in that today’s optimal decision does not depend upon yesterday’s or tomorrow’s data,

• it asymptotically maximizes long run wealth almost surely and it attains arbitrarily large wealth goals faster than any other strategy.

• Also in an economy with one log bettor and all other essentially different strategy wagers, the log bettor will eventually get all the economy’s wealth.

• The drawback of log with its essentially zero Arrow-Pratt absolute risk aversion is that in the short run it is the most risky utility function one would ever consider.

• Since there is essentially no risk aversion, the wagers it suggests are very large and typically undiversified.

• Simulations show that log bettors have much more final wealth most of the time than those using other strategies but can essentially go bankrupt a small percentage of the time, even facing very favorable investment choices.

• One way to modify the growth-security profile is to use either ad hoc or scientifically computed fractional Kelly strategies that blend the log optimal portfolio with cash. to keep one above the highest possible wealth path with high probability or to risk adjust the wealth with convex penalties for being below the path

Abstract

Page 4: download

4

• For log normally distributed assets this simply means using a negative power utility function whose risk aversion coefficient is 1:1 determined by the fraction and vice versa.

• For other asset returns this is an approximate solution.

• Thus one moves the risk aversion away from zero to a higher level.

• This results in a smoother wealth path but usually has less growth.

• This talk is a review of the good and bad properties of the Kelly and fractional Kelly strategies and a discussion of their use in practice by great investors and speculators most of whom have become centi-millionaires or billionaires by isolating profitable anomalies and betting on them well with these strategies.

• The latter include Bill Bentor the Hong Kong racing guru, Ed Thorp , the inventor of blackjack card counting who compiled one of the finest hedge fund records.

• Both of these gamblers had very smooth, low variance wealth paths.

• Additionally legendary investors such as John Maynard Keynes (0.8 Kelly) running the King’s College Cambridge endowment, George Soros (? Kelly) running the Quantum funds and Warren Buffett (full Kelly) running Berkshire Hathaway had similarly good results but had much more variable wealth paths.

• The difference seems to be in the choice of fraction and other risk control measures that relate to true diversification and position size relative to liquid assets under management.

Abstract (cont’d)

Page 5: download

5

Success in investments has two key pillars:

• devising a strategy with positive expectation and

• betting the right amount to balance growth of one’s fortune against the risk of losses.

This talk discusses the Kelly or capital growth log utility criteria for investing.

A strategy which has wonderful asymptotic long run properties

• the log bettor will dominate other strategies with probability one and

• accumulate unbounded amount more wealth.

But in the short run the strategy can be very risky since it has very low Arrow-Pratt risk aversion.

Page 6: download

6

• Examples from blackjack, horseracing, lotteries and futures trading illustrate the theory and its use in practice.

• I have been fortunate to work/consult with seven individuals who turned a humble beginning with essentially zero wealth into hundreds of millions (at least five are billionaires) using security market imperfections and anomalies in racing, futures trading and options mispricings.

• Once they reach 200-300 million, then often log --> linear: bet on anything with a “positive expectation” as long as you diversify and move their wealth into the best hedge and alternative investment funds

• All of them used Kelly or fractional Kelly betting strategies.

Fractional Kelly strategies provide more security but with less growth.

Page 7: download

7

Some points to learn from this research

• Means are by far the most important aspect of any portfolio problem.

• You must have the mean right to have good performance.

• If you have the mean right and do not overbet you should do well.

• In levered bets, it’s the left tail that can lead to trouble so you must not overbet or you can have a large disaster occurring without warning.

• Behavioral and other anomalies can yield strategies that have positive means.

• These biases yield ideas that yield profitable positive mean strategies in racing, sports betting and options markets.

• The capital growth or Kelly criterion strategy yields the most wealth in the long run and dominates all other essentially different strategies.

• But in the short run, the expected log criterion with its essentially zero Arrow-Pratt risk aversion index is very risky and can have substantial losses.

• The most you should ever bet is the log optimal amount; betting more is suboptimal and betting double yields a zero growth rate.

• Negative power utility, which blends cash with the expected log maximizing portfolio provides more security but has less long run growth.

• These fractional Kelly strategies are attractive for many investment situations; determination of what fraction to use depends on constrained optimization models.

Page 8: download

8

Growth versus Security: Tradeoffs in Dynamic Investment Analysis

• One is faced with a sequence of investments in periods 1, …, nsome favorable, some unfavorable

• Given an initial fortune, how should one invest over time to have long-run growth of their fortune while at the same time maintaining its security?

• Develop computational schemes so that the investor can have a desired growth and security tradeoff.

• Find simple operational policies that achieve these tradeoffs

• Transactions costs are crucial in practice so stochastic programming is needed

• Use results to analyze favorable investment situations

Page 9: download

9

POLAR APPROACHES

, Ziemba (2003)

Markowitz (1976), Hausch, Ziemba and Rubinstein (1981, 1985), Luenberger (1993) and others below

Page 10: download

10

Laplace (17xy) and others including Bhulmann

Page 11: download

11

If you bet on a horse, that’s gambling. If you bet you can make three spades, that’s entertainment. If you bet cotton will go up three points, that’s business. See the difference? Blackie Sherrod

Page 12: download

12

Games: favorable or unfavorable

• Blend growth versus security to your risk tolerance and the situation at hand

Page 13: download

13

Effect of data input errors on portfolio performance

Page 14: download

14

It’s the means that are the most important for investment success

Page 15: download

15

Mean Percentage Cash Equivalent Loss Due to Errors in Inputs

Pension funds 60-40 mix, RA=4 (Kallberg-Ziemba, 1983, Management Science)

Conclusion: spend your money getting good mean estimates and use historical variances and covariances

Reference: Chopra and Ziemba (1993), Journal of Portfolio Management, reprinted in Ziemba-Mulvey (1998) Worldwide asset and liability management, Cambridge University Press

Results similar in period 1 of multiple period models and the sensitivity is especially high in continuous time models. See examples in AIMR, 2003.

t =10012 RA

Page 16: download

16

The results here apply to essentially all models. You must get the means

right to win!

Optimal asset weights at stage 1 for varying levels of US equity means in a multiperiod stochastic programming pension fund model for Siemens Austria: see Geyer and Ziemba (2007, Operations Research)

’s euro equities (.9US)• US bonds 7.2, =11.3• Euro bonds 6.8, =3.7

Page 17: download

17

Assuming the the mean return for US stocks is equal to the long run mean of 12% as estimated by Dimson et al. (2002, 2006) --> the model yields an optimal weight for equities of 100%.

A mean return for US stocks of 9% --> < 30% optimal weight for equities.

This is in a five period ten year stochastic programming model.

The sensitivity to the mean is much less in periods 2, …, T

Page 18: download

18

Asset proportions: not practical: bonds vs stocks vs T-bill futures

Page 19: download

19

The Symmetric Downside-Risk Sharpe Ratio

• The Sharpe ratio is a very useful measure of investment performance.

• However, it is based on mean-variance theory and thus is basically valid only for quadratic preferences or normal distributions.

• Hence skewed investment returns can lead to misleading conclusions.

• This is especially true for superior investors such as Warren Buffett and others with a large number of high returns.

• Many of these superior investors use capital growth wagering ideas to implement their strategies which leads to higher growth rates but also higher variability of wealth.

• A simple modification of the Sharpe ratio to assume that the upside deviation is identical to the downside risk provides a useful modification that gives more realistic results.

Page 20: download

20

Using the Sharpe ratio

Page 21: download

21

Using a modified Sharpe ratio that does not penalize gains

Summary over funds of negative observations and arithmetic and geometric means

Page 22: download

22

• we want to determine if Warren Buffett really is a better investor than the rather good but lesser funds mentioned here, especially the Ford Foundation and the Harvard endowment, in some fair way.

• The idea is presented in a Figure below where we have plotted the Berkshire Hathaway and Ford Foundation monthly returns as a histogram and show the losing months and the winning months in a smooth curve. We want to penalize Warren for losing but not for winning. So define the downside risk as

The symmetric downside Sharpe ratio performance measure

• This is the downside variance measured from zero, not the mean, so it is more precisely the downside risk.

• To get the total variance we use twice the downside variance

Page 23: download

23

The wealth levels from December 1985 to April 2000 for the Windsor Fund of George Neff, the Ford Foundation, the Tiger Fund of Julian Robertson, the Quantum Fund of George Soros and Berkshire Hathaway, the fund run by Warren Buffett, as well as the S&P500 total return index.

Page 24: download

24

Ford Foundation and Harvard Investment Corporation Returns, quarterly data, June 1977 to March 2000

Page 25: download

25

Comparison of ordinary and symmetric downside Sharpe yearly performance measures

Only Buffett improves but he still does not beat the Ford Foundation - and Harvard is also better than Buffett but not Ford with the quarterly data

Why? Tails still too fat

Thorp (2006) shows that Buffett is essentially a full Kelly bettor.

Page 26: download

26

Berkshire Hathaway versus Ford Foundation, monthly returns distribution, January 1977 to April 2000

Page 27: download

27

Return distributions of all the funds, quarterly returns distribution, December 1985 to March 2000

Page 28: download

28

The Chest Fund, 1927-1945 (Keynes)

-w-0.25 (80% Kelly, 20% cash), see Ziemba (2003)

Page 29: download

29

Gamblers like smooth wealth paths using fractional Kelly strategies

Page 30: download

30

Princeton Newport Partners, LP, cumulative results, Nov 1968-Dec 1998 (Thorp)

DSSR=13.8

Page 31: download

31

PNP: 15.1% net vs 10.2% for the S&P500

Page 32: download

32

• In the theory of optimal investment over time, it is not quadratic (the utility function behind the Sharpe ratio) but log that yields the most long term growth.

• But the elegant results on the Kelly (1956) criterion, as it is known in the gambling literature and the capital growth theory as it is known in the investments literature, see the survey by Hakansson and Ziemba (1995) and MacLean and Ziemba (2006), that were proved rigorously by Breiman (1961) and generalized by Algoet and Cover (1988) are long run asymptotic results.

• However, the Arrow-Pratt absolute risk aversion of the log utility criterion is essentially zero, where u is the utility function of wealth w,, and primes denote differentiation.

• The Arrow-Pratt risk aversion index.

Log Utility

is essentially zero, where u is the utility function of wealth w, and primes denote differentiation.

• Hence, in the short run, log can be an exceedingly risky utility function with wide swings in wealth values.

Page 33: download

33

Long run exponential growth is equivalent to maximizing the expected log of one period’s returns

Page 34: download

34

• Thus the criterion of maximizing the long run exponential rate of asset growth is equivalent to maximizing the one period expected logarithm of wealth. So an optimal policy is myopic.

• Max G(f) = p log (1+f) + q log (1-f) f* = p-q

• The optimal fraction to bet is the edge p-q

Page 35: download

35

Slew O’ Gold, 1984 Breeders Cup Classic

f*=64% for place/show; suggests fractional Kelly.

Page 36: download

36

Page 37: download

37

Classic Breiman Results

Page 38: download

38

Page 39: download

39

Kelly and half Kelly medium time simulations: Ziemba-Hausch (1986)

These were independent

Page 40: download

40

166 times the wealth is more than 100 times initial wealth fail with full Kelly but only once with half Kelly

But probability of being ahead is higher with half Kelly, 87% vs 95.4%

Min wealth is 18 and only 145 with half Kelly

700 bets all independent with a 14% edge, result you still lose over 98% of your fortune with bad scenarios

With half Kelly, lose half of wealth only 1% of the time but 8.40% with full Kelly

The good, the bad and the ugly

Page 41: download

41

Kentucky Derby 1934-1998• Use inefficient market system in Hausch, Ziemba, Rubinstein (1981) and Ziemba-

Hausch books• Place/show wagers made when prices off sufficiently and EX≥ 1.10

w0 = $2500 63 years 72 wagers with 45 (62.5%) successful

Page 42: download

42

Typical wealth level histories with one scenario (the actual results) from place and show betting (Dr Z system) on the Kentucky Derby, 1934-1994 with Kelly, half Kelly and betting on the favorite strategies

Page 43: download

43

Overbetting

Probability of doubling and quadrupling before halving and relative growth rates versus fraction of wealth wagered for Blackjack (2% advantage, p=0.51 and q=0.49

Should you ever be above 0.02 that is positive power utility like

I think its dominated!

Betting more than the Kelly bet is non-optimal as risk increases and growth decreases; betting double the Kelly leads to a growth rate of zero plus the riskfree asset.

LTCM was at this level or more, see AIMR, 2003.

Page 44: download

44

Growth Rates Versus Probability of Doubling Before Halving for Blackjack

Page 45: download

45

u(w) =-w <0

0 u log

f=1/(1- ) = fraction (Kelly) in log optimal portfolio, rest in cash

=0 f=1 full Kelly

=-1 f=1/2 1/2 Kelly

=--3 f=1/4 1/4 Kelly futures trading down here

This is exact with log normaility and approximate otherwise.

Fractional Kelly and negative power utility

Page 46: download

46

Samuelson’s critique of Kelly betting

Page 47: download

47

Commodity Trading: Turn of Year Effect

Average excess returns of smallest minus largest decile of US stocks, 1926-93, Source: Ibbotson Associates

Small cap stocks have outperformed large cap stocks in January on a regular basis since 1926

Page 48: download

48

-21

-20

-19

-18

-17

-16

-15

-14

-13

-12

-11

-10

-9

-8

-7

Dec-1Dec-3Dec-7Dec-9Dec-13Dec-15Dec-17Dec-21Dec-23Dec-28Dec-30Jan-3Jan-5Jan-7Jan-11Jan-13Jan-17Jan-19Jan-21Jan-25Jan-27Jan-311993/1994 Turn of the Year

Value Line minus S&P 500

Cash(VL-S&P)Futures(VL-S&P)

buy

sell

Futures play with anticipation, mid December to mid January, this is a typical year in the mid 90s, Value Line versus S&P, 1992-3

Page 49: download

49

Turn of the year effect

Relative growth rate and probability of doubling, tripling or tenfolding before halving for various Kelly strategies

Probability of reaching $10 million before ruin for Kelly, half Kelly and quarter Kelly strategies

Page 50: download

50

Turn of the year effect, recent developments

Futures markets - much more violent

Russell 2000 - has more volume than Value Line

Effect moved into December

Textbooks and finance experts say effect is not there

Graphs in Hensel-Ziemba paper in Keim-Ziemba (2000) Worldwide security market imperfections, Cambridge University Press.

Doing this trade is like driving a dynamite truck smoking a cigar. You do it carefully.

Rendon-Ziemba (2005) update to 2005 turn of the year Value Line/S&P500 and Russell 2000/S&P500 spread trades

Page 51: download

51

Unpopular numbers in the Canadian 6/49, 1984, 1986, and 1996 Lotto

Page 52: download

52

Lotto games, experimental data

Page 53: download

53

Probability of doubling, quadrupling and tenfolding before halving, Lotto 6/49

Case A Case B

Page 54: download

54

Probability of reaching the goal of $10 million before falling to $25,000 with various initial wealth levels for Kelly, 1/2 Kelly and 1/4 Kelly wagering strategies

The downside of the analysis is that the expected time to win a lot is in the millions of years.

Page 55: download

55

The Investors

• Tom, I believe, is overbetting and dominated and will go bankrupt

• Harriet has a limited degree of risk tolerance, fits well with lots of

empirical Wall St equity premium data

Page 56: download

56

Some tests

Page 57: download

57

Page 58: download

58

Page 59: download

59

Page 60: download

60

Page 61: download

61

Page 62: download

62

Page 63: download

63

Page 64: download

64

Page 65: download

65

Page 66: download

66

Page 67: download

67

Page 68: download

68

• market in miniature • fundamental and technical systems • returns and odds are determined by1) participants -- like stock market, unlike roulette2) transaction costs -- track take (17%), breakage; rebates now plus Betfair (long short) 

• bet to1) win -- must be 1st2) place -- must be 1st or 2nd3) show -- must be 1st, 2nd or 3rd

Horseracing

Page 69: download

69

Inefficiencies are possible since: 

1) more complex wager

2) prob(horse places) > prob(horse wins) ==> favorites may be good bets

To investigate place bets we need: 

1) determine place payoffs 

2) their likelihood 

3) expected place payoffs

4) betting strategy, if expected payoffs are positive

Place market in horseracing

Bettors do not like place and show bets.

Page 70: download

70

The Idea

1. Use data in a simple market (win) to generate probabilities of outcomes

2. Then use those in a complex market (place and show) to find positive expectation bets

3. Then bet on them following the capital growth theory to maximize long run wealth

Page 71: download

71

Effect of transactions costs, calculation of optimal place and show Kelly bets

Non concave program but it seems to converge.

In practice, adjust q’s to replicate biases.

Page 72: download

72

Page 73: download

73

Use in a calculator

What we do in the system is to reduce the non-convex log optimization problem down to four numbers: Wi,, W, and Si, S or Pi, P,

Thousands of race results regress the expected value and the optimal Kelly bet as a function of these four variables.

Hence, you just find horses where the relative amount bet to place or show is below the bet in the win pool.

The calculator tells you when the expected value is say 1.10 or better and calculates the optimal Kelly bet.

So this can be done in say 15 seconds.

Page 74: download

74

Exhibition Park, 1978, typical returns.

Page 75: download

75

Aqueduct, 1981-82

Page 76: download

76

Ex Place i = 0.319 + 0.559wi /w

pi / p

⎝ ⎜

⎠ ⎟

Ex Show i = 0.543+ 0.369wi /w

si /s

⎝ ⎜

⎠ ⎟

Expected value approximation equations

• Expected value (and optimal wager) are functions of only four numbers - the totals and the horse in question.

• These equations approximate the full optimized optimal growth model.

• Solving the complex NLP: too much work and too much data for most people.

• This is used in the calculators, and Hausch-Ziemba (1985, Management Science), differing track take, etc.

Page 77: download

77

1983 Kentucky Derby

Page 78: download

78

1991 Breeders’ Cup Race 5

Page 79: download

79

Simulations in 2004-5

Real results April 2005-March 2006

Up ~ 36,000 ~ 2% on bets ~ 1.5 M,

System -7%, rebate ~ 9%, edge ~ +2%

Page 80: download

80

Calculating the optimal Kelly fraction

Kelly fractions and path achievement

To stay above a wealth path using a Kelly strategy is very difficult

• the more attractive the investment opportunity,• the larger the bet size and• hence the larger is the chance of falling below the path.

MSZZ using a continuous time lognormally distributed asset model calculate that function to stay above a path at various points in time to stay with a high exogenously specified value at risk probability.

Convex case like Geyer-Ziemba (2007) Siemens Vienna pension model - can do on a computer; will develop the math

Page 81: download

81

The planning horizon is T=3, with 64 scenarios each with probability 1/64

Page 82: download

82

With initial wealth W(1)=1, the value at risk is a. The optimal investment decisions and optimal growth rate for a, the secured average annual growth rate and 1-a, the security level are shown in the table.

Page 83: download

83

Page 84: download

84

Page 85: download

85

Page 86: download

86

Page 87: download

87

Guide to Capital Growth Theory and Kelly Criterion Literature

1956 Kelly heuristic paper, original idea (Latane, 1957, also)

1961 Breiman, original correct proofs

1969 Thorp original application to sports betting

1981 Hausch-Ziemba-Rubinstein, application to place and show system, books later –1984, 86, 87

1988 Algoet and Cover most general proofs

1994 Hausch-Lo-Ziemba reprints many key articles

1995 Hakansson & Ziemba survey finance view point in Finance Handbook, reviewsHakansson’s work

1998 Janacek MSc Thesis, Charles Univ. creative student

1998 Thorp brilliant math analysis

1999 MacLean-Ziemba, fractional Kelly examples (series of papers 1986+)

2002* MacLean-Ziemba theory of targets rather than time, Time to Wealth

2003 Ziemba AIMR, more simply written, many references

2004 MacLean-Ziemba et al how to calculate the “optimal” fractional Kelly;controversial constrained optimization, JEDC

Page 88: download

88

Some properties of the Capital Growth Theory

Page 89: download

89

Some properties of the Capital Growth Theory (cont’d)

Page 90: download

90

Some properties of the Capital Growth Theory (cont’d)

Page 91: download

91

Some properties of the Capital Growth Theory (cont’d)

Page 92: download

92

Some properties of the Capital Growth Theory (cont’d)

Page 93: download

93

Some properties of the Capital Growth Theory (cont’d)

Page 94: download

94

References

Essentially all of the material in this talk is in the following books plus the papers handed out

Ziemba, The Stochastic Programming Approach to Asset Liability Management, AIMR, 2003

Ziemba-Hausch, Dr Z’s Beat the Racetrack, William Morrow, 1987 (has UK betting system)

Hausch-Lo-Ziemba, Efficiency of Racetrack Betting Systems, Academic Press, 1994. Classic new and reprinted articles, bible for Hong Kong professional betting teams. Originals sell for huge prices as high as $12,000 I am told, I sold one for $1400 last week.

Ziemba-Vickson, Stochastic Optimization Models in Finance, Academic Press, 1975. Classic articles, new articles, huge collection of portfolio theory, problems.Reprinted by World Scientific, Singapore, 2006.

Ziemba et al, 6/49 Lotto Guidebook, 1986

Ziemba-Hausch, Betting at the Racetrack, 1986, exotic bet pricing

Books all available, [email protected] for information.