Top Banner
10 WAYS BACKTESTS LIE TUCKER BALCH, PH.D. PROFESSOR, GEORGIA TECH CO-FOUNDER AND CTO, LUCENA RESEARCH
53
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 10 Ways Backtests Lie by Tucker Balch

10 WAYS BACKTESTS LIE

TUCKER BALCH, PH.D. PROFESSOR, GEORGIA TECH CO-FOUNDER AND CTO, LUCENA RESEARCH

Page 2: 10 Ways Backtests Lie by Tucker Balch

OR… Mistakes quant developers make that cause backtests to be inaccurate.

Page 3: 10 Ways Backtests Lie by Tucker Balch

WHAT I’LL COVER Introductions What is a backtest? How backtests lie: 1.  In sample backtesting 2.  Survivor bias 3.  Assume you can observe the close and trade at the close 4.  Ignoring market impact 5.  Assume you can buy $10M of a $1M company 6.  Data mining fallacy 7.  Stateful strategy luck 8.  Buy at the open 9.  Don’t trust complex models 10.  Don’t forward test

Page 4: 10 Ways Backtests Lie by Tucker Balch

ABOUT THE SPEAKER •  Professor of Interactive Computing at

Georgia Institute of Technology. •  Teach courses in Artificial Intelligence and

Finance. •  Teach MOOCs on Machine Learning for

Trading. •  Published over 120 research publications

related to Robotics and Machine Learning. •  Co-founder of Lucena Research.

Page 5: 10 Ways Backtests Lie by Tucker Balch

ABOUT THE SPEAKER •  Professor of Interactive Computing at

Georgia Institute of Technology. •  Teach courses in Artificial Intelligence and

Finance. •  Teach MOOCs on Machine Learning for

Trading. •  Published over 120 research publications

related to Robotics and Machine Learning. •  Co-founder of Lucena Research.

New book ->

Page 6: 10 Ways Backtests Lie by Tucker Balch

ABOUT LUCENA RESEARCH •  We are a fin-tech company who

employ experts in Computational Finance, Quantitative Analysis, and Software Development.

•  We deliver investment decision support technology at a fraction of the cost of an in house quant shop.

•  Python-based infrastructure. •  Erez Katz, CEO •  Eric Davidson, VP http://lucenaresearch.com

LUCENA RESEARCH!

Page 7: 10 Ways Backtests Lie by Tucker Balch

BUILDING A MODEL FROM DATA

Page 8: 10 Ways Backtests Lie by Tucker Balch

BUILDING A MODEL FROM DATA

Page 9: 10 Ways Backtests Lie by Tucker Balch

BUILDING A MODEL FROM DATA

Page 10: 10 Ways Backtests Lie by Tucker Balch

BUILDING A MODEL FROM DATA

Page 11: 10 Ways Backtests Lie by Tucker Balch

BUILDING A MODEL FROM DATA

Page 12: 10 Ways Backtests Lie by Tucker Balch

BUILDING A MODEL FROM DATA

Page 13: 10 Ways Backtests Lie by Tucker Balch

BUILDING A MODEL FROM DATA

Page 14: 10 Ways Backtests Lie by Tucker Balch

BUILDING A MODEL FROM DATA

Page 15: 10 Ways Backtests Lie by Tucker Balch

BACKTESTING TO VALIDATE THE MODEL

Page 16: 10 Ways Backtests Lie by Tucker Balch

BACKTESTING TO VALIDATE THE MODEL

Page 17: 10 Ways Backtests Lie by Tucker Balch

BACKTESTING TO VALIDATE THE MODEL

Page 18: 10 Ways Backtests Lie by Tucker Balch

BACKTESTING TO VALIDATE THE MODEL

Page 19: 10 Ways Backtests Lie by Tucker Balch

BACKTESTING TO VALIDATE THE MODEL

Page 20: 10 Ways Backtests Lie by Tucker Balch

BACKTESTING TO VALIDATE THE MODEL

Page 21: 10 Ways Backtests Lie by Tucker Balch

BACKTESTING TO VALIDATE THE MODEL

Roll forward cross validation Out of sample validation

Page 22: 10 Ways Backtests Lie by Tucker Balch

10 WAYS BACKTESTS LIE

Page 23: 10 Ways Backtests Lie by Tucker Balch

1. IN SAMPLE BACKTESTING Description: Backtesting over the same data you used to train your model.

Page 24: 10 Ways Backtests Lie by Tucker Balch

1. IN SAMPLE BACKTESTING Description: Backtesting over the same data you used to train your model. This method is doomed to succeed spectacularly!

Page 25: 10 Ways Backtests Lie by Tucker Balch

1. IN SAMPLE BACKTESTING Description: Backtesting over the same data you used to train your model.

Training

Testing

Page 26: 10 Ways Backtests Lie by Tucker Balch

1. IN SAMPLE BACKTESTING How to avoid?

Training

Testing

Page 27: 10 Ways Backtests Lie by Tucker Balch

1. IN SAMPLE BACKTESTING How to avoid? More generally, build safeguards and procedures to prevent testing over the same data you train over. E.g., Train over 2007, test over 2008-forward.

Page 28: 10 Ways Backtests Lie by Tucker Balch

2. SURVIVOR BIAS Description: Selective use of data in a statistical study that emphasizes examples that are “alive” at the end of the study. The significance of the bias depends on how important survival is to the quantity being measured.

Page 29: 10 Ways Backtests Lie by Tucker Balch

2. SURVIVOR BIAS Example: Company claims: “Our drug reduces the blood pressure of those who take the drug over time.” 5 year study:

•  Randomly select 500 cardiac patients •  Administer drug to them •  Measure their blood pressure monthly

Results: •  160/110 average first month •  135/80 average at end of study

Do you believe this is a good drug?

Page 30: 10 Ways Backtests Lie by Tucker Balch

2. SURVIVOR BIAS Problem: 58 of the patients they started with have died since the start of the study.

Note: 58 of the members of the S&P 500 in 2008 are now delisted. Not just out of the S&P 500, but gone as companies. 11.6%

Page 31: 10 Ways Backtests Lie by Tucker Balch

2. SURVIVOR BIAS

Page 32: 10 Ways Backtests Lie by Tucker Balch

2. SURVIVOR BIAS

Page 33: 10 Ways Backtests Lie by Tucker Balch

2. SURVIVOR BIAS

Page 34: 10 Ways Backtests Lie by Tucker Balch

2. SURVIVOR BIAS

Green: Current S&P 500, Purple: Point in time S&P 500

--Lucena Research

Page 35: 10 Ways Backtests Lie by Tucker Balch

2. SURVIVOR BIAS How to prevent?

•  Use historic index membership. •  Pair with SBF-free data. •  Use these indices as your universe for testing.

Page 36: 10 Ways Backtests Lie by Tucker Balch

3. OBSERVING THE CLOSE Description: You assume you can observe information recorded at market close, and trade on it.

Examples:

•  Closing price/volume •  Technicals based on price/volume

Page 37: 10 Ways Backtests Lie by Tucker Balch

3. OBSERVING THE CLOSE Description: You assume you can observe information recorded at market close, and trade on it.

Examples:

•  Closing price/volume •  Technicals based on price/volume

This is a specific case of “look ahead bias.” Other examples:

•  Earnings reports •  News feeds

Page 38: 10 Ways Backtests Lie by Tucker Balch

3. OBSERVING THE CLOSE How to prevent? Ensure that information with timestamp X cannot be acted on until X+1.

Example: Data marked January 15 cannot be traded until the open on January 16.

Page 39: 10 Ways Backtests Lie by Tucker Balch

4. IGNORING MARKET IMPACT Description: The act of trading affects price. Historical data does not include your trades and is therefore not an accurate representation of the price you would get.

Page 40: 10 Ways Backtests Lie by Tucker Balch

4. IGNORING MARKET IMPACT

Page 41: 10 Ways Backtests Lie by Tucker Balch

4. IGNORING MARKET IMPACT

Page 42: 10 Ways Backtests Lie by Tucker Balch

4. IGNORING MARKET IMPACT

Page 43: 10 Ways Backtests Lie by Tucker Balch

4. IGNORING MARKET IMPACT

Swetha Shivakumar, Georgia Tech

Page 44: 10 Ways Backtests Lie by Tucker Balch

4. IGNORING MARKET IMPACT How to prevent? Include a “slippage” or “market impact” model in your backtests.

Page 45: 10 Ways Backtests Lie by Tucker Balch

5. BUY $10M OF A $1M COMPANY Description: Backtest allows a strategy to buy (or short) as much of a symbol as it wants.

Page 46: 10 Ways Backtests Lie by Tucker Balch

5. BUY $10M OF A $1M COMPANY Description: Backtest allows a strategy to buy (or short) as much of a symbol as it wants. There often is real alpha in thinly traded stocks.

Page 47: 10 Ways Backtests Lie by Tucker Balch

5. BUY $10M OF A $1M COMPANY Description: Backtest allows a strategy to buy (or short) as much of a symbol as it wants. There often is real alpha in thinly traded stocks.

This is a specific example of the more general issue of capacity limitations.

Page 48: 10 Ways Backtests Lie by Tucker Balch

5. BUY $10M OF A $1M COMPANY How to avoid? Ensure the backtester prohibits trading more dollar volume than actually was available on that day.

Add slippage/market impact models to penalize buying too much.

Page 49: 10 Ways Backtests Lie by Tucker Balch

6. DATA MINING FALLACY Description: If you generate and test enough strategies you’ll eventually find one that “works” in a backtest. The quality of the strategy cannot be distinguished from random luck.

Page 50: 10 Ways Backtests Lie by Tucker Balch

6. DATA MINING FALLACY Description: If you generate and test enough strategies you’ll eventually find one that “works” in a backtest. The quality of the strategy cannot be distinguished from random luck. Example: Look for “skilled” coin flipper among 10,000 candidates.

Page 51: 10 Ways Backtests Lie by Tucker Balch

6. DATA MINING FALLACY How to avoid? You can’t! However you can and should forward test before committing significant capital.

Page 52: 10 Ways Backtests Lie by Tucker Balch

7 THROUGH 10 7. Stateful strategy luck

8. Buy at the open

9. Trust complex models

10. Don’t forward test

Page 53: 10 Ways Backtests Lie by Tucker Balch

THANK YOU! www.lucenaresearch.com