1
Sports Analytics…
… game on
Daniel Conway
Director of the Center for Business Analytics
Loras College
2
Who cares about sports?
3
Where does analytics fit in sports?
• Improve/predict player performance• HR / Personnel• Entertainment
2006 300 attend MIT’s Sports Analytics Conference2011 & beyond capped at 2000 in person & thousands more online
4
2014 SABR – Society for American Baseball Research
5
Agenda
• Moneyball• Baseball, Football, and Basketball
• HR• Entertainment
6
America’s Pastime
Peanuts, Crackerjack, and Numbers
7
The Pythagorean Theorem
RunsAllowed
RunsScored
Estimated % of games won
Runs scored2
----------------------------------Runs scored2 + Runs allowed2
1.96 is the optimal exponent, 2.37 for football
15.4 for basketball
Magic Quadrant Boston Detroit St. Louis…
Tough Year Houston Minnesota Philadelphia
Offensive LA Angels Toronto Colorado
Duals Atlanta Pittsburgh LA Dodgers
8
Some Definitions
Bill James “Runs Created” formula:
RC = (hits + BB + HBP) x (TB) / (AB + BB + HBP)Blue: # baserunnersGreen: Rate at which runners are advanced
Familiar? (Newton: Mass * Velocity)
9
Decision Making in Sports
Blitz or drop backSwing, take pitch
Pass, drive, shoot
8 iron or 9 iron
Pass, run, punt
Foul or play
Pitch out
Attempt base steal
Call Timeout
Change Pitchers
10
Trade? Why would you trade Albert Pujols?
11
State Transitions of Baseball
State (outs 1st 2nd 3rd )
Avg Runs
# Plate Appearances
0 000 .54 46,180
1 000 .29 32,921
2 000 .11 26,009
0 001 1.46 512
1 001 .98 2,069
2 001 .38 3,129
0 010 1.17 3,590
1 010 .71 6,168
2 010 .34 7,709
0 011 2.14 688
1 011 1.47 1,770
2 011 .63 1,902
State (outs 1st 2nd 3rd )
Avg Runs
# Plate Appearances
0 100 .93 11,644
1 100 .55 13,483
2 100 .25 13,588
0 101 1.86 1,053
1 101 1.24 2,283
2 101 .54 3,117
0 110 1.49 2,786
1 110 .97 4,978
2 110 .46 6,545
0 111 2.27 805
1 111 1.6 1,926
2 111 .82 2,280
12
State Transitions of Baseball
State (outs 1st 2nd 3rd )
Avg Runs
# Plate Appearances
0 000 .54 46,180
1 000 .29 32,921
2 000 .11 26,009
0 001 1.46 512
1 001 .98 2,069
2 001 .38 3,129
0 010 1.17 3,590
1 010 .71 6,168
2 010 .34 7,709
0 011 2.14 688
1 011 1.47 1,770
2 011 .63 1,902
State (outs 1st 2nd 3rd )
Avg Runs
# Plate Appearances
0 100 .93 11,644
1 100 .55 13,483
2 100 .25 13,588
0 101 1.86 1,053
1 101 1.24 2,283
2 101 .54 3,117
0 110 1.49 2,786
1 110 .97 4,978
2 110 .46 6,545
0 111 2.27 805
1 111 1.6 1,926
2 111 .82 2,280
No outs. Should I try to advance a runner from 1st to 3rd on a single? Let P be the probability of success.Yes, if P * 1.86 + (1-P) .55 >= 1.49 , or P > .72. Last year, 0.03 were thrown out…
13
State Transitions of Baseball
State (outs 1st 2nd 3rd )
Avg Runs
# Plate Appearances
0 000 .54 46,180
1 000 .29 32,921
2 000 .11 26,009
0 001 1.46 512
1 001 .98 2,069
2 001 .38 3,129
0 010 1.17 3,590
1 010 .71 6,168
2 010 .34 7,709
0 011 2.14 688
1 011 1.47 1,770
2 011 .63 1,902
State (outs 1st 2nd 3rd )
Avg Runs
# Plate Appearances
0 100 .93 11,644
1 100 .55 13,483
2 100 .25 13,588
0 101 1.86 1,053
1 101 1.24 2,283
2 101 .54 3,117
0 110 1.49 2,786
1 110 .97 4,978
2 110 .46 6,545
0 111 2.27 805
1 111 1.6 1,926
2 111 .82 2,280
They say to never make the 1st or 3rd out at 3rd base…
14
Fielding…
• A fatally flawed metric:
Fielding percentage = (PO + A) / (PO + A + E)
PO = PutoutsA = AssistsE = Errors
The Range Factor vs.Baseball Info Solutions
15
Fielding in NY
Yankees fielders cost the team 139 hits over the course of a season, or 11.2 wins worse than an average team.
Derek Jeter defense costs the Yankees 3.8 games per season
Ozzie Smith (Cardinals 1978-1996) was worth over 3.5 wins on defense – highest in history.
16
Pitch Count
2003 AL championship, Red Sox vs. Yankees, 5-2 in the 8th (90% chance of winning)
Pedro Martinez on the mound.
100 pitches
OBP = 0.256 OBP = 0.364
Grady Little goes to the mound but leaves in Martinez.Jeter doubles, Yankees tie it, win in 11.Grady Little fired the following week.
90 99 108
117
126
135
144
0
20000
40000
60000
80000
100000
120000
140000
Pitcher Injuries
17
How much is a 20 yard gain worth?
18
State & Expected Points 1st and 10
Yard Line Cabot, Sagarin, Winston Football Outsiders.com
5 -1.33 -1.2
15 -0.58 -0.6
25 0.13 0.1
35 0.84 0.9
45 1.53 1.2
55 (45) 2.24 1.9
65 (35) 3.02 2.2
75 (25) 3.88 3.0
85 (15) 4.84 3.8
95 (5) 5.84 4.6
19
How has football changed from 2003-2007 compared to 2013?
Category
2007 2013
PY/A 61.67 91.7
DPY/A -67.5 -42.2
RY/A 26.44 -37.4
DRY/A -67.5 -31.5
Pen -0.06 -1.05
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 320
1
2
3
4
5
6
Rushing Yards / Attempt 2013 of 32 top ranked point differential teams
RY/A
20
Who is the “best” quarterback? Simple – use the quarterback
rating:
1. First one takes a quarterback’s completion percentage, then subtracts 0.3 from this number and divides by 0.2.
2. You then take yards per attempt subtract 3 and divide by 4.
3. After that, you divide touchdowns per attempt by 0.05.
4. For interceptions per attempt, you start with 0.095, subtract from this number interceptions per attempt, and then divide this result by 0.04.
To get the quarterback rating, you add the values created from your first four steps, multiply this sum by 100, and divide the result by 6.
The sum from each of your first four steps cannot exceed 2.2375 or be less than zero.
21
22
Basketball – a collaborative pastime
23
Da Four Factors
• EFG – OEFG (effective field goals)
• TPP – DTPP (turnovers)
• ORP – DRP (rebounds)
• FTR – OFTR (free throws)
Then games won =
41.06
+ 351.88 (EFG-OEFG) (explains 71%) .01↑=> 3.5 more wins
+333.06 (TPP-DTPP) (explains 16%) .01↑=> 3.3 more wins
+130.61(ORP-DRP) (explains 6%) .01↑=> 1.3 more wins
+44.43(FTR-OFTR) (explains 0%) .01↑=> 0.44 more wins
(R2 = 0.91)
24
A Simulation of Basketball…
What if the 1992 Dream Team could play the 2012 Dream Team?
25
NBA Shooting
26
One second of NBA basketball
27
One Game’s Ball movement
28
Sir Tim Duncan
29
Expected Value of a Possession
(this slide intentionally blank)
30
CIO Magazine: 8 ways Big Data and Analytics will change sports
1. Pitchf/x – balls & strikes2. Slice & Dice data for fans3. Data from wearable technologies4. Field collection systems5. Predictive insights into Fan Preferences
1. In seat concession ordering
2. Restroom congestion
6. Hiring more numbers whizzes7. Influence Coaching Decisions8. Build Arguments for Contract Negotiations
31
MIT Sports Analytics Conference
• Tanking & the Drafts- incentives• Hot hand theory – players believe it & thus…• Still difficult to sell it to coaches
32
Why do we love the NCAA Basketball Tourney?
• Engaged via Brackets
33
Why do we love NFL Football? - Derivatives
• How big is the market? In terms of actual expenditures, – estimates that 32 million Americans spend $467 per person or about $15 billion in total playing.
• These figures don’t count ad revenue for fantasy hosting sites. The NFL’s annual revenue falls just under $10 billion currently.
• So the “derivative” market has grown larger than the foundational market.
34
Derivatives & the future of Sports Analytics
Forces driving the derivatives (future of SA)• Virtual Reality and Location Independence• Big Data & Big Money• Northwestern ruling and who owns what• Watson
35
Predictive Analytics
36
37
• Addin – bullfighter