Top Banner
Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research
51

Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

Dec 15, 2015

Download

Documents

Marlee Annis
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

Analyzing Major League Baseball Using XMT

Architecture

April 22, 2014Vince Gennaro

Society for American Baseball Research

Page 2: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

2

Agenda

• The Changing World of Baseball Information and Data

• Big Data Application– Using XMT architecture to predict the outcome of

the batter-pitcher matchup

Page 3: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

3

A New Era of Baseball Analytics

• Proliferation of baseball data

• Revolutionary processing technology

• Massive, inexpensive storage capability

Page 4: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

4

Our World Has Changed

Box Score

Play-by-Play

Pitchf/x

Source: MLB.com and Baseball-Reference

Page 5: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

5

Our World Has Changed

Page 6: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

6

Growth in Baseball Data

0

200

400

600

800

1000

1200

1400

1600

1800

2000

Year

MB

/ Se

ason

Pitchf/x2008

1900 1950 20001988

Source: Sportvision

Page 7: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

7

Moneyball—a Breakthrough in 2003

Page 8: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

8

The Demand Side

• The stakes have grown dramatically

• $50—$100 million decisions are commonplace

• Winning (Efficiently) Drives Profitability

• Better player personnel decisions promote winning

Page 9: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

9

Big Data Era of Baseball Analytics

Page 10: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

How Should a Batter-Pitcher Perform?

10

Page 11: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

How Should a Batter-Pitcher Perform?

Starting Lineups Batting Order Pinch Hitters Relief Pitchers11

Page 12: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

12

The Problem We’re Solving

• The Prevailing Approach—One-Pitcher vs. One-Batter Career Data

– Small sample sizes

– Timeframe is too long (full career)

– No Experience = No Help

– Data includes only outcomes

Page 13: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

13

Framework—Batter vs. Pitcher

Pitching Style

Pitcher Quality

Hitting Style

Hitter QualityBallpark

5 Factors

Page 14: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

14

New Data + New Technology

• New Data– Pitch f/x– Hit f/x

+• New Technology– Graph Analytics– .

EvaluatingBatter/Pitcher

Match Ups

Page 15: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

15

Framework—Batter vs. Pitcher

Pitching Style

Pitcher Quality

Hitting Style

Hitter QualityBallpark

5 Factors

Page 16: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

16

Ballpark© Greg Rybarczyk

Page 17: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

17

Ballpark© Greg Rybarczyk

Page 18: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

18

Ballpark© Greg Rybarczyk

Page 19: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

19

Ballpark61% = Single25% = Double14% = Out

Page 20: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

20

Ballpark61% = Single25% = Double14% = Out

1.11 Total Bases

Page 21: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

Expected Total Bases on Batted Balls

21

Batted Ball Velocity—Initial Speed off Bat

Verti

cal L

aunc

h An

gle

OUTSingleDoubleTripleHomerun

Turner Field – Atlanta

Page 22: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

22

Ballpark© Greg Rybarczyk

Page 23: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

23

Ballpark© Greg Rybarczyk

Page 24: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

24

Ballpark© Greg Rybarczyk

Page 25: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

Expected Total Bases on Batted Balls

25

Batted Ball Velocity—Initial Speed off Bat

Verti

cal L

aunc

h An

gle

OUTSingleDoubleTripleHomerun

Turner Field – Atlanta

Page 26: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

Expected Total Bases on Batted Balls

26

Batted Ball Velocity—Initial Speed off Bat

Verti

cal L

aunc

h An

gle

OUTSingleDoubleTripleHomerun

Yankee Stadium– New York

Page 27: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

27

Framework—Batter vs. Pitcher

Pitching Style

Pitcher Quality

Hitting Style

Hitter QualityBallpark

5 Factors

Page 28: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

28

Clustering Pitchers

Objective:• Identify pitcher similarities to form clusters of

“like” pitchers

• Predict hitter performance by pitcher cluster vs. individual batter/pitcher matchups

Page 29: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

29

Clustering Pitchers Hitters’ Questions Model Data

What does he throw? • Top 2 Pitches• Pitch Repertoire/Variety• Horizontal Pitch Location• Vertical Pitch Location

How hard does he throw? • Fastball Velocity

What kind of movement? • Horizontal Movement• Vertical Movement

Where do his pitches come from? • Release Point

How does he like to pitch? • Swinging Strike %• Zone % and Edge %• Top 2-pitch Sequence

Page 30: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

30

RH Pitcher vs. LH Batter Clusters

Page 31: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

31

RH Pitcher vs. LH Batter Clusters

Page 32: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

32

Yankees RF vs. Colorado Rockies?

Facing Right-Handed Pitcher Juan Nicasio

Ichiro Suzuki Brennan Boesch

Page 33: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

33

Yankees RF vs. Colorado Rockies?

Facing Right-Handed Pitcher Juan Nicasio

Ichiro Suzuki Brennan Boesch

Both are 0-0 vs. Nicasio

Page 34: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

34

Yankees Hitters—Rockies Pitchers

Jorge De La Rosa

Juan Nicasio

Jeff Francis

Tyler Chatwood

Ichiro Suzuki 3-6 4-6 1-3

Brennan Boesch 1-9 2-3

Page 35: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

35

RHP vs. LHB Clusters

Page 36: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

36

RHP vs. LHB Cluster “4”High Velocity FB

Low Pitch VarietyUpper Half of Zone

Page 37: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

37

RHP vs. LHB Cluster “4”

Ichiro Suzuki

0 - 65 - 26

2 - 5

2 - 111 - 3

2 - 3

0 - 6

Page 38: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

38

RHP vs. LHB Cluster “4”

Ichiro Suzuki—30th %

0 - 65 - 26

2 - 5

2 - 111 - 3

2 - 3

0 - 6

Page 39: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

39

RHP vs. LHB Cluster “4”

Brennan Boesch

6 -11

1 - 66 -23

0 - 11

3-132 - 3

2-7

Page 40: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

40

RHP vs. LHB Cluster “4”

Brennan Boesch—60th %

6 -11

1 - 66 -23

0 - 11

3-132 - 3

2-7

Page 41: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

41

Yankees Hitters—Rockies Pitchers

Jorge De La Rosa

Juan Nicasio

Jeff Francis

Tyler Chatwood

Ichiro Suzuki 33 30 78 70

Brennan Boesch 53 60 73 72

Page 42: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

42

Framework—Batter vs. Pitcher

Pitching Style

Pitcher Quality

Hitting Style

Hitter QualityBallpark

5 Factors

Page 43: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

43

Hitting Style

Page 44: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

44

Batter—Pitcher Match up Data Issues

Issue Old Process New ProcessToo Literal One-on-one Multiple “like”

pitchers

Sample Sizes Often too small More adequate

No prior experience

No data Data vs. other pitchers in cluster

Timeframe Could span 15+ yrs

Limited to more recent PAs

Performance metric

Outcomes (hit, out, etc.)

Includes batted ball diagnostics

Page 45: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

45

The ROI of Favorable Match Ups

Use of Information/ Decisions Impacted

Runs Created or Saved

Optimizing Starting Lineup

19 Runs

Most Favorable Pinch-Hitting Match Ups

9 Runs

Most Favorable Relief Pitcher Match Ups

5 Runs

33 Runs

* For a “contending” team

Page 46: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

46

The ROI of Favorable Match Ups

Use of Information/ Decisions Impacted

Runs Created or Saved

Optimizing Starting Lineup

19 Runs

Most Favorable Pinch-Hitting Match Ups

9 Runs

Most Favorable Relief Pitcher Match Ups

5 Runs

33 Runs

33 Runs = 3 wins

$ value of a win $5 million*

Potential Value$15 million in Revenue

* For a “contending” team

Page 47: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

47

Framework—Batter vs. Pitcher

Pitching Style

Pitcher Quality

Hitting Style

Hitter QualityBallpark

5 Factors

Page 48: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

48

Framework—Batter vs. Pitcher

• Refining a predictive model of batter/pitcher outcomes—optimal combination of 5 factors

• Validating model against actual outcomes

• Compare predictive accuracy to historical “one-to-one” expectations

• Continue to fine-tune model, incorporating new data daily

Page 49: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

49

Fine-Tuning Model Input Weights

Pitching S

tyle

Pitcher Q

uality

Hitter Quali

ty

Hitting S

tyle

Ballpark

0

5

10

15

20

25

30

35

Page 50: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

50

Fine-Tuning Model Input Weights

Pitching S

tyle

Pitcher Q

uality

Hitter Quali

ty

Hitting S

tyle

Ballpark

0

5

10

15

20

25

30

35

Page 51: Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.

51

END