Top Banner
March Data Crunch Madness Team Coach K. Yi Chun Chien, Xiayu Zeng, Feifei Chen, Xiaoshan Jin March 2015
10

March madness sports analysis

Jan 17, 2017

Download

Sports

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: March madness sports analysis

March Data Crunch Madness

Team Coach K.Yi Chun Chien, Xiayu Zeng, Feifei Chen,

Xiaoshan Jin

March 2015

Page 2: March madness sports analysis

2Introduction❖ Background: NCAA Men’s Basketball Tournament is a single-elimination tournament,

currently featuring 68 college teams.

❖ Objective: Develop a predictive model that examines factors contributing to a team’s performance, based on data from 2001-2014.

❖ Result: As can be analyzed from the model, box score has a large effect on a team’s result in 2015, which is helpful to predict:➢ Win/Lose➢ Winning Probability➢ Sweet Sixteen

Page 3: March madness sports analysis

3Independent & Dependent Variables

IndependentVariables

SeedLocation

Box Score

Assist, Steal, Block Shot,% 2/3 Point Field Goals,% Free Throws, Tempo

Seed#,If this team is Top 5,If this team is 15/16

Latitude, Longitude, Distance Difference

Dependent Variable:Win/Lose

Page 4: March madness sports analysis

4

Performance Validation

Accuracy

ROC Curve

AUC

RMSE

Define Data Group

Decision Tree

Build 5 Models

Bootstrap Forest

Boosted Tree

Neural Network

Nominal Logistic

Training (80%)

Validation(20%)

Testing (2015)

Probability

Win/Lose

Top 16

Prediction

Analysis Process

Page 5: March madness sports analysis

5

● Distribution Review: Most variables are normal distributed

Distribution and Correlation● Scatter Matrix: Few variables has linear correlation

Page 6: March madness sports analysis

6

5 Models Performance

Validation

Nominal Logistic Regression Accuracy: 72%

ROC Curve for Validation

Nominal Logistic Regression has the best performance

Performance Validation

Training

Page 7: March madness sports analysis

7

Result Lose WinLose 6 6Win 5 24Total 11 30

● 2015 Forecast Top 16 team ● 2015 Forecast Result: 73% accuracy

Prediction

27%

73%

Forecast Result IncorrectForecast Result Correct

Page 8: March madness sports analysis

8Model ExplanationDefensive efficiency, offensive efficiency, opponent’s blocked shots and assists are most important attributes based on individual p-value

According to our analysis results, good offensive efficiency contributes more than defensive efficiency in leading a team’s success

The closer the distance to stadium, the better result a team performs

Page 9: March madness sports analysis

9Interesting Analysis● Average score difference is narrowing down● The score pattern for Top 5 Seeds is less volatile

than the one for bottom 2 seeds

● 9 out of 16 is predicted correctly● Only Georgetown shows a declining pattern

of winning probability

Page 10: March madness sports analysis

10

Result and Conclusion❖ Whether a team wins or loses is positively related to four

primary factors:➢ offensive efficiency➢ defensive efficiency➢ block shots➢ assists

❖ Accuracy: Our model is 72.19% accurate in predicting a team’s result for 2015.