Top Banner
63

Data Science and Predictive SPC

Jan 13, 2017

Download

Data & Analytics

Alex Gilgur
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data Science and Predictive SPC
Page 2: Data Science and Predictive SPC

What does a data scientist do?

Alex Gilgur. Data Science & Predictive SPC

Page 3: Data Science and Predictive SPC

The Maslow Pyramid

Alex Gilgur. Data Science & Predictive SPC

Page 4: Data Science and Predictive SPC

The Maslow Pyramid of Data Science

IT Infrastructure

Software Engineering

Quantitative Analytics

Domain

Data

Alex Gilgur. Data Science & Predictive SPC

Page 5: Data Science and Predictive SPC

Data Science = Nuclear Energy

Blow up in our face

Alex Gilgur. Data Science & Predictive SPC

Page 6: Data Science and Predictive SPC

Data Science = Nuclear Energy

Blow up in our face

…or…

Alex Gilgur. Data Science & Predictive SPC

Page 7: Data Science and Predictive SPC

Data Science = Nuclear Energy

Blow up in our face

…or…

Give us Power

Alex Gilgur. Data Science & Predictive SPC

Page 8: Data Science and Predictive SPC

What’s the Team We’re Rooting For?

DATA

INFORMATION

8Alex Gilgur. Data Science & Predictive SPC

Page 9: Data Science and Predictive SPC

What’s the Team We’re Rooting For?

DATA

INFORMATION

9

INFORMATION

INFORMATION

Alex Gilgur. Data Science & Predictive SPC

Page 10: Data Science and Predictive SPC

What’s the Team We’re Rooting For?INFORMATION

10

INFORMATION

INFORMATION

Alex Gilgur. Data Science & Predictive SPC

Page 11: Data Science and Predictive SPC
Page 12: Data Science and Predictive SPC

••••

Page 13: Data Science and Predictive SPC
Page 14: Data Science and Predictive SPC

… … …

… … …

… … …

… … …

Page 15: Data Science and Predictive SPC

Servers = argmax (Revenue |Budget)

Revenue = f[Throughput (Servers, SW, Budget)]

Servers = argmin (Budget | Revenue)

•Throughput = t (UX)

•Revenue = r (Throughput)

•Budget = f(SW, Servers)

Constraints:•Domain•Budget ≤ B

Page 16: Data Science and Predictive SPC

From X to Y to X

Page 17: Data Science and Predictive SPC

Closing the Loop

Page 18: Data Science and Predictive SPC
Page 19: Data Science and Predictive SPC
Page 20: Data Science and Predictive SPC

∆ →

Page 21: Data Science and Predictive SPC
Page 22: Data Science and Predictive SPC

Arithmetic means of random samples taken from any distribution asymptotically converges to a normal distribution as the number of such samples tends to infinity.

CENTRAL LIMIT THEOREM

Page 23: Data Science and Predictive SPC

Page 24: Data Science and Predictive SPC

Page 25: Data Science and Predictive SPC

σσ

σ

Page 26: Data Science and Predictive SPC

σ

σσ

Page 27: Data Science and Predictive SPC

σσ

σ

Page 28: Data Science and Predictive SPC
Page 29: Data Science and Predictive SPC

○○○

Page 30: Data Science and Predictive SPC

○ σ○○

○ σ σ○○

●●

Page 31: Data Science and Predictive SPC

http://www.isixsigma.com/

Page 32: Data Science and Predictive SPC

❑❑❑❑❑… … …

❑❑❑❑❑

❑❑❑❑… … …

Page 33: Data Science and Predictive SPC

❑❑❑

Page 34: Data Science and Predictive SPC

•••

•••

Page 35: Data Science and Predictive SPC

Key

Per

form

ance

Met

ric (K

PM

)

72 hrs

LSL

USL

How did HAL Know?

Page 36: Data Science and Predictive SPC
Page 37: Data Science and Predictive SPC

• ••

■■

●●●

Page 38: Data Science and Predictive SPC

A Few Words About ForecastingMethods:

● EWMA● ARIMA ● Regression

EWMA models are very specific and computationally fast, but they have to be told trend (linear or exponential) and seasonality (additive or multiplicative).

ARIMA model will implicitly account for trends, seasonality, and stationarity of the data. Autocorrelation of ARIMA residuals provide all the periodicities that have been missed.

For stationary data, use ARIMAFor non-stationary data, use EWMAEWMA and ARIMA overlap

When to use Regression:● data are monotonic.● seasonality is NOT statistically significant.● EWMA and ARIMA fail.

When to use Quantile Regression:● Upper and Lower bounds behave differently.● Outliers are possible.

For each data set, we can run a model competition, computing forecast model quality based on a weighted sum of model goodness of fit, model suitability for forecasting, data stationarity and data variability, and selecting the model that works best for each data set.

EWMA

ARIMA

Quantile Regression

Page 39: Data Science and Predictive SPC

Page 40: Data Science and Predictive SPC

Page 41: Data Science and Predictive SPC

●●

○○

Page 42: Data Science and Predictive SPC

… …

●●●●

Page 43: Data Science and Predictive SPC

… …

●●●●

Page 44: Data Science and Predictive SPC

… …

●●●●

Page 45: Data Science and Predictive SPC

… …

●●●●

Page 46: Data Science and Predictive SPC

… …

… …

Page 47: Data Science and Predictive SPC

… …

… …

Page 48: Data Science and Predictive SPC

p50R …

p50 R …

Target

(LCL…UCL)

(LSL…USL)

Page 49: Data Science and Predictive SPC

p50R …

p50 R …

Target

(LCL…UCL)

(LSL…USL)

Page 50: Data Science and Predictive SPC

❑❑❑❑

o

❑❑❑

Page 51: Data Science and Predictive SPC
Page 52: Data Science and Predictive SPC

••••

Page 53: Data Science and Predictive SPC

●○○○

●○

●○

■■

●●●●●●

Page 54: Data Science and Predictive SPC

www.isixsigma.com

www.amstat.org

www.cmg.org

www.linkedin.com

http://alexonsimanddata.blogspot.com/

http://josepferrandiz.blogspot.com/

“Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write.”

- H.G.Wells (1866-1946)

Page 55: Data Science and Predictive SPC

THANK YOU

Page 56: Data Science and Predictive SPC
Page 57: Data Science and Predictive SPC

Page 58: Data Science and Predictive SPC
Page 60: Data Science and Predictive SPC
Page 61: Data Science and Predictive SPC

•••••

Page 62: Data Science and Predictive SPC

σσ

Page 63: Data Science and Predictive SPC

Universal Scalability Law