Top Banner
1 Some Simple Math to Get Signal out of your data noise #VelocityConf 25.06.2014 Toufic Boubez, Ph.D. Co-Founder, CTO Metafor Software toufi[email protected]
55

Simple math to get signal out of your data noise - Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

Aug 11, 2014

Download

Data & Analytics

tboubez

This is the presentation I gave at Velocity Santa Clara on June 25, 2014:

You’ve instrumented your system and application to the hilt. You can now “measure all the things”. Your team has set up thousands of metrics collecting millions of data points a day. Now what?

Analyzing this mountain of data and extracting signal from the noise is not easy, so most IT ops teams only keep an eye on a small fraction of the metrics they collect, or they run some simplistic analytics that don’t generate any useful information. The choice of what analytic method to use ranges from simple statistical analysis to sophisticated machine learning techniques. And one algorithm doesn’t fit all data.

While the more advanced algorithms require math nerds with PhD’s to develop, there are some basic statistical methods that anyone can implement and they can provide surprisingly valuable insights. In this talk, Toufic will show you how to determine the distribution of your data and explain why this is important. We will then explore a few statistical techniques that are appropriate for the most common distributions and discuss their pros and cons. Yes, there will be math in this talk but you will be able to follow along. The goal is for you to walk away with at least one technique that you can apply right away in your monitoring environment to improve the signal to noise ratio and get information out of your data.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

1

Some Simple Math to Get Signalout of your data noise

#VelocityConf25.06.2014

Toufic Boubez, Ph.D.Co-Founder, CTOMetafor Software

[email protected]

Page 2: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

2

Preamble

• I lied: There is no “simple” math for Anomaly Detection!

• I usually beat up on parametric, Gaussian, supervised techniques– This talk is to show some alternatives– Only enough time to cover a couple (four, really) of relatively simple but

very useful techniques– Oh, and I will actually still start up by beating up on the usual suspects,

but don’t despair, there’s good stuff towards the end

• Note: all real data• Note: no y-axis labels on charts – on purpose!!• Note to self: remember to SLOW DOWN!• Note to self: mention the cats!! Everybody loves cats!!

Page 3: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

3

• Co-Founder/CTO Metafor Software• Co-Founder/CTO Layer 7 Technologies

– Acquired by Computer Associates in 2013– I escaped

• Co-Founder/CTO Saffron Technology• IBM Chief Architect for SOA/Web Services• Co-Author, Co-Editor: WS-Trust, WS-

SecureConversation, WS-Federation, WS-Policy• Building large scale software systems for >20 years (I’m

older than I look, I know!)

Toufic intro – who I am

Page 4: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

4

Wall of Charts™

Page 5: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

5

The WoC side-effects: alert fatigue

“Alert fatigue is the single biggest problem we have right now … We need to be more intelligent about our alerts or we’ll all go insane.”

- John Vincent (@lusis) (#monitoringsucks)

We have forensic tools for analytics after the fact BUT we need to KNOW that something has happened!

We need alerts!

Page 6: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

6

Watching screens cannot scale

Page 7: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

7

Time to turn things over to the machines!

Page 8: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

8

Attempt #1: static thresholds …

• Roots in manufacturing process QC

Page 9: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

9

… are based on Gaussian distributions

• Make assumptions about probability distributions and process behaviour– Data is normally distributed with a useful and

usable mean and standard deviation– Data is probabilistically “stationary”

Page 10: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

10

Three-Sigma Rule

• Three-sigma rule– ~68% of the values lie within 1 std deviation of the mean– ~95% of the values lie within 2 std deviations– 99.73% of the values lie within 3 std deviations: anything

else is an outlier

Page 11: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

11

Aaahhhh

• The mysterious red lines explained

mean

3s

3s

Page 12: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

12

Stationary Gaussian distributions are powerful

• Because far far in the future, in a galaxy far far away:– I can make the same predictions because the

statistical properties of the data haven’t changed– I can easily compare different metrics since they

have similar statistical properties• Let’s do this!!• BUT…• Cue in DRAMATIC MUSIC

Page 13: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

13

Then THIS happens

Page 14: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

14

3-sigma rule alerts

Page 15: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

15

Or worse, THIS happens!

Page 16: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

16

3-sigma rule alerts

Page 17: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

17

WTF!? So what gives!?

• Remember this?

Page 18: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

18

Histogram – probability distribution

Page 19: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

19

Histogram – probability distribution

Page 20: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

20

Attempts #2, #3, etc: mo’ better thresholds

• Static thresholds ineffective on dynamic data– Thresholds use the (static) mean as predictor and

alert if data falls more than 3 sigma away• Need “moving” or “adaptive” thresholds:

– Value of mean changes with time to accommodate new data values, trends, periodicity

Page 21: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

21

Moving Averages “big idea”

• At any point in time in a well-behaved time series, your next value should not significantly deviate from the general trend of your data

• Mean as a predictor is too static, relies on too much past data (ALL of the data!)

• Instead of overall mean use a finite window of past values, predict most likely next value

• Alert if actual value “significantly” (3 sigmas?) deviates from predicted value

Page 22: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

22

Moving Averages typical method

• Generate a “smoothed” version of the time series– Average over a sliding (moving) window

• Compute the squared error between raw series and its smoothed version

• Compute a new effective standard deviation (sigma’) by smoothing the squared error

• Generate a moving threshold:– Outliers are 3-sigma’ outside the new, smoothed data!

• Ta-da!

Page 23: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

23

Simple and Weighted Moving Averages

• Simple Moving Average– Average of last N values in your time series

• S[t] <- sum(X[t-(N-1):t])/N– Each value in the window contributes equally to

prediction– …INCLUDING spikes and outliers

• Weigthed Moving Average– Similar to SMA but assigns linearly (arithmetically)

decreasing weights to every value in the window– Older values contribute less to the prediction

Page 24: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

24

Exponential Smoothing techniques

• Exponential Smoothing– Similar to weighted average, but with weights decay exponentially over

the whole set of historic samples• S[t]=αX[t-1] + (1-α)S[t-1]

– Does not deal with trends in data• DES

– In addition to data smoothing factor (α), introduces a trend smoothing factor (β)

– Better at dealing with trending– Does not deal with seasonality in data

• TES, Holt-Winters– Introduces additional seasonality factor– … and so on

Page 25: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

25

Example 1

Page 26: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

26

Holt-Winters predictions

Page 27: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

27

Example 2

Page 28: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

28

Exponential smoothing predictions

Page 29: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

29

Hmmmm, so are we doomed?

• No!• ALL smoothing predictive methods work best

with normally distributed data!• But there are lots of other non-Gaussian

based techniques– We can only scratch the surface in this talk

Page 30: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

30

Trick #1: Histogram!

Page 31: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

31

THIS is normal

Page 32: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

32

This isn’t

Page 33: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

33

Neither is this

Page 34: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

34

Trick #2: Kolmogorov-Smirnov test

• Non-parametric test– Compare two probability

distributions– Makes no assumptions (e.g.

Gaussian) about the distributions of the samples

– Measures maximum distance between cumulative distributions

– Can be used to compare periodic/seasonal metric periods (e.g. day-to-day or week-to-week)

http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

Page 35: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

35

KS with windowing

Page 36: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

36

Page 37: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

37

Page 38: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

38

Page 39: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

39

Page 40: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

40

Page 41: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

41

KS Test on difficult data

Page 42: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

42

Trick #3: Box Plots / Tukey

• Again, need non-parametric method:– Does not rely on mean and standard deviation

• When you can’t count on good old Gaussian:– Median is always a great alternative to the mean– Quartiles are an alternative to standard deviation

• Q1 = 25% Quartile (25% of the data)• Q2 = 50% Quartile == Median (50% of the data)• Q3 = 75% Quartile (75% of the data)• Interquartile Range (IQR) = Q3 – Q1

Page 43: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

43

Example: box plots and fences for a Gaussian

http://en.wikipedia.org/wiki/Interquartile_range

Page 44: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

44

IQR method for streaming time series

• IQR method works well for some non-normal distributions– Generates continuously adaptive fences at(Q1 - 1.5xIRQ) and (Q3 + 1.5xIQR)– Adjusted box plot uses fences at(Q1 - 1.5xIRQ) and (Q3 + 1.5x IQR)

• Method:– As time series is streaming, for every window:

• Re-compute quartiles• Re-compute IQR, fences• Determine if any outliers• Repeat

Page 45: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

45

Example 1

Page 46: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

46

Example 1 – Good

Page 47: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

47

Example 2

Page 48: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

48

Example 2 – Bad

Page 49: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

49

Trick #4: Diffing/Derivatives

• Often, even when the data itself is not stationary, its derivatives tends to be!

• Most frequently, first difference is sufficient:dS(t) <- S(t+1) – S(t)

• Can then perform some analytics on first difference

Page 50: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

50

CPU time series

Page 51: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

51

Its first difference – possible random walk?

Page 52: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

52

Trick #5: Neural Networks

• Really?• No time – To Be Continued!

Page 53: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

53

We’re not doomed, but: Know your data!!

• You need to understand the statistical properties of your data, and where it comes from, in order to determine what kind of analytics to use.– Your data is very important!– You spend time collecting it so spend time analyzing it!

• A large amount of data center data is non-Gaussian– Guassian statistics won’t work– Use appropriate techniques

Page 54: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

54

More?

• Only scratched the surface• I want to talk more about algorithms, analytics,

current issues, etc, in more depth, but time’s up!!– Come talk to me or email me if interested.

• Office Hour: Tomorrow at 11:30 Booth #801• Thank you!

[email protected]@tboubez

Page 55: Simple math to get signal out of your data noise -  Anomaly Detection - Toufic Boubez - Metafor Software - Velocity Santa Clara 2014-06-25

55

Oh yeah, and we’re hiring!

In Vancouver, Canada