Top Banner
Applying PCA for Traffic Anomaly Detection: Problems and Solutions Daniela Brauckhoff (ETH Zurich, CH) Kave Salamatian (Lancaster University, FR) Martin May (Thomson, CH) IEEE INFOCOM (April, 2009) 2010/3/2 1
35

Applying PCA for Traffic Anomaly Detection: Problems and Solutions

Jan 14, 2016

Download

Documents

abiba

Applying PCA for Traffic Anomaly Detection: Problems and Solutions. Daniela Brauckhoff (ETH Zurich, CH) Kave Salamatian (Lancaster University, FR) Martin May (Thomson, CH) IEEE INFOCOM (April, 2009). Agenda. Before Introduction Objective A Signal Processing View on PCA - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

1

Applying PCA for Traffic Anomaly Detection:

Problems and Solutions

Daniela Brauckhoff (ETH Zurich, CH)Kave Salamatian (Lancaster University, FR)

Martin May (Thomson, CH)

IEEE INFOCOM (April, 2009)

2010/3/2

Page 2: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

2

Agenda

• Before Introduction• Objective• A Signal Processing View on PCA• Extension of PCA to Stochastic Processes• Validation• Conclusion

2010/3/2

Page 3: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

3

What is PCA?

• PCA– Principle Component Analysis

• PCA’s Usage– lower the characteristic dimension– e.g., a picture with size 1024 * 768• its characteristic dimension is its length * width• with 786432 characteristic value• use PCA to lower the characteristic dimension

2010/3/2

Page 4: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

4

What is PCA? (cont.1)

2010/3/2 Ref. Site- http://blog.finalevil.com/2008/07/pca.html

Page 5: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

5

What is PCA? (cont.2)

2010/3/2

Page 6: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

6

Agenda

• Before Introduction• Objective• A Signal Processing View on PCA• Extension of PCA to Stochastic Processes• Validation• Conclusion

2010/3/2

Page 7: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

7

Problems and Solutions

• Consider the temporal correlation of the data

• Extend the PCA– Replaced by Karhunen-Loeve Transform

2010/3/2

Page 8: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

8

Agenda

• Before Introduction• Objective• A Signal Processing View on PCA• Extension of PCA to Stochastic Processes• Validation• Conclusion

2010/3/2

Page 9: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

9

Two different interpretations

1. As an efficient representation that transforms the data to a new coordinate system• Projection on the first coordinate contains the

greatest variance

2. As a modeling technique• using a finite number of terms of an orthogonal serie

expansion of the signal with uncorrelated coefficients

2010/3/2

Page 10: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

10

Background

• Suppose that we have a column vector of correlated random variables:– Matrix =>

– Each random variable has its own observation vector through N dependent realization vector:

– Note:• Random variables means the data you collected from network

2010/3/2

kTKXX R),...,( X 1

TiK

i xx ),...,(x 1

X

Page 11: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

11

Background (cont.1)

• In order to find the characteristic of the above data collected from network– i.e., the most suitable basis: ,• where is an eigenvector of the covariance matrix

defined as , estimated by

• where is a column vector containing the means of

2010/3/2

kTKXX R),...,( X 1

),...,( 1 Ki

})X)(X{(E T

iX

X

T

Nxx

1

Page 12: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

12

Background (cont.2)

• The most suitable basis:• How to find the respectively?– i.e., solve the following linear equation:

–Method: SVD (Singular Value Decomposition)• Note: basis change matrix

2010/3/2

),...,( 1 Ki

iii λ

})X)(X{(E T

],...,[ 1 KU

Page 13: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

13

Background (cont.3)

• But is a basis change matrix only when is zero mean

• Meanwhile, must replaced by– i.e., – not taking care of it could lead to large errors when using

PCA

• Rewrite the initial vector of random variables – is the essential property!– i.e., suitable for PCA representation

2010/3/2

],...,[ 1 KU X

X -XX~

xU~y~

X

K

1

Xi

iiYiY

Page 14: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

14

Agenda

• Before Introduction• Objective• A Signal Processing View on PCA• Extension of PCA to Stochastic Processes• Validation• Conclusion

2010/3/2

Page 15: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

15

Stochastic Process

• The extension to PCA Stochastic processes that have temporal as well as spatial correlations

• Assume we have a K-vector of zero mean stationary stochastic processes

– with a covariance function

2010/3/2

TK tXtXt ))(),...,(()(X 1

)}()({E)(, tXtX jiji

Page 16: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

16

Stochastic Process (cont.1)

• The multi-dimension Karhunen-Loeve theorem states that one can rewrite this vector as a serie expansion (named KL expansion):

– Compared:

2010/3/2

K

i jji

ljil tYtX

1 1,, )()(

K

1

Xi

iiY

Page 17: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

17

Stochastic Process (cont.2)

• How to get basis function ?– Solve the linear integral equations:

– Compared:

• Then we can obtained by

2010/3/2

K

i

b

a jljljili tdstss1

,,,, )(λ)()(

)(, tji

iii λ

ljiY ,

b

a jillji dsssXY )()( ,,

K

i jji

ljil tYtX

1 1,, )()(

Page 18: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

18

Stochastic Process (cont.3)

• But Galerkin method transforms the above integral equations to a matrix problem that can be solved by applying the SVD technique

• It possible to derive the KL expansion using only a finite number of samples– Time-sampled version =>– Finally, we obtain a discrete version of the KL

expansion as:

2010/3/2

)(][ ,, kTk jiji

K

i

N

jji

ljil kYkX

1 1,, ][][

Page 19: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

19

Stochastic Process (cont.4)

• Construct a KN × (n − N) observation matrix

• With KN eigenvector2010/3/2

K

i

N

jji

ljil kYkX

1 1,, ][][

Page 20: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

20

Stochastic Process (cont.5)

• Use to estimate the all needed spatio-temporal convariance

2010/3/2

T

Nnxx

1

Page 21: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

21

Agenda

• Before Introduction• Objective• A Signal Processing View on PCA• Extension of PCA to Stochastic Processes• Validation• Conclusion

2010/3/2

Page 22: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

22

Data Set and Metrics

• Collect Three weeks of Netflow data– one of the peering links of a medium-sized ISP

(SWITCH, AS559)• Recorded in August 2007– comprise a variety of traffic anomalies– happening in daily operation such as network

scans, denial of service attacks, alpha flows, etc

2010/3/2

Page 23: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

23

Data Set and Metrics (cont.1)

• The computing the detection metrics:– distinguish between incoming and outgoing

traffic, as well as UDP and TCP flows– for each of these four categories, compute seven

commonly used traffic features:• Byte• Packet• flow counts• Sources and destination IP address entropy• Source and destination IP address counts

2010/3/2

Page 24: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

24

Data Set and Metrics (cont.2)

• All metrics obtained by aggregating the traffic in 15 minutes intervals resulting 28*96 matrix per measurement day

• Anomalies identified by using visual inspection

• Resulted in 28 detected anomalous events in UDP and 73 detected in TCP traffic

2010/3/2

Page 25: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

25

Data Set and Metrics (cont.3)

• Use the vector of metrics containing the first two days of metrics for building the model

• Derive a spatio-temporal correlation matrix with the temporal correlation range set to N = 1, .., 5– Note that setting N = 1 gives the standard PCA

approach– apply SVD decomposition to the data, resulting in

a basis change matrix2010/3/2

Page 26: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

26

ROC curves

• Receiver Operating Characteristics (ROC) curve combining the two parameters in one value captures this essential trade-off– false positive and true positive

2010/3/2

Page 27: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

27

ROC curves (cont.1)

• Receiver Operating Characteristics (ROC) curve combining the two parameters in one value captures this essential trade-off– false positive and true positive

2010/3/2

Page 28: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

28

ROC curves (cont.2)

2010/3/2

Page 29: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

29

ROC curves (cont.3)

• The comparison of ROC curves shows a considerable improvement of the anomaly detection performance with use of KL expansion with N = 2, 3 consistently for UDP and TCP traffic and thereafter a decrease for N ≥ 4

2010/3/2

Page 30: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

30

Effect of non-stationarity

• Stationarity issue:– N ≥ 4 the performance decreases– when N increases, the model contains more

parameters and becomes more sensitive to the stationarity of the traffic metrics

2010/3/2

Page 31: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

31

Agenda

• Before Introduction• Objective• A Signal Processing View on PCA• Extension of PCA to Stochastic Processes• Validation• Conclusion

2010/3/2

Page 32: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

32

Conclusion

• Direct application of the PCA method results in poor performance in terms of ROC curves

• The correct framework is not the classical PCA but rather the Karhunen-Loeve expansion

• Provide a Galerkin method for developing a predictive model and therefore an important improvement is attained when temporal correlation is considered

2010/3/2

Page 33: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

33

Q & A

Thank you!

2010/3/2

Page 34: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

342010/3/2

)(][

)(

],...,[

][][

][

][][ˆ

xx1

][][

)()(

)(λ)()(

)()(

)}()({E)(

))(),...,(()(X

xx1

X

~y~-XX

})X)(X{(E

),...,(

)e,...,e(

),...,(x

R),...,( X

),...,(

,,

,

,

1

,

1 1,,

1 1,,

,,

1,,,,

1 1,,

,

1

K

1

1

1

1

1

1

kTk

Y

t

Y

U

X

X

kQkD

k

kYkX

Nn

kYkX

dsssXY

tdstss

tYtX

tXtX

tXtXt

N

Y

xU

xx

XX

XX

jiji

lji

ji

i

K

i

i

h

ji

L

i

M

jji

ljil

T

K

i

N

jji

ljil

b

a jillji

K

i

b

a jljljili

K

i jji

ljil

jiji

TK

T

iii

iii

T

K

K

TiK

i

kTK

K

),...,( 1 KXXkT

KXX R),...,( X 1

TiK

i xx ),...,(x 1

)e,...,e( 1 K

),...,( 1 K})X)(X{(E T

iii λ-XX

~

xU~y~

K

1

Xi

iiY

T

Nxx

1

TK tXtXt ))(),...,(()(X 1

)}()({E)(, tXtX jiji

K

i jji

ljil tYtX

1 1,, )()(

i

iX

X

iY

Page 35: Applying PCA for Traffic  Anomaly Detection: Problems and Solutions

352010/3/2

L

i

M

jji

ljil kYkX

1 1,, ][][ˆ

T

Nnxx

1

K

i

N

jji

ljil kYkX

1 1,, ][][

b

a jillji dsssXY )()( ,,

K

i

b

a jljljili tdstss1

,,,, )(λ)()(

hkQkD

][][

][, kji

],...,[ 1 KU

ljiY ,

)(][ ,, kTk jiji