Khushboo Shah, Edmond Jonckheere, Stephan Bohacek University of Southern California
Post on 14-Jan-2016
30 Views
Preview:
DESCRIPTION
Transcript
Khushboo Shah, Edmond Jonckheere, Stephan Bohacek
University of Southern California
Detecting Network Attacks Through Traffic Modeling
Objective:
Detect flooding attacks.
Hypothesis:
The time series of non-TCP attack traffic has a statistical signature distinct from that of the time series of normal TCP traffic.
Statistical signatures being considered are:
• Mutual information
• Dynamic modeling prediction error
Theme
• Canonical Correlation Analysis (CCA) and Mutual Information (MI)
• Attack detection on different topologies using Mutual Information:
- Dumbbell topology
- Parking lot topology
- Random topology
- Transit-Stub 100 node topology
• Models derived from CCA and prediction error
- State Space model using Kalman filter
- Nonlinear Auto Regressive (AR) model
• Dynamic attack detection on different topologies using mean square prediction error.
Outline
Canonical Correlation AnalysisObserved signal : {Y(k) : -<k<}, zero-mean random process.
Past: Y-(k) = (y(k-L+1),…, y(k))T
Future: Y+(k) = (y(k+1),…, y(k+L))T where L = lag
One way to understand the correlation between the past and future is to examine C-+ = E(Y-Y+
T)
Canonical Correlation AnalysisObserved signal : {Y(k) : -<k<}, zero-mean random process.
Past: Y-(k) = (y(k-L+1),…, y(k))T
Future: Y+(k) = (y(k+1),…, y(k+L))T where L = lag
One way to understand the correlation between the past and future is to examine C-+ = E(Y-Y+
T)
Another way is to perform canonical correlation analysis:
1,1 subject to
))()((max
22ii
jj
ii jkYikYE
Canonical Correlation AnalysisObserved signal : {Y(k) : -<k<}, zero-mean random process.
Past: Y-(k) = (y(k-L+1),…, y(k))T
Future: Y+(k) = (y(k+1),…, y(k+L))T where L = lag
One way to understand the correlation between the past and future is to examine C-+ = E(Y-Y+
T)
Another way is to perform canonical correlation analysis:
Example:
011
,01
,01
1,1
1,1
21
21
TkYkYkYkYE
increasesYifkYkY
increasesYifkYkY
1,1 subject to
))()((max
22ii
jj
ii jkYikYE
Canonical Correlation AnalysisObserved signal : {Y(k) : -<k<}, zero-mean random process.
Past: Y-(k) = (y(k-L+1),…, y(k))T
Future: Y+(k) = (y(k+1),…, y(k+L))T where L = lag
One way to understand the correlation between the past and future is to examine C-+ = E(Y-Y+
T)
Another way is to perform canonical correlation analysis:
Example:
011
,01
,01
1,1
1,1
21
21
TkYkYkYkYE
increasesYifkYkY
increasesYifkYkY
1,1 subject to
))()((max
22ii
jj
ii jkYikYE
1,1 subject to
))()((max:
22
1
ii
jj
ii jkYikYE
Canonical Correlation AnalysisObserved signal : {Y(k) : -<k<}, zero-mean random process.
Past: Y-(k) = (y(k-L+1),…, y(k))T
Future: Y+(k) = (y(k+1),…, y(k+L))T where L = lag
One way to understand the correlation between the past and future is to examine C-+ = E(Y-Y+
T )
Another way is to perform canonical correlation analysis:
Example:
011
,01
,01
1,1
1,1
21
21
TkYkYkYkYE
increasesYifkYkY
increasesYifkYkY
1,1 subject to
))()((max
22ii
jj
ii jkYikYE
1,1 subject to
))()((max:
22
1
ii
jj
ii jkYikYE
,,1,1 subject to
))()((max:
22
2
ii
jj
ii jkYikYE
Auto-correlation of the past is: C-- = E(Y-Y-
T) Auto-correlation of the future is: C++ = E(Y+Y+
T) Cross correlation between the past and the future is: C-+ = E(Y-Y+
T)
Canonical correlation matrix between past and future: CC = C--
-0.5 C-+ C++-0.5
Linear Canonical Correlation Analysis (LCCA)
Auto-correlation of the past is: C-- = E(Y-Y-
T) Auto-correlation of the future is: C++ = E(Y+Y+
T) Cross correlation between the past and the future is: C-+ = E(Y-Y+
T)
Canonical correlation matrix between past and future: CC = C--
-0.5 C-+ C++-0.5
Singular Value Decomposition of CC: CC = U V’
where U and V are orthogonal matrices,
1 1 2 … L 0 The ’s are called canonical correlation coefficients .
Linear Canonical Correlation Analysis (LCCA)
L
00
00
00
2
1
Auto-correlation of the past is: C-- = E(Y-Y-
T) Auto-correlation of the future is: C++ = E(Y+Y+
T) Cross correlation between the past and the future is: C-+ = E(Y-Y+
T)
Canonical correlation matrix between past and future: CC = C--
-0.5 C-+ C++-0.5
Singular Value Decomposition of CC: CC = U V’
where U and V are orthogonal matrices,
1 1 2 … L 0 The ’s are called canonical correlation coefficients .
Regardless of how large the lag L is, the i ’s have a breakpoint:
1 1 2 … D > > D+1… L 0 0.1 D+1 … L 0
Linear Canonical Correlation Analysis (LCCA)
L
00
00
00
2
1
Observed signal : {Y(k) : -<k<} , zero-mean random process.
1 ,1 subject to
))()((max:1
YYE
Nonlinear Canonical Correlation Analysis (NLCCA)
1,1 subject to
))()((max:
22
1
ii
jj
ii jkYikYE
Observed signal : {Y(k) : -<k<} , zero-mean random process.
Let (Y-(k)) be a non-linear function the past.
We restrict to be a simple polynomial:
)1()()()()())_((0
ikYikYikYikYikYkY i
L
i
i
1 ,1 subject to
))()((max:1
YYE
Nonlinear Canonical Correlation Analysis (NLCCA)
1,1 subject to
))()((max:
22
1
ii
jj
ii jkYikYE
Observed signal : {Y(k) : -<k<} , zero-mean random process.
Let (Y-(k)) be a non-linear function the past.
We restrict to be a simple polynomial:
Semi-NLCCA: Restrict to be a linear function of Y+(k)
)1()()()()())_((0
ikYikYikYikYikYkY i
L
i
i
)())((0
ikYkYL
i
i
1 ,1 subject to
))()((max:1
YYE
Nonlinear Canonical Correlation Analysis (NLCCA)
1,1 subject to
))()((max:
22
1
ii
jj
ii jkYikYE
Observed signal : {Y(k) : -<k<} , zero-mean random process.
Let (Y-(k)) be a non-linear function the past.
We restrict to be a simple polynomial:
Semi-NLCCA: Restrict to be a linear function of Y+(k)
Full-NLCCA: Restrict to be a simple polynomial of Y+(k)
)1()()()()())_((0
ikYikYikYikYikYkY i
L
i
i
)())((0
ikYkYL
i
i
1 ,1 subject to
))()((max:1
YYE
)1()()()()())((0
ikYikYikYikYikYkY i
L
i
i
Nonlinear Canonical Correlation Analysis (NLCCA)
1,1 subject to
))()((max:
22
1
ii
jj
ii jkYikYE
Mutual Information (MI)
The (Akaike) Mutual Information (MI) between the past and future vectors Y-(k) and Y+(k) is given by
where the I’s are the CCCs.
MI is the measure of predictability of the future signal given the past.
If the CCCs are equal to zero, then MI is zero and the given time series is uncorrelated (or independent if the process is normally distributed).
In contrast, if the CCCs are all one, then MI is infinite and the series is completely predictable given the knowledge of the past.
),ln( )1(5.0 2 i
MI
Mutual information can be used to detect an attack
• Canonical Correlation Analysis (CCA) and Mutual Information
• Attack Detection on different topologies using mutual information.
- Dumbbell topology
- Parking lot topology
- Random topology
- Transit-Stub 100 node topology
• Models derived from CCA and Prediction Error
- State Space Model using Kalman Filter
- Nonlinear Auto Regressive Model
• Dynamic Attack Detection on different topologies using prediction error.
Outline
What are the observations?
• Link utilization – The number of bytes that traverse the link in one sample period divided by the hardware bandwidth.
• Packet arrivals – The number of packets that arrive at the router in one sample period.
0 1simplex link
Packet/bytes arrivals
Packet/bytes departures
Background traffic: FTP and HTTP traffic from the sources to the destinations.
Attack traffic: CBR packets of 200 bytes are sent from the sources (2, 3, 4) to the destination (9) every 0.005 seconds (320kbps/source).
Delay: 20ms
Simulation run time: 30000 sec
Link bandwidth: 10Mbps; Bottleneck link bandwidth: 1.5 Mbps
Dumbbell Topology
3
2
5
6
4
8
7
10
11
90 1
Sourc
es
Destin
atio
ns Node
under Attack
1.5Mbps
0 2000 4000 6000 8000 10000 12000 14000 16000
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Attack Data
Time
Lin
k U
tiliz
atio
n
1 2 3 4 5 6 7 8 9 10 11 12
x 104
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Normal Data
Time
Lin
k U
tiliz
atio
nMI on Link Utilization for Dumbbell Topology
)ln( )1(5.0 2 iMI
Link Utilization : No. of bytes arriving per sampling period/ hardware bandwidth
0 2000 4000 6000 8000 10000 12000 14000 16000
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Attack Data
Time
Lin
k U
tiliz
atio
n
1 2 3 4 5 6 7 8 9 10 11 12
x 104
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Normal Data
Time
Lin
k U
tiliz
atio
nMI on Link Utilization for Dumbbell Topology
)ln( )1(5.0 2 iMI
1 2 3 4 5 6 7 8 9 1010
15
20
25
30
35
40
45
50
55
60Mutual Information
Sampling Period
Mu
tua
l In
form
atio
n
AttackNon-Attack
The amplitudes of the non-attack and attack signals are similar.
MI is higher for the attack data than for the non-attack data.
Link Utilization : No. of bytes arriving per sampling period/ hardware bandwidth
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
x 104
0
50
100
150
200
250
300
350
400
450
500Normal Data
Time
Pa
cke
t A
rriv
als
0 0.5 1 1.5 2 2.5 3 3.5 4
x 104
0
50
100
150
200
250
300
350
400
450
500Attack Data
Time
Pa
cke
t A
rriv
als
MI on Packet Arrivals for Dumbbell Topology
)ln( )1(5.0 2 iMI
Packet arrivals : No. of packets arriving per sampling period.
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
x 104
0
50
100
150
200
250
300
350
400
450
500Normal Data
Time
Pa
cke
t A
rriv
als
0 0.5 1 1.5 2 2.5 3 3.5 4
x 104
0
50
100
150
200
250
300
350
400
450
500Attack Data
Time
Pa
cke
t A
rriv
als
MI on Packet Arrivals for Dumbbell Topology
)ln( )1(5.0 2 iMI
1 2 3 4 5 6 7 8 9 1010
15
20
25
30
35
40
45Mutual Information
Sampling PeriodM
utu
al I
nfo
rma
tion
Non-AttackAttack
MI for the attack data is higher than that for the non-attack data, and is hence detecting the abnormality in the traffic.
Packet arrivals : No. of packets arriving per sampling period.
Background traffic: FTP and HTTP traffic to downstream destinations.
Attack traffic: CBR packets of 200 bytes are sent to 4, at the rate of every 0.005 seconds (320kbps/source).
No. of UDP connections: 15
Link bandwidth: 10Mbps
Delay: 20 ms
Link under investigation is : 3 – 4
Parking Lot TopologyNormal Traffic
UDP Flooding Attack
Node under Attack
9
2
10
3
11
4
12
5
6
8
1
0
7
13
Observed link
0 0.5 1 1.5 2 2.5 3
x 104
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Attack Data
Time
Lin
k U
tiliz
atio
n
0 0.5 1 1.5 2 2.5 3
x 104
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Lin
k U
tiliz
atio
nNormal Data
Link Utilization : No. of bytes arriving per sampling period/ hardware bandwidth
MI on Link Utilization for Parking Lot Topology)ln( )1(5.0 2 iMI
0 0.5 1 1.5 2 2.5 3
x 104
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Attack Data
Time
Lin
k U
tiliz
atio
n
0 0.5 1 1.5 2 2.5 3
x 104
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Lin
k U
tiliz
atio
nNormal Data
Link Utilization : No. of bytes arriving per sampling period/ hardware bandwidth
MI on Link Utilization for Parking Lot Topology)ln( )1(5.0 2 iMI
1 2 3 4 5 6 7 8 9 105
6
7
8
9
10
11
12Mutual Information
Sampling Period
Mu
tua
l In
form
atio
n
Non-AttackAttack
Both time series appear to be similar. But there is a clear difference in MI.
MI is higher in the case of attack data.
1 1.5 2 2.5 3 3.5 4
x 104
0
200
400
600
800
1000
1200
1400
1600 Normal Data
Time
Pa
cke
t A
rriv
als
0 1 2 3 4 5 6
x 104
0
200
400
600
800
1000
1200
1400
1600
1800
2000 Attack Data
Time
Pa
cke
t A
rriv
als
Packet arrivals : No. of packets arriving per sampling period.
)ln( )1(5.0 2 iMI
MI on Packet Arrivals for Parking Lot Topology
Both time-series look similar.
But the statistics reveals the anomaly in the traffic.
1 2 3 4 5 6 7 8 9 105
10
15
20
25
30
35
40 Mutual Information
Sampling Period
Mu
tua
l In
form
atio
n
Attack Non-Attack
1 2 3 4 5 6 7 8 9 1010
15
20
25
30
35
40
45
50
55
60Mutual Information
Sampling Period
Mut
ual I
nfor
mat
ion
AttackNon-Attack
1 2 3 4 5 6 7 8 9 1010
15
20
25
30
35
40
45Mutual Information
Sampling Period
Mut
ual I
nfor
mat
ion
Non-AttackAttack
1 2 3 4 5 6 7 8 9 105
6
7
8
9
10
11
12Mutual Information
Sampling Period
Mut
ual I
nfor
mat
ion
Non-AttackAttack
Sampling Period1 2 3 4 5 6 7 8 9 10
5
10
15
20
25
30
35
40Mutual Information
Mut
ual I
nfor
mat
ion
Attack Non-Attack
What are the observations?
Mutual information increases under attack.
Link utilization tends to work better than packet arrivals.
Dumbbell Topology
Parking Lot Topology
The effect of intensity of normal traffic
The normal traffic is HTTP. HTTP traffic is characterized by several parameters:
– Number of pages per session (constant), inter-session time (exponential).
– Number of objects per page (constant), inter-page time (exponential).
– Object size (Pareto).
These parameters are varied to change the intensity of the HTTP traffic.
For these experiments, the CBR attack traffic is held constant.
Background traffic: HTTP traffic to downstream destinations.
Attack traffic: CBR packets of 200 bytes are sent to 4, at the rate of every 0.005 seconds (320kbps/source).
No. of UDP connections: 15
Simulation run time for each trial: 30000 sec
Link bandwidth: 10Mbps
Delay: 20 ms
Link under investigation is : 3 – 4
Parking Lot TopologyNormal Traffic
UDP Flooding Attack
Node under Attack
9
2
10
3
11
4
12
5
6
8
1
0
7
13
Observed link
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
5
10
15
20
25
30Linear Mutual Information: Normal Data
Sampling Period
Mut
ual
Info
rmat
ion
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =8 Trial =9 Trial =10 Trial =11 Trial =12 Trial =13 Trial =14
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.50
5
10
15
20
25
30Linear Mutual Information: Attack Data
Sampling Period
Mut
ual I
nfo
rmat
ion
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =8 Trial =9 Trial =10 Trial =11 Trial =12 Trial =13 Trial =14
Observed Link 3-4: Varying HTTP and Constant CBR
Intensity of HTTP traffic decreases from trial 1 to trial 14. So, relatively speaking, CBR traffic increases from trial 1 to trial 14.
Hence the traffic is more predictable and the mutual information increases.
The effect of varying normal and attack traffic intensity
The normal traffic is HTTP. HTTP traffic is characterized by several parameters:
– Number of pages per session (constant), inter-session time (exponential).
– Number of objects per page (constant), inter-page time (exponential).
– Object size (Pareto).
These parameters are varied to modify the intensity of the HTTP traffic.
The attack traffic is CBR. The rate and packet size are varied to vary the intensity of CBR traffic.
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
5
10
15
20
25
30
35
40Linear Mutual Information: Normal Data
Sampling Period
Mut
ual I
nfor
mat
ion
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =8 Trial =9 Trial =10 Trial =11 Trial =12 Trial =13 Trial =14
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.50
5
10
15
20
25
30
35
40Linear Mutual Information: Attack Data
Sampling Period
Mu
tua
l In
form
atio
n
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =8 Trial =9 Trial =10 Trial =11 Trial =12 Trial =13 Trial =14
Trial 1: Least intensity CBR traffic and highest intensity HTTP traffic.Trial 14: Highest intensity CBR traffic and least intensity HTTP traffic.
For trial 1, MI is nearly the same as normal traffic.
The MI is higher as compared to the MI when CBR traffic was held constant because intensity of CBR traffic increases.
The relationship between the relative intensity of CBR traffic and the mutual information is not trivial.
Observed Link 3-4: Varying HTTP and Varying CBR
Can an attack be detected at different link?
• The normal traffic is HTTP.• The attack traffic is CBR.• These parameters are varied to vary the intensity of the HTTP and
CBR traffic.
Parking Lot TopologyNormal Traffic
UDP Flooding Attack
Node under Attack
9
2
10
3
11
4
12
5
6
8
1
0
7
13
Observed link
Background traffic: HTTP traffic to downstream destinations.
Attack traffic: CBR packets sent to 4.
No. of UDP connections: 15
Simulation run time for each trial: 30000 sec
Link bandwidth: 10Mbps
Delay: 20 ms
Link under investigation is : 4 – 5
Observations can detect attacks on links elsewhere in the network.
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.50
5
10
15
20
25Linear Mutual Information: Attack Data
Sampling Period
Mu
tua
l Inf
orm
atio
n
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =8 Trial =9 Trial =10 Trial =11 Trial =12 Trial =13 Trial =14
Observed Link 4-5: Varying HTTP and Constant CBR
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
5
10
15
20
25Linear Mutual Information: Normal Data
Sampling Period
Mu
tua
l Inf
orm
atio
n
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =8 Trial =9 Trial =10 Trial =11 Trial =12 Trial =13 Trial =14
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.50
5
10
15
20
25Linear Mutual Information: Attack Data
Sampling Period
Mu
tua
l Inf
orm
atio
n
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =8 Trial =9 Trial =10 Trial =11 Trial =12 Trial =13 Trial =14
Observed Link 4-5: Varying HTTP and Varying CBR
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
5
10
15
20
25Linear Mutual Information: Normal Data
Sampling Period
Mu
tua
l Inf
orm
atio
n
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =8 Trial =9 Trial =10 Trial =11 Trial =12 Trial =13 Trial =14
Linear v.s. Nonlinear CCA
• LCCA: Find CCC between linear function of past and linear function of future.
• Semi-NLCCA: Finds CCC between nonlinear function of past and linear function of future.
• Full-NLCCA: Finds CCC between nonlinear function of past and nonlinear function of future.
• Experiment Set-up:
- Normal traffic is http.
- Attack traffic is CBR.
- Intensity of both traffic is varied.
Parking Lot TopologyNormal Traffic
UDP Flooding Attack
Node under Attack
9
2
10
3
11
4
12
5
6
8
1
0
7
13
Observed link
Background traffic: HTTP traffic to downstream destinations.
Attack traffic: CBR packets are sent to 4.
No. of UDP connections: 15
Simulation run time for each trial: 30000 sec
Link bandwidth: 10Mbps
Delay: 20 ms
Link under investigation is : 3 – 4
0 0.1 0.2 0.3 0.4 0.50
2
4
6
8
10
12
14
16
18Linear Mutual Information: Attack Data
Sampling Period
Mut
ual I
nfor
mat
ion
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =8 Trial =9 Trial =10 Trial =11 Trial =12 Trial =13 Trial =14
0 1 2 3 4 50
2
4
6
8
10
12
14
16
18Semi NonLinear Mutual Information: Attack Data
Sampling Period
Mut
ual I
nfor
mat
ion
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =8 Trial =9 Trial =10 Trial =11 Trial =12 Trial =13 Trial =14
0 0.1 0.2 0.3 0.4 0.50
5
10
15
20
25Full NonLinear Mutual Information: Attack Data
Sampling Period
Mut
ual I
nfor
mat
ion
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =8 Trial =9 Trial =10 Trial =11 Trial =12 Trial =13 Trial =14
0 0.1 0.2 0.3 0.4 0.52
3
4
5
6
7
8
9Full NonLinear Mutual Information: Normal Data
Sampling Period
Mut
ual I
nfor
mat
ion
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =8 Trial =9 Trial =10 Trial =11 Trial =12 Trial =13 Trial =14
0 0.1 0.2 0.3 0.4 0.51
1.5
2
2.5
3
3.5
4
4.5
5
5.5Semi NonLinear Mutual Information: Normal Data
Sampling Period
Mut
ual I
nfor
mat
ion
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =8 Trial =9 Trial =10 Trial =11 Trial =12 Trial =13 Trial =14
0 0.1 0.2 0.3 0.4 0.50.5
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5Linear Mutual Information: Normal Data
Sampling Period
Mut
ual I
nfor
mat
ion
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =8 Trial =9 Trial =10 Trial =11 Trial =12 Trial =13 Trial =14
MI for LCCA < MI for Semi-NLCCA < MI for Full-NLCCA
Mutual Information for CBR Attack
Linear
Full-Nonlinear
Semi-Nonlinear
Random Topology - 50 nodes (Gt-itm Topology Generator)
Link Monitored
Attack Destination
HTTP Sources
Background traffic: HTTP traffic from random sources to random destinations.
Attack traffic: CBR packets are sent from random attack sources to 14.
Simulation run time for each trial: 30000 sec.
Link bandwidth: 1.5Mbps, Delay: 20 to 120 ms.
Link monitored : 14 – 30 (10 Mbps).
No. of clients = 20
No. of servers = 20
No. of attack sources = min. 10
Attack Destination = 14
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.160
10
20
30
40
50
60
70
80
90Linear Mutual Information: Attack Data
Sampling Period
Mu
tua
l Inf
orm
atio
n
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =8
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.160
2
4
6
8
10
12Linear Mutual Information: Normal Data
Sampling Period
Mu
tua
l Inf
orm
atio
n
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =9 Trial =8
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.160
20
40
60
80
100Semi-Nonlinear Mutual Information: Attack Data
Sampling Period
Mu
tua
l Inf
orm
atio
n
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =8
Mutual Information for CBR Attack
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.160
50
100
150
200
250
300
350Full NonLinear Mutual Information: Attack Data
Sampling Period
Mu
tua
l Inf
orm
atio
n
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =8 Trial =9 Trial =10 Trial =11 Trial =12 Trial =13
2
4
6
8
10
12
14
16Full-Nonlinear Mutual Information: Normal Data
Mu
tua
l Inf
orm
atio
n
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =8
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.160
Sampling Period
MI for LCCA < MI for Semi-NLCCA < MI for Full-NLCCA
Linear
Full-Nonlinear
Semi-Nonlinear
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.160
2
4
6
8
10
12
14
Sampling Period
Mu
tua
l Inf
orm
atio
n
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =8 Trial =9 Trial =10 Trial =11 Trial =12 Trial =13
SemiNonLinear Mutual Information: Normal Data
Infrastructure Attacks
• Cert has noted that DoS attacks on links and routers is increasing.
• A coordinated attack can utilize many end hosts that all send packets that eventually traverse the same link thereby hogging all link bandwidth.
Transit-Stub 100 Node Topology
HTTP Client
HTTP Server
2
0
3
1
510
4545
100 43
5
100
10
10
10
Attack SourcesA
ttack D
estin
atio
ns
HTTP C
lients
10
10
10
10
45
Link Under Attack
generated using Gt-itm topology generator
Time Series at Monitored Links
HTTP Server
2
0
5
43
5
Link Under Attack
HTTP Client
Linear Mutual InformationHTTP Server
Link under Attack
The attack can be detected!
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.161
2
3
4
5
6
7
8Full NonLinear Mutual Information: Normal Data
Sampling Period
Mu
tua
l In
form
ati
on
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =8 Trial =9 Trial =10 Trial =11 Trial =12 Trial =13 Trial =14 Trial =15
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.161
2
3
4
5
6
7
8Full NonLinear Mutual Information: Attack Data
Sampling Period
Mu
tua
l In
form
ati
on
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =8 Trial =9 Trial =10 Trial =11 Trial =12 Trial =13 Trial =14 Trial =15
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.160
10
20
30
40
50
60
70Full NonLinear Mutual Information: Attack Data
Sampling Period
Mu
tua
l In
form
ati
on
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =8 Trial =9 Trial =10 Trial =11 Trial =12 Trial =13 Trial =14 Trial =15
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.160
5
10
15
20
25
30
35
40
45
50Full NonLinear Mutual Information: Normal Data
Sampling Period
Mu
tua
l In
form
ati
on
Trial =1 Trial =2 Trial =3 Trial =4 Trial =5 Trial =6 Trial =7 Trial =8 Trial =9 Trial =10 Trial =11 Trial =12 Trial =13 Trial =14 Trial =15
HTTP ServerFull Non Linear Mutual Information
Link under Attack
MI for LCCA < MI for Full-NLCCA
• Canonical Correlation Analysis (CCA) and Mutual Information
• Attack Detection on different topologies using mutual information.
- Dumbbell topology
- Parking lot topology
- Random topology
- Transit-Stub 100 node topology
• Models derived from CCA and Prediction Error
- State Space Model using Kalman Filter
- Nonlinear Auto Regressive Model
• Dynamic Attack Detection on different topologies using prediction error.
Outline
State Space ModelingSuppose,we want to construct State Space Model of the form
x(k+1) = Ax(k) + w(k)
y(k) = Cx(k) + w(k)
where A and C are system matrices. w(k) and v(k) are zero-mean uncorrelated sequences.
x(k) = part of the past necessary to predict the future (state)
x(k), A, C and correlation matrices are derived from CCA.
Use Kalman filter to find state recursively.
One step ahead prediction can be obtained by,
n-step ahead prediction can be obtained by,
)1(ˆ)1(ˆ kxCky
)1(ˆ)(ˆ 1 kxCAnky n
Nonlinear AR Model
)()(1
0kwnikynky
L
i i
Let the nonlinear function ( ) of Y-(k) be a simple polynomial.
Construct AR Model:
The one-step ahead prediction can be obtained by
The n-step ahead prediction can be obtained by
)1()()()()())_((0
ikYikYikYikYikYkY i
L
ii
)()1(1
0kwikyky
L
i i
L
imeani
yyL
YVar1
2)(1
1)(
L
i estiyy
LMSE
1
2)(1
1
Prediction Metric
Variance is the mean square error if the mean is used as prediction, i.e,
yest = ymean MSE = Var(Y) NMSE = 1
Hence, One expects NMSE 1
Normalized Mean Square Error:
whereNMSE = MSE/Var(Y)
Mean Square Error is:
Change in NMSE can be used to detect an attack.
Background traffic: HTTP traffic to downstream destinations.
Attack traffic: CBR packets are sent to 4.
No. of UDP connections: 15
Link bandwidth: 10Mbps
Delay: 20 ms
Link under investigation is : 3 – 4
Parking Lot TopologyNormal Traffic
UDP Flooding Attack
Node under Attack
9
2
10
3
11
4
12
5
6
8
1
0
7
13
Observed link
0 5 10 15 20 25 300.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Linear StateSpace Model
Step Ahead Prediction
No
rmal
ized
Mea
n S
qu
are
Err
or
Normal Model on Normal DataNormal Model on Attack Data
No
rmal
ized
Mea
n S
qu
are
Err
or
0 5 10 15 20 25 300.3
0.4
0.5
0.6
0.7
0.8
0.9
1NonlinearAR Model
Step Ahead Prediction
Normal Model on Normal DataNormal Model on Attack Data
Prediction with Linear State Space and Nonlinear AR Models
Nonlinear AR works better for prediction and detection as compared with linear state space model.
Random Topology - 50 nodes (Gt-itm Topology Generator)
Link Monitored
Attack Destination
HTTP Sources
Background traffic: HTTP traffic from random sources to random destinations.
Attack traffic: CBR packets are sent from random attack sources to 14.
Simulation run time for each trial: 30000 sec.
Link bandwidth: 1.5Mbps, Delay: 20 to 120 ms.
Link monitored : 14 – 30 (10 Mbps).
No. of clients = 20
No. of servers = 20
No. of attack sources = min. 10
Attack Destination = 14
0 5 10 15 20 25 300.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
2.2 NonLinearAR Model
Step Ahead Prediction
Nor
mal
ized
Mea
n S
quar
e E
rror
Normal Model on Normal DataNormal Model on Attack Data
0 5 10 15 20 25 300
0.5
1
1.5
2
2.5 Linear StateSpace Model
Step Ahead Prediction
Nor
mal
ized
Mea
n S
quar
e E
rror
Normal Model on Normal DataNormal Model on Attack Data
Prediction with Linear State Space
and Nonlinear AR Models
Nonlinear AR works better for prediction and equally well for detection as compared with linear state space model.
Transit-Stub 100 Node Topology
HTTP Client
HTTP Server
2
0
3
1
510
4545
100 43
5
100
10
10
10
Attack SourcesA
ttack D
estin
atio
ns
HTTP C
lients
10
10
10
10
45
Link Under Attack
generated using Gt-itm topology generator
0 5 10 15 20 25 300
0.2
0.4
0.6
0.8
1
1.2
1.4 Linear StateSpace Model
Step Ahead Prediction
No
rma
lize
d M
ea
n S
qu
are
Err
or
Normal Model on Normal DataNormal Model on Attack Data
0 5 10 15 20 25 300
2
4
6
8
10 NonLinearAR Model
Step Ahead PredictionN
orm
aliz
ed M
ean
Squ
are
Err
or
Normal Model on Normal DataNormal Model on Attack Data
Prediction with Linear State Space
and Nonlinear AR Models
Nonlinear AR works better for prediction and detection as compared with linear state space model.
Conclusion
• Time series of TCP traffic has distinct signature as compared to the time series of non TCP attack traffic.
• Mutual information is a useful tool for detecting flooding attacks.
• Increase in the mutual information is topology independent for flooding attacks.
• Mean square error is a also a useful tool for detecting flooding attacks.
Future Work
• Understand the relationship between CBR rate and mutual information. ( A slightly modified version of mutual information may be required.)
• Utilize packet type information for detecting flooding attacks.
• Investigate different types of attacks such as SYN and PING attacks using both detection tools.
Linear versus Nonlinear Normal versus Attack
for Parking Lot Topology
1 2 3 4 5 6 7 8 9 100
10
20
30
40
50
60
70
80
90Mutual Information at Different Sampling Period
Mut
ual I
nfor
mat
ion
Sampling Period
NLCCA Non-AttackNLCCA AttackLCCA Non-AttackLCCA Attack
20 40 60 80 100 120 140 160 180 2004
6
8
10
12
14
16
18
20
22
24Mutual Information at Different Lag
Lag
Mut
ual I
nfor
mat
ion
NLCCA Attack NLCCA Non-AttackLCCA Attack LCCA Non-Attack
20 40 60 80 100 120 140 160 180 2000
2
4
6
8
10
12Mutual Information at Different Lag
Lag
Mut
ual I
nfor
mat
ion
NLCCA Attack NLCCA Non-AttackLCCA Attack LCCA Non-Attack
1 2 3 4 5 6 7 8 9 100
5
10
15
20
25
30
35Mutual Information at Different Sampling Period
Sampling Period
Mut
ual I
nfor
mat
ion
NLCCA Attack NLCCA Non-AttackLCCA Attack LCCA Non-Attack
Linear versus Nonlinear Normal versus Attack for Dumbbell Topology
State Space Modeling
Another way to compute the state is as follows:
By weak stationarity,
Decompose the C-- matrix into Cholesky factors TLLC
)(1
000
00
000
00
)( 5.02
1
kYLVkX
YCVkX
T
T
d
))()(())()(( TT kYkYECkYkYEC
15.0 LCSo,
Suppose we want to construct a State Space Model of the form
where A and C are system matrices and w(k) and v(k) are zero-mean uncorrelated sequences.
)()()()()(1
kvkCxkykwk Ax) x(k
kCCk YY 1)(ˆ
kCVUCk T
YY 5.05.0)(ˆ
kCCCCCk
YY 5.05.05.05.0)(ˆ
Ld
kCVUCk T
d
)(
000
00
000
00
)(~ˆ 5.02
1
5.0 YY
CxxUCx
UCkY
11
5.05.0
0)(
~ˆ
)(
000
00
000
00
)( 5.02
1
kYCVkX T
d
State Space Modeling
Suppose we want to construct a State Space Model of the form
where A and C are system matrices and w(k) and v(k) are zero-mean uncorrelated sequences.
)()()()()(1
kvkCxkykwk Ax) x(k
kCCk YY 1)(ˆ
State Space Modeling
Suppose we want to construct a State Space Model of the form
where A and C are system matrices and w(k) and v(k) are zero-mean uncorrelated sequences.
)()()()()(1
kvkCxkykwk Ax) x(k
kCCk YY 1)(ˆ
kCCCCCk
YY 5.05.05.05.0)(ˆ
State Space Modeling
Suppose we want to construct a State Space Model of the form
where A and C are system matrices and w(k) and v(k) are zero-mean uncorrelated sequences.
)()()()()(1
kvkCxkykwk Ax) x(k
kCCk YY 1)(ˆ
kCVUCk T
YY 5.05.0)(ˆ
kCCCCCk
YY 5.05.05.05.0)(ˆ
State Space Modeling
Suppose we want to construct a State Space Model of the form
where A and C are system matrices and w(k) and v(k) are zero-mean uncorrelated sequences.
)()()()()(1
kvkCxkykwk Ax) x(k
kCCk YY 1)(ˆ
kCVUCk T
YY 5.05.0)(ˆ
kCCCCCk
YY 5.05.05.05.0)(ˆ
State Space Modeling
)()(ˆ 5.0
1
2
1
5.0 kCVUCk T
d
d
YY
01tsignifican
21 d
State Space Modeling
state
5.0
2
1
5.0 )(
0
)(ˆ kCVUCk T
d
YY
01tsignifican
21 d
State Space Modeling
state
5.0
2
1
5.0 )(
0
)(ˆ kCVUCk T
d
YY
01tsignifican
21 d
)(
0
)( 5.0
2
1
kCVk T
d
YX
01tsignifican
21 d
State Space Modeling
0
x
X
State Space Modeling
)(
000
00
000
00
)( 5.02
1
kCVkx T
d
Y
0
x
X
State Space Modeling
)(
000
00
000
00
)( 5.02
1
kCVkx T
d
Y
0
x
X
CxxUCx
UCk
11
5.05.0
0)(Y
State Space Modeling
)(
000
00
000
00
)( 5.02
1
kCVkx T
d
Y
0
x
X
CxxUCx
UCk
11
5.05.0
0)(Y
CxxUCky
~)(ˆ 5.0
))()((
)(
))()(cov(
0
1
0
T
yy
T
yy
T
yyyy
T
T
kYkYE
where
VCVVCVCC
CC
kvkvR
State Space Modeling
)/()( 11 kkk xxExkw
))(cov(
)(
)cov(
1
1
111
kX
where
VVCVVCVVCVVC
AA
kwkwQ
T
kk
T
kk
T
kk
T
kk
T
T
)/()( kkk xyEykv
T
TT
kkk
T
kk
T
T
CAM
CxxEYYELV
kvkwEN
)()(
))()((
1
1
))()((ˆ
ˆ
)()(11
1
1
kYkYER
where
VLRLV
xxExxEATT
T
kk
T
kk
11
5.0
1
21
1
1
...)(
)(
)()(
UCC
VL
VVCVC
xxExyEC
T
T
yy
T
yy
T
kk
T
kk
0)0(ˆ
))(ˆ)(()(ˆ)1(ˆ
x
kxCkyLkxAkx
1'' ))()()(()( CkCPRNAkAPkL
The state predictor is derived using
where L is the Kalman filter gain matrix. It is determined such that the state error covariance, P(k+1), is minimized.
P(k) is computed recursively as follows:
where P0 is the error covariance of x at time 0 and can be assumed to be zero.
State Estimation and Prediction using Kalman Filter
))1()1(()1( ' kekeEkP ))1(ˆ)1(()1( kxkxke
))(())()()(()()1( '1''' NAkCPRCkCPNCkAPAkAPQkP
and
0)0( PP
))(ˆ)(()(ˆ)1(ˆ kxCkyLkxAkx
)1(ˆ)1(ˆ kxCky
)1(ˆ)(ˆ 1 kxCAnky n
y(k) can be obtained by following recursive equations:
The n-step ahead prediction can be obtained by
The one-step ahead prediction can be obtained by
Kalman Filter
Let the nonlinear function ( ) of Y-(k) be a simple polynomial:
Construct AR model:
The one-step ahead prediction can be obtained by
The n-step ahead prediction can be obtained by
Nonlinear AR Model
)()(1
0kwnikynky
L
i i
)1()()()()())_((0
ikYikYikYikYikYkY i
L
ii
)()1(1
0kwikyky
L
i i
Increase in the MI indicates increase in the CCC. At higher sampling period, the signal is more averaged; hence, the signal is more predictable.
Hence, the prediction error decreases and the inverse of the mean square error increases.
MI ~ 1/Mean Square Error
2 3 4 5 6 7 8 9
950
1000
1050
1100
1150
1200Inverse Mean Square Error
MS
E I
nve
rse
Sampling Periods1
850
900
Mutual Information at Different Sampling Periods
Sampling Periods
Mu
tua
l In
form
atio
n
top related