Network Anomaly Detection: Based on Statistical Approach and Time Series Analysis Huang Kai Qi Zhengwei Liu Bo Shanghai Jiao Tong University.
Apr 22, 2015
Network Anomaly Detection: Based on Statistical Approach
and Time Series Analysis
Huang KaiQi Zhengwei
Liu BoShanghai Jiao Tong University.
5/18/2009 FINA'09
Outline Problem description Data flow statistical characteristic Statistical Analysis Time Series Analysis Conclusion
5/18/2009 FINA'09
Problem description Why statistical approach?
Network anomaly signature based approach.(DPI)
Privacy problem. Machining learning based approach.
Hard to be real time.
5/18/2009 FINA'09
Problem description Why our approach?
Users’ different definition of network anomaly.
Adaptability to the developing network.
5/18/2009 FINA'09
Data flow statistical characteristic Complicated statistical characteristics!
Poisson process Telnet connection Ftp control connection
Exponential process Telnet package
Self-similar process WAN arrival process
Heavy-tail process Ftp data connection Ftp data transfer
5/18/2009 FINA'09
Statistical Analysis Gaussian or not?
-3 -2 -1 0 1 2 3
Normal Distribution
-20
0
20
40
resi
duals
1
No!!!!!!!!
5/18/2009 FINA'09
Statistical Analysis Gaussian mixture model
EM Algorism
5/18/2009 FINA'09
Statistical Analysis EM Algorism
E-step
M-step
5/18/2009 FINA'09
Statistical Analysis
5/18/2009 FINA'09
Statistical Analysis Amount of Gaussian in the model?
Gaussian 25Gaussian 10Gaussian 5
5/18/2009 FINA'09
Statistical Analysis Tome cost related with the amount of
Gaussian in the model
Not necessarily the more the better
5/18/2009 FINA'09
Time Series Analysis Up Bound Low Bound Approach(for
comparison) Cross indicator approach with k line and
d line Moving Average Convergence and
Divergence
5/18/2009 FINA'09
Time Series Analysis Up Bound Low Bound Approach(for
compare)
5/18/2009 FINA'09
Time Series Analysis Cross indicator approach with k line and
d line
5/18/2009 FINA'09
Time Series Analysis Moving Average Convergence and
Divergence
5/18/2009 FINA'09
Time Series Analysis Experiment result comparison
5/18/2009 FINA'09
Conclusion Gaussian mixture model match the
distribution of network traffic The Gaussian mixture model with Gaussian
amount 10 is a good tradeoff between the performance and time cost
K line and D line approach with low time cost but too sensitive to the fluctuation
Moving Average Convergence and Diverge approach has the best performance but cost more than the K line and D line approach
5/18/2009 FINA'09
Future Work Analysis the relation between the result
and different kinds of attack and anomaly Distinguish the anomaly type
An auto-adaptable approach with no need to configure the parameter of the
model An model applicable for the wireless
network To meet the hybrid, unstable and wireless
network with the changing topology
5/18/2009 FINA'09
Thanks for Your Attention