Journal of Babylon University/Pure and Applied Sciences/ No.(4)/ Vol.(23): 2015 4141 Using DBSCAN Clustering Algorithm in Detecting DDoS Attack Safaa O. Al-Mamory Assistant Professor, college of Information Technology, University of Babylon [email protected]Zahraa Mohammed Ali Department of computer science, University of Kufa [email protected]Abstract Distributed Denial of Service (DDoS) attack, has become one of the major threats to the Internet. It makes a victim to deny providing normal services in the Internet by generate huge useless packets by a large number of agents and can easily exhaust the computing and communication resources of a victim .In this paper we develop method to detect DDoS attacks accurately and proactively. This can be achieved using entropy concept to measure abnormal change in traffic according to the phases of the attack , and then these traffics are clustered using DBSCAN algorithm. The patterns for DDoS traffic is created based on extracted centroid points from each cluster, which are used in testing phase using Distances-based classification . This system is characterized processing and analyzing of high-speed network traffic (based on entropy approach ), discovering and accurately identifying new types of DDoS attack to reduce the false alarms (FA) , detecting this attack in real time and making use of pattern in the train stage to increase detection ratio. Keywords : DDoS , Proactive detection , Clustering , DBSCAN 1.Introduction Distributed denial of service (DDoS) attack was first seen in early 1998 (CERT,1998). In February 2000, a number of the World’s largest e-commerce sites included Yahoo.com, Amazon.com, Excite, E*Trade, eBay, CNN.com, Buy.com, and ZDNet were brought offline for days by this kind of attack, even though they were designed to offer high availability. The outages had caused a huge economic loss to both the victim sites and their users (Wan, 2001). The overarching aim of this paper is to develop method to detect DDoS attacks accurately and proactively . This can be achieved using entropy concept to measure abnormal change in traffic according to the phases of the attack , and then this traffic is clustered using DBSCAN algorithm, and the pattern for DDoS traffic is created based on the output cluster set. This system is characterized processing and analyzing of high-
13
Embed
Using DBSCAN Clustering Algorithm in Detecting DDoS Attack...alarms (FA) , detecting this attack in real time and making use of pattern in the train stage to increase detection ratio.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Journal of Babylon University/Pure and Applied Sciences/ No.(4)/ Vol.(23): 2015
4141
Using DBSCAN Clustering Algorithm in Detecting DDoS Attack
Safaa O. Al-Mamory
Assistant Professor, college of Information Technology, University of Babylon [email protected]
Zahraa Mohammed Ali
Department of computer science, University of Kufa
Journal of Babylon University/Pure and Applied Sciences/ No.(4)/ Vol.(23): 2015
4141
speed network traffic (based on entropy approach), discovering and accurately
identifying new types of DDoS attack to reduce the false alarms (FA), detecting the
intrusion in real time and making use of pattern in the train stage to increase detection
ratio .
In Section 2 describes The related works. In Section3 proposed system is explan .
The experimental results are discussed in Section4.Conclusions are given in Section5.
2.Related Works There have been done lots of researches relevant to DDoS attack . To detect this
attack proactively , Feinstein et al. (2003) presented statistical approaches to identify
DDoS attacks by computing entropy and frequency-sorted distributions of selected packet
attributes. The DDoS attacks show anomalies in the characteristics of the selected packet
attributes. The detection accuracy and performance are analyzed using live traffic traces
from a variety of network environments ranging from points in the core of the Internet to
those inside an edge network. The results indicate that these methods can be effective
against current attacks and suggest directions for improving detection of more stealthy
attacks.
Jin et al. (2004) proposed a covariance analysis model for detecting SYN flooding
attacks. the correlations among the features may provide additional essential information.
In terms of correlation, the normal patterns will be different from the abnormal patterns.
In this sense detecting the correlation changes among different features could determine
the occurrence of the anomalies. A two variables covariance model is presented in this
paper as a possible approach to detecting the DDoS attacks.
Gavrili, et al. (2005) proposed Radial-Basis-Function neural network (RBF-NN) to
recognize DDoS attacks from the normal traffic . RBF-NN detector is a two layer neural
network. It uses nine packet parameters, and the frequencies of these parameters are
estimated. Based on the frequencies, RBF-NN classifies traffic into attack or normal
class.
Lee et al. (2008) proposed a method for proactive detection of DDoS attack by
exploiting its architecture which consists of the selection of handlers and agents, the
communication and compromise, and attack. The features are selected based on the
procedures of DDoS attack. After that, cluster analysis performed for proactive detection
of the attack. The experiment is performed with 2000 DARPA Intrusion Detection
Scenario Specific Data Set in order to evaluate our method. The results show that each
phase of the attack scenario is partitioned well and this method can detect precursors of
DDoS attack as well as the attack itself.
Rahmani et al. (2009) presented entropy-based anomaly detection, using joint
entropy analysis of multiple traffic distributions. That observed the time series of IP-flow
number and aggregate traffic size are strongly statistically dependent. The occurrence of
attack affects this dependence and causes a rupture in time series of joint entropy values.
Xia et al. (2010) presented a method that can identify the occurrence of the DDoS
flood attack and determine its intensity using the fuzzy logic. This process consists of two
stages: (i) statistical analysis of the network traffic time series using discrete wavelet
transform (DWT) and Schwarz information criterion (SIC) to find out the change point of
Hurst parameter resulting from DDoS flood attack, and then (ii) adaptively decide the
Journal of Babylon University/Pure and Applied Sciences/ No.(4)/ Vol.(23): 2015
4141
intensity of the DDoS flood attack by using the intelligent fuzzy logic technology to
analyze the Hurst parameter and its changing rate.
Zhong et al. (2010) presented a DDoS attack detection model based on data mining
algorithm. FCM cluster algorithm and Apriori association algorithm used to extracts
network traffic model and network packet protocol status model. Apriori association
algorithm is used in mining of packet protocol status. The packet protocol status
appearing frequently in the network could be combined into one association record. The
data collected continuously for a period is used to calculate the packet protocol status
threshold through the FCM cluster algorithm.
Liu et al. (2013) proposed an anomaly detection method for DDoS at-tacks based
on Gini coefficient. First, Gini coefficient is introduced to measure the inequalities of
packet attribution (IP addresses and ports) distributions during attacks. Then, an
improved (Transductive Confidence Machines for K-Nearest Neighbors) TCM-KNN
algorithm is applied to identify attacks by classifying the Gini coefficient samples
extracted from real-time network traffic. Experiment was made on the DDoS attacks
dataset (LLDoS 2.0.2) from MIT Lincoln Laboratory.
Chen et al. (2013) proposed a detection model based on conditional random fields
(CRF). The CRF based model incorporates the signature based and anomaly-based
detection methods to a hybrid system. The selected features include source IP entropy,
destination IP entropy, source port entropy, destination port entropy, protocol number and
etc. The CRF based model combines these IP flow entropies and other fingerprints into a
normalize entropy as the feature vectors to depict the states of the monitoring traffic. The
training method of the detection model uses the L-BFGS algorithm
3.Problem Formulation and Methodology
In order to satisfy early detection of DDoS attack , we employ entropy concept and
cluster analysis. the idea of this research is separate each phase of DDoS attack,
DBSCAN clustering algorithm used in training phase and then the corresponding cluster
centroids (average of each cluster) are used as patterns for efficient distance-based
detection in testing phase . Figure 1 illustrates the proposed system flow chart .
Journal of Babylon University/Pure and Applied Sciences/ No.(4)/ Vol.(23): 2015
4141
Figure (1) : The general architecture of proposed system.
3.1 Extraction of the detection features
According to DDoS architecture , the DDoS attack is performed by following steps :
(Douligeris C. et al., 2004)
Selection of handlers and agents
Compromise
Communication
Attack From this procedure of a DDoS we can find out traffic parameters which change abnormally in
each step. Lee et al. (Lee et al.,2008) presented nine features based on the analysis of DDoS
attack's characteristics . we will use these features in our method . "In the first step, real attacker sends ICMP Echo Request packets to find handlers and
agents that help attack, which is called IPsweep"(Lee et al.,2008). A lot of ICMP traffics
are generated , therefore the occurrence rate of ICMP packets may be abnormally high
compared to normal traffics. Also In this period, destination IP address in network flow
would be distributed randomly.
Start
data set
Features Extraction for each sample of
consecutive packets
Clustering by DBSCAN
Extract set of centroid
points (mean of each
cluster)
as pattern
distance-based
classification
end
System validation
Testing
Training
Generate a data base from the extracted
features
Journal of Babylon University/Pure and Applied Sciences/ No.(4)/ Vol.(23): 2015
4141
In second and third steps , a specific traffic type such as ICMP,UDP and TCP SYN
packets can be used for message exchange. Hence ,the occurrence rates of these types of
packets can indicate the preparation for launching a DDoS attack (Zi et al.,2010).
Under DDoS attack , the agents randomly generate the source IP addresses of attack
packets to hide their real addresses. They also randomize the destination and source port
numbers depending on the attack type, therefore this randomize can provide useful
information to detection DDoS attack . In order to measure the degree of divergence , Lee
et al. (Lee et al.,2008) suggest to use the concept of entropy .
Let an information source has n independent symbols each with probability of choice Pi.
Then, the entropy H is defined as follows (Shannon, 1948):
…(1)
Entropy would compute on a sample of consecutive packets. Comparing the value for
entropy of sample with other provides a mechanism for detecting changes in the
randomness (Lee et al.,2008).
In the IPsweep phase, the entropy value of source IP address becomes small and
that of destination IP address increases. In the attack phase, attack packets have diverse
source IP addresses and a target destination IP address. The entropy value of source IP
address increases and that of destination IP address converges to a very small value.
Similarly, the entropy values of source and destination port numbers can be useable for
DDoS detection since some types of DDoS attacks use random port numbers in the
attack. In addition, one DDoS attack may use a specific type of packets, the entropy value
of packet type may be useful. If the entropy value of packet type is very small, it is
possible that some kind of DDoS attack is being launched.
In our experiments, we use the same nine features which were presented in (Lee et
al., 2008). The features are :
Entropy of source IP address and port number.
Entropy of destination IP address and port number.
Entropy of packet type.
Number of packets.
Occurrence rate of packet type (ICMP, UDP, TCP SYN).
3.2 Clustering analysis by DBSCAN (Training phase)
Clustering is method by which the large sets of data are grouped into clusters of
similar data . By using cluster analysis, we can separate normal traffic and each phase of
the DDoS attack into partitioned groups if variables involved to form cluster have
dissimilarities among them. Hence, in this paper, we apply cluster analysis to separate
each phase of the DDoS attack. We first employ a clustering algorithm to partition a
training data set to clusters that represented normal and each phase of DDoS attack then
extracted pattern from these clusters to use it in online detection.
We adopt DBSCAN algorithm for clustering purpose. Density-Based Spatial
Clustering and Application with Noise (DBSCAN) was a clustering algorithm based on
density. It did clustering through growing high density area, and it can find any shape of
clustering. The basic idea of using DBSCAN in DDoS attack detection is that most of the
data is normal traffic while the attack data is very few , and different with normal data .
In training mode , we need to modify DBSCAN algorithm by adding new step that
compute the centroid µ of each cluster as following :
Journal of Babylon University/Pure and Applied Sciences/ No.(4)/ Vol.(23): 2015
4141
This centroids representing a pattern to detect the DDoS attack phases in online mode .
The modified DBSCAN algorithm steps is shown below :
Algorithm 1 DBSCAN ( D, ε , MinPts)
Input : training data set D , neighbourhood radius ε , density threshold MinPts
Output : labels the data with cluster id (or NOISE) , centriod points set µk