Towards Scalable and Robust Distributed Intrusion Alert
Fusion with Good Load Balancing
Zhichun Li, Yan Chen and Aaron Beach
Lab for Internet & Security Technology (LIST)
http://list.cs.northwestern.eduNorthwestern University
2
The Spread of CodeRed
3
Distributed IDSes• Distributed Intrusion Detection Systems (IDSes)
– Crucial to identify large-scale attacks early– Robust to various scan techniques– Locate the attackers/zombies when spoofed– E.g, Symantec has 20,000 sensors in 180 countries
• General architecture– IDS nodes
» Generate the alarms» Heterogeneous: host- or network- based
– Sensor fusion centers (SFCs)» Fuse the alarms» A subset of IDSes or dedicated hosts
4
Desired Features of DIDS Infrastructure
•Scalability–15 million daily intrusion alerts reported to
DShield•Route only related alarms to the same SFC
–Over 18,000 vulnerabilities found [CERT]–17,500 Win32 threats and their variants
[Symantec]–Hierarchical fusion cannot scale w/ diverse
alerts•Distributed queries over multiple SFCs•Good load balancing•Attack resiliency
5
Outline
•Motivation•CDDHT Design•Features of CDDHT•Evaluation•Related Work •Conclusion
6
Cyber Disease Distributed Hash Tables (CDDHT)
•General intrusion alert fusion framework, can plug-in any alert generation or alert fusion algorithm
•Part of the Router-based Anomaly/Intrusion Detection and Mitigation (RAIDM) system in LIST–High-speed network measurement with
reversible sketches [IMC 2004, INFOCOM 2006]
–Online flow-level anomaly/intrusion detection [IEEE ICDCS 2006] [IEEE CG&A, Security Visualization 06]
–Router-based polymorphic worm signature generation [IEEE Symposium on Security and Privacy 2006]
7
CDDHT Design
•Leverage DHT systems–O(log(n)) hops distance where n is the # of
nodes–O(log(n)) maintenance overhead for routing–Guaranteed success for deterministic routing–Fault-tolerant, robust, and DoS attack resilient–Becoming increasingly popular for serious use
»Eg, eMule P2P system uses Kademila
•Primitives of CDDHT–Put (disease key, symptom report)–Summary report = Get (disease key)
8
Architecture of CDDHT
Internet
IDSNode ID : “0” + sha2(IP of the IDS)
IDS + SFCNode ID: “1” + sha2(IP of the IDS)
DIDS Coverage
AttackInjected
AttackInjected
9
Disease Key Design•Challenge: fuse the vast, diverse
symptoms from heterogeneous IDSes with different views–Key generation in a decentralized and
deterministic manner
•Key idea: generate the disease keys which capture the uniqueness of certain attacks
•Focus on popular types of attacks•Improve with features
–Load balancing–Attack resilience
10
The Disease Key
• Currently, model four types of attacks• Extensible design
Intrusion ID Characterization Field(s) Length
DoS Attack
000 Victim IP (subnet) 35 bits
Scans 001 0 (for vertical & block scan)
Source IP 36 bits
1 (for horizontal & coordinated scan)
Destport #
Src IP (horizontal scan) 52 bits
0 (coordinated scan)
Viruses/Worms
010 0 (for known) Worm ID (32bit) 36 bits
1 (for unknown) Dst port # 20 bits
Botnets 011 00 (for DDNS entry) Botnet ID (32bit) 37 bits
01 (for URL entry) Botnet ID (32bit) 37 bits
11
Port Scan Disease Key Design
•Vertical scan and block scan–Source IP
•Horizontal scan and Coordinated scan–Scan port–Horizontal: + Source IP
12
Viruses/Worms and Botnets Disease Key
Design•Viruses/Worms
–Known worms: hash of the worm name
–Unknown worms: worm scan port #
•Botnets–Assume botnets use centralized C&C– IRC based bots: dynamic DNS–Web based bots: URL–Botnet ID = hash of the DDNS or URL
13
Outline
•Motivation•CDDHT Design•Features of CDDHT•Evaluation•Related Work •Conclusion
14
Load Balancing•Challenges to load balancing
–Large key space in DHT–Highly skewed alert distribution
Number of ports picked Number of subnets picked
15
Load Balancing II• Proactive balancing with stable hot spots
– Reduce key space of port # to 7 bits– 64 buckets for 64 most popular port #– Remaining 64 buckets randomly assigned to other
port #
• Balancing load of the key space– Node migration– Virtual node– Load-aware bootstrap
• Balancing load of single hot key– IDS alarm rate limiting
• Aggregation tree for large-scale attacks– Received alarms by the final SFC bounded by
O(log(n))
16
Attack Resilience
• DoS resilience comparison with hierarchical model
– Proved the average number of alerts unreachable to their corresponding SFCs given one node loss
» Hierarchical DIDS: O(log (n))» CDDHT: O(1)
• More in the paper– Authenticity of alarms– Dealing with compromised nodes
17
Outline
•Motivation•CDDHT Design•Features of CDDHT•Evaluation•Related Work •Conclusion
18
Methodology• Implementation
– Preliminary CDDHT system based on Chord simulator– Event-driven simulation
» Each alarm is an event with a timestamp from certain IDSes
• Datasets– DShield firewall logs (Jan. 2004)– Results from each day’s data are similar– Use January 2nd 2004 as illustration
» 25 million scan logs from 1,417 providers» Randomly choose 10% to be SFCs
Scan type Vertical Horizontal Block Coordinated
# of scans 3364 8486 22 25711
19
Evaluation Metrics
•Fusion effectiveness–100% due to deterministic routing of
CDDHT
•Load balancing –Consider number of alerts received at
each SFC–Maximum vs. mean ratio (MMR)–Coefficient of variation (CV)
)(mean
)(deviation standard)(
x
xxCV
20
Proactive Balancing with Stable Hot Ports
Proactive load balancing can reduce CV by 60% and reduce MMR by 40%
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
16 7* 7** 6* 6** 5* 5**Number of bits used for port field
* trained with popular port distribution from the day before** trained with popular port distribution from the same day
Co
effi
cien
t o
f va
riat
ion
(C
V)
0
5
10
15
20
25
30
35
40
Max
vs.
mea
n r
atio
(M
MR
)coefficient of variation (CV)
max vs mean (MMR)
21
The Load Variation Comparison Between Hierarchical Scheme and
CDDHT
• Median, 10- and 90- percentile of 10 runs
• CDDHT with proactive balancing (PB) and virtual nodes (VN)
• Compared with Hierarchical schemes, CDDHT reduces the MMR by a factor of 5.5 and CV by a factor of 5.2
HierarchicalCDDHT
CDDHT w/ PBCDDHT w/ PB+VN
HierarchicalCDDHT
CDDHT w/ PBCDDHT w/ PB+VN
22
Outline
•Motivation•CDDHT Design•Features of CDDHT•Evaluation•Related Work •Conclusion
23
Related Works
CDDHT Centralized/Hierarchical Model
Publish/Subscribe Model
P2P Querying
Failure/ attack resilience
High Low High High
Fusion overhead
Low Low High Low
Query overhead
Low Low Low High• WormShield uses DHT specifically to find popular content fingerprints as worm signatures, but does not work for polymorphic worms
24
Conclusion
•Large number and diverse alerts from many distributed IDSes calls for efficient fusion of these alerts
•CDDHT: Cyber Disease DHT–Efficient route alarms of different intrusions
to different SFCs–Highly scalable and robust–Good load balancing–High attack resilience
•Future work–Disease keys for more types of attacks and
querying of CDDHT
Backup Slides
26
Introduction to DHT• DHT (Distributed Hash Table): An infrastructure
that enables the distribution of an ordinary hash table onto a set of cooperating nodes
Key Object
0x2535 “Apple”
0x2353 ”Banana”
0x3978 ”Peach”
0x9123 ”Strawberry”
0x7234 ”Grape”
0x5942 ”Watermelon”
node A
node D
node B
node C
Each node only stores part of the hash table
• Basic operations– Put(Key, Object) : From Key to find the corresponding node via DHT routing and store
the Object on the node– Object=Get(Key) : From Key to find the corresponding node via DHT routing and
retrieve the Object from the node
27
Introduction to DHT II•Different DHT
systems–Chord–CAN–Pastry–Tapestry- Kademlia
- Kademia has been used in eMule P2P software
01
2
3
4
5
6
78
910
11
15
14
13
Chord DHT routing
• DHT routing– Distributed and deterministic routing– The max hops to find the node corresponding to a key is bounded
by O( log (n) )
28
DoS Attack Disease Key Design
•Most DoS attack target specific IP addresses (the server) or the subnet (Bandwidth consuming attack)
•But the victim IP (subnet) can be destination or source (in backscatter)
•Other parts all can be variants
29
Related Works
•Centralized/Hierarchical Model•Publish/subscribe Model
–O(n2) communicate vs. O(n)
•P2P Query–Scalability with frequent fusion
30
Attack Resilience
• DoS resilience comparison with hierarchical model
– Proved the average number of disconnected nodes given one node loss
» in a k-way hierarchical DIDS is O(log (n))» but the DHT based is O(1).
• Authenticity of alarms– Valid the source subnets of IDS by
Whois and BGP tables– Use PKI to verify the messages send by
IDSes/SFCs
31
Attack Resilience II
• Dealing with compromised nodes– IDS nodes
» Voting the importance of the results by # of IDSes, IP coverages
» Probability based verification for alarm aggregation
– SFC nodes» The “trust but verify” principle» Envision that there is a centralized authority
randomly check the fusion results for the SFCs
32
Proactive Balancing with Stable Hot Ports
Use 7 bits encoding, can reduce MMR by 60% and reduce CV by 40%
33
Dynamic of Load Variation over Time
•MMR for CDDHT is much smaller and smoother
•CV also get better