ACTIONABLE KNOWLEDGE DISCOVERY FOR THREATS INTELLIGENCE SUPPORT ~ A MULTI-DIMENSIONAL DATA MINING METHODOLOGY 2 nd Int. Workshop on Domain Driven Data Mining Pisa - Dec 15 th , 2008 Olivier Thonnard Royal Military Academy Polytechnic Faculty Belgium [email protected]Marc Dacier Symantec Research Labs Sophia Antipolis France [email protected]
25
Embed
A CTIONABLE K NOWLEDGE D ISCOVERY FOR T HREATS I NTELLIGENCE S UPPORT ~ A M ULTI -D IMENSIONAL D ATA M INING M ETHODOLOGY 2 nd Int. Workshop on Domain.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
ACTIONABLE KNOWLEDGE DISCOVERYFOR THREATS INTELLIGENCE SUPPORT
1. Introduction2. A multi-dimensional & domain-driven approach
for mining network traffic (eg malicious)3. Experimental environment4. A real-world example5. Conclusions
Introduction
According to the security community, today’s cybercriminality: Is increasingly organized Involves the commoditization of various activities :
By selling 0-days and new (undetected) malwares By selling /renting compromised hosts or entire botnets
Seems to be specialized in certain countries Coordination patterns …
Threats intelligence What is the prevalence of emerging coordinated malicious
activities? Which countries / IP blocks seem to be more affected? Can we observe various “communities” of machines coordinating their
efforts?
How to discover knowledge about:1. The modus operandi of attack phenomena2. The underlying root causes of attacks
How to analyze Internet threats from a global strategic level? Can we enable some sort of Internet threat “situational awareness”
Our « multi-dimensional KDD » approach to analyze network threats
Collect real-world attack traces from a number of (worldwide) distributed sensors Network of honeypots = “Honeynet”
Threats analysis (semi-automated): Collect “attack events” from each sensor Multi-dimensional KDD:
1) Extract relevant nuggets of knowledge DDDM (with expert-defined features )– Using Clique algorithms (clique-based clustering)
extraction of maximal weighted cliques
2) Synthesizing those pieces of knowledge, to create “concepts” describing the attack phenomena– Using Cliques combinations DDDM
+/- 40 sensors, 30 countries, 5 continents
6
Leurré.comProject
Leurre.com / SGNET Honeynet
Global distributed honeynet (http://www.leurrecom.org) +50 sensors distributed in more than 30 countries worldwide Ongoing effort of EURECOM since 2003
Same configuration for all sensors : (V1.0): low-interaction honeypots based on honeyd (V2.0) : high-interaction honeypots based on ScriptGen
Data enrichment: Dataset enriched with contextual information:
All partners have full access (for free) to the whole DB
Research contextWOMBAT
Worldwide Observatory of Malicious Behaviors And Threats EU-FP7 project ( http://www.wombat-project.eu ) Joint effort in collecting, sharing and analyzing data on global Internet
In our honeynet: A source = an IP address that targets a honeypot platform
on a given day, with a certain port sequence. All sources are clustered into “attack (profiles)” based on
certain network characteristics(*): targeted port sequence, #packets, attack duration, packet payload, …
(*) F. Pouget, M. Dacier, Honeypot-Based Forensics. AusCERT Asia Pacific Information technology Security Conference 2004.
Attack tool
Fingerprint(s)
Definition 2: Attack event on sensor ‘x’
Event 1
Event 2
Event 3
Dimensions usedto create “attack cliques”
We need to identify salient features for the creation of meaningful cliques (“viewpoints“) expert-defined characteristics for each dimension
Geolocation Botnets located in specific regions So-called “safe harbors” for the hackers
IP netblocks / ISP’s of origin Bias in worm propagation (e.g. malware coding strategies) “Uncleanliness” of certain networks (e.g. clusters of zombie machines)
Many others Time series
Synchronized activities targeting different sensors
Targeted sensors
Remark: distance used for distributions Kullback-Leibler, Chi-2, and Kolmogorov-Smirnov