Top Banner
Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu April 13 th , 2009 BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection
35

Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Dec 25, 2015

Download

Documents

Cecil Watts
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee

College of Computing, Georgia Institute of Technology

USENIX Security '08

Presented by Lei Wu

April 13th, 2009

BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent

Botnet Detection

Page 2: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Motivation and Background

System description

Experimental analysis

Conclusion

Outline

Page 3: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Motivation and Background

System description

Experimental analysis

Conclusion

Outline

Page 4: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

This paper proposes a general detection framework BotMiner that is independent of botnet Command and Control (C&C) protocol and structure, and requires no a priori knowledge of botnets

Motivation and Background

Page 5: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

BotA malware instance that runs

autonomously and automatically on a compromised computer (zombie) without owner’s consent

Botnet: network of bots controlled by criminalsDefinition: “A coordinated group

of malware instances that are controlled by a botmaster via some C&C channel”

25% of Internet PCs are part of a botnet!

Motivation and Background

Page 6: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Why BotMiner?Traditional methods are not enough.

Botnets can change their C&C content (encryption, etc.), protocols (IRC, HTTP, etc.), structures (P2P, etc.), C&C servers, infection models …

Motivation and Background

Page 7: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Cluster similar communication traffic and similar malicious traffic, and performs cross cluster correlation to identify the hosts that share both similar communication patterns and similar malicious activity patterns

Basic idea

Page 8: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Revisit the definition of Botnet again“A coordinated group of malware instances that are

controlled by a botmaster via some C&C channel”

We need to monitor two planesC-plane (C&C communication plane): “who is

talking to whom”A-plane (malicious activity plane): “who is doing

what”

Horizontal correlationBots are for long-term useBotnet: communication and activities are

coordinated/similar

How does it work?

Page 9: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Motivation and Background

System description

Experimental analysis

Conclusion

Outline

Page 10: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Architecture overview

Page 11: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Simplified Architecture

A-PlaneMonitor + Clustering

C-PlaneMonitor + Clustering

Cross-Plane Correlation

Network

TrafficReport

Page 12: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

A-Plane

A-PlaneMonitor + Clustering

C-PlaneMonitor + Clustering

Cross-Plane Correlation

Network

TrafficReport

Page 13: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Log information on who is doing whatMonitor four types of malicious activities

Scanning SpammingBinary downloadingExploit attempts

Based on Snort, adapt some existing intrusion detection techniques (e.g. BotHunter, PEHunter)

A-Plane Monitor

Page 14: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Two-layer clustering on activity logs

A-Plane Clustering

Page 15: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

C-Plane

A-PlaneMonitor + Clustering

C-PlaneMonitor + Clustering

Cross-Plane Correlation

Network

TrafficReport

Page 16: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Capture network flows and records information on who is talking to whom

Adapt an efficient network flow capture tool named fcapture, which is based on Judy library

Each flow record contains the following information: time, duration, source IP, source port, destination IP, destination port, and the number of packets and bytes transferred in both directions

C-Plane Monitor

Page 17: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Architecture of the C-plane clustering

First two steps are not critical, however, they can reduce the traffic workload and make the actual clustering process more efficient

In the third step, given an epoch E (typically one day), all TCP/UDP flows that shares the same protocol, source IP, destination IP and port, are aggregated into the same C-flow

C-Plane Clustering

Page 18: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Extract a number of statistical features from each C-flow and translate them into d-dimensional pattern vectors compute the discrete sample distribution of (currently) four random variablesthe number of flows per hour (fph)the number of packets per flow (ppf)the average number of bytes per packets (bpp)the average number of bytes per second (bps)

Feature Extraction

Temporal related statistical distribution

information: FPH and BPS

Spatial related statistical distribution

information: BPP and PPF

Page 19: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Compute the overall discrete sample distribution of the random variable considering all the C-flows in the traffic for an epoch E, then describe that random variable (approximate) distribution as a vector of 13 elements.

Apply the same algorithm for all four random variables, and therefore we map each C-flow into a pattern vector of d = 52 elements

Feature Extraction Algorithm

Page 20: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.
Page 21: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Why multi-step?Coarse-grained clustering

Using reduced feature space: mean and variance of the distribution of FPH, PPF, BPP, BPS for each C-flow (2*4=8)

Efficient clustering algorithm: X-means

Fine-grained clusteringUsing full feature space

(13*4=52)

Two-step Clustering of C-flows

Page 22: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Cross-Plane Correlation

A-PlaneMonitor + Clustering

C-PlaneMonitor + Clustering

Cross-Plane Correlation

Network

TrafficReport

Page 23: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Botnet score s(h) for every host hh will receive a high score if it has performed

multiple types of suspicious activities, and if other hosts that were clustered with h also show the same multiple types of activities

Similarity score between host hi and hj

Two hosts in the same A-clusters and in at least one common C-cluster are clustered together

Cross-Plane Correlation

Page 24: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Use the Davies-Bouldin (DB) validation index to find the best dendrogram cut, which produces the most compact and well separated clusters

Hierarchical clustering

Page 25: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Motivation and Background

System description

Experimental analysis

Conclusion

Outline

Page 26: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Data collected

Page 27: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Results

Page 28: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Motivation and Background

System description

Experimental analysis

Conclusion

Outline

Page 29: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Evading C-plane monitoring and clusteringMisuse whitelistManipulate communication patterns

Evading A-plane monitoring and clusteringVery stealthy activityIndividualize bots’ communication/activity

Evading cross-plane analysisExtremely delayed task

Limitation and Discussion

Page 30: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Related Work

Page 31: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Propose a detection framework which is independent of botnet C&C protocol and structure, and requires no a priori knowledge of specific botnets

Build a prototype system based on the general detection framework, and evaluate it with multiple real-world network traces including normal traffic and several real-world botnet traces

Contribution

Page 32: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Offline systemLong time data collection and analysisNo incremental ability of analysis

The experiment is not convincing enough Only shows the system performance on day-2,

what about the other days?Not a real “real world experiment”

Weakness

Page 33: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Fast detection and online analysis

More efficient clustering, more robust features

More experiments in different and real network environment

Improvement

Page 34: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Sides of the paper in USENIX Security’08 http://faculty.cs.tamu.edu/guofei/paper/botMiner-Security08-slides.pdf

Sad Planet, Kayak Adventure. Botnets on the Rampage http://birdhouse.org/blog/2006/11/16/botnets-on-the-rampage/

Beware of Potential Confickor BotNet Chaos http://thejunction.net/2009/03/25/april-1st-beware-of-potential-botnet-

chaos/

Oracle Data Mining Mining Techniques and Algorithms http://www.oracle.com/technology/products/bi/odm/

odm_techniques_algorithms.html

Reference

Page 35: Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Question?