Top Banner
Dartmouth College Thayer School of Engineering Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking Daniel J. Burroughs Institute for Security Technology Studies Thayer School of Engineering Dartmouth College May 1, 2002
57

Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Jan 23, 2016

Download

Documents

KYLee

Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking. Daniel J. Burroughs Institute for Security Technology Studies Thayer School of Engineering Dartmouth College. May 1, 2002. Outline. Institute for Security Technology Studies Needs and goals System overview - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Daniel J. Burroughs

Institute for Security Technology StudiesThayer School of Engineering

Dartmouth College

May 1, 2002

Page 2: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Outline

• Institute for Security Technology Studies

• Needs and goals

• System overview

• Sensor Modeling

• Attacker Modeling

• Hypothesis Management

• Testing and Evaluation

• Summary and Future Work

Page 3: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Institute for Security Technology Studies

• Security and counter-terrorism research center

• Funded by the NIJ

• Main focus is on computer security

• Investigative Research for Infrastructure Assurance (IRIA)

• Joint effort with Thayer School of Engineering

Page 4: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Outline

• Institute for Security Technology Studies

• Needs and goals

• System overview

• Sensor Modeling

• Attacker Modeling

• Hypothesis Management

• Testing and Evaluation

• Summary and Future Work

Page 5: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

The Internet and Security in a Nutshell

IDS

IDS

ALERT!

ALERT!

Page 6: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

What is the Need?

• Distributed and/or coordinated attacks– Increasing rate and sophistication

• Infrastructure protection– Coordinated attack against infrastructure– Attacks against multiple infrastructure components

• Overwhelming amounts of data– Huge effort required to analyze– Lots of uninteresting events

Page 7: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Outline

• Institute for Security Technology Studies

• Needs and goals

• System overview

• Sensor Modeling

• Attacker Modeling

• Hypothesis Management

• Testing and Evaluation

• Summary and Future Work

Page 8: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

What is the System?

• Reorganization of existing data

• Data fusion

• Building situational knowledge

• Not an intrusion detection system

Snort

SHADOW RealSecure

Security Database

Tracking System

Page 9: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Network Centered View

• Network viewed in isolation

• Limited view of attacker’s activity

• Defensive posture

Page 10: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Distributed Attack

Denial of Service

Page 11: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Attacker Centered View

• More complete picture

• Information gathering

• Requires cooperation and data fusion

Page 12: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Radar Tracking• Multiple sensors

• Multiple targets

• Real-time tracking

• Incomplete data

• Inaccurate data

• Heterogeneous sensors

Snort SHADOW

RealSecure

Page 13: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Gather and Correlate

• Collecting data– Time correlation, communications, common

formatting, etc.– These issues are addressed by numerous projects

• IDEF, IDMEF, CIDF, D-Shield, Incidents.org, etc.

• Correlating data– How can we tell what events are related?– Attacker’s goals determine behavior– Multiple hypothesis tracking

Page 14: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Multiple Hypothesis Tracking

• Events analyzed on arrival

PortScan

BufferOverflow

OR

Attack 1: PortScan

• Scenario createdBufferOverflow

Stream 1Attack 1: PortScan

Attack 2: BufferOverflow

• Alternate hypothesis

Page 15: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Hypothesis Evaluation

• Hypotheses are evaluated based on the behaviors of the sensor and target

• What real-world event caused the given sensor output?

• How likely is it that the target moved to this position?

( ) ( ) ( )kkkkkkk stpsyLstp ,|C

1, −=

Page 16: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Outline

• Institute for Security Technology Studies

• Needs and goals

• System overview

• Sensor Modeling

• Attacker Modeling

• Hypothesis Management

• Testing and Evaluation

• Summary and Future Work

Page 17: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

IDS Overview

• Two methods of intrusion detection– Signature detection (pattern matching)

• Low false positive / Detects only known attacks

– Statistical anomaly detection• High false positive / Detects wider range of attacks

• Two domains to be observed– Network– Host

Page 18: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Signature Detection vs. Anomaly Detection

• Modeling signature detection is easy– If a known attack occurred in an observable area,

then p(detection) = 1, else p(detection) = 0

• Modeling anomaly detection is more difficult– Noisy and/or unusual attacks are more likely seen

• Denial of Service, port scans, unused services, etc.

– Other types of attacks may be missed• Malformed web requests, some buffer overflows, etc.

Page 19: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Event Measurements

• Minimal feature set is extracted from reports– Source IP, destination IP– Source port, destination port– Type of attack– Time

• These are then used to describe a hyperspace through which the attack moves

Page 20: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Bayesian Inference

• Forward response of sensor is well known– Given real-world event x, what is H(x)?

• We need to reason backwards– Given sensor output H(x), what is x?

• Forward response and prior distribution of x– Probability of H(x) given x – Probability of a particular x existing

( ) ( ) ( )( ) ( )∫

=dxxpxxHL

xpxxHLxHxp

|)(

|)()(|

Page 21: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Outline

• Institute for Security Technology Studies

• Needs and goals

• System overview

• Sensor Modeling

• Attacker Modeling

• Hypothesis Management

• Testing and Evaluation

• Summary and Future Work

Page 22: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Attacker Model

• Attackers are not as easy to observe– Often we are only able to observe them through

the sensors (IDS)

• State of the attack is difficult to describe

• We have three sources of attack data– Simulation– Dartmouth / Thayer network– Def Con

Page 23: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Simulation

• Purely generated data– Models for generating attack sequences and noise– Highly controllable – good for development

• Generated attacks with ‘background noise’– Use Thayer IDS for background noise– More interesting for testing

Page 24: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Dartmouth / Thayer Network

Switch

SwitchSwitchSwitch

SHADOW

SHADOW

Snort

Snort

ISTS

Snort

SignalQuest

Page 25: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Def-Con Capture-The-Flag

• Hacker game

• Unrealistic data in some aspects– Lack of stealth, lack of firewall, etc.

• Many attacks, many scenarios– 16,250 events in 2.5 hours– 89 individual scenarios

• Classified by Oliver Dain at Lincoln Labs

Page 26: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

State Problem

• Desire to describe state as Markovian process– Reduces computational complexity and space

• Easy for an aircraft, difficult for an attack– Non-linear, non-contiguous space

X, Y, Z

Yaw, Pitch, Roll

Position & Velocity?

Page 27: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

State Problem

• No simple method for describing state

• Use a history of events in the track– Increases computational complexity– Increases memory requirements

• Use a weighted window of past events– Calculate various relationships between past and

current events.

Page 28: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Windowed History

• Minimum history needed to differentiate state

• Weighting of events to lend more value to recent events

• Relationships calculated between pairs and sequences of events

Xt-6 Xt-5 Xt-4 Xt-3 Xt-2 Xt-1 Xt

Page 29: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Common History

• Don’t care which path was taken

• Just need to distinguish current state

State1

1a

1b

1c

State2

2a

2b

2c

Page 30: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Predictive Model

• To determine likelihood of event belonging in series, predictive models are needed

• Based on current state, what is the probability distribution for the target motion?

• Different types of attacks have different distributions

Page 31: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Attacker Motion Probability Distributions

Motion update for scanning Motion update for DoS(Denial of Service)

Events are readily distinguishable based on arrival time and source IP distance

Page 32: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Feature Extraction

• Historical data sets used to determine good differentiating feature sets

• These are used in combination to measure the fitness of new events to scenarios

• Use neural net to discover complex patterns

Page 33: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Neural Net

• Empirically derived probability distributions work well for simple attacks– But is difficult to compute for more complex ones

• Machine learning is applied to solve this– Neural net feeds from event feature set values– Fitness function is calculated from this

Page 34: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Neural Net

• Fitness functions created for various feature subsets– i.e., rate of events vs. IP source velocity

• These values feed a neural net

• NN then determines overall fitness value

Page 35: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Outline

• Institute for Security Technology Studies

• Needs and goals

• System overview

• Sensor Modeling

• Attacker Modeling

• Hypothesis Management

• Testing and Evaluation

• Summary and Future Work

Page 36: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Hypothesis Management

• In the brute-force approach, each new event doubles the number of hypotheses

• Without pruning, complexity grows exponentially

Page 37: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Branch and Prune

• Calculate all possible hypotheses

• Prune back unlikely or completed ones– Must be very aggressive in pruning– Many hypotheses are not kept long

• Inefficient method of controlling growth

Page 38: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Selective Branching

• Often times, there is a clear winner– Why bother creating hypotheses for other?

• Measure difference between fitness of top choice and fitness of second choice

• If it is greater than a predetermined threshold, no branching is needed

• Number of branches can be determined with threshold

Page 39: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Preprocessing and Multi-pass

• Some sequences of events are simply related

• Port scans– Noisy

• Many events

• Require many evaluations

– Easily grouped

• Preprocessing groups these into single larger events

Page 40: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

• Develop small attack sequences initially

• Chain sequences together in later passes– Small sequences become atomic events

• May aid ‘missing data’ problem

a-b-c-d f-g-h

a b c d f g h k l m

k-l-m

a-b-c-d-f-g-h-k-l-m

Multi-Pass Approach

Page 41: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Outline

• Institute for Security Technology Studies

• Needs and goals

• System overview

• Sensor Modeling

• Attacker Modeling

• Hypothesis Management

• Testing and Evaluation

• Summary and Future Work

Page 42: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Testing and Evaluation

• Testing has been performed with data collected from the Thayer network and DefCon data sets – Thayer testing used earlier probability distribution

method– DefCon testing used machine learning approach

• Arranging for a live run at DefCon

Page 43: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Thayer Testing and Evaluation

• Testing performed on Thayer data– Roughly 1500 events– 20 Scenarios– Roughly half of data were single events

Page 44: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Thayer Testing and Evaluation

• Accuracy measured by number of correctly placed scenario events

• Best hypothesis had ~20% of the single events included in tracks

• Most confident hypothesis not always most accurate

Page 45: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

DefCon Testing and Evaluation

• Testing performed on DefCon data– 2.5 Hour time slice– Roughly 16,000 events– 89 Scenarios– Hand classified by Oliver Dain at Lincoln Labs

• Neural net approach used– Trained with random time slice of data

Page 46: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

DefCon Testing and Evaluation

• Testing performed on DefCon data– 2.5 Hour time slice– Roughly 16,000 events– 89 Scenarios– Hand classified by Oliver Dain at Lincoln Labs

• Neural net approach used– Trained with random time slice of data

Page 47: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

DefCon Testing and Evaluation

From Dain & Cunningham (October, 2001)

Page 48: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

DefCon Testing and Evaluation

• Accuracy measured by number of correctly placed scenario events

• Achieved higher accuracy, but less stable with fewer hypotheses

Page 49: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Outline

• Institute for Security Technology Studies

• Needs and goals

• System overview

• Sensor Modeling

• Attacker Modeling

• Hypothesis Management

• Testing and Evaluation

• Summary and Future Work

Page 50: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Summary

• Reorganize data already being collected

• Provide ‘Higher level’ view of situation

• Reduce the work of the security analyst

• Radar tracking analogy

• Multisensor data fusion

• Multiple hypothesis tracking

Page 51: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Future Work

• Incorporate wider variety of sensors– Host-based IDS– System logs– Other network devices (firewall, router, etc…)

• Larger scale implementation – Scaling, timing, communications

• Integration with network analysis tools

Page 52: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Final Summary

BAD

GOOD

Questions?

Tracking System

Page 53: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Acknowledgements

• Linda Wilson and George Cybenko

• Oliver Dain & Richard Cunningham

• Robert Gray

• Robert Morris

• Daniel Bilar

• Goufei Jiang

• Chris Brenton

• Bill Stearns

Page 54: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Objectives

• Gather and correlate intrusion reports

• Develop attack sequences

• Reorganize existing data

• Using techniques from radar tracking applications

Page 55: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Multiple Target Tracking

Scan: 1 32 4 5 6

Each scan is a sweep of the radar or sampling of the IDS reports

Targets at each sweep are clear, but the paths are not

Page 56: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Multiple Target Tracking

• Hypotheses are generated and evaluated as new data arrives

• Belief in a hypothesis is recalculated with additional data

Page 57: Correlating Network Attacks Using Bayesian Multiple Hypothesis Tracking

Dartmouth College

Thayer School of Engineering

Dartmouth / Thayer Network

100 Mb Switch

10 Mb Sw 10 Mb Sw 100 Mb Sw

IDS

IDSIDS

Outside