Top Banner
RelSamp: Preserving Application Structure in Sampled Flow Measurements Myungjin Lee, Mohammad Hajjat, Ramana Rao Kompella, Sanjay Rao
22

RelSamp : Preserving Application Structure in Sampled Flow Measurements

Feb 24, 2016

Download

Documents

skah

RelSamp : Preserving Application Structure in Sampled Flow Measurements. Myungjin Lee , Mohammad Hajjat , Ramana Rao Kompella , Sanjay Rao. A plethora of Internet applications. 1) Emergence of new applications. Objectives Re-provision networks Detect undesirable behaviors of applications - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: RelSamp : Preserving Application Structure in Sampled Flow Measurements

RelSamp:Preserving Application Structurein Sampled Flow Measurements

Myungjin Lee, Mohammad Hajjat,Ramana Rao Kompella, Sanjay Rao

Page 2: RelSamp : Preserving Application Structure in Sampled Flow Measurements

Internet

A plethora of Internet applications

Objectives Re-provision networks Detect undesirable behaviors of applications Prepare network better against major application

trends

2) Measure/Monitor1) Emergence of new applications

3) Characterization

Page 3: RelSamp : Preserving Application Structure in Sampled Flow Measurements

Monitoring applications at an edge Goal: Monitoring application

behavior Identify number of flows Identify number of packets

Current Solution: Sampled NetFlow Supported by most modern routers

Key limitation: Application session structure gets distorted Small # of flows per application

session Small # of packets per application

session

EnterpriseNetwork

EdgeRouter

Internet

SampledNetFlow

Page 4: RelSamp : Preserving Application Structure in Sampled Flow Measurements

Preserving application structure in flow measurements Benefit 1: Enables continuous monitoring of

applications Better understanding about communication patterns Better understanding of characteristics (# of flows,

packets)

Benefit 2: Application classification becomes easier Statistical machine learning techniques: SVM, C4.5, etc. Social behavior-based classifier: BLINC

Benefit 3: Detecting undesirable traffic patterns of an application

Page 5: RelSamp : Preserving Application Structure in Sampled Flow Measurements

Contributions Introduce the notion of related sampling

Flows belonging to the same application session are sampled with higher probability

Propose RelSamp architecture for realizing related sampling Uses three stages of sampling to preserve application

structure

Show efficacy in preserving application structure Captures more number of flows per application session Significant increase of accuracy in application

classification

Page 6: RelSamp : Preserving Application Structure in Sampled Flow Measurements

Related sampling

App2

App1

App3

Original applicatio

n structure

Sampled NetFlow

Related sampling

Key idea: Sample more flows from fewer application sessions

Page 7: RelSamp : Preserving Application Structure in Sampled Flow Measurements

Realizing related sampling

Question 1: How to sample an application session ?

Question 2: How to sample packets within an application session ?

Page 8: RelSamp : Preserving Application Structure in Sampled Flow Measurements

Defining application session A sequence of packets from an application on

a given host with inter-arrival time ≤ τ seconds Packets may belong to different flows to different

destinations

Example 1: BitTorrent connections to several destinations within a short span of time constitute an application session

Example 2: Web connections from a browser several seconds apart constitute different application sessions

Page 9: RelSamp : Preserving Application Structure in Sampled Flow Measurements

Sampling an application session One possible approach: Similar to Sampled

NetFlow Sample packets with some probability Create an application session record if no record

exists Update the application session record

Problem: Hard to do in an online fashion No application session identifier (like flow key) Need to know all flows that constitute an

application session DPI-based techniques are both difficult and

incomplete

Page 10: RelSamp : Preserving Application Structure in Sampled Flow Measurements

Our approach: sampling hosts Observation: Host is a super-set of an

application session Sample more flows from the same host

Flows originating at a same host closely in time typically belong to few application sessions About 80% hosts run fewer than 2 applications in

our study More details in the paper

Page 11: RelSamp : Preserving Application Structure in Sampled Flow Measurements

RelSamp design Three-stage sampling process consisting of host,

flow, and packet selection stages Host stage: hash-based sampling

No state maintained on a per-application basis Many application sessions for a given host are possibly

sampled Change hash function periodically to track different hosts

Flow and packet stages: random packet sampling Controls fraction of flows sampled in an application

session and packets sampled in a flow Post processing: Can separate flow records into

application sessions using port-based/statistical classifiers

Page 12: RelSamp : Preserving Application Structure in Sampled Flow Measurements

RelSamp architecture

Host-levelbias stage

Flow-levelbias stage

Pkt-levelbias stage

11

Copy

Ph

Selection rangeH(SrcIP) Hash space

Ph = selection range / hash space

Pfif ( random no. ≤ Pf && no flow record) create a flow record

Ppif ( random no. ≤ Pp && flow record) update the flow record

1

Tunableparameters

2

2

Flow Memory

Page 13: RelSamp : Preserving Application Structure in Sampled Flow Measurements

Exploring parametric space Router sampling budget Pe = f(Ph, Pf, Pp) Trade-off between accuracy of flow statistics

and # flows/application session Parameters can be tuned depending on

Objective Network environment

Examples of tuning parameters by objective Application classification: low Ph, high Pf, low Pp Application characterization: lower Ph, high Pf, high

Pp Flow statistics of all flows: Ph = Pf = Pp = Pe

Page 14: RelSamp : Preserving Application Structure in Sampled Flow Measurements

Evaluation goals Application characterization

Question 1: Is RelSamp effective for sampling more # of flows in an application session?

Question 2: Can RelSamp estimate statistics of an application session?

Application classification Questions 3: Is sampling more # flows in an

application session beneficial for application classification?

Page 15: RelSamp : Preserving Application Structure in Sampled Flow Measurements

Experimental setup Evaluation of effectiveness for capturing more flows

Trace 1: 1 hour packet trace collected at an edge RelSamp configuration (other settings in paper): Capture

more flows of app session from many hosts , , ()

Evaluation of application classification accuracy Trace 2: 13-hour full-payload trace captured at a dorm

network RelSamp setting: Similar setting, but varies from 0.1 to

1.0 Classifiers: BLINC [SIGCOMM ’05] , SVM, and C4.5 Ground truth is obtained using DPI-based classifier (tstat)

Page 16: RelSamp : Preserving Application Structure in Sampled Flow Measurements

Flows per application session

#captured flows/#total flows in an app session

CDF

More # of flowsper app session

Page 17: RelSamp : Preserving Application Structure in Sampled Flow Measurements

Accuracy of BLINC classifier

Sampling rate

Accu

racy

(%)

Note: classification results on flows using non-standard port

~ 50% increase

Page 18: RelSamp : Preserving Application Structure in Sampled Flow Measurements

Related work Flow Sampling [ToN ’06]

Samples flows once flow record is created Flow Slices [IMC ’05]

Focuses on controlling router resources (CPU and memory)

cSamp [NSDI ’08] Supports sampling of all traffic by coordinating

various vantage points in a network FlexSample [IMC ’08]

Support monitoring of traffic subpopulations, but needs to maintain extra states for approximate checking of predicates

Page 19: RelSamp : Preserving Application Structure in Sampled Flow Measurements

Summary Introduced the notion of related sampling

Samples more number of related flows in the same application session with higher probability

Proposed RelSamp architecture Preserve application structure in sampled flow records

Effective to preserving application session structure 5-10x more flows per application session compared to

Sampled NetFlow Up to 50% higher classification accuracy than

Sampled NetFlow

Page 20: RelSamp : Preserving Application Structure in Sampled Flow Measurements

Thank you! Questions?

Page 21: RelSamp : Preserving Application Structure in Sampled Flow Measurements

Evaluation method of classification techniques

DPI-basedClassifi

erRelSam

pSample

dNetFlo

wFlowSampli

ng

Ground

TruthFlow

Record1

FlowRecord

2Flow

Record3

Classification Algorithm(e.g., BLIN

C, SVM,

C4.5)Packe

tTrace

Report

Tstat

Page 22: RelSamp : Preserving Application Structure in Sampled Flow Measurements

Comparison with other solutions using BLINC

Sampling rate# of

acc

urat

ely

class

ified

flow

s

Note: classification results on flows using non-standard port