Revisiting the Case for a Minimalist Approach for Network Flow Monitoring Vyas Sekar, Michael K Reiter, Hui Zhang 1.

1

Revisiting the Case for a Minimalist Approach for Network Flow Monitoring

Vyas Sekar, Michael K Reiter, Hui Zhang

Many Monitoring Applications

Traffic Engineering

Analyze new user apps

AnomalyDetection

Network Forensics

Worm Detection

Accounting

Botnet analysis

…….

3

Need to estimate different metrics

Traffic Engineering

Analyze new user apps

AnomalyDetection

Network ForensicsWorm

Detection

Accounting

Botnet analysis

…….

“Heavy-hitters”

“Degree histogram” “Entropy”, “Changes”

“SuperSpreaders”

“Flow size distribution”

4

How are these metrics estimated?Traffic

Packet Processing

Counter DataStructures

Application-LevelMetrics

Monitoring(on router)

Computation(off router)

5

Today’s solution: Packet SamplingTraffic

Packet Processing

Counter Data Structures



Sample packets uniformly

FlowId Pkt/ByteCounts

Compute metrics on sampled flows

Estimation is inaccurate for fine-grained analysisExtensive literature on limitations for many tasks!


Flow = Packets with same Src/Dst Addr and Ports

6

Trend: Shift to Application-SpecificTraffic

Packet Processing


Application-LevelMetric

Flow Size Distribution Entropy Superspreader

Complexity: Need per-metric implementation Early commitment: Applications are a moving target



Packet Processing



Packet Processing

….

11

What do we ideally want?Traffic

PacketProcessing


Application-SpecificMetrics



Simple

High accuracy

Support many applications

12

Outline

• Motivation

• A Minimalist Alternative

• Evaluation

• Summary and discussion

13

RequirementsAnomaly

Worm Accounting

Botnet

2. General acrossapplications

1. Simple router implementation

3. Enable drill-down capabilities

4. Network-wideviews

14

How do we meet these requirements?


2. General across applications


4. Network-wide views

Delay binding to specific applications

15

What does it mean to delay binding?

Traffic

Packet Processing





Instead of splittingresources, Aggregate into generic primitives

Keep this stage as “generic” as possible

17

What Generic Primitives?

Two broad classes of monitoring tasks:

1. Communication structuree.g., Who talked to whom?

2. Volume structuree.g., How much traffic?

Flow sampling[Hohn, Veitch IMC ‘03]

Sample and Hold[Estan,Varghese SIGCOMM ’02]

18

Flow Sampling

Traffic

Packet Processing


Hash(5-tuple) If hash < r, update



Pick flows at random; not biased by flow sizeGood for “communication” patterns

19

Sample and Hold

Traffic

Packet Processing




Accurate counts of large flowsGood for “volume” queries

If flow in table, updateSample with prob pIf new, create entry

22







Generic primitives = FS,SH

Retain NetFlow’s operational model

24

Retain NetFlow operational modelApplication-Specific

FSD DegreeHistogram

Entropy

Summary Statistics Difficult to do

further analysise.g., why is X

high?

Can estimate new metrics!

FSD Entropy Deg

…

…

Minimalist

Flow reports

FS+SH

FSD DegreeHistogram

Entropy

25






Retain NetFlow’s Operational model Keep flow reports

Network-wide resource management


Generic primitives = FS,SH

26

Network-Wide Sample-and-Hold

1

1

1

1

1 23

47 55

Sample-and-HoldFlow Sampling

Repeating Sample-and-Hold wastes resources Do it once per-path

5

5

5

FS+SH FS+SH

FS+SH

FS+SH

FS+SH

27

Network-Wide Flow Sampling

11 23

47 55

Flow Sampling

Use cSamp [NSDI’08] to configure flow sampling capabilitiesHash-based coordination Non-overlapping sets of flowsNetwork-wide Optimization Operator goals e.g., per-path guarantee

1

5

9

8

2

3

94

7

8

28

Putting the pieces together: “Minimalist” Proposal

Traffic

Flow Sampling


Sample & Hold

h Hash(flowid) If h in FS_Range(path) Create/Update

If Ingress(path)If flow in table

Update With prob SH_p(path)

If new Create

FS_Range(path), SH_p(path) are configuration parameters e.g., via network-wide optimization using cSamp+

30

What do we ideally want?Traffic

PacketProcessing


Application-SpecificMetrics



Simple

High accuracy

Support many applications

✔

✔

?

31

Outline

• Motivation

• A Minimalist Alternative

• Evaluation– Compare FS+SH vs. application-specific

• Summary and discussion

32

Assumptions in resource normalization• Hardware requirements are similar

– Both need per-packet array/key-value updates– More than pkt sampling, but within router capabilities

• Processing costs– Online cost lower for minimalist (don’t need per-app-instance)– Offline cost is higher for minimalist (but can be reduced, if necessary)

• Reporting bandwidth – Higher for minimalist, but < 1% of network capacity

• Memory for counters– Bottleneck is SRAM (Flow headers can be offloaded to DRAM)– We conservatively assume 4X more per-counter cost

34

Head-to-Head Comparison

Flow Size Distribution

OutdegreeHistogram…

Application-Specific Minimalist

+

+ =

Normalize SRAM

Relative Accuracy (Minimalist) – Accuracy (AppSpecific) accuracy = ---------------------------------------------------------------difference Accuracy (AppSpecific)

Application Portfolio

FS+SHFSD Entropy Degree

Flow Size Distribution

OutdegreeHistogram…

35

Resource split between FS and SH

We pick 80-20 split as a good operation pointRelative difference is positive for most applications!

+ good- bad

Run application-specific algorithms with recommended parameters (details in paper)Measure memory use; Run FS+SH with aggregate, but normalized (1/4X) memory Packet trace from CAIDA; consistent over other traces

36

Varying the application portfolioMinimalist vs. Application-specific under same resources

+ good- bad

More tasks or some resource-intensive Better across entire portfolio!“Sharing” effect across estimation tasks

Application portfolio

Packet trace from CAIDA; consistent over other traces

Rela

tive

accu

racy

diff

eren

ce

37

Network-Wide View

Estimation(error metric)

ApplicationSpecific

UncoordinatedFS + SH

Coordinated FS +SH

FSD(WMRD)

0.16 0.19 0.02

Heavy Hitter(miss rate)

0.02 0.3 0.04

Entropy(relative error)

not available 0.03 0.02

SuperSpreader(miss rate)

0.02 0.04 0.01

Deg. Histogram(JL-divergence)

0.15 0.03 0.02Configured per-ingress can’t get network-wide!

Introduces some biases due to duplicates

1. App-Specific: Difficult to generate different views e.g., per-OD-pair

2. Coordination: better performance & operational simplicity

Lower

Better

Flow-level traces from Internet2. Configure Application-Specific per PoPMeasure resource consumption, normalize and give to network-wide FS+SH

38

Conclusions and discussion

Even a simple “minimalist” approach might work

Key: Focus on portfolio rather than individual tasksProposal: FS + SH (complementary) ; cSamp-like mgmt

• Implications for device vendors and operators– Late binding, lower complexity

• Quest for feasibility not optimalityBetter primitives, combination, estimation?Is this sufficient?

Revisiting the Case for a Minimalist Approach for Network Flow Monitoring Vyas Sekar, Michael K Reiter, Hui Zhang 1.

Documents