Top Banner
Fast, Memory-Efficient Traffic Estimation by Coincidence Counting Fang Hao 1 , Murali Kodialam 1 , T. V. Lakshman 1 , Hui Zhang 2 , 1 Bell Labs, Lucent Technologies 2 University of Southern California
20

Fast, Memory-Efficient Traffic Estimation by Coincidence Counting Fang Hao 1, Murali Kodialam 1, T. V. Lakshman 1, Hui Zhang 2, 1 Bell Labs, Lucent Technologies.

Dec 16, 2015

Download

Documents

Janelle Gross
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Fast, Memory-Efficient Traffic Estimation by Coincidence Counting Fang Hao 1, Murali Kodialam 1, T. V. Lakshman 1, Hui Zhang 2, 1 Bell Labs, Lucent Technologies.

Fast, Memory-Efficient Traffic Estimation by Coincidence Counting

Fang Hao1, Murali Kodialam1, T. V. Lakshman1, Hui

Zhang2,

1Bell Labs, Lucent Technologies

2University of Southern California

Page 2: Fast, Memory-Efficient Traffic Estimation by Coincidence Counting Fang Hao 1, Murali Kodialam 1, T. V. Lakshman 1, Hui Zhang 2, 1 Bell Labs, Lucent Technologies.

2

Traffic flow measurement

Related work:

•Sample and hold [Estan et al. 2002], Smart Sampling [Duffield et al. 2004],

RATE [Kodialam et al. 2004], ACCEL-RATE[Hao et al. 2004], etc.

Flow rate pf : the proportion of packets belonging to flow f during a

certain time period.

•Arrival rate rf = pf : - total arrival rate (packets / second)

router

packets

flows

Flow rate statistics

network link

Page 3: Fast, Memory-Efficient Traffic Estimation by Coincidence Counting Fang Hao 1, Murali Kodialam 1, T. V. Lakshman 1, Hui Zhang 2, 1 Bell Labs, Lucent Technologies.

3

Problem definition• What’s the traffic flow composition through a network

link during a certain time period, given an estimation

accuracy requirement (, )?

- confidence percentile;

- estimation error.

• For any given flow f with its rate be pf, determine an

estimate p’f for pf such that

p’f (pf - /2, pf +/2)

with probability greater than .

e.g., = 0.9975, = 0.0001.

Page 4: Fast, Memory-Efficient Traffic Estimation by Coincidence Counting Fang Hao 1, Murali Kodialam 1, T. V. Lakshman 1, Hui Zhang 2, 1 Bell Labs, Lucent Technologies.

4

One solution – naive counting• Directly counting the number of packets for each flow.

It’s simple;

It estimates the rate of the flows rapidly;

But it requires frequent access of a large amount of high-speed memory.

•There can be millions of active flows on backbone links [Duffield et al. 2001]

Page 5: Fast, Memory-Efficient Traffic Estimation by Coincidence Counting Fang Hao 1, Murali Kodialam 1, T. V. Lakshman 1, Hui Zhang 2, 1 Bell Labs, Lucent Technologies.

5

Arrival model1. Flow rates are stationary during the estimation

period.

An arriving packet belongs to flow f with probability

pf.

2. Packet arrivals are independent.

An arriving packet belongs to flow f independent of

other arrivals.

Page 6: Fast, Memory-Efficient Traffic Estimation by Coincidence Counting Fang Hao 1, Murali Kodialam 1, T. V. Lakshman 1, Hui Zhang 2, 1 Bell Labs, Lucent Technologies.

6

Performance metrics

• Sample size (estimation time): the number of

arrivals needed to achieve the specified estimation

accuracy.

• Memory size: the number of flows that are tracked

during the estimation.

Page 7: Fast, Memory-Efficient Traffic Estimation by Coincidence Counting Fang Hao 1, Murali Kodialam 1, T. V. Lakshman 1, Hui Zhang 2, 1 Bell Labs, Lucent Technologies.

7

Can we …

• Can we design a scheme which runs as fast as naïve

counting but only catches “interesting” flows?

• By counting the exact number of arrivals for each flow,

naïve accounting requires minimally arrivals to

meet the accuracy requirement (, ).

This will capture at least one packet in expectation for any flow f with

pf (Z=3, = 0.0001, ).

2

2

Z

2

2

Z

?2

2

Z

?2

2

Z

!2

2

Z

Page 8: Fast, Memory-Efficient Traffic Estimation by Coincidence Counting Fang Hao 1, Murali Kodialam 1, T. V. Lakshman 1, Hui Zhang 2, 1 Bell Labs, Lucent Technologies.

8

Problem re-definition• What’s the traffic flow composition through a network

link during a certain time period given an estimation

accuracy requirement (, , , )?

- confidence percentile; - estimation error;

- threshold rate; - error relaxation factor.

• For any given flow f with its rate be pf, determine an

estimate p’f for pf such that

p’f (pf - /2, pf +/2) if pf

p’f (pf - /2, pf + /2) if pf

with probability greater than .

e.g., = 0.9975 (Z=3), = 0.0001, = 0.01, = 10.

Page 9: Fast, Memory-Efficient Traffic Estimation by Coincidence Counting Fang Hao 1, Murali Kodialam 1, T. V. Lakshman 1, Hui Zhang 2, 1 Bell Labs, Lucent Technologies.

9

Our solution – CATE

• Coincidence bAsed Traffic Estimation

It’s simple;

It estimates the size of the flows rapidly;

It requires a small amount of memory.

• A generalization of RATE scheme [Kodialam et al. 2004]

Page 10: Fast, Memory-Efficient Traffic Estimation by Coincidence Counting Fang Hao 1, Murali Kodialam 1, T. V. Lakshman 1, Hui Zhang 2, 1 Bell Labs, Lucent Technologies.

10

CATE – scheme description (I)

k-lengthk-length

Page 11: Fast, Memory-Efficient Traffic Estimation by Coincidence Counting Fang Hao 1, Murali Kodialam 1, T. V. Lakshman 1, Hui Zhang 2, 1 Bell Labs, Lucent Technologies.

11

CATE – scheme description (II)• Estimation procedure

1. Specify the estimation accuracy requirement (, , , );

2. Calculate sample size N, memory size M, and k (the length of PT);

3. Run and count the coincidences CC(f) for each flow f;

4. At the end of the estimation, output the estimated flow rates as

a) For each flow f in CCT, p’(f) = ;

b) For the rest of the flows, report 0 as the estimated rates.

KN

fcc

)(

Page 12: Fast, Memory-Efficient Traffic Estimation by Coincidence Counting Fang Hao 1, Murali Kodialam 1, T. V. Lakshman 1, Hui Zhang 2, 1 Bell Labs, Lucent Technologies.

12

Intuition behind CATE

• Counting coincidence dramatically amplifies (squares) the

ratio of catching probability between a large flow and a

small flow.

Good news: CATE sample size is still for the estimation

accuracy requirement (, , , ).

• Multiple (i.e., k) comparisons increase the number of

membership testing to kN with N arrivals.

Good news: the reduction on estimation variance due to increase in testing

number is no less than the increase on estimation variance due to comparison

correlation.

)(2

2

Z

Page 13: Fast, Memory-Efficient Traffic Estimation by Coincidence Counting Fang Hao 1, Murali Kodialam 1, T. V. Lakshman 1, Hui Zhang 2, 1 Bell Labs, Lucent Technologies.

13

• Given the accuracy requirement (, ) and k-

length predecessor table for CATE,

The minimal sample size

CATE – sample size

12

2

kwhenZ

N

Specifically, for a flow f with pf , the minimal sample size12

1

k

2

23

k

ZN f

Page 14: Fast, Memory-Efficient Traffic Estimation by Coincidence Counting Fang Hao 1, Murali Kodialam 1, T. V. Lakshman 1, Hui Zhang 2, 1 Bell Labs, Lucent Technologies.

14

• Given an estimation accuracy requirement (, , , ), if

then setting

gives

the sample size =

Theorem 1 – CATE sample size

)12

(3

)11

( 2

)11

(2

1

k

2

23

k

Z

= 99.9 % = 0.0001 = 0.01 = 10 k = 50 = 0.001 = 20 k = 500…

Page 15: Fast, Memory-Efficient Traffic Estimation by Coincidence Counting Fang Hao 1, Murali Kodialam 1, T. V. Lakshman 1, Hui Zhang 2, 1 Bell Labs, Lucent Technologies.

15

• Given the estimation setting in Theorem 1,

the maximum expected memory

Theorem 2 – CATE memory size

Z11.1

Totally 100,000 flows, rate range [10-6, 1]Z = 3= 0.002CATE will catch no more than 1650 flows.Naïve counting will record all flows w.h.p.

Page 16: Fast, Memory-Efficient Traffic Estimation by Coincidence Counting Fang Hao 1, Murali Kodialam 1, T. V. Lakshman 1, Hui Zhang 2, 1 Bell Labs, Lucent Technologies.

16

CATE – experiment 1

• Real IP traces from NLANR;

• A size-1000 buffer between new arrival and PT.

(The memory size = 547) (The memory size = 1464)

Page 17: Fast, Memory-Efficient Traffic Estimation by Coincidence Counting Fang Hao 1, Murali Kodialam 1, T. V. Lakshman 1, Hui Zhang 2, 1 Bell Labs, Lucent Technologies.

17

CATE – experiment 2 (I)

• methodology

Totally 1 million flows, synthetic traces;

5 “large” flows, rates uniformly distributed between 0.1 and 0.2 of the

entire traffic;

1000 “medium-sized” flows, rates uniformly distributed between

0.0001 and 0.0002 of the entire traffic;

All the rest are “small” flows, each with rate roughly 10-7;

Deliberately chose a short sample time (1 million packet time) to

illustrate the impact of k, the predecessor table length.

Page 18: Fast, Memory-Efficient Traffic Estimation by Coincidence Counting Fang Hao 1, Murali Kodialam 1, T. V. Lakshman 1, Hui Zhang 2, 1 Bell Labs, Lucent Technologies.

18

CATE – experiment 2 (II)

Page 19: Fast, Memory-Efficient Traffic Estimation by Coincidence Counting Fang Hao 1, Murali Kodialam 1, T. V. Lakshman 1, Hui Zhang 2, 1 Bell Labs, Lucent Technologies.

19

CATE – experiment 2 (III)

Page 20: Fast, Memory-Efficient Traffic Estimation by Coincidence Counting Fang Hao 1, Murali Kodialam 1, T. V. Lakshman 1, Hui Zhang 2, 1 Bell Labs, Lucent Technologies.

20

Conclusion & future work

• CATE, a memory efficient traffic estimation scheme

as fast as naïve counting.

• Future work:

Extending CATE for byte rate estimation.

Extending CATE to minimize the impact of arrival dependence

without (excessive) additional overhead.