Top Banner
1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley, CA 94720-1776
23

1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

Jan 18, 2018

Download

Documents

Dina Walsh

3  1 st slide: Title  2 nd slide: motivations and problem formulation -Why is the problem important? -What is challenging/hard about your problem  3 rd slide: main idea of your solution  4 th slide: status  5 th slide: future plans and schedule
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

1

CS 268: Lecture 14

Internet Measurements

Scott Shenker and Ion StoicaComputer Science Division

Department of Electrical Engineering and Computer SciencesUniversity of California, Berkeley

Berkeley, CA 94720-1776

Page 2: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

2

Project Presentations

Five slides

Five minutes

Five questions

You’re outta there!

Page 3: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

3

1st slide: Title

2nd slide: motivations and problem formulation- Why is the problem important?- What is challenging/hard about your problem

3rd slide: main idea of your solution

4th slide: status

5th slide: future plans and schedule

Page 4: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

4

Poll

How many people know:

- Lamport clocks

- Timestamp vectors

- Byzantine agreement

- Epidemic/gossip dissemination algorithms

- Consistency

- Bayou

Page 5: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

5

Internet Measurements

This field went from casual sideline to serious science with Vern Paxson’s thesis in 1997

Now there is a massive literature in Internet measurements

Two devoted conferences (PAM and IMC) in addition to standard networking conferences

Page 6: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

6

Today’s Lecture

Provide a quick overview of the kinds of techniques used and questions addressed

Tom Anderson (UW) will lead us in a design exercise

Page 7: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

7

Philosophical Question

Why do we take Internet measurements?

Page 8: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

8

Measurement Techniques

Passive:- Traces- Netflow data- Network telescopes- BGP looking glass sites- ……

Active:- Ping- Traceroutes- ……

Page 9: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

9

Measurement Infrastructure

NIMI

Planetlab

Telescopes (CIED)

Traceroute servers

…..

Page 10: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

10

Varieties of Measurement Studies

Basic characterization from direct measurements- Raw statistics- Model fitting- Just because it is “direct” doesn’t mean it is easy!

• Look at the care taken by Paxson in the two readings

Deep inference- Estimating something that can’t be directly measured

Page 11: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

11

Examples of Direct Studies

Packet sizes and protocol type

Flow sizes

Route stability

DNS usage

Stationarity

…….

Page 12: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

12

A Few Results

Most flows are small, but most bytes in large flows

Most flows are slow, but most bytes are in fast flows- Rate/size highly correlated (not due to slow start)

Losses are reasonably modeled as Poisson arrival of loss events followed by geometric loss train

Internet routing does not yield good paths….

In 2003, Bytes: TCP 83% UDP 16% Packets: TCP 75% UDP 22% Flows: TCP 56% UDP 33%

Page 13: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

13

Traffic Variances

A Bellcore team discovered that ethernet traffic had strage behavior

No matter what time scale they averaged it over, the variances were still large!

How could that be?

Page 14: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

14

Self Similarity, LRD, and All That….

Correlation: r(t) = < x(s+t)x(s)> - <x(s)>2

Short-range correlation: Sum r(t) finite

Long-range correlation (LRD): Sum r(t) infinite

Example of LRD: r(t) ~ t-.5

Page 15: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

15

Aggregates and Variance

Consider a sum of M iid random variables:- Variance of sums ~ 1/M

Consider process with short-range correlations:- On large-enough time scales, intervals are iid- Variance should decay as 1/M

Only with LRD can the variance decay more slowly- Often as a power law with p<1

Page 16: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

16

LRD Comes from Many Sources?

Poisson arrivals, with long-tailed sizes is one mechanism- Most files small- Most bytes in large files

Pareto distributions are very common!

Page 17: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

17

Zipf and Pareto

Zipf: F(rank) ~ rank-b

- F is frequency, size, etc.- 1 < b

Pareto: P(size) ~ size-a

- 1 < a < 2- Log(size) has geometric distribution!- Pareto distributed interarrival times means logs are uncorrelated

a = 1+1/b and b = 1/(a-1)

The tail of the distribution can’t be ignored!

Page 18: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

18

Examples of Zipf/Pareto

Income distribution Asteroid sizes Letter frequencies

File sizes Telnet packet interarrivals Web access frequencies (impact on caching) AS connectivity

Page 19: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

19

Examples of Inference Tools

Bandwidth estimation

Critical path analysis

Network tomography

Flow rate causes

…..

Page 20: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

20

Bandwidth Estimation (I)

Send pair of packets back-to-back

If not intermediate packets intervene, their spacing is a function of the bandwidth of the bottleneck link

Send a series of packet-pairs, measure the minimal delays

Doesn’t work well: often overestimate capacity!

Can do better with many packets back-to-back

Page 21: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

21

Bandwidth Estimation (II)

Send variable sized packets, k hops (using TTL).

Delay at link i: Di = Ai + P/Ci

- Ai is size-independent delay

- P is packet size

- Ci is capacity of link

Model delay of k hops: Sum (i = 1 to k) Di

Page 22: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

22

Possible Causes of Flow Rates

Application: doesn’t generate data fast enough

Opportunity: never leaves slow-start

Receiver: limited dby receive window

Sender: limited by sending buffer

Bandwidth: using full bottleneck bandwidth

Congestion: flow is responding to packet loss

Transport: none of the above…

Page 23: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.

23

Differentiating Causes

Take trace of flow

Estimate RTT

Look at window epochs

Identify where TCP is in cycle