Top Banner
Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)
27

Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Tomography-based Overlay Network Monitoring

Hugo Angelmar

Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Page 2: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Motivation• Infrastructure ossification led to thrust of

overlay and P2P applications• Such applications flexible on paths and

targets, thus can benefit from E2E distance monitoring– Overlay routing/location – VPN management/provisioning– Service redirection/placement …

• Requirements for E2E monitoring system– Scalable & efficient: small amount of probing traffic– Accurate: capture congestion/failures– Incrementally deployable– Easy to use

Page 3: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Existing Work• General Metrics: RON (n2 measurement)• Latency Estimation

– Clustering-based: IDMaps, Internet Isobar, etc.– Coordinate-based: GNP, ICS, Virtual Landmarks

• Network tomography– Focusing on inferring the characteristics of physical

links rather than E2E paths– Limited measurements -> under-constrained

system, unidentifiable links

Page 4: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Problem Formulation

Given an overlay of n end hosts and O(n2) paths, how to select a minimal subset of paths to monitor so that the loss rates/latency of all other paths can be inferred.

Assumptions:• Topology measurable• Can only measure the E2E path, not the link

Page 5: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Our Approach

Select a basis set of k paths that fully describe O(n2) paths (k «O(n2))

• Monitor the loss rates of k paths, and infer the loss rates of all other paths

• Applicable for any additive metrics, like latency

End hosts

Overlay Network Operation Center

topology

measurements

Page 6: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Modeling of Path Space

Path loss rate p, link loss rate l )1)(1(1 211 llp

)1log(

)1log(

)1log(

011)1log()1log()1log(

3

2

1

211

l

l

l

llp

A

D

C

B

1

2

3p1

1

3

2

1

011 b

x

x

x

Page 7: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Putting All Paths Together

11 vectorrate losspath vectorrate losslink

matrix path where

,

}1|0{,

rs

sr

bx

GbGx

Totally r = O(n2) paths, s links, s «r

A

D

C

B

1

2

3p1

…=

Page 8: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Sample Path Matrix

• x1 - x2 unknown => cannot compute x1, x2

• Set of vectorsform null space

• To separate identifiable vs. unidentifiable components: x = xG + xN

0

1

1

2

)(

2/

2/

1

0

0

0

1

1

2

)(

21

2

1

1

321

xxx

b

b

b

xxx

x

N

G

111

100

011

G

3

2

1

3

2

1

b

b

b

x

x

x

G

A

D

C

B

1

2

3b1

b2

b3

(1,-1,0)

x2

x1x3

(1,1,0)

path/row space(measured)

null space(unmeasured)

T]011[

Page 9: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Intuition through Topology VirtualizationVirtual links:

• Minimal path segments whose loss rates uniquely identified

• Can fully describe all paths

• xG is composed of virtual links

A

D

C

B

1

2

3b1

b2

b3

(1,-1,0)

x2

x1x3

(1,1,0)

path/row space(measured)

null space(unmeasured)

2

1

1

321 2/

2/

1

0

0

0

1

1

2

)(

b

b

b

xxx

xG

1 2Virtualization

Virtual links

GNG GxGxGxGxb All E2E paths are in path space, i.e., GxN = 0

Page 10: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

More Examples

Real links (solid) and all of the overlay paths (dotted) traversing them

Virtualization

Virtual links

1

2 31’ 2’

Rank(G)=2

1 2

1

0

0

1

1

1G

1

2

3

1’2’

4

Rank(G)=3

3’

4’

12

3

1

1

0

0

1

0

1

0

0

1

0

1

0

0

1

1

G

Page 11: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Algorithms

• Select k = rank(G) linearly independent paths to monitor– Use QR decomposition– Leverage sparse matrix: time

O(rk2) and memory O(k2)• E.g., 10 minutes for n =

350 (r = 61075) and k = 2958

• Compute the loss rates of other paths– Time O(k2) and memory

O(k2)

…=

… =

bG Gx

Page 12: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

How many measurements saved ?

k « O(n2) ?For a power-law Internet topology • When the majority of end hosts are on the overlay

• When a small portion of end hosts are on overlay– If Internet a pure hierarchical structure (tree): k = O(n)– If Internet no hierarchy at all (worst case, clique):

k = O(n2)– Internet has moderate hierarchical structure [TGJ+02]

k = O(n) (with proof)

For reasonably large n, (e.g., 100), k = O(nlogn)(extensive linear regression tests on both synthetic and real topologies)

Page 13: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Practical Issues

• Topology measurement errors tolerance– Care about path loss rates than any interior links– Poor router alias resolution

=> assign similar loss rates to the same links– Unidentifiable routers

=> add virtual links to bypass

• Measurement load balancing on end hosts– Randomly order the paths for scan and selection

of

• Topology Changes– Efficient algorithms for incrementally update of

for adding/removing end hosts & routing changes

G

G

Page 14: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Topology Changes• Basic building block: add/remove one path

– Incremental changes: O(k2) time (O(n2k2) for re-scan)– Add path: check linear dependency with old basis

set,– Delete path p : hard when

The essential info described by p :

G

• Add/remove end hosts , Routing changes• Topology relatively stable in order of a day

=> incremental detection

0 and}{ where

, topath any add,0}{ if

}{but , such that , vector a

yxpGx

GxypG

pGyGyy

Gp

Page 15: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Evaluation• Simulation

– Topology• BRITE: Barabasi-Albert, Waxman, hierarchical: 1K – 20K

nodes• Real topology from Mercator: 284K nodes

– Fraction of end hosts on the overlay: 1 - 10%– Loss rate distribution (90% links are good)

• Good link: 0-1% loss rate; bad link: 5-10% loss rates• Good link: 0-1% loss rate; bad link: 1-100% loss rates

– Loss model: • Bernouli: independent drop of packet• Gilbert: busty drop of packet

– Path loss rate simulated via transmission of 10K pkts

• Experiments on PlanetLab

Page 16: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Areas and Domains# of

hosts

US (40)

.edu 33

.org 3

.net 2

.gov 1

.us 1

Interna-tional (11)

Europe (6)

France 1

Sweden 1

Denmark 1

Germany 1

UK 2

Asia (2)Taiwan 1

Hong Kong 1

Canada 2

Australia 1

Evaluation• Extensive Simulations• Experiments on PlanetLab

– 51 hosts, each from different organizations

– 51 × 50 = 2,550 paths– On average k = 872

• Results Highlight– Avg real loss rate: 0.023– Absolute error mean:

0.0027 90% < 0.014– Relative error mean: 1.1

90% < 2.0– On average 248 out of 2550

paths have no or incomplete routing information

– No router aliases resolved

Page 17: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Areas and Domains# of

hosts

US (40)

.edu 33

.org 3

.net 2

.gov 1

.us 1

Interna-tional (11)

Europe (6)

France 1

Sweden 1

Denmark 1

Germany 1

UK 2

Asia (2)Taiwan 1

Hong Kong 1

Canada 2

Australia 1

Experiments on Planet Lab

• 51 hosts, each from different organizations– 51 × 50 = 2,550 paths

• Simultaneous loss rate measurement– 300 trials, 300 msec each– In each trial, send a 40-

byte UDP pkt to every other host

• Simultaneous topology measurement– Traceroute

• Experiments: 6/24 – 6/27– 100 experiments in peak

hours

Page 18: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Sensitivity Test of Sending Frequency

• Big jump for # of lossy paths when the sending rate is over 12.8 Mbps

Page 19: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

• Loss rate distribution

• Metrics– Absolute error |p – p’ |:

• Average 0.0027 for all paths, 0.0058 for lossy paths

– Relative error [BDPT02]

– Lossy path inference: coverage and false positive ratio

• On average k = 872 out of 2550

lossrate

[0, 0.05)

lossy path [0.05, 1.0] (4.1%)

[0.05, 0.1) [0.1, 0.3) [0.3, 0.5) [0.5, 1.0) 1.0

% 95.9% 15.2% 31.0% 23.9% 4.3% 25.6%

PlanetLab Experiment Results

)',max()('),,max()(where

)(

)(',

)('

)(max)',(

pppp

p

p

p

pppF

Page 20: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Accuracy Results for One Experiment

• 95% of absolute error < 0.0014• 95% of relative error < 2.1

Page 21: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Accuracy Results for All Experiments

• For each experiment, get its 95% absolute & relative errors• Most have absolute error < 0.0135 and relative error < 2.0

Page 22: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Lossy Path Inference Accuracy

• 90 out of 100 runs have coverage over 85% and false positive less than 10%

• Many caused by the 5% threshold boundary effects

Page 23: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Topology/Dynamics Issues

• Out of 13 sets of pair-wise traceroute …• On average 248 out of 2550 paths have no

or incomplete routing information• No router aliases resolvedConclusion: robust against topology

measurement errors

• Simulation on adding/removing end hosts and routing changes also give good results

Page 24: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Performance Improvement with Overlay

• With single-node relay• Loss rate improvement

– Among 10,980 lossy paths:– 5,705 paths (52.0%) have loss rate reduced by 0.05 or more– 3,084 paths (28.1%) change from lossy to non-lossy

• Throughput improvement– Estimated with

– 60,320 paths (24%) with non-zero loss rate, throughput computable

– Among them, 32,939 (54.6%) paths have throughput improved, 13,734 (22.8%) paths have throughput doubled or more

• Implications: use overlay path to bypass congestion or failures

lossraterttthroughput

5.1

Page 25: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

SERVER

OVERLAY RELAYNODE

OVERLAY NETWORKOPERATION CENTER

CLIENT

3. Network congestion /failure

4. Detect congestion /failure

2. Register trigger

7. Skip-free streamingmedia recovery

6. Setup New Path

1. Setupconnection

5. Alert +New Overlay Path

X

UC Berkeley

UC San Diego

Stanford

HP Labs

Adaptive Overlay Streaming Media

• Implemented with Winamp client and SHOUTcast server• Congestion introduced with a Packet Shaper• Skip-free playback: server buffering and rewinding• Total adaptation time < 4 seconds

Page 26: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Adaptive Streaming Media Architecture

Client 1

MEDIASOURCE

SERVER

SHOUTcastServer

Buffering Layer

Clie

nt 1

Clie

nt 2

Clie

nt 3

Clie

nt 4

FromSHOUTcast

server

Calculated

concatenationpoint

BU

FF

ER

ByteCounter

Client 2

Client 3

Client 4

INTERNET

Triggering /alert + new path

OVERLAY RELAY NODE

RELAY

Overlay Layer

Path Management

TCP/IP Layer

RELAY

CLIENT

Winamp client

TCP/IP Layer

Overlay Layer

Internet

Path Management

Winamp Video/Audio Filter

Byte Counter

TCP/IP Layer

OVERLAY NETWORKOPERATION CENTER

Page 27: Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)

Conclusions• A tomography-based overlay network

monitoring system– Given n end hosts, characterize O(n2) paths with

a basis set of O(nlogn) paths– Selectively monitor O(nlogn) paths to compute

the loss rates of the basis set, then infer the loss rates of all other paths

• Both simulation and real Internet experiments promising

• Built adaptive overlay streaming media system on top of monitoring services– Bypass congestion/failures for smooth playback

within seconds