Top Banner
Yao Zhao 1 , Yan Chen 1 , Da vid Bindel 2 Towards Unbiased End-to-End Diagnosis 1.Lab for Internet & Security Tech, Northwestern Univ 2.EECS department, UC Berkeley
39

Towards Unbiased End-to-End Diagnosis

Jan 01, 2016

Download

Documents

ahmed-freeman

Towards Unbiased End-to-End Diagnosis. Yao Zhao 1 , Yan Chen 1 , David Bindel 2. Lab for Internet & Security Tech, Northwestern Univ EECS department, UC Berkeley. Outline. Background and Motivation MILS in Undirected Graph MILS in Directed Graph Evaluation Conclusions. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Towards Unbiased End-to-End Diagnosis

Yao Zhao1, Yan Chen1, David Bindel2

Towards Unbiased End-to-End Diagnosis

1. Lab for Internet & Security Tech, Northwestern Univ

2. EECS department, UC Berkeley

Page 2: Towards Unbiased End-to-End Diagnosis

2

Outline

• Background and Motivation

• MILS in Undirected Graph

• MILS in Directed Graph

• Evaluation

• Conclusions

Page 3: Towards Unbiased End-to-End Diagnosis

3

End-to-End Network Diagnosis

93 hours

?

Page 4: Towards Unbiased End-to-End Diagnosis

4

Linear Algebraic Model

Path loss rate pi, link loss rate lj:

)1)(1(1 211 llp

1

3

2

1

011 b

x

x

x

A

D

C

B

1

2

3p1

p2

)1log()1log()1log( 211 llp

)1log(

)1log(

)1log(

011

3

2

1

l

l

l

2

1

3

2

1

111

011

b

b

x

x

x

Usually an underconstrained syste

m

Page 5: Towards Unbiased End-to-End Diagnosis

5

Unidentifiable Links

• Vectors That Are Linear Combinations of Row Vectors of G Are Identifiable– The property of a link (or link sequence) can

be computed from the linear system if and only if the corresponding vector is identifiable

• Otherwise, Unidentifiable

111

011G

(1) 121 bxx

(2) 2321 bxxx (1)-(2) 123 bbx

A

D

C

B

1

2

3p1

p2 [ 0 0 1 ]

[ 1 0 0 ] ? ?1 x

Page 6: Towards Unbiased End-to-End Diagnosis

6

Virtual Link

Motivation

• Biased statistic assumptions are introduced to infer unidentifiable Links

0.1

0.1

0

Loss rate = 0.1 if linear optimization

Loss = 0 if unicast tomography & RED

Loss rate?

Page 7: Towards Unbiased End-to-End Diagnosis

7

Least-biased End-to-end Network Diagnosis (LEND)

• Basic Assumptions– End-to-end measurement can infer the end-to-

end properties accurately– Link level properties are independent

• Problem Formulation– Given end-to-end measurements, what is the

finest granularity of link properties can we achieve under basic assumptions?

Basic assumptions

More and stronger statistic assumptions

Virtual linkDiagnosis granularity?

Better accuracy

Page 8: Towards Unbiased End-to-End Diagnosis

8

Least-biased End-to-end Network Diagnosis (LEND)

• Contributions– Define the minimal identifiable unit under basic

assumptions (MILS)– Prove that only E2E paths are MILS with a

directed graph topology (e.g., the Internet) – Propose good path algorithm (incorporating

measurement path properties) for finer MILS

Basic assumptions

More and stronger statistic assumptions

Virtual linkDiagnosis granularity?

Better accuracy

Page 9: Towards Unbiased End-to-End Diagnosis

9

Outline

• Background and Motivation

• MILS in Undirected Graph

• MILS in Directed Graph

• Evaluation

• Conclusions

Page 10: Towards Unbiased End-to-End Diagnosis

10

Minimal Identifiable Link Sequence

• Definition of MILS– The smallest path segments with loss rates t

hat can be uniquely identified through end-to-end path measurements

– Related to the sparse basis problem• NP-hard Problem

• Properties of MILS– The MILS is a consecutive sequence of link

s– A MILS cannot be split into MILSes (minima

l)– MILSes may be linearly dependent, or some

MILSes may contain other MILSes

Page 11: Towards Unbiased End-to-End Diagnosis

11

Examples of MILSes in Undirected Graph

Real links (solid) and all of the overlay paths (dotted) traversing them

1

231’

2’

3’4’

4

5

4

3

2

1

11000

01101

10110

00011

v

v

v

v

G

MILSes

a

b

c

de

3’+2’-1’-4’ → link 3

4132 vvvv

001002

Page 12: Towards Unbiased End-to-End Diagnosis

12

Outline

• Background and Motivation

• MILS in Undirected Graph

• MILS in Directed Graph

• Evaluation

• Conclusions

Page 13: Towards Unbiased End-to-End Diagnosis

13

Identify MILSes in Undirected Graphs

• Preparation– Active or passive end-to-end path measure

ment– Optimization

• Measure O(nlogn) paths and infer the n(n-1) end-to-end paths [SIGCOMM04]

Page 14: Towards Unbiased End-to-End Diagnosis

14

• Preparation• Identify MILSes

– Enumerate each link sequence to see if it is identifiable

– Computational complexity: O(r×k×l2)• r: the number of paths (O(n2))• k: the rank of G (O(nlogn))• l: the length of the paths

– Only takes 4.2 seconds for the network with 135 Planetlab hosts and 18,090 Internet paths

Identify MILSes in Undirected Graphs

Page 15: Towards Unbiased End-to-End Diagnosis

15

What about Directed Graphs?

• Directed Graph Are Essentially Different to Undirected Graph

Theorem: In a directed graph, no end-to-end path contains an identifiable subpath if only considering topology information

A

B C

N

A

B C

N

IncomingLinks

OutgoingLinks

1

23

4

6 5

010100

001100

100010

001010

100001

010001654321

G

[1 0 0 0 0 0] ?

Sum=1 Sum=1Sum=1 Sum=1

Sum=1 Sum=0

Page 16: Towards Unbiased End-to-End Diagnosis

16

Good Path Algorithm

• Consider Only Topology– Works for undirected graph

• Incorporate Measurement Path Property– Most paths have no loss

• PlanetLab experiments show 50% of paths in the Internet have no loss

– All the links in a path of no loss are good links (Good Path Algorithm)

Page 17: Towards Unbiased End-to-End Diagnosis

17

Good Path Algorithm

A

B C

N

A

B C

N

IncomingLinks

OutgoingLinks

1

23

4

6 5

010100

001100

100010

001010

100001

010001654321

G

• Symmetric Property is broken when using good path algorithm

Page 18: Towards Unbiased End-to-End Diagnosis

18

Other Features of LEND

• Dynamic Update for Topology and Link Property Changes– End hosts join or leave, routing changes or pa

th property changes– Incremental update algorithms very efficient

• Combine with Statistical Diagnosis– Inference with MILSes is equivalent to inferen

ce with the whole end-to-end paths– Reduce computational complexity because MI

LSes are shorter than paths• Example: applying statistical tomography methods i

n [Infocom03] on MILSes is 5x faster than on paths

Page 19: Towards Unbiased End-to-End Diagnosis

19

Outline

• Motivation

• MILS in Undirected Graph

• MILS in Directed Graph

• Evaluation

• Conclusions

Page 20: Towards Unbiased End-to-End Diagnosis

20

Evaluation Metrics

• Diagnosis Granularity– Average length of all the lossy MILSes in lossy p

ath

• Accuracy– Simulations

• Absolute error and relative error

– Internet experiments• Cross validation • IP spoof based consistency check

• Speed– Running time for finding all MILSes and loss rat

e inference

Page 21: Towards Unbiased End-to-End Diagnosis

21

Methodology

• Planetlab Testbed– 135 end hosts, each from different institute – 18,090 end-to-end paths

• Topology Measured by Traceroute– Avg path length is 15.2

• Path Loss Rate by Active UDP Probing with Small Overhead

Areas and Domains # of hosts

US (77)

.edu 50.org 14.net 2.com 10.us 1

Inter- national (58)

Europe 25Asia 25

Canada 3South America 3

Australia 2

Page 22: Towards Unbiased End-to-End Diagnosis

22

Diagnosis Granularity

# of End-to-end Paths 18,090

Avg Path Length 15.2

# of MILSes 1009

Avg length of MILSes2.3 virtual links

(3.9 physical links)

Avg diagnosis granularity2.3 virtual links

(3.8 physical links)

Loss rate

[0, 0.05)

lossy path [0.05, 1.0] (15.8%)

[0.05, 0.1)

[0.1, 0.3) [0.3, 0.5)

[0.5, 1.0)

1.0

% 84.2 17.2 15.6 24.9 15.8 26.5

Page 23: Towards Unbiased End-to-End Diagnosis

23

Distribution of Length of MILSes

• Most MILSes are pretty short• Some MILSes are longer than 10 hops

– Some paths do not overlap with any other paths

Most MILSes are short

A few MILSes are very long

Page 24: Towards Unbiased End-to-End Diagnosis

24

Other Results

• MILS to AS Mapping– 33.6% lossy MILSes comprise only one phy

sical link • 81.8% of them connect two ASes

• Accuracy– Cross validation (99.0%)– IP spoof based consistency check (93.5%)

• Speed– 4.2 seconds for MILS computations– 109.3 seconds for setup of scalable active

monitoring [SIGCOMM04]

Page 25: Towards Unbiased End-to-End Diagnosis

25

Conclusion

• Link-level property inference in directed graphs is completely different from that in undirected graphs

• With the least biased assumptions, LEND uses good path algorithm to infer link level loss rates, achieving– Good inference accuracy– Acceptable diagnosis granularity in practice– Online monitoring and diagnosis

• Continuous monitoring and diagnosis services on PlanetLab under construction

Page 26: Towards Unbiased End-to-End Diagnosis

26

Thank You!

For more info:

http://list.cs.northwestern.edu/lend/

Questions?

Page 27: Towards Unbiased End-to-End Diagnosis

27

Page 28: Towards Unbiased End-to-End Diagnosis

28

Motivation

• End-to-End Network Diagnosis

• Under-constrained Linear System– Unidentifiable Links exist

To simplify presentation, assume

undirected graph model

A R B

Page 29: Towards Unbiased End-to-End Diagnosis

29

Linear Algebraic Model (2)

! system dconstraine-underan Usually

)( sGrankk

…=

11 vectorrate losspath vectorrate losslink

matrix path where

,

}1|0{,

rs

sr

bx

GbGx

Page 30: Towards Unbiased End-to-End Diagnosis

30

Identifiable and Unidentifiable

• Vectors That Are Linear Combinations of Row Vectors of G Are Identifiable

• Otherwise, Unidentifiable

111

011G

(1) 121 bxx (2) 2321 bxxx

(1)-(2) 123 bbx

A

D

C

B

1

2

3p1

p2

(1,1,0)

Row(path) space(identifiable)

x1

x2

(1,1,1)

(0,0,1)

x3

Page 31: Towards Unbiased End-to-End Diagnosis

31

Examples of MILSes in Undirected Graph

1 2

1

2 3

1’

Real links (solid) and all of the overlay paths (dotted) traversing them

1’ 2’

1

231’

2’

Rank(G)=1

Rank(G)=3

Rank(G)=4

3’4’

a

4

11G

1

1

0

1

0

1

0

1

1

Ga

b c3’

5

11000

01101

10110

00011

G

MILSes

a

b

c

de

3’+2’-1’-4’ → link 3

Page 32: Towards Unbiased End-to-End Diagnosis

32

Identify MILSes in Undirected Graphs

• Preparation• Identify MILSes

– Compute Q as the orthonormal basis of R(GT) (saved by preparation step)

– For a vector v in R(GT) , ||v|| = ||QTv||

x1

x2

x3

v11

~v

||~|||||| 22 vv v2

Page 33: Towards Unbiased End-to-End Diagnosis

33

Flowchart of LEND System

• Step 1– Monitors O(n·logn) paths that can fully describe all the O(n

2) paths (SIGCOMM04)– Or passive monitoring

• Step 2 – Apply good path algorithm before identifying MILSes as in

undirected graph

Measure topology to get G

Active or passive monitoring

Iteratively check all possible MILSes

Compute loss rates of MILSes

Good pathalgorithm on G

Stage 2: online update the measurements and diagnosisStage 1: set up scalablemonitoring system for diagnosis

Page 34: Towards Unbiased End-to-End Diagnosis

34

Evaluation with Simulation

• Metrics– Diagnosis granularity

• Average length of all the lossy MILSes in lossy path (in the unit of link or virtual link)

– Accuracy• Absolute error |p – p’ |: • Relative error

)',max()('),,max()(where

)(

)(',

)('

)(max)',(

pppp

p

p

p

pppF

Page 35: Towards Unbiased End-to-End Diagnosis

35

Simulation Methodology

• Topology type– Three types of BRITE router-level topologie

s– Mecator topology

• Topology size– 1000 ~ 20000 or 284k nodes

• Number of end hosts on the overlay network– 50 ~ 300

• Link loss rate distribution– LLRD1 and LLRD2 models

• Loss model– Bernoulli and Gilbert

Page 36: Towards Unbiased End-to-End Diagnosis

36

Sample of Simulation Results

# of endhost on OL

# ofpaths

AvgPL

# oflinks

# ofLP

# of linksin LP

Avg MILSlength

Avg diagnosisgranularity

50 2450 8.86 3798 1042 903 2.23(3.03) 2.24(3.07)

100 9900 8.80 9802 3551 1993 1.71(2.27) 2.05(2.95)

200 39800 8.80 22352 14706 4335 1.49(1.92) 1.77(2.38)

• Mercator (284k nodes) with Gilbert loss model and LLRD1 loss distribution

Page 37: Towards Unbiased End-to-End Diagnosis

37

Related Works

• Pure End-to-End Approaches– Internet Tomography

• Multicast or unicast with loss correlation

– Uncorrelated end-to-end schemes

• Router Response Based Approach– Tulip and Cing

0

A

B C

N0.1

A

B C

N

0.19

0

0.1

0.1

0.19 0.19 0.19 0.19

(a) (b)

Page 38: Towards Unbiased End-to-End Diagnosis

38

MILS to AS Mapping

• IP-to-AS mapping constructed from BGP routing tables

• Consider the short MILSes with length 1 or 2– Consist of about 44% of all lossy MILSes.– Most lossy links are connecting two dierent

ASes

1 AS 2 ASes 3 ASes >3 ASes

Len 1 MILSes (33.6%) 6.1% 27.5% 0 0

Len 2 MILSes (9.8%) 2.6% 5.8% 1.3% 0

Len > 2 MILSes (56.6%) 6.8% 17.8% 21.8% 10.2%

Page 39: Towards Unbiased End-to-End Diagnosis

39

Accuracy Validation

• Cross Validation (99.0% consistent)• IP Spoof based Consistency Checking.

• UDP: Src: A, Dst: C, TTL=255

A

C

B

• UDP: Src: A, Dst: B, TTL=255• UDP: Src: C, Dst: B, TTL=2• ICMP: Src: R3, Dst: C, TTL=255

R1

R2R3

IP Spoof based Consistency: 93.5%