D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007.

DARRYL VEITCHThe University of Melbourne

ACTIVE MEASUREMENT IN NETWORKS

MetroGrid Workshop

GridNets 2007

19 October 2007

A RENAISSANCE IN NETWORK MEASUREMENT

• Data centric view of networking

• Must arise from real problems, or observations on data

• Abstractions based on data

• Results validated by data

• In fact: the scientific method in networking

• Not just getting numbers, Discovery

• Knowing, understanding, improving network performance

• Not just monitoring

• Traffic patterns: link, path, node, applications, management

• Quality of Service: delay, loss, reliability

• Protocol dynamics: TCP, VoIP, ..

• Network infrastructure: routing, security, DNS, bottlenecks, latency..

THE RISE OF MEASUREMENT

• Now have dedicated conferences:

• Passive and Active Measurement Conference (PAM) 2000 …

• ACM Internet Measurement Conference (IMC) 2001 ...

• Papers between 1966-87 ( P.F. Pawlita, ITC-12, Italy )

• Queueing theory: several thousand

• Traffic measurement: around 50

ACTIVE VERSUS PASSIVE MEASUREMENT

• Typical Active Aims

• “End-to-End” or “Network edge”

• End-to-End Loss, Delay, Connectivity, “Discovery” …..

• Long/short term monitoring: Network health; Route state

• Internet view: Application performance and robustness

• Typical Passive Aims

• “At-a-point” or “Network Core”

• Link utilisation, Link traffic patterns, Server workloads

• Long term monitoring:

Dimensioning, Capacity Planning, Source modelling

• Engineering view: Network performance

ACTIVE PROBING VERSUS NETWORK TOMOGRAPHY

• Network Tomography

• Typically “network wide”: multiple destinations and/or sources

• Simple black box node/link models, strong assumptions

• Classical inference with twists

• Typical metrics: per link/path loss/delay/throughput

• Active Probing

• Typically over a single path

• Use tandem FIFO queue model

• Exploit discrete packet effects in semi-heuristic queueing analysis

• Typical metrics: link capacities, available bandwidth

TWO ENTREES

• Loss Tomography

Removing the temporal independence assumption

Arya, Duffield, Veitch, 2007

• Active Probing

Towards (rigorous) optimal probing

Baccelli, Machiraju, Veitch, Bolot, 2007

TOMO - GRAPHY

Tomos

section

Graphia

writing

EXAMPLES OF TOMOGRAPHY

• Atom probe tomography (APT)

• Computed tomography (CT) (formerly CAT)

• Cryo-electron tomography (Cryo-ET)

• Electrical impedance tomography (EIT)

• Magnetic resonance tomography (MRT)

• Optical coherence tomography (OCT)

• Positron emission tomography (PET)

• Quantum tomography

• Single photon emission computed tomography (SPECT)

• Seismic tomography

• X-ray tomography

COMPUTED TOMOGRAPHY

NETWORK TOMOGRAPHY

The Metrics

• Link Traffic (volume, variance) (Traffic Matrix estimation)

• Link Loss (average, temporal)

• Link Delay (variance, distribution)

• Link Topology

• Path Properties (network `kriging’)

• Joint problems (use loss or delay to infer topology)

Began with Vardi [1996]

“Network Tomography: estimating source-destination traffic intensities from link data”

Classes of Inversion Problems

• End-to-end measurements internal metrics• Internal measurements path metrics

THE EARLY LITERATURE (INCOMPLETE!)

• Loss/Delay/Topology Tomography

• AT&T ( Duffield, Horowitz, Lo Presti, Towsley et al. )

• Rice ( Coates, Nowak et al. )

• Evolution:• loss, delay topology• Exact MLE EM MLE Heuristics• Multicast Unicast (striping )

• Traffic Matrix Tomography• AT&T ( Zhang, Roughan, Donoho et al. )

• Sprint ATL ( Nucci, Taft et al. )

LOSS TOMOGRAPHY USING MULTICAST PROBING

1 0 1 11 1 1 0

1 0 0 1

...

multicastprobes

...

Loss Estimator

Loss rates in logical Tree1θ

7θ

Stochastic loss process on link acts deterministically on probes arriving to

Node and Link Processes

• loss process on link

• probe `bookkeeping’ process for node

THE LOSS MODEL

1 10 0 …

1 1 00 …

1 0 0 … Bookkeeping process

Link loss process

0

Independent Probes

• Spatial: loss processes on different links independent

• Temporal: losses within each link independent

ADDING PROBABILITY: LOSS DEPENDENCIES

Model reduces to a single parameter per link, the

passage or transmission probabilities

FROM LINK PASSAGE TO PATH PASSAGE PROBABILITIES

Path probabilities: only ancestors matter

Sufficient to estimate path probabilities

ESTIMATION: JOINT PATH PASSAGE PROBABILITY, SINGLE PROBE

Aim: estimate

1 0

0 1

1 1

1 1

0 0

1

1

1

1

0

… … …

10 / 19

Original MINC loss estimator for binary tree

[Cáceres,Duffield, Horowitz, Towsley 1999]

ACCESSING INTERNAL PATHS

Obtain a quadratic in

ESTIMATION: JOINT PATH PASSAGE PROBABILITY, SINGLE PROBE

10 / 19

OBTAIN PATHS RECURSIVELY

Before

• Spatial: link loss processes independent

• Temporal:

link loss processes Bernoulli• Parameters: link passage probabilities

TEMPORAL INDEPENDENCE: HOW FAR TO RELAX ?

After

• Spatial: link loss processes independent

• Temporal:

link loss processes stationary, ergodic

• Parameters: joint link passage probabilities over index sets

Full characterisation/identification possible!

• Importance

• Impacts delay sensitive applications like VoIP (FEC tuning)• Characterizes bottleneck links

Pr

1 3 …2

Loss-run length

Loss/pass bursts……

• Loss run-length distribution ( density, mean )

TARGET LOSS CHARACTERISTIC

• Sufficiency of joint link passage probabilities

• Mean Loss-run length

• Loss-run distribution

e.g.

ACCESSING TEMPORAL PARAMETERS: GENERAL PROPERTIES

7

JOINT PASSAGE PROBABILITIES

• Joint link passage probability

• Joint path passage probability

ESTIMATION: JOINT PATH PASSAGE PROBABILITY

computed recursivelyover index sets

No of equationsequal to degree of node k

= 0 for

• Estimation of in general trees

• Requires solving polynomials with degree equal to the degree of node k

Numerical computations for trees with large degree

• Recursion over smaller index sets

• Simpler variants

• Subtree-partitioning

Requires solutions to only linear or quadratic equations

No loss of samples

Also simplifies existing MINC estimators

• Avoid recursion over index sets by considering only

subsets of receiver events which imply

ESTIMATION: JOINT PATH PASSAGE PROBABILITY

... ...

SIMULATION EXPERIMENTS

• Setup

• Loss process

• Discrete-time Markov chains

• On-off processes: pass-runs geometric, loss-runs truncated Zipf

• Estimation

• Passage probability

• Joint passage probability for a pair of consecutive probes

• Mean loss-run length:

• Relative error:

EXPERIMENTS

• Estimation for shared path in case of two-receiver binary trees

Markov chain

On-offprocess

EXPERIMENTS

Markov chain

On-offprocess

• Estimation for shared path in case of two-receiver binary trees

EXPERIMENTS

• Estimation of for larger trees

Link 1

Trees taken from router-level map of AT&T network produced by Rocketfuel (2253 links, 731 nodes)

Random shortest path multicast trees with32 receivers. Degree of internal nodes from 2 to 6, maximum height 6

VARIANCE

• Estimation for shared path in case of tertiary tree

Standard errors shown for various temporal estimators

CONCLUSIONS

• Estimators for temporal loss parameters, in addition to loss rates• Estimation of any joint probability possible for a pattern of probes

• Class of estimators to reduce computational burden

• Subtree-partition: simplifies existing MINC estimators

• Future work• Asymptotic variance

• MLE for special cases (Markov chains)

• Hypothesis tests

• Experiments with real traffic

FRANÇOIS BACCELLI, SRIDHAR MACHIRAJU, DARRYL VEITCH, JEAN BOLOT

OPTIMAL PROBING IN CONVEX NETWORKS

The Case Against Poisson Probing

• Zero-sampling bias not unique to Poisson (in non-intrusive case)

• PASTA talks about bias, is silent on variance

• PASTA is not optimal for variance or MSE

• PASTA is about sampling only, is silent on inversion

PREVIOUS WORK

10

Probe Pattern Separation Rule: select inter-probe (or probe pattern) separations as i.i.d. positive random variables, bounded above zero.

Example: Uniform (i.i.d.) separations on [ 0.9μ, 1.1μ ]

• Aims for variance reduction• Avoids phase locking (leads to sample path `bias’)• Allows freedom of probe stream design

• Results derived in context of delay only

• No optimality result (expected to be highly system and traffic dependent)

LIMITATIONS

10

• Extended all results to loss case (loss and delay in uniform framework)

• For convex networks, have universal optimum for variance

• But: sample-path bias

• Give family of probing strategies with

• Zero bias and strong consistency

• Variance as close to optimal as desired (tunable)

• Fast simulation

• Consistent with spirit of Separation Rule

NEW WORK

10

But are networks convex?

• Some systems when answer is known to be Yes • virtual work of M/G/1

• loss process of M/M/1/K

• Insight into when No

• Real data when answer seems to be Yes

Sampling

• For end-to-end delay of a probe of size x

• Only have probe samples

TWO PROBLEMS: SAMPLING AND INVERSION

10

Inversion

From the measured delay data of perturbed network, may want:

• Unperturbed delays

• Link capacities

• Available bandwidth

• Cross traffic parameters at hop 3

• TCP fairness metric at hop 5 ….

Here we focus on sampling only, do so using non-intrusive probing

Non-intrusive Delay

• Delay process using zero sized probes still meaningful

• Each probe carries a delay: samples available

THE QUESTION OF GROUND TRUTH

10

Non-intrusive Loss

• Losses with x=0 are hard to find.. meaningful but useless

• Lost probes are not available (where? How?)

• Probes which are not lost tell us ? about loss?

Cannot base non-intrusive probing on virtual (x=0) probes

Approach

• Define end-to-end process directly in continuous time

• Must be a function of system in equilibrium - no probes, no perturbation!

• Non-intrusive probing defined directly as a sampling of this process:

• Interpretation: what probe would have seen if sent in at time .

GENERAL GROUND TRUTH

10

Note

• Process may be very general, not just isolated probes but probe patterns!

• Applies equally to loss or delay

• This is the only way, even virtual probes can be intrusive!

LOSS GROUND TRUTH EXAMPLES

10

1-hop

• FIFO, buffer size K bytes, droptail

• If packet based instead, becomes independent of x !

2-hop

• Depends on x

Other examples:

• Packet pair jitter

• Indicator of packet loss in a train

• Largest jitter in a chain, or number lost

NIPSTA: NON-INTRUSIVE PROBING SEES TIME AVERAGES

10

Strong Consistency:Empirical averages seen by probe samples converge to true expectation of continuous time process, assuming:

• Stationarity and ergodicity of CT

• Stationarity and ergodicity of PT (easy)

• Independence of CT and PT (easy)

• Joint ergodicity between CT and PT

Result follows just as in delay work, using marked point processes,Palm Calculus and Ergodic Theory.

Not Restricted to Means :

• Temporal quantities also, eg jitter

• Probe-train sampling strategies

OPTIMAL VARIANCE OF SAMPLE MEAN

10

Covariance function

Sample Mean Estimator

Periodic Probing

PERIODIC PROBING IS OPTIMAL !

10

Compare integral terms :If R convex, then can use Jensen’s inequality:

Problem :• Periodic is not mixing, so joint ergodicity not assured (phase lock)

• If occurs, get `sample path bias’ (estimator not strongly consistent)

No amount of probe train design can beat periodic!

GAMMA RENEWAL PROBING

10

Gamma Law

Gamma Renewal Family:• As increases, variance drops at constant mean

• Poisson is beaten once

• Process tends to periodic as

• Sensitivity to periodicities increases however

Proof:• Again uses Jensen’s inequality

• Uses a conditioning trick and a technical result on Gamma densities

TEST WITH REAL DATA FOR MEAN DELAY

10

EXAMPLE WITH REAL DATA

10

EXAMPLE WITH REAL DATA

10

CONCLUSION TO A MORNING

• Active measurement continues to develop!

• Grid applications will no doubt lead to interesting new problems

Merci de votre attention

D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007.

Documents