DARRYL VEITCH The University of Melbourne ACTIVE MEASUREMENT IN NETWORKS MetroGrid Workshop GridNets 2007 19 October 2007
DARRYL VEITCHThe University of Melbourne
ACTIVE MEASUREMENT IN NETWORKS
MetroGrid Workshop
GridNets 2007
19 October 2007
A RENAISSANCE IN NETWORK MEASUREMENT
• Data centric view of networking
• Must arise from real problems, or observations on data
• Abstractions based on data
• Results validated by data
• In fact: the scientific method in networking
• Not just getting numbers, Discovery
• Knowing, understanding, improving network performance
• Not just monitoring
• Traffic patterns: link, path, node, applications, management
• Quality of Service: delay, loss, reliability
• Protocol dynamics: TCP, VoIP, ..
• Network infrastructure: routing, security, DNS, bottlenecks, latency..
THE RISE OF MEASUREMENT
• Now have dedicated conferences:
• Passive and Active Measurement Conference (PAM) 2000 …
• ACM Internet Measurement Conference (IMC) 2001 ...
• Papers between 1966-87 ( P.F. Pawlita, ITC-12, Italy )
• Queueing theory: several thousand
• Traffic measurement: around 50
ACTIVE VERSUS PASSIVE MEASUREMENT
• Typical Active Aims
• “End-to-End” or “Network edge”
• End-to-End Loss, Delay, Connectivity, “Discovery” …..
• Long/short term monitoring: Network health; Route state
• Internet view: Application performance and robustness
• Typical Passive Aims
• “At-a-point” or “Network Core”
• Link utilisation, Link traffic patterns, Server workloads
• Long term monitoring:
Dimensioning, Capacity Planning, Source modelling
• Engineering view: Network performance
ACTIVE PROBING VERSUS NETWORK TOMOGRAPHY
• Network Tomography
• Typically “network wide”: multiple destinations and/or sources
• Simple black box node/link models, strong assumptions
• Classical inference with twists
• Typical metrics: per link/path loss/delay/throughput
• Active Probing
• Typically over a single path
• Use tandem FIFO queue model
• Exploit discrete packet effects in semi-heuristic queueing analysis
• Typical metrics: link capacities, available bandwidth
TWO ENTREES
• Loss Tomography
Removing the temporal independence assumption
Arya, Duffield, Veitch, 2007
• Active Probing
Towards (rigorous) optimal probing
Baccelli, Machiraju, Veitch, Bolot, 2007
TOMO - GRAPHY
Tomos
section
Graphia
writing
EXAMPLES OF TOMOGRAPHY
• Atom probe tomography (APT)
• Computed tomography (CT) (formerly CAT)
• Cryo-electron tomography (Cryo-ET)
• Electrical impedance tomography (EIT)
• Magnetic resonance tomography (MRT)
• Optical coherence tomography (OCT)
• Positron emission tomography (PET)
• Quantum tomography
• Single photon emission computed tomography (SPECT)
• Seismic tomography
• X-ray tomography
COMPUTED TOMOGRAPHY
NETWORK TOMOGRAPHY
The Metrics
• Link Traffic (volume, variance) (Traffic Matrix estimation)
• Link Loss (average, temporal)
• Link Delay (variance, distribution)
• Link Topology
• Path Properties (network `kriging’)
• Joint problems (use loss or delay to infer topology)
Began with Vardi [1996]
“Network Tomography: estimating source-destination traffic intensities from link data”
Classes of Inversion Problems
• End-to-end measurements internal metrics• Internal measurements path metrics
THE EARLY LITERATURE (INCOMPLETE!)
• Loss/Delay/Topology Tomography
• AT&T ( Duffield, Horowitz, Lo Presti, Towsley et al. )
• Rice ( Coates, Nowak et al. )
• Evolution:• loss, delay topology• Exact MLE EM MLE Heuristics• Multicast Unicast (striping )
• Traffic Matrix Tomography• AT&T ( Zhang, Roughan, Donoho et al. )
• Sprint ATL ( Nucci, Taft et al. )
LOSS TOMOGRAPHY USING MULTICAST PROBING
1 0 1 11 1 1 0
1 0 0 1
...
multicastprobes
...
Loss Estimator
Loss rates in logical Tree1θ
7θ
Stochastic loss process on link acts deterministically on probes arriving to
Node and Link Processes
• loss process on link
• probe `bookkeeping’ process for node
THE LOSS MODEL
1 10 0 …
1 1 00 …
1 0 0 … Bookkeeping process
Link loss process
0
Independent Probes
• Spatial: loss processes on different links independent
• Temporal: losses within each link independent
ADDING PROBABILITY: LOSS DEPENDENCIES
Model reduces to a single parameter per link, the
passage or transmission probabilities
FROM LINK PASSAGE TO PATH PASSAGE PROBABILITIES
Path probabilities: only ancestors matter
Sufficient to estimate path probabilities
ESTIMATION: JOINT PATH PASSAGE PROBABILITY, SINGLE PROBE
Aim: estimate
1 0
0 1
1 1
1 1
0 0
1
1
1
1
0
… … …
10 / 19
Original MINC loss estimator for binary tree
[Cáceres,Duffield, Horowitz, Towsley 1999]
ACCESSING INTERNAL PATHS
Obtain a quadratic in
ESTIMATION: JOINT PATH PASSAGE PROBABILITY, SINGLE PROBE
10 / 19
OBTAIN PATHS RECURSIVELY
Before
• Spatial: link loss processes independent
• Temporal:
link loss processes Bernoulli• Parameters: link passage probabilities
TEMPORAL INDEPENDENCE: HOW FAR TO RELAX ?
After
• Spatial: link loss processes independent
• Temporal:
link loss processes stationary, ergodic
• Parameters: joint link passage probabilities over index sets
Full characterisation/identification possible!
• Importance
• Impacts delay sensitive applications like VoIP (FEC tuning)• Characterizes bottleneck links
Pr
1 3 …2
Loss-run length
Loss/pass bursts……
• Loss run-length distribution ( density, mean )
TARGET LOSS CHARACTERISTIC
• Sufficiency of joint link passage probabilities
• Mean Loss-run length
• Loss-run distribution
e.g.
ACCESSING TEMPORAL PARAMETERS: GENERAL PROPERTIES
7
JOINT PASSAGE PROBABILITIES
• Joint link passage probability
• Joint path passage probability
ESTIMATION: JOINT PATH PASSAGE PROBABILITY
computed recursivelyover index sets
No of equationsequal to degree of node k
= 0 for
• Estimation of in general trees
• Requires solving polynomials with degree equal to the degree of node k
Numerical computations for trees with large degree
• Recursion over smaller index sets
• Simpler variants
• Subtree-partitioning
Requires solutions to only linear or quadratic equations
No loss of samples
Also simplifies existing MINC estimators
• Avoid recursion over index sets by considering only
subsets of receiver events which imply
ESTIMATION: JOINT PATH PASSAGE PROBABILITY
... ...
SIMULATION EXPERIMENTS
• Setup
• Loss process
• Discrete-time Markov chains
• On-off processes: pass-runs geometric, loss-runs truncated Zipf
• Estimation
• Passage probability
• Joint passage probability for a pair of consecutive probes
• Mean loss-run length:
• Relative error:
EXPERIMENTS
• Estimation for shared path in case of two-receiver binary trees
Markov chain
On-offprocess
EXPERIMENTS
Markov chain
On-offprocess
• Estimation for shared path in case of two-receiver binary trees
EXPERIMENTS
• Estimation of for larger trees
Link 1
Trees taken from router-level map of AT&T network produced by Rocketfuel (2253 links, 731 nodes)
Random shortest path multicast trees with32 receivers. Degree of internal nodes from 2 to 6, maximum height 6
VARIANCE
• Estimation for shared path in case of tertiary tree
Standard errors shown for various temporal estimators
CONCLUSIONS
• Estimators for temporal loss parameters, in addition to loss rates• Estimation of any joint probability possible for a pattern of probes
• Class of estimators to reduce computational burden
• Subtree-partition: simplifies existing MINC estimators
• Future work• Asymptotic variance
• MLE for special cases (Markov chains)
• Hypothesis tests
• Experiments with real traffic
FRANÇOIS BACCELLI, SRIDHAR MACHIRAJU, DARRYL VEITCH, JEAN BOLOT
OPTIMAL PROBING IN CONVEX NETWORKS
The Case Against Poisson Probing
• Zero-sampling bias not unique to Poisson (in non-intrusive case)
• PASTA talks about bias, is silent on variance
• PASTA is not optimal for variance or MSE
• PASTA is about sampling only, is silent on inversion
PREVIOUS WORK
10
Probe Pattern Separation Rule: select inter-probe (or probe pattern) separations as i.i.d. positive random variables, bounded above zero.
Example: Uniform (i.i.d.) separations on [ 0.9μ, 1.1μ ]
• Aims for variance reduction• Avoids phase locking (leads to sample path `bias’)• Allows freedom of probe stream design
• Results derived in context of delay only
• No optimality result (expected to be highly system and traffic dependent)
LIMITATIONS
10
• Extended all results to loss case (loss and delay in uniform framework)
• For convex networks, have universal optimum for variance
• But: sample-path bias
• Give family of probing strategies with
• Zero bias and strong consistency
• Variance as close to optimal as desired (tunable)
• Fast simulation
• Consistent with spirit of Separation Rule
NEW WORK
10
But are networks convex?
• Some systems when answer is known to be Yes • virtual work of M/G/1
• loss process of M/M/1/K
• Insight into when No
• Real data when answer seems to be Yes
Sampling
• For end-to-end delay of a probe of size x
• Only have probe samples
TWO PROBLEMS: SAMPLING AND INVERSION
10
Inversion
From the measured delay data of perturbed network, may want:
• Unperturbed delays
• Link capacities
• Available bandwidth
• Cross traffic parameters at hop 3
• TCP fairness metric at hop 5 ….
Here we focus on sampling only, do so using non-intrusive probing
Non-intrusive Delay
• Delay process using zero sized probes still meaningful
• Each probe carries a delay: samples available
THE QUESTION OF GROUND TRUTH
10
Non-intrusive Loss
• Losses with x=0 are hard to find.. meaningful but useless
• Lost probes are not available (where? How?)
• Probes which are not lost tell us ? about loss?
Cannot base non-intrusive probing on virtual (x=0) probes
Approach
• Define end-to-end process directly in continuous time
• Must be a function of system in equilibrium - no probes, no perturbation!
• Non-intrusive probing defined directly as a sampling of this process:
• Interpretation: what probe would have seen if sent in at time .
GENERAL GROUND TRUTH
10
Note
• Process may be very general, not just isolated probes but probe patterns!
• Applies equally to loss or delay
• This is the only way, even virtual probes can be intrusive!
LOSS GROUND TRUTH EXAMPLES
10
1-hop
• FIFO, buffer size K bytes, droptail
• If packet based instead, becomes independent of x !
2-hop
• Depends on x
Other examples:
• Packet pair jitter
• Indicator of packet loss in a train
• Largest jitter in a chain, or number lost
NIPSTA: NON-INTRUSIVE PROBING SEES TIME AVERAGES
10
Strong Consistency:Empirical averages seen by probe samples converge to true expectation of continuous time process, assuming:
• Stationarity and ergodicity of CT
• Stationarity and ergodicity of PT (easy)
• Independence of CT and PT (easy)
• Joint ergodicity between CT and PT
Result follows just as in delay work, using marked point processes,Palm Calculus and Ergodic Theory.
Not Restricted to Means :
• Temporal quantities also, eg jitter
• Probe-train sampling strategies
OPTIMAL VARIANCE OF SAMPLE MEAN
10
Covariance function
Sample Mean Estimator
Periodic Probing
PERIODIC PROBING IS OPTIMAL !
10
Compare integral terms :If R convex, then can use Jensen’s inequality:
Problem :• Periodic is not mixing, so joint ergodicity not assured (phase lock)
• If occurs, get `sample path bias’ (estimator not strongly consistent)
No amount of probe train design can beat periodic!
GAMMA RENEWAL PROBING
10
Gamma Law
Gamma Renewal Family:• As increases, variance drops at constant mean
• Poisson is beaten once
• Process tends to periodic as
• Sensitivity to periodicities increases however
Proof:• Again uses Jensen’s inequality
• Uses a conditioning trick and a technical result on Gamma densities
TEST WITH REAL DATA FOR MEAN DELAY
10
EXAMPLE WITH REAL DATA
10
EXAMPLE WITH REAL DATA
10
CONCLUSION TO A MORNING
• Active measurement continues to develop!
• Grid applications will no doubt lead to interesting new problems
Merci de votre attention