Top Banner
Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy
53

Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Dec 16, 2015

Download

Documents

Jasper Fleming
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Fault and Performance Management for

Next Generation IP Communication

Alan Clark, Telchemy

Fault and Performance Management for

Next Generation IP Communication

Alan Clark, Telchemy

Page 2: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Outline

• Problems affecting VoIP performance• Tools for Measuring and Diagnosing

Problems• Protocols for Reporting QoS• Performance Management Architecture• What to ask for/ integrate?

Page 3: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Enterprise VoIP Deployment

Branch Office

IP Phone

IP VPN

IP Phone

Teleworker

IP Phones

Gateway

Page 4: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

VoIP Deployment - Issues

IP Phone

IP VPN

IP Phone

IP Phones

GatewayECHO

ACCESSLINKCONGESTION

LAN CONGESTION,DUPLEX MISMATCH,LONG CABLES….

ROUTEFLAPPING,LINK FAIL

CODECDISTORTION

Page 5: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Call Quality Problems

• Packet Loss• Jitter (Packet Delay Variation)• Codecs and PLC• Delay (Latency)• Echo• Signal Level• Noise Level

Page 6: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Packet Loss and Jitter

CodecIPNetwork

JitterBuffer

Packets lostin network

Packets discardeddue to jitter

DistortedSpeech

Page 7: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Routers, Loss and Jitter

Arrivingpackets

Outputqueue

Prioritize/Route

Voice packet delayedby one or more datapackets

Queuing delay

Serialization delay

Packet loss due to bufferOverflow or RED

Inputqueue

Queuing delay

Processing delay

Page 8: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Queuing Delays

0

25

50

75

100

125

150

175

200

0 500 1000 1500 2000

Transmission speed (kbits/ s)

Max d

ela

y (

mS

)

1 x 1500 byte MTU

2 x 1500 byte MTU

3 x 1500 byte MTU

Added delay due towait for data packetsto be sent = Jitter

Page 9: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Jitter

50

75

100

125

150

0 0.5 1 1.5 2

Time (Seconds)

Dela

y (

mS)

Average jitter level (PPDV) = 4.5mSPeak jitter level = 60mS

Page 10: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

WiFi can also cause jitter

0

50

100

150

200

250

300

Time

Dela

y (

mS

) &

RS

SI RSSI

Delay

Page 11: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Effects of Jitter

• Low levels of jitter absorbed by jitter buffer• High levels of jitter

o lead to packets being discardedo cause adaptive jitter buffer to grow - increasing delay but

reducing discards

• If packets are discarded by the jitter buffer as they arrive too late they are regarded as “discarded”

• If packets arrive extremely late they are regarded as “lost” hence sometimes “lost” packets actually did arrive

Page 12: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Packet Loss

0

10

20

30

40

50

30 35 40 45 50 55 60 65 70

Time (seconds)

50

0m

S A

vge P

acket L

oss R

ate

Average packet loss rate = 2.1%Peak packet loss = 30%

Page 13: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Packet Loss is bursty

• Packet loss (and packet discard) tends to occur in sparse bursts - say 20-30% in density and one second or so in length

• Terminologyo Consecutive bursto Sparse bursto Burst of Loss vs Loss/Discard

Page 14: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

0

50

100

150

200

0 100 200 300 400 500Burst length ( packets)

Bu

rst

we

igh

t (

pa

ck

ets

)

Example Packet Loss Distribution

20 percent burst density (sparse burst)

Cons

ecut

ive

loss

Page 15: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Loss and Discard

• Loss is often associated with periods of high congestion

• Jitter is due to congestion (usually) and leads to packet discard

• Hence Loss and Discard often coincide

• Other factors can apply - e.g. duplex mismatch, link failures etc.

Page 16: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Example Loss/Discard Distribution

0

50

100

150

200

0 100 200 300 400 500

Bur st le ngth ( pa cke ts)

Bu

rst

we

igh

t (

pa

ck

ets

)

Page 17: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Leads To Time Varying Call Quality

1

2

3

4

5

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Time

MO

S

0100200300400500

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Ban

dw

idth

(kb

it/

s) Voice

Data

High jitter/ loss/ discard

Page 18: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Packet Loss Concealment

• Mitigates impact of packet loss/ discard by replacing lost speech segments

• Very effective for isolated lost packets, less effective for bursty loss/discard

• But isn’t loss/discard bursty?• Need to be able to deal with 10-20-30%

loss!!!

Estimated by PLC

Page 19: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Effectiveness of PLC

1

2

3

4

5

0 5 10 15 20

Packet Loss/ Discard Rate

AC

R M

OS

G.711 no PLCG.711 PLCG.729A

Codecdistortion Impact of loss/

discard and PLC

Page 20: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Call Quality Problems

• Packet Loss• Jitter (Packet Delay Variation)• Codecs and PLC• Delay (Latency)• Echo• Signal Level• Noise Level

Page 21: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Effect of Delay on Conversational Quality

1

2

3

4

5

0 100 200 300 400 500 600

Round trip delay (milliseconds)

MO

S S

core

55dB Echo Return Loss

35dB Echo Return Loss

Page 22: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Causes of Delay

CODEC Echo Control

RTP

IPUDPTCP

CODEC Echo Control

RTP

IPUDPTCP

External delayAccumulate and encode

Network delay Jitter buffer, decode and playout

Page 23: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Cause of Echo

IP

EchoCanceller

Gateway

LineEchoRound trip delay - typically 50mS+

Additional delay introduced by VoIP makes existing echo problems more obvious

Also - “convergence” echo

AcousticEcho

Page 24: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Echo problems

• Echo with very low delay sounds like “sidetone”

• Echo with some delay makes the line sound hollow

• Echo with over 50mS delay sounds like…. Echo

• Echo Return Loss o 55dB or above is goodo 25dB or below is bad

Page 25: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Call Quality Problems

• Packet Loss• Jitter (Packet Delay Variation)• Codecs and PLC• Delay (Latency)• Echo• Signal Level• Noise Level

Page 26: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Signal Level Problems

Temporal Clipping occurs with VAD or Echo Suppressors -- gaps in speech, start/end of words missing

Amplitude Clipping occurs -- speech sounds loud and “buzzy”

0 dBm0

-36 dBm0

Page 27: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Noise

• Noise can be due too Low signal levelo Equipment/ encoding (e.g. quantization noise)o External local loopso Environmental (room) noise

• From a service provider perspective - how to distinguish between o room noise (not my problem)o Network/equipment/circuit noise (is my problem)

Page 28: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Measuring VoIP performance

VQmon

ITU G.107ITU P.862 (PESQ)

VQmon

ITU P.VTQITU P.563

Active Test- Measure test calls

Passive Test- Measure live calls

VoIP SpecificAnalog signal based

Page 29: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

“Gold Standard” - ACR Test

• Speech materialo Phonetically balanced speech samples 8-10 seconds in lengtho Test designed to eliminate bias (e.g. presentation order different for

each listener)o Known files included as anchors (e.g. MNRU)

• Listening conditionso Panel of listenerso Controlled conditions (quiet environment with known level of

background noise)

23 2

4

Page 30: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Example ACR test results

• Extract from an ITU subjective test

• Mean Opinion Score (MOS) was 2.4

• 1=Unacceptable• 2=Poor• 3=Fair• 4=Good• 5=Excellent

0

10

20

30

40

50

Votes

1 2 3 4 5

Opinion Score

Page 31: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Packet based approaches

VoIPTest

System

VoIPTest

SystemIP

VoIPEnd

System

VoIPEnd

SystemIP

PassiveTest

PassiveTest

Measurecall

Test Call

Live CallVQmon,G.107.P.VTQ

Page 32: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Packet based approaches

• ITU G.107 R = Ro - Is - Ie - Id + Ao Really a network planning toolo Missing many essential monitoring features

• VQmono ITU G.107 + ETSI TS 101 329-5 Annex E +…….o Proprietary but widely used (Superset of G.107 &

P.VTQ)

• ITU P.VTQ o Available late 2005, very limited functionality

Page 33: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Extended E Model - VQmonArrivingpackets

Discarded

CODEC

Jitterbuffer

Loss/ Discardevents

MetricsCalculation

4 State Markov ModelGather detailedpacket loss infoin real time

Signal levelNoise levelEcho level

Call Quality ScoresDiagnostic Data

Page 34: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Modeling transient effects

10 15 20 25 30 35Time (seconds)

MeasuredCall quality

User ReportedCall quality

Ie(gap)

Ie(burst)

Ie(VQmon)

Page 35: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

VQmon - computational modelBurst lossrate

Gap lossrate

Ie mapping

Perceptual model

CalculateR-LQMOS-LQ

CalculateRo, Is

Signal levelNoise level

CalculateId

EchoDelay

CalculateR-CQMOS-CQ

Recencymodel

ETSI TS101 329-5

ITU-T G.107

Page 36: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Accuracy: Non-bursty conditions

Com pa rison of V Qm on v s ACR M OS - I LBC 1 5 .2 k

1

1.5

2

2.5

3

3.5

4

4.5

5

0 5 10 15 20

Pa cke t Loss Ra t e ( % )

MO

S S

co

re

ACR MOS

VQm on MOS- LQ

Com pa rison of V Qm on v s PESQ - I LBC 1 5 .2 k

1

1.5

2

2.5

3

3.5

4

0 5 10 15 20 25 30

Pa cke t Loss Ra t e ( % )

PE

SQ

Sc

ore

PESQ

VQm on MOS- PQ

Page 37: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

1.5

2

2.5

3

3.5

4

1.5 2 2.5 3 3.5 4

ACR MO S

Es

tim

ate

d M

OS

Accuracy: Bursty conditions

• G.107o Well established model for

network planningo No way to represent jittero Few codec modelso Inaccurate for bursty losso Conversational Quality only

• VQmono Extended G.107o Transient impairment modelo Wide range of codec modelso Narrow & Widebando Jitter Buffer Emulatoro Listening and

Conversational Quality

VQmon

E Model

Comparison of VQmon and E Modelfor severely time varying conditions

Page 38: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Signal based approaches

VoIPEnd

System

VoIPEnd

SystemIP

VoIPEnd

System

VoIPEnd

SystemIP

P.862TesterTest Call

P.563Tester

P.862 is an Active Test Approach

P.563 is a Passive Test Approach

Page 39: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

ITU P.862 - Active testing

IP

Timealign

Audiofiles

FFT…

FFT…

ComparePESQScore

Tested segment of connection

PESQ

Page 40: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

ITU P.862 - Active testing

• Send speech file

• Compare received file with original using FFT

• Takes typically 50-100 MIPS per call

• MOS-like score in the range -0.5 to 4.5

• Widely used within the industry

1

1.5

2

2.5

3

3.5

4

0 5 10 15 20 25 30 35 40

Pa cke t Loss Ra t e

PE

SQ

Sc

ore

s

Results for G.729A codec for a set ofspeech files (i.e. for each packet lossrate the only thing changed is the speechsource file)

Page 41: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

ITU P.563 - Passive monitoring

• Analyses received speech file (single ended)

• Produces a MOS score

• Correlates well with MOS when averaged over many calls

• Requires 100MIPS per call

1 .0 0

2 .0 0

3 .0 0

4 .0 0

5 .0 0

1 2 3 4 5

P5 6 3 Scor e

AC

R M

OS

Comparison of P.563 estimated MOS scores with actual ACR test scores.Each point is average per file ACR MOS with 16listeners compared to P.563 score

Page 42: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Performance Monitoring - Passive Test

RTCP XR

SIP QoSReport

EmbeddedMonitoringFunction

Page 43: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

SLA Monitoring - Active Test

Active Test Functions

Test call

Page 44: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Active or Passive Testing?

• Active testing o works for pre-deployment testing and on-demand

troubleshooting

• But!!!!o IP problems are transient

• Passive monitoring o Monitors every call made - but needs a call to monitoro Captures information on transient problemso Provides data for post-analysis

• Therefore - you need both

Page 45: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

VoIP Performance Management Framework

Media Path Reporting(RTCP XR)

Call Server andCDR database

VoIPEndpoint

VoIPGateway

SNMPReporting

NetworkManagementSystem

Signaling Based QoS Reporting

Embedded Monitoring

Network Probe,Analyzer orRouter

VQVQ

Embedded Monitoring

VQ

RTP stream (possibly encrypted)

Page 46: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

VoIP Performance Management Framework

• Embedded monitoring function in IP phones, residential gateways….

o Close to the usero Least cost + widest coverage

• Protocol support developedo RTCP XR (RFC3611), SIP, MGCP, H.323, Megacoo Draft SNMP MIB

• Works in encrypted environments• Already being deployed by equipment vendors

Page 47: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

The role of RTCP XR

RTCP XR (RFC3611)

1. Provides a useful set of metrics for VoIP performance monitoring and diagnosis

2. Supports both real time monitoring and post-analysis

3. Extracts signal level, noise level and echo level from DSP software in the endpoint

4. Exchanges info on endpoint delay and echo to allow remote endpoint to assess echo impact

5. Provides midstream probes/ analyzers access to analog metrics if secure RTP is used

6. Goes through firewalls………

Page 48: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

RFC3611 - RTCP XR

Loss Rate Discard Rate Burst Density Gap Density

Burst Duration (mS) Gap Duration (mS)

Round Trip Delay (mS) End System Delay (mS)

Signal level RERL Noise Level Gmin

R Factor Ext R MOS-LQ MOS-CQ

Rx Config - Jitter Buffer Nominal

Jitter Buffer Max Jitter Buffer Abs Max

Page 49: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

SIP Service Quality Reporting Event

PUBLISH sip:[email protected] SIP/2.0

Via: SIP/2.0/UDP pc22.example.com;branch=z9hG4bK3343d7 ……… Content-Type: application/rtcpxr Content-Length: ...

VQSessionReportLocalMetrics:TimeStamps=START:10012004.18.23.43 STOP:10012004.18.26.02SessionDesc=PT:0 PD:G.711 SR:8000 FD:20 FPP:2 PLC:3 SSUP:on

[email protected] ………Signal=SL:2 NL:10 RERL:14QualityEst=RLQ:90 RCQ:85 EXTR:90 MOSLQ:3.4 MOSCQ:3.3

QoEEstAlg:VQMonv2.1DialogID:38419823470834;to-tag=8472761;from-tag=9123dh311

Page 50: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

RTCP XR MIB

Session table

Basic parameters

Call quality metrics

History table

Alerting

Page 51: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Passive Monitoring Framework

Branch Office

IP Phone

IP VPN

IP Phone

Teleworker

VQ

IP Phones

Gateway

NMS

VQ

VQ

VQ

VQ

VQ

VQ

VQ

VQ

VQ

VQ

VQ

RTCP XR

SIP QoS Report

SNMP

Page 52: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

What to Implement/ Ask For

• Embedded monitoring functionality in IP Phones and Gateways (e.g. VQmon)

• RTCP XR for mid-call data exchange between endpoints

• SIP Service Quality Events for reporting end of call quality

• RTCP XR MIB for SNMP support

Page 53: Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication.

Summary

• Problems affecting VoIP performance• Tools for Measuring and Diagnosing

Problems• Protocols for Reporting QoS• Performance Management Architecture• What to ask for/ integrate?