Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy
Dec 16, 2015
Fault and Performance Management for
Next Generation IP Communication
Alan Clark, Telchemy
Fault and Performance Management for
Next Generation IP Communication
Alan Clark, Telchemy
Outline
• Problems affecting VoIP performance• Tools for Measuring and Diagnosing
Problems• Protocols for Reporting QoS• Performance Management Architecture• What to ask for/ integrate?
Enterprise VoIP Deployment
Branch Office
IP Phone
IP VPN
IP Phone
Teleworker
IP Phones
Gateway
VoIP Deployment - Issues
IP Phone
IP VPN
IP Phone
IP Phones
GatewayECHO
ACCESSLINKCONGESTION
LAN CONGESTION,DUPLEX MISMATCH,LONG CABLES….
ROUTEFLAPPING,LINK FAIL
CODECDISTORTION
Call Quality Problems
• Packet Loss• Jitter (Packet Delay Variation)• Codecs and PLC• Delay (Latency)• Echo• Signal Level• Noise Level
Packet Loss and Jitter
CodecIPNetwork
JitterBuffer
Packets lostin network
Packets discardeddue to jitter
DistortedSpeech
Routers, Loss and Jitter
Arrivingpackets
Outputqueue
Prioritize/Route
Voice packet delayedby one or more datapackets
Queuing delay
Serialization delay
Packet loss due to bufferOverflow or RED
Inputqueue
Queuing delay
Processing delay
Queuing Delays
0
25
50
75
100
125
150
175
200
0 500 1000 1500 2000
Transmission speed (kbits/ s)
Max d
ela
y (
mS
)
1 x 1500 byte MTU
2 x 1500 byte MTU
3 x 1500 byte MTU
Added delay due towait for data packetsto be sent = Jitter
Jitter
50
75
100
125
150
0 0.5 1 1.5 2
Time (Seconds)
Dela
y (
mS)
Average jitter level (PPDV) = 4.5mSPeak jitter level = 60mS
WiFi can also cause jitter
0
50
100
150
200
250
300
Time
Dela
y (
mS
) &
RS
SI RSSI
Delay
Effects of Jitter
• Low levels of jitter absorbed by jitter buffer• High levels of jitter
o lead to packets being discardedo cause adaptive jitter buffer to grow - increasing delay but
reducing discards
• If packets are discarded by the jitter buffer as they arrive too late they are regarded as “discarded”
• If packets arrive extremely late they are regarded as “lost” hence sometimes “lost” packets actually did arrive
Packet Loss
0
10
20
30
40
50
30 35 40 45 50 55 60 65 70
Time (seconds)
50
0m
S A
vge P
acket L
oss R
ate
Average packet loss rate = 2.1%Peak packet loss = 30%
Packet Loss is bursty
• Packet loss (and packet discard) tends to occur in sparse bursts - say 20-30% in density and one second or so in length
• Terminologyo Consecutive bursto Sparse bursto Burst of Loss vs Loss/Discard
0
50
100
150
200
0 100 200 300 400 500Burst length ( packets)
Bu
rst
we
igh
t (
pa
ck
ets
)
Example Packet Loss Distribution
20 percent burst density (sparse burst)
Cons
ecut
ive
loss
Loss and Discard
• Loss is often associated with periods of high congestion
• Jitter is due to congestion (usually) and leads to packet discard
• Hence Loss and Discard often coincide
• Other factors can apply - e.g. duplex mismatch, link failures etc.
Example Loss/Discard Distribution
0
50
100
150
200
0 100 200 300 400 500
Bur st le ngth ( pa cke ts)
Bu
rst
we
igh
t (
pa
ck
ets
)
Leads To Time Varying Call Quality
1
2
3
4
5
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Time
MO
S
0100200300400500
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Ban
dw
idth
(kb
it/
s) Voice
Data
High jitter/ loss/ discard
Packet Loss Concealment
• Mitigates impact of packet loss/ discard by replacing lost speech segments
• Very effective for isolated lost packets, less effective for bursty loss/discard
• But isn’t loss/discard bursty?• Need to be able to deal with 10-20-30%
loss!!!
Estimated by PLC
Effectiveness of PLC
1
2
3
4
5
0 5 10 15 20
Packet Loss/ Discard Rate
AC
R M
OS
G.711 no PLCG.711 PLCG.729A
Codecdistortion Impact of loss/
discard and PLC
Call Quality Problems
• Packet Loss• Jitter (Packet Delay Variation)• Codecs and PLC• Delay (Latency)• Echo• Signal Level• Noise Level
Effect of Delay on Conversational Quality
1
2
3
4
5
0 100 200 300 400 500 600
Round trip delay (milliseconds)
MO
S S
core
55dB Echo Return Loss
35dB Echo Return Loss
Causes of Delay
CODEC Echo Control
RTP
IPUDPTCP
CODEC Echo Control
RTP
IPUDPTCP
External delayAccumulate and encode
Network delay Jitter buffer, decode and playout
Cause of Echo
IP
EchoCanceller
Gateway
LineEchoRound trip delay - typically 50mS+
Additional delay introduced by VoIP makes existing echo problems more obvious
Also - “convergence” echo
AcousticEcho
Echo problems
• Echo with very low delay sounds like “sidetone”
• Echo with some delay makes the line sound hollow
• Echo with over 50mS delay sounds like…. Echo
• Echo Return Loss o 55dB or above is goodo 25dB or below is bad
Call Quality Problems
• Packet Loss• Jitter (Packet Delay Variation)• Codecs and PLC• Delay (Latency)• Echo• Signal Level• Noise Level
Signal Level Problems
Temporal Clipping occurs with VAD or Echo Suppressors -- gaps in speech, start/end of words missing
Amplitude Clipping occurs -- speech sounds loud and “buzzy”
0 dBm0
-36 dBm0
Noise
• Noise can be due too Low signal levelo Equipment/ encoding (e.g. quantization noise)o External local loopso Environmental (room) noise
• From a service provider perspective - how to distinguish between o room noise (not my problem)o Network/equipment/circuit noise (is my problem)
Measuring VoIP performance
VQmon
ITU G.107ITU P.862 (PESQ)
VQmon
ITU P.VTQITU P.563
Active Test- Measure test calls
Passive Test- Measure live calls
VoIP SpecificAnalog signal based
“Gold Standard” - ACR Test
• Speech materialo Phonetically balanced speech samples 8-10 seconds in lengtho Test designed to eliminate bias (e.g. presentation order different for
each listener)o Known files included as anchors (e.g. MNRU)
• Listening conditionso Panel of listenerso Controlled conditions (quiet environment with known level of
background noise)
23 2
4
Example ACR test results
• Extract from an ITU subjective test
• Mean Opinion Score (MOS) was 2.4
• 1=Unacceptable• 2=Poor• 3=Fair• 4=Good• 5=Excellent
0
10
20
30
40
50
Votes
1 2 3 4 5
Opinion Score
Packet based approaches
VoIPTest
System
VoIPTest
SystemIP
VoIPEnd
System
VoIPEnd
SystemIP
PassiveTest
PassiveTest
Measurecall
Test Call
Live CallVQmon,G.107.P.VTQ
Packet based approaches
• ITU G.107 R = Ro - Is - Ie - Id + Ao Really a network planning toolo Missing many essential monitoring features
• VQmono ITU G.107 + ETSI TS 101 329-5 Annex E +…….o Proprietary but widely used (Superset of G.107 &
P.VTQ)
• ITU P.VTQ o Available late 2005, very limited functionality
Extended E Model - VQmonArrivingpackets
Discarded
CODEC
Jitterbuffer
Loss/ Discardevents
MetricsCalculation
4 State Markov ModelGather detailedpacket loss infoin real time
Signal levelNoise levelEcho level
Call Quality ScoresDiagnostic Data
Modeling transient effects
10 15 20 25 30 35Time (seconds)
MeasuredCall quality
User ReportedCall quality
Ie(gap)
Ie(burst)
Ie(VQmon)
VQmon - computational modelBurst lossrate
Gap lossrate
Ie mapping
Perceptual model
CalculateR-LQMOS-LQ
CalculateRo, Is
Signal levelNoise level
CalculateId
EchoDelay
CalculateR-CQMOS-CQ
Recencymodel
ETSI TS101 329-5
ITU-T G.107
Accuracy: Non-bursty conditions
Com pa rison of V Qm on v s ACR M OS - I LBC 1 5 .2 k
1
1.5
2
2.5
3
3.5
4
4.5
5
0 5 10 15 20
Pa cke t Loss Ra t e ( % )
MO
S S
co
re
ACR MOS
VQm on MOS- LQ
Com pa rison of V Qm on v s PESQ - I LBC 1 5 .2 k
1
1.5
2
2.5
3
3.5
4
0 5 10 15 20 25 30
Pa cke t Loss Ra t e ( % )
PE
SQ
Sc
ore
PESQ
VQm on MOS- PQ
1.5
2
2.5
3
3.5
4
1.5 2 2.5 3 3.5 4
ACR MO S
Es
tim
ate
d M
OS
Accuracy: Bursty conditions
• G.107o Well established model for
network planningo No way to represent jittero Few codec modelso Inaccurate for bursty losso Conversational Quality only
• VQmono Extended G.107o Transient impairment modelo Wide range of codec modelso Narrow & Widebando Jitter Buffer Emulatoro Listening and
Conversational Quality
VQmon
E Model
Comparison of VQmon and E Modelfor severely time varying conditions
Signal based approaches
VoIPEnd
System
VoIPEnd
SystemIP
VoIPEnd
System
VoIPEnd
SystemIP
P.862TesterTest Call
P.563Tester
P.862 is an Active Test Approach
P.563 is a Passive Test Approach
ITU P.862 - Active testing
IP
Timealign
Audiofiles
FFT…
FFT…
ComparePESQScore
Tested segment of connection
PESQ
ITU P.862 - Active testing
• Send speech file
• Compare received file with original using FFT
• Takes typically 50-100 MIPS per call
• MOS-like score in the range -0.5 to 4.5
• Widely used within the industry
1
1.5
2
2.5
3
3.5
4
0 5 10 15 20 25 30 35 40
Pa cke t Loss Ra t e
PE
SQ
Sc
ore
s
Results for G.729A codec for a set ofspeech files (i.e. for each packet lossrate the only thing changed is the speechsource file)
ITU P.563 - Passive monitoring
• Analyses received speech file (single ended)
• Produces a MOS score
• Correlates well with MOS when averaged over many calls
• Requires 100MIPS per call
1 .0 0
2 .0 0
3 .0 0
4 .0 0
5 .0 0
1 2 3 4 5
P5 6 3 Scor e
AC
R M
OS
Comparison of P.563 estimated MOS scores with actual ACR test scores.Each point is average per file ACR MOS with 16listeners compared to P.563 score
Performance Monitoring - Passive Test
RTCP XR
SIP QoSReport
EmbeddedMonitoringFunction
SLA Monitoring - Active Test
Active Test Functions
Test call
Active or Passive Testing?
• Active testing o works for pre-deployment testing and on-demand
troubleshooting
• But!!!!o IP problems are transient
• Passive monitoring o Monitors every call made - but needs a call to monitoro Captures information on transient problemso Provides data for post-analysis
• Therefore - you need both
VoIP Performance Management Framework
Media Path Reporting(RTCP XR)
Call Server andCDR database
VoIPEndpoint
VoIPGateway
SNMPReporting
NetworkManagementSystem
Signaling Based QoS Reporting
Embedded Monitoring
Network Probe,Analyzer orRouter
VQVQ
Embedded Monitoring
VQ
RTP stream (possibly encrypted)
VoIP Performance Management Framework
• Embedded monitoring function in IP phones, residential gateways….
o Close to the usero Least cost + widest coverage
• Protocol support developedo RTCP XR (RFC3611), SIP, MGCP, H.323, Megacoo Draft SNMP MIB
• Works in encrypted environments• Already being deployed by equipment vendors
The role of RTCP XR
RTCP XR (RFC3611)
1. Provides a useful set of metrics for VoIP performance monitoring and diagnosis
2. Supports both real time monitoring and post-analysis
3. Extracts signal level, noise level and echo level from DSP software in the endpoint
4. Exchanges info on endpoint delay and echo to allow remote endpoint to assess echo impact
5. Provides midstream probes/ analyzers access to analog metrics if secure RTP is used
6. Goes through firewalls………
RFC3611 - RTCP XR
Loss Rate Discard Rate Burst Density Gap Density
Burst Duration (mS) Gap Duration (mS)
Round Trip Delay (mS) End System Delay (mS)
Signal level RERL Noise Level Gmin
R Factor Ext R MOS-LQ MOS-CQ
Rx Config - Jitter Buffer Nominal
Jitter Buffer Max Jitter Buffer Abs Max
SIP Service Quality Reporting Event
PUBLISH sip:[email protected] SIP/2.0
Via: SIP/2.0/UDP pc22.example.com;branch=z9hG4bK3343d7 ……… Content-Type: application/rtcpxr Content-Length: ...
VQSessionReportLocalMetrics:TimeStamps=START:10012004.18.23.43 STOP:10012004.18.26.02SessionDesc=PT:0 PD:G.711 SR:8000 FD:20 FPP:2 PLC:3 SSUP:on
[email protected] ………Signal=SL:2 NL:10 RERL:14QualityEst=RLQ:90 RCQ:85 EXTR:90 MOSLQ:3.4 MOSCQ:3.3
QoEEstAlg:VQMonv2.1DialogID:38419823470834;to-tag=8472761;from-tag=9123dh311
RTCP XR MIB
Session table
Basic parameters
Call quality metrics
History table
Alerting
Passive Monitoring Framework
Branch Office
IP Phone
IP VPN
IP Phone
Teleworker
VQ
IP Phones
Gateway
NMS
VQ
VQ
VQ
VQ
VQ
VQ
VQ
VQ
VQ
VQ
VQ
RTCP XR
SIP QoS Report
SNMP
What to Implement/ Ask For
• Embedded monitoring functionality in IP Phones and Gateways (e.g. VQmon)
• RTCP XR for mid-call data exchange between endpoints
• SIP Service Quality Events for reporting end of call quality
• RTCP XR MIB for SNMP support
Summary
• Problems affecting VoIP performance• Tools for Measuring and Diagnosing
Problems• Protocols for Reporting QoS• Performance Management Architecture• What to ask for/ integrate?