Christian Schmutzer Principal Engineer Feb 2019, NANOG 75, San Francisco … and how to guarantee them over MPLS Demystifying SONET/OTN Service SLAs
Christian SchmutzerPrincipal EngineerFeb 2019, NANOG 75, San Francisco
… and how to guarantee them over MPLSDemystifying SONET/OTN Service SLAs
• Initially DS1 or DS3 SONET/PDH circuits (aka leased lines)
• Over time this became predominantly FE or 1GE point to point connections
mapped over PDH/SONET
• Reasons for using Ethernet over SONET
• Geographic reach, Committed bandwidth, Transparency to L2 protocols
• Lately it means a 10GE point to point connection delivered via
• 10G DWDM transponder or 100G DWDM muxponder
• ODU2 circuit across a OTN switched network
• Reasons for using OTN are similar to the ones for using SONET
Defining ”Private Line Services”
• Low cost per bit• Assuring transport service SLAs• Predictable path, guaranteed bandwidth & 50msec protection• Low latency• Fault detection and notification• Performance monitoring• Transparency
• Simple operations
Common Private Line Service Requirements
Note: MEF 6.1 does already define most of this for ethernet services
• Legacy TDM SONET chipset does not allow to scale bandwidth and power.
• Investment in TDM chipsets have been declining and focusing on OTN.
• Adoption of OTN inefficiently limits switching granularity to 1Gbps (ODU0).
• Pure packet chipsets enable superior scale and smallest power per bit
• Circuit emulation enables highly scalable and distributed TDM switching over a modernized packet network.
Technology Scale Evolution
0
1,000
2,000
3,000
4,000
5,000
6,000
2013 2015 2017 2019Ch
ipse
t Ban
dwid
th in
Gbp
s
SONET OTN Packet
Packet Transport providing lower Transport CostOTN platform
Fab
ric400G Linecard 400G Linecard
BRCM Jericho
BRCM Jericho
OTN Framer
OTN Framer
OTN Framer
OTN Framer
Packet only platform
Fab
ric
600G Linecard 600G Linecard
BRCM Jericho
BRCM Jericho
Less chips provide 50% more capacity !
Fab
ric
BRCM Jericho
BRCM Jericho
BRCM Jericho
BRCM Jericho
BRCM Jericho
BRCM Jericho
…
3.6T Linecard 3.6T Linecard
BRCM Jericho
BRCM Jericho
BRCM Jericho
BRCM Jericho
BRCM Jericho
BRCM Jericho
…
New “datacenter style” platforms
Super dense platforms proving 6x capacity
MPLS – Pseudowires mapped to MPLS LSPs
MPLS MPLS
MPLS LSP
MPLS LSP Size is adjusted by the Edge Nodes as TDM Services get
mapped into itMPLS Midpoints are not aware of TDM Services, all they know how forward MPLS LSP traffic
DS1 SAToP PseudowireSTS3c CEP Pseudowire
DS1
OC3
DS1
OC3
Site 1 Site 3Site 2
LSP … Label Switched Path
A circuit becomes a Pseudowire
• Circuit emulation is used • across ethernet access• between Metro COs• MPLS end2end
• Common network infrastructure also providing Eline services
• Lightreading Webinar
Verizon Use Case
TDM Access
TDM Metro3rd Party Ethernet Access
MPLS Metro
MPLS Access MPLS Metro
SONET/PDH pseudowire SONET circuit
CO
MPLS LSP Bandwidth and Path Management
A
BZ
Do I have enough bandwidth on Link to B?
Do I have enough bandwidth on Link to Z?
Network elements
NMS and/or SDN Controller
Network control
Service request from A-Z
A
BZ
NMS and/or SDN Controller
Service request from A-Z
A
BZ
Do I have enough bandwidth on all links on the path between A-Z?
Distributed, network element centric Centralized, controller centric
Do I have enough bandwidth for all failure
cases?
Do I have enough bandwidth for all failure
cases?
• Typically two approaches• Preferred path (pinning a PW to a single LSP)• Autoroute announce (let routing choose the appropriate LSP)
• “preferred path” does provide a strict 1:1 relationship between PWs and LSPs
• “PW CAC” is helping managing the required bandwidth • User configures bandwidth of the PW (accounting for overhead)• The router’s L2VPN process keeps track of PWs mapped onto each LSP and holds
down a PW if there is not enough free bandwidth in the LSP• Note: The same could also be done by the NMS or SDN controller
Mapping of Pseudowires onto LSPs
• When using autoroute announce, the L2VPN process inside the router can no longer do bandwidth accounting
• A newly added PW will increase the LSP load• In case of circuit emulated TDM, immediately (always on)• In case of ethernet PW, depending on customer traffic load
• LSP may get rerouted if needed
• The maximum reservable bandwidth of each link in the network should be set to a value <100% to give the LSP control plane enough time to react and to avoid temporary congestion
Autoroute & Auto-Bandwidth
Classic TE-FRR
ZA
pre-programmed FRR LSP(zero bandwidth)
2
SRLG diverse FRR LSP rouEng
3Bandwidth is reserved on each link along the path1
Classic TE-FRR
ZA
X
Immediate local protec8on via FRR LSP
Classic TE-FRR
ZA
X
Global re-convergence and activation of new LSP via make-before-break
Topology Independent Loop-free Alternate (TI-LFA)
ZA
pre-programmed TI-LFA1
SRLG diverse TI-LFA rou@ng
2
Topology Independent Loop-free Alternate (TI-LFA)
ZA
X
Immediate local protection via TI-LFA
Topology Independent Loop-free Alternate (TI-LFA)
ZA
X
Global re-convergence
• Use cases• “What if” analyzes• Growth planning• Network augmentation
• Capabilities• Complete network topology model• LSP path computation
• Bandwidth, affinities, SRLG
• Failure analysis• Local protection
• Path protection
• Traffic class aware (QoS violations)
Network Planning and OpLmizaLon
Path protected, co-routed, bi-directional LSPs
ZA
A protect LSP is pre-signaled and pre-programmed for handling failures immediately
1
Forward and reverse LSP are routed along same path3
SRLG diverse protect path rouCng
4
Bandwidth is reserved on each link along the path2
Headend based switching between working & protect path
6
Tailend accepting traffic from both working and protect path
7
BFD conCnuity messages validate end2end datapath to be programmed properly
5
Adjusting the MPLS Control Plane for TDM
ZA
XNew Primary Path
MPLS-TE is naturally re-optimizing and revertive
Backup Path
SONET or OTN networks don�t behave like that !
Adjusting the MPLS Control Plane for TDM
ZA
XPrimary path = failed
Introducing “persistent” MPLS-TE paths
stay on the protect !
Backup path
Adjusting the MPLS Control Plane for TDM
ZA
Primary path = operational again on same path !(inactive)
Backup path(still active !)
Introducing non-revertive path protection
Return to primary path only upon explicit request
Adjusting the MPLS Control Plane for TDM
ZA
X
X
Primary path = failed
Introducing 1:1+R to MPLS-TE to handle double-failures
Backup path = failed
On demand restore path
• Packet networks are no longer ”slow” or high latency thanks to hardware based packet forwarding
• The only reason for increased latency can be congestion (packets have to be stored in a buffer until a link is ready to send them)
• Implementing strict bandwidth accounting (RSVP-TE or central PCE) allows to design a packet network with a utilization <100% on every link which avoids packets being buffered
• This ensures overall transfer delay of a packet node to be in 10-30usec range (similar to OTN switches!)
Achieving Low Latency
Fault Detection and Notification
fault detection downstream notification remote notification
Ethernet local fault (LF) VPWS status, Y.1731 ETH-AIS remote fault (RF), Y.1731 ETH-RDI
SONET LOS - STS level AIS via C2=FF (hex)- VT level AIS via V5 bits1-3 = 111 (binary)
- STS level RDI via G1- VT level RDI via Z7(for both, bits1-3 = 010, 100, 101, 110 or 111 (binary)
ZA
LOSx
1. CW with L bit set2. PW status down
RDI
AIS
ZA
LFx
1. CCM loss of continuity2. PW status = down
RF
ETH-AISshut
SONET pseudowire Ethernet pseudowireRemote port shutdown
xCW with R bit set ETH-RDI
AIS … Alarm Indication SignalRDI … Remote Defect Indication
LF … Local FaultRF … Remote Fault
Severely errored second (SES) = >15% errored blocks during 1 second (or LOS, LOF, AIS, RDI, …)Severely errored period (SEP) = 3 to 9 consecutive SES
SONET/OTN Performance Monitoring
Errored block (EB) = block with at least one bit errorErrored second (ES) = period of 1 second with at least one errored blockBackground errored block (BBE) = errored block not part of a SES
Unavailable Seconds (UAS) = number of seconds of unavailable Sme
References: ITU-T error performance explained, also see G.828, G.8201
��
2 13
Insert BIP
2 13
Check parityactual bitstream
Ethernet Performance Monitoring
References: Y.1731, Y.1563, MEF35.1, MEF10.3
ETH-LM packets every X ms (default = 100ms)
Time interval with FLR < threshold (i.e. 50%)
Maintenance endpoint (MEP)Maintenance endpoint (MEP)
Severely errored second (SESETH) = FLR > threshold (i.e 50%)
Consecutive SESETH = two or more seconds with FLR > threshold
��
• Identified by EtherType and destination MAC address ranges (MEF45, 6.1)• 01-08-c2-00-00-00 … 01-08-c2-00-00-0f• 01-80-c2-00-00-20 … 01-08-c2-00-00-2F
• Depending on MEF UNI configuration, protocol packets may pass or get dropped
• EPL Option 2 L2CP processing defined in MEF45 is aligned to private lines
Transparency – Layer 2 Control Protocols
• Identified by ether-type 0x88e5 (MACsec) and 0x888e (MKA)
• EAPoL destination MAC address 01-80-C2-00-00-03 (802.1Q-2010) only SHOULD be passed per MEF45 9.1.1• End customers can change to broadcast MAC address on their CPE routers to ensure packets
pass
• Clear 802.1Q tag is optional (allows for VLAN and QoS aware services)
Transparency - MACsec
clearencrypted
https://www.cisco.com/c/dam/en/us/td/docs/solutions/Enterprise/Security/MACsec/WP-High-Speed-WAN-Encrypt-MACsec.pdf
Operational Simplicity – has this really been easy?
ADM ADM ADM
3/1DCS
3/1DCS
3/1DCS
OC12 Ring OC12 Ring
OC12 Ring
DS1
OC3
DS1
OC3
Site 1 Site 2 Site 3
12841
12841
Existing STS3cNewly provisioned STS3cExisHng VT15Newly provisioned VT15
12841
ADM … Add Drop Multiplexer3/1 DCS … Low Order Cross Connect (aka DCS)
Operational Simplicity – True A-Z via MPLS !
MPLS MPLS
MPLS LSP
MPLS LSP Size is adjusted by the Edge Nodes as TDM Services get
mapped into it
DS1 SAToP PseudowireSTS3c CEP Pseudowire
DS1
OC3
DS1
OC3
Site 1 Site 3Site 2
LSP … Label Switched Path
A circuit becomes a pseudowire
A Z
MPLS Midpoints are not aware of TDM Services, all they know how forward MPLS LSP traffic
• Recent advances in ASIC/Silicon and router architectures make MPLS the most cost effec<ve network transport
• Great savings by deploying a single MPLS network for all services
• Advances in traffic engineering (RSVP-TE and SR-TE) allow to run a MPLS network in a <ghtly controlled manner to guarantee bandwidth and low latency
• MPLS protec<on switching, bandwidth engineering and OAM func<ons guarantee transport SLAs
Conclusion