© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Arkadiy Shapiro
Technical Marketing Engineer
NX-OS and Nexus 7000
BRKRST-2333
Ver 1.8
Network Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Why am I here?
4
Campus Core
Catalyst 6500
Access
100G
10
0G
40G
40
G
ASR 9000
Routing Core
CRS-3
SP Edge
Campus Core
Nexus 2000 / 3000 / 3500 / 5000
Catalyst 6500
DC Access
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Session Goals
5
At the end of the session, the participants
should:
Understand where failure detection fits in
achieving network fast convergence
Be able to identify which failure detection
technologies are needed to achieve
business needs and required SLAs
Understand future advances in network
failure detection technologies
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Session Non-goals
This session does not include:
Discussion on other aspects of fast convergence
Details on software or hardware architectures of related Cisco products
Detailed roadmap discussion for related Cisco products
Detailed discussion on service / end-to-end failure technologies
Discussion on user-driven failure detection methods (ping, traceroute etc)
and using scripts / EEM to automate those
6
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Agenda
Overview
Layer 1 Failure Detection
Layer 2 Failure Detection
Layer 3 Failure Detection
Summary
7
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Agenda
Overview
Layer 1 Failure Detection
Layer 2 Failure Detection
Layer 3 Failure Detection
Summary
8
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Routing Convergence in Action
Overview
A B C
D
Folks: my link to B is down Folks: my link to C is
down
Ok, fine, will use path via D
I don’t care, nothing changes for me
Ooops.. Problem
t0 t1 t3 t2 t4 Loss of Connectivity = t4 – t0 9
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Routing Convergence Components
1. Failure Detection
2. Failure Propagation (flooding, etc.)
3. Topology/Routing Recalculation
4. Update of the routing and forwarding table (RIB & FIB)
10
Overview
t0 t1 t3 t2 t4
1 2 3 4
IGP and BGP Reaction
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Failure Detection Overview
Detecting the failure is very critical but most
challenging part of network convergence
Failure Detection can occur on different levels / layers:
Physical Layer (1)
Data link Layer (2)
Network Layer (3)
Service / Application (not covered here)
Do you really need to touch all the layers?
11
Overview
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
IP/MPLS
Ethernet/FR/ATM …
SONETSDH OTN
DWDN
Interconnection Options
A. Layer 3 p2p
B. Layer 3 with a Layer 1 (DWDM) “bump” in wire
C. Layer 3 with a Layer 2 (Ethernet / Frame Relay / ATM switch) “bump” in wire
D. Layer 3 with a Layer 3 (Firewall / router) “bump” in wire 12
L1
L2
L3 A
B
C
D
Overview
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Bit transmission
Signaling: Auto-negotiation / FEFI / Remote Fault Indication
Other: Carrier Delay / Debounce
UDLD LACP 802.1ag CFM/
Y.1731 FM
Failure Detection Tools
Layered Approach
13
802.3ah Link OAM
BFD for MPLS LSPs / TE-FRR
BFD for BGP, OSPF, IS-IS, EIGRP, FHRPs and static
802.1ag CFM; Y.1731 PM; BFD for VCCV, GRE; FabricPath/TRILL OAM
Service /
Application
Layer 3
Layer 2
MPLS
Layer 1
Overview
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Engineering Complexity vs. Gain K.I.S.S
14
Overview
Loss (Impairments/Time)
Co
st a
nd
C
om
ple
xity
Re-engineering Required
Pote
nti
al O
ver-
En
gin
eeri
ng
Viable- Engineering
Number of possible approaches, or combinations of approaches.
Range of viable engineering options may vary by type of application
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Agenda
Overview
Layer 1 Failure Detection
Layer 2 Failure Detection
Layer 3 Failure Detection
Summary
15
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Layer 1 – IPoDWDM Proactive Protection
IP / optical integration enables the capability to identify degraded link using optical
data (pre-FEC BER) and start protection (i.e. by signaling to the IGP/FRR) before
traffic starts failing, achieving hitless failover in many cases
16
Trans-ponder
Optical port on router
WDM port on router
Optical impairments Co
rre
cte
d b
its
FEC limit
Working path
Switchover lost data
Protected path
BER
LOF
Optical impairments Co
rre
cte
d b
its
FEC limit
Protection trigger
Working path Protect path
BER
Near-hitless switch
WDM WDM
FEC
FEC
Reactive protection Proactive protection
Layer 1 Failure Detection
HW
Support
CRS
ASR 9000
XR 12000
7600
Check
specific
interface
types!
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Layer 1 Failure Detection – Ethernet
Ethernet mechanisms like auto-negotiation (1 GigE), FEFI (100FX) or link fault
signalling (802.3ae/ba) can signal local failures to the remote end
Challenge to get this signal across an Eth-over-SDH/OTN cloud as relaying the fault information to the other end is not always possible
Link Fault Signaling
19
Layer 1 Failure Detection
R1
rx
tx
tx
rx
R2
X
R2
rx tx
tx rx
rx
tx
tx
rx
Optical Transport R1 MUX-B MUX-A
X
“Bump” in Layer 1 link
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Link Down Detection
Link-down / interface-down event detection is hardware-dependent
Catalyst 6500 and Cisco 7600 OSM, SIP, 6708-10GE and more recent I/O
modules use interrupt-driven notification, offering <10ms detection
6704 offers <30ms with optimized polling
All other older I/O modules are being polled in order, 20ms per port
worst case 48 * 20ms = 960ms to detect failure!
Enhancement with CSCsr21196 (SXI, SRD2, SRC3) for fiber ports 60 msec
Nexus switches / CRS / ASR 9000 – interrupt-driven notification
How Fast?
20
Layer 1 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Carrier Delay
Running timer in software
Filters link up and down events, notifies protocols
By default, most IOS versions set timer at 2 seconds
to suppress short flaps
This behaviour is not desirable for Fast Convergence
Not recommended to set carrier-delay to 0 on SVI
Standard routing platform feature
21
interface …
carrier-delay msec 0
Layer 1 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Asymmetric Carrier Delay
When connecting to an Ethernet Layer 2 cloud, it may be desirable to delay link-up for a bit, without changing link-down carrier delay
Otherwise, the initial ARP request could get dropped in the L2 cloud, which can create short black-hole (due to incomplete adjacency)
Some device drivers have a built-in up-delay
POS: Generally 10 seconds
7600 ES20/40 WAN ports: 4 seconds
22
interface …
carrier-delay up 20
interface …
carrier-delay up msec 20
SW Support
IOS
• 12.0(32)SY2
• 12.2SRD
IOS XR
• XR 3.4.0
Layer 1 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Debounce Timer
Delay link down notification only
Runs in firmware
100 msec default in NX-OS
300 msec default on IOS on copper, 10 msec on fiber
Most cases recommended to keep it at default
Standard switching platform feature
23
switch(config)interface …
switch(config-if)# link debounce time ?
<0-5000> Timer value (in milliseconds)
NX-OS
Layer 1 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Carrier Delay vs Debounce timer
Carrier Delay /
Asymmetric Carrier Delay
Debounce timer
Runs in software
Runs in firmware
Not applicable to: • Switches except WAN interfaces ((i.e ES+ or
SIP/SPA on Catalyst 6500)
• Ethernet LAN switching interfaces on routers
(i.e Cisco 7600 with WS-X6708 card)
Not applicable to : • Routers except Ethernet LAN switching
interfaces (i.e Cisco 7600 with WS-X6708 card)
• WAN interfaces on switches (i.e ES+ or
SIP/SPA on Catalyst 6500)
• SVIs
Filters link down and up events Filters link down events only
24
Make sure to test before implementing!
Layer 1 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Link Isolation - IP Event Dampening
Logical Diagram
25
Actual interface state
Maximum penalty
Suppress threshold
Reuse threshold
Accumulated penalty
Interface state seen by routing protocols
Layer 1 Failure Detection
SW Support
IOS
IOS XE
IOS XR
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Agenda
Overview
Layer 1 Failure Detection
Layer 2 Failure Detection
Layer 3 Failure Detection
Summary
26
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Technology Analysis
What layer?
Keepalive message interval and timeout?
Types of failures detected?
Reaction to failures?
Methods to support ISSU?
Scale?
Protocol offload?
Standardization?
Types of interfaces supported?
Layer 2 and Layer 3 Failure Detection
27
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Network Scenarios
Classical Ethernet Layer 2
Single p2p link
Bundle
FabricPath / TRILL
Single p2p link
Bundle
Layer 3
Single p2p link
Bundle
SVI on top of Classical Ethernet
SVI on top of FabricPath / TRILL
28
Summary
SVI SVI
SVI SVI
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Layer 2 – Data Link Layer
Generally only applicable to L2 transports using some form of keepalive mechanism
PPP or HDLC keepalives
Frame-Relay LMI
ATM OAM
Ethernet OAM, LACP (bundles), UDLD
Sub-second failure detection at scale typically not a goal using the features mentioned above
‒ Ethernet OAM / CFM is getting there…
‒ Fast UDLD
Tuning keepalive down to minimum is NOT recommended, can lead to false positives as keepalive processing may not be optimized
29
Layer 2 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Unidirectional Link Detection (UDLD)
Light-weight Layer 2 failure detection protocol
Designed for detecting:
One-way connections due to physical failure
One-way connections due to soft failure
Mis-wiring detection (loopback or triangle)
Cisco proprietary, but listed in informational RFC 5171
Runs on any single Ethernet link, even inside bundle
Typically a centralized implementation (hellos sent from
supervisor, not from LC)
Message interval: 7-90 sec (default: 15 seconds)
Detection: 2.5 x interval + timeout value (4 sec) ~ 21 sec
30
Layer 2 Failure Detection
Tx Rx
Tx Rx
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
UDLD Basics of Operation
With ECHO messages, each device learns:
What its connected to and peer’s message
interval
What its neighbors think they are connected to!
This information can then be used to detect
faults
FLUSH message is sent when UDLD is
disabled
Aging mechanism with PROBE messages
Information from neighbors that is not periodically
refreshed is eventually timed out
This can also be used for fault detection
Peer Discovery and Relationship
32
Layer 2 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
UDLD Scenario 1
Echo Packet from A to B has “My Switch-ID A, My Port-ID e x/y”
When B sends the echo-reply back, it is expected to have “My Switch-ID
B, My Port-ID e w/z” AND “Your Switch-ID A, Your Port-ID e x/y”.
Transmit path failure from A to B
When B sends the echo-reply back, the echo-reply packet has only “My
Switch-ID B, My Port-ID e w/z. B timed out!
Empty-Echo condition or age out
33
Layer 2 Failure Detection
Switch A e x/y e w/z Switch B e
U
D
L
D
Pk tMg r
X X X
U
D
L
D
Pk tMg r
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Switch C e s/t
UDLD Scenario 2
Caused by packet flowing only in one (uni) direction
Key differentiating factor of UDLD!
With SFP type fiber connection, this error is less common
Miswiring Detection
34
Layer 2 Failure Detection
Switch A e x/y
Switch B e w/z
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Fast UDLD
UDLD message interval to achieve sub-second detection
New Fast Hello TLV for backward compatibility
Message interval: 200 msec – 1 sec
Similar considerations as Layer 3 timer tuning:
CPU usage (false positives) and scale (not designed for this)
SSO / ISSU support
37
Layer 2 Failure Detection
SW Support
IOS
• 12.2.33 SXI4
• 12.2(54)SG
switch(config)#interface GigabitEthernet1/1
switch(config-if)#udld fast-hello ?
<200-1000> Time in milliseconds between sending of messages in steady state
switch#show udld fast-hello
Total ports with fast hello configured: 10
Total ports with fast hello operational: 5
Total ports with fast hello non-operational: 5
Fast hello configuration setting (millisecond):
Interface Gi1/1 200 operational
Interface Gi1/6 500 configured
IOS
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
UDLD Failure Reaction
Normal vs. Aggressive mode
38
Normal Aggressive
Set port to err-disable state in case of uni-
direction condition : Empty Echo packet,
Uni-direction, TX/RX loop, and Neighbor
Mismatch
Set port to err-disable state in case of uni-
direction condition : Empty Echo packet,
Uni-direction, TX/RX loop, and Neighbor
Mismatch
Does NOT err-disable the port in case of
sudden cessation of udld packets
Set port to err-disable state in case of
sudden cessation of UDLD packets:
port is put in err-disable mode if no udld
packets are received for 3 x hello-time + 5
sec (=50 secs, default )
Layer 2 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Spanning Tree Bridge Assurance
Turns STP into a bidirectional protocol
Ensures spanning tree fails “closed” rather than “open”
All ports with “network” port type send BPDUs regardless of state
If network port stops receiving BPDUs, port is placed in BA-Inconsistent
state (blocked)
Caveats:
Not recommended on VPC ports
ISSU on Nexus 5000 not supported with STP BA (VPC peer-link is exception)
Layer 2 Failure Detection
%STP-2-BRIDGE_ASSURANCE_BLOCK: Bridge Assurance blocking port Ethernet2/48 VLAN0700. switch# sh spanning vl 700 | in -i bkn Eth2/48 Desg BKN*4 128.304 Network P2p *BA_Inc
NX-OS
39
SW Support
IOS
• 12.2.33 SXI
• 12.2.50SY
NX-OS
• 4.0(1)
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
With Bridge Assurance
Layer 2 Failure Detection
Root
Blocked
BPDUs
Network
Network Network
Network
BPDUs
Edge Edge
Network
Network
BPDUs
Malfunctioning
switch
Stopped receiving BPDUS!
Stopped receiving BPDUS!
BA Inconsistent
BA Inconsistent
%STP-2-BRIDGE_ASSURANCE_BLOCK: Bridge Assurance blocking port Ethernet2/48 VLAN0700.
switch# show spanning vl 700 | in -i bkn
Eth2/48 Altn BKN*4 128.304 Network P2p *BA_Inc
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
UDLD “Original” Deployment Scenarios
Assist unidirectional Layer 2 protocols
42
Layer 2 Failure Detection
Root switch
Figure 1: Spanning Tree Loop Prevention
Alternate
block
A
B C
1 2
3
Root switch
Figure 2: Spanning Tree Fast Convergence
Alternate
block
A
B C
1 2
Figure 3: Ether-channel Convergence
Channel group 1 mode on
RSTP 802.1w
STP Bridge Assurance STP Bridge Assurance
LACP
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
UDLD Best Practices
How much do you really need UDLD?
Physical uni-directional failures are communicated by Layer 1 mechanisms
STP Bridge Assurance to account for soft failures in either direction
LACP to account for failures on bundle members
Chance of mis-wiring may be rare
Are you on Layer 3 / FabricPath p2p link with already bidirectional protocol?
If UDLD is needed:
Use normal mode
Use default timers
Only choose few interfaces to use for Fast UDLD
43
Layer 2 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
OAM
Link OAM - Any point-to-point 802.3 link
CFM / Y.1731 - End-to-End UNI to UNI
E-LMI - User to Network Interface (UNI)
MPLS OAM - within MPLS cloud
Current Protocol Positioning
45
Access Access Core Customer
Provider Bridges
Provider Bridges
IP/MPLS
Business
Residential
Business
Residential
UNI UNI NNI NNI NNI
Backbone Bridges
Backbone Bridges
Customer
Ethernet Link OAM
Access E-LMI
MPLS OAM
MSE/BNG
Y.1731 Performance Management
Access
Connectivity Fault Management
Layer 2 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Ethernet OAM
IEEE 802.3ah (clause 57)
Ethernet Link OAM
Also referred as 802.3 OAM or Link OAM
IEEE 802.1ag
Connectivity Fault Management (CFM)
Also referred as Service OAM
ITU-T Y.1731
OAM functions and mechanisms for Ethernet-based networks
MEF E-LMI
Ethernet Local Management Interface
Building Blocks
46
Layer 2 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Link OAM
Provides mechanisms for “monitoring link operation”
Runs on any single point-to-point Ethernet link
Uses “Slow Protocol”1 frames called OAMPDUs
OAMPDU interval: 100 msec – 1 sec (1-10 pps)
Minimum Timeout: 200 msec (IOS XR), 2 sec (IOS)
Extensible and flexible protocol
Support mainly on Carrier Ethernet platforms:
Cisco 7600, ASR 9000, ASR 901, ASR 903, ME switches
IEEE 802.3ah, Clause 57 (IEEE 802.3-2008)
48
Layer 2 Failure Detection
(1) No more than 10 frames transmitted in any one-second period
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
IEEE 802.3ah
OAM Discovery
Discover OAM support, peer identity and capabilities per device
Link Monitoring
Basic error definitions for Ethernet so entities can detect degraded links and
isolate them
Remote Failure Indication
Mechanisms for one entity to signal another that it has detected an error
Remote Loopback
Used to troubleshoot networks, allows one station to put the other station into a
state whereby all inbound traffic is immediately reflected back onto the link
Remote MIB Variable Retrieval
Ability to read one or more MIB variables from the remote DTE
Key Functions
Layer 2 Failure Detection
49
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Link OAM Discovery
Layer 2 Failure Detection
switch#show ethernet oam discovery interface fas 1/1 FastEthernet1/1 Local client ------------ Administrative configurations: Mode: active Unidirection: not supported Link monitor: supported (on) Remote loopback: not supported MIB retrieval: not supported Mtu size: 1500 Operational status: Port status: operational Loopback status: no loopback PDU revision: 0 Remote client ------------- MAC address: 0011.9321.1640 Vendor(oui): 00000C(cisco) Administrative configurations: PDU revision: 1 Mode: active Unidirection: not supported Link monitor: supported Remote loopback: not supported MIB retrieval: not supported Mtu size: 1500
First phase of Ethernet OAM
Discovery has a simple state machine:
Send Information OAMPDU in a periodic
fashion
Discover peer device and its OAM configuration
and capabilities
Decide whether OAM clients can be fully
operational on the link
Detect timeout based on lack of OAMPDUs
from peer
No message interval exchange or
negotiation!
51
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Link OAM scale and ISSU
• Scale
Slow protocol but 100 msec interval for all ports on a
linecard is not slow!
Protocol offload to I/O module CPU helps
Protocol offload to FPGA (ME 3400) helps even more!
• ISSU (the “zero service disruption one”)
Need graceful protocol mechanisms to support SSO /
ISSU – standard does not specify
Not possible to inflate timers since timers are not
negotiated!
53
Layer 2 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
IOS and IOS XR
55
Layer 2 Failure Detection
TenGigEthernet4/1 TenGigE 0/1/0/0
interface TenGigE 0/1/0/0
ethernet oam
hello-interval 100ms
connection timeout 2
interface TenGigEthernet4/1
ethernet oam
ethernet oam max-rate 10
ethernet oam timeout 2
Link OAM Basic Configuration
IOS XR IOS
Local hello
multiplier
Value in
seconds
Value in
msec or sec Value in pps
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Link OAM - Link Monitoring
Monitor link quality every 1 sec (min)
Conditions monitored:
Errored Symbol Period
Errored Frame
Errored Frame Period
Errored Frame Seconds
Receive CRC (Cisco defined – IOS only)
Transmit CRC (Cisco defined – IOS only)
Configure error condition thresholds to:
Signal peer with “Event Notification” OAMPDU
Syslog / SNMP trap
Isolate the link
Layer 2 Failure Detection
56
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Link OAM – Link Monitoring
Problem
Ensure CRCs injected by devices don’t propagate
through the network
Need to operate with or without neighbor discovery
Solution
IEEE 802.3ah for link monitoring and error-disable
Example: CRC Detection and Link Isolation (IOS)
interface GigabitEthernet1/1
ethernet oam
ethernet oam link-monitor receive-crc window 1
ethernet oam link-monitor receive-crc threshold high 10
ethernet oam link-monitor high-threshold action error-
disable-interface
……
Nov 10 09:56:08.643: EOAM LM(Gi1/1): sending an EventTLV!
Nov 10 09:56:09.643: %ETHERNET_OAM-5-LINK_MONITOR: 94 rx CRC
errors detected over the last 1 seconds on interface Gi1/1.
Nov 10 09:56:09.643: EOAM LM(Gi1/1): sending an EventTLV!
Nov 10 09:56:09.647: %PM-SP-4-ERR_DISABLE: link-monitor-failure
error detected on Gi1/1, putting Gi1/1 in err-disable state
CRC! CRC!
Layer 2 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Link OAM Miswiring Detection (IOS XR only)
Mechanism to detect miswiring of Ethernet
ports
Similar to UDLD, but using standard protocol
with Cisco vendor extension
Uses existing 4-byte field in periodic
OAMPU (Information OAMPDU Vendor
TLV ‘Vendor Information’ field)
Vendor Information is copied back by the
peer, allowing for MWD
Interoperates with other 802.3ah-compliant
vendors
Closing the gap with UDLD
Layer 2 Failure Detection
SW Support
IOS XR
• 3.9
59
I am X
X Y
Z
I am Y,
I know X
I am Z,
I know Y
interface TenGigE 0/1/0/0
ethernet oam
action wiring-conflict
error-disable-interface
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Link OAM Failure Reaction
No standards that define this!
Depending on implementation, available options for
failure reaction / path isolation:
Syslog / SNMP trap
Signal peer using specific OAMPDU
Error-disable
Error-block
Error-disable – operate at Layer 1, useful when
need to force manual intervention after error (like
mis-wiring)
Today, only IOS XR can isolate path based on peer
timeout or received notification OAMPDU!
Path Isolation
Layer 2 Failure Detection
60
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Link OAM Failure Reaction
Mechanism for OAM protocol to bring down interface “line protocol” state
when a problem is detected
Interface / sub-interface / bundle is “down” to routing / switching protocols
(MSTP, ARP, IGPs, BGP) – will trigger reconvergence
E-OAM protocols continue to operate
Automatic recovery when fault is resolved
IOS XR only, IOS supports error-block
Benefits:
Reduced interface up/down churn
Deterministic recovery
Path Isolation with Ethernet Failure Detection (EFD)
61
Layer 2 Failure Detection
interface TenGigE 0/1/0/0
ethernet oam
…
action link-fault error-disable-interface
action link-fault efd
action discovery-timeout error-disable-interface
action discovery-timeout efd
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Ethernet Failure Detection (EFD)
Logical Diagram
Layer 2 Failure Detection
Interface
MAC layer
L2VPN IPv4 IPv6 MPLS
UP
Packet
I/O
UP
Link OAM EFD
SW Support
IOS XR
• 3.9
UP DOWN
DOWN
Failure detected
62
CDM
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Link OAM vs UDLD
Link OAM adoption is growing, could be adopted in
enterprises / DC in future
Stick with UDLD (at least for now):
Link OAM mis-wiring detection only on IOS XR as
proprietary extension
Link OAM path isolation based on timeout only in IOS XR
Consider Link OAM today:
Must adhere to standard protocols
Link Monitoring capabilities
Who Wins?
63
Layer 2 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Link Aggregation Control Protocol (LACP)
Protocol used to:
‒ Ensure configuration consistensy across bundle
members on both ends
‒ Ensure wiring consistency (bundle members
between 2 chassis)
‒ Detect unidirectional links
‒ Bundle member keepalive
Peers negotiate requested send rate among
other things through LACPDUs
Loss of heartbeat typically triggers port
suspend
IEEE 802.1ax (formerly 802.3ad)
64
Layer 2 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
interface gig 0/1/2/3
bundle id <n> mode active
lacp period 100
interface Bundle-Ether 1
lacp cisco enable
LACP Slow, Fast and Super Fast Hellos
Traditional LACP heartbeat intervals
Long interval: 30 sec 90 sec failure detection
Short interval: 1 sec 3 sec failure detection
IOS / IOS-XE / IOS XR / NX-OS
Heartbeats typically sent from supervisor, so SSO /
ISSU will not work with aggressive timers
Very fast LACP hellos sent from ASR 9K / CRS
linecard
Proprietary Cisco extension on IOS-XR allows for:
Signalling at 100 msec with 300 msec failure detection
ISSU support with fast timers (from IOS XR 4.1)
Use only if cant do per-link BFD or Fast UDLD and
need sub-second detection!
interface gig 0/1/2/3
bundle id <n> mode active
lacp period short
SW Support: IOS XR 3.9
interface Ethernet1/7
lacp rate fast
IOS / NX-OS
IOS XR
65
Layer 2 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Agenda
Overview
Layer 1 Failure Detection
Layer 2 Failure Detection
Layer 3 Failure Detection
Summary
66
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Failure Detection at Layer 3
In some cases, failure detection relies on checks at Layer 3
How quickly can I detect a failure (neighbor down event)?
67
L2 bridged network
DWDM/X without LoS propagation
Tunnels (GRE, IPsec, etc.)
X
Layer 3 Failure Detection
X
Something
happened a
while ago!
Something just
happened!
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Is Layer 3 Failure Detection Tuning Necessary?
Needed when:
Intermediate L2 hop over L3 link
Concerns over any protocol software failures
Concerns over unidirectional failures on point-to-point physical L3 links
May not be needed when:
Point-to-point physical L3 links with no concerns over unidirectional failures
Enough software redundancy to account for protocol software failures
FHRPs are running in active-active mode (VPC/VPC+ in Nexus 5000 / 7000)
68
Layer 3 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
FHRPs with vPC / vPC+ in NX-OS
HSRP, VRRP and GBLP in vPC / vPC+
environment operate in Active/Active mode
No additional configuration required
General best practices still apply, except:
Since running in active/active mode,
aggressive timers can be relaxed
No need to manipulate priorities / preemption
on different devices to achieve load-balancing
Active/Active Mode
69
Layer 3 Failure Detection
L3 L2
HSRP/VRRP “Active”: Active
for shared L3 MAC
HSRP/VRRP “Standby”: Active
for shared L3 MAC
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Layer 3 Failure Detection
All Layer 3 protocols (FHRPs, BGP, EIGRP, OSPF etc) use HELLOs to:
Maintain adjacencies (pass protocol specific info)
Check neighbour reachability and detect failure
Hello/Keepalive and Dead/Hold timers can be tuned down, however it is
not recommended:
Each interface may have 2-3+ protocols establishing adjacency (e.g. HSRP, PIM,
OSPF on SVI)
Increased supervisor CPU utilization false-positives
Configuration complexity and waste of link bandwidth
Challenges supporting ISSU / SSO
Challenges achieving sub-second detection
Having said this: works reasonably well in small & controlled environments
Protocol Timers
70
Layer 3 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Bidirectional Forwarding Detection (BFD)
Lightweight hello protocol designed to run over
multiple transport protocols:
‒IPv4, IPv6, MPLS, TRILL
Designed for sub-second Layer 3 failure
detection
Any interested client (OSPF, BGP, HSRP etc.)
registers with BFD and is notified as soon as BFD
detects a neighbor loss
All registered clients benefit from uniform failure
detection
Runs on physical, virtual and bundle interfaces
Uses UDP port 3784 / 3785 (for echo)
RFC 5880 / 5881
71
Layer 3 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Layer 3 Failure Detection with BFD
Bidirectional Forwarding Detection (BFD) – recommended Layer 3 failure
detection mechanism over lowered protocol timers
BFD general advantages:
Reduced control plane load and link bandwidth usage
Sub-second failure detection
In-flight timer negotiation
BFD platform-specific advantages:
Stateful restart, SSO and ISSU support
Protocol off-load / distributed implementation – I/O module transmits / receives
BFD packets
Per-link implementations with bundles
72
Layer 3 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
BFD Peer Establishment
• No discovery – peer IP provided by client!
• Neighbors continuously negotiate their desired transmit and receive rates
in terms of microseconds.
• The system reporting the slower rate determines the transmission rate.
Timer Negotiation
73
Desired Receive rate = 50 ms Desired Transmit rate = 100 ms
Desired Receive rate = 60 ms Desired Transmit rate = 40 ms
Green Transmits at 100ms Orange transmits at 50ms
Negotiate rates
interface <name>
bfd interval <msec> min_rx <msec> multiplier <n>
Layer 3 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
BFD Operation Modes
Session established using
asynchronous control packets
Asynchronous mode (no echo):
Control packets sent at negotiated rate
Independent session
Neighbour declared dead if no packet is
received for <interval * multiplier> period
Additionally, if echo is negotiated:
Control packets sent at slow rate
Self-directed echo packets sent at fast
negotiated rate (min Rx interval), used
for failure detection
74
green is alive orange is alive
orange is alive green is alive
Async Mode
Async Mode + Echo
Layer 3 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
BFD – OSPF Interaction Example
76
R2 R1
BFD Session
BFD BFD
OSPF OSPF
X X
X
X- Forwarding plane failure between R1 and R2 X- BFD detects failure between R1 and R2 X- OSPF adjacency reset between R1 and R2
OSPF registers with BFD
OSPF registers with BFD
BFD notifies OSPF BFD notifies OSPF
OSPF peering
Layer 3 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
BFD Off-load / Distributed Processing
Helps achieve higher BFD scale
SUP-BFD - BFD process running on
Supervisor Engine
Interfaces with LC-BFD processes
Interfaces with BFD clients
LC-BFD – BFD process running on CPU
of each I/O module
Communicates with SUP-BFD process
Generates BFD hellos (echo and async)
Receives BFD hellos from peer (async)
Support for stateful restart, SSO and
ISSU
Nexus 7000 Architecture Example
Layer 3 Failure Detection
I/O Module I/O Module I/O Module
Supervisor Engine
OSPF HSRP PIM BGP Etc.
SUP-BFD
Hardware
LC-BFD
Hardware Hardware
LC-BFD LC-BFD
EOBC
Module Inband
IS-IS
Similar Architectures:
CRS-1
ASR 9000
12000 / XR12000
ASR 1K (from IOS XE 3.6)
7600 with ES+ I/O modules
80
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Layer 3 Fast Failure Detection and Link Bundles
• Scenarios:
1. Layer 2 bundle between 2 SVIs
2. Layer 3 bundle
• Each node uses a hash algorithm to distribute the load across bundle members
• Chances are high that control plane packets are only carried on a single link:
‒ Can’t reliably test all links
‒ Single bundle member malfunction can cause black holes which remain undetected
‒ Rely on Layer 1 or Layer 2 (LACP/PaGP/UDLD/OAM) detection
• Can use parallel Layer 3 links instead, load-sharing properties are often similar
• Two approaches for BFD:
1. Single session
2. Per-link sessions
Challenges
82
Single BFD session
Layer 3 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
BFD over Bundle Members (BOB)
IPv4 BFD session per bundle member
IPv6 relies on IPv4 session state
Verify every member link forwarding state by
establishing BFD session before its added to bundle
Master session on RP consolidates member states
and communicates with clients
Async + echo
Ethernet and POS bundles
IOS XR proprietary, close to proposed standard
CRS / ASR 9000 / XR 12000
83
LC1
LC2 RP
LC1
LC2 RP
interface bundle-ether 1
bfd
address-family ipv4
fast-detect
minimum-interval 15
multiplier 3
destination 10.11.12.13
SW Support
IOS XR 4.0.1 for CRS / ASR 9000
IOS XR 4.1 for XR 12000
Per-link
Sessions
Layer 3 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
BFD Per-link Mode
BFD session per port-channel member
Master session on SUP consolidates member states and communicates
with clients
LACP is required for port-channels
Async only, no echo
Layer 3 port-channel / sub-interface only
NX-OS proprietary
Minimum interval: 50 msec x 3
Nexus 7000
84
Layer 3 Failure Detection
LC1
LC2 SUP
LC1
LC2 SUP
SW Support
NX-OS
• 5.0(2a)
Per-link
Sessions
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
BFD Logical Bundles (BLB)
• Single BFD session per L3 destination address
• Internal algorithm to decide which I/O module hosts BFD session
• BFD packet distribution - Tx and Rx packets are polarized on one
bundle link per session
• IPv4 and IPv6 sessions
• Async only
• Replaces BVLAN mode but backward compatible!
• Verified interoperability with IOS and NX-OS single session modes
• Minimum interval: 50 msec x 3 (depends on linecard)
CRS / ASR 9000
86
SW Support
IOS XR 4.2.3 (CRS)
IOS XR 4.3 (ASR9K)
LC1
LC2 RP
LC1
LC2 RP
Single
Session
Layer 3 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
BFD Logical Mode
Single BFD session per L3 destination address
Internal algorithm to determine which I/O module hosts BFD session
BFD packet distribution:
‒ Prior to NX-OS 5.2(1) – Tx packets are polarized on one bundle link per session
‒ From NX-OS 5.2(1) – Tx packets are round-robin load-balanced on all bundle links
‒ Rx packets are always polarized on one bundle link per session
• Async + echo
• Verified interoperability with IOS XR BLB mode
• Minimum interval is 250 msec x 3
Nexus 3000 / 7000
87
LC1
LC2 SUP
LC1
LC2 SUP
SW Support
NX-OS
• 5.0(2a)
• 5.0(3)U2(2)
Single
Session
Layer 3 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
BFD Interoperability with Bundles
Current standards do not address this!
Single session
‒ Easiest to achieve with current standards and
implementations
‒ Verified interoperability between IOS XR BLB
mode, IOS and NX-OS single session mode
Per-link sessions
‒ Most recommended, but solutions are platform
proprietary
‒ IETF draft-mmm-bfd-on-lags-03 will address
interoperability!
88
Layer 3 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
BFD and FabricPath / TRILL
Use-case: peer switch path failure detection
Not supported for TRILL / FabricPath yet
Proposed standard:
draft-ietf-trill-rbridge-bfd-07
‒ Does not cover bundle per-link
‒ IS-IS notifies BFD of Rbridge IDs
Link OAM could be adopted in future
FP / TRILL OAM in the works for service /
end-to-end failure detection
Scenario 1 – FabricPath as BFD client
90
Layer 3 Failure Detection
FP
FP
FabricPath
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
BFD and FabricPath / TRILL
TRILL specifies support shared Ethernet segment with several peers
FabricPath can only peer on point-to-point links
BFD may be more needed for TRILL than FabricPath except…
Point-to-Point vs. Shared Ethernet segment
91
Layer 3 Failure Detection
TRILL
FP FP FP FP
TRILL
FabricPath
BFD
BFD
TRILL TRILL TRILL
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
FabricPath Design Perspective for Failure Detection
DCI may require BFD for FabricPath
Point-to-Point Leaf-Spine vs Data Center Interconnect
92
Fabric Path Active DC1
Fabric Path Active DC2
Fat Spine
Layer 3 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
BFD and FabricPath / TRILL
• Routing protocol / FHRP peering over FabricPath network
Scenario 2 – BFD client using FabricPath / TRILL as transport
93
SVI
SVI
FabricPath
SVI / sub-interface
SVI
FabricPath FabricPath
SVI / sub-interface
SVI / sub-interface
Layer 3 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
BFD for Static Routes
Next-hop liveliness detection
Fail-close solution (remove static route and not reinstate until BFD is up)
Must be configured on both ends
94
Layer 3 Failure Detection
ip route 30.0.0.0/24 Vlan 20 10.0.0.1
ip route static bfd Vlan20 10.0.0.1
ip route 0.0.0.0/0 Vlan10 20.0.0.1
ip route static bfd Vlan10 20.0.0.1
SVI 20
20.0.0.1
SVI 10
10.0.0.1
switch# sh ip route
0.0.0.0/0, ubest/mbest: 1/0
*via 20.0.0.1, Vlan 10, [1/0], static
switch# sh ip route
30.0.0.0/0, ubest/mbest: 1/0
*via 10.0.0.1, Vlan 20, [1/0], static
30.0.0.2
Internet A B
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
BFD Multi-hop
• BFD sends packets with TTL=1
• If go through a device that decrements TTL, need multi-hop
• Use-case 1: static route or PBR through routed firewalls / NAT
• Use-case 2: eBGP multi-hop
RFC 5883
95
ip route 30.0.0.0/24 Vlan 20 12.0.0.1
ip route static bfd Vlan20 10.0.0.1
ip route 0.0.0.0/0 Vlan10 11.0.0.1
ip route static bfd Vlan10 20.0.0.1
switch# sh ip route
0.0.0.0/0, ubest/mbest: 1/0
*via 20.0.0.1, Vlan 10, [1/0], static
switch# sh ip route
30.0.0.0/0, ubest/mbest: 1/0
*via 10.0.0.1, Vlan 20, [1/0], static
30.0.0.2
Layer 3 Failure Detection
SW Support
IOS
IOS XR
11.0.0.1 12.0.0.1 SVI 20
20.0.0.1
SVI 10
10.0.0.1
Internet A B
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
BFD and Security
97
Layer 3 Failure Detection
Support for SHA-1 (NX-OS / IOS) and MD5 (IOS) authentication
Disable platform hardware security mechanisms for BFD echo to
function:
uRPF (per interface) no [ip|ipv6] verify unicast source reachable-via [any|rx]
IDS checks (global) no hardware ip verify address identical
IP redirects (per interface) no ip redirects
Open rules to allow echo packets though firewall or enable loopback as
source IP (default on IOS XR): bfd echo-interface <a_loop_back_interface>
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
BFD Best Practices and Recommendations
1. If Layer 3 fast failure detection is needed, use BFD for all protocols
2. If cant use BFD, check specific platform support for aggressive protocol
timers
3. Always plan your BFD scale and check with platform capabilities
(centralized vs distributed architecture, interface and client support locally
and on peer)
4. Use BFD echo (default on many platforms) whenever possible, check security
5. On Layer 3 port-channels, use per-link mode and prefer that over echo
6. BFD single-hop for BGP – make sure neighbor update source is a directly
connected interface
7. Make sure BFD packets are prioritized appropriately (Marked with IP precedence 6 /
DSCP CS6 / CoS 6, can also be classified by udp 3784+3785)
8. Make sure neighbours support same BFD version (ver 0 / 1)
98
Layer 3 Failure Detection
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Agenda
Overview
Layer 1 Failure Detection
Layer 2 Failure Detection
Layer 3 Failure Detection
Summary
99
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Protocol Comparison
Key Decision Criteria
100
Summary
BFD UDLD Link OAM
OSI Layer L3 L2 L2
Standard IETF RFC 5880 / 5881
(with some Cisco enhancements)
Cisco proprietary IEEE 802.3ah
(with some Cisco enhancements)
Failures
Detected
Uni-directional soft failures
Bidirectional soft failures
Uni-directional soft failures
Bidirectional soft failures
Mis-wiring Detection
Uni-directional soft failures
Bidirectional soft failures
Mis-wiring Detection (IOS XR)
Link Degradation
Failure
Reaction
Notify peer and clients
Remove link from bundle (IOS
XR, IETF standard in future)
BFD dampening (IOS XR)
Error-disable (depending on mode) Notify peer
Error-disable (depending on error type
and platform)
Error-block
Ethernet Failure Detection (IOS XR)
Bundles and
Virtual
Interfaces
Bundle logical, bundle per-link,
SVI, sub-interface
Single L2 links Single L2 links
Message
Interval and
Timeout
Configurable, exchanged and
negotiated
Timeout generally in msec
Configurable and exchanged
Timeout generally in 20+
seconds
Configurable, not exchanged
Timeout generally in 2+ seconds
ISSU Timer inflation Flush message sent (IOS XR) No (can be extended in future)
For Your Reference
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Summary of Network Scenarios and Recommendations
101
Summary
Classical Ethernet Layer 2
Single p2p link
Bundle
FabricPath / TRILL
Single p2p link
Bundle
Layer 3
Single p2p link
Bundle
SVI on top of Classical Ethernet
SVI on top of FabricPath / TRILL
SVI SVI
SVI SVI
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Summary
102
Summary
Fast Failure Detection is Key to Fast Convergence
Business requirements and SLAs to drive technology and protocol choice
One protocol may be enough – keep it simple!
Evolving field with IETF / IEEE / MEF and Cisco innovations
Design your network to take advantage of best practices
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public 103
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Related Cisco Live London 2013 events
104
Summary
Session-ID Session Name
BRKIPM-2265 Deploying BGP Fast Convergence / BGP PIC
BRKCRS-2041 Highly Available Wide Area Network Design
Related Past Cisco Live events
Session-ID Session Name
TECRST-3190 IP Routing Fast Convergence
BRKNMS-2202 Ethernet OAM – Technical Overview and Deployment
Scenarios
BRKRST-2032 Highly Available Wide Area Network Design
© 2013 Cisco and/or its affiliates. All rights reserved. BRKRST-2333 Cisco Public
Call to Action
• Visit the Cisco Campus at the World of Solutions to experience Cisco innovations in action
• Get hands-on experience attending one of the Walk-in Labs
• Schedule face to face meeting with one of Cisco’s engineers
at the Meet the Engineer center
• Discuss your project’s challenges at the Technical Solutions Clinics
105