Transcript
FABRICPATH
Why layer 2 in DC?
Typical DC Design
End to End L2
Limitation of Traditional L2
Cisco FabricPath Goal
Why FabricPath?
Control Plane
Key FabricPath control plane elements:
•Routing table – FabricPath IS-IS learns switch IDs (SIDs) and builds routing table
•Multidestination trees – FabricPath IS-IS elects roots and builds multidestination forwarding trees
•Mroute table – IGMP snooping learns group membership at the edge, FabricPath IS-IS floods group-membership LSPs (GM-LSPs) into the fabric
MAC-Based Routing ?•NO!
•Routing information consists of Switch IDs
•Forwarding in fabric based on Switch IDs, not MAC addresses
FabricPath Routing Table
•Contains shortest path(s) to each SID, based on link metrics / path cost
•Equal-cost multipath (ECMP) supported on up to 16 next-hop interfaces
FabricPath Routing Table
ECMP Load-Sharing
•ECMP path chosen based on hash function
•Hash uses SIP/DIP + L4 + VLAN by default
•Use show fabricpath load-balance unicast to determine ECMP path for a given packet
Mutlidestination TREE
MDT Root Selection•FabricPath network elects a primary root switch for the first
multidestination tree in the topology
•Switch with highest priority value becomes root for the tree
–Tie break: root priority → highest system ID → highest SID
•Primary root determines roots of additional trees and announces them in Router Capability TLV
–Roots spread among available switches to balance load
Root?Tree?..Is it STP??
•NO! – More like IP multicast routing
•Trees do NOT dictate forwarding path of unicast frames, only multidestination frames
•Multiple trees allow load-sharing for any multidestination frames
•Control plane state further constrains IP multicast forwarding (based on mrouter and receiver activity)
Data Plane
Key FabricPath data plane elements: •MAC table – Hardware performs MAC lookups at CE/FabricPath edge only
•Switch table – Hardware performs destination SID lookups to forward unicast frames to other switches
•Multidestination table – Hash function selects tree*, multidestination table identifies on which interfaces to flood based on selected tree
FP Mac Table•Edge switches perform MAC table lookups on ingress frames
•Lookup result identifies output interface or destination FabricPath switch
Encapsulation
SwitchID
•Every FabricPath switch automatically assigned a Switch ID –Optionally, network administrator can manually configure SIDs •FabricPath network automatically detects conflicting SIDs and
prevents data path initialization on violating switch •Encoded in “Outer MAC addresses” of FabricPath MAC-in-MAC
frames
•Enables deterministic numbering schemes, e.g.: –Spine switches assigned two-digit SIDs –Leaf switches assigned three-digit SIDs –VPC+ virtual SIDs assigned four-digit SIDs –etc.
More about SID
F Tag•Forwarding tag – Unique 10-bit number encoded in FabricPath header •Overloaded field that identifies FabricPath topology or multidestination
tree •For unicast packets, identifies which FabricPath IS-IS topology to use •For multidestination packets (broadcast, multicast, unknown unicast),
identifies which multidestination tree to use
•FTAG: (Forwarding TAG) Used for multidestination traffic; carries the ID of the tree chosen at the FabricPath ingress switch. DRAP is responsible to keep FTAGs unique/consistent. For known unicast, FTAG carries topology ID
Terminology• Classical Ethernet (CE)– Regular Ethernet with regular flooding,
regular STP, etc.• Leaf Switch– Connects CE domain to FP domain• Spine Switch– FP backbone switch with all ports in
the FP domain only• FP Core Ports– Links on Leaf up to Spine, or Spine to
Spine– i.e. the switchport mode fabricpath
links• CE Edge Ports– Links on Leaf connecting to regular
Classical Ethernet domain– i.e. not the switchport mode
fabricpath links
Fabricpath Support
Configuration
More in Encapsulation
Outer SA
More in Encapsulation
Outer DA
Conflict Resolution
Fabricpath Tree
Forwarding Tree + VLAN
Root Election /Tree Construction
Other Encapsulation
Reverse Path Forwarding Check
Topologies
•Routing table & Trees (FTAGs) are per topology
•Switch ID is shared across all topologies
•FP interface may belong to several topologies
•N7K: up to 8 topologies support starting in 6.2
•N5K/N6K: 2 topologies supported since 5.2.1; main use is to permit separate L2 pods to use same local vlan set
Config:---------------------------------
FabricPath Software Architecture & Hardware tables
on the Supervisor Engine: •FabricPath IS-IS routing protocol process that forms the core of the
FabricPath control plane •DRAP Dynamic Resource Allocation Protocol, ensures network-wide
unique and consistent Switch IDs and FTAGs –Resolves switch id conflicts
•IGMP Provides IGMP snooping support for building multicast forwarding database
•M2RIB Multicast Layer 2 RIB, contains the multicast Layer 2 routing information
•U2RIB Unicast Layer 2 RIB, containing the “best” unicast Layer 2 routing information
•L2FM Layer 2 forwarding manager, controls MAC address table •MFDM Multicast forwarding distribution manager, connects platform-
independent control-plane processes and platform-specific processes on I/O modules
on the Linecards: •U2FIB – Unicast Layer 2 FIB, managing the hardware unicast routing
table •MTM – MAC Table Manager, managing the hardware MAC address table •M2FIB – Multicast Layer 2 FIB, manages the hardware multicast routing table
FabricPath: Forwarding Tables
FabricPath uses 3 tables to forward frames
•MAC address table VLAN, MAC Address, Port (local or remote), FTAG (for
non-unicast)
•Switch-ID table remote switch-ID, local next-hop interfaces (up to 16)
•Multidestination tree table Per Tree: remote switch-ID, local next-hop/RPF interface Tree#1 (broadcast, unknown unicast, IP multicast) Tree#2 (IP multicast)
Forwarding: unicast CEFP
Unicast: Known Destination MAC
Forwarding: broadcast/multicast CEFP
Multidestination (broadcast, multicast, unicast flood)
Forwarding: FP->FP or FP->CE
• Multicast lookups are done using VLAN, FTAG, and ODA (each multicast macappears twice)
• SubSwitchID lookups are omitted here
• Remember about special LIDs (Sup, Flood, …)
• FF frames are forwarded out of CE ports only when DA is locally learned
Load-balancing
•Symmetric: idea is to make aband ba flows take same path by sorting addresses, before feeding them to hash
•Rotate: polarization avoidance; hash result is rotated by specified number of bytes. Number is derived from unique system MAC
Reducing impact of forwarding loops
•Transient loops might occur during convergence (as with L3 routing)
•To contain impact of these loops FabricPathuses TTL. Starting in 6.2(2), can set the initial TTL via fabricpath [multicast | unicast] ttl
•For Multidestination Trees Reverse Path Forwarding check performed on source switch ID
MAC Address Learning
•Learning MAC addresses is not required in FabricPathCore as switching is based on Switch ID
•FP Edge switches learn local MAC addresses (behind edge ports) conventionally
•FP Edge devices learn remote addresses (behind Core-facing ports) using conversational learning o For packets arriving from FP, source MAC (not outer SA!) is
learned when destination MAC of the frame is already known on any Edge port of this switch
•No learning from broadcasts (though existing entries will be updated)
•Normal Learning from multicasts (example: HSRP address)
Conversational MAC Address Learning
FabricPath Multicast Control Plane
•IGMP/IGMP snooping tracks connected hosts/routersinterest in receiving multicast
•ISIS distributes information from igmp snooping toother FP nodes using GM-LSPs. Intermediate nodesflood GM-LSPs
•A pruned subtree is created for each group (+flood,OMF) per vlan per FTAG
STP & FabricPath
• No STP inside FP network • BPDUs do not traverse FP network (dropped at FP edge, with the exception of TCNs) • FP network pretends to be 1 switch from STP point of view: all FP edge switches send
BPDUs with the same Bridge ID c84c.75fa.60xx (xx is domain ID in hex, default 00) • Before FP ports are up, switch will use its own Bridge ID (like STP without FP would
do) • Ports inside FP cannot be blocked, FP edge switches will always want to have STP
designated role, if superior BPDU is received such port will be blocked as L2GW inconsistent
STP, FabricPath & TCNs• When CE STP domains are connected to multiple FP
switches STP TCN handling might be needed to maintain accuracy of MAC address tables inside CE
• Example if link CE1-CE2 goes down, link CE2-CE3 will become forwarding. Now to reach MAC B, switches inside FP need to send traffic to S5 instead of S4…
• To achieve this, FP switches when receiving a TCN from CE will propagate it to all FP switches in the network (via ISIS)
• Each FP switch will flush all remote MAC addresses learned from switches in the same STP domain as domain originating the TCN
• In addition, if FP switch is also part of the same STP domain, it will propagate TCN to the CE domain
• TCNs are not propagated to CE in domain 0 (default domain)
Control Plane Protection•Both N7K, N6K, and N5K recognize and protect FP ISIS traffic at COPP level
•COPP needs to be updated when deploying FabricPath; standard profiles are FP-aware as of 5.2(1)
•In case of complex CE-side STP topologies (with blocking ports), usual STP safeguards are recommended (Bridge Assurance & Dispute / UDLD)
•On N7K-F1 cards: rate-limiters allow up to 4500 PPS worth of control plane FabricPath packets
VPC+: Why, What and How •Goal: provide redundant, active-active L2 links to separate FP
switches with active-active HSRP
•Challenge : depending on the path the packet AB takes, switch S3 will learn MAC A behind S1 or S2 (or MAC will be moving)
•Solution: introduce Emulated Switch S100 to represent devices behind VPCs: MAC A will appear behind S100 in S3 MAC address table. HSRP MAC is advertised with emulated switch as a source – taking advantage of VPC+ multipathing
VPC VPC+
•To enable VPC+ an Emulated Switch ID must be configured in VPC domain on both peers (must be the same on both peers and globally unique). ES represents ALL VPC+ channels of the domain
•Peer-link and VPC+ ports must be fabric-path capable •Peer-link is FP interface (no STP, only FP vlans are carried, VPC
check is no more). VPC+ channels are CE
•VPC+ domain must be the root for CE STP, otherwise VPC+ channels will be blocked as L2GW inconsistent
•FP switches use same STP bridge ID so peer-switch is implicit
VPC+: Prevention of Duplicate Packets
•How is packet received from VPC+ and flooded on S1 prevented from being flooded on S2 to same VPC+ again?
•N7K-F1 linecards: Each VPC+ will have its own sub-switch ID. Mac addresses will be
learned behind <es_id>.<subsw_id>.<lid>, for example 100.11.65535 (emulated switch 100, sub-switch 11, LID 65535). S2 will recognize ES + SubSwitch tuple as its own port and will not flood the frame back to VPC
•N7K-F2, N7K-F3 linecards & N5K, N6K:
By default same as above, as below with ‘fabricpath multicast load-balance’
Each VPC+ peer will be forwarding only for 1 FTAG and traffic coming from other peer will have different FTAG. For example flooded packet coming from S1 will have FTAG1, but S2 will only flood FTAG2 packets out of the VPC
VPC+ Failover •VPC+ member link goes down
–Traffic diverted over Peer-Link
•Peer-Link goes down (but Peer-Keepalive up)
–Primary: No action
–Secondary: Bring down VPC+ channels
–Stop advertising reachability to Emulated Switch
•Dual active is much less likely than with normal VPC: if Peer-Link and Peer-Keepalive go down, but peer is reachable via FP – secondary will not become primary
FabricPath: What command comes from where
MAC
Question ???
top related