NSX-T Deep Dive: Layer 3 Routing
Amit Aneja
Senior Technical Product Manager, VMware
#vmworld #CNET1069BU
SESSION ID
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc.
Disclaimer
This presentation may contain product features or functionality that are currently under development.
This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.
Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.
Technical feasibility and market demand will affect final delivery.
Pricing and packaging for any new features/functionality/technology discussed or presented, have not been determined.
2
The information in this presentation is for informational purposes only and may not be incorporated into any contract. There is no commitment or obligation to deliver any items presented herein. VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc.
Agenda
3
NSX-T Data Center Vision & Architecture
NSX-T Logical Routing
Terminology
Packet Flows and N/S Connectivity Options
Multi-Tier Routing
Routing Features
High Availability/Resiliency
Deployment Topologies
Summary and Q&A
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 5
ESX
NSX Evolution
BRANCH
DC
EDGE/IOT
PUBLIC CLOUD
PRIVATE CLOUD
vSphere
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 6
BRANCH
BRANCH
EDGE/IOT
TELCO/NFV
BRANCH
BRANCH
DCDC
DC
EDGE/IOT
PUBLIC CLOUD
PRIVATE CLOUD
Virtual Cloud Network
Tied Together—Everywhere.
vRNI
CLEAR VISIBILITY
NSX Intelligence
DEEP INSIGHT
Containers | Virtual Machines | Bare Metal
vSphere
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 7
The NSX-T Platform
VisibilityAutomation
Any hypervisor – ESX, KVM
Heterogenous end-points – Container, VM, Bare-Metal
Multiple Clouds – On-Premise, Hybrid, Public (AWS, Azure, IBM Cloud ,VMC on AWS , VMware Cloud Partner Destinations)
Networking Security
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 8
Load Balancing
Connectivity to physical
Switching FirewallingVPN
NSX-T Networking and Security Services
Routing
SessionsNSX-T Deep Dive: Logical Switching [CNET1511BU]NSX-T Deep Dive: Load Balancing [CNET1356BU]NSX-T Deep Dive: Connecting Clouds and Data Centers via NSX-T VPN [CNET2841BU]What's New with NSX-T Micro-Segmentation [SAI2565BU]
DHCPNAT
MetaData
Proxy
MetaDataProxy
DNS
Forwarder
DNS Forwarder
VMworld 2019 Content: Not for publication or distribution
9Confidential │ ©2019 VMware, Inc.
Architecture
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 10
UI/API entry point, Store desired configuration
Interact with other management components
Cloud Service Manager
NSX Container Plugin
vCenter(s)
NSX Manager
CMP, Automation (Ansible, Terraform, Python, Go etc…)
NSX-T Architecture
ESXi host
N-VDS
KVM host
N-VDS
NSX EdgeBare MetalServer
NSX
Transport Nodes:
• Host workloads (VMs, containers) and services
• Switch data plane traffic
Private Cloud
LinuxVM
NSX
WindowsVM
NSX
NSXCloudGW
NAT
Public Cloud
VMware Cloud on AWS
VMs Containers
Maintain and propagate dynamic state within the system
Disseminates topology information reported by the data plane elements
NSX Controller
Control Plane
Data Plane
Management Plane
NSX Manager Appliance
NSX Manager Appliance
Cluster of 3 VMs (scale out + redundancy)
NSX Manager Appliance
VMworld 2019 Content: Not for publication or distribution
11Confidential │ ©2019 VMware, Inc.
NSX-T Logical Routing
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 12
host3
host2
host1
N-VDS
N-VDS
N-VDS
host6
host5
host4
N-VDS
N-VDS
N-VDS
rack1 rack2
TEP A.1
TEP A.2
TEP A.3
TEP B.1
TEP B.2
TEP B.3
The Overlay Model
Segments are instantiated on the hypervisors
They are extended between hypervisors by IP tunnels using Geneve encapsulation
NSX maintains a table locating the position of the virtual elements in the physical network
Introducing the Segment
Subnet A Subnet B
VM1
VM2
VM1 VM2
Logical View
IP N.1 IP N.2
VM (vnic) Location
Mac VM1 TEP A.1
Mac VM2 TEP A.3
Segment
TEP: Tunnel End PointVMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 13
host3
host2
host1
N-VDS
N-VDS
N-VDS
host6
host5
host4
N-VDS
N-VDS
N-VDS
rack1 rack2
TEP A.1
TEP A.2
TEP A.3
TEP B.1
TEP B.2
TEP B.3
The Overlay Model
Only requirements on the physical infrastructure:
• IP connectivity
• 1700 bytes MTU (minimum, jumbo frame recommended)
Introducing the Segment
Subnet A Subnet B
VM1VM1 VM2
Logical View
IP N.1 IP N.2
VM2
VM (vnic) Location
Mac VM1 TEP A.1
Mac VM2 TEP B.3
Segment
TEP: Tunnel End PointVMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 14
Terminology:
Gateway/Logical Router
• Provides E-W routing between different L2 segments or logical switches
(Overlay or Vlan)
• Peers with the physical infrastructure for N-S Routing.
• Can provides network services like Network Address Translation(NAT), Load-Balancing, Perimeter Firewall, VPN etc.
Gateway aka Logical Router
Gateway
Physical
Router
External or Uplink
Interface
OverlaySegment 2
OverlaySegment 1
10.1.1.0/24 10.2.2.0/24
10.2.2.1/2410.1.1.1/24
Baremetal
VlanSegment
10.3.3.1/24
10.3.3.0/24
Linked Segments
Or Downlinks
Service
Interface
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 15
• Runs locally in the transport nodes participating in the NSX fabric.
• Typically runs as kernel module in the hypervisor.
• Provides distributed E-W routing
• Traffic between different subnets on same hypervisor doesn’t leave the hypervisor
• Responsible for providing on/off ramp gateway services including N/S routing.
• Provides centralized services like NAT, BGP, LB, Edge Firewall, Connectivity to the physical
• The SR is instantiated as a service on an appliance called the Edge Node.
Distributed Router (DR) Services Router (SR)
Terminology: Gateway ComponentsDistributed Router (DR) and Services Router (SR)
DR SR
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 16
Distributed Routing with DR
Logical Routing
10.1.1.1 10.2.2.1
Tier-0Gateway
ESXi-1 ESXi-2
DR DR
10.1.1.10 10.2.2.10
10.2.2.10
AA-Web-Seg
10.1.1.1 10.2.2.1
10.2.2.20
10.1.1.10 10.2.2.20
10.1.1.1 10.2.2.1
10.1.1.20
10.1.1.20
AA-App-Seg
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 17
Logical RoutingCentralized Routing with Services Router (SR)
Whenever a service which cannot be distributed is enabled on a Gateway, an SR or Services Component is instantiated.
SR is instantiated for the following services:
SR is instantiated on an appliance called the Edge Node.
Connectivity to physical
Load Balancing
FirewallingVPNRouting DHCPNAT
MetaData
Proxy
MetaDataProxy
DNS
Forwarder
DNS Forwarder
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 18
NSX-T Terminology: Edge NodeIntroducing the “Edge Node”
Edge-nodes are appliances with pools of capacity for hosting any services which are not distributed.
• Available in two form factors: Baremetal Edge and VM form Factor
• Sizing Choice for VM Form Factor - 3 sizes available (Small, Medium, Large)
• Leverages DPDK technology for fast packet processing.
• Edge Node is also configured as NSX-T Transport Node.
!=
BM
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 19
Higher Throughput
Low convergence (Sub-second)
Specific NICs required (Intel 82599, Intel X540, Intel XL710)
Rack & stack
Higher Services Scale
Throughput (~9G)
Convergence (3+ seconds)
No NIC requirements (VMXNET3 driver required)
Flexible deployment
Services scale varies by the form factor
Baremetal Edge VM form factor Edge
Design ConsiderationsBaremetal Edge vs VM form factor Edge
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 20
Logical Routing Topology
Spine WAN
Compute Hypervisors (vSphere / KVM)
Infrastructure Clusters: Edge Nodes, Management Nodes
Leaf
Edge Node hosting SR
DR on every hypervisor (in kernel)
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 21
Topology View : Putting it all togetherDR/SR Interaction
Physical Router
10.1.1.10/24
ESXi
Transport
Network
10.1.1.10
Web1
Tier-0Gateway
External
Interface
Transit Segment
169.254.0.1 169.254.0.2169.254.0.1
DR SR
Web1
Web Segment
10.1.1.1
Web Segment
DR
10.1.1.1
Create External interface
NSX Management plane auto-plumbs this link
(internal VNI) and routing between DR and SR
Physical Router
Edge Node
NSX user configuration What happened behind the scenes?
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 22
Packet Walk (South-North traffic)
Logical Routing
Edge Node ESXi
WEB Segment
Transport
Network
External Segment
Transit Segment
SRDR
Physical Router
DR
10.1.1.10,
MAC1
Web1
192.168.100.0/24
192.168.240.1
192.168.240.3
• TEP
10.10.10.10
DR Routing table
Network Gateway
0.0.0.0/0 169.254.0.2
10.1.1.0/24 0.0.0.0
DR ARP Table
Network Mac
169.254.0.2 02:50:56:56:53:00
Logical Switch MAC Table
Inner MAC Outer IP
02:50:56:56:53:00 30.30.30.30
169.254.0.2
PayloadSrc= 10.1.1.10Dst= 192.168.100.1
• TEP
30.30.30.30
SR Routing table
Type Network Next hop
t0s 0.0.0.0/0 192.168.240.1
t0c 10.1.1.0/24 0.0.0.0
GENEVESrc=10.10.10.10
Dst=30.30.30.30
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 23
Packet Walk (North–South traffic)
Logical Routing
Edge Node ESXi
WEB Segment
Transport
Network
External Segment
Transit Segment
SRDR
Physical Router
DR
10.1.1.10,
MAC1
Web1
192.168.100.0/24
192.168.240.1
• TEP
10.10.10.10
• TEP
30.30.30.30
SR Routing table
Type Network Next hop
t0s 0.0.0.0/0 192.168.240.1
t0c 10.1.1.0/24 0.0.0.0
ARP Table
IP : 10.1.1.10
MAC : 00:50:56:b7:2c:79
MAC Table of Web-Segment
MAC : 00:50:56:b7:2c:79
LOCAL : 30.30.30.30
REMOTE: 10.10.10.10
ENCAP : GENEVE
PayloadSrc= 192.168.100.1 Dst= 10.1.1.10
GENEVESrc=30.30.30.30Dst=10.10.10.10
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc.
Tier-0
Gateway
Physical
Router
BGP/Static
with BFD
Edge Node1
and ECMP
External 1 External 2
Edge Node 1 (EN1)
Tier-0Gateway
Compute Hypervisor
DR
Transit
Segment
SR
External 1
Physical Topology
DR
169.254.0.1/25 169.254.0.1/25
169.254.0.2/25
External 2
Logical Topology
Logical Routing ECMP
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc.
Logical Topology
Edge Node 1 (EN1)
Tier-0Gateway
DR
Transit
Segment
Edge Node 1(EN1)
External 1
Physical Router 1
Physical Topology
Tier-0Gateway
Physical Router 1
DRSR
EN1
External 1
EN2
External 2
Physical Router 2
DR
169.254.0.1/25 169.254.0.1/25
169.254.0.2/25
Logical Routing ECMP
Compute Hypervisor
Tier-0Gateway
External 2
Physical Router 2
SR
DR
169.254.0.3/25
Edge Node 2 (EN2)
169.254.0.1/25
SR
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc.
Logical Topology
Edge Node 1 (EN1)
Tier-0Gateway
DR
Transit
Segment
Edge)
Physical Router 1
Physical Topology
Tier-0Gateway
Physical Router 1
DRSR
EN1 EN2
Physical Router 2
DR
169.254.0.1/25 169.254.0.1/25
169.254.0.2/25
Logical Routing Inter-SR Routing
Compute Hypervisor
Tier-0Gateway
Physical Router 2
SR
DR
169.254.0.3/25
Edge Node 2 (EN2)
169.254.0.1/25
IBGP
WAN0.0.0.0/0
Corporate192.168.100.0/24
SR
VMworld 2019 Content: Not for publication or distribution
27Confidential │ ©2019 VMware, Inc.
Multi-Tier Routing
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 28
• Tenant Isolation
• Separate control for Infra and Tenant admin
• Eliminates dependency on physical infrastructure when a new tenant is provisioned
• Role- Connects to physical infra
• Manual Management
Tier-0 Gateway
Benefit
Logical Routing- Multi Tier Topology
Tier-0Gateway
Physical
Routers
Tier-1 Gateway
• Role- Per tenant first hop router
• Cloud Management Platform (CMP) driven ManagementTenant-1 Tenant-2
Tier-1Gateway
Tier-1Gateway
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 29
Route Advertisement and Route Redistribution- Configuration
Logical Routing- Multi Tier Topology
Tier-0Gateway
PhysicalRouter
172.16.20.0/24
Tier-1Gateway
Web
Segment
App
Segment
172.16.10.0/24VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 31
Route Advertisement and Route Redistribution- Auto Plumbing
Logical Routing- Multi Tier Topology
172.16.20.0/24
Tier-0Gateway
Tier-1Gateway
Web
Segment
App
Segment
PhysicalRouter
Tier-1 connected routes (t1c) 172.16.10.0/24 & 172.16.20.0/24 are advertised to Tier-0.Default route with next hop IP as 100.64.224.0/31 is auto installed on Tier1 as soon as it is connected to Tier-0.
Tier-0 gateway redistributes Tier-0 Connected & Tier-1 Connected in BGP to provide N-S connectivity to all Tier-0 and Tier-1 subnets.
100.64.224.1/31
100.64.224.0/31
172.16.10.0/24
172.16.10.0/24 & 172.16.20.0/24 are seen as Tier-1 Connected (t1c) routes.
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 32
Logical Routing- Multi Tier TopologyMulti Tier Distributed Routing
ESXi-1Tier-0 DR
Tenant 1
Tier-1 DR
Tenant 2
Tier-1 DR
ESXi-2Tier-0
DR
Tenant 1
Tier-1 DR
Tenant 2
Tier-1 DR
• Tier0 and Tier1 routers are also instantiated on the hypervisors in order to prevent hair-pinning
• Fully distributed architecture : as much routing as possible is performed upfront at the source
100.64.224.0/31 100.64.224.2/31
100.64.224.1/31 100.64.224.3/31
100.64.224.0/31 100.64.224.2/31
100.64.224.1/31 100.64.224.3/31
VMworld 2019 Content: Not for publication or distribution
33Confidential │ ©2019 VMware, Inc.
Routing Features
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 34
Logical Routing
BGP Feature Set
• 4-byte ASN (asdot, asdot+, asplain)
• EBGP/IBGP with BFD
• EBGP Multi-Hop Support
• Route aggregation & redistribution
• Support for IPv4 and IPv6 AF
• Route-map match using prefix list or BGP community list.
• BGP Graceful restart (Full & helper)
• BGP Community support (Standard, Extended and Large)
• Outbound and Inbound route influencing using Weight, Local-pref, MED, AS path Prepend etc.
• Multipath ASN ECMP
• BGP Allow AS in
Routing Feature Set
Tier-0
Gateway
Physical
Router-2
BGP AS
65002
EN1 EN2
Physical
Router-1
BGP AS
65002
BGP AS
65000
ECMP across
Different ASN
BGP AS
65000
Remote CE
BGP AS
65003
BGP Allow AS inBGP Allow AS in
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 35
Logical Routing
IPv6 Routing Feature Set
Dual stack support on all interfaces
Tier-0 Gateway supports:
• Static routes with IPv6 Next-hop
• MP-eBGP or IBGP with IPv4 and IPv6 AF towards physical
• ECMP supported using static routes and EBGP
• Multi-hop eBGP
• Route Aggregation & Redistribution
• IPv6 Prefix List and route map
Tier1 Gateway supports:
• Static routes with IPv6 Next-hop
IPv6 Routing Feature Set
Tier-0Gateway
Physical
Routers
Tenant-1 Tenant-2
Tier-1Gateway
Tier-1Gateway
Dual Stack (IPv4 & IPV6)(Static, DHCPv6 Relay, SLAAC)
ECMP MP-eBGP w/ IPv4 and IPv6 AF
Auto plumbed Routing like IPv4
Dual Stack
VMworld 2019 Content: Not for publication or distribution
37Confidential │ ©2019 VMware, Inc.
High Availability
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 38
High Availability
Edge Cluster
NSX-T Edge nodes are pooled in edge-cluster to provide scale out and High-Availability for Services.
Gateway in Active/Active HA mode
• Scale out HA• ECMP• Stateless Services (Reflexive NAT)
Gateway in Active/Standby HA mode
Stateful Services • SNAT/DNAT• Load Balancer• Edge Firewall• DHCP server• VPN • Bridging
Active/Active and Active/Standby
Edge Cluster
Tier-1 Tier-1
Tier-0 Tier-0
Tier-1Tier-1
Edge Node1 Edge Node2
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 39
High Availability
Active/Active HA mode
• Supported on Tier-0 Gateway only
• Different IPs both northbound and southbound
• DR does ECMP to all SRs with Active external interfaces
Active/Active and Active/Standby
Edge Node 1 (EN1)
ActiveTier-0 SR
DR
Transit
Segment
DR
169.254.0.1/25 169.254.0.1/25
Compute Hypervisor
SR
DR
169.254.0.2/25 169.254.0.3/25
Edge Node 2 (EN2)
169.254.0.1/25
SRActive
Tier-0 SR
192.168.250.1192.168.240.1
Edge ClusterVMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 40
Active/Standby
High Availability
Active/Standby HA mode
• Supported on both Tier-0 & Tier-1 Gateway
• Different IPs both northbound but same IP is used southbound (169.254.0.2)
• BGP is established on both SRs but standby SR does AS Path prepending by default.
• Standby SR is NOT a Forwarder
• DR sends traffic to Active SR only.
BGP AS 65002
Edge Node 1 (EN1)
ActiveTier-0 SR
DR
Transit
Segment
DR
169.254.0.1/25
Compute Hypervisor
DR
169.254.0.2/25 169.254.0.2/25
Edge Node 2 (EN2)
SRStandby
Tier-0 SRSR
192.168.100.0/24
192.168.250.1192.168.240.1
Edge Cluster
Web1
10.1.1.0/24
BGPBGP
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 41
High Availability
Failure Triggers
• BFD sessions on Management and Tunnel interfaces of Edge node are down.
• All GENEVE Tunnels down
• Northbound routing state is Down (Applicable to Tier-0 SR only)
Failure Triggers
Tier-0 SR
Edge
Cluster
Tier-0 SR
Active Standby
EN1 EN2
eBGP
Mgmt Network
Tunnel Network
eBGPeBGP
• TEP • TEP
VMworld 2019 Content: Not for publication or distribution
42Confidential │ ©2019 VMware, Inc.
Topologies
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 43
TopologiesHigh Throughput Enterprise Topology
Segment 1
8-Way ECMP with BGP & BFD
Segment 2 Segment 3 Segment N
Physical Network
Bare Metal
L2 Bridging
Tier-0Gateway
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 44
Tenant -1
TopologiesMulti-tenant topology
Tenant -2
8-Way ECMP with BGP & BFD
Physical Network
Tenant -n
Provisioned programmatically by Cloud management APIs
Tier-0Gateway
Tier-1Gateway
Tier-1Gateway
Segment 2Segment 1 Segment 3 Segment NVMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 45
Tenant -1
TopologiesMulti-tenant topology with overlapping IP addresses between Tenants
Tenant -2
8-Way ECMP with BGP & BFD
Physical Network
Tenant -n
Provisioned programmatically by Cloud management APIs
Tier-0Gateway
Tier-1Gateway
Tier-1Gateway
Segment 2Segment 1 Segment 3 Segment NVMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc.
• Distributed Routing (DR) optimizes traffic flows for East-West traffic.
• DPDK Enabled Edge nodes provide capacity to host North-South connectivity to physical and centralized services (SRs).
• Rich Routing feature set
• High Availability per Gateway– Active/Active and Active/Standby models available.
• One Networking/Security construct for VMs (multi-hypervisor), Containers & Cloud
LogicalRouting
Logical Routing
Key Takeaways
VMworld 2019 Content: Not for publication or distribution
©2019 VMware, Inc. 49
How to get started
Resources
LEARN TRY
nsx.techzone.vmware.com
CONNECT
TRY
@VMwareNSX#runNSX
Learn ConnectTry
Design Guides Demos
Take a Hands-on Lab
Join VMUG, VMware Communities (VMTN)
VMworld 2019 Content: Not for publication or distribution
VMworld 2019 Content: Not for publication or distribution
VMworld 2019 Content: Not for publication or distribution