© 2016 NETRONOME SYSTEMS, INC. 1 © 2016 NETRONOME SYSTEMS, INC. Simon Horman Layer 3 Tunnel Support for Open vSwitch (OVS)
© 2016 NETRONOME SYSTEMS, INC. 1 © 2016 NETRONOME SYSTEMS, INC.
Simon Horman
Layer 3 Tunnel Support for Open vSwitch (OVS)
© 2016 NETRONOME SYSTEMS, INC. 2
Motivation
Would like to: ▶ Allow rx and tx of packets over tunnels whose payload packet does not have an
Ethernet header ▶ Add these features to upstream OVS then offload them
© 2016 NETRONOME SYSTEMS, INC. 3
Importance of Offloading
5
10
15
20
25
30
OVS Kernel Datapath with Netdev to VMs
OVS User-Space Datapath with Netdev to VMs
100 Wildcard Rules
1000 Wildcard Rules
10000 Wildcard Rules
64000 Wildcard Rules
Pac
kets
per
sec
ond
(mill
ions
)
12 CPU Cores 8 CPU Cores
OVS Offload to iNIC with
PMD to VMs
1 CPU Core OVS L2/L3 Forwarding to 8 VMs with 64K Flows
OVS Offload to iNIC with
Netdev to VMs
1 CPU Core
© 2016 NETRONOME SYSTEMS, INC. 4
Scope
Datapaths: ▶ Linux Kernel ▶ User-Space with and without DPDK
Encapsulation Protocol: ▶ GRE (non-TEB) (rfc2794): • IP protocols over GRE • MPLS in GRE (rfc4023)
© 2016 NETRONOME SYSTEMS, INC. 5
Background: Tunnel vPorts
Encapsulation and decapsulation is handled by output to/input from tunnel vports
Not currently exposed in Open-Flow
© 2016 NETRONOME SYSTEMS, INC. 6
Kernel Datapath Tunnel vPorts
Kernel Datapath: ▶ On rx tunnel vport decapsulates packet passing the result and metadata to the
datapath ▶ On tx tunnel vport encapsulates packet based on metadata
© 2016 NETRONOME SYSTEMS, INC. 7
User-Space Tunnel vPorts
Native Tunneling: ▶ Tunnel ingress and egress on separate OvS bridge ▶ Internal rules match ingress and egress packets for tunnel vPorts and apply push
and pop tunnel actions accordingly ▶ Like the Kernel Datapath tunnel metadata is: • Available in flow key after decapsulation • Used as parameters for encapsulation
© 2016 NETRONOME SYSTEMS, INC. 8
Layer 3 Tunneling: Basic Concepts
Layer 2 and 3 vPorts
push_eth and pop_eth datapath actions
Datapath Attributes and packet type
© 2016 NETRONOME SYSTEMS, INC. 9
Layer 2 and 3 vPorts
Layer 2 or 3 is a mode of vports
Default is layer 2: behavior of all vports until now
© 2016 NETRONOME SYSTEMS, INC. 10
pop_eth and push_eth Actions
Add or remove an Ethernet header to/from start of packet
Packets with a VLAN not currently permitted
MPLS is treated as L2.5 and left alone
Not currently exposed to OpenFlow: ▶ Automatically included in actions of datapath flow
© 2016 NETRONOME SYSTEMS, INC. 11
Datapath Attributes and Packet Type
Presence of ETHERTYPE and ETHERNET attributes indicates L2 packet
Presence of ETHERTYPE but not ETHERNET attribute indicates L3 packet
ETHERTYPE corresponds to Protocol Type in GRE header
© 2016 NETRONOME SYSTEMS, INC. 12
GRE Header
Checksum (optional) Reserved (optional)
Protocol Type Ver Reserved0 C
C: Checksum Present
Delivery Header
Payload Packet
© 2016 NETRONOME SYSTEMS, INC. 13
Operation
OvS User-Space (ovs-vswitchd) is aware of which vports are Layer 2 and which are Layer 3
It is aware of the input port for each flow
And thus when translating from OpenFlow to datapath flows it can add push_eth and pop_eth actions before output actions as necessary
© 2016 NETRONOME SYSTEMS, INC. 14
Packet Flow
Key: eth_type, ... Actions: push_eth, output
Layer 3 vPort
Layer 2 vPort
Key: eth, eth_type, ... Actions: pop_eth, output
Layer 2 vPort
Layer 3 vPort
© 2016 NETRONOME SYSTEMS, INC. 15
vPort Implementations
User-Space (non-Datapath)
Kernel Datapath
User-Space Datapath
© 2016 NETRONOME SYSTEMS, INC. 16
User-Space (non-Datapath) vPorts
vPorts have new layer3 flag to distinguish layer mode
vPorts of the same type (e.g. GRE) but different layer mode share the same datapath vport
© 2016 NETRONOME SYSTEMS, INC. 17
Kernel Datapath vPorts
Switch to using ipgre rather than gretap netdev in kernel
ipgre (and ipvxlan) vports have recently been enhanced to allow rx/tx of TEB as well as non-TEB packets
Thus facilitating a single datapath vport for use with both layer 2 and 3 user-space vports
This design was motivated by a desire to avoid vport type explosion
© 2016 NETRONOME SYSTEMS, INC. 18
User-Space Datapath vPorts
New user-space datapath only NEXT_BASE_LAYER flow key attribute
Used to distinguish flows with layer 2 and 3 payload packets
© 2016 NETRONOME SYSTEMS, INC. 19
GRE Header
Checksum (optional) Reserved (optional)
Protocol Type Ver Reserved0 C
C: Checksum Present
Delivery Header
Payload Packet
© 2016 NETRONOME SYSTEMS, INC. 20
Configuration Example
ovs-vsctl add-port br0 tun1 -- \ set Interface tun1 type=gre \ options:remote_ip=10.0.0.2 \ options:key=flow \ options:layer3=true
© 2016 NETRONOME SYSTEMS, INC. 21
Future Work
Encapsulation Protocols: ▶ MPLS in IP (rfc4023) ▶ MPLS in UDP (rfc7510) ▶ NSH (draft-ietf-sfc-nsh-05) ▶ VXLAN-GPE (draft-ietf-nvo3-vxlan-gpe-02) ▶ LISP (rfc6830)
© 2016 NETRONOME SYSTEMS, INC. 22
Credits
Many, including: ▶ Lorand Jakub, Thomas Morin: Original implementation ▶ Jiri Benc: Kernel Tunnel Enhancements
© 2016 NETRONOME SYSTEMS, INC. 23
Availability
Open vSwitch (User-Space): ▶ https://github.com/horms/openvswitch l3-vpn
Kernel (Datapath): ▶ https://github.com/horms/linux l3-vpn
Working towards upstream merge!
© 2016 NETRONOME SYSTEMS, INC. 24 © 2016 NETRONOME SYSTEMS, INC.
Simon Horman
Thank You