© 2016 NETRONOME SYSTEMS, INC.
Simon Horman
Layer 3 Tunnel Supportfor Open vSwitch
© 2016 NETRONOME SYSTEMS, INC. 2
Motivation
Would like to:
● Allow rx and tx of packets over tunnels whose payload packet does not have an Ethernet header
● Add these features to upstream OvS then offload them
© 2016 NETRONOME SYSTEMS, INC. 3
Importance of Offloading
5
10
15
20
25
30
OVS Kernel Datapath with Netdev to VMs
OVS User-Space Datapath with Netdev to VMs
100 Wildcard Rules
1000 Wildcard Rules
10000 Wildcard Rules
64000 Wildcard Rules
Pac
kets
per
sec
ond
(mill
ions
)
12 CPU Cores8 CPU Cores
OVS Offload to iNIC with
PMD to VMs
1 CPU Core
5X Throughput Improvement + 50% CPU Savings
OVS L2/L3 Forwarding to 8 VMs with 64K Flows
OVS Offload to iNIC with
Netdev to VMs
1 CPU Core
© 2016 NETRONOME SYSTEMS, INC. 4
Scope
Datapaths:
● Linux Kernel● User-Space with and without DPDK
Encapsulation Protocol:
● GRE (non-TEB) (rfc2794): ▶ IP protocols over GRE▶ MPLS in GRE (rfc4023)
© 2016 NETRONOME SYSTEMS, INC. 5
Background: Tunnel vPorts
● Encapsulation and decapsulation is handled by output toand input from tunnel vports
● Not currently exposed in Open-Flow
© 2016 NETRONOME SYSTEMS, INC. 6
Kernel Datapath Tunnel vPorts
Kernel Datapath:
● On rx tunnel vport decapsulates packet passing the result and metadata to the datapath
● On tx tunnel vport encapsulates packet based on metadata
© 2016 NETRONOME SYSTEMS, INC. 7
User-Space Tunnel vPorts
Native Tunnelling:
● Tunnel ingress and egress on separate OvS bridge● Internal rules match ingress and egress packets for tunnel vPorts and apply
push and pop tunnel actions accordingly● Like the Kernel Datapath tunnel metadata is:
▶ Available in flow key after decapsulation▶ Used as parameters for encapsulation
© 2016 NETRONOME SYSTEMS, INC. 8
Layer 3 Tunneling: Basic Concepts
● Layer 2 and 3 vPorts● push_eth and pop_eth datapath actions● Datapath Attributes and packet type
© 2016 NETRONOME SYSTEMS, INC. 9
Layer 2 and 3 vPorts
● Mode of vport● Default is layer 2: behaviour of all vports until now
© 2016 NETRONOME SYSTEMS, INC. 10
pop_eth and push_eth Actions
● Add or remove an ethernet header to/from start of packet● Packets with a VLAN not currently permitted● MPLS is treated as L2.5 and left alone● Not currently exposed to OpenFlow:
▶ Automatically included in actions of datapath flow
© 2016 NETRONOME SYSTEMS, INC. 11
Datapath Attributes and Packet Type
● Presence of ETHERTYPE and ETHER attribute indicates L2 packet● Presence of ETHERTYPE but not ETHER attribute indicates L3 packet● ETHERTYPE Used to communicate type of layer 3 packet● Corresponds to Packet Type in GRE header
© 2016 NETRONOME SYSTEMS, INC. 12
GRE Header
Checksum (optional) Reserved (optional)
Protocol TypeVerReserved0C
C: Checksum Present
Delivery Header
Payload Packet
© 2016 NETRONOME SYSTEMS, INC. 13
Operation
● OvS User-Space (ovs-vswitchd) is aware of which vports are Layer 2 and which are Layer 3
● It is aware of the input port for each flow● And can thus when translating from OpenFlow to datapath flows it can add
push_eth and pop_eth actions as before output actions as necessary
© 2016 NETRONOME SYSTEMS, INC. 14
Packet Flow
Key: pkt_eth=0xXXXX,...
Actions: push_eth, output
Layer 3vPort
Layer 2vPort
Key: eth_type=0xXXXX,...
Actions: pop_eth, output
Layer 2vPort
Layer 3vPort
© 2016 NETRONOME SYSTEMS, INC. 15
vPort Implementation
● User-Space (non-Datapath)● Kernel Datapath● User-Space Datapath
© 2016 NETRONOME SYSTEMS, INC. 16
User-Space (non-Datapath) vPorts
● vPorts have new layer3 flag to distinguish layer mode● vPorts of the same type (e.g. GRE) but different layer mode share the same
datapath vport
© 2016 NETRONOME SYSTEMS, INC. 17
Kernel Datapath vPorts
● Switch to using gretap rather than ipgre vport in kernel● ipgre (and ipvxlan) vports have recently been enhanced to allow rx/tx of
TEB as well as non-TEB packets● Thus facilitating a single datapath vport for use with both Layer 2 and 3
user-space vports● This design was motivated by a desire to avoid vport type explosion
© 2016 NETRONOME SYSTEMS, INC. 18
User-Space Datapath vPorts
● New user-space datapath only NEXT_BASE_LAYER flow key attribute● Used to distinguish flows with layer 2 and 3 payload packets
© 2016 NETRONOME SYSTEMS, INC. 19
Configuration Example
ovs-vsctl add-port br0 tun1 -- \set Interface tun1 type=gre \
options:remote_ip=10.0.0.2 \options:key=flow \options:layer3=true
© 2016 NETRONOME SYSTEMS, INC. 20
Future Work
Encapsulation Protocols:
● MPLS in IP (rfc4023)● MPLS in UDP (rfc7510)● NSH (draft-ietf-sfc-nsh-05)● VXLAN-GPE (draft-ietf-nvo3-vxlan-gpe-02)● LISP (rfc6830)
© 2016 NETRONOME SYSTEMS, INC. 21
Credits
Many, including:
● Lorand Jakub, Thomas Morin: Original implementation● Jiri Benc: Kernel Tunnel Enhancements
© 2016 NETRONOME SYSTEMS, INC. 22
Availability
Open vSwitch (User-Space):
https://github.com/horms/openvswitch me/l3-vpn
Kernel (Datapath):
https://github.com/horms/linux me/l3-vpn
Working towards upstream merge!
© 2016 NETRONOME SYSTEMS, INC. 23
Questions