Mark Kavanagh / Tarek Radi Intel Corporation Optimizing TCP Workloads in an OvS-based NFV Deployment
Mark Kavanagh / Tarek RadiIntel Corporation
Optimizing TCP Workloads in an OvS-based NFV Deployment
Legal Disclaimer
General Disclaimer:
© Copyright 2016 Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, the Intel Inside logo, Intel. Experience What’s Inside are trademarks of Intel. Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Technology Disclaimer:
Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at [intel.com].
Performance Disclaimers:
Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction. Results have been estimated or simulated using internal Intel analysis or architecture simulation or modelling, and provided to you for informational purposes. Any differences in your system hardware, software or configuration may affect your actual performance.
Problem Domain
VNFProprietary VNF
Optimal TCP
PerformanceNFVI
Migration
VMCompute Node 1
VM Compute Node 1
Speed Test ServerCompute Node 2
VM Compute Node 1
VMExternal Network
Speed Test ServerCompute Node
Speed Test Client Speed Test Server
Deployment Scenario (Simplified)
ToR Switch
Compute 1 Compute 2
10Gbe
br-vlanbr-vlan
br-int br-int
VM1 VMn… Speed Test Server
VLAN Network
Customer-Defined Test Cases
VM1 VM5…
OvS
External Machine
VNF Deployment Scenario (Full)
Controller + Neutron
ToR Switch
Compute 2
External Machine
Internet
External Network
Switch
VM1 VMn…
10GbeVLAN Network
VM1 VM5…
External Machine
OvS
1GbeManagement Network
Compute 1
br-vlanbr-vxlan
br-int
VxLANNetwork
10Gbe
br-vlanbr-extbr-vxlan
br-int
br-extbr-vxlanbr-vlanbr-vxlan
VM1 VMn…
br-int
br-ext
1Gbe
Speed Test Server
Anatomy of a VNF Compute Node
Intel® Xeon E5-2680 v2 @ 2.8GHz
Intel® 82599ES 10 Gigabit Ethernet ControllerHardware
Compute Node
Intel ® Ethernet Controller I350 BT2
DPDK 16.04Open vSwitch 2.5.90
OpenStack Kilo 2015.1.1
Host Software
Stack
QEMU 2.5.0
Fedora 21 - 4.1.13-100.fc21.x86_64
KVM 2.3.0.5fc21
CentOS 7 – 4.5.4
iPerf3
Guest Software
StackVirtio-net
VNF – Virtualized Broadband Speed Test Server
Optimizations: Baseline
Enable Hugepages- Reduce the impact of Translation Lookaside Buffer (TLB) misses
Affinitize DPDK PMDs, and QEMU’s virtual CPU threads- Maximize CPU occupancy
- Minimize cache thrashing
Enable NUMA support for OvS-DPDK- Eliminate QPI traversal performance penalties
Additional details available here
https://github.com/openvswitch/ovs/blob/master/INSTALL.DPDK-ADVANCED.md
Optimizations: TCP Segmentation Offload (TSO) Overview
Application Data
TCP Segments
1 2 n…
IP Packets
1 2 n…
1 2 n
Ethernet Frames
…
Application
TCP
Ethernet
IP
Application Data
1
TCP Segment (super-sized skb)
1
IP Packet
Ethernet Frame
11 2 n
Ethernet Frames
…
No TSO TSO Enabled
Enable TSO in the guest to reduce vCPU load & boost throughput for OvS-DPDK
1 2 n
Ethernet Frames
…
Optimizations: TCP Segmentation Offload
OVS-DPDK
vhu0
dpdk0
ethtool –K eth0 tso on
eth0
App
mbuf->ol_flags & PKT_TX_TCP_SEGmbuf->l2_lenmbuf->l3_lenmbuf->l4_lenmbuf->tso_segsz = MSSmbuf->ol_flags |= PKT_TX_IP_CKSUM
Ethernet Frame
1
1 2 n
Ethernet Frames
mbuf chain
1 2 n
3
Reduced vCPU load Improved PCI bus usage Higher throughputVM
RFC Patch
https://mail.openvswitch.org/pipermail
/ovs-dev/2016-June/235223.html
GUEST
HOST
TCP Optimizations: Multi Q (Overview)
vhu0
dpdk0
PMD
VNIC
Q0
Q0
NIC
GUEST
HOST
HARDWARE
OvS-DPDK
Rx Tx
TCP Optimizations: Multi Q (Overview)
vhu0
dpdk0
PMD
VNIC
Q0
Q0
NIC
GUEST
HOST
HARDWARE
OvS-DPDK
Rx Tx
PMD
Q1
Q1
Rx Tx
PMD
Qn
Qn
Rx Tx
https://software.intel.com/en-us/articles/configure-vhost-user-multiqueue-for-ovs-with-dpdk
iperf3 –s
VNIC
TCP Optimizations: Multi Q (Problem)
OvS-DPDK
v hu0
dpdk0
ToR Switch
iperf -c
VM0
iperf -c
VM1 VM2
iperf -c iperf -c
VM3
iperf -c
VM4
External MachineCompute Node
OvS Bridge
Kernel
NIC
iperf3 -s
RSS Hash
TCP Optimizations: Multi Q (Solution)
iperf3 –s –P 10000…
iperf3 –s –P 10004
VNIC
OvS-DPDK
v hu0
dpdk0
ToR Switch
iperf -c-P 10000
VM0
External Machine
iperf -c-P 10001
VM1
iperf -c-P 10002
VM2
iperf -c-P 10003
VM3
iperf -c-P 10004
VM4
Compute Node
OvS Bridge
Kernel
NICRSS Hash
Performance Results – Test Case #1
0
1
2
3
4
5
6
7
8
9
10
Baseline With TCP Optimizations
AVERAGE SPEED TEST SERVER
BANDWIDTH (GBPS)
Client 1 Client 2 Client 3 Client 4 Client 5
5 x EXTERNAL VM -> SINGLE SPEED TEST SERVER
*System configuration detailed in backup
Performance Results – Test Case #2
4.96
9.34
0
1
2
3
4
5
6
7
8
9
10
Baseline With TCP Optimizations
AVERAGE SPEED TEST SERVER
BANDWIDTH (GBPS)
SPEED TEST SERVER -> VM
- SEPARATE COMPUTE NODES -
*System configuration detailed in backup
Performance Results – Test Case #3
10.5
45.1
0
5
10
15
20
25
30
35
40
45
50
Baseline With TCP Optimizations
AVERAGE SPEED TEST SERVER
BANDWIDTH (GBPS)
VM -> VM
SAME COMPUTE NODE
*System configuration detailed in backup
Optimization Summary
• Enable hugepages
• Per-port/RxQ PMD
• Affinitize workloads
• Incorporate NUMA support
Baseline Optimizations
• TSO = reduced vCPU load
• TSO = efficient PCI bandwidth consumption
Avail of Offloads
• Saturate line
• Push bottleneck back to the network
Utilize Multi Q for Guests
Next Steps
• Release non-RFC TSO Support Patch
• Add support for TSO + Tunnels
References
• https://www.measurementlab.net/publications/understanding-broadband-speed-measurements.pdf
• https://software.intel.com/en-us/articles/configure-vhost-user-multiqueue-for-
ovs-with-dpdk
• http://openvswitch.org/pipermail/dev/2016-June/072871.html
Q&A
Backup
System Configuration: Hardware
Hardware Platform Specification
Server Processor Hard Drive Memory NIC
Compute1Intel® Xeon® E5-2680v2 at 2.80 GHz, 40logical cores
1 TB DDR3 1600 MHz
• Intel Ethernet Controller I350 BT2(management and public networks)• Intel® 82599 ES-10 GigabitEthernet Controller (VxLAN andVLAN networks
Compute2Intel® Xeon® E5-2680v2 at 2.80 GHz, 40logical cores
1 TB DDR3 1600MHz
• Intel Ethernet Controller I350 BT2(management and public networks)• Intel® 82599 ES-10 GigabitEthernet Controller (VxLAN andVLAN networks
System Configuration: Software
Software Ingredients
# Software BOM Item Component
1 Operating System Fedora* 21, Kernel 4.1.13-100.fc21.x86_64
2 Hypervisor Compute nodes: QEMU-KVM, QEMU 2.5.0
3Virtual Switch Compute nodes: Open vSwitch 2.5.9 + TSO RFC patch
4 Packet ProcessingAcceleration DPDK v16.04
5 Virtualized Infrastructure Manager OpenStack* Kilo 2015.1.0
System Configuration: BIOS Settings 1/2
System Configuration: BIOS Settings 2/2