Top Banner
How NICs Work Today IETF 105, Montreal, Tuesday July 23, 2019 1 Tom Herbert, Intel Simon Horman, Netronome Andy Gospodarek, Broadcom
37

Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Feb 04, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

How NICs Work Today

IETF 105, Montreal, Tuesday July 23, 20191

Tom Herbert, IntelSimon Horman, NetronomeAndy Gospodarek, Broadcom

Page 2: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Fundamentals

2

Page 3: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Terminology● Network Interface Card (NIC): Host’s interface to

physical network● Host Stack: Software stack that performs host side

processing of L2, L3, or L4 protocols● Kernel Stack: Host stack implemented in an OS kernel● Offload: Do something in NIC HW that could be done in

host SW stack● Acceleration: Offload for performance gains

3

Page 4: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Network Interface Cards

4

Page 5: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Evolution of Network Interface Cards● Fundamental support (1990s)

○ Transmit and receive packets○ Basic offloads (Ethernet Checksum Offload!)

● Data plane acceleration (early to mid 2000s)○ Optimization for multi-core CPUs○ Hardware data plane offload — mostly fixed function devices○ Tunneling, IPsec, QoS offloads

● Programmability (2010 onwards)○ FPGAs and NPUs with programmable data plane○ General purpose processor with programmable data and control planes

5

Page 6: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Offload: Motivation

6

● Free up CPU cycles for application● Specialized processing can be more efficient● Save host resources● Scaling performance (low latency/high throughput)● Power savings for some use cases

=> Reduced TCO (marketing slant!)

Page 7: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Less is More Principle● Protocol agnostic is better than protocol specific

○ Avoid protocol ossification○ New protocol support without needing completely new solutions

● Common open APIs are better than proprietary ones○ Avoid vendor lock in○ Differentiation by features, performance, implementation

● Programmability is (generally) good○ Be adaptable, don’t dictate to the user what they are allowed to do○ Aspiration: “write once, run anywhere” model across devices

7

Page 8: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Basic offloads

8

Page 9: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Offload Considerations● TX and RX ● Protocol agnostic versus protocol specific● Stateful versus stateless● Encapsulation● “Always on” versus “opportunistic”● IPv6 and IPv4● How to build protocols to be NIC offload friendly

9

Page 10: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Basic Offloads● Checksum offload● Segmentation offload● Multi-queue and packet steering

10

Page 11: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Checksum Offload

● TCP, UDP, GRE, etc...● NIC offload calculation over data● Checksum offload is ubiquitous● Encapsulation allows multiple checksums in same packet

11

Page 12: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

TX Checksum Offload● Device parses tranports and set checksum

○ Device parse packets and set TCP or UDP checksum

● Instruct device where to start and write checksum○ Init csum field, indicate start offset and offset to write csum○ Generic method

12

Page 13: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

RX Checksum Offload● Checksum unnecessary

○ Device parses packet and verifies UDP or TCP checksum

● Checksum complete○ Device return 1’s complement sum across words in the packet○ Generic method

13

Page 14: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Segmentation Offload● Stack operates more efficiently

on large packets● Combines with checksum offload to

minimize header processing and per packet overhead

14

Page 15: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Transmit Segmentation Offload● Split big packet into smaller one low in the stack● GSO, Generic Segmentation Offload: SW variants● LSO: HW variant

15

Page 16: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Receive Segmentation Offload

● Coalesce small packets into bigger ones low in stack ● Generic Receive Offload, GRO: SW variant● Large Receive Offload, LRO: HW variant● Difficult to make protocol agnostic!

16

Page 17: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Multi-Queue● Multiple queues exposed by NIC● Queues processed by CPUs● Queues can be accessed and processed in parallel,

technique for load balancing● Queues can also have different properties, e.g. priority● Avoid OOO packets, maintain flow to queue affinity

17

Page 18: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Transmit Queue Selection● XPS, Transmit packet steering

○ Send packets on queue associated with CPU or thread

● Driver selects queue○ Device driver operation○ ndo_select_queue in Linux○ Arbitrary properties (e.g. priority)

18

Page 19: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Receive Packet Steering

● Receive Packet Steering○ Steer to queue based on hash○ RPS is SW variant○ RSS, Receive Side Scaling, is HW

● Receive Flow Steering○ Flow to queue association○ RFS is SW variant○ aRFS, accelerated RFS, is HW variant

19

Page 20: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Data Plane in Hardware

20

Page 21: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Data Plane in Hardware● Fixed or minimally configurable pipeline

○ ASIC with TCAM tables used for configuring pipeline

● Programmable Pipeline○ Network Processing Unit/Network Flow Processor

■ Multi-threaded execution environment for data plane programs

○ FPGA■ Gate-Level Programmable

○ General Purpose Processor■ CPU Complex separate from host

Offload NIC

PHY0

PHY1

PCIeOffloaded Data Plane

PHY0

PHY1

ASIC, FPGA, NPU, or CPU

Page 22: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

● Control plane stays in host software stack

● Offload data plane● Hardware Fallback/

Assist datapath in host software stack

● Host software stack implements features of offload data plane

Data Plane Acceleration

22

Offload NICPHY0

PHY1

Host

User Space

Applications

Kernel

Linux

Offloaded Data Plane

Control / DataplaneSoftware Datapath

PHY0

PHY1

PCIe

Page 23: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Data Plane Acceleration● Match/Action● Forwarding● QoS ● TLS and IPsec

23

Page 24: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

● Match packet based on headers and metadata

○ e.g: input-device + 5-tuple

● Execute actions based on match○ Forward / Mirror○ Drop ○ Packet/metadata modification

● Stateful actions○ Policing○ Connection tracking

Host Linux

Offload NIC

Software DatapathMatch Tables Actions

Offload Data PlaneMatch Tables Actions

Match/Action

24

Page 25: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

● L2 -> Ln● Between physical and logical

devices● HW datapath misses can fall

back to host● Optional tunnel encap/decap

○ VXLAN, GRE, Geneve, …

● And tagging: VLAN, MPLS

Forwarding

25

Host Linux

Offload NIC

Software DatapathMatch Tables Actions

Offload Data PlaneMatch Tables Actions

Tunnel

Tunnel

Page 26: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

● Ingress○ No queue○ Police/Meter/Filter

● Egress○ Classifier selects priority○ Scheduler

■ Priority Scheduler, f.e. 802.1p■ Deficit round robin■ TSN■ Shaping: DCB, …

QoS

26

Host Linux

Offload NIC

Software DatapathClassifier Scheduler

Offload Data PlaneClassifier Scheduler

Egress QoS

Page 27: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

User Space

Host Linux

Offload SmartNIC

Fallback PathDevice

OffloadFast Path MQ

RED RED RED

Application

Kernel

ApplicationApplicationMQ + RED Offload

27

● Per-device RED in HW● May ECN mark or drop packets

Page 28: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

TLS Acceleration

Offload NIC

Host

User Space

Application

Kernel

Linux

FramingkTLS Module

Symmetric Crypto Offload

Encrypt/Decrypt

TLS HeaderRecord Payload

Ciphertext Auth Hash

Established TLS be passed to kTLS

TX Path:

● NIC driver marks packets for crypto offload based on packet socket

● NIC performs encrypt and TX

RX Path:

● NIC performs decrypt and auth● Notifies kTLS of queued data● kTLS skips decrypt of plaintext● Handle Out-Of-Order

TLS HeaderRecord Payload

Plaintext 0000 0000

Auth Hash

28

Page 29: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Crypto Offload

● HW: Encrypt/Decrypt/Integrity/LSO/Checksum● Kernel: Padding/Anti-replay/Counters/Security Policy DB ● User-Space: IKE

Full Offload

● HW: Replay/Encap/Decap/SPD/LSO/Checksum/LRO● Kernel: IP fragmentation/Counters/Configuration● User-Space: IKE

IPsec Acceleration

29

Page 30: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Programmability

30

Page 31: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Programmability● Facilitates rapid protocol development● Quickly fix bugs and security problems● Two main types used today:

○ FPGA/NPU

○ General Purpose Processors

● Emerging trend: What is niche today can be broad tomorrow○ IETF 104 “Forwarding Plane Realities”

31

Page 32: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

● Control plane stays in host● Flexible offload data plane controlled

through kernel or user space● Data plane could be expressed by

P4, eBPF, NPL, or other native instruction set

● Dynamically programmed

Programmability with FPGA or NPU

32

User Programmable NICPHY0

PHY1

Host

User Space

Applications

Kernel

Linux

FPGA/NPU Data Plane

Control/Data PlaneSoftware Datapath?

PHY0

PHY1

PCIe

Page 33: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

General Purpose Processor

33

● Move host software stack down to the NIC● Dataplane offload to general purpose processor on NIC● Control plane offload

○ Useful in bare metal or multi-tenant deployments○ Network admin can control server networking

● No host resources consumed forwarding network traffic

Page 34: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

● Capable of running complete Operating System

● Forwarding functionality moved completely away from server cores down to NIC

Programmable NIC with General Purpose Processor

34

Programmable NICPHY0

PHY1

Host

User Space

Applications

Kernel

Linux

Software Datapath

Control/Data Plane

PHY0

PHY1

General Purpose Processor

Page 35: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Programmable NIC with General Purpose Processor

35

● Programmable NICs also have offload-capable devices

Programmable NIC

Host

User Space

Applications

Kernel

Linux

FPGA/NPU/ASIC

Offloaded Datapath

Control/Data Plane

PHY0

PHY1

Control/Data Plane

Software Datapath

General Purpose Processors

Page 36: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Conclusion and Futures● Networking trends

○ Insatiable need for more bandwidth and lower latency○ Deployment of forward looking IETF protocols

● NICs work with hosts to make this happen○ Offloads will be relevant for foreseeable future○ Programmability and flexibility spur innovation

36

Page 37: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...

Thank You!

37