Melbourne Tutorial – November 4-5, 2010 1 NetFPGA Workshop Day 1 Presented by: Hosted by: Gavin Buskes at Melbourne University September 15 - 16, 2010 http://NetFPGA.org Glen Gibb (Stanford University) Melbourne Tutorial – November 4-5, 2010 2 Tutorial Outline • Background – Introduction – The NetFPGA Platform • The Stanford Base Reference Router – Motivation: Basic IP review – Demo1: Reference Router running on the NetFPGA • The Enhanced Reference Router – Motivation: Understanding buffer size requirements in a router – Demo 2: Observing and controlling the queue size • How does the NetFPGA work – Utilities – Reference Designs – Inside the NetFPGA Hardware • The Life of a Packet Through the NetFPGA – Hardware Datapath – Interface to software: Exceptions and Host I/O • Exercise: Drop Nth Packet • Concluding Remarks – Using NetFPGA for research and teaching
59
Embed
NetFPGA Workshop Day 1 - Stanford Universityklamath.stanford.edu/nf2/tutorials/Melbourne2010/NetFPGA_Melbourne...Melbourne Tutorial – November 4-5, 2010 1 NetFPGA Workshop Day 1
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Melbourne Tutorial – November 4-5, 2010 1
NetFPGA Workshop Day 1
Presented by:
Hosted by: Gavin Buskes
at Melbourne University
September 15 - 16, 2010
http://NetFPGA.org
Glen Gibb (Stanford University)
Melbourne Tutorial – November 4-5, 2010 2
Tutorial Outline • Background
– Introduction – The NetFPGA Platform
• The Stanford Base Reference Router – Motivation: Basic IP review – Demo1: Reference Router running on the NetFPGA
• The Enhanced Reference Router – Motivation: Understanding buffer size requirements in a router – Demo 2: Observing and controlling the queue size
• How does the NetFPGA work – Utilities – Reference Designs – Inside the NetFPGA Hardware
• The Life of a Packet Through the NetFPGA – Hardware Datapath – Interface to software: Exceptions and Host I/O
• Exercise: Drop Nth Packet • Concluding Remarks
– Using NetFPGA for research and teaching
Melbourne Tutorial – November 4-5, 2010 3
Section I: Motivation
Melbourne Tutorial – November 4-5, 2010 4
What is the NetFPGA? A line-rate, flexible, open networking platform for teaching and research
Melbourne Tutorial – November 4-5, 2010 5
NetFPGA Board
NetFPGA consists of…
Four elements:
• NetFPGA board
• Tools + reference designs
• Contributed projects
• Community
Melbourne Tutorial – November 4-5, 2010 6
FPGA
Memory
1GE
1GE
1GE
1GE
NetFPGA board
PCI
CPU Memory
NetFPGA Board
PC with NetFPGA
Networking Software running on a standard PC
A hardware accelerator built with Field Programmable Gate Array driving Gigabit network links
More projects: http://netfpga.org/foswiki/NetFPGA/OneGig/ProjectTable
Project Contributor OpenFlow switch Stanford University Packet generator Stanford University NetFlow Probe Brno University NetThreads University of Toronto zFilter (Sp)router Ericsson Traffic Monitor University of Catania DFA UMass Lowell
Melbourne Tutorial – November 4-5, 2010 9
Community
Wiki • Documentation (slowly growing) • Encourage users to contribute
Forums • Support by users for users • Active community – 10s to 100s of posts
per week
Melbourne Tutorial – November 4-5, 2010 10
NetFPGA’s Defining Characteristics • Line-Rate
– Processes back-to-back packets • Without dropping packets • At full rate of Gigabit Ethernet Links
– Operating on packet headers • For switching, routing, and firewall rules
– And packet payloads • For content processing and intrusion prevention
• Open-source Hardware – Similar to open-source software
• Full source code available • BSD-Style License
– But harder, because • Hardware modules must meeting timing • Verilog & VHDL Components have more complex interfaces • Hardware designers need high confidence in specification of modules
Melbourne Tutorial – November 4-5, 2010 11
Test-Driven Design • Regression tests
– Have repeatable results – Define the supported features – Provide clear expectation on functionality
• Example: Internet Router – Drops packets with bad IP checksum – Performs Longest Prefix Matching on destination address – Forwards IPv4 packets of length 64-1500 bytes – Generates ICMP message for packets with TTL <= 1 – Defines how packets with IP options or non IPv4
… and dozens more … Every feature is defined by a regression test
Melbourne Tutorial – November 4-5, 2010 12
Who, How, Why
Who uses the NetFPGA? – Teachers – Students – Researchers
How do they use the NetFPGA? – To run the Router Kit – To build modular reference designs
• IPv4 router • 4-port NIC • Ethernet switch, …
Why do they use the NetFPGA? – To measure performance of Internet systems – To prototype new networking systems
Melbourne Tutorial – November 4-5, 2010 13
What you will learn • Overall picture of NetFPGA • How reference designs work • How you can work on a project
– NetFPGA Design Flow – Directory Structure, library modules and projects – How to utilize contributed projects
• Interface/Registers – How to verify a design (Simulation and Regression
Tests) – Things to do when you get stuck
AND… You can start your own projects!
Melbourne Tutorial – November 4-5, 2010 14
Section II: Demo Basic Use
Melbourne Tutorial – November 4-5, 2010 15
Basic Uses of NetFPGA
• Recap Internet Protocol and Routing
• Demonstrate – How you can use the NetFPGA as a router – See routing in action
Melbourne Tutorial – November 4-5, 2010 16
What is IP?
• IP (Internet Protocol) – Protocol used for communicating data across
packet-switched networks – Divides data into a number of packets (IP
packet)
• IP Packet – Header (IP Header) including:
• Source IP address • Destination IP address
Melbourne Tutorial – November 4-5, 2010 17
IP Header
Data Hdr Data Hdr Data Hdr
Data
16 32 4 1
Data
Options (if any)
Destination Address
Source Address
Header Checksum Protocol TTL
Fragment Offset Flags Fragment ID
Total Packet Length T.Service HLen Ver
20 bytes
Melbourne Tutorial – November 4-5, 2010 18
IP Address
• Used to uniquely identify a device (such as a computer) from all other devices on a network – Two parts
• Identifier of a particular network on the Internet • Identifier of a particular device within a network
All packets, except the ones for the same network, first go to their gateway (router) and are transferred to the destination via routers.
Melbourne Tutorial – November 4-5, 2010 19
Basic Operation of an IP Router R3
A
B
C
R1
R2
R4 D
E
F R5
R5 F R3 E R3 D Next Hop Destination
D
Melbourne Tutorial – November 4-5, 2010 20
What does a router do? R3
A
B
C
R1
R2
R4 D
E
F R5
R5 F R3 E R3 D Next Hop Destination
D
16 32 4 1
Data
Options (if any)
Destination Address
Source Address
Header Checksum Protocol TTL
Fragment Offset Flags Fragment ID
Total Packet Length T.Service HLen Ver
20 b
ytes
Melbourne Tutorial – November 4-5, 2010 21
What does a router do?
A
B
C
R1
R2
R3
R4 D
E
F R5
Melbourne Tutorial – November 4-5, 2010 22
Basic Components of an IP Router
Control Plane
Datapath per-packet processing
Switching Forwarding Table
Routing Table
Routing Protocols
Management & CLI
Softw
are H
ardware
Melbourne Tutorial – November 4-5, 2010 23
Per-packet processing in an IP Router
1. Accept packet arriving on an incoming link. 2. Lookup packet destination address in the
forwarding table to identify outgoing port(s). 3. Manipulate IP header: e.g., decrement TTL,
update header checksum. 5. Buffer packet in the output queue. 6. Transmit packet onto outgoing link.
Melbourne Tutorial – November 4-5, 2010 24
Generic Datapath Architecture
Lookup IP Address
Update Header
Header Processing Data Hdr Data Hdr
Forwarding Table
IP Address Next Hop
Queue Packet
Buffer Memory
Melbourne Tutorial – November 4-5, 2010 25
CIDR and Longest Prefix Matches
The IP address space is broken into line segments. Each line segment is described by a prefix. A prefix is of the form x/y where x indicates the prefix of all
addresses in the line segment, and y indicates the length of the segment.
e.g. The prefix 128.9/16 represents the line segment containing addresses in the range: 128.9.0.0 … 128.9.255.255.
0 232-1
128.9/16
128.9.0.0
216
142.12/19 65/8
128.9.16.14
Melbourne Tutorial – November 4-5, 2010 26
Classless Interdomain Routing (CIDR)
0 232-1
128.9/16
128.9.16.14
128.9.16/20 128.9.176/20
128.9.19/24 128.9.25/24
Most specific route = “longest matching prefix”
Melbourne Tutorial – November 4-5, 2010 27
Techniques for LPM in hardware • Linear search
– Slow • Direct lookup
– Currently requires too much memory – Updating a prefix leads to many changes
• Tries – Deterministic lookup time – Easily pipelined but require multiple memories/
references • TCAM (Ternary CAM)
– Simple and widely used but have lower density than RAM and need more power
– Gradually being replaced by algorithmic methods
Melbourne Tutorial – November 4-5, 2010 28
An IP Router on NetFPGA
Switching Forwarding Table
Routing Table
Routing Protocols
Management & CLI
Softw
are H
ardware
Linux user-level processes
Verilog on NetFPGA PCI board
Exception Processing
Melbourne Tutorial – November 4-5, 2010 29
NetFPGA Router
Function – 4 Gigabit Ethernet ports
Fully programmable – FPGA hardware
Low cost
Open-source FPGA hardware – Verilog base design
Open-souce Software – Drivers in C and C++
Melbourne Tutorial – November 4-5, 2010 30
Demo 1
Reference Router running on the NetFPGA
Melbourne Tutorial – November 4-5, 2010 31
Net-FPGA
Hardware Setup for Demo #1
Net-FPGA GE
GE
GE
GE
Internet Router
Hardware
CPU x2
Net-FPGA
NIC GE
PCI-e
PCI
Video Display
GE
GE
GE
GE
GE
CAD Tools
GE
GE
GE
GE
Internet Router
Hardware
Internet Router
Hardware
CPU x2
PCI-e
PCI
Video Server NIC
GE
PCI-e GE
…
Server delivers streaming HD video through a chain of NetFPGA Routers
Melbourne Tutorial – November 4-5, 2010 32
Topology
.1.1
.1.2 .3.1
.30.2
.4.1
.4.2
.6.1 .3.2
.7.1
.7.2
.9.1 .6.2
.10.1
.10.2
.12.1 .9.2
.13.1
.13.2
.15.1 .12.2
.16.1
.16.2 .15.2
.28.1
.28.2 .27.1
.30.1
.25.1
.25.2 .24.1
.27.2
.22.1
.22.2 .21.1
.24.2
.19.1
.19.2
.17.1
.21.2 .18.2
.5.1 .8.1 .11.1 .14.1 .18.1
.20.1 .23.1 .26.1
.29.1
.2.1
Video Client Shortest Path
Video Server
Melbourne Tutorial – November 4-5, 2010 33
Melbourne Tutorial – November 4-5, 2010 34
Working IP Router
• Objectives – Become familiar with
Stanford Reference Router
– Observe PW-OSPF re-routing traffic around a failure
Melbourne Tutorial – November 4-5, 2010 35
Step 1 – Observe the Routing Tables
The router is already configured and running on your machines
The routing table has converged to the routing decisions with minimum number of hops
Next, break a link …
Melbourne Tutorial – November 4-5, 2010 36
Step 2 - Dynamic Re-routing
eth1 of Host PC 192.168.X.Y
5
6
4 10.1 19
2.16
8.18
.*
192.168.21.* 8
3 2
0
1
7 9
16.1
Key:
NetFPGA Router #
13.1
19.1 22.1 1.1 25.1 28.1
7.1 4.1
Any PC can stream traffic through multiple NetFPGA routers in the ring topology
To stream video from server 4.1, type: ./play 192.168.4.1
Example:
2
1
3 4 5 6
7 8 9 10
Melbourne Tutorial – November 4-5, 2010 37
Step 3 - Dynamic Re-routing Break the link
between video server and video client
Routers re-route traffic around the broken link and video continues playing
.1.1
.1.2 .3.1
.30.2
.4.1
.4.2
.6.1 .3.2
.7.1
.7.2
.9.1
.6.2
.10.1
.10.2
.12.1
.9.2
.13.1
.13.2
.15.1
.12.2
.16.1
.16.2 .15.2
.28.1
.28.2 .27.1
.30.1
.25.1
.25.2 .24.1
.27.2
.22.1
.22.2 .21.1
.24.2
.19.1
.19.2
.17.1
.21.2 .18.2
.5.1 .8.1 .11.1 .14.1 .18.1
.20.1
.23.1 .26.1
.29.1
.2.1
Melbourne Tutorial – November 4-5, 2010 38
Section III: Demo Advanced Use
Melbourne Tutorial – November 4-5, 2010 39
Advanced Uses of NetFPGA
• Introduction on TCP and Buffer Sizes
• Demonstrate – NetFPGA used for real time measurement – See TCP Saw tooth in real time
Melbourne Tutorial – November 4-5, 2010 40
Buffer Requirements in a Router Buffer size matters:
– Small queues reduce delay – Large buffers are expensive
Theoretical tools predict requirements – Queuing theory – Large deviation theory – Mean field theory
Yet, there is no direct answer – Flows have a closed-loop nature – Question arises on whether focus should be on
equilibrium state or transient state
Melbourne Tutorial – November 4-5, 2010 41
• Universally applied rule-of-thumb: – A router needs a buffer size: – 2T is the two-way propagation delay (or just 250ms) – C is capacity of bottleneck link
• Context – Mandated in backbone and edge routers – Appears in RFPs and IETF architectural guidelines – Already known by inventors of TCP
• [Van Jacobson, 1988] – Has major consequences for router design
Rule-of-thumb
C Router Source Destination
2T
Melbourne Tutorial – November 4-5, 2010 42
The Story So Far
10,000 20 # packets at 10Gb/s 1,000,000
(1) Assume: Large number of desynchronized flows; 100% utilization (2) Assume: Large number of desynchronized flows; <100% utilization
Melbourne Tutorial – November 4-5, 2010 43
Exploring Buffer Sizes
• Need to reduce buffer size and measure occupancy
• Not possible in commercial routers • So, we will use the NetFPGA instead
Objective: – Use the NetFPGA to understand how large a
buffer we need for a single TCP flow.
Melbourne Tutorial – November 4-5, 2010 44
Rule for adjusting W – If an ACK is received: W ← W+1/W – If a packet is lost: W ← W/2
Why 2TxC for a single TCP Flow?
Only W packets may be outstanding
http://guido.appenzeller.net/anims/ �
Melbourne Tutorial – November 4-5, 2010 45
Time evolution of a single TCP flow through a router. Buffer is < 2T*C
Time Evolution of a Single TCP Flow
Time evolution of a single TCP flow through a router. Buffer is 2T*C
Melbourne Tutorial – November 4-5, 2010 46
Demo 2
Buffer Sizing Experiments using the NetFPGA Router
Melbourne Tutorial – November 4-5, 2010 47
Hardware Setup for Demo #2
CPU x2
PCI-e
Video Server NIC
GE
PCI-e GE
Net-FPGA CPU x2
NIC GE
PCI-e
PCI
Video Client
GE
GE
GE
GE
GE
Internet Router
Hardware
…
Server delivers streaming HD video to adjacent client
Melbourne Tutorial – November 4-5, 2010 48
Topology • eth1 connects your host to your NetFPGA Router • nf2c2 routes to nf2c1 (your adjacent server) • eth2 serves web and video traffic to your neighbor • nf2c0 & nf2c3 (the network ring) are unused
.1.1 .1.2
.4.1
.4.2
.7.1
.7.2
.10.1
.10.2
.13.1
.13.2
.16.1 .16.2
.28.1
.28.2
.25.1
.25.2
.22.1
.22.2
.19.1
.19.2
.2.2
.2.1
.5.2
.5.1
.8.2
.8.1
.11.2
.11.1
.14.2
.14.1 .17.2
.17.1
.20.2
.20.1
.23.2
.23.1
.26.2
.26.1
.29.2
.29.1
This configuration allows you to modify and test your router without affecting others
Melbourne Tutorial – November 4-5, 2010 49
Enhanced Router
Objectives – Observe router with new modules – New modules: rate limiting, event capture
Execution – Run event capture router – Look at routing tables – Explore details pane – Start tcp transfer, look at queue occupancy – Change rate, look at queue occupancy
Melbourne Tutorial – November 4-5, 2010 50
Step 1 - Run Pre-made Enhanced Router
Start terminal and cd to “netfpga/projects/
tutorial_router/sw/”
Type “./tut_adv_router_gui.pl”
A familiar GUI should start
Melbourne Tutorial – November 4-5, 2010 51
Step 2 - Explore Enhanced Router
Click on the Details tab
A similar pipeline to the one seen previously shown with some additions
Melbourne Tutorial – November 4-5, 2010 52
Enhanced Router Pipeline
Two modules added 1. Event Capture
to capture output queue events (writes, reads, drops)
2. Rate Limiter to create a bottleneck
MAC RxQ
CPU RxQ
MAC RxQ
CPU RxQ
MAC RxQ
CPU RxQ
MAC RxQ
CPU RxQ
Input Arbiter
Output Port Lookup
MAC TxQ
CPU TxQ
MAC TxQ
CPU TxQ
MAC TxQ
CPU TxQ
MAC TxQ
CPU TxQ
Output Queues
Rate Limiter
Event Capture
Melbourne Tutorial – November 4-5, 2010 53
Step 3 - Decrease the Link Rate To create bottleneck and
show the TCP “sawtooth,” link-rate is decreased.
In the Details tab, click the “Rate Limit” module
Check Enabled
Set link rate to 1.953Mbps
Melbourne Tutorial – November 4-5, 2010 54
Step 4 – Decrease Queue Size
Go back to the Details panel and click on “Output Queues”
Select the “Output Queue 2” tab
Change the output queue size in packets slider to 16
Melbourne Tutorial – November 4-5, 2010 55
Step 5 - Start Event Capture
Click on the Event Capture module under the Details tab
This should start the configuration page
Melbourne Tutorial – November 4-5, 2010 56
Step 6 - Configure Event Capture
Check Send to local host to receive events on the local host
Check Monitor Queue 2 to monitor output queue of MAC port1
Check Enable Capture to start event capture
Melbourne Tutorial – November 4-5, 2010 57
Step 7 - Start TCP Transfer
We will use iperf to run a large TCP transfer and look at queue evolution
Start a terminal and cd to “netfpga/projects/tutorial_router/sw”
Type “./iperf.sh”
Melbourne Tutorial – November 4-5, 2010 58
Step 8 - Look at Event Capture Results
Click on the Event Capture module under the Details tab.
The sawtooth pattern should now be visible.
Melbourne Tutorial – November 4-5, 2010 59
Queue Occupancy Charts
Leave the control windows open
Observe the TCP/IP sawtooth
Melbourne Tutorial – November 4-5, 2010 60
Section IV: How does the NetFPGA Work
Melbourne Tutorial – November 4-5, 2010 61
Integrated Circuit Technology
Full-custom Design – Complementary Metal Oxide Semiconductor (CMOS)
Semi-custom ASIC Design – Gate array – Standard cell
• PCs assembled from parts – Stanford University – Cambridge University
• Pre-built systems available – Accent Technology Inc.
• Details are in the Guide http://netfpga.org/static/guide.html
Melbourne Tutorial – November 4-5, 2010 113
Rackmount NetFPGA Servers
NetFPGA inserts in PCI or PCI-X slot
2U Server (Dell 2950)
Thanks: Brian Cashman for providing machine
1U Server (Accent Technology Inc.)
Melbourne Tutorial – November 4-5, 2010 114
Stanford NetFPGA Cluster
Statistics • Rack of 40
• 1U PCs with NetFPGAs
• Manged • Power • Console • LANs
• Provides 4*40=160 Gbps of full line-rate processing bandwidth
Melbourne Tutorial – November 4-5, 2010 115
Acknowledgments NetFPGA Team at University of Cambridge (Past and Present):
Andrew Moore, David Miller, Martin Zadnik
NetFPGA Team at Stanford University (Past and Present):
Nick McKeown, Glen Gibb, Jad Naous, David Erickson, G. Adam Covington, John W. Lockwood, Jianying Luo, Brandon Heller,
Paul Hartke, Neda Beheshti, Sara Bolouki, James Zeng, Jonathan Ellithorpe, Sachidanandan Sambandan
All community members (including but not limited to):
Paul Rodman (Google), Kumar Sanghvi, Wojciech A. Koszek (Xilinx/FreeBSD), Yahsar Ganjali (University of Toronto), Martin Labrecque (University of Toronto), Jeff Shafer (Rice University), Eric Keller (Princeton), Tatsuya
Yabe (NEC/Stanford), Billal Anwer (Georgia Tech)
Melbourne Tutorial – November 4-5, 2010 116
Special thanks to our Partners:
Past NetFPGA Tutorial Presented At:
SIGMETRICS
Patrick Lysaght, Veena Kumar, Paul Hartke, Anna Acevedo Xilinx University Program (XUP)
See: http://NetFPGA.org/tutorials/
Melbourne Tutorial – November 4-5, 2010 117
Thanks to our Sponsors: • Support for the NetFPGA project has been provided
by the following companies and institutions
Disclaimer: Any opinions, findings, conclusions, or recommendations expressed in these materials do not necessarily reflect the views of the National Science Foundation or of any other sponsors supporting this project.