1 of 20 Smart-NICs: Power Proxying for Reduced Power Consumption in Network Edge Devices Karthikeyan Sabhanatarajan, Ann Gordon-Ross + , Mark Oden, Mukund Navada , Alan D. George + High Performance Computing and Simulation Research Laboratory Department of Electrical and Computer Engineering University of Florida , Gainesville This work was supported by the U.S. National Science Foundation + Also Affiliated with NSF Center for High-Performance Reconfigurable Computing
20
Embed
1 of 20 Smart-NICs: Power Proxying for Reduced Power Consumption in Network Edge Devices Karthikeyan Sabhanatarajan, Ann Gordon-Ross +, Mark Oden, Mukund.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1 of 20
Smart-NICs: Power Proxying for Reduced Power Consumption in
Network Edge DevicesKarthikeyan Sabhanatarajan, Ann Gordon-Ross+ , Mark Oden, Mukund Navada ,
Alan D. George+
High Performance Computing and Simulation Research LaboratoryDepartment of Electrical and Computer Engineering
University of Florida , Gainesville
This work was supported by the U.S. National Science Foundation
+ Also Affiliated with NSF Center for High-Performance Reconfigurable Computing
22 of 20
Introduction
INTERNET
33 of 20
Introduction• Connected edge devices account for 2% of the total power consumed in the US [EPA-06]
– 130 TWh/Year
• This is $1.3 billion @ $.10 per kWh• 1 single-unit nuclear power plant
outputs 8 TWh/Year
• Translates to 16 single-unit nuclear power plants!
• Why so much power?– PCs can consume up to 200 W– 1 billion PCs worldwide by 2010 [Kanellos-04]
• What can we do?– PCs are idle 75% of the time [Purushothaman-06]– But only 10% of PCs are allowed to sleep during that time [EPA-06]– Sleeping reduces power consumption by 80% or more– If PCs were allowed to sleep, only 3 single-unit nuclear power plants would be required
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Question: Why aren’t these PCs asleep?!?!
44 of 20
Maintaining Network Connectivity
INTERNET
IDLE
GNUTELLA FILE SHARING APPLICATION
FILE QUERY PACKET
FILE RESPONSE PACKET
Bob
Alice
Alice checks to see if Bob has a file needed for p2p file sharing
Z
Z
z
zFILE QUERY PACKET
Problem: PC must be awake to maintain network connectivity
5 of 205
A Solution – Power Proxying• Primary challenge is to maintain network connectivity
while the PC is power down to standby mode - sleeping• Some packets do not require a complex response
– Automated responses are sufficient
– Network Interface Card (NIC) can act as proxy for the PC
– Allow the PC to sleep while NIC services packets with automated responses
– A technique known as power proxying
– We call such a NIC a “Smart”-NIC - SNIC
66 of 20
Power Proxying
INTERNET
IDLE
GNUTELLA FILE SHARING APPLICATION
Alice
Bob
Z
Z
z
z
PC delegates power to the SNIC to handle to network traffic
FILE QUERY PACKET
FILE RESPONSE PACKET
77 of 20
Power Proxying
INTERNET
IDLE
Proxiable Packet
Response
Z
Z
z
z
Chatter Packet
Non-Proxiable/Wake up Packet
SNIC
ResponseBob
88 of 20
What to Proxy? - Proxiable Protocols• Proxiable protocols - Network protocols amenable to proxying
– Responses may be automated
– Keep alive packets, IP conflict avoidance, etc.
Z
Z
z
z
IDLE
FOUR Categories of Proxiable Packets
ARP QUERY
ARP RESPONSE
PING
PING RESPONSE
P2P FILE QUERY
P2P RESPONSE
Mail Notification
ARP (Address Resolution Protocol)
ICMP (Internet Control Message Protocol)
TCP (Transmission Control Protocol)
UDP (User Datagram Protocol)
99 of 20
Response
Power Proxying Operation
z
z
z
IDLE
SNIC
Packet Classifier
Application Handler
1. PC decides to sleep2. PC offloads power proxy rules to the SNIC3. PC sleeps and SNIC proxy is activated
Rules
4. Packet Arrives
Rules
Rules
source
addr
source port
dest port
?=?=?=?=?=
Match?
No
(not
cha
tter
)
7(a) Wake up PC
7(b) DiscardNo
(chatter)
Yes
7(c) Invoke app handler
Payload Header
6. Rule checking
5. Header inspection
Payload Header
App ID
8. Determine response
?
9. Proxyied Response
SW
HW or SW?
source
addr
source port
dest port
1010 of 20
Packet Classifier Requirements
PC-BASED CLASSIFIER ROUTER-BASED CLASSIFIER
3) Operates only during system inactivity 3) Continual operation
4) Process packets addressed only to a particular destination and Broad/MultiCast packets
4) Process packets to any destination
5) Limited processing resources - processors clocked in MHz
5) Processors clocked in GHz range
1) Must sustain link rates of 10/100/1000/10000 Mbps
1) Must sustain link rates of 10/100/1000/10000 Mbps
2) No packet loss allowed 2) No packet loss allowed
6) Limited number of rules directly depend on number of proxiable applications running
6) Larger number rules with a wide complexity range
7) Packets match only one rule - rules are disjoint 7) Packets can match multiple rules
1111 of 20
Packet Classifier - SW vs. HW
Software Classifier Hardware Classifier
1) Limited operating frequency between 66 MHz to 400 MHz
1) Custom hardware can be designed for the required frequency
2) Cannot meet the network throughput demands even for the fastest packet classification algorithms
2) Can easily meet the network throughput demands
3) High power even during idle period 3) Comparatively lower power
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
1212 of 20
Rules
Header Processor
Header Processor
Incoming Packet(From MAC Core)
Packet Class
Application ID
Source Port CAM
Dest PortCAM
Match Matc
h
Matc
h
Match AddressMatch Address
Addre
ss
Addre
ss
Address
Matc
h ID
MultiMatch
Source PortSource IP Dest Port
Custom HW Packet Classification
Source IP
Source IPAddress
CAM
Source Port
Source Port CAMSource
Port CAM
Dest Port
Dest PortCAM
Dest PortCAM
Invokes applicati
on handler
OR
MultiMatch
Mult
iMatc
h
Mult
iMatc
h
Source IPAddress
CAM
Source IPAddress
CAM
1313 of 20
Packet Classifier Placement
From PHY
Packet ClassifierPacket Descriptor FIFO
Tx FIFO
Rx FIFO
MAC Core
uPApplication
Handler
Response No change to critical path
1414 of 20
Experimental Setup
• Software packet classifier– Implemented on RiceNIC platform using PowerPC405
• RiceNIC is a programmable NIC
– PowerPC clocked at 300 MHz and 100 MHz
• Hardware packet classifier – Xilinx IP cores to generate CAMs as block memory
– Prototyped in Verilog HDL
– System implemented and simulated using Xilinx ISE 9.1 and ModelSIM
– Clocked at 1.25 MHz, 12.5 MHz, and 125 MHz corresponding to 10 Mbps, 100 Mbps, and 1000 Mbps