Top Banner
What’s needed to transmit? A look at the minimum steps required for programming our 82573L nic to send packets
24

What’s needed to transmit?

Jan 19, 2016

Download

Documents

Ajay Surana

What’s needed to transmit?. A look at the minimum steps required for programming our 82573L nic to send packets. Typical NIC hardware. main memory. packet. nic. TX FIFO. transceiver. buffer. LAN cable. B U S. RX FIFO. CPU. Quotation. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: What’s needed to transmit?

What’s needed to transmit?

A look at the minimum steps required for programming our 82573L nic to send packets

Page 2: What’s needed to transmit?

nic

Typical NIC hardware

TX FIFO

RX FIFO

transceiver LANcableB

US

main memory

packet

buffer

CPU

Page 3: What’s needed to transmit?

Quotation

Many companies do an excellent job of providing information to help customers use their products... but in the end there's no substitute for real-life experiments: putting together the hardware, writing the program code, and watching what happens when the code executes. Then when the result isn't as expected -- as it often isn't -- it means trying something else

or searching the documentation for clues.

-- Jan Axelson, author, Lakeview Research (1998)

Page 4: What’s needed to transmit?

Thanks, Intel!☻

• Intel Corporation has kindly posted details online for programming its family of gigabit Ethernet controllers – includes our 82573L

Page 5: What’s needed to transmit?

Our ‘nictx.c’ module

• We’ve created an LKM which has minimal functionality – enough to be sure we know how to ‘transmit’ a raw Ethernet packet – but we do this in a forward-looking way so that our source-code can later be turned into a Linux character-mode device-driver (once we’ve also seen how to write code which allows our nic to ‘receive’ packets)

Page 6: What’s needed to transmit?

Access to PRO1000 registers

• Device registers are hardware mapped to a range of addresses in physical memory

• We obtain the location (and the length) of this memory-range from a BAR register in the nic device’s PCI Configuration Space

• Then we request the Linux kernel to setup an I/O ‘remapping’ of this memory-range to ‘virtual’ addresses within kernel-space

Page 7: What’s needed to transmit?

Tx-Desc Ring-Buffer

Circular buffer (128-bytes minimum)

TDBA base-address

TDLEN (in bytes)

TDH (head)

TDT (tail)

= owned by hardware (nic)

= owned by software (cpu)

0x00

0x10

0x20

0x30

0x40

0x50

0x60

0x70

0x80

Page 8: What’s needed to transmit?

How ‘transmit’ works

descriptor0descriptor1descriptor2descriptor3

0000

Buffer0

Buffer1

Buffer2

Buffer3

List of Buffer-Descriptors

We setup each data-packets that we want to be transmitted in a ‘Buffer’ area in ram

We also create a list of buffer-descriptors and inform the NIC of its location and size

Then, when ready, we tell the NIC to ‘Go!’ (i.e., start transmitting), but let us know when these transmissions are ‘Done’

Random Access Memory

Page 9: What’s needed to transmit?

Allocating kernel-memory

• Our 82573L device-driver will need to use a segment of contiguous physical memory which is cache-aligned and non-pageable

• Such a memory-block can be allocated by using the kernel’s ‘kzalloc()’ function (and it can later be deallocated using ‘kfree()’)

• You should use the ‘GFP_KERNEL’ flag (and we also used the ‘GFP_DMA’ flag)

Page 10: What’s needed to transmit?

NIC registers (for transmit)

enum {E1000_CTRL = 0x0000, // Device ControlE1000_STATUS = 0x0008, // Device StatusE1000_TCTL = 0x0400, // Transmit ControlE1000_TDBAL = 0x3800, // Tx-Descriptor Base-Address LowE1000_TDBAH = 0x3804, // Tx-Descriptor Base-Address HighE1000_TDLEN = 0x3808, // Tx-Descriptor queue LengthE1000_TDH = 0x3810, // Tx-Descriptor HeadE1000_TDT = 0x3818, // Tx-Descriptor TailE1000_TXDCTL = 0x3828, // Tx-Descriptor ControlE1000_RA = 0x5400, // Receive-address Array};

Page 11: What’s needed to transmit?

Device Control (0x0000)

PHYRST

VME R=0

TFCE RFCE RST R=0

R=0

R=0

R=0

R=0

ADVD3

WUC

R=0

D/UDstatus

R=0

R=0

R=0

R=0

R=0

FRCDPLX

FRCSPD

R=0

SPEED R=0

SLU

R=0

R=0

R=1

0 0 FD

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

GIOMD

R=0

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16

FD = Full-Duplex SPEED (00=10Mbps, 01=100Mbps, 10=1000Mbps, 11=reserved)GIOMD = GIO Master Disable ADVD3WUP = Advertise Cold Wake Up Capability SLU = Set Link Up D/UD = Dock/Undock status RFCE = Rx Flow-Control EnableFRCSPD = Force Speed RST = Device Reset TFCE = Tx Flow-Control EnableFRCDPLX = Force Duplex PHYRST = Phy Reset VME = VLAN Mode Enable

82573L

Page 12: What’s needed to transmit?

0

Device Status (0x0008)

? 0 0 0 0 0 0 0 0 0 0 0GIO

MasterEN

0 0 0

0 0 0 0 PHYRA ASDV

ILOS

SLU

0 TXOFF 0 0

FD

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

FunctionID

LU

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16

SPEED

FD = Full-DuplexLU = Link UpTXOFF = Transmission PausedSPEED (00=10Mbps,01=100Mbps, 10=1000Mbps, 11=reserved)ASDV = Auto-negotiation Speed Detection ValuePHYRA = PHY Reset Asserted

82573L

some undocumented functionality?

Page 13: What’s needed to transmit?

Transmit Control (0x0400)

R=0

R=0

R=0

MULR TXCSCMTUNORTX RTLC R

=0

SWXOFF

COLD (upper 6-bits)(COLLISION DISTANCE)

COLD (lower 4-bits)(COLLISION DISTANCE) 0 ASDV

ILOS

SLU

TBImode

PSP

0 0 R=0

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

R=0

EN

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16

SPEEDCT

(COLLISION THRESHOLD)

EN = Transmit Enable SWXOFF = Software XOFF TransmissionPSP = Pad Short Packets RLTC = Retransmit on Late CollisionCT = Collision Threshold (=0xF) UNORTX = Underrun No Re-TransmitCOLD = Collision Distance (=0x3F) TXCSCMT = TxDescriptor Minimum Threshold

MULR = Multiple Request Support

82573L

Page 14: What’s needed to transmit?

Tx-Descriptor Control (0x3828)

0 0 0 0 0 0 0

GRAN

0 0 WTHRESH(Writeback Threshold)

0 0 0 FRCDPLX

FRCSPD 0HTHRESH

(Host Threshold)

ILOS

0 0

ASDE

0

LRST

0 0

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

0 0

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16

PTHRESH(Prefetch Threshold)0 0

Recommended for 82573: 0x01010000 (GRAN=1, WTHRESH=1)

“This register controls the fetching and write back of transmit descriptors. The three threshhold values are used to determine when descriptors are read from, and written to, host memory. Their values can be in units of cache lines or of descriptors (each descriptor is 16 bytes), based on the value of the GRAN bit (0=cache lines, 1=descriptors). When GRAN = 1, all descriptors are written back (even if not requested).” --Intel manual

Page 15: What’s needed to transmit?

An observation

• We notice that the 82573L device retains the values in many of its internal registers

• This fact reduces the programming steps that will be required to operate our nic on the anchor cluster machines, since Intel’s own Linux device driver (‘e1000e.ko’) has already initialized many nic registers

• But we MAY need to bring ‘eth1’ down!

Page 16: What’s needed to transmit?

Using ‘/sbin/ifconfig’

• You can use the ‘/sbin/ifconfig’ command to find out whether the ‘eth1’ interface has been brought ‘down’:

$ /sbin/ifconfig eth1

• If it is still operating, you can turn it off with the (privileged) command:

$ sudo /sbin/ifconfig eth1 down

Page 17: What’s needed to transmit?

Programming steps

1) Detect the presence of the 82573L network controller (VENDOR_ID, DEVICE_ID)2) Obtain the physical address-range where the nic’s device-registers are mapped3) Ask the kernel to map this address range into the kernel’s virtual address-space4) Copy the network controller’s MAC-address into a 6-byte array for future access5) Allocate a block of kernel memory large enough for our descriptors and buffers6) Insure that the network controller’s ‘Bus Master’ capability has been enabled 7) Select our desired configuration-options for the DEVICE CONTROL register8) Perform a nic ‘reset’ operation (by toggling bit 26), then delay until reset completes9) Select our desired configuration-options for the TRANSMIT CONTROL register10) Initialize our array of Transmit Descriptors with the physical addresses of buffers 11) Initialize the Transmit Engine’s registers (for Tx-Descriptor Queue and Control)12) Setup the buffer-contents for an Ethernet packet we want to be transmitted13) Enable the Transmit Engine14) Give ‘ownership’ of a Tx-Descriptor to the network controller15) Install our ‘/proc/nictx’ pseudo-file (for user-diagnostic purposes)

Page 18: What’s needed to transmit?

Legacy Tx-Descriptor Layout

special

0x0

0x4

0x8

0xC

CMD

Buffer-Address high (bits 63..32)

Buffer-Address low (bits 31..0)

31 0

Packet Length (in bytes)CSO

statusCSS reserved=0

Buffer-Address = the packet-buffer’s 64-bit address in physical memory Packet-Length = number of bytes in the data-packet to be transmitted CMD = Command-field CSO/CSS = Checksum Offset/Start (in bytes) STA = Status-field

Page 19: What’s needed to transmit?

Suggested C syntax

typedef struct {

unsigned long long base_address;

unsigned short packet_length;

unsigned char cksum_offset;

unsigned chardesc_command;

unsigned char desc_status;

unsigned char cksum_origin;

unsigned short special_info;

} TX_DESCRIPTOR;

Page 20: What’s needed to transmit?

TxDesc Command-field

IDE VLE DEXT reserved=0 RS IC IFCS EOP

7 6 5 4 3 2 1 0

EOP = End Of Packet (1=yes, 0=no) IFCS = Insert Frame CheckSum (1=yes, 0=no) – provided EOP is set IC = Insert CheckSum (1=yes, 0=no) as indicated by CSO/CSS fields RS = Report Status (1=yes, 0=no) DEXT = Descriptor Extension (1=yes, 0=no) use ‘0’ for Legacy-Mode VLE = VLAN-Packet Enable (1=yes, 0=no) – provided EOP is set IDE = Interrupt-Delay Enable (1=yes, 0=no)

Page 21: What’s needed to transmit?

TxDesc Status field

reserved=0 LC EC DD

3 2 1 0

DD = Descriptor Done this bit is written back after the NIC processes the descriptor provided the descriptor’s RS-bit was set (i.e., Report Status)EC = Excess Collisions indicates that the packet has experienced more than the maximum number of excessive collisions (as defined by the TCTL.CT field) and therefore was not transmitted. (This bit is meaningful only in HALF-DUPLEX mode.)LC = Late Collision indicates that Late Collision has occurred while operating in HALF-DUPLEX mode. Note that the collision window size is dependent on the SPEED: 64-bytes for 10/100-MBps, or 512-bytes for 1000-Mbps.

Page 22: What’s needed to transmit?

Bit-mask definitions enum {

DD = (1<<0), // Descriptor DoneEC = (1<<1), // Excess CollisionsLC = (1<<2), // Late Collision

EOP = (1<<0), // End Of PacketIFCS = (1<<1), // Insert Frame CheckSumIC = (1<<2), // Insert CheckSum as per

CSO/CSSRS = (1<<3), // Report StatusDEXT = (1<<5), // Descriptor ExtensionVLE = (1<<6), // VLAN packetIDE = (1<<7) // Interrupt-Delay Enable};

Page 23: What’s needed to transmit?

the packet’s data ‘payload’ goes here(usually varies from 56 to 1500 bytes)

Ethernet packet layout

• Total size normally can vary from 64 bytes up to 1536 bytes (unless ‘jumbo’ packets and/or ‘undersized’ packets are enabled)

• The NIC expects a 14-byte packet ‘header’ and it appends a 4-byte CRC check-sum

destination MAC address (6-bytes)

source MAC address(6-bytes)

Type/length(2-bytes)

Cyclic RedundancyChecksum (4-bytes)

0 6 12 14

Page 24: What’s needed to transmit?

In-class exercises

• Modify the code in our ‘nictx.c’ module so that it will transmit more than just one raw packet when you install it into the kernel

• Can you also modify the ‘module_exit()’ function so that it will transmit a packet before it disables the ‘Transmit Engine’?