Receiver ‘packet-splitting’ A look at how a driver can cause the 82573L NIC to separate a packet’s headers from its data NIC can do packet-parsing • Intel’s newest gigabit Ethernet controllers offer an enhancement to the ‘extended’ Receive Descriptor, called ‘packet-split’ format, which empowers the hardware to recognize the packet ‘headers’ used with the most common network protocols and to automatically separate those headers from their accompanying packet ‘data’ ‘Extended’ RX-Descriptors CPU writes this, NIC reads it: NIC writes this, CPU reads it: Base-address (64-bits) Packetchecksum reserved (=0) VLAN tag The device-driver initializes the ‘base-address’ field with the physical address of a packet-buffer, and it initializes the ‘reserved’ field with a zero-value… … the network hardware will later modify both fields IP MRQ identification (multiple receive queues) Packetlength Extended errors Extended status The network controller will ‘write-back’ the values for these fields when it has transferred a received packet’s data into the packet-buffer ‘Packet-Split’ RX-Descriptors CPU writes this, NIC reads it: NIC writes this, CPU reads it: Base-address 0 (64-bits) Packetchecksum Base-address 1 (64-bits) VLAN tag PacketLength 0 Base-address 2 (64-bits) Packet Length 3 Packet Length 2 Base-address 3 (64-bits) The device-driver initializes four ‘base-address’ fields (‘even-numbered’ addresses) IP MRQ identification (multiple receive queues) Extended errors Packet Length 1 Extended status S P Header Length reserved The network controller will ‘write-back’ values to these fields when it has transferred a received packet’s data into those packet-buffers 1 Same ‘Extended’ Status/Errors 19 0 18 0 11 17 16 15 0 10 0 A C K 9 RXE IPE TCPE 14 13 0 0 12 0 8 7 0 0 11 10 9 U D P V I P I V 0 6 5 SEQ SE 8 7 6 5 4 3 2 1 0 0 P I F I P C S T C P C S U D P C S V P I X S M E O P D D 4 3 2 1 0 CE 0 0 0 0 Syntax modifications for ‘fetch’ typedef struct { unsigned long long base_addr0; unsigned long long base_addr1; unsigned long long base_addr2; unsigned long long base_addr3; } RX_DESC_FETCH; Syntax modifications for ‘store’ typedef struct { unsigned int unsigned short unsigned short unsigned int unsigned int unsigned short unsigned short unsigned short unsigned short unsigned short unsigned short unsigned long long } RX_DESC_STORE; mrq; ip_identification; packet_chksum; desc_status:20; desc_errors:12; packet_length0; vlan_tag; header_length; packet_length1; packet_length2; packet_length3 reserved; Same syntax for the ‘union’ typedef union { RX_DESC_FETCH RX_DESC_STORE } RX_DESCRIPTOR; rxf; rxs; NIC Registers involved 31 10 0 D T Y P RCTL Device Control register 31 RFCTL 15 Reserved (=0) 0 E X T E N Receive Filter Control register 31 PSRCTL 24 23 16 15 8 7 0 LEN3 LEN2 LEN1 LEN0 (1KB) (1KB) (1KB) (128B) Packet-Split Receive Control register Each descriptor has four buffers Packet-Split Rx-descriptor base_addr0 base_addr1 base_addr2 base_addr3 buffer3 buffer2 buffer1 buffer0 Four buffers are allocated for receiving one packet Refresh for ‘reuse’ • As with the ‘extended’ receive-descriptors, it is necessary for a device-driver to setup each ‘packet-split’ receive-descriptor any time it is going to be ‘reused’, since prior buffer-addresses get overwritten during a packet-reception by the network controller • So driver needs a formula for recalculating buffer-addresses, or use a ‘shadow’ array Kernel-memory layout Sixteen Rx-descriptors (32-bytes each) Sixty-four receive-buffers (1024-bytes each) 512 bytes 65536 bytes KMEM_SIZE (= 66048 bytes) kmem Caveats • Short packets are not always ‘split’ • Unrecognized packet-headers may not be separated from accompanying packet-data • Demonstrating the packet-split capability will require us to devise a way to transmit packets which have the TCP/UDP and IP packet-headers that the NIC recognizes Our ‘pktsplit.c’ demo • We created a ‘minimal’ kernel-module for demonstrating the NIC’s ‘packet-splitting’ capabilities TIMEOUT for an in-class demonstration In-class exercise • Can you enhance our ‘pktsplit.c’ module so that its Receive-Descriptor Queue will function automatically as a ring-buffer (as happens in our ‘extended.c’ example)? • Your best option for this is to install an ISR which will reinitialize some Rx-Descriptors (and advance the RDT index) each time an RXDMT0 interrupt gets triggered