What is a packet checksum? Here we investigate the NIC’s capabilities for computing and detecting errors using checksums Frame is extended to occupy a slot-time of 512 bytes Carrier Extension Frame Check Sequence Data Type/Lenth Source Address Destination Address Start-Of-Frame Delimiter Preamble Gigabit Ethernet frame-format Some lowest-level details • The frame’s Preamble consists of 7 bytes of an alternating bit-pattern of 1’s and 0’s • The Start-of-Frame Delimiter is a one-byte bit-pattern which continues the alternation of 1’s and 0’s until the final bit is reached (which ‘breaks’ the pattern-of-alternation) 10101010 10101010 10101010 10101010 10101010 10101010 10101010 10101011 The 64-bit Preamble and SFD In hexadecimal notation: 0xAA 0xAA 0xAA 0xAA 0xAA 0xAA 0xAA 0xAB The Start-of-Frame Delimiter The FCS field • The Frame Check Sequence is a four-byte integer, usually computed by the hardware according to a sophisticated mathematical error-detection scheme known as CyclicRedundancy Check (CRC): The CRC is calculated by performing a modulo 2 division of the data by a generator polynomial and recording the remainder after division CRC-32 = x32 + x26 + x23 + x22 + x16 + x12 + x11 + x10 + x8 + x7 + x5 + x4 + x2 + x + 1 The CRCERRS register • Our Intel Pro1000 ethernet controller has a statistical counter register (the first one, at offset 0x4000 in the memory-mapped I/Ospace) which counts any received frames which arrived with a CRC-error indicated • Our ‘nic2.c’ was programmed to ‘strip’ the 4-byte FCS field from all received frames, by setting the SECRC-bit in register RCTL Another (simpler) checksum The Pro1000’s ‘legacy-format’ Receive-Descriptor 16 bytes Base-address (64-bits) The device-driver initializes this ‘base-address’ field with the physical address of a packet-buffer Packetlength Packetstatus errors checksum VLAN tag The network controller will ‘write-back’ the values for these fields when it has transferred a received packet’s data into the packet-buffer What is this ‘packet checksum’? • According to the Intel documentation, the packet checksum is “the unadjusted 16-bit ones complement of the packet” (page 24) • Now what exactly does that mean? • And why did Intel’s hardware designers believe that device-driver software would need this value to be available for every ethernet packet that the NIC receives? The idea of ‘complements’ • Whenever a whole object gets divided into two parts, those pieces often referred to as complements • Example: complementary angles in a right-triangle B ABC + C A BAC = 90° Set Theory Venn Diagram A B Within the ‘universe’ represented here by the box, the orange set B is the complement of the green set A. Arithmetic • Among the digits in our ‘base ten’ system: – the values 1 and 9 are complements – the values 2 and 8 are complements – the values 3 and 7 are complements – the values 4 and 6 are complements • Complements are useful when performing subtractions in modular arithmetic: 4 - 6 (mod 10) = 4 + 4 (mod 10) Two’s complement • Digital computers use modular arithmetic in the ‘base two’ number system • Two 8-bit numbers are ‘complements’ if their sum equals 28 (= 256): 00000001 is the twos-complement of 11111111 00000010 is the twos-complement of 11111110 00000011 is the twos-complement of 11111101 • As a consequence, subtractions can be done with the same circuits as additions One’s complement • But for some purposes there is a different kind of ‘complement’ that results in better arithmetical properties – it’s known as the ‘diminished radix’ complement • For the case of radix ten, it’s called ‘nine’s complement’, and for the case of radix two it’s called ‘one’s complement’ Nine’s complements • A pair of 3-digit numbers x and y in the radix ten number system would be called nine’s complements if x+y = 103-1 = 999 – Thus 321 and 678 are nines-complements • A pair of 3-digit numbers x and y in the radix two number system would be called one’s complements if x+y = 23-1 = 111 – Thus 110 and 001 are ones-complements ‘end-around-carry’ • When you want to add two binary numbers using the one’s complement system, you can do it by first performing ordinary binary addition, and then adding in the carry-bit: 10101010 + 11110000 --------------------1 10011010 + 1 --------------------10011011 (8-bit number) (8-bit number) (9-bits in the normal total) (apply ‘end-around carry’) (8-bits in the ones-complement total) NIC uses ‘one’s complement’ • For network programming nowadays it is common practice for ‘one’s complement’ to be used when computing checksums • It is also common practice for multi-byte integers to be ‘sent out over the wire’ in so called ‘big-endian’ byte-order (i.e., the most significant byte goes out ahead of the bytes having lesser significance) Intel’s cpu uses ‘little endian’ • Whenever our x86 processor ‘stores’ a multi-byte value into a series of cells in memory, it puts the least significant byte first (i.e., at the earliest memory address), followed by the remaining bytes having greater significance) AX = 0x3456 mov %ax, buf buf: 0x56 0x34 Checksum using C { unsigned char unsigned int unsigned int unsigned short unsigned int *cp = phys_to_virt( rxring[ rxhead ].base_address ); nbytes = rxring[ rxhead ].packet_length; nwords = ( nbytes / 2 ); *wp = (unsigned short*)cp; i, checksum = 0; if ( nbytes & 1 ) { cp[ nbytes ] = 0; ++nwords; } for ( i = 0; i < nwords; i++ ) checksum += wp[ i ]; checksum += (checksum >> 16); checksum = htons( checksum ) checksum = ~checksum; checksum &= 0xFFFF; // pad odd length packet // two’s complement sum // do ‘end-around carries’ // -- adjustment #1: swap the bytes // -- adjustment #2: and flip the bits // mask to lowest 16-bits // Let’s compare our checksum-calculation with the one done by the PRO1000 printk( “ cpu-computed checksum=%04X “, checksum ); printk( “ nic’s rx packet-checksum=%04X “, rxring[ rxhead ].packet_chksum ); printk( “\n” ); } In-class demonstration #1 TIMEOUT We will insert into our ‘nic2.c’ device-driver’s ‘read()’ function our C code that computes and displays the “unadjusted 16-bit ones complement sum” for each received packet and compare our calculation with the NIC’s ‘packet_checksum’ Checksum ‘offloading’ • Our Intel 82573L network controller has the capability of performing several useful checksum calculations on normal network packets – if this desired by a device-driver Receive CheckSum control register 31 RXCSUM (0x5000) 9 8 reserved (=0) T U O F L D I P O F L D 7 0 PCSS Legend: PCSS = Packet Checksum Start (default=0x00) IPOFLD = IP-checksum Off-load Enable (1=yes, 0=no) TUOFLD = TCP/UDP checksum Off-load Enable (1=yes, 0=no) Using ‘nicecho.c’ • To compensate for the modifications made to the DA and SA fields by our ‘echo.c’, we can omit the first six words (12 bytes) from the checksum-calculations done both by our read() code and by the nic hardware // we start our addition-loop at i=6 instead of i=0 for ( i = 6; i < nwords; i++ ) checksum += wp[ i ]; AND // we initialize the CRXCSUM register with PCSS=12 iowrite( 0x0000000C, io + E1000_RXCSUM ); In-class demonstration #2 TIMEOUT We will modify the nic’s RXCSUM register (as well as our own previous checksum computation) and observe the resulting effects The ‘Legacy’ Transmit-Descriptor 16 bytes Packetlength Base-address (64-bits) CSO cmd status CSS CSO = CheckSum Offset CSS = CheckSum Start Command-Byte Format I D E V L E 0 0 W B I C I F C S E O P special Our driver’s packet-layout destn-address source-address TYPE/ LENGTH count -- data --- data --- data – -- data --- data --- data -- Let’s make room for a new 16-bit field at offset 0x0010, by starting our packet’s data-payload at offset 0x0012 instead of offset 0x0010 Further driver modifications… • We can demonstrate ‘Checksum Insertion’ performed by the NIC with these changes: #define HDR_LEN enum (14+4) // two more bytes precede packet’s data { E1000_RXCSUM = 0x5000, }; // define symbolic register-offset ssize_t my_write( struct file *file, const char *buf, size_t len, loff_t *pos ) { // add these assignments in our driver’s ‘write()’ function txring[ txtail ].cksum_offset = 16; // where to insert the checksum txring[ txtail ].cksum_origin = 12; // where to start its calculation txring[ txtail ].desc_command |= (1<<2); // IC-bit (Insert Checksum) } In-class demonstration #3 TIMEOUT We will modify the packet-layout used in our device-driver’s ‘write()’ and ‘read()’ functions, and then program our TX descriptors to utilize the IC command-option and the CSO and CSS descriptor fields and then observe the resulting effects