A network driver ‘framework’ We construct a ‘skeleton’ module showing just the essential pieces of a Linux network device driver Overview user space standard runtime libraries kernel space Linux operating system Kernel networking subsystem application program device driver module hardware Source-code layout netframe.c #include <linux/module.h> #include <linux/etherdevice.h> … typedef struct { /* driver’s private data */ } MY_DRIVERDATA; char modname[ ] = “netframe”; struct net_device *netdev; my_open() my_stop() The network driver’s “payload” functions my_hard_start_xmit() my_isr() my_get_info() The mandatory moduleadministration functions my_init my_exit module_init() • This function will execute when the driver is installed in the kernel (‘/sbin/insmod’) • Its role is to allocate and partially initialize a ‘struct net_device’ object for our network interface controller (i.e., hardware device), then “register” that object with the kernel • For ethernet NICs there exists a kernel helper-function that drivers can utilize The ‘key’ statements… typedef struct { /* the driver’s private data */ } MY_DRIVERDATA; struct net_device *netdev; static int __init my_init( void ) { netdev = alloc_etherdev( sizeof( MY_DRIVERDATA ) ); if ( !netdev ) return –ENOMEM; netdev->open netdev->stop netdev->hard_start_xmit return } = my_open; = my_stop; = my_hard_start_xmit; register_netdev( netdev ); module_exit() • This function will execute when the driver is removed from the kernel (‘/sbin/rmmod’) • Its role is to “unregister” the ‘net_device’ structure and free the memory that was allocated during the module’s initialization The ‘key’ statements… struct net_device *netdev; static void __exit my_exit( void ) { unregister_netdev( netdev ); free_netdev( netdev ); } open() • The kernel will call this function when the system administrator “configures” the NIC (e.g., with the ‘/sbin/ifconfig’ command) to assign an IP-address to the interface and and bring it UP • Thus the role of ‘open()’ would be to reset the hardware to a known working state and initiate packet-queueing by the kernel The ‘key’ statements… int my_open( struct net_device *dev ) { /* initialize any remaining ‘private’ data */ /* prepare the hardware for operation */ /* install an Interrupt Service Routine */ /* enable the NIC to generate interrupts */ netif_start_queue( netdev ); return 0; //SUCCESS } stop() • The kernel will call this function when the NIC is brought DOWN (i.e., to turn off its transmission and reception of packets) • This could occur because of a command (such as ‘/sbin/ifconfig’) executed by the System Administrator, or because a user is removing the driver-module from the kernel (with the ‘/sbin/rmmod’ command) The ‘key’ statements… int my_stop( struct net_device *dev ) { netif_stop_queue( netdev ); /* kill any previously scheduled ‘tasklets’ (or other deferred work) */ /* turn off the NIC’s transmit and receive engines */ /* disable the NIC’s ability to generate interrupts */ /* delete the NIC’s Interrupt Service Routine */ return } 0; //SUCCESS hard_start_xmit() • The kernel will call this function whenever it has data that it wants the NIC to transmit • The kernel will supply the address for a socket-buffer (‘struct sk_buff’) that holds the packet-data that is to be transmitted • So this function’s duties are: to initiate transmission, update relevant statistics, and then release that ‘sk_buff’ structure The ‘key’ statements… int my_hard_start_xmit( struct sk_buff *skb, struct net_device *dev ) { /* code goes here to initiate transmission by the hardware */ dev->trans_start = jiffies; dev->stats.tx_packets += 1; dev->stats.tx_bytes += skb->len; dev_kfree_skb( skb ); return 0; //SUCCESS } What about reception? • The NIC hardware receives data-packets asynchronously – not at a time of its own choosing – and we don’t want our system to be ‘stalled’ doing ‘busy-waiting’ • Thus an interrupt handler is normally used to detect and arrange for received packets to be validated and dispatched to upper layers in the kernel’s network subsystem Simulating an interrupt • Our network device-driver ‘framework’ was only designed for demonstration purposes; it does not work with any actual hardware • But we can use a ‘software interrupt’ that will trigger the execution of our ISR • To implement this scheme, we’ll need to employ an otherwise unused IRQ-number, along with its associated ‘Interrupt-ID’ Advanced Programmable Interrupt Controller Multi-CORE CPU CPU 0 LOCAL APIC CPU 1 LOCAL APIC I/O APIC IRQ0 IRQ1 IRQ2 IRQ3 ● ● ● IRQ23 The I/O APIC component is programmable – its 24 inputs can be assigned to interrupt ID-numbers in the range 0x20..0xFF (lower numbers are reserved by Intel for the CPU’s exception-vectors) The I/O-APIC’s 24 Redirection Table registers determine these assignments Two-dozen IRQs • The I/O APIC in our classroom machines supports 24 Interrupt-Request input-lines Redirection-table • Its 24 programmable registers determine how interrupt-signals get routed to CPUs Redirection Table Entry 63 56 55 destination 48 32 extended destination 31 reserved 16 15 14 13 12 11 10 reserved M A S K Trigger-Mode (1=Edge-triggered, 0=Level-triggered) Remote IRR (for Level-Triggered only) 0 = Reset when EOI received from Local-APIC 1 = Set when Local-APICs accept Level-Interrupt sent by IO-APIC Interrupt Input-pin Polarity (1=Active-High, 0=Active-Low) Delivery-Status (1=Pending, 0=Idle) E / L R I R R H / L S T A T U S L / P 9 8 delivery mode 7 0 interrupt vector ID 000 = Fixed 001 = Lowest Priority 010 = SMI 011 = (reserved) 100 = NMI 101 = INIT 110 = (reserved) 111 = ExtINT Destination-Mode (1=Logical, 0=Physical) Our ‘ioapic.c’ module • Last semester we created a module that will show us which IRQ-numbers are not currently being used by our system, and the Interrupt-IDs those IRQ-signals were assigned to by Linux during ‘startup’ Timeout for an in-class demonstration my_isr() • We created a “dummy” Interrupt Service Routine for our ‘netframe.c’ demo-module #define IRQ #define intID 4 0x49 // temporarily unused (normally for serial-UART // our I/O-APIC has assigned this ID to to IRQ 4 irqreturn_t my_isr( int irq, void *my_netdev_addr ) { struct net_device *dev = (struct net_device *)my_netdev_addr; MY_DRIVERDATA *priv = dev->priv; // we do processing of the received packet in our “bottom half” tasklet_schedule( &priv->my_rxtasklet ); return IRQ_HANDLED; } Installing and removing an ISR option-flags name for display entry-point for interrupt-handler IRQ’s signal-number ISR data-argument if ( request_irq( IRQ, my_isr, IRQF_SHARED, dev->name, dev ) < 0 ) return –EBUSY; This statement would go in the driver’s ‘open()’ function… …and this statement would go in the driver’s ‘stop()’ function free_irq( IRQ, dev ); Here ‘dev’ is the address of the interface’s ‘struct net_device’ object Processing a received packet • When the NIC notifies our driver that it has received a new ethernet-packet, our driver must allocate a socket-buffer structure for the received data, initialize the ‘sk_buff’ with that data and supporting parameters, then pass that socket-buffer upward to the kernel’s network subsystem for delivery to the appropriate application-program that is listening for it The ‘key’ statements… void my_rxhandler( unsigned long data ) { struct net_device *dev = (struct net_device *)data; struct sk_buff *skb; int rxbytes = 60; // just an artificial value here skb = dev_alloc_skb( rxbytes + 2 ); skb->dev = dev; skb->protocol = eth_type_trans( skb, dev ); skb->ip_summed = CHECKSUM_NONE; dev->stats.rx_packets += 1; dev->stats.rx_bytes += rxbytes; netif_rx( skb ); } Triggering the interrupt… • We allow a user to trigger execution of our interrupt-handler (for testing purposes), by reading from a pseudo-file that our driver creates during module-initialization, whose ‘get_info()’ function includes execution of a software-interrupt instruction: ‘int $0x49’ • This inline assembly language instruction is produced via the GNU ‘asm’ construct Using the ‘asm-construct’ #define intID 0x49 asm(“ int %0 “ : : “i” (intID) ); statement keyword parameter-value (symbolic) assembly language opcode parameter type (“i” = immediate data) parameter indicator This example shows how a symbolic constant’s value, defined in the high-level C programming language using a ‘#define’ preprocessor directive, is able to be referenced by an “inline” assembly language statement within a C code-module Testing our ‘framework’ • You can download, compile, and install our ‘netframe.c’ network driver module • It doesn’t do anything with real hardware, but it does illustrate essential interactions of a network device driver with the Linux operating system’s networking subsystem In-class exercise #1 • Use the ‘/sbin/ifconfig’ command to assign an IP-address to the ‘struct net_device’ object that our framework-module creates • You can discover the interface’s name by using our earlier ‘netdevs.c’ module • You should use a ‘private’ IP-address • EXAMPLE (for station ‘hrn23501’): $ sudo /sbin/ifconfig eth1 192.168.86.1 up In-class exercise #2 • Use ‘ifconfig’ to confirm the IP-address, the IRQ, and the interface’s status: $ /sbin/ifconfig eth1 • Use ‘ifconfig’ to examine the interface’s statistics (packets transmitted/received) In-class exercise #3 • Use the ‘cat’ command to simulate an interrupt from your device’s interface • Verify that your interrupt-handler did get executed, by looking at the statistics, and by displaying the output of a pseudo-file Linux creates (named ‘/proc/interrupts’) $ cat /proc/interrupts In-class exercise #4 • Try removing the ‘netframe.ko’ module (with the ‘/sbin/rmmod’ command), then use the ‘dmesg’ command to see your system’s log-file messages