I/O Devices and Drivers Vivek Pai / Kai Li Princeton University Gaining Flexibility Question: how do you make a file descriptor refer to non-files? Answer: treat it as an object System calls have a shared part of code Actual work done by calls to function ptrs Each type of object exports a structure of func ptrs that handle all file-related syscalls 2 Where Have We Seen This? Internals of read( ) system call descending down to fop_read method Other places where this might be good? Filesystems – want to support local, network, CD-ROM, legacy Other I/O Devices 3 Using “Virtual Nodes” struct vnode { u_long v_flag; int v_usecount; int v_writecount; int v_holdcnt; u_long v_id; struct mount *v_mount; vop_t **v_op; TAILQ_ENTRY(vnode) v_freelist; TAILQ_ENTRY(vnode) v_nmntvnodes; […] enum vtype v_type; […] struct vm_object *v_object; […] enum vtagtype v_tag; void *v_data; […] } /* /* /* /* /* /* /* /* /* vnode flags (see below) */ reference count of users */ reference count of writers */ page & buffer references */ capability identifier */ ptr to vfs we are in */ vnode operations vector */ vnode freelist */ vnodes for mount point */ /* vnode type */ /* Place to store VM object */ /* type of underlying data */ /* private data for fs */ 4 More Vnode Info enum vtype { VNON, VREG, VDIR, VBLK, VCHR, VLNK, VSOCK, VFIFO, VBAD }; enum vtagtype { VT_NON, VT_UFS, VT_NFS, VT_MFS, VT_PC, VT_LFS, VT_LOFS, VT_FDESC, VT_PORTAL, VT_NULL, VT_UMAP, VT_KERNFS, VT_PROCFS, VT_AFS, VT_ISOFS, VT_UNION, VT_MSDOSFS, VT_TFS, VT_VFS, VT_CODA, VT_NTFS, VT_HPFS, VT_NWFS, VT_SMBFS }; 5 Definitions & General Method Overhead CPU time to initiate operation (cannot be overlapped) Latency Time to perform 1-byte I/O operation Bandwidth Rate of I/O transfer, once initiated General method Abstraction of byte transfers Batch transfers into block I/O for efficiency to prorate overhead and latency over a large unit 6 Programmed I/O “Slow” Input Device Device Data registers Status register (ready, busy, interrupt, … ) CPU L2 Cache Memory A simple mouse design Put (X, Y) in data registers on a move Interrupt Perform an input On an interrupt I/O Bus X Y Interface reads values in X, Y registers sets ready bit wakes up a process/thread or execute a piece of code 7 Programmed I/O Output Device Device Data registers Status registers (ready, busy, … ) Perform an output Polls the busy bit Writes the data to data register(s) Sets ready bit Controller sets busy bit and transfers data Controller clears the ready bit and busy bit 8 Direct Memory Access (DMA) Perform DMA from host CPU Device driver call (kernel mode) Wait until DMA device is free Initiate a DMA transaction Memory (command, memory address, size) Block CPU Free to move data during DMA L2 Cache DMA interface DMA data to device (size--; address++) Interrupt on completion (size == 0) I/O Bus DMA Interface Interrupt handler (on completion) Wakeup the blocked process 9 Device Drivers Device Device controller Device driver Device Device controller Device driver .. . .. . Device controller Device driver Device Rest of the operating system Device I/O System 10 Device Driver Design Issues Operating system and driver communication Commands and data between OS and device drivers Driver and hardware communication Commands and data between driver and hardware Driver operations Initialize devices Interpreting commands from OS Schedule multiple outstanding requests Manage data transfers Accept and process interrupts Maintain the integrity of driver and kernel data structures 11 Device Driver Interface Open( deviceNumber ) Initialization and allocate resources (buffers) Close( deviceNumber ) Cleanup, deallocate, and possibly turnoff Device driver types Block: fixed sized block data transfer Character: variable sized data transfer Terminal: character driver with terminal control Network: streams for networking 12 Block Device Interface read( deviceNumber, deviceAddr, bufferAddr ) transfer a block of data from “deviceAddr” to “bufferAddr” write( deviceNumber, deviceAddr, bufferAddr ) transfer a block of data from “bufferAddr” to “deviceAddr” seek( deviceNumber, deviceAddress ) move the head to the correct position usually not necessary 13 Character Device Interface read( deviceNumber, bufferAddr, size ) reads “size” bytes from a byte stream device to “bufferAddr” write( deviceNumber, bufferAddr, size ) write “size” bytes from “bufferSize” to a byte stream device 14 Unix Device Driver Interface Entry Points init( ): Initialize hardware start( ): Boot time initialization (require system services) open(dev, flag, id): initialization for read or write close(dev, flag, id): release resources after read and write halt( ): call before the system is shutdown intr(vector): called by the kernel on a hardware interrupt read/write calls: data transfer poll(pri): called by the kernel 25 to 100 times a second ioctl(dev, cmd, arg, mode): special request processing 15 What Was That Last One? The system call “ioctl” SYNOPSIS #include <sys/ioctl.h> int ioctl(int d, unsigned long request, ...); DESCRIPTION The ioctl() function manipulates the underlying device parameters of special files. In particular, many operating characteristics of character special files (e.g. terminals) may be controlled with ioctl() requests. The argument d must be an open file descriptor. 16 Any Counterparts? “fcntl” – operations on files Duplicating file descriptors Get/set/clear “close-on-exec” flag Get/set/clear flags – nonblocking, appending, direct (no-cache), async signal notification, locking & unlocking Also available as dup( ), dup2( ), lockf( ), flock( ) and others 17 Any Other Non-Orthogonality Sending data, credentials, file descriptors over sockets SYNOPSIS ssize_t send(int s, const void *msg, size_t len, int flags); ssize_t sendto(int s, const void *msg, size_t len, int flags, const struct sockaddr *to, socklen_t tolen); ssize_t sendmsg(int s, const struct msghdr *msg, int flags); DESCRIPTION Send(), sendto(), and sendmsg() are used to transmit a message to another socket. Send() may be used only when the socket is in a connected state, while sendto() and sendmsg() may be used at any time. 18 Why Buffering Speed mismatch between producer and consumer Character device and block device, for example Adapt different data transfer sizes Packets vs. streams Support copy semantics Deal with address translation I/O devices see physical memory, but programs use virtual memory Spooling Avoid deadlock problems Caching Avoid I/O operations 19 Asynchronous I/O Why do we want asynchronous I/O? Life is simple if all I/O is synchronous How to implement asynchronous I/O? On a read copy data from a system buffer if the data is there Otherwise, initiate I/O How does process find out about completion? On a write copy to a system buffer, initiate the write and return 20 Other Design Issues Build device drivers statically dynamically How to download device driver dynamically? load drivers into kernel memory install entry points and maintain related data structures initialize the device drivers 21 Dynamic Binding with An Indirect Table Indirect table Interrupt handlers Other Kernel services Driver-kernel interface Open( 1, … ); Driver for device 0 open(…) { } … Driver for device 1 read(…) { } open(…) { } … read(…) { } 22 Dynamic Binding Download drivers by users (may require a reboot) Allocate a piece of kernel memory Put device driver into the memory Bind device driver with the device Pros: flexible and support ISVs and IHVs Cons: security holes 23