W4118 Operating Systems Interrupt and System Call Instructor: Junfeng Yang Logistics Room change: 633 Mudd Homework 1 out, due Thu 2/5 at 4:09pm EST Last lecture What is OS? Stuff between What functionality in OS? App view: hw abstraction layer Sys view: resource mgr Users + hardware functionality Concepts User App OS HW Batching: work on group of jobs, so can optimize Spooling: overlap I/O with compute • OS: buffering, DMA, interrupt Multiprogramming: keep N jobs in mem, OS chooses which to run • OS: job scheduling, mem mgmt Timesharing: fast switch view of dedicated machine • OS: more complex scheduling, mem mgmt, concurrency control, synchronization Today OS: event-driven Interrupt Device events: interrupt Application events: system call Background How interrupt works Some tricky points System call User App OS HW Computer Organization Computer-system operation One or more CPUs, device controllers connect through common bus providing access to shared memory CPU CPU runs instructions: Needs work space: registers E.g. x86 has 8 general purpose registers: EAX, EBX, ECX, EDX, ESI, EDI, EBP, ESP Very fast, very few Needs more work space: memory while (fetch next instruction){ run instruction } CPU has address line, data line Sends out address on address line Data comes back on data line, or data is written to data line Instructions in memory too! IP: instruction pointer (or Program Counter) Increment after running each instruction CPU’s ‘fetch-execute’ cycle User Program ld Fetch instruction at IP add st IP mul Decode the fetched instruction ld sub Execute the decoded instruction bne add jmp … Advance IP to next instruction How to bootstrap this cycle? Where to find the first instruction? The boot process 1. When CPU powers up, IP always points to fixed memory address (0xfffffff0) 2. This memory address maps to ROM with BIOS 3. BIOS executes, initializes your computer 4. BIOS loads boot loader from first sector of disk (MBR) into RAM, then jump 5. Boot loader loads your kernel Can have two-level boot loader, or no boot loader How do devices gain CPU’s attention? How do devices gain CPU’s attention? I/O devices and the CPU can execute concurrently Device controller: in charge of a device type Each device controller has a local buffer CPU moves data b/w main memory controller buffers (spooling) I/O: b/w controller buffer and device Device controller informs CPU that it has finished its operation by causing an interrupt Controller buffer small, want better interactivity Want CPU to respond fast CPU’s ‘fetch-execute’ cycle with interrupt User Program ld add Fetch instruction at IP st IP mul Save context Decode the fetched instruction ld sub bne Get INTR # Execute the decoded instruction add jmp … Lookup ISR Advance IP to next instruction Execute ISR IRQ? no yes IRET Interrupt Hardware (legacy systems) IRQs Ethernet Slave PIC (8259) SCSI Disk Real-Time Clock Keyboard Controller Master PIC (8259) INTR x86 CPU intr # Programmable Interval-Timer I/O devices have (unique or shared) Interrupt Request Lines (IRQs) IRQs are mapped by special hardware to interrupt numbers, and passed to the CPU This hardware is called a Programmable Interrupt Controller (PIC) The `Interrupt Controller’ Responsible for telling the CPU when a specific external device wishes to ‘interrupt’ Needs to tell the CPU which one among several devices is the one needing service PIC translates IRQ to interrupt number Raises interrupt to CPU Interrupt # available in register Waits for ack from CPU Interrupts can have varying priorities PIC also needs to prioritize multiple requests Possible to “mask” (disable) interrupts at PIC or CPU Early systems cascaded two 8 input chips (8259A) Example: Interrupts on 80386 80386 core has one interrupt line, one interrupt acknowledge line Interrupt sequence: Interrupt controller raises INT line 80386 core pulses INTA line low, allowing INT to go low 80386 core pulses INTA line low again, signaling controller to put interrupt number on data bus INT: INTA: Data bus: Interrupt # CPU’s ‘fetch-execute’ cycle with interrupt User Program ld add Fetch instruction at IP st IP mul Save context Decode the fetched instruction ld sub bne Get INTR ID Execute the decoded instruction add jmp … Lookup ISR Advance IP to next instruction Execute ISR IRQ? no yes IRET Interrupt Descriptor Table The ‘entry-point’ to the interrupt-handler is located via the Interrupt Descriptor Table (IDT) Interrupt Service Routine = IDT[Interrupt number] Also called interrupt handler IDT is in memory, initialized by OS at boot How to locate base of IDT? CPU has a register, idtr, pointing to IDT, initialized by OS via the LIDT instruction at boot Putting It All Together Memory Bus intr # IRQs PIC idtr INTR CPU 0 IDT intr # ISR Mask points 255 Some tricky points Must be able to resume user program after interrupt, so need to save instruction pointer and other registers To preserve IRQ order on the same line, must disable incoming interrupts (all or same line) Some done in hardware, some in handler Done in handler Thus, handler must run for a very short time Preempts what CPU was doing, which may be important Don’t want to interrupt user program for long Don’t want to disable interrupt for long (otherwise, may lose interrupts) Lead to some complex implementations. We’ll see in Linux Interrupt v.s. Polling Instead for device to interrupt CPU, CPU can poll the status of device Intr: “I want to see a movie.” Poll: While(1) {“Do you want to see a movie?”} Good or bad? For mostly-idle device? For busy device? Same Interrupt mechanism used for other unusual control transfers We’ve seen Interrupts: raised externally by device Traps (or Exceptions): raised internally by CPU 0: divide-overflow fault 3: breakpoint 6: Undefined Opcode 13: General Protection Exception System call can be implemented this way too Linux system call: INT 0x80 System calls Programming interface to OS services Next: Protection System calls and application programming interface (API) How to implement system calls The need for protection For reliability: buggy or malicious user program For security: malicious user program Exceptions (division by 0, buffer overrun, dereference NULL, …) Resource hogs (infinite loops, exhaust memory …) Despite these, OS cannot crash, must serve other processes and users Read/write OS or other process’s data without permission OS must check, and check code cannot be tampered Must distinguish trusted (OS) and untrusted (user program) Dual-mode operation Allows OS to protect itself and other system components User mode and kernel mode Mode bit provided by hardware • Provides ability to distinguish when system is running user code or kernel code • Some instructions designated as privileged, only executable in kernel mode • If executed in user mode, exception • X86 actually has 4 modes, but only 2 used To perform privileged operations, must transit into OS through well defined interfaces System calls Interrupt handlers too Example transition: system call Call: changes mode to kernel int 0x80 OS validates system call Return: resets it to user iret Example transition: timer interrupt Timer to prevent infinite loop / process hogging resources Set interrupt after specific period Set up before scheduling process to regain control or terminate program that exceeds allotted time When interrupt occurs, switch from current process to another (slightly simplified) System Calls and API Mostly accessed by programs via a high-level Application Program Interface (API) rather than direct system call use Example API: Win32 API, POSIX API Why use APIs rather than system calls? Give OS some flexibility • Can have non-standard or not-so-easy-to-use system call interface. Fix things up in libs (usually easier than fix in kernel) (Note that the system-call names used throughout this text are generic) System Call Implementation Typically, a number associated with each system call System-call interface maintains a table indexed according to these numbers Similar to interrupt, but dispatched in software The system call interface invokes intended system call in OS kernel and returns status of the system call and any return values Apps only need to know API, not system call interface API – System Call – OS Relationship { printf(“hello world!\n”); } libc User mode %eax = sys_write; int 0x80 system_call() { fn = syscalls[%eax] kernel mode 0x80 IDT } syscalls table sys_write(…) { // do real work } System Call Parameter Passing Often, more information is required than simply identity of desired system call Exact type and amount of information vary according to OS and call Three general methods used to pass parameters to the OS Simplest: pass the parameters in registers • In some cases, may be more parameters than registers Parameters stored in a block, or table, in memory, and address of block passed as a parameter in a register • This approach taken by Linux and Solaris Parameters placed, or pushed, onto the stack by the program and popped off the stack by the operating system Block and stack methods do not limit the number or length of parameters being passed Parameter Passing via Table Types of System Calls Process control File management Device management Information maintenance Communications Case study: the shell (simplified) Shell: interactive command line interface User types command, shell executes Thus, need to create process to run command while (1) { write (1, "$ “, 2); parse_cmd (command, args); // parse user input switch(pid = fork ()) { case -1: perror (“fork”); break; case 0: // child execv (command, args, 0); break; default: // parent wait (0); break; // wait for child to terminate } } Case study: the shell (cont.) System calls for files: open, read, write, close Identify opened file with file descriptors, numbered starting from 0 Avoid repeated path resolution OS knows when file is closed can reclaim resource Avoid race: same name may map to different file fd 0: input (e.g. keyboard) fd 1: output (e.g. screen) Fd 2: error output (e.g. screen) Naming conventions • parse_cmd: read(0, buf, bufsize) • Write(1, “hello\n”, strlen(“hello\n”); • Often use libc function (printf, fprintf, etc) On fork, child copies fd On exec, retains fd (except those specifically marked as close-on-exec: fcntl(fd, F_SETFD, FD_CLOEXEC)) Case study: the shell (cont.) I/O redirection: “ls > tmp1” How does the shell implement I/O redirection fd = open (“tmp1”, …); // error checking omitted if (fd != 1) { dup2 (fd, 1); // 1 is a copy of fd, thus points to file tmp1 close (fd); } System calls and API Q: what system calls and APIs to provide? Q: how to implement? Next: OS design and implementation OS design and implementation Design and Implementation of OS not “solvable”, but some approaches have proven successful Internal structure of different Operating Systems can vary widely Start by defining goals and specifications Affected by choice of hardware, type of system User goals and System goals User goals – operating system should be convenient to use, easy to learn, reliable, safe, and fast System goals – operating system should be easy to design, implement, and maintain, as well as flexible, reliable, error-free, and efficient Operating System Design and Implementation (Cont.) Important principle to separate Policy: What will be done? Mechanism: How to do it? Mechanisms determine how to do something, policies decide what will be done The separation of policy from mechanism is a very important principle, it allows maximum flexibility if policy decisions are to be changed later Simple Structure MS-DOS – written to provide the most functionality in the least space Not divided into modules Although MS-DOS has some structure, its interfaces and levels of functionality are not well separated No protection user program crashes entire machine Hardware reason: 8088 has no protection MS-DOS Layer Structure Unix UNIX – limited by hardware functionality, the original UNIX operating system had limited structuring. The UNIX OS consists of two separable parts Systems programs The kernel • Consists of everything below the system-call interface and above the physical hardware • Provides the file system, CPU scheduling, memory management, and other operating-system functions; a large number of functions for one level UNIX System Structure Layered Approach The operating system is divided into a number of layers (levels), each built on top of lower layers. The bottom layer (layer 0), is the hardware; the highest (layer N) is the user interface. With modularity, layers are selected such that each uses functions (operations) and services of only lowerlevel layers Layered Operating System Microkernel System Structure Moves as much from the kernel into “user” space Communication takes place between user modules using message passing Claimed benefits: Easier to extend a microkernel Easier to port the operating system to new architectures More reliable (less code is running in kernel mode) More secure Detriments: Performance overhead of user space to kernel space communication Mac OS X Structure Modules Most modern operating systems implement kernel modules Uses object-oriented approach • Function pointers in Linux: OOP with C Each core component is separate Each talks to the others over known interfaces Each is loadable as needed within the kernel Overall, similar to layers but with more flexible Solaris Modular Approach Virtual Machines A virtual machine takes the layered approach to its logical conclusion. It treats hardware and the operating system kernel as though they were all hardware A virtual machine provides an interface identical to the underlying bare hardware The operating system creates the illusion of multiple processes, each executing on its own processor with its own (virtual) memory Virtual Machines (Cont.) The resources of the physical computer are shared to create the virtual machines CPU scheduling can create the appearance that users have their own processor Spooling and a file system can provide virtual card readers and virtual line printers A normal user time-sharing terminal serves as the virtual machine operator’s console Virtual Machines (Cont.) Non-virtual Machine Virtual Machine Virtual Machines (Cont.) The virtual-machine concept provides complete protection of system resources since each virtual machine is isolated from all other virtual machines. This isolation, however, permits no direct sharing of resources. A virtual-machine system is a perfect vehicle for operating-systems research and development. System development is done on the virtual machine, instead of on a physical machine and so does not disrupt normal system operation. The virtual machine concept is difficult to implement due to the effort required to provide an exact duplicate to the underlying machine VMware Architecture Next lecture Interrupts and system calls in Linux