CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011 Logistics Location and schedule: CoRE A, Thursdays from noon to 3 pm Instructor: Liviu Iftode Office: CoRE 311 Office Hours: Thursdays 10-11 am TA: Nader Boushehrinejadmoradi Office: Hill 353 Office Hours: Wednesdays, 2-4 pm More information http://www.cs.rutgers.edu/~iftode/cs519-2011.html http://sakai.rutgers.edu page for the course CS 519 2 Operating System Theory Course Overview Goals: Understand how an operating system works Learn how OS concepts are implemented in a real operating system Introduce to systems programming Learn about performance evaluation Learn about current trends in OS research CS 519 3 Operating System Theory Course Structure Structure: Each major area: Review basic material Discuss the implementation on xv6 Read, present, and discuss interesting papers Programming assignments and project CS 519 4 Operating System Theory Suggested Studying Approach Read the assigned chapter and/or code before the lecture and try to understand it Start homework and project right away, systems programming is not easy, it takes a lot of time! Ask questions during the lecture Use the mailing list for discussion, do not be afraid to answer a question posted by your colleague even if you are not sure. This is a way to validate your understanding of the material. 5 Course Topics Processes, threads and synchronization Memory management and virtual memory CPU scheduling File systems and I/O management Distributed systems New trends in OS research 6 Textbooks - Required Stallings. Operating Systems: Internals and Design Principles, Prentice-Hall. Any recent version will do Papers available on the Web CS 519 7 Operating System Theory Textbooks - Recommended Andrew Tanenbaum. Distributed Operating Systems, Prentice-Hall. CS 519 8 Operating System Theory MIT xv6 OS Teaching OS developed by Russ Cox, Frans Kaashoek and Robert Morris from MIT UNIX V6 ported to Intel x86 machines Download source code and lecture materials from xv6 home page at MIT CS 519 9 Operating System Theory Course Requirements Prerequisites Undergraduate OS and computer architecture courses Good programming skills in C and UNIX What to expect Several programming assignments with write-ups Challenging project (a distributed shared memory protocol) Midterm and final exams Substantial reading Read, understand, and extend/modify xv6 code Paper presentations CS 519 10 Operating System Theory Homework Goals Learn to design, implement, and evaluate a significant OSlevel software system Improve systems programming skills: virtual memory, threads, synchronization, sockets, etc Structure They are all individual assignments Software must execute correctly Performance evaluation Written report for each assignment CS 519 11 Operating System Theory Grading Midterm 25% Final 25% Programming assignments and in-class presentations 25% Project 25% CS 519 12 Operating System Theory Today What is an Operating System? Stallings 2.1-2.4 Architecture refresher … CS 519 13 Operating System Theory What is an operating system? application (user) operating system hardware A software layer between the hardware and the application programs/users, which provides a virtual machine interface: easy to use (hides complexity) and safe (prevents and handles errors) Acts as resource manager that allows programs/users to share the hardware resources in a protected way: fair and efficient CS 519 14 Operating System Theory How does an OS work? application (user) system calls upcalls commands interrupts hardware independent OS hardware dependent hardware Receives requests from the application: system calls Satisfies the requests: may issue commands to hardware Handles hardware interrupts: may upcall the application OS complexity: synchronous calls + asynchronous events CS 519 15 Operating System Theory Mechanism and Policy application (user) operating system: mechanism+policy hardware Mechanisms: data structures and operations that implement an abstraction (e.g. the buffer cache) Policies: the procedures that guide the selection of a certain course of action from among alternatives (e.g. the replacement policy for the buffer cache) Traditional OS is rigid: mechanism together with policy CS 519 16 Operating System Theory Mechanism-Policy Split Single policy often not the best for all cases Separate mechanisms from policies: OS provides the mechanism + some policy Applications may contribute to the policy Flexibility + efficiency require new OS structures and/or new OS interfaces CS 519 17 Operating System Theory OS Mechanisms and Policies CS 519 18 Operating System Theory System Abstraction: Processes A process is a system abstraction: illusion of being the only job in the system user: application operating system: process create, kill processes, inter-process comm. multiplex resources hardware: CS 519 computer 19 Operating System Theory Processes: Mechanism and Policy Mechanism: Creation, destruction, suspension, context switch, signaling, IPC, etc. Policy: Minor policy questions: Who can create/destroy/suspend processes? How many active processes can each user have? Major policy question that we will concentrate on: How to share system resources between multiple processes? Typically broken into a number of orthogonal policies for individual resources such as CPU, memory, and disk. CS 519 20 Operating System Theory Processor Abstraction: Threads A thread is a processor abstraction: illusion of having 1 processor per execution context application: execution context create, kill, synch. operating system: thread context switch hardware: processor Process vs. Thread: Process is the unit of resource ownership, while Thread is the unit of instruction execution. CS 519 21 Operating System Theory Threads: Mechanism and Policy Mechanism: Creation, destruction, suspension, context switch, signaling, synchronization, etc. Policy: How to share the CPU between threads from different processes? How to share the CPU between threads from the same process? CS 519 22 Operating System Theory Threads Traditional approach: OS uses a single policy (or at most a fixed set of policies) to schedule all threads in the system. Assume two classes of jobs: interactive and batch. More sophisticated approaches: applicationcontrolled scheduling, reservation-based scheduling, etc CS 519 23 Operating System Theory Memory Abstraction: Virtual memory Virtual memory is a memory abstraction: illusion of large contiguous memory, often more memory than physically available application: address space virtual addresses operating system: virtual memory physical addresses hardware: CS 519 physical memory 24 Operating System Theory Virtual Memory: Mechanism Mechanism: Virtual-to-physical memory mapping, page-fault, etc. virtual address spaces p1 p2 processes: v-to-p memory mappings physical memory: CS 519 25 Operating System Theory Virtual Memory: Policy Policy: How to multiplex a virtual memory that is larger than the physical memory onto what is available? How to share physical memory between multiple processes? CS 519 26 Operating System Theory Virtual Memory Traditional approach: OS provides a sufficiently large virtual address space for each running application, does memory allocation and replacement, and may ensure protection More sophisticated approaches: external memory management, huge (64-bit) address space, global virtual address space CS 519 27 Operating System Theory Storage Abstraction: File System A file system is a storage abstraction: illusion of structured storage space application/user: copy file1 file2 operating system: files, directories hardware: CS 519 naming, protection, operations on files operations on disk blocks disk 28 Operating System Theory File System Mechanism: File creation, deletion, read, write, file-block-to-disk-block mapping, file buffer cache, etc. Policy: Sharing vs. protection? Which block to allocate for new data? File buffer cache management? CS 519 29 Operating System Theory File System Traditional approach: OS does disk block allocation and caching (buffer cache), disk operation scheduling, and management of the buffer cache More sophisticated approaches: applicationcontrolled buffer cache replacement, log-based allocation (makes writes fast) CS 519 30 Operating System Theory Communication Abstraction: Messaging Message passing is a communication abstraction: illusion of reliable (sometimes ordered) msg transport application: sockets naming, messages operating system: TCP/IP protocols network packets hardware: CS 519 network interface 31 Operating System Theory Message Passing Mechanism: Send, receive, buffering, retransmission, etc. Policy: Congestion control and routing Multiplexing multiple connections onto a single NIC CS 519 32 Operating System Theory Message Passing Traditional approach: OS provides naming schemes, reliable transport of messages, packet routing to destination More sophisticated approaches: user-level protocols, zero-copy protocols, active messages, memorymapped communication CS 519 33 Operating System Theory Character & Block Devices The device interface gives the illusion that devices support the same API – character stream and block access application/user: read character from device operating system: character & block API hardware: keyboard, mouse, etc. CS 519 34 naming, protection, read, write hardware-specific PIO, interrupt handling, or DMA Operating System Theory Devices Mechanisms Open, close, read, write, ioctl, etc. Buffering Policies Protection Sharing? Scheduling? CS 519 35 Operating System Theory UNIX Source: Silberschatz, Galvin, and Gagne 2005 CS 519 36 Operating System Theory Major Issues in OS Design Programming API: what should the VM look like? Resource management: how should the hardware resources be multiplexed among multiple users? Sharing: how should resources be shared among multiple users? Protection: how to protect users from each other? How to protect programs from each other? How to protect the OS from applications and users? Communication: how can applications exchange information? Structure: how to organize the OS? Concurrency: how do we deal with the concurrency that is inherent in OS’es? CS 519 37 Operating System Theory Major Issues in OS Design Performance: how to make it all run fast? Reliability: how do we keep the OS from crashing? Persistence: how can we make data last beyond program execution? Accounting: how do we keep track of resource usage? Distribution: how do we make it easier to use multiple computers in conjunction? Scaling: how do we keep the OS efficient and reliable as the offered load increases (more users, more processes, more processors)? CS 519 38 Operating System Theory Architecture Refresher von Neumann Machine The first computers (late 40’s) were calculators The advance was the idea of storing the instructions (coded as numbers) along with the data in the same memory CS 519 40 Operating System Theory Conceptual Model Addresses of memory cells CPU + * / CS 519 41 Memory contents 0 1 2 3 4 5 6 7 8 9 "big byte array" Operating System Theory Operating System Perspective A computer is a piece of hardware that runs the fetch-decode-execute loop Next slides: walk through a very simple computer to illustrate Machine organization What the pieces are and how they fit together The basic fetch-decode-execute loop How higher-level constructs are translated into machine instructions At its core, the OS builds what looks like a more sophisticated machine on top of this basic hardware CS 519 42 Operating System Theory Fetch-Decode-Execute Computer as a large, general-purpose calculator Want to program it for multiple functions All von Neumann computers follow the same loop: Fetch the next instruction from memory Decode the instruction to figure out what to do Execute the instruction and store the result Instructions are simple. Examples: Increment the value of a memory cell by 1 Add the contents of memory cells X and Y and store in Z Multiply contents of memory cells A and B and store in B CS 519 43 Operating System Theory Instruction Encoding How to represent instructions as numbers? 8 bits operators +: 1 -: 2 *: 3 /: 4 CS 519 8 bits 8 bits operands 44 8 bits destination Operating System Theory Example Encoding Add cell 28 to cell 63 and place result in cell 100: 8 bits operator +: 1 -: 2 *: 3 /: 4 CS 519 8 bits 8 bits source operands Cell 28 Cell 63 8 bits destination Cell 100 Instruction as a number in: Decimal: 1:28:63:100 Binary: 00000001:00011100:00111111:01100100 Hexadecimal: 01:1C:3F:64 45 Operating System Theory The Program Counter Where is the “next instruction”? A special memory cell in the CPU called the “program counter" (the PC) points to it Special-purpose memory in the CPU and devices is called a register Naïve fetch cycle: Increment the PC by the instruction length (4) after each execute Assumes all instructions are the same length CS 519 46 Operating System Theory Conceptual Model Memory 0 1 CPU Arithmetic Units Program Counter CS 519 + * / 2 3 4 5 6 7 8 9 4 47 operator operand 1 operand 2 Instruction 0 @ memory address 0 destination Instruction 1 @ memory address 4 Operating System Theory Memory Indirection How do we access array elements efficiently if all we can do is name a cell? Modify the operand to allow for fetching an operand "through" a memory location E.g.: LOAD [5], 2 means fetch the contents of the cell whose address is in cell 5 and put it into cell 2 So, if cell 5 had the number 100, we would place the contents of cell 100 into cell 2 This is called indirection Fetch the contents of the cell “pointed to” by the cell in the opcode Use an operand bit to signify if an indirection is desired CS 519 48 Operating System Theory Conditionals and Looping Primitive “computers” only followed linear instructions Breakthrough in early computing was addition of conditionals and branching Instructions that modify the Program Counter Conditional instructions If the content of this cell is [positive, not zero, etc.] execute the instruction or not Branch Instructions If the content of this cell is [zero, non zero, etc.], set the PC to this location jump is an unconditional branch CS 519 49 Operating System Theory Example: While Loop Variables to memory cells: while (counter > 0) { sum = sum + Y[counter]; counter–-; }; Memory Assembly cell address label 100 LOOP: 104 108 112 116 120 CS 519 END: counter is cell 1 sum is cell 2 index is cell 3 Y[0]=cell 4, Y[1]=cell 5… Assembly "mnemonic" BZ 1,END English // // ADD 2,[3],2 // // // // DEC 3 // DEC 1 // JUMP LOOP // // <next code block> 50 Branch to address of END if cell 1 is 0. Add cell 2 and the value of the cell pointed to by cell 3 then place the result in cell 2 Decrement cell 3 by 1 Decrement cell 1 by 1 Start executing from the address of LOOP Operating System Theory Registers Architecture rule: large memories are slow, small ones are fast But everyone wants more memory! Solution: Put small amount of memory in the CPU for faster operation Most programs work on only small chunks of memory in a given time period. This is called locality. So, if we cache the contents of a small number of memory cells in the CPU memory, we might be able to execute many instructions before having to access memory Small memory in CPU named separately in the instructions from the “main memory” Small memory in CPU = registers Large memory = main memory CS 519 51 Operating System Theory Register Machine Model Memory CPU Arithmetic Units Logic Units +,-,*,/ <,>,!= Program Counter 8 register 0 24 register 1 100 register 2 18 CS 519 52 0 1 2 3 4 5 6 7 8 9 Operating System Theory Registers (cont) Most CPUs have 16-32 “general-purpose” registers All look the “same”: combination of operators, operands, and destinations possible Operands and destination can be in: Registers only (Sparc, PowerPC, Mips, Alpha) Registers & 1 memory operand (Intel x86 and clones) Any combination of registers and memory (Vax) Only memory operations possible in "register-only" machines are load from and store to memory Operations 100-1000 times faster when operands are in registers compared to when they are in memory Save instruction space too Only address 16-32 registers, not GB of memory CS 519 53 Operating System Theory Typical Instructions Add the contents of register 2 and register 3 and place result in register 5 ADD r2,r3,r5 Add 100 to the PC if register 2 is not zero Relative branch BNZ r2,100 Load the contents of memory location whose address is in register 5 into register 6 LDI r5,r6 CS 519 54 Operating System Theory Memory Hierarchy cpu word transfer cache block transfer main memory page transfer decrease cost per bit decrease frequency of access increase capacity increase access time increase size of transfer unit disks CS 519 55 Operating System Theory Memory Access Costs Intel Pentium IV Level Size Assoc Extreme Edition (3.2 GHz, 32 bits) L1 8KB 4-way L2 512KB 8-way L3 2MB 8-way Mem Block Access Size Time 64B 2 cycles 64B 19 cycles 64B 43 cycles 206 cycles AMD Athlon 64 FX-53 (2.4 GHz, 64 bits, L1 128KB 2-way on-chip mem cntl) L2 1MB 16-way Mem 64B 64B 3 cycles 13 cycles 125 cycles Processors introduced in 2003 CS 519 56 Operating System Theory Memory Access Costs Intel Core 2 Quad Level Size Assoc Block Access Q9450 Size Time (2.66 GHz, 64 bits) L1 128KB 8-way 64B 3 cycles shared L2 6MB 24-way 64B 15 cycles Mem 229 cycles Quad-core AMD Opteron 2360 (2.5 GHz, 64 bits) L1 128KB 2-way L2 512KB 16-way shared L3 2MB 32-way Mem 64B 64B 64B 3 cycles 7 cycles 19 cycles 356 cycles Processors introduced in 2008 CS 519 57 Operating System Theory Hardware Caches Motivated by the mismatch between processor and memory speed Closer to the processor than the main memory Smaller and faster than the main memory Act as “attraction memory”: contains the value of main memory locations that were recently accessed (temporal locality) Transfer between caches and main memory is performed in units called cache blocks/lines Caches contain also the value of memory locations that are close to locations that were recently accessed (spatial locality) CS 519 58 Operating System Theory Cache Architecture 2 ways, 6 sets CPU cache line L1 L2 associativity Memory Cache line ~32-128 Associativity ~2-32 CS 519 59 Capacity miss Conflict miss Cold miss Operating System Theory Cache Design Issues cpu word transfer cache block transfer main memory CS 519 Cache size and cache block size Mapping: physical/virtual caches, associativity Replacement algorithm: direct or LRU Write policy: write through/write back 60 Operating System Theory Abstracting the Machine Bare hardware provides a computation device How to share this expensive piece of equipment between multiple users? Sign up during certain hours? Give program to an operator? They run it and give you the results Software to give the illusion of having it all to yourself while actually sharing it with others! This software is the Operating System Need hardware support to “virtualize” machine CS 519 61 Operating System Theory Architecture Features for the OS Next, we'll look at the mechanisms the hardware designers add to allow OS designers to abstract the basic machine in software Processor modes Exceptions Traps Interrupts These require modifications to the basic fetchdecode-execute cycle in hardware CS 519 62 Operating System Theory Processor Modes OS code is stored in memory … von Neumann model, remember? What if a user program modifies OS code or data? Introduce modes of operation Instructions can be executed in user mode or system mode A special register holds which mode the CPU is in Certain instructions can only be executed when in system mode Likewise, certain memory locations can only be written when in system mode Only OS code is executed in system mode Only OS can modify its memory The mode register can only be modified in system mode CS 519 63 Operating System Theory Simple Protection Scheme Addresses < 100 are reserved for OS use Mode register provided zero = SYS = CPU is executing the OS (in system mode) one = USR = CPU is executing in user mode Hardware does this check: On every fetch, if the mode bit is USR and the address is less than 100, then do not execute the instruction When accessing operands, if the mode bit is USR and the operand address is less than 100, do not execute the instruction Mode register can only be set if mode is SYS CS 519 64 Operating System Theory Simple Protection Model CPU Arithmetic Units Logic Units +,-,*,/ <,>,!= Program Counter 8 Registers 0-31 Mode register CS 519 0 65 Memory 0 99 100 101 102 103 104 105 106 OS User Operating System Theory Fetch-decode-execute Revised Fetch: if ((PC < 100) && (mode register == USR)) then Error! User tried to access the OS else fetch the instruction at the PC Decode: if ((destination register == mode) && (mode register == USR)) then Error! User tried to set the mode register < more decoding > Execute: if ((an operand < 100) && (mode register == USR) then Error! User tried to access the OS else execute the instruction CS 519 66 Operating System Theory Exceptions What happens when a user program tries to access memory holding the operating system code or data? Answer: exceptions An exception occurs when the CPU encounters an instruction that cannot be executed Modify fetch-decode-execute loop to jump to a known location in the OS when an exception happens Different errors jump to different places in the OS (are "vectored" in OS speak) CS 519 67 Operating System Theory Fetch-decode-execute with Exceptions 60 is the wellFetch: if ((PC < 100) && (mode register == USR)) then known entry point set the PC = 60 for a memory set the mode = SYS violation fetch the instruction at the PC Decode: if ((destination register == mode) && (mode register == USR)) then set the PC = 64 64 is the well set the mode = SYS known entry point goto fetch for a mode < more decoding > Execute: register violation < check the operands for a violation > CS 519 68 Operating System Theory Access Violations Notice both instruction fetch from memory and data access must be checked Execute phase must check both operands Execute phase must check again when performing an indirect load This is a very primitive memory protection scheme. We'll cover more complex virtual memory mechanisms and policies later in the course CS 519 69 Operating System Theory Recovering from Exceptions The OS can figure out what caused the exception from the entry point But how can it figure out where in the user program the problem was? Solution: add another register, the PC’ When an exception occurs, save the current PC to PC’ before loading the PC with a new value OS can examine the PC' and perform some recovery action Stop user program and print an error message: error at address PC' Run a debugger CS 519 70 Operating System Theory Fetch-decode-execute with Exceptions & Recovery Fetch: if ((PC < 100) && (mode register == USR)) then set the PC' = PC set the PC = 60 set the mode = SYS Decode: if ((destination register == mode) && (mode register == USR)) then set the PC' = PC set the PC = 64 set the mode = SYS goto fetch < more decoding > Execute: … CS 519 71 Operating System Theory Traps Now, we know what happens when a user program illegally tries to access OS code or data How does a user program legitimately access OS services? Solution: Trap instruction A trap is a special instruction that forces the PC to a known address and sets the mode to system mode Unlike exceptions, traps carry some arguments to the OS Foundation of the system call CS 519 72 Operating System Theory Fetch-decode-execute with Traps Fetch: if ((PC < 100) && (mode register == USR)) then < memory exception > Decode: if (instruction is a trap) then set the PC' = PC set the PC = 68 set the mode = SYS goto fetch if ((destination register == mode) && (mode register == USR)) then < mode exception > Execute: … CS 519 73 Operating System Theory Traps How does the OS know which service the user program wants to invoke on a trap? User program passes the OS a number that encodes which OS service is desired This example machine could include the trap ID in the instruction itself: Trap opcode Trap service ID Most real CPUs have a convention for passing the trap ID in a set of registers E.g. the user program sets register 0 with the trap ID, then executes the trap instruction CS 519 74 Operating System Theory Returning from a Trap How to "get back" to user mode and the user's code after a trap? Set the mode register = USR then set the PC? But after the mode bit is set to user, exception! Set the PC, then set the mode bit? Jump to "user-land", then in kernel mode Most machines have a "return from exception" instruction A single hardware instruction: Sets the PC to PC' Sets the mode bit to user mode Traps and exceptions use the same mechanism (RTE) CS 519 75 Operating System Theory Fetch-decode-execute with Traps Fetch: if ((PC < 100) && (mode register == USR)) then < memory exception > Decode: if (instruction is RTE) then set the PC = PC' set the mode = USR goto fetch if ((destination register == mode) && (mode register == USR)) then < mode exception > Execute: … CS 519 76 Operating System Theory Interrupts How can we force the CPU back into system mode if the user program is off computing something? Solution: Interrupts An interrupt is an external event that causes the CPU to jump to a known address Link an interrupt to a periodic clock Modify fetch-decode-execute loop to check an external line set periodically by the clock CS 519 77 Operating System Theory Simple Interrupt Model CPU Arithmetic Units Logic Units +,-,*,/ <,>,!= Program Counter 8 Registers 0-31 CS 519 OS User Interrupt line PC' Mode register Memory Clock Reset line 0 78 Operating System Theory The Clock The clock starts counting to 10 milliseconds When 10 milliseconds elapse, the clock sets the interrupt line "high" (e.g. sets it to logic 1) When the CPU toggles the reset line, the clock sets the interrupt line low and starts count to 10 milliseconds again CS 519 79 Operating System Theory Fetch-decode-execute with Interrupts Fetch: if (clock interrupt line == 1) then set the PC' = PC set the PC = 72 set the mode = SYS goto fetch if ((PC < 100) && (mode register == USR)) then < memory exception > fetch next instruction Decode: if (instruction is a trap) then < trap exception > if ((destination register == mode) && (mode register == USR)) then < mode exception > < more decoding > Execute: … CS 519 80 Operating System Theory Entry Points What are the "entry points" for our little example machine? 60: memory access violation 64: mode register violation 68: User-initiated trap 72: Clock interrupt Each entry point is typically a jump to some code block in the OS All real OS’es have a set of entry points for exceptions, traps and interrupts Sometimes they are combined and software has to figure out what happened. CS 519 81 Operating System Theory Saving and Restoring Context Recall the processor state: PC, PC', R0-R31, mode register When an entry to the OS happens, we want to start executing the correct routine then return to the user program such that it can continue executing normally Can't just start using the registers in the OS! Solution: save/restore the user context Use the OS memory to save all the CPU state Before returning to user, reload all the registers and then execute a return from exception instruction CS 519 82 Operating System Theory Input and Output How can humans get at the data? How to load programs? What happens if I turn the machine off? Can I send the data to another machine? Solution: add devices to perform these tasks Keyboards, mice, graphics Disk drives Network cards CS 519 83 Operating System Theory A Simple I/O device Network card has 2 registers: A store into the “transmit” register sends the byte over the wire Transmit often is written as TX (E.g. TX register) A load from the “receive” register reads the last byte that was read from the wire Receive is often written as RX How does the CPU access these registers? Solution: map them into the memory space An instruction that accesses memory cell 98 really accesses the transmit register instead of memory An instruction that accesses memory cell 99 really accesses the receive register These registers are said to be memory-mapped CS 519 84 Operating System Theory Basic Network I/O CPU Arithmetic Units Logic Units +,-,*,/ <,>,!= Program Counter 8 0 98 99 Memory Transmit Reg. Receive Reg. Network card PC' Registers 0-31 Mode register 0 Interrupt line Clock Reset line CS 519 85 Operating System Theory Why Memory-Mapped Registers? "Stealing" memory space for device registers has 2 functions: Allows protected access --- only the OS can access the device. User programs must trap into the OS to access I/O devices because of the normal protection mechanisms in the processor Why do we want to prevent direct access to devices by user programs? OS can control devices and move data to/from devices using regular load and store instructions No changes to the instruction set are required This is called programmed I/O CS 519 86 Operating System Theory Status Registers How does the OS know if a new byte has arrived? How does the OS know when the last byte has been transmitted? (so it can send another one) Solution: status registers A status register holds the state of the last I/O operation Our network card has 1 status register To transmit, the OS writes a byte into the TX register and sets bit 0 of the status register to 1. When the card has successfully transmitted the byte, it sets bit 0 of the status register back to 0. When the card receives a byte, it puts the byte in the RX register and sets bit 1 of the status register to 1. After the OS reads this data, it sets bit 1 of the status register back to 0. CS 519 87 Operating System Theory Polled I/O To Transmit: While (status register bit 0 == 1); TX register = data; Status reg = status reg | 0x1; // wait for card to be ready While (status register bit 1 != 1); Data = RX register; Status reg = status reg & 0x01; // wait for data to arrive Naïve Receive: // tell card to TX (set bit 0 to 1) // tell card got data (clear bit 1) Can’t stall OS waiting to receive! Solution: poll after the clock ticks If (status register bit 1 == 1) { Data = RX register Status reg = status reg & 0x01; } CS 519 88 Operating System Theory Interrupt-driven I/O Polling can waste many CPU cycles On transmit, CPU slows to the speed of the device Can't block on receive, so tie polling to clock, but wasted work if no RX data Solution: use interrupts When network has data to receive, signal an interrupt When data is done transmitting, signal an interrupt CS 519 89 Operating System Theory Polling vs. Interrupts Why poll at all? Interrupts have high overhead: Stop processor Figure out what caused interrupt Save user state Process request Key factor is frequency of I/O vs. interrupt overhead CS 519 90 Operating System Theory Direct Memory Access (DMA) Problem with programmed I/O: CPU must load/store all the data into device registers. The data is probably in memory anyway! Solution: more hardware to allow the device to read and write memory just like the CPU Base + bound or base + count registers in the device Set base and count registers Set the start transmit register I/O device reads memory from base Interrupts when done CS 519 91 Operating System Theory PIO vs. DMA Overhead less for PIO than DMA PIO is a check against the status register, then send or receive DMA must set up the base, count, check status, take an interrupt DMA is more efficient at moving data PIO ties up the CPU for the entire length of the transfer Size of the transfer becomes the key factor in when to use PIO vs. DMA CS 519 92 Operating System Theory Typical I/O Devices Disk drives: Present the CPU with a linear array of fixed-sized blocks that are persistent across power cycles Network cards: Allow the CPU to send and receive discrete units of data (packets) across a wire, fiber or radio Packet sizes 64-8K bytes are typical Graphics adapters: Present the CPU with a memory that is turned into pixels on a screen CS 519 93 Operating System Theory Recap: the I/O design space Polling vs. interrupts How does the device notify the processor that an event happened? Polling: Device is passive, CPU must read/write a register Interrupt: device signals CPU via an interrupt Programmed I/O vs. DMA How does the device send and receive data? Programmed I/O: CPU must use load/store into the device DMA: Device reads and writes memory CS 519 94 Operating System Theory Practical: How to boot? How does a machine start running the operating system in the first place? The process of starting the OS is called booting Sequence of hardware + software events form the boot protocol Boot protocol in modern machines is a 3-stage process CPU starts executing from a fixed address Firmware loads the boot loader Boot loader loads the OS CS 519 95 Operating System Theory Boot Protocol (1) CPU is hard-wired to start executing from a known address in memory This memory address is typically mapped to solid-state persistent memory (e.g., ROM, EPROM, Flash) (2) Persistent memory contains the “boot” code This kind of software is called firmware On x86, the starting address corresponds to the BIOS (basic input-output system) boot entry point This code reads 1 block from the disk drive. This block is loaded and then executed. This program is the boot loader. CS 519 96 Operating System Theory Boot Protocol (cont) (3) The boot loader can then load the rest of the operating system from disk. Note that at this point the OS still is not running The boot loader can know about multiple operating systems The boot loader can know about multiple versions of the OS CS 519 97 Operating System Theory Why Have A Boot Protocol? Why not just store the OS into persistent memory? Separate the OS from the hardware Multiple OSes or different versions of the OS Want to boot from different devices E.g. security via a network boot OS is pretty big (tens of MBs). Rather not have it as firmware CS 519 98 Operating System Theory Basic Computer Architecture CPU Memory Single-CPU-chip computer: • Single threaded • Multithreaded • Multi/many-core memory bus I/O bus core 1 core 2 … core n CS 519 Disk 99 Net interface Operating System Theory Caching Inside A 4-Core CPU Core Core Core Core Private L1 Caches (coherence!) Shared L2 Cache CPU CS 519 100 Operating System Theory Multi-CPU-Chip Multiprocessors CPU cache CPU cache Memory memory bus I/O bus Last level of hw caching disk Net interface Simple scheme (SMP): more than one CPU on the same bus Memory is shared among CPUs -- cache coherence between LLCs Bus contention increases -- does not scale Alternative (non-bus) system interconnect -- complex and expensive SMPs naturally support single-image operating systems CS 519 101 Operating System Theory Cache-Coherent Shared-Memory: UMA Core Core Core Core Snooping Caches Memory CS 519 102 Operating System Theory CC-NUMA Multiprocessors CPU cache memory bus I/O bus Memory Memory Mem Cntrl Mem Cntrl network disk CPU cache memory bus I/O bus disk Non-uniform access to different memories • Hardware allows remote memory accesses and maintains cache coherence • Scalable interconnect more scalable than bus-based UMA systems • Also naturally supports single-image operating systems • Complex hardware coherence protocols • CS 519 103 Operating System Theory Multicomputers CPU cache CPU cache Memory Memory memory bus memory bus I/O bus I/O bus network disk Net interface disk Network of computers: “share-nothing” -- cheap Distributed resources: difficult to program Net interface Message passing Distributed file system Challenge: build efficient global abstraction in software CS 519 104 Operating System Theory OS Issues in Different Architectures UMA Multiprocessors CPU CPU cache cache Memory memory bus I/O bus disk CS 519 Net interface 106 Operating System Theory UMA Multiprocessors: OS Issues Processes How to divide processors among multiple processes? Time sharing vs. space sharing Threads Synchronization mechanisms based on shared memory How to schedule threads of a single process on its allocated processors? Affinity scheduling? CS 519 107 Operating System Theory CC-NUMA Multiprocessors CPU cache memory bus I/O bus Memory Memory Mem Cntrl Mem Cntrl network disk CPU cache memory bus I/O bus disk Hardware allows remote memory accesses and maintains cache coherence through protocol CS 519 108 Operating System Theory CC-NUMA Multiprocessors: OS Issues Memory locality!! Remote memory access up to an order of magnitude more expensive than local access Thread migration vs. page migration Page replication Affinity scheduling CS 519 109 Operating System Theory Multicomputers CPU cache CPU cache Memory Memory memory bus memory bus I/O bus I/O bus network disk Net interface Net interface disk Share-nothing CS 519 110 Operating System Theory Multicomputers: OS Issues Scheduling Node allocation? (CPU and memory allocated together) Process migration? Software distributed shared-memory (Soft DSM) Distributed file systems Low-latency reliable communication CS 519 111 Operating System Theory OS Structure Traditional OS Structure Monolithic/layered systems one/N layers all executed in “kernel-mode” good performance but rigid user user process system calls file system OS kernel memory system hardware CS 519 113 Operating System Theory Micro-kernel OS user mode client process file server memory server IPC micro-kernel hardware Client-server model, IPC between clients and servers The micro-kernel provides protected communication OS functions implemented as user-level servers Flexible but efficiency is the problem Easy to extend for distributed systems CS 519 114 Operating System Theory Extensible OS kernel process process my memory service default memory service user mode extensible kernel hardware User processes can load customized OS services into the kernel Good performance and flexibility but protection and scalability become problems CS 519 115 Operating System Theory Exokernels user level allocate resource OS libraries exokernel hardware Kernel provides only a very low-level interface to the hardware Idea is to allow an application to manage its resources (kernel ensures that resource is free and application has right to access it) OS functionality implemented as user-level libraries to simplify programming CS 519 116 Operating System Theory Some History … Brief OS History In the beginning, there really wasn’t an OS Program binaries were loaded using switches Interface included blinking lights (cool!) Then came batch systems OS was implemented to transfer control from one job to the next OS was always resident in memory Resident monitor Operator provided machine/OS with a stream of programs with delimiters Typically, input device was a card reader, so delimiters were known as control cards Spooling CPUs were much faster than card readers and printers Disks were invented – disks were much faster than card readers and printers So, what do we do? Pipelining … what else? Read job 1 from cards to disk. Run job 1 while reading job 2 from cards to disk; save output of job 1 to disk. Print output of job 1 while running job 2 while reading job 3 from cards to disk. And so on … This is known as spooling: Simultaneous Peripheral Operation On-Line Can use multiple card readers and printers to keep up with CPU if needed Improves both system throughput and response time Multiprogramming CPUs were still idle whenever executing program needs to interact with peripheral device E.g., reading more data from tape Multiprogrammed batch systems were invented Load multiple programs onto disk at the same time (later into memory) Switch from one job to another when the first job performs an I/O operation Overlap I/O of one job with computation of another job Peripherals have to be asynchronous Have to know when I/O operation is done: interrupt vs. polling Increase system throughput, possibly at the expense of response time Time-Sharing As you can imagine, batching was a big pain You submit a job, you twiddle your thumbs for a while, you get the output, see a bug, try to figure out what went wrong, resubmit the job, etc. Even running production programs was difficult in this environment Technology got better: can now have terminals and support interactive interfaces How to share a machine (remember machines were expensive back then) between multiple people and still maintain interactive user interface? Time-sharing Connect multiple terminals to a single machine Multiplex machine between multiple users Machine has to be fast enough to give illusion that each user has own machine Multics was the first large time-sharing system – mid-1960’s Parallel OS Some applications comprise tasks that can be executed simultaneously Weather prediction, scientific simulations, recalculation of a spreadsheet Can speedup execution by running these tasks in parallel on many processors Need OS, compiler, and/or language support for dividing programs into multiple parallel activities Need OS support for fast communication and synchronization Many different parallel architectures Main goal is performance Real-Time OS Some applications have time deadlines by when they have to complete certain tasks Hard real-time system Medical imaging systems, industrial control systems, etc. Catastrophic failure if system misses a deadline What happens if collision avoidance software on an oil tanker does not detect another ship before the “turning or breaking” distance of the tanker? Challenge lies in how to meet deadlines with minimal resource waste Soft real-time system Multimedia applications May be annoying but is not catastrophic if a few deadlines are missed Challenge 1: how to meet most deadlines with minimal resource waste Challenge 2: how to load-shed if become overloaded Distributed OS Clustering Use multiple small machines to handle large service demands Cheaper than using one large machine Better potential for reliability, incremental scalability, and absolute scalability Wide-area distributed systems Allow use of geographically distributed resources E.g., use of a local PC to access web services Don’t have to carry needed information with us Need OS support for communication and sharing of distributed resources E.g., network file systems Want performance (although speedup is not metric of interest here), high reliability, and use of diverse resources Embedded OS Pervasive computing Right now, cell phones and PDAs Future, computational elements everywhere Characteristics Constrained resources: slow CPU, small memories, no disk, etc. What’s new about this? Isn’t this just like the old computers? Well no, because we want to execute more powerful programs than before How can we execute more powerful programs if our hardware is similar to old hardware? Use many, many of them Augment with services running on powerful machines OS support for power management, mobility, resource discovery, etc. Virtual Machines and Hypervisors Popular in the 60’s and 70’s, vanished in the 80’s and 90’s Idea: Partition a physical machine into a number of virtual machines Each virtual machine behaves as a separate computer Can support heterogeneous operating systems (called guest OSes) Provides performance isolation and fault isolation Facilitates virtual machine migration Facilitates server consolidation Hypervisor or Virtual Machine Monitor Underlying software that manages multiple virtual machines Virtual Machines and Hypervisors Source: Silberschatz, Galvin, and Gagne 2005 Virtual Machines: Another Architecture Source: Silberschatz, Galvin, and Gagne 2005 Backup Slides CS 519 129 Operating System Theory The UNIX Time-sharing System Features Time-sharing system Hierarchical file system System command language (shell) File-based device-independent I/O Versions 1 & 2 No multi-programming Ran on PDP-7,9,11 CS 519 130 Operating System Theory More History Version 4 Ran on PDP-11 (hardware costing < $40k) Took less than 2 man-years to code ~50KB code size (kernel) Written in C CS 519 131 Operating System Theory File System Ordinary files (uninterpreted) Directories File of files Organized as a rooted tree Pathnames (relative and absolute) Contains links to parent, itself Multiple links to files can exist Link - hard (different name for the same file; modifications seen under both names, but erasing one does not affect the other) or symbolic (pointer to a file; erasing the file leaves pointers hanging) CS 519 132 Operating System Theory File System (contd) Special files Each I/O device associated with a special file To provide uniform naming and protection model Uniform I/O CS 519 133 Operating System Theory Removable File Systems Tree-structured file hierarchies Mounted on existing space by using mount No links between different file systems CS 519 134 Operating System Theory Protection User id uid marked on files Ten protection bits nine - rwx permissions for user, group & other setuid bit is used to change user id Super-user has special uid exempt from constraints on access CS 519 135 Operating System Theory Uniform I/O Model Basic system calls open, close, creat, read, write, seek Streams of bytes, no records No locks visible to the user CS 519 136 Operating System Theory File System Implementation I-node contains a short description of one file direct, single-indirect and double-indirect pointers to disk blocks I-list table of i-nodes, indexed by i-number pathname scanning to determine i-number Allows simple and efficient fsck Buffered data Different disk write policies for data and metadata CS 519 137 Operating System Theory Processes Process Memory image, register values, status of open files etc. Memory image consists of text, data, and stack segments To create new processes pid = fork() process splits into two independently executing processes (parent and child) Pipes used for communication between related processes exec(file, arg1, ..., argn) used to start another application CS 519 138 Operating System Theory The Shell Command-line interpreter cmd arg1 arg2 ... argn i/o redirection <, > filters & pipes ls | more job control cmd & simplified shell => CS 519 139 Operating System Theory