SEM / YEAR : VI / III CS2028 & UNIX INTERNALS UNIT V& PROCESS SCHEDULING AND MEMORY MANAGEMENT POLICIES PART A 1. Define swapping? MAY JUNE 2012 Unix systems transferred entire processes between primary memory and the swap device, but did not transfer parts of a process independently, except for shared text. Such a memory management policy is called swapping. 2. What are the types of devices in unix system? Block devices Raw or character devices. 3. What are the 3 stages at which device configuration can be specified? i. Administrators can hard-core configuration data into files that are compiled and linked when building the kernel. ii. Administrator can supply configuration information after the system is already running. iii. Self-identifying devices permit the kernel to recognize which device is installed. 4. What are the operations specifies for device close? a. Search file table and make sure that no other processes still have the device open. b. For character device, the kernel invokes the device close procedure and return to user mode. c. For block device the kernel search the mount table to make sure that the device does not contain a mount file system 5. What are the functions of a line discipline? To parse input string in to lines To process erase characters To process a kill character that invalidates all characters typed so far on the current line. To echo (write)received characters to the terminal To expand output such as tab characters To generate signals to processes. NARESHKUMAR.R, AP, DEPT OF CSE , MAHALAKSHMI ENGINEERING COLLEGE, TRICHY - 621213 6. What are the data structures for terminal drivers associated with clists? i. A clist to store data for output to the terminal. ii. A clist to store “raw” input data provided by the terminal interrupt handler as the user type it in. iii. A clist to store “cooked” input data. 7. Define scheduling? When more than one process is runable, the operating system must decide which one first. The part of the operating system concerned with this decision is called the scheduler, and algorithm it uses is called the scheduling algorithm. 8. Define demand paging? Demand paging is an application of virtual memory. In a system that uses demand paging, the operating system copies a diskpage into physical memory only if an attempt is made to access it. 9. Explain real time processing? In a real time processing, there is a continual input, process and output of data. Data has to be processed in a small stipulated time period (real time), otherwise it will create problems for the system. example: assembly line robots and radar system. 10. What’s the use of clock handler? The clock handler runs with the processor execution level set high, preventing other events from happening while handler is active. 11. What are the functions of the clock interrupt handler? The functions of the clock interrupt handler are Restart the clock Gather system and process accounting statistics Keep track of time. Control process scheduling Sent signals to processes on request. NARESHKUMAR.R, AP, DEPT OF CSE , MAHALAKSHMI ENGINEERING COLLEGE, TRICHY - 621213 12. What all the bit fields that are used to support demand paging? Valid Reference Modify Copy on write Age 13. What are the types of page fault? i. Validity fault ii. Protection fault 14. Write the states for the cause of page fault? MAY JUNE 2013 i. Page is on a swap device and not in memory ii. Page is on the free page list in memory iii. Page in an executable file. iv. Marked “demand zero” v. Marked “demand fill” 15. What are the types of devices? Block device Character device 16. Write the system call for terminal polling? Select(nfds, rfds, wfds, efds, timeout) Where, nfds gives the number of file descriptor rfds, wfds, efds points to bit mask 17. Write the syntax for ptrace system call? ptrace(cmd, pid, addr, data) Where cmd specifies various commands, pid is the process ID, addr is the virtual address, NARESHKUMAR.R, AP, DEPT OF CSE , MAHALAKSHMI ENGINEERING COLLEGE, TRICHY - 621213 data is an integer value to be written. 18. Write the syntax for msgsnd system call? msgsnd(msgqid, msg, count, flag); where, msgqid is the descriptor, count gives the size of data array, flag specifies the action. 19. What are the inputs for msgsnd algorithm? Message queue descriptor Address of message structure Size of message Flags 20. Write the syntax for shmget system call? Shmid = shmget(key, size, flag); Where, size is number of bytes in the region. 21.Write short notes on page stealer process. MAY JUNE 2007 This is the Kernel process that makes rooms for the incoming pages, by swapping the memory pages that are not the part of the working set of a process. Page-Stealer is created by the Kernel at the system initialization and invokes it throughout the lifetime of the system. Kernel locks a region when a process faults on a page in the region, so that page stealer cannot steal the page, which is being faulted in. 22.Dicuss Brielfy The Parameters Related To Process Scheduling. Linux scheduling is based on the time-sharing technique already introduced in the section "CPU's Time Sharing" in Chapter 5, Timing Measurements: several processes are allowed to run "concurrently," which means that the CPU time is roughly divided into "slices," one for each runnable process Of course, a single processor can run only one process at any given instant. If a currently running process is not terminated when its time slice or quantum expires, a process switch may take place. Time-sharing relies on timer interrupts and is thus transparent to processes. No additional code needs to be inserted in the programs in order to ensure CPU time-sharing. 23.Give a brief note on ioctl. May june 2012 The ioctl system call is a generalization of the document of the terminal-specific stty(set terminal settings)and gtty(get terminal settings) available in the UNIX system. NARESHKUMAR.R, AP, DEPT OF CSE , MAHALAKSHMI ENGINEERING COLLEGE, TRICHY - 621213 o It allows a process to set hardware options associated with a device and software options associated with the driver. o Syntax: ioctl(fd,command,arg); 24.What is map? Maps text from one character set representation to another. This work is actually long time very well done by recode, but unfortunately recode does not support Unicode and eastern asia character sets. But, if you have pure 8 bit things to do, recode will still be the best solution. PART B 1.Explain detail about the Process Scheduling: MAY JUNE 2011 - The kernel allocates the CPU to a process for a time quantum, preempts a process that exceeds its time quantum, and feeds it backs into one of several priority queues. - A Process may need many iterations thro’ the “feedback loop” before it finishes. - When the kernel does a context switch and restores the context of a process, the process resumes execution from the point where it has been suspended. Algorithm: NARESHKUMAR.R, AP, DEPT OF CSE , MAHALAKSHMI ENGINEERING COLLEGE, TRICHY - 621213 - The kernel executes the algorithm to schedule a process, selecting the highest priority process form those in the sates “ready to run and loaded in memory” and “preempted”. - If several process tie for the highest priority, the kernel picks the one that has been “ready to run” for the longest time, following a round robin scheduling policy. - If there are no processes eligible for execution, the processor idles until the next interrupt, which will happen in at most one clock tick; after handling the interrupt, the kernel again attempts to schedule a process to run. Scheduling Parameters: - Each process table entry contains a priority field for process scheduling. - The priority of a process in user mode is a function of its recent CPU usage, with processes getting a lower priority if they have recently used the CPU. - The range of process priorities can be partitioned into two classes i.e. user priorities and kernel priorities. - Each class contains several priority values, and each priority has a queue of processes logically associated with it. - Processes with the user level priorities were preempted on their return from the kernel to the user mode, and processes with kernel level priorities achieved them in the sleep alg. - So from this the user level priorities are below a threshold values and kernel level priorities are above the threshold values. Kernel level priorities are subdivided into, 1. Process with the low kernel priority wake up on receipt of a signal. 2. Process with the high kernel priority continues to sleep. Range of process priorities diagram…… The kernel calculates the priority of a process in specific process states - It assigns priority to a process about to go to sleep, correlating a fixed priority value with the reason for sleeping. - The priority does not depend on the run time characteristics of the process but instead is a constant value. NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 - Processes that sleep in lower level algorithms tend to cause more system bottlenecks the longer are inactive. For instance: A process sleeping and waiting for the completion of disk I/O has a higher priority than a process sleeping & waiting for a free buffer for several reasons: 1. First, the process waiting for completion of disk I/O already has a buffer; 2. So when it wakes up there is a chance that it will do enough processing to release the buffer and possibly other resources. 3. The more resources it frees the better the processes‘ll not block waiting. 4. Second, a process waiting for a free buffer may be waiting for a buffer held by the process waiting for completion of I/O. 5. When the I/O completes both processes wake up because they sleep on the same address. The kernel adjusts the priority that returns from kernel mode to user mode. The process may have previously entered the sleep state, changing its priority to a kernel level priority that must be lowered to user level priority when returning to user mode. Clock, - The clock handler adjust the priorities of all processes in user mode at one second intervals and causes the kernel to go through the scheduling algorithm to prevent process from monopolizing the use of CPU. - The clock may interrupt a process several times during its time quantum at every clock interrupt - The clock handler increments a field in the process table that records the recent CPU usage of the process - Once a second the clock handler also adjust the recent CPU usage of each process according to the decay function , Decay (CPU) = CPU/2 on system V When it recomputes recent CPU usage the clock handler also recalculates the priority of every process according to the formula, Priority = (“recent CPU usage “/2) + (base level user priority) Where base level user priority is the threshold priority between kernel and user mode. -A numerically low value implies a high scheduling priority. -The slower the decay rate for recent CPU usage, the longer it will take for the priority of a process to reach the base level -Processes in the ready to run state will tend to occupy more priority levels. In a real system, all processes with user level priorities in the figure would change priority queues but only one has been depicted. The kernel does not change the priority of processes in kernel mode, nor does it allow processes with user level priority to cross the threshold and attain kernel level priority unless they make a system call and go to sleep. -The kernel attempts to recompute the priority of all active processes once a second, but the interval can vary slightly. -If the clock interrupt had come while the kernel was executing a critical region of code, the kernel does not recompute priorities, since that would keep the kernel in the critical region for too long time. -Instead the kernel remembers that it should have recomputed process priorities and does so at a succeeding clock interrupt when the “previous” processor execution level is sufficiently low. -Periodic recalculation of process priority assures a round robin scheduling policy for processes executing in user mode. -The kernel responds naturally to interactive requests such a for text editors. -Such processes have a high idle time to CPU usage ratio, & consequently their priority value naturally rises when they are ready for execution. Examples of process scheduling: It has 3 processes A,B,C under the following assumptions, -They are created simultaneously with initial priority 60, the higher user level priority is 60, the clock interrupts the system 60 times a second, the process make no system calls and other processes are ready to run, The kernel calculates the decay of the CPU usage by, CPU=decay (CPU) =CPU/2; And the process priority as, Priority = (CPU) + 60; -Assuming process A is the first to run and that it starts running at the beginning of a time quantum, it runs for 1 second. -During that time the clock interrupts the system 60 times and the interrupt handler increments the CPU usage field of process A 60 times. -The kernel forces a context switch at the 1 second mark and after recalculating the priorities of all processes, schedules process B for execution. -The clock handler increments the CPU usage field of process B 60 times during the next second and then recalculates the CPU usage & priority of all processes and forces a context switch. -The pattern repeats, with the processes taking turns to execute. By considering the above diagram, and assume other processes are in the system. -The kernel may preempt process A, leaving it in the state “ready to run”, after it had received several time quanta in succession on the CPU, and its user level priority may therefore be low. -As time progresses, process B may enter the “ready to run” state and its user level priority may be higher than that of process A at that instant. -If the kernel does not schedule either process for a while, both processes could eventually be at the same user priority level, although process B would probably enter the level first since its starting level was originally closer. -The kernel would choose to schedule process A ahead of process B because it was in the state ready to run for a longer time.\ -This is the tie-breaker rule for processes with equal priority. The kernel schedules a process at the conclusion of a context switch: i.e. a process must do a context switch when it goes to sleep or exits and it has the opportunity to do a context switch when returning to user mode from kernel mode. The kernel preempts a process about to return to user mode if a process with higher priority is ready to run. Such a process exists if the kernel awakened a process with higher priority than the currently running process. In the first case, the current process should not run in user mode given that a higher priority kernel mode process is available. In the second case, the clock handler decides that the process used up its time quantum and since many processes had their priorities changed, the kernel does a context switch to reschedule. Controlling process priorities: Processes can exercise crude control of their scheduling priority by using the nice system call; Nice (value); Where value is added in the calculation of process priority: Priority= (“recent CPU usage”/ constant) + (base priority) + (nice value) -The nice system call increments/decrements the nice field in the process table by the value of the parameter, although only the super user can supply nice values that increase the process priority. -Similarly, only the super user can supply a nice value below a particular threshold. -Users who invoke the nice system call to lower their process priority when executing computation-intensive jobs are “nice” to other users on the system. -Processes inherit the nice value of their parent during the fork system call. -The nice system call works for the running process only; a process cannot reset the nice value of another process. -If a system admin wishes to lower the priority of any process there is no other choice instead of killing them outright. Discuss about fair share scheduler Fair Share Scheduler: NOV DEC 2010 -The scheduler algorithm described above does not differentiate between classes of users. -I.e., it is impossible to allocate half of the CPU time to a particular set of process, if desired. -Its principle is to divide the user community into a set of fair share groups, such that the members of each group are subject to the constraints of the regular process scheduler relative to the other process in the group. -However the system allocates its cpu time proportionally to each group regardless of how many processes are in the group. -For example if there are four groups in the system, each with an allocated CPU share of 25% and that the groups contain 1,2,3 and 4 CPU bound processes that never willingly give up the processor. -If there are no other process on the system, each process in the four groups would get 10% of the CPU time using the regular scheduling algorithm, because there is no way to distinguish them from each other. -But using the fair share scheduler, the process in the group 1 will receive twice as much CPU time as each process in group 2,3 times as much CPU time as each process in group 3 and 4 times as much CPU time as each process in group 4. -In this example, the cpu time of all process in a group should be equal over time, because they are all in an infinite loop. Fair share group priority: -Each process has a new field in its U area that points to a fair share cpu usage field, shared by all processes in the fair share group. -The clock interrupts handler increments the fair share group cpu usage filed for the running process, just as it increments the cpu usage field of the running process and decays the values of all fair share group cpu usage fields once a second. -When calculating process priorities, a new component of calculation is the group cpu usage, normalized according to the amount of the cpu time allocated to the fair share group. -The more cpu time processes in a group received recently, the higher the numerical values of the group cpu usage field is and, therefore the lower the priority for all the processes in the fair share group. For example, consider the three processes depicted in fig 8.6 and suppose that process A is in one group and process B and C are in another. -Assuming the kernel schedules processA first, it will increment the cpu and group usage fields for process a first, it will increment the cpu and group usage fields for process A over the next second. -On recomputation of process priorities at the 1-second mark, process B and C have the highest priority; assume the kernel schedules process B -During the next second, the cpu usage fields of process B goes upto 60, as does the group usage field for processes B and C. -Hence, on recomputation of process priorities at the 2-second mark, process C will have priority 75(compare to fig. 8.9) and the kernel will schedule process A, with priority 74. -The figure shows how the pattern repeats: the kernel schedules the processes in the order A, B, A, C, A, B, and so on. 1.6Real-time processing: Real-time processing implies the capability to provide immediate response to specific external events and, hence to schedule particular processes to run within a specified time limit after occurrence of an event. An another impediment to support of real-time processing is that the kernel is nonpreemptive; the kernel cannot schedule a real-time process in the user mode if it is currently executing another process in kernel mode unless major changes are made. 2.Explain detail about theMemory Management Policies, Swapping APRIL MAY 2004 OR How Much Memory? Unlike traditional PC operating systems, Unix related systems use very sophisticated memory management algorithms to make efficient use of memory resources. This makes the questions "How much memory do I have?" and "How much memory is being used?" rather complicated to answer. First you must realize that there are three different kinds of memory, three different ways they can be used by the operating system, and three different ways they can be used by processes. The memory management subsystem decides which process should reside in main memory, and manages the parts of the virtual address space of a process that are not core resident. It monitors the amount of available primary memory and may periodically write processes to a secondary memory device called swap device to provide more space in primary meory. At a time later, the kernal reads the data from the swap device back to main memory. Kinds of Memory: Main - The physical Random Access Memory located on the CPU motherboard that most people think of when they talk about RAM. Also called Real Memory. This does not include processor caches, video memory, or other peripheral memory. File System - Disk memory accessible via pathnames. This does not include raw devices, tape drives, swap space, or other storage not addressable via normal pathnames. It does include all network file systems. NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 Swap Space - Disk memory used to hold data that is not in Real or File System memory. Swap space is most efficient when it is on a separate disk or partition, but sometimes it is just a large file in the File System. OS Memory Uses: Kernel - The Operating System's own (semi-)private memory space. This is always in Main memory. Cache - Main memory that is used to hold elements of the File System and other I/O operations. Not to be confused with the CPU cache or disk drive cache, which are not part of main memory. Virtual - The total addressable memory space of all processes running on the given machine. The physical location of such data may be spread among any of the three kinds of memory. Process Memory Uses: Data - Memory allocated and used by the program (usually via malloc, new, or similar runtime calls). Stack - The program's execution stack (managed by the OS). Mapped - File contents addressable within the process memory space. The amount of memory available for processes is at least the size of Swap, minus Kernel. On more modern systems (since around 1994) it is at least Main plus Swap minus Kernel and may also include any files via mapping. Swapping: 3.Explain about Swapping in details NOV DEC 2010 NOV DEC 2011 Swapping: The process of moving some pages out of main memory and moving others in is called swapping. There are three parts to the description of swapping algorithm: Managing space on the swap device Swapping processes out of main memory Swapping processes into main memory 2.1Allocation of Swap Space: The swap device is a block device in a configurable section of a disk. Whereas the kernel allocates space for files one block at a time, it allocates space on the swap device in groups of contigous blocks. A process that resides on the swap device is will eventually migrate back to main memory, freeing the space it had occupied on the swap device. Since speed is critical and the system can do I/O faster in one multiblock operation than in several single block operations, the kernal allocates contiguous space on the swap device without regard for fragmentation. NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 The kernal maintains free space for the file system in a linked list of free blocks, accessible from the file system super block, but it maintains the free space for the swap device in an in-core table called a map. A map is an array where each entry consists of an address of an allocatable resource and the number of resource units available there. Initially, a map contains one entry that indicates the address and total number of resources. For instance, the kernal treats each unit of the swap map as a group of disk blocks, and it treats the address of the block offset from the beginning of the swap area. Fig 1 illustrates an initial swap map that consists of 10,000 blocks starting at address 1. Fig 2 gives the algorithm malloc for allocating space for maps.Thekernal searches the map for the first entry from that contains enough space to accommodate the request. Fig 3 shows the sequence of swap map configurations after allocating 100 units, 50 units, then 100 units again. When freeing resources, the kernal finds their proper position in the map by address. Three cases are possible: 1. The freed resources completely filla a hole in the map: they are contigous to the entries whose address would immediately precede them and follow them in the map. 2. The freed resources partially fill a hole in the map. If the freed resources are contigous with the map entry that would immediately precede them or with the entry that would immediately follows them. 3. The freed resources partially fill a hole but are not contigous to any resources in the map. If the kernal frees 50 units of the swap resource starting at address 101, the swap map contains a new entry for the freed resources, since the returned resources are not contiguous to exixting entries in the map. If kernal then frees 100 units of the swap resource starting at address 1, it adjusts the first entry of the swap map since the freed resources are contigouous to those in the first entry.Fig 4 shows the sequence of swap map configuration. Suppose the kernal now request 200 units of swap space. Because the first entry in the swap map only contains 150 units, the kernal satisfies the request from the second entry.fig 5. 2.2Swapping Processes Out: NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 The kernal swaps a process out if it needs space in memory, which may result from any of the following: 1. The fork systems call must allocate space for a child process, 2. Thbrk system call increase the size of a process, 3. A process becomes larger to free space in memory for processes it had previously swapped out and should now swap in. When kernal decides that a process is eligible for swapping from main memory, it decrements the reference count of each region in the process and swaps the region out it its reference count drops to 0. The kernal swaps as much data as possible per I/O operation directly between the swap device and user address space, bypassing the buffer cache. If the hardware cannot transfer multiple pages in one operation, the kernal software must iteratively transfer one page of memory at a time. Fig 6 gives an example of mapping the in-core image of a process onto a swap device. The processcontains three regions of text, data, and stack: the text region ends at virtual address 2k, and the data region starts at virtual address 64k leaving a gap of 62k bytes in the virtual address space. When the kernal swaps process out, it swaps the pages for virtual addresses 0, 1k, 64k, 65k, 66k, and 128k. It does not allocate swap space for the empty 62k bytes between the text and data regions or the empty 61k bytes between the data and stack regions but fill the swap space contiguously. Fig 6 and 7 shows that the physical addresses occupied by the process before and after the swap are not the same. 2.2.1Fork swap: When the swap is complete, the child process exists on the swap device. The parent process places the child in the “ready-to-run” state and returns to the user mode. Since the child is in the “ready-to-run” state, the swapper will eventually swap it into memory, where the kernal will schedule it. The child will complete its part to the fork system call and return to the user mode. 2.2.2Expansion swap: If a process requires more physical memory than is currently allocated to it, either as a result of user stack growth or invocation of the brk system call and if needs more memory than is currently available, the kernal does an expansion swap of the process. It reserves enough space on the swap device to contain the memory space of the process including the newly requested space. Then it adjusts the address translation mapping of NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 the process to account for the new virtual memory but does not assign physical memory. Finally it swaps the process out in a normal swapping operation, zeroing out the newly allocated space on the swap device (fig 8) 2.3Swapping Processes In: Process 0, the swapper, is the only process into memory from swap devices. At the conclusion of system initialization, the swapper goes into an infinite loop, where its only task is to do process swapping. It attempts to swap processes in from the swap device, and it swaps processes out if it needs space in main memory. The swapper sleeps if there is no work for it to do or if it is unable to do any work. When the swapper wakes up to swap processes in, it examines all processes that are in the state “ready to run but swapped out” and selects one that has been swapped out the longest. If there is enough free memory available, the swapper swaps the process in, reversing the operation done for swapping out: It allocates physical memory, reads the process from the swap device, and frees the swap space. If the swapper successfully swaps in a process, it searches the set of “ready-torun” but swapped ouy” processes for others to swap in and repeats the above procedure. One of the following situations eventually arises: No “ready to run” processes exist on the swap device: The swapper goes to sleep until a process on the swap device wakes up or until the kernal swaps out a process that is “ready to run”. The swapper finds an eligible process to swap in but the system does not contain enough memory: The swapper attempts to swap another process out and. If successful, restarts the swapping algorithm, searching for a process to swap in. The kernal swaps out sleeping processes rather than those “ready to run” because “ready to run” processes have a grater chance of being scheduld soon. The choice of which sleeping process to swap out is a function of the process priority and the time the process has been in memory. If there are no sleeping processes in memory, the choice of which “ready to run” process to swap out is a function of the process nice value and the time the process has been in memory. A “ready to run process must be core resisdent for atleadt 2 seconds before being swapped out, and a process to be swapped in must have been swapped out for at least 2 seconds. NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 If the swapper cannot find any processes to swap out or if neither the process to be swapped in nor the process to be swapped out have accumulated more than 2 seconds resisdence time in their environment, then the swapper sleeps on the event that it wants to swap a process into memory cannot find room for it. The clock will awaken the swapper once a second in that state. Fig 10 depicts five processes and the time they spend in memory or on the swap device as they go through a sequence of swapping operations. For simplicity, assume that all processes are CPU intensive and they do not make any system calls. Hence, a context switch happens only as a result of clock interrupts at 1-second intervals. The swapper runs at high scheduling priority, so it always runs briefly at 1-second intervals if it has work to do. Further, assume that the processes are the same size and the system contain at mosttwo processes simultaneously in main memory. Initially processes A and B are in main memory and the other processes are swapped out. The swapper cannot swap any processes during first 2 seconds, because none have been in memory or on the swap device for 2 seconds, but at the 2-second mark, it swaps out NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 processes A and B and swaps in C and D. At end of 3 second mark, process E is eligible for swapping but the swapper cannot swap processes out of main memory because their resisdency time is under 2 seconds. At the 4-second mark, the swapper swaps out processes C and D and swaps in E and A. The algorithm for choosing a process to swap out to make room in memory has more serious flaws. First the swapper swpas out a process based on its priority, memory-resisdence time, and nice value. Although it swaps out a process only to make room for a process being swapped in, it may swap out a process that does not provide enough memory for the incoming process. Second, if the swapper sleeps because it could not find enough memory to swap in a process, it searches again for a process to swap in although it had previously chosen one. NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 Third, if the swapper chooses a “ready to run” process to swap out, it is possible that the process had executed since it has previously swapped in. fig 11 depicits such a case, where the kernalswpas in process D at the 2-second mark, schedule process C, and then swaps out process D at the 3-second mark in favor of process E even though process D had never run. Such thrassing is clearly undesirable. Finally, if the swapper attempts to swap out a process but cannot find space on the swpa device, a system deadlock could arise if the following conditions are met: All processes in main memory are asleep All “ready-to-run” processes are swapped out There is no room on the swap device for new processes There is no room in main memory for incoming processes 4.Explain demand paging in detail. Demand paging: NOV DEC 2010,NOV DEC 2011,MAY JUNE 2011 Machines whose memory architecture is based on pages and whose CPU has restart able instructions can support a kernel that implements a demand paging algorithms. Demand paging systems frees processes from size limitations otherwise imposed by the amount of physical memory available on a machine. Since a process may not fit into physical memory, the kernel must load its relevant portions into memory dynamically and execute it even though other parts are not loaded. Demand paging is transparent to user programs except for the virtual size permissible to the process. Processes tend to execute instruction in small portions of their text spaces, such as program loops and frequently called subroutines, and their data references tend to cluster in small subset of the total data space of the process. This is known as the principle of “locality”. Denning formalized the notion of the working set of a process, which is the set of pages that the process has referenced in its last n memory references; the number n is called the window of the working set. When a process addresses a page that is not in its working set, it incurs a page fault; in handling the fault, the kernel updates the working set, reading in pages from a secondary device if necessary. When a process access a page that is not part of its working set, it incurs a validity page fault. The kernel suspends execution of the process until it reads the page into memory and makes it accessible to the process. When the page is loaded in memory, the process restarts the instruction it was executing when it incurred the fault. Thus, the implementation of a paging NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 subsystem has two parts: swapping rarely used pages to swapping device and handling page faults. 4.1Data structuresfordemand paging: The kernel contains 4 major data structure to support low-level memory management function and demand paging: page table entries, disk block descriptors, the page frame data table (called pfdata for short), and the swap-use table. The kernel allocates space for the pfdata table once for the lifetime of the system but allocates memory pages for the other structure dynamically. The following bit fields to support demand paging: Valid Reference Modify Copy on write NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 Age The Kernel turns on the valid bit to indicate that the contents of a page are legal, but eh page reference is not necessarily illegal if the valid bit is off, as will be seen. The Reference bit indicates whether a process recently referenced a page, and the modify bits indicates whether a process recently referenced a page, and the modify bit indicates whether a process recently modified the contents of a page. The copy on write bit, used in the fork system call, indicates that the kernel must create a new copy of the page when a process modifies its contents. Finally, the kernel manipulates the age bits to indicate how long a page has been a member of the working set of a process. The pfdata table describes each page of physical memory and is indexed by page number. The fields of an entry are The page state, indicating that the page is on a swap device or executable file, that DMA is currently underway for the page or that the page can be reassigned. NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 The number of processes that reference the page. The reference count equals the number of valid page table entries that reference the page. It may differ from the number of processes that share regions containing the page, as will be described below when reconsidering the algorithm for fork. The logical device and block number that contains a copy of the page. Pointer to other pfdata table entries on a list of free pages and on a hash queue of pages. 4.1.1Fork in a Paging system: The kernel duplicates every region of the parent process during the fork system call and attaches it to the child process. Traditionally, the kernel of a swapping system makes a physical copy of the parent’s address space, usually a wasteful operation, because processes often call exec soon after the fork call and immediately free the memory just copied. On the system V paging system, the kernel avoids copying the page by manipulating the region tables, page table entries, and pfdata table entries: it simply increments the region reference count of shared regions. The page can now be referenced through both regions, which share the page until a process writes to it. The kernel then copies the page so that each region has a private version. To do this, the kernel turns on the “copy on write” bit for every page table entry in private regions of NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 the parent and child processes during fork. Fig 9.15 shows the data structures when a process forks. The processes share access to the page table of the shared text region T, so the region reference count is 2 and the pfdata reference count for pages in the text region is 1. The kernel allocates a new child data region, C1, a copy of region P1 in the parent process. The page table entries of the two regions are identical, as illustrated by the entry for virtual address 97K. The page table entries point to pfdata table entry 613, whose reference count is 2, indicating that two regions reference the page. NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 The implementation of the fork system call in the BSD system makes a physical copy of the pages of the parent process. Recognizing the performance improvement gained by not having to do the copy, however, the BSD system also contain the vfork system call, which assumes that a child process will immediately invoke exec on return from the vfork call. Vfork does not copy page tables so it is faster than the system v fork implementation. NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 Fig 9.16 Vfork and Corruption of Process Memory For example, consider the program in fig 9.16. After the vfork call, the child process does not exec, but resets the variables global and local and exists. The system guarantees that the parents process is suspended until the child process exec and exits. 4.1.2Exec in a Paging System When a process invokes the exec system call, the kernel reads the executable file into memory from the file system. On a demand paged system, however, the executable file may be too large to fit in the available main memory. The kernel, therefore, does not pre assign memory to the executable but “faults” it in, assigning memory as needed. It first assigns the page tables and disk block descriptors for the executable file, marking the page table entries “demand fill” or “demand zero” Following a variant of the read algorithm for reading the file into memory, the process incurs a validity fault as it reads each page. However, use of standard algorithms to access a file would make it expensive to demand page from indirect blocks because of the, multiple buffer cache accesses necessary to read a block. Furthermore, consistency problems could arise because bmap is not reentrant. The kernel sets various I/O parameters in the u area during the SYSTEM CALL. If a process incurs a page fault during a read system call when attempting to copy data to user space, it would overwrite these fields in the u area to read the page from the file system. NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 4.2Write short notes on Page Faults MAY JUNE 2012 The Page-Stealer Process The page stealer is a kernel process that swaps out memory pages that are no longer part of the working set a process. The kernel creates the page stealer during system initialization and invokes it throughout the lifetime of the system when low on free pages. There are two paging states for a page in memory: The page is aging and is not yet eligible for swapping, or the page is eligible for swapping and is available for reassignment to other virtual pages. The first state indicates that a process recently accessed the page, and the page is therefore in its working set. Some machines set a reference bit when they reference a page, but software methods can be substituted if the hardware does not have this feature. The page starts out in main memory, and the figure shows the number of examinations by the page stealer between memory references. A process referenced the page after the second examination, dropping its age to 0. Similarly, a process referenced the page again after mark.nel wakes up one more examination. Finally, the page stealer examined the page three times without an intervening reference and swapped the page out. NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 If two or more processes share a region, they update the reference bits of the same set of page table entries. The kernel wakes up the page stealer when the available free memory in the system is below a low-water mark, and the page stealer swaps out ages until the available free memory in the system exceeds a high-water mark. By swapping out pages until the number of free pages exceeds a high-water mark, it take longer undoes not run as often.til the number of free pages drops below the low-water mark, so the page stealer. Administrators can configure the values of the high- and low-water marks for best performance. NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 Fig9.19 Example of Aging a page 1. If no copy of the page is on a swap device, the kernel “schedules” the page for swapping. The page stealer places the page on a list of pages to be swapped out and continues; the swap is logically complete. When the list of pages to be swapped reaches a limit (dependent on the capabilities of the disk controller), the kernel writes the pages to the swap device. 2.If a copy of the page is already on a swap device and no process had modified its incore contents(the page table entry modify bit is clear), the kernel clears the page table entry valid bit, decrements the reference count in the pfdata table entry, and puts the entry on the free list for future allocation. 3. If a copy of the page is on a swap device but a process had modified its contents in memory, the kernel schedules the page for swapping, as above and frees the space it currently occupies on the swap device. The page stealer copies the page to a swap device if case 1 or case 3 is true. NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 To illustrate the differences between the last two cases, suppose a page is on a swap device and is swapped into main memory after a process incurs a validity fault. Assume the kernel does not automatically remove the disk copy. Eventually, the page stealer decides to swap the page out again. If no process has written the page since it was swapped in, the memory copy is identical to the disk copy and there is no need to write the page to the swap device. If a process has written the page, however, the memory copy differs from the disk copy, so the kernel must write the page to the swap device, after freeing the space on the swap device previously occupied by the page. It does not reuse the space on the swap device immediately, so that it can keep swap space contiguous for better performance. For example, suppose the page stealer swaps out 30, 40, 50 and 20 pages from processes A, B, C and D, respectively, and that it writes 64 pages to the swap device in one disk write operation. It shows the sequence of page swapping operations that would occur if the page stealer examines pages of the processes in the order A, B, C and D. The page stealer allocates space for 64 pages on the swap device and swaps out the 30 pages of process A and 34 pages of process B. It then allocates more space on the swap device for another 64 pages and swaps out the remaining 6 pages of process B, the 50 pages of process C, and 8 pages of process D. The two areas of the swap device for the two write operations need not be contiguous. 4.3 Write short notes on Page Faults Page Faults MAY JUNE 2012,MAY JUNE 2013 The system can incur two types of page faults: validity faults and protection faults .Because the fault handlers may have to read a page from disk to memory and sleep during the I/O operation, fault handlers are an exception to the general rule that interrupt handlers cannot sleep. However, because the fault handler sleeps in the context of the process that caused the memory fault, the fault relates to the running process, hence no arbitrary processes are put to sleep. 4.3.1Validity Fault Handler If a process attempts to an access a page whose valid bit is not set, it is incurs a validity fault and the kernel invokes the validity fault handler. The valid bit is not set for pages outside the virtual address space of a process, nor is it set for pages that are part of the virtual address space but do not currently have a physical page assigned to them, The hardware supplies the kernel with the virtual address that was accessed to cause the memory fault, and the kernel finds the page table entry and disk block description for the page. } Fig:9.21 Algorithm for validity fault handler The page that caused the fault is in one of five states: 1. On a swap device and not in memory, 2. On the free page list in memory, 3. In an executable file, 4. Marked “ demand zero”, 5. Marked “demand fill”. For example, consider the page table entry for virtual address 66k.If a process incurs a validity fault when accessing the page; the fault handler examines the disk block descriptor and sees that the page is contained in block 847 of the swap device. Hence, the virtual address is logical. The fault handler then searches the page cache but fails to find an entry for disk block 847.Therefore, there is no copy of the virtual page in memory, and the fault handler must read it from the swap device. NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 The kernel does not always have to do an I/O operation when it incurs a validity fault, even though the disk block descriptor indicates that the page is swapped. It is possible that the kernel had never reassigned the physical page after swapping it out, or that another process had faulted the virtual page into another physical page. Similarly, the fault handler does not have to read the page into memory if another process had faulted on the same page but had not completely read it in yet. The fault handler finds the region containing the page table entry locked by another instance of the fault handler. If a process incurs a page fault for a page marked “demand fill” or “demand zero”, the kernel allocates a free page in memory and updates the appropriate page table entry. For “demand zero”, it also clears the page to zero. The validity fault handler concludes by setting the valid bit of the page and clearing the modify bit. Finally, if returning to user mode, it checks for receipt of any signals that occurred while handling the page fault. 5.The I/O subsystem o The I/O subsystem allows a process to communicate with peripheral devices such as disks,tapedrives,terminals,printers and networks,and kernel modules that control devices are known as device drivers.system may contain one disk driver to control all disk drives,one terminal driver to control all terminals,and one tape driver to control all tape drives. NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 o This chapter examines the interface between processes and the I/O subsystem and between the machine and the device drivers and terminal drivers as detailed examples of the general interface o It concludes with a description of a new method for implementing device drivers called streams. 6. Discuss in detail about driver interfaces. NOV DEC 2010 NOV DEC 2011 Driver interfaces o The Unix system contains two types of devices (i)block devices-> random access storage devices to rest of the system(disk,tapes) (ii)raw or character devices->include all other devices(terminals,network media) o Block devices may have a character device interface o Every device has a name that looks like a file and is accessed by like a file. o The device file is distinguished from other files by (i)file type stored in its inode (ii)either “block” or “character special” o If a device has both a block and character interface,it is represented by two device files (i)block device special file (ii)character device special file o Ioctl->it is a system call provides an interface that allows processes to control character devices,but it is not applicable to regular files 6.1system configuration: o system configuration is the procedure by which administrators specify parameters thar are installation dependent. o parameters specify (i)size of kernel table->process table,inodetable,filetable,no.of buffers allocated to buffer pool (ii)device configuration (iii)telling the kernel which devices are included &address o Three stages to specify the device configuration NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 (i)administrators can hard-code configuration data into files that are compiled and linken when building the kernel.configuration data is in simple format ,it convert into file for compilation. (ii)administrators can supply configuration information after system is already running.kernal updates internal configuration table dynamically. (iii)self-identifying devices permit the kernel to recognize which devices are installed. o The kernel to device interface is described by the block device switch table and the character device switch table o Each device type has entriec in the table that direct the kernel to appropriate driver interface for system calls. o Open and close->device file funnel through the two device switch tables,based upon the file type o Mount and umount->invoke the open and close procedures for block devices o Read and write->invoke the algorithms of buffercache which invoke the device strategy procedure. o When device interrupt occurs,the system identifies the interrupting device and calls the appropriate interrupt handler. o Software devices->kernel profiler driver known as “software interrupt handler” NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 o Mknod command->supplying file types,major and minor numbers. Mknod /dev/tty13 c 2 13 “/dev/tty13 “->filename of the device c->character special file(b->block special file) 2->major number->it indicates device type that corresponding to appropriate entry in character device switch table 13->minor number->it indicates unit of a device o If process opens”/dev/dsk1”major number is 0->kernel calls the routine gdopen o If process reads the character specialfile”/dev/mem”major number is 3-> kernel calls the routine mmread o Nulldev->empty routine NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 6.2system calls and the driverinterface: Entry open Close Strategy 0 gdopen gdclose gdstrategy 1 gdopen gdclose gdstrategy (fig:block device switch table) entry open close read write ioctl 0 conopen conclose conread conwrite conioctl 1 dzopen dzclose dzread dzwrite dzioctl 2 syopen syclose syread sywrite syioctl 3 nulldev nulldev mmread mmwrite nodev 4 gdopen gdclose gdread gdwrite nodev 5 gtopen gtclose gtread gtwrite nodev (fig:character device switch table) o System calls that use file descriptors,the kernel follows pointers from the user filedescriptor to the kernel file and inode,where it examines the file type and access the block or character switch table,as appropriate. o It extracts the major and minor numbers from inode o Major number->index to appropriate file o Minor number->passins this number as a parameter. o If the inode was locked,other process that access the inode would sleep indefinitely because another process is asleep in driver. 6.2.1Open o The kernel follows the same procedure for opening a device as it does for opening regular files,allocating an in-core inode,incrementing its reference count,and assigning a file table entry and user file descriptor. o The kernel returns the user file descriptor to calling process,so that opening a device looks like opening a regular file NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 o If a device is both block and character device,the kernel will invoke the appropriate open procedure depending on the particular device file the user opened. o The device specific open procedure establishes a connection between the calling process and the opened device and initializes private driver data structures. o Process may specify various options to qualify the device open.the most common option is “no delay”means that the process will not sleep during the open procedure if the device is not ready. o The device driver can count how many times a device was opened and fail the open call if the count is inappropriate. 6.2.2.Close: o A process severs connection to an open device by closing it. o If no other process have the device open ,because the device close procedure terminates hardware connections.this must wait until no process are accessing the device. o The device can be rendered useless if the combination of open and close results in an unrecognized state. 1)It searches the file table to make sure that no other processes still have the device open.if the process open the two files independently,they access different inodes but the same device. 2)For a character device,the kernel invokes the device close procedure and returns the user mode.For a block device,the kernel searches the mount table to make sure that the device does not contain a mounted file system. 3)The kernel releases the inode of the device file. 6.2.3.Read and Write: o If the process is reading orwriting a character device,the kernel invokes the device driver read or write procedure. o Although the kernel transmits data directly between the user address space and the device,device drivers may buffer data internally. o If process generate data faster than a device output,thewrite procedure puts processes to sleep until the device can accept more data. o Some machines provide memory mappes I/O->certain addresses in kernel address space are not in physical memory locations,special registers that control particular devices. NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 o Other machine have programmed I/O->machine contains instructions to control devices. o High-speed devices transfer data directly between the device and user’s address space,without intervention of kernel buffer. 6.2.4.Strategy interface: o The kernel uses the strategy interface to transmit data between the buffer cache and a device,although read and write procedures of character devices sometimes use their(block counterpart) strategy procedure to transfer data directly between the device and user address space. o The kernel passes a buffer header addresses to the driver strategy procedure;the header contains a list of addresses and sizes for transmission of data to or from the device. o If data is being copied to or from the user’s address space,the driver must lock the process in memory until the I/O transfer is complete. 6.2.5.Ioctl: o The ioctl system call is a generalization of the document of the terminal-specific stty(set terminal settings)and gtty(get terminal settings) available in the UNIX system. o It allows a process to set hardware options associated with a device and software options associated with the driver. o Syntax: ioctl(fd,command,arg); NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 fd->file descriptor returned by aopen system call command->request of the driver to do particular action. arg->parameter for the command. 6.2.6.Other File System Related Calls: The lseek system calls work for devices.when the process reads or writes,the kernel moves the file table offset to the u area as is done like regular files,and the device physically seeks to the correct offset indicated in the u area. 6.3.Interrupt Handlers: o The kernel invokes the device specific interrupt handler,passing it the device number or other parameters to identify the specific unit that caused the interrupt. o The device number used by the interrupt handler identifies a hardware unit,and the minor number in the device file identifies a device for the kernal NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 7.Discuss about Disk drivers MAYJUNE2011 A disk driver translates the file system address to the particular sector of the disk. The file system address consists of a logical device numberand a block no. Eg: /dev/dsk0 of a DEC RP07 disk Which means accessing section 0 of the DEC RP07 disk. Accessing block 940 of dev/dsk3 Section 3 starts at block 336000 336000+940 336940 Accessing disk data Using raw or block interface Eg: lseek, ls Using commands like mkfs&fsck. They access the file systems directly. NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 Lseek sets the file offset for the open file Mkfs: formats disk section for UNIX file system & creates the super block, inode, list, linked list of free disk blocks and root directory. Fsck: checks the consistency of the file system and corrects the errors. Block and raw interface Ls –l /dev/dsk15 /dev/rdsk15 Br -----2root0,21 Feb12 15:00 /dev/dsk15 0 = Major n0. 21= Minor no. 2= physical drive 1= section no. Block interface = read , write Raw interface = lseek # include “fcntl.h” Main() { Char buf1[4096],buf2[4096]; Intfd1,fd2,I; If(((fd1=open(“/dev/dk5”,O_RDONLY))==-1)||((fd2=open(“/dev/rdsk5”,O_RDONLY))==-1)) { Printf(“failure on open\n”); Exit(); } Lseek(fd1,8192L,0); Lseek(fd2,8192L,0); If((read(fd1,buf1,sizeof(buf1))==-1)|| (read(fd2,buf3,sizeof(buf1))==-1)) { Printf(“failure on read\n”); Exit(); } For(i=0;i<sizeof(buf1);i++) If(buf1[i]!=buf2[i]) { Printf(“different at offset %d\n”,i); NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 Exit(); } Printf(“reads match\n”); } From the program we infer : Using read & write to the disk directly is dangerous . Jeopardizes system security. Administrators need to put appropriate permissions for the disk device. For eg: /dev/dsk15 should be owned by root to read but should not allow other users to read or write. Read or write routine converts byte offset to block offset . Use of block interface needs an extra copy of data between user address space & kernel buffers Use of raw interface on the other hand is faster. 8.Discuss on thr terminal drivers in detail. MAY JUNE 2011, 2012,2013 Terminal drivers Terminal Drivers are used to control the transmission of data to & from the terminals. But terminals are also user interface to the system unlike disks. Line Disciplines: Set of manipulations that input editing does. For eg: for moving one step backward, you have BACKSPACE in some terminals and DEL in some others. Modes of a terminal Raw mode Canonical mode Raw mode sends the entire sequence typed sequence including the erase characters. In Canonical mode, line disciple buffers the data into lines and process erase characters before sending the revised sequence to the reading process. Canonical mode processes edit, delete. NARESHKUMAR R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY-621214 Example Hello data DEL DELDELDEL world RAW mode output : Hello data DEL DELDELDEL world Canonical mode output :Hello world Functions of line discipline: Parse the input string into lines. Process the erase characters. Process “kill” character that invalidates all characters typed so far on the current line. To echo(write) received characters to the terminal. To expand o/p such as tab character to a sequence of blank spaces. Generate signals to process terminal hangups, line breaks or response to user hitting the delete key. To allow raw mode that does not interpret special characters such as erase, kill or carriage return.