Lecture 25 Reminder: Homework 5 due tody. Homework 6 posted. Questions? Wednesday, March 16 CS 470 Operating Systems - Lecture 25 1 Outline Summary of page replacement algorithms Other replacement algorithms Frame allocation Thrashing Working set model Other considerations Program structure Wednesday, March 16 CS 470 Operating Systems - Lecture 25 2 Summary: Page Replacement Algorithms FIFO ("First In, First Out") - victim page is the oldest page; not particularly good algorithm; suffer's from Belady's anomaly. OPT ("Optimal") - victim page is the one that will be used furthest into the future; provably optimal for a finite reference string; not implementable in a real system. LRU ("Least Recently Used") - victim page is the one that was used furthest in the past; most commonly used algorithm Wednesday, March 16 CS 470 Operating Systems - Lecture 25 3 Other Replacement Algorithms Counting algorithms - replacement based on frequency of use; increment a page counter each time referenced LFU ("Least Frequently Used") - victim page is the one that has the lowest counter MFU ("Most Frequently Used") - vicitm page is the one that has the highest counter Why should we expect these might work? Not too common, since they do not approximate OPT very well. Wednesday, March 16 CS 470 Operating Systems - Lecture 25 4 Other Replacement Algorithms LRU approximation - many architectures do not provide sufficient hardware for true LRU, often providing only one reference bit that is initially clear (value 0) and set (value 1) when a page is referenced. When a victim is needed, can tell which pages have been used/not used, but not exact order. Extend to multiple reference bits. Periodically shift right with 0 added to left end. References set the leftmost bit. Wednesday, March 16 CS 470 Operating Systems - Lecture 25 5 Other Replacement Algorithms For example, if there are 8 reference bits, get a history of reference usage in 8 time periods. 11111111 - refs in each of the last 8 time periods 00000011 - refs in far past 11000000 - refs in recent past When viewed as an integer, pages with recent references have a larger reference number, so page with the smallest reference number is the LRU pick. Note that this number is not necessarily unique. Wednesday, March 16 CS 470 Operating Systems - Lecture 25 6 Other Replacement Algorithms Second chance algorithm - can use a single reference bit in the following way to make FIFO a better algorithm. Keep track of a current "head" of page list. When a victim needs to be selected do the following: Check the head page, if reference bit is 0, replace it and advance the head pointer. If the reference bit is 1 ("recently used"), clear bit to 0 and skip. Repeat until find a 0. Wednesday, March 16 CS 470 Operating Systems - Lecture 25 7 Other Replacement Algorithms Worst case, all reference bits are set and algorithm degenerates to FIFO after each page has been given a second chance. If a page is used often enough, its reference bit likely will be 1 and it stays in. May also want to take the dirty bit into consideration, since dirty pages are more expensive to replace. Wednesday, March 16 CS 470 Operating Systems - Lecture 25 8 Frame Allocation To speed up page replacement, some systems reserve a few free frames at all times (called a buffer pool), so that there is always a free frame when a page fault occurs. Selection of the victim page and swapping out is done concurrently with the page fault service. Wednesday, March 16 CS 470 Operating Systems - Lecture 25 9 Frame Allocation How is the fixed amount of free memory allocated among the various processes? Consider simplest case of a single-user system. For example, 128KB of memory divided into 128 1KB pages. The OS may need as many as 35 frames, leaving a minimum of 93 frames for the user process. What if the OS does not use 35 frames at all times? Should the OS compete with user process? If not, what would be a fair division of the frames? Wednesday, March 16 CS 470 Operating Systems - Lecture 25 10 Frame Allocation In general, what would a fair division of frames among multiple processes? Some things to note: Maximum allocation is the maximum of the system Performance is better with more frames. In particular, for any particular architecture, can determine the minimum number of frames needed to avoid really bad performance. E.g., PDP-11 move instruction could straddle two pages and have two indirect address operands, thus requiring 6 pages for one instruction. Wednesday, March 16 CS 470 Operating Systems - Lecture 25 11 Frame Allocation What are some ways to divide m frames among n processes? How do we make the division "fair"? Wednesday, March 16 CS 470 Operating Systems - Lecture 25 12 Frame Allocation Equal allocation gives equal-sized shares to each process. I.e. m/n frames per process. Any leftover frames can be used as the buffer pool. Continuing the example, 93 frames divided among 5 processes gives 18 frames/process and a 3 frame buffer pool. Of course, not all programs are the same size. Suppose 2 processes, one is 10KB (like an editor) and another is 127KB (like a database). Conceptually, "wastes" 36 (93/2 - 10) frames. Wednesday, March 16 CS 470 Operating Systems - Lecture 25 13 Frame Allocation Can do proportional allocation instead, where allocation is based on size of each process' VM size as follows: Let the VM size of process Pi be si. Define S = sum of all si Allocate ai frames to Pi where ai = si / S x m Or the minimum number of frames, whichever is larger. Wednesday, March 16 CS 470 Operating Systems - Lecture 25 14 Frame Allocation Continuing the example, this would allocate 10/137 x 93 = 7 frames to the 10KB process and 127/137 x 93 = 86 frames to the 127KB process. The actual allocation will depend on the amount of multiprogramming. Also additional processes reduce the allocation to existing processes and vice versa. Could also proportionally allocated on the basis of priority or other factors rather than size. Wednesday, March 16 CS 470 Operating Systems - Lecture 25 15 Global vs. Local Replacement In addition to allocation, need to decide how to handler page replacement when there are multiple processes. Global replacement - choose victim from the set of all frames, even ones currently allocated to another process Local replacement - choose victim from set of frames currently allocated to the process. Wednesday, March 16 CS 470 Operating Systems - Lecture 25 16 Global vs. Local Replacement Local replacement seems "fairer", but if frames are unused, results in lower utilization Global replacement causes the page fault rate of a process to be affected by other processes. It also can cause a process to lose "too many" pages and fall below the minimum, especially in priority-based schemes. But generally, global replacement increases throughput, so is more commonly used (and is what is to be simulated in the project). Wednesday, March 16 CS 470 Operating Systems - Lecture 25 17 Thrashing When a process loses too many pages under global replacement, the system should suspend the process by swapping it out and releasing its remaining frames. Later the process can be swapped back in and resumed. The intermediate CPU scheduler handles this. "Not enough" is simply more active pages than allocated frames. Then each new reference causes a page fault that throws out a page that will be used soon. Wednesday, March 16 CS 470 Operating Systems - Lecture 25 18 Thrashing This high paging is called thrashing. The process/system spends more time paging than executing. CPU utilization is very low. Cause of thrashing is: Too much multiprogramming - processes keep stealing frames from each other, or the fixed number of frames is not enough for current activity. Even worse, some systems may interpret low CPU utilization as a need to increase multiprogramming. Wednesday, March 16 CS 470 Operating Systems - Lecture 25 19 Thrashing Difference is that there is very high disk activity when process/system is thrashing. (Causes a "thrashing" noise as disk seeks from frame to frame on the backing store.) Wednesday, March 16 CS 470 Operating Systems - Lecture 25 20 Thrashing When a local replacement strategy is used, thrashing may be limited to one process. However, this may still affect overall paging performance, since the backing store is shared by all processes. Wednesday, March 16 CS 470 Operating Systems - Lecture 25 21 Working Set Model How does a system determine if it is thrashing or could start thrashing if it admits another process? The locality model says that as a program executes, it moves from one locality to another where is locality is a set of pages. (Have always assumed this.) Localities are defined by program and data structures. If do not allocate enough frames for current locality, process thrashes. Wednesday, March 16 CS 470 Operating Systems - Lecture 25 22 Working Set Model Model locality using a working set model. Define a parameter called the working set window. The working set is then the set of pages referenced in the last references. Idea is that if a page currently is active, it will be in the working set. After last use, the page drops out of the working set after references. Wednesday, March 16 CS 470 Operating Systems - Lecture 25 23 Working Set Model For example, suppose = 10, and we have the following reference string: …2615777751…3444343444… <------------------>| WS(t1) = {1,2,5,6,7} <------------------>| WS(t2) = {3,4} Of course, must be the correct size. Too small and will still get thrashing. Too large and will have low utilization as unused pages continue to take up frames. Wednesday, March 16 CS 470 Operating Systems - Lecture 25 24 Working Set Model The system needs to keep track of the total demand for frames. Define WSSi as the working set size for process Pi, then the total demand D = sum of WSSi If D > m, the thrashing will occur and system needs to suspend a process. If D < m by enough frames, system can initiate a new process or resume a suspended one. Wednesday, March 16 CS 470 Operating Systems - Lecture 25 25 Page-Fault Frequency A more direct way of preventing thrashing is to just keep track of the page-fault rate of each process. When it is excessively high, process needs more frames. Either allocate more frames or suspend. When it is very low, process has too many frames, so free some. Wednesday, March 16 CS 470 Operating Systems - Lecture 25 26 Pre-Paging Pure demand paging incurs a large number of page faults at the beginning of process execution. Pre-paging (or pre-fetching) tries to bring in more than one page to prevent this. Some OS's pre-page small files. Can also prepage the working set of a resumed process. Main question is whether cost of pre-paging is less than cost of page faults, since not all of the pre-loaded pages may be used. Wednesday, March 16 CS 470 Operating Systems - Lecture 25 27 Program Structure Usually programmers do not know or care about memory management. But sometimes how a program is written can have great effect. Suppose we declare a 2D array: int array [1024][1024]; In most languages, most notably except FORTRAN, this array is laid out in row major order. Suppose ints are 4 bytes and pages are 4KB bytes. Each row takes up a page as shown on the next slide. Wednesday, March 16 CS 470 Operating Systems - Lecture 25 28 Program Structure array[0][0] array[0][1] : array[0][1023] ­­­­­­­­­­­­­­­­­­­­­­­ page boundary array[1][0] array[1][1] : array[1][1023] ­­­­­­­­­­­­­­­­­­­­­­­ page boundary array[2][0] : Wednesday, March 16 CS 470 Operating Systems - Lecture 25 29 Program Structure Consider initializing this array with 0's. Could write: for (int i = 0; i < 1024; i++) for (int j = 0; j < 1024; j++) array[i][j] = 0; How many page faults will this cause? Wednesday, March 16 CS 470 Operating Systems - Lecture 25 30 Program Structure Also could write: for (int j = 0; j < 1024; j++) for (int i = 0; i < 1024; i++) array[i][j] = 0; How many page faults will this cause? Wednesday, March 16 CS 470 Operating Systems - Lecture 25 31 Program Structure Careful selection of data and programming structures can increase locality and hence lower page-fault rates and size of working set. Consider whether the following are "good" or "bad" with regard to page faults: Stack Hashed symbol table Sequential search Binary search Pure code Vector operations Indirection (pointers) Wednesday, March 16 CS 470 Operating Systems - Lecture 25 32