Lecture 21 Exam 1 and Process Management Project back Questions? Monday, February 27 CS 470 Operating Systems - Lecture 21 1 Outline Demand Paging Review: Address translation Effective memory access time Storage management Page fault processing Page replacement Monday, February 27 CS 470 Operating Systems - Lecture 21 2 Dynamic, Partial, Non-Contiguous Organization Recall: Virtual memory storage organization provide support for the design choices in bold: single vs. multiple processes complete vs. partial allocation fixed-size vs. variable-size allocation contiguous vs. fragmented allocation static vs. dynamic allocation of partitions Implement this using a demand paging system where a logical page is loaded into memory when it is accessed. Monday, February 27 CS 470 Operating Systems - Lecture 21 3 Review: Address Translation Address translation must now handle the case where the logical page is not in memory: Page number p is obtained from logical address If TLB hit, access memory If TLB miss, access PT If PT entry is valid, access memory If PT entry is invalid, handle a page fault. Monday, February 27 CS 470 Operating Systems - Lecture 21 4 Review: Address Translation Page fault processing consists of Issuing a disk transfer request to the backing store Loading the requested page into a free frame in physical memory Updating the page table entry with valid bit and frame number OS issues an Event completion interrupt and process goes to the Ready Queue to wait for CPU. Then it attempts the same access that caused the page fault. Monday, February 27 CS 470 Operating Systems - Lecture 21 5 Review: Address Translation log. addr. f phys. addr. d CPU p d f d TLB p# f# p f TLB hit i/v valid PT entry f# main memory p TLB miss update page table i/v m/f OS trap page fault invalid PT entry m page table f f I/O completion interrupt Monday, February 27 backing store CS 470 Operating Systems - Lecture 21 transfer page from disk to memory 6 Effective Memory Access Time What is the effect of partial allocation on performance? Could be very bad. As seen before, emat of complete allocation paging is 20-220ns. (I.e., effective addressing time without page faults.) Use 200ns and call this ma. The first cut estimate depends on the probability of a page fault p (the page fault rate). Now the emat is: emat = (1 - p) x ma + p x page fault time Monday, February 27 CS 470 Operating Systems - Lecture 21 7 Effective Memory Access Time How much time does it take to service a page fault? trap to OS, context switch, determine that trap is a page fault: 1-100s find page on disk, issue a disk read (may have to wait), I/O completion interrupt: 8ms update PT, wait for CPU (assume get it immediately), context switch, resume interrupted access: 1-100s Disk read dominates time, so use 8ms for page fault time. Monday, February 27 CS 470 Operating Systems - Lecture 21 8 Effective Memory Access Time This gives us: emat = (1 - p) x 200ns + p x 8ms = (1 - p) x 200ns + p x 8,000,000ns = (200 + 7,999,800 x p) ns emat is directly proportional to page fault rate. Try out p = 0.001. (1 page fault out of 1000 accesses) emat = 200 + 7,999,800 x 0.001 ns = 8199.8 ns = 8.2ms (! - slowdown factor of 40) Monday, February 27 CS 470 Operating Systems - Lecture 21 9 Effective Memory Access Time If we want degradation to be less than 10%, need p to be close to 0. Can compute this: 220 > 200 + 7,999,800 x p 20 > 7,999,800 x p p < 0.0000025 This is less than 1 page fault out of 399,990 accesses. Turns out this is not unreasonable, due to prefetching and locality. Monday, February 27 CS 470 Operating Systems - Lecture 21 10 Pre-Fetching To start a process, we could just load the first instruction of the main program into the program counter and fault in pages as new logical addresses are encountered. This is called pure demand paging. Generally, programs exhibit locality. That is, memory accesses tend to be near each other. E.g. code instructions are often sequential, data structures often fit in a page, etc. Monday, February 27 CS 470 Operating Systems - Lecture 21 11 Pre-Fetching This is especially true when a process first starts running, so often it makes sense to prefetch (i.e., pre-load) the first few pages of the program code at the beginning to reduce the number of page faults when a process starts up. Pre-fetching also can be useful while a process is running. As we will see, often when process moves to a new logical page, it also will access pages around the new one. Monday, February 27 CS 470 Operating Systems - Lecture 21 12 Pre-Fetching It is also possible for some instructions to access more than one page. E.g., ADD A, B, C could translate to: Fetch and decode ADD instruction Fetch A into R1 Fetch B into R2 Add A+B into R3 Store R3 into C Could require up to 4 page faults without prefetching, but eventually will succeed. Monday, February 27 CS 470 Operating Systems - Lecture 21 13 Copy-on-Write While pre-fetching is used to eliminate future page faults, sometimes we don't want to be so aggressive. E.g., consider what happens after a fork( ) system call. As noted, the child process is an exact copy of the parent. The two processes can share code pages, but not data pages. But often the first thing the child process does is execute an exec( ), so copying the parent's data pages is unnecessary. Monday, February 27 CS 470 Operating Systems - Lecture 21 14 Copy-on-Write Instead, we can use a copy-on-write technique. Initially, parent and child share the same data pages, but they are marked as copyon-write pages. When either process tries to modify such a shared page, a copy of the shared page is created and the process' PT updated to reflect this. Several systems, such as Linux, implement fork( ) using copy-on-write pages. Monday, February 27 CS 470 Operating Systems - Lecture 21 15 Storage Management In real memory systems (static, complete allocation), we mostly have had to worry about maintaining a free list of available memory and deciding when there is enough memory to load a process. With virtual memory systems (dynamic, partial allocation), storage management becomes more complex. Monday, February 27 CS 470 Operating Systems - Lecture 21 16 Storage Management As noted last time, one advantage of VM is that more processes can be run at any given time. (I.e., it increases multi-programming.) E.g., suppose processes each have 10 logical pages and there are 40 physical frames. With real (i.e., static and complete) memory allocation, only 4 processes can be loaded. If each process uses an average of 5 logical pages, then can load 8 processes. Monday, February 27 CS 470 Operating Systems - Lecture 21 17 Storage Management However, there is a chance that all 8 processes will try to access all 10 of their pages at the same time. Now need 80 physical frames, but only have 40. What should happen when there is a page fault and no free frames? Monday, February 27 CS 470 Operating Systems - Lecture 21 18 Page Replacement How is this done? Add to page fault handler Write page back to swap space Invalidate PT (and TLB, if any) entry Note that now there are two disk transfers, when there are no free frames. Really only need to write back a page if it has been modified. Add a dirty bit to each PT entry to indicate if a page has been changed. Only write back the modified frames. Monday, February 27 CS 470 Operating Systems - Lecture 21 19 Page Replacement The page fault algorithm becomes 1. Find location of page p on swap disk 2. Find a free frame: a. If there is a free frame, use it b. If there is no free frame, use a page replacement algorithm to select a victim frame c. If the victim frame is dirty, write to disk d. Change the PT of victim process accordingly 3. Read page p from swap disk into newly freed frame; change PT of this process accordingly 4. Restart user process Monday, February 27 CS 470 Operating Systems - Lecture 21 20 Effective Memory Access Time There is another category of memory access: TLB hit: TLB access + memory access TLB miss: TLB access + PT in memory access + If PT entry is valid, memory access If PT entry is invalid, handle a page fault + TLB access (for restart) + memory access If clean replace: disk access If dirty replace: two disk accesses emat = time x probability of each category Monday, February 27 CS 470 Operating Systems - Lecture 21 21 Replacement Algorithms There are many types of replacement algorithms. Criteria for choosing includes: Low page fault rate Efficient in choosing victim frame As with CPU scheduling, evaluate by simulating on scenario data and comparing. For VM page replacement, data is a string of logical page references. Also need to know number of physical frames available. Monday, February 27 CS 470 Operating Systems - Lecture 21 22 Replacement Algorithms Generally, we expect that more physical frames leads to fewer page faults. E.g., if have only one physical frame, then nearly every memory reference causes a fault. Vice versa, if have as many physical frames as logical pages, then only the first use of each page causes a fault. For most of the examples, will use 3 physical frames. Monday, February 27 CS 470 Operating Systems - Lecture 21 23 FIFO Replacement As usual, simplest algorithm is FIFO (first in, first out). Associate a time with each frame. Victim frame is the oldest one. Can be implemented using a basic queue. 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 7 7 7 2 2 2 2 4 4 4 0 0 0 0 0 3 3 3 2 2 2 1 1 1 1 0 0 0 3 3 * * * * * * * * * * Easy to understand, but doesn't always give th good performance. E.g., 5 fault (page 3). Monday, February 27 CS 470 Operating Systems - Lecture 21 24 Belady's Anomaly Consider following reference string with 3 physical frames that results in 9 faults: 1 2 3 4 1 2 5 1 1 1 4 4 4 2 2 2 1 3 3 3 1 2 3 4 5 5 5 1 1 3 3 2 2 2 4 5 Do it again with 4 frames; results in 10 faults! 1 2 3 4 1 1 1 1 2 2 2 3 3 1 2 5 1 2 3 4 5 4 Monday, February 27 CS 470 Operating Systems - Lecture 21 25 OPT Replacement OPT is the optimal algorithm - replace the page that will be used farthest into the future. 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 7 7 7 2 2 2 0 0 0 0 4 1 1 3 3 Monday, February 27 CS 470 Operating Systems - Lecture 21 26 OPT Replacement OPT is provably optimal for a finite reference string. Of course, generally do not know the entire reference string, but it is good to know a theoretical minimum so we can say things like "at worst within 12% of optimal" or "4.7% of optimal on average". OPT is unimplementable in real systems, so need to approximate. FIFO is not good. Monday, February 27 CS 470 Operating Systems - Lecture 21 27 LRU Replacement LRU chooses the "least recently used" page as the victim page. The theory is to use recent past usage as a predictor of new future usage. It replaces the page that has not been used for the longest time. 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 7 7 7 2 2 4 0 0 0 0 0 1 1 3 3 Monday, February 27 CS 470 Operating Systems - Lecture 21 28 LRU Replacement The main problem for LRU is how to implement it. There are a couple of feasible implementations Counters: use CPU time counter and store it into the PT on each reference. Replace the lowest numbered entry. Need to search PT. Need to deal with context switches and counter rollover. "Stack": keep a "stack" of page references. When a page is referenced, move it to the top of stack. Victim is bottom of stack. Since want to remove from middle, usually use doubly-linked list. Monday, February 27 CS 470 Operating Systems - Lecture 21 29