CS571 – Fall 2010 Virtual Memory Systems" GMU – CS 571! 10.1! Virtual Memory" Basic Concepts" Demand Paging" Page Replacement Algorithms" Thrashing " Working-Set Model" Observing VM Activity" GMU – CS 571! 10.2! Virtual Memory" Separation of user logical memory from physical memory" • Only part of the program needs to be in physical memory for execution." • The logical address space can be much larger than the physical address space." • Extends physical memory" anyone remember VAX systems?" Virtual memory can be implemented via:" • Demand paging " • Demand segmentation" GMU – CS 571! 10.3! Virtual Memory and Physical Memory" GMU – CS 571! 10.4! Swapping" GMU – CS 571! 10.5! Demand Paging" Bring a page into memory only when it is needed" • Less I/O needed" • Less physical memory needed " • Faster response?" well, that depends…" • Support more processes/users Page is needed ⇒ use the reference to page" • If not in memory ⇒ must bring from the disk" But remember…" GMU – CS 571! 10.6! Disk is S…L…O…W… • MB" • GB" • TB" • TB" • PB" • PB" 10.7! Latency Matters!! Relative Time Level Seconds Hours Days Latency Seconds NH L3 Cache 52 Clocks 1.62 x 10-8 Local Memory 150 ns 1.5 X 10-7 9.23 32 Core NUMA 3 Hops 5.19 x 10-7 31 SSD 125 us 1.25 x 10-4 7,692 2.14 1 ms 1 x 10-3 61,538 17 .71 3.6 ms 3.6 x 10-3 221,538 62 2.56 Gb Ethernet Disk 10.8! 1 Valid-Invalid Bit" With each page table entry a valid–invalid bit is associated (1 ⇒ in-memory, 0 ⇒ not-in-memory)" Frame #! valid-invalid bit! 1! 1! 1! 1! 0! ! 0! 0! page table! During address translation, if valid–invalid bit in page table entry is 0 ⇒ page fault." GMU – CS 571! 10.9! Use of the Page Table" GMU – CS 571! 10.10! Page Fault" If there is a reference to a page which is not in memory, this reference will result in a trap ⇒ page fault! Typically, the page table entry will contain the address of the page on disk." Major steps" • • • • Locate empty frame" Initiate disk I/O" Move page (from disk) to the empty frame" Update page table; set validation bit to 1." No work done while this is happening!" GMU – CS 571! 10.11! Steps in Handling a Page Fault" GMU – CS 571! 10.12! Impact of Page Faults" Each page fault affects the system performance negatively" • The process experiencing the page fault will not be able to continue until the missing page is brought to the main memory" • The process will be blocked (moved to the waiting state)" • Dealing with the page fault involves disk I/O " Increased demand to the disk drive " Increased waiting time for process experiencing the page fault" • How can we minimize the impact of page faults? " GMU – CS 571! 10.13! Performance of Demand Paging" Page Fault Rate p (0 ≤ p ≤ 1.0)" • if p = 0, no page faults " • if p = 1, every reference is a page fault" Effective Access Time with Demand Paging" = (1 – p) * (effective memory access time)" " " "+ p * (page fault overhead)" Example" • • • • " GMU – CS 571! Effective memory access time = 100 nanoseconds" page fault overhead = 25 milliseconds" p = 0.001" Effective Access Time with Demand Paging = 25 microseconds " " "" 10.14! Page Replacement" As we increase the degree of multi-programming, over-allocation of memory becomes a problem." What if we are unable to find a free frame at the time of the page fault? " One solution: Swap out a process, free all its frames and reduce the level of multiprogramming" • May be good under some conditions" Another option: free a memory frame already in use." GMU – CS 571! 10.15! Basic Page Replacement" 1. Find the location of the desired page on disk. 2. Locate a free frame: - If there is no free frame, use a page replacement algorithm to select a victim frame." - Write the victim page to the disk; update the page and frame tables accordingly." "But, how do we select a victim? 3. Read the desired page into the free frame. Update the page and frame tables. 4. Put the process (that experienced the page fault) back to the ready queue." GMU – CS 571! 10.16! Page Replacement" GMU – CS 571! 10.17! Page Replacement" Observe: If there are no free frames, two page transfers needed at each page fault!" We can use a modify (dirty) bit to reduce overhead of page transfers – only modified pages are written back to disk. Page replacement completes the separation between the logical memory and the physical memory – very large virtual memory can be provided on a smaller physical memory." GMU – CS 571! 10.18! Page Replacement Algorithms" When page replacement is required, we must select the frames that are to be replaced. " Primary Objective: " • Use the algorithm with lowest page-fault rate." • Efficiency (how fast can you swap pages) " • Cost (what is the effect on running programs)" Evaluate algorithm by running it on a particular string of memory references (reference string) and computing the number of page faults on that string." We can generate reference strings artificially or we can use specific traces." GMU – CS 571! 10.19! Page Faults Versus The Number of Frames" GMU – CS 571! Usually, for a given reference string the number of page faults decreases as we increase the number of frames. " 10.20! First-In-First-Out (FIFO) Algorithm" Simplest page replacement algorithm. " FIFO replacement algorithm chooses the “oldest” page in the memory. " Implementation: FIFO queue holds identifiers of all the pages in memory. " • We replace the page at the head of the queue." • When a page is brought into memory, it is inserted at the tail of the queue." GMU – CS 571! 10.21! FIFO Page Replacement" Easy to understand and implement. " But the “oldest” page may contain a heavily used variable." Will need to bring back that page in near future!" GMU – CS 571! 10.22! FIFO Page Replacement" Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 "" • 3-frame case results in 9 page faults" • 4-frame case results in 10 page faults" Program runs potentially slower with more memory!" Beladyʼs Anomaly" • more frames ⇒ more page faults for some reference strings!" • But most of the time, more memory (frames) results in fewer page faults" GMU – CS 571! 10.23! Optimal Page Replacement" Is there an algorithm that yields the minimal page fault rate for any reference string and a fixed number of frames? " Replace the page that will not be used for the longest period of time Algorithm OPT (MIN)" But…" Can be used as a yardstick for the performance of other algorithms" GMU – CS 571! 10.24! Least-Recently-Used (LRU) Algorithm" Idea: Use the recent past as an approximation of the near future." Replace the page that has not been used for the longest period of time Least-Recently-Used (LRU) algorithm " GMU – CS 571! 10.25! LRU Approximation Algorithms" Sophisticated hardware support may involve high overhead/cost. " Some limited HW support is common: Reference bit" • With each page associate a bit, initially set to 0." • When the page is referenced bit set to 1. " • By examining the reference bits, we can determine which pages have been used" • We do not know the order of use, however." • Forms the basis of many page replacement algorithms that approximate LRU" Additional-Reference-Bits Algorithm" Second-Chance Algorithm (Clock Algorithm)" Enhanced Second-Chance Algorithm" GMU – CS 571! 10.26! Additional-Reference-Bits Algorithm" Idea: Record the reference bits at regular intervals, to get additional ordering information." For example, we can keep an extra byte for each page/ frame in a table in memory. At regular intervals, the OS will shift the reference bit for each page into the high-order bit of its byte (clearing the reference bit in the mean time): " • Reference bit: 1 Additional byte: 00000000 10000000" • Reference bit: 0 Additional byte: 10000000 01000000" • Reference bit: 1 Additional byte: 01000000 10100000" The page with the lowest number is replaced." The number of bits in history depends on available hardware. " GMU – CS 571! 10.27! Second-Chance Algorithm" This is basically FIFO algorithm with the reference bit." When a page is selected for replacement, we inspect its reference bit." • If the reference bit = 0, we directly replace it." • If the reference bit = 1, we give that page a second chance and move on to select the next FIFO page. However, we set its reference bit to 0. " One way to implement the second-chance algorithm is to use a circular queue." GMU – CS 571! 10.28! Second-Chance Algorithm" GMU – CS 571! 10.29! Enhanced Second-Chance Algorithm" Idea: Use the second-chance algorithm by considering both the reference bit and the modify bit together" • (0,0) neither recently used nor modified – best page to replace." • (0,1) not recently used but modified – not quite as good, because the page will need to be written out before replacement." • (1,0) recently used but clean – probably will be used again soon. " • (1,1) recently used and modified – probably will be used again soon, and we will need to write it out to disk " We replace the first page encountered in the lowest non-empty class." GMU – CS 571! 10.30! Phases of the Enhanced Second-Chance Algorithm (Cont.)" 1. Beginning at the current position of the pointer, scan the pages. The first page with the reference bit = 0 and modify bit = 0 is replaced. No changes are made to the reference bit in this phase." 2. If Phase 1 fails, scan again, looking for a page with the reference bit = 0, and modify bit = 1. During this scan, set the reference bit to 0 on each page that is bypassed." 3. If Phase 2 fails, the pointer should be in its original position and all the pages will have the reference bit 0. Repeat Phase 1, and if necessary, Phase 2. " GMU – CS 571! 10.31! Counting-Based Page Replacement" Keep a counter of the number of references that have been made to each page Least-Frequently-Used (LFU) Algorithm: replaces page with smallest count GMU – CS 571! 10.32! Page-Buffering Algorithm" In addition to a specific page-replacement algorithm, other procedures are also used. " Systems commonly keep a pool of free frames." When a page fault occurs, the desired page is read into a free frame first, allowing the process to restart as soon as possible. The victim page is written out to the disk later, and its frame is added to the free-frame pool. " Other enhancements are also possible. " GMU – CS 571! 10.33! Global vs. Local Allocation" Page replacement algorithms can be implemented broadly in two ways.! Global replacement – process selects a replacement frame from the set of all frames; one process can take a frame from another." • Under global allocation algorithms, the page-fault rate of a given process depends also on the paging behavior of other processes. " Local replacement – each process selects from only its own set of allocated frames. " • Less used pages of memory are not made available to a process that may need them." GMU – CS 571! 10.34! Memory Allocation for Local Replacement" Equal allocation – If we have n processes and m frames, give each process m/n frames." Proportional allocation – Allocate according to the size of process." Priority of processes? " GMU – CS 571! 10.35! CPU Utilization versus the Degree of Multiprogramming" GMU – CS 571! 10.36! Thrashing" High-paging activity: The system is spending more time paging than executing." How can this happen? " • OS observes low CPU utilization and increases • • • • GMU – CS 571! the degree of multiprogramming. " Global page-replacement algorithm is used, it takes away frames belonging to other processes" But these processes need those pages, they also cause page faults." Many processes join the waiting queue for the paging device, CPU utilization further decreases. " OS introduces new processes, further increasing the paging activity." 10.37! Thrashing" To avoid thrashing, we must provide every process in memory as many frames as it needs to run without an excessive number of page faults. " Programs do not reference their address spaces uniformly/randomly" A locality is a set of pages that are actively used together. " According to the locality model, as a process executes, it moves from locality to locality. " A program is generally composed of several different localities, which may overlap." GMU – CS 571! 10.38! Locality in a Memory-Reference Pattern" GMU – CS 571! 10.39! Working-Set Model " Introduced by Peter Denning" A model based on locality principle" • see The Locality Principle, P.Denning, CACM July 2005" The parameter Δ, defines the working set window ! The set of pages in the most recent Δ page references of process Pi constitutes the working set " GMU – CS 571! 10.40! Working-Set Model" The accuracy of the working set depends on the selection of Δ:" • if Δ is too small, it will not encompass the entire locality." • if Δ is too large, it will encompass several localities." • if Δ = ∞ ⇒ will encompass the entire program." D = Σ WSSi ≡ total demand of frames " if D > the number of frames in memory ⇒ Thrashing" The O.S. will monitor the working set of each process and perform the frame allocation accordingly. It may suspend processes if needed. " Difficulty is how to keep track of the moving working-set window. " GMU – CS 571! 10.41! Working Sets and Page Fault Rates" GMU – CS 571! 10.42! Controlling Page-Fault Rate" Maintain “acceptable” page-fault rate." • If actual rate too low, process loses frame." • If actual rate too high, process gains frame." GMU – CS 571! 10.43! Copy-on-Write" Copy-on-Write (COW) allows both parent and child processes to initially share the same pages in memory. If either process modifies a shared page, only then the page is copied." COW allows more efficient process creation as only modified pages are copied (Windows 2000/XP, Linux, Solaris)." GMU – CS 571! 10.44! Virtual Memory in Windows XP" Uses demand paging with clustering. Clustering brings in pages surrounding the faulting page." Processes are assigned a working set minimum and a working set maximum." Working set minimum is the minimum number of pages the process is guaranteed to have in memory." A process may be assigned as many pages as its working set maximum." When the amount of free memory in the system falls below a threshold, automatic working set trimming is performed to restore the amount of free memory." Working set trimming removes pages from processes that have pages in excess of their working set minimum (uses a variation of the clock algorithm)" GMU – CS 571! 10.45! Virtual Memory in Solaris" When a thread experiences a page fault, the kernel assigns a page to the faulting thread from the list of free frames.! To reduce the overhead, the kernel tries to maintain a sufficient amount of free memory available at all times. " If the number of free pages falls below a certain threshold, a system process known as pageout is initiated." Three important parameters (related to paging)" • lotsfree: if total free memory is less than lotsfree, then some frames are freed by pageout process - uses a version of the “second-chance algorithm”" • desfree: if total free memory is less than desfree, then the kernel begins swapping out processes." • minfree: if the system is unable to maintain the amount of free memory even at minfree, the pageout process is called for every request for a new page. " • lotsfree ≥ desfree ≥ minfree! GMU – CS 571! 10.46! Observing VM Activity" Main tool is vmstat GMU – CS 571! 10.47! Observing VM Activity" Main tool is vmstat GMU – CS 571! 10.48! Observing VM Activity" Main tool is vmstat GMU – CS 571! 10.49!