TDDB68 Concurrent programming and operating systems Overview: Virtual Memory Background Demand Paging Page Replacement Allocation of Frames Virtual Memory Thrashing and Data Access Locality Process Creation: Copy-on-Write [SGG7/8/9] I/O Interlock Chapter 9 Memory-Mapped Files later Copyright Notice: The lecture notes are mainly based on Silberschatz’s, Galvin’s and Gagne’s book (“Operating System Concepts”, 7th ed., Wiley, 2005). No part of the lecture notes may be reproduced in any form, due to the copyrights reserved by Wiley. These lecture notes should only be used for internal teaching purposes at the Linköping University. Christoph Kessler, IDA, Linköpings universitet. Virtual address space of a process TDDB68, C. Kessler, IDA, Linköpings universitet. 9.2 Silberschatz, Galvin and Gagne ©2005 Virtual Memory Previously achieved: Separation of user logical memory from physical memory Sparse address space: ”Holes” should be allocated pages only when the stack/heap grows ++ protection, reuse Still, the whole program and data must be loaded in memory Virtual memory Idea: Only part of the program needs to be in memory for execution. Throw out pages currently not used (to secondary memory) Logical address space can therefore be much larger than physical address space. Allows address spaces to be shared by several processes. Allows for more efficient process creation. Not all functionality provided in the program text will be executed in every run of the program Virtual memory can be implemented via: TDDB68, C. Kessler, IDA, Linköpings universitet. 9.3 Silberschatz, Galvin and Gagne ©2005 Virtual Memory That is Larger Than Physical Memory Demand paging Demand segmentation TDDB68, C. Kessler, IDA, Linköpings universitet. 9.4 Demand Paging Silberschatz, Galvin and Gagne ©2005 [Kilburn et al. 1961] Bring a page into memory only when it is needed Less I/O needed Less memory needed Faster response More users Page is needed if referenced (load/store, data/instructions) TDDB68, C. Kessler, IDA, Linköpings universitet. 9.5 Silberschatz, Galvin and Gagne ©2005 invalid reference abort not-in-memory bring to memory TDDB68, C. Kessler, IDA, Linköpings universitet. 9.6 Silberschatz, Galvin and Gagne ©2005 1 Demand paging (“lazy paging”, “lazy swapping”): Transfer of a Paged Memory to Contiguous Disk Space Valid-Invalid Bit With each page table entry, a valid–invalid bit is associated (1 page in memory, 0 not-in-memory) Initially, valid–invalid bit is set to 0 on all entries Example of a page table snapshot: Frame # valid-invalid bit 1 1 1 1 0 0 0 page table Rather than swapping entire processes (cf. swapping), we page their pages from/to disk only when first referenced. TDDB68, C. Kessler, IDA, Linköpings universitet. 9.7 Silberschatz, Galvin and Gagne ©2005 Page Table When Some Pages Are Not in Main Memory During address translation, if valid–invalid bit in page table entry is 0 page fault TDDB68, C. Kessler, IDA, Linköpings universitet. Silberschatz, Galvin and Gagne ©2005 9.8 Steps in Handling a Page Fault (Case: a free frame exists) ”Busy waiting” for slow disk? Better allocate the CPU to some other ready process in the meanwhile… TDDB68, C. Kessler, IDA, Linköpings universitet. 9.9 Silberschatz, Galvin and Gagne ©2005 What happens if there is no free frame? find some page in memory, but not really in use, swap it out Write-back only necessary if victim page was modified Same page may be brought into memory several times Various algorithms Performance an algorithm that minimizes number of page faults TDDB68, C. Kessler, IDA, Linköpings universitet. 9.10 Silberschatz, Galvin and Gagne ©2005 Performance of Demand Paging 0 p 1.0 if p = 0: no page faults if p = 1, every reference is a fault Memory access time t Page Fault Rate p Page replacement – want TDDB68, C. Kessler, IDA, Linköpings universitet. 9.11 Silberschatz, Galvin and Gagne ©2005 Effective Access Time (EAT) EAT = (1 – p) t + p . ( TDDB68, C. Kessler, IDA, Linköpings universitet. page fault overhead + time to swap page out, if modified + time to swap new page in + restart overhead +t ) 9.12 Silberschatz, Galvin and Gagne ©2005 2 Demand Paging Example Example: Need For Page Replacement Write-back rate w 0 <= w <= 1 % of page faults where page replacement is needed and the victim page has been modified so it needs to be swapped out kernel 0 1 2 Example: 3 Memory access time = 1 microsecond (ms) Time for swapping a page = 10 ms = 10,000 ms Write-back rate w = 0.5 = 50% ... ... page fault for page 3 expected swap time per page fault = (1+w) * 10,000 ms 0 1 2 EAT = (1 – p) . 1ms + p (15,000 ms) all frames currently in use... 3 = 1 ms + p . 14999 ms TDDB68, C. Kessler, IDA, Linköpings universitet. 9.13 Silberschatz, Galvin and Gagne ©2005 TDDB68, C. Kessler, IDA, Linköpings universitet. Silberschatz, Galvin and Gagne ©2005 9.14 Page Replacement Basic Page Replacement Prevent over-allocation of memory by modifying page-fault Extended page-fault service routine: service routine to include page replacement Find the location of the desired page on disk Use modify (dirty) bit to reduce overhead of page transfers – only modified pages are written to disk Find a free frame: - If there is a free frame, use it Page replacement completes separation between logical memory and physical memory – large virtual memory can be provided on a smaller physical memory - If there is no free frame, use a page replacement algorithm to select a victim frame and, if dirty, write its page to disk (1,2) Read the desired page into the (newly) free frame (3) Update the page table (4) TDDB68, C. Kessler, IDA, Linköpings universitet. 9.15 Silberschatz, Galvin and Gagne ©2005 How to compare algorithms for page replacement? Restart the process TDDB68, C. Kessler, IDA, Linköpings universitet. Silberschatz, Galvin and Gagne ©2005 9.16 First-In-First-Out (FIFO) Algorithm Use a time stamp or a queue Goal: find algorithm with lowest page-fault rate. Victim is the ”oldest page” Method: Simulation. Assume table size = 3 frames / process Assume initially empty page table Run algorithm on a particular string of memory references (reference string – page numbers only) Count number of page faults on that string. (3 pages can be in memory at a time per process) and reference string: 1, 1 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 1 1 4 4 4 5 5 5 5 5 5 2 2 2 1 1 1 1 1 3 3 3 3 3 3 2 2 2 2 2 4 4 A total of 9 page faults In all our examples, the reference string is After page 3 is loaded, page 1 is the oldest of them all (gray box) 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5. TDDB68, C. Kessler, IDA, Linköpings universitet. 9.17 Silberschatz, Galvin and Gagne ©2005 TDDB68, C. Kessler, IDA, Linköpings universitet. The fact that we re-use an existing page does not alter who has been in there the longest... 9.18 Silberschatz, Galvin and Gagne ©2005 3 Expected Graph of Page Faults Versus Number of Frames Same FIFO: More frames = better ? 4 frames/process (4 pages can be in memory at a time per process) Reference string: 1, 2, 1 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 1 1 1 1 1 5 5 5 5 4 4 2 2 2 2 2 2 1 1 1 1 5 3 3 3 3 3 3 2 2 2 2 4 4 4 4 4 4 3 3 3 A total of 10 page faults FIFO Replacement – Belady’s Anomaly more frames with more page faults – possible! Generally, more frames => Less page faults ? TDDB68, C. Kessler, IDA, Linköpings universitet. Silberschatz, Galvin and Gagne ©2005 9.19 FIFO illustrating Belady’s Anomaly TDDB68, C. Kessler, IDA, Linköpings universitet. Silberschatz, Galvin and Gagne ©2005 9.20 An Optimal Algorithm [Belady 1966] ”optimal”: has the lowest possible page-fault rate (NB: still ignoring dirty-ness) does not suffer from Belady’s anomaly Remark: Belady’s algorithm is only optimal if there are no dirty write-backs. Otherwise it is just a heuristic algorithm. Belady’s Algorithm: Farthest-First, MIN, OPT Replace page that will not be used for the longest period of time.... How do you know this? Example: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 1 Silberschatz, Galvin and Gagne ©2005 9.21 Least Recently Used (LRU) Algorithm Algorithm: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 1 2 1 2 3 1 2 3 4 1 2 3 4 1 2 3 4 1 2 5 4 1 2 5 4 1 2 5 4 1 2 5 3 1 2 4 3 1 4 1 2 2 2 2 2 2 2 3 3 1 3 1 3 1 3 3 3 3 4 4 4 5 5 5 5 5 5 9.22 A total of 6 page faults We will need frames 1, 2 and 3 before we need frame 4 again, thus throw it out! Silberschatz, Galvin and Gagne ©2005 The process maintains a logical clock (counter for number of memory accesses made) Every page entry has a timestamp field Every time a page is referenced, copy the logical clock into the timestamp field When a page needs to be replaced, search for oldest timestamp 9.23 Overhead! Linear search! Stack implementation 5 2 4 3 A total of 8 page faults Out of pages 1,2,3 and 4, page 3 is the one not used for the longest time... TDDB68, C. Kessler, IDA, Linköpings universitet. 1 2 3 TDDB68, C. Kessler, IDA, Linköpings universitet. Replace the page that has not been used for the longest period of time 1 1 2 3 Timestamp implementation ....try using recent history as approximation of the future! Example: 1 2 LRU Algorithm Implementations Optimal algorithm not feasible? 1 2 Theoretical algorithm! Used for measuring how well your algorithm performs, e.g., ”is it within 12% from the optimal?” more frames but more page faults TDDB68, C. Kessler, IDA, Linköpings universitet. 1 Silberschatz, Galvin and Gagne ©2005 keep a stack of page numbers in doubly linked form: page referenced: move it to the top 6 pointers to be changed No search for replacement TDDB68, C. Kessler, IDA, Linköpings universitet. requires 9.24 Silberschatz, Galvin and Gagne ©2005 4 LRU Approximation Algorithms Counting Algorithms Reference bit for each page Keep a counter of the number of references that have been Initially = 0 (by OS) When page is referenced, bit set to 1 Replace a page whose bit is 0 (if one exists). We do not know the access order, however. May improve precision by using several bits. made to each page LFU Algorithm: replaces page with smallest count MFU Algorithm: based on the argument that the page with the smallest count was probably just brought in and has yet to be used Second chance algorithm Clockwise replacement If page to be replaced (in clock order) has reference bit = 1 then: set reference bit 0 leave page in memory replace next page (in clock order), subject to same rules TDDB68, C. Kessler, IDA, Linköpings universitet. 9.25 Silberschatz, Galvin and Gagne ©2005 Not common in today’s operating systems TDDB68, C. Kessler, IDA, Linköpings universitet. Allocation of Frames Fixed Allocation Each process needs a minimum number of pages Equal allocation Example: IBM 370 – 6 pages to handle SS MOVE instruction: instruction is 6 bytes, might span 2 pages 2 pages to handle from 2 pages to handle to fixed allocation priority allocation TDDB68, C. Kessler, IDA, Linköpings universitet. 9.27 Example: if there are 100 frames and 5 processes, give each process 20 frames. Proportional allocation Allocate according to the size of process si size of process pi m 64 s1 10 m total number of frames s ai allocation for pi i m S s2 127 S si Two major allocation schemes Silberschatz, Galvin and Gagne ©2005 TDDB68, C. Kessler, IDA, Linköpings universitet. 9.28 Priority Allocation Global vs. Local Allocation Use a proportional allocation scheme Global replacement using priorities rather than size If process Pi generates a page fault, select for replacement one of its frames select for replacement a frame from a process with lower priority TDDB68, C. Kessler, IDA, Linköpings universitet. 9.29 Silberschatz, Galvin and Gagne ©2005 Silberschatz, Galvin and Gagne ©2005 9.26 10 64 5 137 127 a2 64 59 137 a1 Silberschatz, Galvin and Gagne ©2005 process selects a replacement frame from the set of all frames; one process can take a frame from another Local replacement each process selects from only its own set of allocated frames TDDB68, C. Kessler, IDA, Linköpings universitet. 9.30 Silberschatz, Galvin and Gagne ©2005 5 Thrashing Demand Paging and Thrashing Why does demand paging work? Locality model If a process does not have “enough” pages, Locality = set of pages that are actively used together Process migrates from one locality to another Localities may overlap the page-fault rate is very high. This leads to: low CPU utilization operating system thinks that it needs to increase the degree of multiprogramming another process added to the system Why does thrashing occur? sizes of localities > total memory size Thrashing a process is busy swapping pages in and out TDDB68, C. Kessler, IDA, Linköpings universitet. Silberschatz, Galvin and Gagne ©2005 9.31 Working-Set Model [Denning, 1968] TDDB68, C. Kessler, IDA, Linköpings universitet. Silberschatz, Galvin and Gagne ©2005 9.32 Program Structure Matters! j Example: Matrix initialization int data [128][128]; Assume page size = 128 ints Assume < 128 frames available, LRU repl. Each WSi = WSi (t) = working set of process Pi at current time t = set of pages referenced in the most recent accesses if too small, WS will not encompass entire locality if too large, WS will encompass several localities if = WS will encompass entire program WS can be derived e.g. by simulation, by sampling or tracing (with hardware support). row is stored in one page .... working-set window a fixed number of page references i Program 1: for (j = 0; j <128; j++) for (i = 0; i < 128; i++) data[i][j] = 0; // traverse column-wise -> 128 x 128 = 16,384 page faults WSSi = |WSi| = working set size of process Pi at current time = total number of pages referenced in the most recent accesses D= Program 2: for (i = 0; i < 128; i++) i WSSi total demand frames if D > m Thrashing Policy: if D > m, then suspend one of the processes Silberschatz, Galvin and Gagne ©2005 TDDB68, C. Kessler, IDA, Linköpings universitet. 9.33 for (j = 0; j < 128; j++) data[i][j] = 0; TDDB68, C. Kessler, IDA, Linköpings universitet. // traverse row-wise Data access -> 128 page faults 9.34 locality is an important issue for programmers Silberschatz, Galvin and compilers Gagne ©2005 and optimizing Process Creation: Copy-on-Write Copy-on-Write fork(): Copy-on-Write (COW) allows both parent and child processes Child process gets a copy of parent process’ address space to initially share the same pages in memory Lazy copying process2 writes to page C If either process modifies a shared page, only then the page is copied Free pages are allocated from a pool of zeroed-out pages More efficient process creation as only modified pages are copied No improvement if child process does exec (overwrites all) Alternatively: vfork() in Unix, Solaris, Linux: copy of C TDDB68, C. Kessler, IDA, Linköpings universitet. 9.35 Silberschatz, Galvin and Gagne ©2005 Light-weight fork variant: sleeping parent and immediate exec() by child process Silberschatz, Galvin and Gagne ©2005 9.36 TDDB68, C. Kessler, IDA, Linköpings universitet. 6 I/O interlock Memory-Mapped Files I/O Interlock – Pages must sometimes be locked in memory Uses physical mem. addresses Consider I/O. Pages that are used for copying a file from a device must be locked from being selected for eviction by a page replacement algorithm. DMA / interruptdriven I/O Memory-mapped file I/O allows file I/O to be treated as routine memory access by mapping a disk block to a page in memory A file is initially read using demand paging. A page-sized portion of the file is read from the file system into a physical page. Subsequent reads/writes to/from the file are treated as ordinary memory accesses. Simplifies file access by treating file I/O through memory rather than read() / write() system calls (Physical) address of buffer gets stale if any of its pages is evicted: risk for overwriting (unrelated) data TDDB68, C. Kessler, IDA, Linköpings universitet. 9.37 Also allows several processes to map the same file allowing the pages in memory to be shared Silberschatz, Galvin and Gagne ©2005 TDDB68, C. Kessler, IDA, Linköpings universitet. 9.38 Silberschatz, Galvin and Gagne ©2005 Data used by Page Replacement Algorithms TDDB68 Concurrent programming and operating systems All algorithms need extra data in the page table, e.g., one or several of the following: Additional Slides on Virtual Memory A reference bit to mark if the page has been used a modify (dirty) bit to mark if a page has been written to (changed) since last fetched from SM Additional mark bits Counters or time stamps (used for theoretical algorithms – must often be converted into use of mark bits) A queue or stack of page numbers... Copyright Notice: The lecture notes are mainly based on Silberschatz’s, Galvin’s and Gagne’s book (“Operating System Concepts”, 7th ed., Wiley, 2005). No part of the lecture notes may be reproduced in any form, due to the copyrights reserved by Wiley. These lecture notes should only be used for internal teaching purposes at the Linköping University. Christoph Kessler, IDA, Linköpings universitet. Page-Fault Frequency Scheme TDDB68, C. Kessler, IDA, Linköpings universitet. 9.40 Silberschatz, Galvin and Gagne ©2005 Other Issues – Prepaging Prepaging Establish “acceptable” page-fault rate If actual rate too low, process loses frame If actual rate too high, process gains frame To reduce the large number of page faults ( kernel entry overhead) that occurs at process startup Prepage all or some of the pages a process will need, before they are referenced But if prepaged pages are unused, I/O and memory was wasted Assume s pages are prepaged and fraction α of the pages is used Is cost of s * α saved page faults > or < than the cost of prepaging s * (1 – α) unnecessary pages? α TDDB68, C. Kessler, IDA, Linköpings universitet. 9.41 Silberschatz, Galvin and Gagne ©2005 near zero prepaging loses TDDB68, C. Kessler, IDA, Linköpings universitet. 9.42 Silberschatz, Galvin and Gagne ©2005 7 Other Issues – Page Size Other Issues – TLB Reach Page size selection must take into consideration: TLB Reach = the amount of memory accessible from the TLB fragmentation TLB Reach = (TLB Size) . (Page Size) table size Ideally, the working set of each process is stored in the TLB. I/O overhead Otherwise there is a high degree of page faults. Increase the page size. locality This may lead to an increase in fragmentation as not all applications require a large page size Provide Multiple Page Sizes. This allows applications that require larger page sizes the opportunity to use them without an increase in fragmentation. TDDB68, C. Kessler, IDA, Linköpings universitet. 9.43 Silberschatz, Galvin and Gagne ©2005 TDDB68, C. Kessler, IDA, Linköpings universitet. 9.44 Silberschatz, Galvin and Gagne ©2005 References Belady, L.: A Study of Replacement of Algorithms for a Virtual Storage Computer. IBM Systems Journal 5 (1966) 78–101 Peter J. Denning: The working set model for program behavior, Communications of the ACM 11(5):323-333, May 1968 Peter J. Denning: The Locality Principle. Communications of the ACM 48(7):19-24, July 2005. TDDB68, C. Kessler, IDA, Linköpings universitet. 9.45 Silberschatz, Galvin and Gagne ©2005 8