W4118 Operating Systems Instructor: Junfeng Yang 0 Logistics Homework 4 deadline extended 3:09pm 3/31 1 Last Lecture: Paging Disadvantages of contiguous allocation External fragmentation Wasteful allocation of unused memory No sharing Paging: divide memory into fixed-sized pages Each address is split into page number and page offset Page table maps virtual page number to physical page number 2 Last lecture: Paging Advantages No external fragmentation Don’t need to allocate unused memory Fine-grained sharing Two page table entries from two process can point to the same physical page Easy to swap out to disk (later this lecture) Efficient Allocation and Free Allocation: since fixed size, no search necessary Free: insert page to free list 3 Last lecture: Paging Disadvantages Internal fragmentation Page tables can be large • Multi-level page tables • Hashed page tables • Inverted page tables Inefficiency: two memory accesses for each CPU memory access Techniques to reduce memory overhead Translation look ahead buffer (TLB): exploit temporal and spatial locality to reduce the number of memory accesses Page size? Too small? Too large? 4 Today Segmentation Virtual memory 5 Segmentation Divide address space into logical segments Each logical segment can be part of physical memory Separate base and limit for each segment (+ protection bits) How to specify a segment? User part of logical address (similar to how to select a page) • Top bits specify segment • Low bits specify offset within segment Implicitly by type of memory reference • Code v.s. Data segment Special registers 6 User’s View of a Program 7 Logical View of Segmentation 1 4 1 2 2 3 4 3 user space physical memory space 8 Segmentation Architecture Logical address consists of a two tuple: <segment-number, offset>, Segment table – maps two-dimensional physical addresses; each table entry has: base – contains the starting physical address where the segments reside in memory limit – specifies the length of the segment 9 Segmentation Architecture (Cont.) Protection With each entry in segment table associate: • validation bit = 0 illegal segment • read/write/execute privileges Protection bits associated with segments; code sharing occurs at segment level Since segments vary in length, memory allocation is a dynamic storage-allocation problem A segmentation example is shown in the following diagram 10 Segmentation Hardware 11 Example of Segmentation 12 Segmentation Advantages Advantages Sharing of segments Easier to relocate segment than entire program Avoids allocating unused memory Flexible protection Efficient translation • Segment table small fit in MMU Disadvantages Segments have variable lengths dynamic allocation overhead (Best fit? First fit?) External fragmentation: wasted memory • Segments can be large 13 Combine Paging and Segmentation Structure Segments: logical units in program, such as code, data, and stack • Size varies; can be large Each segment contains one or more pages • Pages have fixed size Two levels of mapping to reduce page table size Page table for each segment Base and limit for each page table Similar to multi-level page table Logical address divided into three portions seg # page # offset 14 Example: 80x86 Supports both segmentation and segmentation with paging CPU generates logical address Given to segmentation unit • Which produces linear addresses Linear address given to paging unit • Which generates physical address in main memory • Paging units form equivalent of MMU 15 80x86 Segment Selector Logical address: segment selector + offset Segment selector stored in segment registers (16-bit) cs: code segment selector ss: stack segment selector ds: data segment selector es, fs, gs Segment register can be implicitly or explicitly specified mov $8049780, %eax // implicitly use ds mov %ss:$8049780, %eax // explicitly use ss • Logical address: ds : $8049780 • Logical address: ss : $8049780 16 80x86 Segmentation Unit Descriptor table seg descriptor segment memory ds seg selector mov $8049780, %eax seg descriptor CPU Two memory references for one load! How to optimize? 17 Translating Logical to Linear Address 18 80x86 Paging Unit 4MB page started with Pentium 19 Today Segmentation Virtual memory 20 Motivation Previous approach to memory management Must completely load user process in memory One process with large address space or many processes with combined address space out of memory Observation: locality of reference Temporal locality: access memory location accessed just now Spatial locality: access memory location adjacent to locations accessed just Programs spend majority of time in small piece of code • 90% of time in 10% of code (Knuth’s estimate) Thus, processes only need small amount of address space at any moment 21 Virtual Memory Idea OS and hardware produce illusion of a disk as fast as main memory Process runs when not all pages are loaded in memory Keep referenced pages in main memory Keep unreferenced pages on slower, cheaper backing store (disk) 22 Memory Hierarchy Levels of memory in computer system size speed registers cache cost < 1 cycle a few cycles memory <100 ns disk a few ms 23 Virtual Address Space Virtual address maps to one of three locations Physical memory: small, fast, expensive Disk: large, slow, cheap Nothing 24 Virtual Memory Operation What happens when reference a page in backing store? Recognize location of page Choose a free page Bring page from disk into memory Above steps need hardware and software cooperation How to detect if a page is in memory? Extend page table entries with present bits • Page fault: if bit is cleared then referencing resulting in a trap into OS 25 Handling a Page Fault OS selects a free page OS brings faulting page from disk into memory Page table is updated, present bit is set Process continues execution 26 Steps in Handling a Page Fault 27 Continuing Process Continuing process is tricky Page fault may have occurred in middle of instruction Want page fault to be transparent to user processes Options Skip faulting instruction? Restart instruction from beginning? • What about instruction like: mov ++(sp), R2 Requires hardware support to restart instructions 28 OS Decisions Page selection When to bring pages from disk to memory? Page replacement When no free pages available, must select victim page in memory and throw it out to disk 29 Page Selection Algorithms Demand paging: load page on page fault Request paging: user specifies which pages are needed Start up process with no pages loaded Wait until a page absolutely must be in memory Users do not always know best Preparing: load page before it is referenced When one page is referenced, bring in next one Do not work well for all workloads • Difficult to predict future 30 Page Replacement Algorithms Optimal: throw out page that won’t be used for longest time in future Random: throw out a random page Easy to implement Works surprisingly well FIFO: throw out page that loaded in first Best algorithm if we can predict future Good for comparison, but not practical Fair: all pages receive equal residency LRU: throw out page that hasn’t been used in longest time Past predicts future With locality: approximates Optimal 31 Page Replacement Algorithms Want lowest page-fault rate Evaluate algorithm by running it on a particular string of memory references (reference string) and computing the number of page faults on that string In all our examples, the reference string is 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 32 Optimal Algorithm Replace page that will not be used for longest period of time 4 frames example 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 1 4 6 page faults 2 3 4 5 How do you know this? Used for measuring how well your algorithm performs 33 First-In-First-Out (FIFO) Algorithm Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 3 frames (3 pages can be in memory at a time per process): 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 1 1 4 5 2 2 1 3 3 3 2 4 1 1 5 4 2 2 1 5 3 3 2 4 4 3 9 page faults 10 page faults 4 frames: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 Belady’s Anomaly: more frames more page faults 34 Graph of Page Faults Versus The Number of Frames 35 FIFO Illustrating Belady’s Anomaly Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 36 Least Recently Used (LRU) Algorithm Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 1 1 1 1 5 2 2 2 2 2 3 5 5 4 4 4 4 3 3 3 Counter implementation: Every page entry has a counter; every time page is referenced through this entry, copy the clock time into the counter When a page needs to be changed, look at the counters to determine which are to change Problem: have to search all pages/counters! 37 Implementing LRU: Stack Stack implementation – keep a stack of page numbers in a double link form: Page referenced: • move it to the top • requires 6 pointers to be changed No search for replacement • bottom entry is by definition least used 38 Use Of A Stack to Record The Most Recent Page References 39 LRU: Concept vs. Reality LRU is considered to be a reasonably good algorithm Problem is in implementing it Counter implementation: counter per page, copied per memory reference, have to search pages on page replacement to find oldest Stack implementation: no search, but pointer swap on each memory reference Thus the efforts to design efficient implementations that approximate LRU 40 LRU Approximation Algorithms Reference bit With each page associate a bit, initially = 0 When page is referenced bit set to 1 Replace the one which is 0 (if one exists) • We do not know the order, however Second chance Need reference bit Clock replacement If page to be replaced (in clock order) has reference bit = 1 then: • set reference bit 0 • leave page in memory • replace next page (in clock order), subject to same rules 41 Second-Chance (clock) Page-Replacement Algorithm 42 43 Paging in 64 bit Linux Platform Page Size Address Bits Used Alpha 8 KB 43 3 10+10+10+13 IA64 4 KB 39 3 9+9+9+12 PPC64 4 KB 41 3 10+10+9+12 sh64 4 KB 41 3 10+10+9+12 X86_64 4 KB 48 4 9+9+9+9+12 Paging Levels Address Splitting 44