Lecture 19 Reminder: Homework 4 due today, Thread Synch project due on Monday Homework 5 posted; due Wednesday after spring break. Questions? Wednesday, February 22 CS 470 Operating Systems - Lecture 19 1 Outline Paging Page tables Translation look-aside buffer (TLB) Effective memory access time Very large address spaces Wednesday, February 22 CS 470 Operating Systems - Lecture 19 2 Static, Complete, Non-Contiguous Organization Recall: The issues in storage organization include providing support for: single vs. multiple processes complete vs. partial allocation fixed-size vs. variable-size allocation contiguous vs. fragmented allocation static vs. dynamic allocation of partitions Looked at schemes created by the bolded choices. What happens if we allow noncontiguous allocation? Wednesday, February 22 CS 470 Operating Systems - Lecture 19 3 Paging In non-contiguous allocation schemes, the logical address space is still contiguous, but it is divided into multiple partitions that are mapped separately into (possibly) non-contiguous physical space. Simplest is paging, which uses fixed-size partitions. Logical memory is divided into fixedsize partitions called pages. Physical memory is divided into partitions of the same size called frames. Backing store also is divided this way. Wednesday, February 22 CS 470 Operating Systems - Lecture 19 4 Paging An admitted program is allocated memory by finding enough physical frames to map the logical pages. Since all the partitions are the same size (logical page, backing store partition, physical frame), any frame can accept any page. The MMU for this is more complex. Need a page table (basically an array) that is indexed by page numbers with frame number element values. Wednesday, February 22 CS 470 Operating Systems - Lecture 19 5 Address Translation f log. addr. phys. addr. d CPU p d f d p f page table Wednesday, February 22 CS 470 Operating Systems - Lecture 19 main memory 6 Address Translation Page size is determined by hardware. Size is usually a power of 2 between 512 bytes (9 bits of displacement) and 8192 bytes (13 bits of displacement). Power of 2 makes address translation easy: 2m byte logical address space, divided by 2n bytes per page, results in 2(m-n) logical pages, so (m-n) bits of page number and n bits of displacement Wednesday, February 22 CS 470 Operating Systems - Lecture 19 7 Very Small, Concrete Example 16 bytes of logical address (24 bytes) 4 bytes per page (22 bytes) => 4 logical pages (22 pages) => 2 bits page number, 2 bits displacement page# displacement 00 | abcd | 00 (a), 01 (b), 10 (c), 11 (d) 01 | efgh | 00 (e), 01 (f), 10 (g), 11 (h) 10 | ijkl | 00 (i), 10 (k), 11 (l) 11 | mnop | 00 (m), 01 (n), 10 (o), 11 (p) Wednesday, February 22 01 (j), CS 470 Operating Systems - Lecture 19 8 Very Small, Concrete Example 32 bytes of physical memory (25 bytes) 4 bytes per frame (22 bytes) => 8 physical frames (23 frames) => 3 bits frame number, 2 bits displacement page table physical memory p f f 0 | 5| 0 | | 4 | | 1 | 6| 1 |ijkl | 5 |abcd | 2 | 1| 2 |mnop | 6 |efgh| 3 | 2| 3 | 7 | Wednesday, February 22 | CS 470 Operating Systems - Lecture 19 | 9 Larger Example 8192 bytes logical address space 1024 bytes per logical page / physical frame 32,768 bytes physical memory How many logical pages? How many bits in a logical address? How many physical frames? How many bits in a physical address? Wednesday, February 22 CS 470 Operating Systems - Lecture 19 10 Paging Paging is a form of dynamic relocation. Every page has its own base "register". Advantages are the same as for all fixed-size partition schemes: no external fragmentation; all frames are alike, so any free frame can be used, and allocation/deallocation is efficient. Likewise disadvantages: some internal fragmentation. E.g., if request is 1 byte more than page size. Expect half page per process. Wednesday, February 22 CS 470 Operating Systems - Lecture 19 11 Paging Page size affects performance of systems Smaller is better for utilization, less internal fragmentation; but slower, e.g. disk transfers from backing store. Pages are getting larger as memory and disks get faster and cheaper. Current systems are usually 2KB or 4KB pages. Wednesday, February 22 CS 470 Operating Systems - Lecture 19 12 Page Tables With complete allocation, a new process enters only when all of its memory requirements can be granted. PCB holds the page table (PT) that is loaded on a context switch. Implementation of PT is an issue. If it is stored in memory, need two physical memory accesses per logical access. Slows down system by half. Wednesday, February 22 CS 470 Operating Systems - Lecture 19 13 Translation Look-aside Buffer (TLB) Standard solution for implementing PTs is to use a very fast, small, special hardware cache called a translation look-aside buffer (TLB). A TLB is a set of associative registers containing (key, value) pairs, meaning that they are wired together to receive a key, compare it to multiple values simultaneously, and output the corresponding value of any key match in one step. Wednesday, February 22 CS 470 Operating Systems - Lecture 19 14 Translation Look-aside Buffer (TLB) key key1 value1 key2 value2 key3 value3 key4 value4 value TLB hardware is fairly expensive, so they tend to be small. Some are as few as 8 entries, but some are as large as 2K entries. To use a TLB, it is put between the CPU and the PT. The basic idea is to look for an entry for page p in the TLB and only look in the PT if there is no match in the TLB. Wednesday, February 22 CS 470 Operating Systems - Lecture 19 15 Address Translation with TLB f log. addr. phys. addr. d CPU p d f d TLB p# f# p f TLB hit p TLB miss f main memory page table Wednesday, February 22 CS 470 Operating Systems - Lecture 19 16 Address Translation with TLB When there is a TLB hit, the frame number is obtained nearly instantaneously. When there is a TLB miss, the page table must be consulted, resulting in an extra memory access. The frame number is then loaded into the TLB, possibly replacing an existing entry. The TLB must be flushed on a context switch. Wednesday, February 22 CS 470 Operating Systems - Lecture 19 17 Effective Memory Access Time The effect of a TLB is calculated based on the hit ratio of memory accesses. I.e., the percentage of times the desired page number is in the TLB. For example, assume 100ns to access memory, 20ns to search the TLB, and 80% hit ratio. Mapped memory access (TLB hit) takes 120ns (TLB search + memory access). Unmapped memory access (TLB miss) takes 220ns (TLB search + PT access + memory access) Wednesday, February 22 CS 470 Operating Systems - Lecture 19 18 Effective Memory Access Time Effective memory access time (emat) is computed by weighting each case by its probability: emat = TLB hit % * TLB hit time + TLB miss % * TLB miss time = .80 * 120ns + .20 * 220ns = 140ns A 40% slowdown over direct mapped access, but better than 100% slowdown without TLB. Wednesday, February 22 CS 470 Operating Systems - Lecture 19 19 Effective Memory Access Time What is the emat if the hit ratio is increased to 98%? Hit ratio depends on size of TLB. Studies have shown that 16-512 entries can get 80-98%. Motorola 68020 has 22 entries. Intel 486 has 32 entries and claimed 98% hit ratio. Wednesday, February 22 CS 470 Operating Systems - Lecture 19 20 Paging Issues Sharing is easy. Can set up PTs of multiple processes to have same entries for shared parts of the logical address space. Especially good for code segments, as long as they are reentrant (i.e., program does not modify itself). Most modern OS's support very large address spaces that require very large PTs. 32-bit addresses are common. 64-bit is becoming so. Wednesday, February 22 CS 470 Operating Systems - Lecture 19 21 Very Large Address Spaces For example, 32-bit address space with 4KB pages results in 20 bits of page number and 12 bits of displacement. If each PT entry is 32 bits (4 bytes), need 4MB for the PT alone! Must divide the PT into smaller pieces using paging. Then have a PT to find the actual PT. For 32-bit addressing, use two levels of paging. Wednesday, February 22 CS 470 Operating Systems - Lecture 19 22 Very Large Address Spaces Logical address: | pp | pd | d | Page number part is divided into the page table page number (pp) and the page table displacement (pd). pp is used to index the PT's PT to find the location of the part of the PT containing the needed page number. pd is used to index the obtained part of the PT to get the frame number Wednesday, February 22 CS 470 Operating Systems - Lecture 19 23 Very Large Address Spaces For a 64-bit address space, need four levels of PTs. Under same assumptions as before: Unmapped memory access (TLB miss) is 520ns (20ns for TLB search + 400ns for 4 PT accesses + 100ns for memory access) For a 98% hit ratio emat = .98 * 120ns + .02 * 520ns = 128 ns Only slightly slower than the 1 level case! Shows the importance of TLB hardware. Wednesday, February 22 CS 470 Operating Systems - Lecture 19 24