Lecture 21 Case study guidelines posted Case study assignments Third project (virtual memory simulation) will go out after spring break. Extra credit fourth project (shell program) will go out after that. Questions? Monday, February 28 CS 470 Operating Systems - Lecture 21 1 Outline Effective memory access time Very large address spaces Segmentation Combining paging and segmentation Monday, February 28 CS 470 Operating Systems - Lecture 21 2 Address Translation with TLB f log. addr. phys. addr. d CPU p d f d TLB p# f# p f TLB hit p TLB miss f main memory page table Monday, February 28 CS 470 Operating Systems - Lecture 21 3 Address Translation with TLB When there is a TLB hit, the frame number is obtained nearly instantaneously. When there is a TLB miss, the page table must be consulted, resulting in an extra memory access. The frame number is then loaded into the TLB, possibly replacing an existing entry. The TLB must be flushed on a context switch. Monday, February 28 CS 470 Operating Systems - Lecture 21 4 Effective Memory Access Time The effect of a TLB is calculated based on the hit ratio of memory accesses. I.e., the percentage of times the desired page number is in the TLB. For example, assume 100ns to access memory, 20ns to search the TLB, and 80% hit ratio. Mapped memory access (TLB hit) takes 120ns (TLB search + memory access). Unmapped memory access (TLB miss) takes 220ns (TLB search + PT access + memory access) Monday, February 28 CS 470 Operating Systems - Lecture 21 5 Effective Memory Access Time Effective memory access time (emat) is computed by weighting each case by its probability: emat = TLB hit % * TLB hit time + TLB miss % * TLB miss time = .80 * 120ns + .20 * 220ns = 140ns A 40% slowdown over direct mapped access, but better than 100% slowdown without TLB. Monday, February 28 CS 470 Operating Systems - Lecture 21 6 Effective Memory Access Time What is the emat if the hit ratio is increased to 98%? Hit ratio depends on size of TLB. Studies have shown that 16-512 entries can get 80-98%. Motorola 68020 has 22 entries. Intel 486 has 32 entries and claimed 98% hit ratio. Monday, February 28 CS 470 Operating Systems - Lecture 21 7 Paging Issues Sharing is easy. Can set up PTs of multiple processes to have same entries for shared parts of the logical address space. Especially good for code segments, as long as they are reentrant (i.e., program does not modify itself). Most modern OS's support very large address spaces that require very large PTs. 32-bit addresses are common. 64-bit is becoming so. Monday, February 28 CS 470 Operating Systems - Lecture 21 8 Very Large Address Spaces For example, 32-bit address space with 4KB pages results in 20 bits of page number and 12 bits of displacement. If each PT entry is 32 bits (4 bytes), need 4MB for the PT alone! Must divide the PT into smaller pieces using paging. Then have a PT to find the actual PT. For 32-bit addressing, need two levels of paging. Monday, February 28 CS 470 Operating Systems - Lecture 21 9 Very Large Address Spaces Logical address: | pp | pd | d | Page number part is divided into the page table page number (pp) and the page table displacement (pd). pp is used to index the PT's PT to find the location of the part of the PT containing the needed page number. pd is used to index the obtained part of the PT to get the frame number Monday, February 28 CS 470 Operating Systems - Lecture 21 10 Very Large Address Spaces For a 64-bit address space, need four levels of PTs. Under same assumptions as before: Unmapped memory access (TLB miss) is 520ns (20ns for TLB search + 400ns for 4 PT accesses + 100ns for memory access) For a 98% hit ratio emat = .98 * 120ns + .02 * 520ns = 128 ns Only slightly slower than the 1 level case! Shows the importance of TLB hardware. Monday, February 28 CS 470 Operating Systems - Lecture 21 11 Segmentation Paging makes a very clear distinction between the user's logical view of memory and the actual physical memory. But generally, users tend to think of memory in segments related to the purpose or part of a program rather than as a linear array of bytes. E.g., objects, functions, global data, etc. Users do not care where in memory each segment is in relation to each other, and the segments are variable in length. Monday, February 28 CS 470 Operating Systems - Lecture 21 12 Segmentation This idea can be used as a storage organization and management scheme called segmentation. Logical address space is a collection of segments. Each segment has a "name" and a length. Logical addresses are of the form <s, d>, where s is the segment's name (usually a segment number) and d is the displacement. The MMU consists of a segment table (ST) of <base address, limit> pairs. Monday, February 28 CS 470 Operating Systems - Lecture 21 13 Address Translation b log. addr. CPU s d l base limit s b d<l no trap invalid address yes l d main memory Monday, February 28 CS 470 Operating Systems - Lecture 21 14 Segmentation Example Segment Table Segment Description 0 Func 1400 1000 1 sqrt 6300 400 2 main 4300 400 3 stack 3200 1100 4 symbol table 4700 1000 Base Addr Limit Translate the following addresses: < 2, 53 > < 3, 852 > < 0, 1222 > Monday, February 28 CS 470 Operating Systems - Lecture 21 15 Segmentation Implementation of segment table is similar to a page table. If small, stored directly in hardware. If large, stored in memory. Use TLB to cache mappings. Also need a segment table length register to check if segment number is valid. Monday, February 28 CS 470 Operating Systems - Lecture 21 16 Combining Paging and Segmentation Often systems combine paging and segmentation. The most common way is to divide (fairly large) segments into pages. Example: Multics - 36-bit word, 34-bit addresses divided into 18-bit segment number and 16-bit displacement. Maximum size segment is 64KB, rather large. Divide the logical address displacement into pages: | Monday, February 28 s | dp | dd | CS 470 Operating Systems - Lecture 21 17 Combining Paging and Segmentation The pages are 1KB, so dp is 6 bits, and dd is 10 bits. The ST entries contain a pointer to the segment's PT rather than a base address. 18 bits of segment number is large, too, so the ST is paged as well. | sp | sd | dp | dd | Monday, February 28 CS 470 Operating Systems - Lecture 21 18 Combining Paging and Segmentation Address translation is now sp indexes ST page table to get appropriate page of the ST sd indexes ST page to get the PT of segment s dp indexes the PT of segment s to get the appropriate page of physical memory dd is the displacement into the page Monday, February 28 CS 470 Operating Systems - Lecture 21 19 Combining Paging and Segmentation Example: Intel Pentium - supports pure segmentation and segmentation with paging CPU generates 48-bit <s, d> logical addresses, where s is 16 bits and d is 32 bits, that are translated into 32-bit linear addresses. s is used to index the ST to get the base address of a segment, then the 32 bit displacement is added to create a 32-bit linear address. These are then paged in either 4KB or 4MB pages. Monday, February 28 CS 470 Operating Systems - Lecture 21 20 Combining Paging and Segmentation Example: Linux on Pentium - Linux is designed to be multi-platform, so cannot depend on any particular hardware support. On a Pentium, Linux had 6 segments: Kernel code Kernal data User code - shared by all user processes User data - shared by all user processes Task-state segment (TSS) - stores contexts for switches Default local descriptor table (LDT) segment - generally unused Monday, February 28 CS 470 Operating Systems - Lecture 21 21 Combining Paging and Segmentation Effectively, each segment is the physical memory begin managed for its category. Linux is designed for 3 levels of paging, since it can run on 64-bit architectures. The number of bits for each level depends on the architecture. Since a Pentium is a 32-bit architecture, only need two levels, so the middle level has 0 bits, effectively bypassing it. Monday, February 28 CS 470 Operating Systems - Lecture 21 22