CS 1104 Help Session II Virtual Memory Colin Tan, ctank@comp.nus.edu.sg S15-04-15 Motivation • Drive space is very very cheap – Typically about 2cents per megabyte. – It would be ideal if we could set aside a portion of drive space to be used as memory. – Unfortunately disk drives are very slow • Fastest access time is about 10ms, or about 1,000 times slower than SRAM and several hundred times slower than DRAM. • Idea: Use drive space as memory, and main memory to cache the drive space! – This is the idea behind virtual memory. Will it work? • Virtual memory accesses come from the programs executing in CPU just like main memory accesses previously. • Hence virtual memory accesses will still display temporal and spatial locality! • AMAT is now: AMAT = Tcache + miss_rate x (Tmemory + page_fault_rate x drive_access_time) • With locality, miss_rate and page_fault_rate are very small (2% or 3%), so memory access time is still almost that of the cache! Main Idea System Cache Is cached by Main Memory Is cached by Virtual Memory • Virtual memory (residing on disk) is cached by main memory. • Main memory is cached by system cache • All memory transfers are only between consecutive levels (e.g. VM to main memory, main memory to cache). Cache vs. VM • Concept behind VM is almost identical to concept behind cache. • But different terminology! – Cache: Block – Cache: Cache Miss VM: Page VM: Page Fault • Caches implemented completely in hardware. VM implemented in software, with hardware support from CPU. • Cache speeds up main memory access, while main memory speeds up VM access. Technical Issues of VM • Relatively cheap to remedy cache misses – Miss penalty is essentially the time taken to access the main memory (around 60-80ns). – Pipeline freezes for about 60-80 cycles. • Page Faults are EXPENSIVE! – Page fault penalty is the time taken to access the disk. – May take up to 50 or more ms, depending on the speed of the disk and I/O bus. – Wastes millions of processor cycles! Virtual Memory Design • Because page-miss penalties are so heavy, not practical to implement direct-mapped or setassociative architectures – These have poorer hit rates. • Main memory caching of VM is always fully associative. – 1% or 2% improvement in hit rate over other fully associative or set associative designs. – But with heavy page-miss penalties, 1% improvement is A LOT! • Also relatively cheap to implement full associativity in software Virtual Memory Design • Main Memory at Virtual Memory are both divided into fixed size pages. – Page size is typically about 16KB to 32KB. – Large page sizes are needed as these can be more efficiently transferred between main memory and virtual memory. – Size of physical page ALWAYS equal to size of virtual page. • Pages in main memory are given physical page numbers, while pages in virtual memory are given virtual page numbers. – I.e. First 32KB of main memory is physical page 0, 2nd 32KB is physical page 1 etc. – First 32KB of virtual memory is virtual page 0, etc. Virtual Memory Design • In cache, we can search through all the blocks until we find the data for the address we want. – This is because the number of blocks is small. • This is extremely impractical for virtual memory! – The number of VM pages is in the tens of thousands! Solution • Use a look up table. • The addresses generated by the CPU is called the virtual address. • The virtual address is divided into a page offset and a virtual page number: Virtual Page Number Page Offset • The virtual page number indicates which page of virtual memory the data that the CPU needs is in. Solution • The data must also be in physical memory before it can be used by the CPU! • Need a way to translate between the virtual page number where the data is in VM, to the page number of the physical page where the data is in physical memory. • To do this, use Virtual Page Table. – Page Table resides in main memory. – One entry per virtual page. Can get VERY large as the number of virtual pages can be in the tens of thousands. Virtual Page Table • Gives the physical page number of a virtual page, if that page is in memory. – Once entry per virtual page. • Gives location on disk if virtual page is not yet in main memory. PPN0 VPN0 VPN1 VPN2 VPN3 VPN4 VPN5 Virtual Memory Table PPN1 PPN2 PPN3 Physical Memory VM (on Disk Space) Page Table Contents • The page table also contains a Valid Bit (V) to indicate if the virtual page is in main memory (V=1) or still on disk (v=0). VPN0 VPN1 VPN2 VPN3 VPN4 VPN5 1 0 1 1 0 1 2 (2,1,7) 0 1 (7,2,9) 3 • If a page is in physical memory (V=1), then the page table gives the physical page number. • Otherwise it gives the location of the page on disk, in the form (side#, track#, block#). Accessing Data • To retrieve data: 1. Extract the Virtual Page Number from the Virtual Address Virtual Page Number (e.g. 02) Virtual Page Number (e.g. 02) Page Offset Page Offset Accessing Data 2. Use the VPN to look up the page table. If V=1, get the PPN from the page table: VPN = 2 VPN0 VPN1 VPN2 VPN3 VPN4 VPN5 1 0 1 1 0 1 2 (2,1,7) 0 1 (7,2,9) 3 PPN=0 Here virtual page number 2 mapped to phyiscal page number 0. Accessing Data 3. Combine the PPN found with the page offset to form the physical memory address: Phyiscal Page Number 0 Page Offset Physical Page Number 0 Physical Address Page Offset Accessing Data 4. Access main memory using the physical address. – A page consists of many bytes (e.g. 32KB) – The page offset tells us exactly which byte of these 32KB we are accessing. • Similar to the idea of block offset and byte offset in caches Page Fault • What if the page we want is not in main memory yet? 1. In this case, V=0, and the page table contains the disk address of the page (e.g. VPN1 in the previous example is still at side 2, track 1, block 7 (2,1,7) of the disk. 2. Find a free physical page, or if none are available, apply a replacement policy (e.g. LRU) to find one. 3. Load the virtual page into the physical page. Set the V flag, and update the page table to show which physical page the virtual page has gone to. Writing to VM • Writes to Virtual Memory is always done on a write-back basis. – It is much too expensive to update both main memory and virtual memory, so write-through schemes are not possible. • To support write-back, the page-table must be augmented with a dirty-bit (D). • This bit is set if the page is updated in physical memory. Writing to VM D VPN0 0 VPN1 0 1 VPN2 1 VPN3 0 VPN4 0 VPN5 0 V PPN or disk location 1 2 0 (2,1,7) 11 0 1 1 0 (7,2,9) 1 3 • Here virtual page number 2 was updated in physical page number 0. • If PPN0 is ever replaced, its contents must be written back to disk to update VPN2. • Similar in concept to write-back cache. Translation Look-aside Buffer • An access to virtual memory requires 2 main memory accesses at best. – One access to read the page table, another to read the data. • Remember from the Cache section that main memory is s l - o - w. • Fortunately, page table accesses themselves tend to display both temporal and spatial locality! – Temporal Locality: Accesses to the different words in the same VPN will cause access to same entry in page table! – Spatial Locality: Sequential access of data from one virtual page into the next will cause consecutive accesses to page table entries. • Initially I am at VPN0, and I access Page Table entry for VPN0. As I move into VPN1, I will access Page Table entry for VPN1, which is next to page table entry for VPN0! Translation Look-aside Buffer • Solution: – Implement a cache for the page table! This cache is called the translation look-aside buffer, or TLB. – The TLB is separate from the caches we were looking at earlier. • Those caches cached data from main memory. • The TLB caches page table entries! Different! – TLB is small (about 8 to 10 blocks), and is implemented as a fully associative cache. Translation Look-aside Buffer • Fully Associative – New page table entries go into the next free TLB block, or a block is replaced if there are none. • Note that only page table entries with V=1 are written to the TLB! • The page table entries already in the TLB are not usually updated, so no need to consider writethrough or write-back – Exceptional cases: VPN aliasing, where more than 1 VPN can refer to the same Physical Page. Translation Look-aside Buffer • The tags used in the TLB is the virtual page number of a virtual address. • All TLB blocks are searched for the VPN. If found, we have a TLB hit and the physical page number is read from the TLB. This is joined with the page offset to form the physical address. • If not found, we have a TLB miss. Then we must go to the page table in main memory to get the page table entry there. Write this entry to TLB. Translation Look-aside Buffer • Complication – If we have a TLB miss and go to main memory to get the page table entry, it is possible that this entry has a V of 0 - page fault. – In this case we must remedy the page fault first, update the page table entry in main memory, and then copy the page table entry into TLB. The tag portion of TLB is updated to the VPN of the virtual address. • Note that the TLB must also have a valid bit V to indicate if the TLB entry is valid (see cache section for more details on the V bit.) Integration Cache, Main Memory and Virtual Memory • Suppose a Virtual Address V is generated by the CPU (either from PC for instructions, or from ALU for lw and sw instructions). 1. Perform address translation from Virtual Address to Physical Address (a) Look up TLB or page table (see previous slides). Remedy page fault if necessary (again, see previous slides). 2. Use the physical address to access the cache (see cache notes). 3. If cache hit, read the data (or instruction) from the cache. 4. If cache miss, read the data from main memory. Integration Cache, Main Memory and Virtual Memory • Note that a page-fault in VM will necessarily cause a cache miss later on (since the data wasn’t in physical memory, it cannot possibly be in cache!) • Can optimize algorithm in event of page fault: 1. Remedy the page fault. 2. Copy the data being accessed directly to cache. 3. Restart previous algorithm at step 3. • This optimization eliminates 1 unnecessary cache access that would definitely miss. Page Table Size • A Virtual Memory System was implemented for a MIPS workstation with 128MB of main memory. The Virtual Memory size is 1GB, and each page is 32KB. Calculate the size of the page table. Page Table Size • Previous calculation shows that page tables are huge! • These are sitting in precious main memory space. • Solutions: – Use inverted page tables • Instead of indexing virtual pages, index physical pages. • Page table will provide virtual page numbers instead. • Search page table for the VPN of address virtual address V. If the VPN is found in entry 25, then the data can be found in physical page 25. – Have portions of page table in virtual memory. • Slow, complex Finer Points of VM • VM is a collaboration between hardware and OS – Hardware: • TLB • Page Table Register – Indicates where the page table is in main memory • Memory Protection – Certain virtual pages are allocated to processes running in memory. – If one process tries to access the virtual page of another process without permission, hardware will generate exception. – This gives the famous “General Protection Fault” of windoze and the “Segmentation Fault” of Unix. Finer Points of VM – Hardware • Does address translations etc. – Operating System • Actually implements the virtual memory system. – Does reads and writes to/from disk – Creates the page table in memory, sets the Page Table Register to point to the start of the page table. – Remedies page faults,updates the page table. – Remedies VM violations » Windows: Pops up blue screen of death, dies messily. Sometimes thrashes your hard-disk. » Unix: Gives “Segmentation Fault”. Kills offending process and continues working. Finer Points of VM • Where is the Virtual Memory located on disk? – Virtual memory is normally implemented as a very large file, created by the OS. E.g. in Windows NT, the virtual memory file is called swapfile.sys • Insecure. Sometimes sensitive info gets written to swapfile.sys, and you can later retrieve the sensitive info. • In Unix, implemented as a partition on the disk that cannot be read except by the OS. Unix good. Windows bad. – Whenever virtual memory is read or written to, the OS actually reads or writes from/to this file. • Virtual Memory is NOT the other files on your disk (e.g. your JAVA assignment) Finer Points of VM • The VM shown here is not implemented in the real world: – Implicit assumption is that process data, instructions etc. are created and stored in VM on disk. – We will access process data, instructions from VM as and when we need it. – EXPENSIVE, SLOW => Pretty idiotic system. • In a real VM, the virtual memory on disk is never used until the main memory runs out. Finer Points of VM • See a good Operating Systems book for more details on VM implementation. – Look up web for Windows white-papers – Try hacking the Linux kernel to understand VM implementation. Summary • Main memory is to VM as what cache is to main memory. • Due to heavy page-fault penalties, main memory always caches VM in a fully-associative way. • Data in VM must be copied to physical memory before CPU can read it. • Page tables are used to find the data we want in physical memory. Summary • Page Tables mean that we must access main memory twice – Once to read page table, once to read data. • We can speed things up by caching portions of page table in a special cache called the TLB. – Page table accesses show temporal and spatial locality too! Recommended Reading • Patterson and Hennessy, pp 603 to 618 – Provides a common framework to understand both cache and VM. • Also good to read historical perspectives to understand why and how cache and VM came about.