Virtual Memory: Concepts Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron 1 Physical Addressing CPU Physical address (PA) 4 Main memory 0: 1: 2: 3: 4: 5: 6: 7: 8: ... M-1: Data word Used by microcontrollers like those in Arduino, cars etc. Simple but fragile to program for: Buffer overrun in buflab corrupts Firefox memory! 2 All problems in CS can be solved by another level of indirection Butler Lampson, co-inventor of PC 3 A System Using Virtual Addressing CPU Chip CPU Virtual address (VA) 4100 MMU Physical address (PA) 4 (Memory Management Unit) Main memory 0: 1: 2: 3: 4: 5: 6: 7: 8: ... M-1: Data word Used in all modern servers, desktops, and laptops 4 Why Virtual Memory (VM)? Simplified memory management Each process gets the same uniform linear address space Process Isolation Different processes have different virtual address spaces One process can’t interfere with another’s memory Uses main memory efficiently Use DRAM as a cache for the parts of a virtual address space 5 Today Address spaces VM as a tool for Caching memory management protection Address translation 6 VM as a Tool for Caching Virtual memory is an array of N contiguous bytes stored on disk. The contents of the array on disk are cached in physical memory (DRAM cache) These cache blocks are called pages (size is P = 2p bytes) Virtual memory VP 0 Unallocated VP 1 Cached VP 2n-p-1 Uncached Unallocated Cached Uncached Cached Uncached Physical memory 0 0 Empty PP 0 PP 1 Empty Empty M-1 PP 2m-p-1 N-1 Virtual pages (VPs) stored on disk Physical pages (PPs) cached in DRAM 7 DRAM Cache Organization DRAM cache organization driven by the enormous miss penalty DRAM is about 10x slower than slowest SRAM Disk is about 10,000x slower than DRAM Consequences Large page (block) size: typically 4-8 KB, sometimes 4 MB Fully associative Any VP can be placed in any PP Requires a “large” mapping function – different from CPU caches Highly sophisticated, expensive replacement algorithms Too complicated and open-ended to be implemented in hardware Write-back rather than write-through 8 Page Table A page table is an array of page table entries (PTEs) that maps virtual pages to physical pages. Per-process kernel data structure in DRAM Physical page number or Valid disk address PTE 0 0 null 1 1 0 1 0 0 PTE 7 1 null Physical memory (DRAM) VP 1 VP 2 VP 7 VP 4 PP 0 PP 3 Virtual memory (disk) VP 1 Memory resident page table (DRAM) VP 2 VP 3 VP 4 VP 6 VP 7 9 Page Hit Page hit: VM data accessed is found in physical memory Virtual address Physical page number or Valid disk address PTE 0 0 null 1 1 0 1 0 0 PTE 7 1 null Physical memory (DRAM) VP 1 VP 2 VP 7 VP 4 PP 0 PP 3 Virtual memory (disk) VP 1 Memory resident page table (DRAM) VP 2 VP 3 VP 4 VP 6 VP 7 10 Page Fault Page fault: VM data accessed is not found in physical memory Virtual address Physical page number or Valid disk address PTE 0 0 null 1 1 0 1 0 0 PTE 7 1 null Physical memory (DRAM) VP 1 VP 2 VP 7 VP 4 PP 0 PP 3 Virtual memory (disk) VP 1 Memory resident page table (DRAM) VP 2 VP 3 VP 4 VP 6 VP 7 11 Handling Page Fault Page fault handler selects a victim to be evicted Virtual address Physical page number or Valid disk address PTE 0 0 null 1 1 0 1 0 0 PTE 7 1 null Physical memory (DRAM) VP 1 VP 2 VP 7 VP 4 PP 0 PP 3 Virtual memory (disk) VP 1 Memory resident page table (DRAM) VP 2 VP 3 VP 4 VP 6 VP 7 12 Handling Page Fault Page fault handler selects a victim to be evicted Offending instruction is restarted: page hit! Virtual address Physical page number or Valid disk address PTE 0 0 null 1 1 1 0 0 0 PTE 7 1 null Physical memory (DRAM) VP 1 VP 2 VP 7 VP 3 PP 0 PP 3 Virtual memory (disk) VP 1 Memory resident page table (DRAM) VP 2 VP 3 VP 4 VP 6 VP 7 13 Locality to the Rescue Again! Virtual memory works because of locality At any point in time, programs tend to access a set of active virtual pages called the working set Programs with better temporal locality will have smaller working sets If (working set size < main memory size) Good performance for one process after compulsory misses If ( SUM(working set sizes) > main memory size ) Thrashing: Performance meltdown where pages are swapped (copied) in and out continuously 14 Memory Management Each process has its own virtual address (VA) space It can view memory as a simple linear array One process cannot change another processes memory VA space for Process 1 0 VP 1 VP 2 Address translation 0 Physical Address Space (DRAM) PP 2 ... N-1 PP 6 VA space for Process 2 (read-only library code) 0 PP 8 VP 1 VP 2 ... ... N-1 M-1 15 Memory Management Sharing among processes Map different virtual pages to the same physical page VA space for Process 1 0 VP 1 VP 2 Address translation 0 Physical Address Space (DRAM) PP 2 ... N-1 PP 6 VA space for Process 2 (read-only library code) 0 PP 8 VP 1 VP 2 ... ... N-1 M-1 16 Simplifying Linking and Loading Kernel virtual memory Linking 0xc0000000 Each program has similar virtual User stack (created at runtime) address space Code, stack, and shared libraries always start at the same address Memory invisible to user code %esp (stack pointer) Memory-mapped region for shared libraries 0x40000000 Loading execve() causes kernel to Run-time heap (created by malloc) allocate virtual pages Kernel copies .text and .data sections, page by page, from disk to memory Read/write segment (.data, .bss) Read-only segment (.init, .text, .rodata) 0x08048000 0 Loaded from the executable file Unused 17 Memory Protection Extend PTEs with permission bits Page fault handler checks these before remapping If violated, send process SIGSEGV (segmentation fault) SUPER READ WRITE Process i: Address VP 0: No Yes No PP 6 VP 1: No Yes Yes PP 4 VP 2: Yes Yes Yes • • • PP 2 Physical Address Space PP 2 PP 4 PP 6 SUPER READ WRITE Process j: Address VP 0: No Yes No PP 9 VP 1: Yes Yes Yes PP 6 VP 2: No Yes Yes PP 11 PP 8 PP 9 PP 11 18 Today Address spaces VM as a tool for caching memory management memory protection Address translation 19 Address translation Direct 1 to 1 mapping from Virtual Address to Physical Addresses VA=0x08 VA PA … .. 0x08 0x98 .. .. .. .. .. invalid PA=0x98 20 Summary of Address Translation Symbols Basic Parameters N = 2n : Number of addresses in virtual address space M = 2m : Number of addresses in physical address space P = 2p : Page size (bytes) Components of the virtual address (VA) PDBR: Page directory base registers TLB: Translation lookaside buffer TLBI: TLB index TLBT: TLB tag VPO: Virtual page offset VPN: Virtual page number Components of the physical address (PA) PPO: Physical page offset (same as VPO) PPN: Physical page number CO: Byte offset within cache line CI: Cache index CT: Cache tag 21 Address Translation with a Page Table How kernel tells hardware where to find the page table. Page table Page table base register Valid Page table address for process Physical page number (PPN) PTE (page table entry) m-1 Physical page number (PPN) p p-1 0 Physical page offset (PPO) Physical address Virtual address Virtual page number (VPN) n-1 Virtual page offset (VPO) p p-1 0 22 Simple Memory System Example Addressing 14-bit virtual addresses 12-bit physical address Page size = 64 bytes 11 13 12 11 10 9 8 7 6 5 4 3 2 1 PPN PPO Physical Page Number Physical Page Offset 10 9 8 7 6 5 4 3 2 1 VPN VPO Virtual Page Number Virtual Page Offset 0 0 23 Memory System Page Table Example VPN PPN Valid VPN PPN Valid 00 28 1 08 13 1 01 – 0 09 17 1 02 33 1 0A 09 1 03 02 1 0B – 0 04 – 0 0C – 0 05 16 1 0D 2D 1 06 – 0 0E 11 1 07 – 0 0F 0D 1 24 Address Translation Example Virtual Address: 0x0354 0 3 5 4 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 0 0 0 1 1 0 1 0 1 0 1 0 0 VPN VPO VPN PPN Valid VPN PPN Valid 00 28 1 08 13 1 01 – 0 09 17 1 02 33 1 0A 09 1 03 02 1 0B – 0 04 – 0 0C – 0 05 16 1 0D 2D 1 06 – 0 0E 11 1 07 – 0 0F 0D 1 What’s the corresponding PPN? Physical address? 25 Address Translation: Page Hit 2 PTE Address CPU Chip CPU 1 VA PTE MMU 3 PA Cache/ Memory 4 Data 5 1) Processor sends virtual address to MMU 2-3) MMU fetches PTE from page table in memory 4) MMU sends physical address to cache/memory 5) Cache/memory sends data word to processor 26 Speeding up translation with a Translation Lookaside Buffer (TLB) Cache a small number of page table entries CPU Chip CPU TLB 2 PTE VPN 3 1 VA MMU PA 4 Cache/ Memory Data 5 A TLB hit eliminates a memory access 27 TLB Miss CPU Chip TLB 2 4 PTE VPN CPU 1 VA MMU 3 PTE Address PA Cache/ Memory 5 Data 6 A TLB miss incurs an additional memory access (the PTE) Fortunately, TLB misses are rare. Why? 28 Multi-Level Page Tables Suppose: 4KB (212) page size, 48-bit address space, 8-byte (23) PTE Problem: Would need a 512 GB page table! Level 2 Tables Level 1 Table 248-12 * 23 = 239 bytes = 512GB ... Solution: Example: 2-level page table ... Level 1 table: each PTE points to a page table Level 2 table: each PTE points to a page 29 A Two-Level Page Table Hierarchy Level 1 page table Level 2 page tables Virtual memory VP 0 PTE 0 PTE 0 ... PTE 1 ... VP 1023 PTE 2 (null) PTE 1023 VP 1024 0 2K allocated VM pages for code and data ... PTE 3 (null) VP 2047 PTE 4 (null) PTE 0 PTE 5 (null) ... PTE 6 (null) PTE 1023 Gap PTE 7 (null) 6K unallocated VM pages PTE 8 (1K - 9) null PTEs 1023 null PTEs PTE 1023 ... 32 bit addresses, 4KB pages, 4-byte PTEs 1023 unallocated pages VP 9215 1023 unallocated pages 1 allocated VM page for the stack 30 Summary Programmer’s view of virtual memory Each process has its own private linear address space Cannot be corrupted by other processes System view of virtual memory Uses memory efficiently by caching virtual memory pages Efficient only because of locality Simplifies memory management and programming Simplifies protection by providing a convenient interpositioning point to check permissions 31