Interactions between Processor Design and Memory System Design David E. Culler CS61CL Nov 25, 2009 Lecture 12 11/4/25 UCB CS61CL F09 Lec 12 1 A Processor Centric View Memory Datapath Control 11/4/09 UCB CS61CL F09 Lec 10 2 Fundamental Mem. Design concepts • Caches • Virtual memory • Without these, processing as we know it would not be possible 11/4/25 UCB CS61CL F09 Lec 12 3 A more balanced view Memory Processor • “Princeton Architecture” – common instruction and data memory 11/4/25 UCB CS61CL F09 Lec 12 4 A more balanced view Instruction Memory Data Memory Processor • “Harvard Architecture” – separate instruction and data memory 11/4/25 UCB CS61CL F09 Lec 12 5 Or really Memory Processor • Memory systems are extremely sophisticated • Parallelism, caching, controllers, protocols, … 11/4/25 UCB CS61CL F09 Lec 12 6 + IR_mem IR_ex IR_wb Dmem A B Ci IR PC imem °°° Pipeline design: I-miss handling • Insert a no-op “bubble” till i-fetch completes 11/4/09 UCB CS61CL F09 Lec 10 7 + IR_mem IR_ex IR_wb Dmem A B Ci IR PC imem °°° Pipeline Design: D-miss • Stall entire pipeline behind mem stage for data miss penalty • Bubble the remainder (WB) 11/4/09 UCB CS61CL F09 Lec 10 8 Performance “Iron Triangle” • Execution Time = Seconds / Program = Seconds X Cycles X Instructions Cycle Instruction Program = CycleTime X CPI X Inst.Count • What primarily determines… – Cycle Time? – Instruction Count? – CPI ? CPI Cycle Time 11/4/25 UCB CS61CL F09 Lec 12 Inst. Count 9 Bringing Cache into the Picture • Recall MAT = Timehit + Pmiss * Penaltymiss • Timehit < Cycle Time • Penaltymiss = Pipeline Stalls/Bubbles during miss • Ideal CPI is CPI with perfect memory system • CPI = Ideal_CPI + Pmiss* Penaltymiss 11/4/25 UCB CS61CL F09 Lec 12 10 Example • Instruction Mix: – 50% arith, 30% load/store, 20% jumps/branches • Pipeline hazards – Ideal CPI = 1.2 • Cache behavior – 0.2% instruction miss rate (99.8% hit rate) – 3% data miss rate (97% hit rate) – 100 cycle miss penalty • Without Cache: CPI = 1.2 + 100 + 0.30 x 100 = 131.2 – processor pipeline is 0.7% utilized !!!! • Cache: CPI = 1.2 + 1 x 0.002 x 100 + 0.30 x 0.03 x 100 = 1.2 + 0.2 + 0.9 = 2.3 on average ~half the time is spent waiting for mem. 11/4/25 UCB CS61CL F09 Lec 12 11 Administration • Midterm II results – Max: 99 Mean: 75.2 (without bonus) – Max: 105.5 Mean 77 • HW 8 due 12/7 midnight • Project 4 due 12/9 midnight • Review Week – review in Tu/W lab + optional threads lab – review in lecture • Final Exam: Dec 15 12:30 -3:30 11/4/25 UCB CS61CL F09 Lec 12 12 Virtual Memory • Each Program runs in its own Virtual Address Space (VAS) • Distinct from the Physical Address Space (PAS) of the machine • Hardware transparently maps the Virtual Address Spaces onto physical resources • Only a small fraction of the VAS’s in physical memory at any time! 11/4/25 UCB CS61CL F09 Lec 12 13 Timesharing, MultiProcessing, Multitasking 11/4/25 UCB CS61CL F09 Lec 12 14 Multiple Process Address Spaces in Mem 00000000 Physical Memory 00000000 00FD0000 FFFFFFFF 11/4/25 UCB CS61CL F09 Lec 12 15 With Virtual Memory 00000 00000000 Physical Memory 00FD0000 FFFFF FFFFFFFF 11/4/25 UCB CS61CL F09 Lec 12 16 A Processor Supporting Virtual Memory • Is able to access a Page Table to translate Virtual Page Number => Physical Frame • on EVERY memory reference • Page Table lives in memory • How many memory accesses per instruction? – Instruction Fetch VA Translation » PF = Mem[ PTbase + PC_page] – Fetch the Actual Instructions » IR = Mem[ PF + PC_offset] – Load/Store VA Translation » PF = Mem[ PTbase + (R[rs]+Sx)_page ] – Load/Store the actual location » R[rt] = Mem[ PF + (R[rs]+Sx)_offset ] • How many cache accesses? 11/4/25 UCB CS61CL F09 Lec 12 17 TLB ???? • Translation Lookaside Buffer is a specialized cache for the page table • It was invented (by Sir Maurice Wilkes) to make virtual memory possible • He then realized it could be used to make all memory accesses faster. • Should TLBs and caches be different? 11/4/25 UCB CS61CL F09 Lec 12 18 What must happens in the processor on a Page Fault? • It could happen in instruction fetch, LW or SW • The translation fails • The actual page is out on disk – 10 ms @ 3 GHz => 30 Million cycles to access it! • We need to run a special program (The Operating System) to go and get it – allocate a frame in memory – read the page from disk » seek » transfer, … – update the page table • But we are in the middle of an instruction… 11/4/25 UCB CS61CL F09 Lec 12 19 + IR_mem IR_ex IR_wb Dmem A B Ci IR PC imem °°° Page Fault • Cannot just stall the pipeline • Must “trap” the current instruction • Put it aside and start executing other (OS) instructions 11/4/09 UCB CS61CL F09 Lec 10 20 More Key Concepts • Exception: unprogrammed transfer of control • Interrupt – asynchronous – occurs between instructions – used for efficient I/O • Fault – synchronous – occurs within an instruction • Preserve state associated with trap in special registers – EPC + BADVad + Cause in MIPS • Modify PC register to be exception handler – PC := trapHandlerAddr 11/4/25 UCB CS61CL F09 Lec 12 21 What information must be recorded on a page fault? • The PC of offending instruction • The offending address • other cause-related info 11/4/25 UCB CS61CL F09 Lec 12 22 Page Fault in Action Physical Memory Disk 07 0000 page 0040 Page Table v: 07 PTB 0040 Regs 0040 => 07 TLB Processor PC ePC 11/4/25 0040 0010 Program Virtual Address Space IR badVA UCB CS61CL F09 Lec 12 23 Inst Fetch: VA 0040xxxx => PA 07xxxx Physical Memory Disk 07 0000 page 0040 Page Table v: 07 PTB 0040 Regs 0040 => 07 TLB Processor PC ePC 11/4/25 0040 0010 Program Virtual Address Space IR badVA UCB CS61CL F09 Lec 12 24 Inst Fetch: mem[07 0010] => IR Physical Memory Disk 07 0000 page 0040 Page Table v: 07 PTB 0040 Regs 0040 => 07 TLB Processor PC ePC 11/4/25 0040 0010 lw $3 20($4) Program Virtual Address Space IR badVA UCB CS61CL F09 Lec 12 25 Exec: EA = 0053 1000 + 20 Physical Memory Disk 07 0000 page 0040 Page Table v: 07 PTB 0040 Regs 0053 1000 0040 => 07 TLB Processor PC ePC 11/4/25 0040 0010 lw $3 20($4) Program Virtual Address Space IR badVA UCB CS61CL F09 Lec 12 26 Exec: VA 00531020 => ??? TLB miss Physical Memory Disk 07 0000 page 0040 Page Table v: 07 PTB 0040 Regs 0053 1000 0040 => 07 TLB Processor PC ePC 11/4/25 0040 0010 lw $3 20($4) Program Virtual Address Space IR badVA UCB CS61CL F09 Lec 12 27 Exec: PT lookup(0053) => ??? Fault Physical Memory Disk 07 0000 page 0040 Page Table v: 07 0053 N: PTB Regs 0053 1000 0040 => 07 TLB Processor PC ePC 11/4/25 0040 0010 lw $3 20($4) Program Virtual Address Space IR badVA UCB CS61CL F09 Lec 12 28 Exec: Trap to OS Page Fault Handler Physical Memory Disk 07 0000 page 0040 Page Table v: 07 0053 N: PTB Regs 0053 1000 0040 => 07 TLB Processor PC ePC 11/4/25 00001 FF00 0040 0010 0040 0010 lw $3 20($4) 0053 1020 Program Virtual Address Space IR badVA UCB CS61CL F09 Lec 12 29 Fetch and execute OS instructions OS page Physical Memory Disk 07 0000 page 0040 Page Table v: 07 0053 N: PTB Regs 0053 1000 0040 => 07 TLB Processor PC ePC 11/4/25 00001 FF00 0040 0010 0040 0010 j flt_hndlr 0053 1020 Program Virtual Address Space IR badVA UCB CS61CL F09 Lec 12 30 Fetch and execute OS instructions OS page Physical Memory Disk 07 0000 page 0040 Page Table v: 07 0053 N: PTB Regs 0053 1000 0040 => 07 TLB Processor PC 000YY xxxx 0040 0010 jxzyxzyxz ePC 0040 0010 0053 1020 11/4/25 Program Virtual Address Space IR badVA UCB CS61CL F09 Lec 12 31 Load page from Disk to Memory OS page Physical Memory Disk 07 0000 page 0040 page 0053 Page Table v: 07 0053 N: PTB Regs 0053 1000 0040 => 07 TLB Processor PC ePC 11/4/25 00001 FF00 0040 0010 0040 0010 j flt_hndlr 0053 1020 Program Virtual Address Space IR badVA UCB CS61CL F09 Lec 12 32 Update Page Table OS page Physical Memory Disk 07 0000 page 0040 14 0000 page 0053 Page Table v: 07 0053 v: 14 PTB Regs 0053 1000 0040 => 07 TLB Processor PC ePC 11/4/25 00001 FF00 0040 0010 0040 0010 j flt_hndlr 0053 1020 Program Virtual Address Space IR badVA UCB CS61CL F09 Lec 12 33 ReturnFromException (RFE) OS page Physical Memory Disk 07 0000 page 0040 14 0000 page 0053 Page Table v: 07 0053 v: 14 PTB Regs 0053 1000 0040 => 07 TLB Processor PC 0040 0010 ePC 0040 0010 11/4/25 lw $3 20($4) Program Virtual Address Space IR badVA UCB CS61CL F09 Lec 12 34 Exec: TLB Miss, PT lookup OS page Physical Memory Disk 07 0000 page 0040 14 0000 page 0053 Page Table v: 07 0053 v: 14 PTB Regs 0053 1000 0040 => 07 TLB 0053 => 07 0040 0010 lw $3 20($4) Processor PC ePC 11/4/25 Program Virtual Address Space IR badVA UCB CS61CL F09 Lec 12 35 Exec: Read physical address OS page Physical Memory Disk 07 0000 page 0040 14 0000 page 0053 Page Table v: 07 0053 v: 14 PTB Regs 0053 1000 0040 => 07 TLB 0053 => 07 0040 0010 lw $3 20($4) 432 Processor PC ePC 11/4/25 Program Virtual Address Space IR badVA UCB CS61CL F09 Lec 12 36 Paging the Page Table? • 264 byte virtual address space • 214 byte pages (16 kB) • => 250 page table entries • Large address spaces are used sparsely 11/4/25 UCB CS61CL F09 Lec 12 37 Summary • Caches are essential to performance • Virtual Address translation permits modern operating systems and applications • Requires caching • Also requires special processor hardware support • Also requires operating system support • Works as long as page faults are rare • Next Time: Andy lectures on “What’s an OS” 11/4/25 UCB CS61CL F09 Lec 12 38