Interactions between Processor Design and Memory System Design David E. Culler

advertisement
Interactions between Processor Design
and
Memory System Design
David E. Culler
CS61CL
Nov 25, 2009
Lecture 12
11/4/25
UCB CS61CL F09 Lec 12
1
A Processor Centric View
Memory
Datapath
Control
11/4/09
UCB CS61CL F09 Lec 10
2
Fundamental Mem. Design concepts
• Caches
• Virtual memory
• Without these, processing as we know it would
not be possible
11/4/25
UCB CS61CL F09 Lec 12
3
A more balanced view
Memory
Processor
• “Princeton Architecture” – common instruction
and data memory
11/4/25
UCB CS61CL F09 Lec 12
4
A more balanced view
Instruction
Memory
Data
Memory
Processor
• “Harvard Architecture” – separate instruction
and data memory
11/4/25
UCB CS61CL F09 Lec 12
5
Or really
Memory
Processor
• Memory systems are extremely sophisticated
• Parallelism, caching, controllers, protocols, …
11/4/25
UCB CS61CL F09 Lec 12
6
+
IR_mem
IR_ex
IR_wb
Dmem
A
B Ci
IR
PC
imem
°°°
Pipeline design: I-miss handling
• Insert a no-op “bubble” till i-fetch completes
11/4/09
UCB CS61CL F09 Lec 10
7
+
IR_mem
IR_ex
IR_wb
Dmem
A
B Ci
IR
PC
imem
°°°
Pipeline Design: D-miss
• Stall entire pipeline behind mem stage for data
miss penalty
• Bubble the remainder (WB)
11/4/09
UCB CS61CL F09 Lec 10
8
Performance “Iron Triangle”
• Execution Time = Seconds / Program
= Seconds X Cycles
X Instructions
Cycle
Instruction
Program
= CycleTime X CPI X Inst.Count
• What primarily determines…
– Cycle Time?
– Instruction Count?
– CPI ?
CPI
Cycle
Time
11/4/25
UCB CS61CL F09 Lec 12
Inst.
Count
9
Bringing Cache into the Picture
• Recall MAT = Timehit + Pmiss * Penaltymiss
• Timehit < Cycle Time
• Penaltymiss = Pipeline Stalls/Bubbles during miss
• Ideal CPI is CPI with perfect memory system
• CPI = Ideal_CPI + Pmiss* Penaltymiss
11/4/25
UCB CS61CL F09 Lec 12
10
Example
• Instruction Mix:
– 50% arith, 30% load/store, 20% jumps/branches
• Pipeline hazards
– Ideal CPI = 1.2
• Cache behavior
– 0.2% instruction miss rate (99.8% hit rate)
– 3% data miss rate (97% hit rate)
– 100 cycle miss penalty
• Without Cache: CPI = 1.2 + 100 + 0.30 x 100 = 131.2
– processor pipeline is 0.7% utilized !!!!
• Cache: CPI = 1.2 + 1 x 0.002 x 100 + 0.30 x 0.03 x 100
= 1.2 + 0.2 + 0.9 = 2.3
on average ~half the time is spent waiting for mem.
11/4/25
UCB CS61CL F09 Lec 12
11
Administration
• Midterm II results
– Max: 99 Mean: 75.2 (without bonus)
– Max: 105.5 Mean 77
• HW 8 due 12/7 midnight
• Project 4 due 12/9 midnight
• Review Week
– review in Tu/W lab + optional threads
lab
– review in lecture
• Final Exam: Dec 15 12:30 -3:30
11/4/25
UCB CS61CL F09 Lec 12
12
Virtual Memory
• Each Program runs in its own Virtual Address
Space (VAS)
• Distinct from the Physical Address Space (PAS)
of the machine
• Hardware transparently maps the Virtual
Address Spaces onto physical resources
• Only a small fraction of the VAS’s in physical
memory at any time!
11/4/25
UCB CS61CL F09 Lec 12
13
Timesharing, MultiProcessing,
Multitasking
11/4/25
UCB CS61CL F09 Lec 12
14
Multiple Process Address Spaces in Mem
00000000
Physical
Memory
00000000
00FD0000
FFFFFFFF
11/4/25
UCB CS61CL F09 Lec 12
15
With Virtual Memory
00000
00000000
Physical
Memory
00FD0000
FFFFF
FFFFFFFF
11/4/25
UCB CS61CL F09 Lec 12
16
A Processor Supporting Virtual Memory
• Is able to access a Page Table to translate
Virtual Page Number => Physical Frame
• on EVERY memory reference
• Page Table lives in memory
• How many memory accesses per instruction?
– Instruction Fetch VA Translation
» PF = Mem[ PTbase + PC_page]
– Fetch the Actual Instructions
» IR = Mem[ PF + PC_offset]
– Load/Store VA Translation
» PF = Mem[ PTbase + (R[rs]+Sx)_page ]
– Load/Store the actual location
» R[rt] = Mem[ PF + (R[rs]+Sx)_offset ]
• How many cache accesses?
11/4/25
UCB CS61CL F09 Lec 12
17
TLB ????
• Translation Lookaside Buffer is a
specialized cache for the page
table
• It was invented (by Sir Maurice
Wilkes) to make virtual memory
possible
• He then realized it could be used
to make all memory accesses
faster.
• Should TLBs and caches be
different?
11/4/25
UCB CS61CL F09 Lec 12
18
What must happens in the processor
on a Page Fault?
• It could happen in instruction fetch, LW or SW
• The translation fails
• The actual page is out on disk
– 10 ms @ 3 GHz => 30 Million cycles to access it!
• We need to run a special program (The Operating
System) to go and get it
– allocate a frame in memory
– read the page from disk
» seek
» transfer, …
– update the page table
• But we are in the middle of an instruction…
11/4/25
UCB CS61CL F09 Lec 12
19
+
IR_mem
IR_ex
IR_wb
Dmem
A
B Ci
IR
PC
imem
°°°
Page Fault
• Cannot just stall the pipeline
• Must “trap” the current instruction
• Put it aside and start executing other (OS)
instructions
11/4/09
UCB CS61CL F09 Lec 10
20
More Key Concepts
• Exception: unprogrammed transfer of control
• Interrupt
– asynchronous
– occurs between instructions
– used for efficient I/O
• Fault
– synchronous
– occurs within an instruction
• Preserve state associated with trap in special
registers
– EPC + BADVad + Cause in MIPS
• Modify PC register to be exception handler
– PC := trapHandlerAddr
11/4/25
UCB CS61CL F09 Lec 12
21
What information must be recorded on
a page fault?
• The PC of offending instruction
• The offending address
• other cause-related info
11/4/25
UCB CS61CL F09 Lec 12
22
Page Fault in Action
Physical Memory
Disk
07 0000
page 0040
Page Table
v: 07
PTB
0040
Regs
0040 => 07 TLB
Processor
PC
ePC
11/4/25
0040 0010
Program
Virtual
Address
Space
IR
badVA
UCB CS61CL F09 Lec 12
23
Inst Fetch: VA 0040xxxx => PA 07xxxx
Physical Memory
Disk
07 0000
page 0040
Page Table
v: 07
PTB
0040
Regs
0040 => 07 TLB
Processor
PC
ePC
11/4/25
0040 0010
Program
Virtual
Address
Space
IR
badVA
UCB CS61CL F09 Lec 12
24
Inst Fetch: mem[07 0010] => IR
Physical Memory
Disk
07 0000
page 0040
Page Table
v: 07
PTB
0040
Regs
0040 => 07 TLB
Processor
PC
ePC
11/4/25
0040 0010
lw $3 20($4)
Program
Virtual
Address
Space
IR
badVA
UCB CS61CL F09 Lec 12
25
Exec: EA = 0053 1000 + 20
Physical Memory
Disk
07 0000
page 0040
Page Table
v: 07
PTB
0040
Regs
0053 1000
0040 => 07 TLB
Processor
PC
ePC
11/4/25
0040 0010
lw $3 20($4)
Program
Virtual
Address
Space
IR
badVA
UCB CS61CL F09 Lec 12
26
Exec: VA 00531020 => ??? TLB miss
Physical Memory
Disk
07 0000
page 0040
Page Table
v: 07
PTB
0040
Regs
0053 1000
0040 => 07 TLB
Processor
PC
ePC
11/4/25
0040 0010
lw $3 20($4)
Program
Virtual
Address
Space
IR
badVA
UCB CS61CL F09 Lec 12
27
Exec: PT lookup(0053) => ??? Fault
Physical Memory
Disk
07 0000
page 0040
Page Table
v: 07
0053
N:
PTB
Regs
0053 1000
0040 => 07 TLB
Processor
PC
ePC
11/4/25
0040 0010
lw $3 20($4)
Program
Virtual
Address
Space
IR
badVA
UCB CS61CL F09 Lec 12
28
Exec: Trap to OS Page Fault Handler
Physical Memory
Disk
07 0000
page 0040
Page Table
v: 07
0053
N:
PTB
Regs
0053 1000
0040 => 07 TLB
Processor
PC
ePC
11/4/25
00001
FF00
0040 0010
0040 0010
lw $3 20($4)
0053 1020
Program
Virtual
Address
Space
IR
badVA
UCB CS61CL F09 Lec 12
29
Fetch and execute OS instructions
OS page
Physical Memory
Disk
07 0000
page 0040
Page Table
v: 07
0053
N:
PTB
Regs
0053 1000
0040 => 07 TLB
Processor
PC
ePC
11/4/25
00001
FF00
0040 0010
0040 0010
j flt_hndlr
0053 1020
Program
Virtual
Address
Space
IR
badVA
UCB CS61CL F09 Lec 12
30
Fetch and execute OS instructions
OS page
Physical Memory
Disk
07 0000
page 0040
Page Table
v: 07
0053
N:
PTB
Regs
0053 1000
0040 => 07 TLB
Processor
PC
000YY
xxxx
0040 0010
jxzyxzyxz
ePC
0040 0010
0053 1020
11/4/25
Program
Virtual
Address
Space
IR
badVA
UCB CS61CL F09 Lec 12
31
Load page from Disk to Memory
OS page
Physical Memory
Disk
07 0000
page 0040
page 0053
Page Table
v: 07
0053
N:
PTB
Regs
0053 1000
0040 => 07 TLB
Processor
PC
ePC
11/4/25
00001
FF00
0040 0010
0040 0010
j flt_hndlr
0053 1020
Program
Virtual
Address
Space
IR
badVA
UCB CS61CL F09 Lec 12
32
Update Page Table
OS page
Physical Memory
Disk
07 0000
page 0040
14 0000
page 0053
Page Table
v: 07
0053
v: 14
PTB
Regs
0053 1000
0040 => 07 TLB
Processor
PC
ePC
11/4/25
00001
FF00
0040 0010
0040 0010
j flt_hndlr
0053 1020
Program
Virtual
Address
Space
IR
badVA
UCB CS61CL F09 Lec 12
33
ReturnFromException (RFE)
OS page
Physical Memory
Disk
07 0000
page 0040
14 0000
page 0053
Page Table
v: 07
0053
v: 14
PTB
Regs
0053 1000
0040 => 07 TLB
Processor
PC
0040 0010
ePC
0040 0010
11/4/25
lw $3 20($4)
Program
Virtual
Address
Space
IR
badVA
UCB CS61CL F09 Lec 12
34
Exec: TLB Miss, PT lookup
OS page
Physical Memory
Disk
07 0000
page 0040
14 0000
page 0053
Page Table
v: 07
0053
v: 14
PTB
Regs
0053 1000
0040 => 07 TLB
0053 => 07
0040 0010
lw $3 20($4)
Processor
PC
ePC
11/4/25
Program
Virtual
Address
Space
IR
badVA
UCB CS61CL F09 Lec 12
35
Exec: Read physical address
OS page
Physical Memory
Disk
07 0000
page 0040
14 0000
page 0053
Page Table
v: 07
0053
v: 14
PTB
Regs
0053 1000
0040 => 07 TLB
0053 => 07
0040 0010
lw $3 20($4)
432
Processor
PC
ePC
11/4/25
Program
Virtual
Address
Space
IR
badVA
UCB CS61CL F09 Lec 12
36
Paging the Page Table?
• 264 byte virtual address space
• 214 byte pages (16 kB)
• => 250 page table entries
• Large address spaces are used sparsely
11/4/25
UCB CS61CL F09 Lec 12
37
Summary
• Caches are essential to performance
• Virtual Address translation permits modern
operating systems and applications
• Requires caching
• Also requires special processor hardware
support
• Also requires operating system support
• Works as long as page faults are rare
• Next Time: Andy lectures on “What’s an OS”
11/4/25
UCB CS61CL F09 Lec 12
38
Download