CS 152 Computer Architecture and Engineering Lecture 15 Virtual Memory

advertisement
CS 152
Computer Architecture and Engineering
Lecture 15 – Virtual Memory
2005-10-20
John Lazzaro
(www.cs.berkeley.edu/~lazzaro)
TAs: David Marquardt and Udam Saini
www-inst.eecs.berkeley.edu/~cs152/
CS 152 L15: Virtual Memory
UC Regents Fall 2005 © UCB
Last Time: Practical Cache Design
Cache design control is done by
many loosely coupled state machines, including ...
State Machine
To
CPU
Control
Control
Control
Addr
To
CPU
Din
Dout
CS 152 L15: Virtual Memory
Addr
Blocks
Tags
Din
Dout
To
Lower
Level
Memory
To
Lower
Level
Memory
UC Regents Fall 2005 © UCB
State machines for bus control ....
For reads,
your state
machine must:
To Processor
Upper Level
Memory
Small, fast
(1) sense REQ
(2) latch Addr
(3) create Wait
(4) put Data Out
on the bus.
Blk X
From Processor
Lower Level
Memory
Large,
slow
Blk Y
From
CPU
To CPU
An example interface ... there are other possibilities.
CS 152 L15: Virtual Memory
UC Regents Fall 2005 © UCB
fou r or th e last d esired of a lon ger b u rst. Th e 256Mb
SDRAM u ses a p ip elin ed arch itectu re an d th erefore
d oes n ot req u ire th e 2n ru le associated with a p refetch
sam e b an k, as sh own in Figu re 14, or each su b se
READ m ay b e p erform ed to a d ifferen t b an k.
State machines for block fetch from DRAM
DRAM can be set up to request an N byte region
Figure
13: Consecut
READ Burst
s within region
starting
at anive
arbitrary
N+k
One request ...
T0
T1
T2
T3
T4
T5
T6
CLK
COM M AND
READ
NOP
NOP
NOP
READ
NOP
NOP
X = 1 cycle
ADDRESS
BANK,
COL n
BANK,
COL b
DOUT
n
DQ
CAS Lat ency = 2
DOUT
n+1
DOUT
n+2
DOUT
n+3
DOUT
b
Many returns ...
T0
T1challenges:
T2
T3 setting
T4 up correct
T5
T6
State machine
(1)
block T7
CLK mode (2) delivering correct word direct to CPU (3)
read
putting all words in cache in right place.
CS 152 L15: Virtual Memory
UC Regents Fall 2005 © UCB
State machine for
writeback
to
DRAM
Figure 47: Writ e – Wit h Aut o Precharge 1
T0
t CK
CLK
One
command ...
t CKS
T1
t CL
T2
T3
T4
T5
T6
T7
NOP
NOP
NOP
NOP
NOP
t CH
t CKH
CKE
tCM S tCM H
COM M AND
ACTIVE
NOP
WRITE
t CM S t CM H
DQM /
DQM L, DQM U
t AS
A0-A9, A11, A12
Many
bytes
written
t AH
ENABLE AUTO PRECHARGE
ROW
t AS
BA0, BA1
COLUM N m 2
ROW
t AS
A10
t AH
t AH
BANK
BANK
t DS
t DH
DIN m
DQ
t RCD
t DS
t DH
DIN m + 1
t DS
t DH
DIN m + 2
t DS
t DH
DIN m + 3
t WR
t RAS
State machine challenges: (1) putting cache block into
correct location (2) what if a read or write wants to
use DRAM before the burst is complete? Must stall ...
t RC
CS 152 L15: Virtual Memory
UC Regents Fall 2005 © UCB
State machines to manage write buffer
Solution: add a “write buffer” to cache datapath
Processor
Cache
Lower
Level
Memory
Write Buffer
Holds data awaiting write-through to
lower level memory
Q. Why a write buffer ?
A. So CPU doesn’t stall
Q. Why a buffer, why
A. Bursts of writes are
not just one register ?
common.
Q. Are Read After Write
A. Yes! Drain buffer
(RAW) hazards an issue
before next read, or
for write buffer?
check write buffers.
On reads, state machine checks cache and write buffer -what if word was removed from cache before lower-level
write? On writes, state machine stalls for full write buffer,
handles write buffer duplicates.
CS 152 L15: Virtual Memory
UC Regents Fall 2005 © UCB
Don’t design one big state machine!!!
Focus on the high-level state machine structure early!
State Machine
To
CPU
Control
Control
Control
Addr
To
CPU
Din
Dout
CS 152 L15: Virtual Memory
Addr
Blocks
Tags
Din
Dout
To
Lower
Level
Memory
To
Lower
Level
Memory
UC Regents Fall 2005 © UCB
Today’s Lecture - Virtual Memory
Virtual address spaces
Page table layout
TLB design options
CS 152 L15: Virtual Memory
UC Regents Fall 2005 © UCB
The Limits of Physical Addressing
“Physical addresses” of memory locations
A0-A31
CPU
A0-A31
Where we are in CS 152 ...
D0-D31
Memory
D0-D31
Data
All programs share one address space:
The physical address space
Machine language programs must be
aware of the machine organization
No way to prevent a program from
accessing any machine resource
CS 152 L15: Virtual Memory
UC Regents Fall 2005 © UCB
Apple II: A physically addressed machine
Apple ][ (1977)
CPU: 1000 ns
DRAM: 400 ns
Steve
Jobs
CS 152 L13: Cache I
Steve
Wozniak
UC Regents Fall 2005 © UCB
The Limits of Physical Addressing
“Physical addresses” of memory locations
A0-A31
CPU
A0-A31
Programming the Apple ][ ...
D0-D31
Memory
D0-D31
Data
All programs share one address space:
The physical address space
Machine language programs must be
aware of the machine organization
No way to prevent a program from
accessing any machine resource
CS 152 L15: Virtual Memory
UC Regents Fall 2005 © UCB
Solution: Add a Layer of Indirection
“Physical Addresses”
“Virtual Addresses”
A0-A31
Virtual
Physical
Address
Translation
CPU
D0-D31
A0-A31
Memory
D0-D31
Data
User programs run in an standardized
virtual address space
Address Translation hardware
managed by the operating system (OS)
maps virtual address to physical memory
Hardware supports “modern” OS features:
Protection, Translation, Sharing
CS 152 L15: Virtual Memory
UC Regents Fall 2005 © UCB
MIPS R4000: Address Space Model
Process A
ASID = 12
32
2 -1
Address
Error
2
ASID = Address Space Identifier
Process A and B have
independent address spaces
2 GB
32
2 -1
Address
Error
2
31
When Process A writes its
address 9, it writes to a different
physical memory location than
Process B’s address 9
To let Process A and B share
memory, OS maps parts of
ASID 12 and ASID 13 to the same
physical memory locations.
0
ASID = 13
All address spaces
use a standard memory map
May only be accessed by
kernel/supervisor
31
Process B
2 GB
0
Still works (slowly!) if a process accesses more virtual memory than
the machine has physical memory
CS 152 L15: Virtual Memory
UC Regents Fall 2005 © UCB
4.3 System Control Coprocessor
MIPS R4000: Who’s Running on the CPU?
The System Control Cop rocessor (CP0) is im p lem ented as an integral p art
of the CPU, and su p p orts m em ory m anagem ent, ad d ress translation,
excep tion hand ling, and other p rivileged op erations. CP0 contains the
registers show n in Figu re 4-7 p lu s a 48-entry TLB. The sections that follow
d escribe how the p rocessor u ses the m em ory m anagem ent-related
registers †.
System Control Registers
Each CP0 register has a u niqu e nu m ber that id entifies it; this nu m ber is
referred to as the register number. For instance, the Page M ask register is
register nu m ber 5.
EntryLo0
EntryLo0
2*2*
EntryHi
EntryHi
10*
EntryLo1
3*
47
Context
Index
4*
Random
Random
Count
Page Mask
Page Mask
Status
12*
13*
Wired
Wired
EPC
WatchLo
1*
6*
0
LLAddr
17*
Compare
11*
Cause
14*
15*
19*
Config
ECC
16*
26*
XContext
20*
CacheErr
27*
TagLo
TagHi
ErrorEPC
28*
29*
30*
Used with memory
management system.
*Register number
Status (12): Indicates
user, supervisor, or
kernel mode
18*
WatchHi
PRId
(“Safe” entries)
(See Random Register,
contents of TLB Wired)
127
0
8*
9*
5*
TLB
BadVAddr
0*
Index
EntryLo0 (2): 8-bit ASID
field codes virtual
address space ID.
Used with exception
processing. See
Chapter 5 for details.
Figure 4-7 CP0 Registers and the TLB
User cannot write supervisor/kernel
bits. Supervisor cannot write kernel bit.
† For a d escrip tion of CP0 d ata d ep end encies and hazard s, p lease see Ap p end ix F.
80
M IPS R4000 M icroprocessor User' s M anual
User cannot change address
translation configuration
CS 152 L15: Virtual Memory
UC Regents Fall 2005 © UCB
MIPS Address Translation: How it works
“Physical Addresses”
“Virtual Addresses”
A0-A31
Virtual
CPU
D0-D31
Data
Physical
Translation
Look-Aside
Buffer
(TLB)
A0-A31
Memory
D0-D31
What is
the table
Translation Look-Aside Buffer (TLB)
of
A small fully-associative cache of
mappings
mappings from virtual to physical addresses that it
caches?
TLB also contains ASID and
kernel/supervisor bits for virtual address
Fast common case: Virtual address is in TLB,
process has permission to read/write it.
CS 152 L15: Virtual Memory
UC Regents Fall 2005 © UCB
size for ea
Page tables encode virtual address spaces
Page Table
(One per ASID)
Physical
Memory Space
frame
frame
A virtual address space
is divided into blocks
of memory called pages
frame
frame
virtual
address
OS
manages
the page
table for
each ASID
A machine
usually supports
pages of a few
sizes
(MIPS R4000):
A page table is indexed by a
virtual address
Page Size
TLB read
d estinatio
p hysical a
ad d ress bi
field is no
is u nd efin
Ta
24
23
4 Kbytes
0
0
16 Kbytes
0
0
64 Kbytes
0
0
256 Kbytes
0
0
1 Mbyte
0
0
4 Mbytes
0
0
16 Mbytes
1
1
A valid page table entry codes physical
memory “frame” address for the page
M IPS R4000 M icroprocessor Us
CS 152 L15: Virtual Memory
UC Regents Fall 2005 © UCB
TheWhat
TLB
page table entries
is caches
virtual memory?
Virtual
Address Space
Physical
Address Space
TLB caches
page table
entries.
virtual address
page
off
Virtual Address
10
offset
V page no.
Page Table
Page Table
for ASIDReg
Base
index
Page Table
into
page
table
2
0
V
Access
Rights
PA
In this example,
physical and virtual
pages must be the
same size!
Physical
frame
address
table located
in physical P page no.
memory
offset
10
Physical Address
1
V=0 pages either
physical
address as a cache for the disk
° Virtual memory => treat
memory
reside on disk or
° Terminology:TLB
blocks inpage
thisoffcache are called “Pages”
have not yet been
• Typical frame
size of
a page: 1K — 8K
page
allocated.
2
2 virtual page numbers to physical
° Page table maps
frames
OS
handles V=0
0
5
MIPS handles TLB misses
• “PTE” = Page Table Entry
“Page
fault”
in software (random
CS152 / Kubiatowicz
3
4/19/04
CS 152 L15: Virtual Memory
©UCB Spring 2004
replacement). Other
machines use hardware.
Lec21.22
UC Regents Fall 2005 © UCB
Page tables may not fit in memory!
A table for 4KB pages for a 32-bit
address space has 1M entries
Large Address Spaces
Each process needs its own address space!
Two-level Page Tables
Two-level Page Tables
ookup
32-bit address:
10
10
32 bit virtual
address
P1 index P2 index
y4
1K
PTEs
4KB
12
page offest
31
22 21
12 11
0
P1 index P2 index Page Offset
4 bytes
Top-level table wired
in main memory
° 2 GB virtual address space
° 4 MB of PTE2
Subset of 1024 second-level
tables in
– paged, holes
main memory; °rest
are
4 KB of
PTE1on disk or
unallocated
4 bytes
What about a 48-64 bit address space?
52 / Kubiatowicz
Lec21.25
4/19/04
CS 152 L15: Virtual Memory
©UCB Spring 2004
CS152 / Kubiatowicz
Lec21.26
UC Regents Fall 2005 © UCB
What
if isa virtual
page memory?
resides on disk?
What
Virtual
Address Space
Physical
Address Space
TLB caches
page table
entries.
virtual address
page
off
Virtual Address
10
offset
V page no.
Page Table
Page Table
for ASIDReg
Base
index
Page Table
into
page
table
2
0
V
Access
Rights
PA
Physical
frame
address
table located
in physical P page no.
memory
offset
10
Physical Address
1
V=0 pages either
physical
address as a cache for the disk
° Virtual memory => treat
memory
reside on disk or
° Terminology:TLB
blocks inpage
thisoffcache are called “Pages”
have not yet been
• Typical frame
size of
a page: 1K — 8K
page
allocated.
2
2 virtual page numbers to physical
° Page table maps
frames
OS
handles V=0
0
5
Question: What to do when a
• “PTE” = Page Table Entry
“Page
fault”
TLB miss causes an access to a
CS152 / Kubiatowicz
3
4/19/04
CS 152 L15: Virtual Memory
©UCB Spring 2004
page table entry with V=0?
Lec21.22
UC Regents Fall 2005 © UCB
VM and Disk: Page replacement policy
Dirty bit: page
written.
Used bit: set to
1 on any
reference
Set of all pages
in Memory
Head pointer
Place pages on
free list if used bit
is still clear.
Schedule pages
with dirty bit set to
be written to disk.
Page Table
dirty used
1
1
0
1
0
0
0
1
1
0
...
Tail pointer:
Clear the used
bit in the
page table
Freelist
On page fault: deallocate
page table entry of a page
on the free list.
Free Pages
Architect’s role: support setting dirty and used bits
CS 152 L15: Virtual Memory
UC Regents Fall 2005 © UCB
Friday: Design document deadlines
IM Bus
IC Bus
Instruction Cache
P
i
p
e
l
i
n
e
d
C
P
U
DM Bus
DC Bus
Data Cache
CS 152 L8: Pipelining I
D
R
A
M
C
o
n
t
r
o
l
l
e
r
DRAM
List the bugs you
will target in test
benches.
UC Regents Spring 2005 © UCB
Also: Lab 3 Peer Evaluations ...
CS 152 L16: Error Correcting Codes
Define the timing
diagrams and signal
names for the IM,
DM, IC, DC buses.
Other items ...
UC Regents Fall 2005 © UCB
TLB Design Concepts
CS 152 L15: Virtual Memory
UC Regents Fall 2005 © UCB
MIPS R4000 TLB: A closer look ...
M emory M anagement
32-bit Mode Address Translation
Figu re 4-2 show s the virtu al-to-p hysical-ad d ress translation of a 32-bit
“Physical Addresses”
“Virtual
Addresses”
m od
e ad d ress.
•
A0-A31
CPU
The bottom pLook-Aside
ortion of Figu re 4-2 show s a virtu al ad dSystem
ress w ith
a 24-bit, or 16-Mbyte,
p age size, labelled Offset. The rem aining
Buffer
D0-D31
8 bits of the ad d ress
rep resent the VPN , and ind ex the 256(TLB)
Dataentry p age table.
•
D0-D31
Checked
against
CPO ASID
The top p ortion of Figu re 4-2 show s a virtu al ad d ress w ith a
12-bit, or 4-Kbyte,
labelled Offset. The rem aining
20
Virtual p age size,
Physical
A0-A31
bits of the ad d ress rep resent the VPN , and ind ex the 1M-entry
p age table. Translation
Memory
Virtual Address with 1M (220) 4-Kbyte pages
39
32 31 29 28
20 bits = 1M pages
12 11
Physical space larger
than virtual
space!
0
ASID
VPN
Offset
8
20
12
Virtual-to-physical
translation in TLB
Bits 31, 30 and 29 of the virtual
address select user, supervisor,
or kernel address spaces.
TLB
36-bit Physical Address
35
0
PFN
Virtual-to-physical
translation in TLB
CS 152 L15: Virtual Memory
Offset passed
unchanged to
physical
memory
TLB
Offset
Offset passed
unchanged to
physical
UC Regents Fall 2005 © UCB
memory
Can TLB and caching be overlapped?
Virtual Page Number
Page Offset
Index
Byte Select
Virtual
Translation
Look-Aside
Buffer
(TLB)
Cache Tags Valid
Cache Data
Cache Block
Physical
Cache Tag
This works, but ...
=
Cache Block
Hit
Q. What is the downside?
A. Inflexibility. VPN size
locked to cache tag size.
CS 152 L15: Virtual Memory
Data out
UC Regents Fall 2005 © UCB
Can we cache virtual addresses?
“Physical Addresses”
“Virtual Addresses”
A0-A31
Virtual
Virtual
Cache
CPU
D0-D31
D0-D31
Physical
Translation
Look-Aside
Buffer
(TLB)
A0-A31
Main Memory
D0-D31
Only use TLB on a cache miss !
Downside: a subtle, fatal problem. What is it?
A. Synonym problem. If two address spaces share a
physical frame, data may be in cache twice.
Maintaining consistency is a nightmare.
CS 152 L15: Virtual Memory
UC Regents Fall 2005 © UCB
Conclusions
VM: Uniform memory models,
protection, sharing.
A TLB acts as a fast cache for
recent address translations.
Operating systems manage
the page table and (often) the TLB
CS 152 L15: Virtual Memory
UC Regents Fall 2005 © UCB
Tuesday: How ECC memory works ...
Detecting and correcting RAM bit errors
Replacing lost network packets,
recovering from disk drive failure
Detecting arbitrary bit errors
in network packets
CS 152 L16: Error Correcting Codes
UC Regents Fall 2005 © UCB
Download