Virtual Memory Topics Motivations for VM Address translation

advertisement
Virtual Memory
Topics



Motivations for VM
Address translation
Accelerating translation with TLBs
Motivations for Virtual Memory
Use Physical DRAM as a Cache for the Disk


Address space of a process can exceed physical memory size
Sum of address spaces of multiple processes can exceed
physical memory
Simplify Memory Management

Multiple processes resident in main memory.
 Each process with its own address space

Only “active” code and data is actually in memory
 Allocate more memory to process as needed.
Provide Protection

One process can’t interfere with another.
 because they operate in different address spaces.

User process cannot access privileged information
 different sections of address spaces have different permissions.
–2–
DRAM a “Cache” for Disk
Full address space is quite large:


32-bit addresses:
~4,000,000,000 (4
billion) bytes
64-bit addresses: ~16,000,000,000,000,000,000 (16
quintillion) bytes
Disk storage is ~300X cheaper than DRAM storage


80 GB of DRAM: ~ $33,000
80 GB of disk:
~ $110
To access large amounts of data in a cost-effective
manner, the bulk of the data must be stored on disk
1GB: ~$200
80 GB: ~$110
4 MB: ~$500
SRAM
–3–
DRAM
Disk
Levels in Memory Hierarchy
cache
CPU
regs
Register
size:
speed:
$/Mbyte:
line size:
32 B
1 ns
8B
8B
C
a
c
h
e
32 B
Cache
32 KB-4MB
2 ns
$125/MB
32 B
larger, slower, cheaper
–4–
virtual memory
Memory
Memory
1024 MB
30 ns
$0.20/MB
4 KB
4 KB
disk
Disk Memory
100 GB
8 ms
$0.001/MB
DRAM vs. SRAM as a “Cache”
DRAM vs. disk is more extreme than SRAM vs. DRAM

Access latencies:
 DRAM ~10X slower than SRAM
 Disk ~100,000X slower than DRAM

Importance of exploiting spatial locality:
 First byte is ~100,000X slower than successive bytes on disk
» vs. ~4X improvement for page-mode vs. regular accesses to
DRAM

Bottom line:
 Design decisions made for DRAM caches driven by enormous cost
of misses
SRAM
–5–
DRAM
Disk
Impact of Properties on Design
If DRAM was to be organized similar to an SRAM cache, how
would we set the following design parameters?

Line size?
 Large, since disk better at transferring large blocks

Associativity?
 High, to mimimize miss rate

Write through or write back?
 Write back, since can’t afford to perform small writes to disk
What would the impact of these choices be on:

miss rate
 Extremely low. << 1%

hit time
 Must match cache/DRAM performance

miss latency
 Very high. ~20ms

tag storage overhead
 Low, relative to block size
–6–
Locating an Object in a “Cache”
SRAM Cache


Tag stored with cache line
Maps from cache block to memory blocks
 From cached to uncached form
 Save a few bits by only storing tag


No tag for block not in cache
Hardware retrieves information
 can quickly match against multiple tags
Object Name
X
= X?
Tag
Data
0:
D
243
1:
X
•
•
•
J
17
•
•
•
105
N-1:
–7–
“Cache”
Locating an Object in “Cache”
DRAM Cache


Each allocated page of virtual memory has entry in page
table
Mapping from virtual pages to physical pages
 From uncached form to cached form

Page table entry even if page not in memory
 Specifies disk address
 Only way to indicate where to find page

OS retrieves information
“Cache”
Page Table
Location
Data
Object Name
D:
0
0:
243
X
J:
On Disk
1:
17
•
•
•
105
X:
–8–
•
•
•
1
N-1:
A System with Physical Memory Only
Examples:

most Cray machines, early PCs, nearly all embedded systems,
etc.
Memory
Physical
Addresses
0:
1:
CPU
N-1:

–9–
Addresses generated by the CPU correspond directly to bytes in
physical memory
A System with Virtual Memory
Examples:

workstations, servers, modern PCs, etc.
0:
1:
Page Table
Virtual
Addresses
0:
1:
Memory
Physical
Addresses
CPU
P-1:
N-1:
Disk

Address Translation: Hardware converts virtual addresses to
physical addresses via OS-managed lookup table (page table)
– 10 –
Page Faults (like “Cache Misses”)
What if an object is on disk rather than in memory?


Page table entry indicates virtual address not in memory
OS exception handler invoked to move data from disk into
memory
 current process suspends, others can resume
 OS has full control over placement, etc.
Before fault
After fault
Memory
Memory
Page Table
Virtual
Addresses
Physical
Addresses
CPU
Virtual
Addresses
Physical
Addresses
CPU
Disk
– 11 –
Page Table
Disk
Servicing a Page Fault
(1) Initiate Block Read
Processor Signals Controller

Read block of length P
starting at disk address X
and store starting at memory
address Y
Read Occurs


Direct Memory Access (DMA)
Under control of I/O
controller
I / O Controller Signals
Completion


– 12 –
Interrupt processor
OS resumes suspended
process
Processor
Reg
(3) Read
Done
Cache
Memory-I/O bus
(2) DMA
Transfer
Memory
I/O
controller
disk
Disk
disk
Disk
Memory Management
Multiple processes can reside in physical memory.
How do we resolve address conflicts?

what if two processes access something at the same
address?
kernel virtual memory
stack
%esp
Memory mapped region
forshared libraries
Linux/x86
process
memory
image
– 13 –
memory invisible to
user code
the “brk” ptr
runtime heap (via malloc)
0
uninitialized data (.bss)
initialized data (.data)
program text (.text)
forbidden
Separate Virt. Addr. Spaces

Virtual and physical address spaces divided into equal-sized
blocks
 blocks are called “pages” (both virtual and physical)

Each process has its own virtual address space
 operating system controls how virtual pages as assigned to
physical memory
0
Virtual
Address
Space for
Process 1:
Address Translation
0
VP 1
VP 2
PP 2
...
N-1
PP 7
Virtual
Address
Space for
Process 2:
– 14 –
Physical
Address
Space
(DRAM)
0
VP 1
VP 2
PP 10
...
N-1
M-1
(e.g., read/only
library code)
Contrast: Macintosh Memory Model
MAC OS 1–9

Does not use traditional virtual memory
Shared Address Space
P1 Pointer Table
Process P1
A
B
“Handles”
P2 Pointer Table
C
Process P2
D
E
All program objects accessed through “handles”


– 15 –
Indirect reference through pointer table
Objects stored in shared global address space
Macintosh Memory Management
Allocation / Deallocation

Similar to free-list management of malloc/free
Compaction

Can move any object and just update the (unique) pointer in
pointer table
P1 Pointer Table
Shared Address Space
B
Process P1
A
“Handles”
P2 Pointer Table
C
Process P2
D
– 16 –
E
Mac vs. VM-Based Memory
Allocating, deallocating, and moving memory:

can be accomplished by both techniques
Block sizes:

Mac: variable-sized
 may be very small or very large

VM: fixed-size
 size is equal to one page (4KB on x86 Linux systems)
Allocating contiguous chunks of memory:


Mac: contiguous allocation is required
VM: can map contiguous range of virtual addresses to
disjoint ranges of physical addresses
Protection

– 17 –
Mac: “wild write” by one process can corrupt another’s
data
MAC OS X
“Modern” Operating System


Virtual memory with protection
Preemptive multitasking
 Other versions of MAC OS require processes to voluntarily
relinquish control
Based on MACH OS

– 18 –
Developed at CMU in late 1980’s
Motivation #3: Protection
Page table entry contains access rights information

hardware enforces this protection (trap into OS if
violation occurs)
Page Tables
Read? Write?
Process i:
No
PP 9
VP 1: Yes
Yes
PP 4
No
XXXXXXX
No
•
•
•
•
•
•
Read? Write?
– 19 –
Physical Addr
VP 0: Yes
VP 2:
Process j:
Memory
•
•
•
Physical Addr
VP 0: Yes
Yes
PP 6
VP 1: Yes
No
PP 9
VP 2:
No
XXXXXXX
No
•
•
•
•
•
•
0:
1:
•
•
•
N-1:
VM Address Translation
Virtual Address Space

V = {0, 1, …, N–1}
Physical Address Space


P = {0, 1, …, M–1}
M < N
Address Translation


MAP: V  P U {}
For virtual address a:
 MAP(a) = a’ if data at virtual address a at physical
address a’ in P
 MAP(a) =  if data at virtual address a not in physical
memory
» Either invalid or stored on disk
– 20 –
VM Address Translation: Hit
Processor
a
virtual address
– 21 –
Hardware
Addr Trans
Mechanism
Main
Memory
a'
part of the
physical address
on-chip
memory mgmt unit (MMU)
VM Address Translation: Miss
page fault
fault
handler
Processor
a
virtual address
– 22 –
Hardware
Addr Trans
Mechanism

Main
Memory
Secondary
memory
a'
part of the
physical address
on-chip
memory mgmt unit (MMU)
OS performs
this transfer
(only if miss)
VM Address Translation
Parameters



P = 2p = page size (bytes).
N = 2n = Virtual address limit
M = 2m = Physical address limit
n–1
p p–1
virtual page number
0
virtual address
page offset
address translation
m–1
p p–1
physical page number
page offset
0
physical address
Page offset bits don’t change as a result of translation
– 23 –
Page Tables
Virtual Page
Number
Memory resident
page table
(physical page
Valid or disk address)
1
1
0
1
1
1
0
1
0
1
– 24 –
Physical Memory
Disk Storage
(swap file or
regular file system file)
Address Translation via Page
Table
page table base register
VPN acts
as
table index
if valid=0
then page
not in memory
virtual address
n–1
p p–1
virtual page number (VPN)
page offset
valid access physical page number (PPN)
m–1
p p–1
physical page number (PPN)
page offset
physical address
– 25 –
0
0
Page Table Operation
Translation


Separate (set of) page table(s) per process
VPN forms index into page table (points to a page table entry)
page table base register
VPN acts
as
table index
if valid=0
then page
not in memory
virtual address
n–1
p p–1
virtual page number (VPN)
page offset
valid access physical page number (PPN)
m–1
p p–1
physical page number (PPN)
page offset
physical address
– 26 –
0
0
Page Table Operation
Computing Physical Address

Page Table Entry (PTE) provides information about page
 if (valid bit = 1) then the page is in memory.
» Use physical page number (PPN) to construct address
 if (valid bit = 0) then the page is on disk
» Page fault
page table base register
VPN acts
as
table index
if valid=0
then page
not in memory
– 27 –
virtual address
n–1
p p–1
virtual page number (VPN)
page offset
0
valid access physical page number (PPN)
m–1
p p–1
physical page number (PPN)
page offset
physical address
0
Page Table Operation
Checking Protection

Access rights field indicate allowable access
 e.g., read-only, read-write, execute-only
 typically support multiple protection modes (e.g., kernel vs. user)

Protection violation fault if user doesn’t have necessary
permission
page table base register
VPN acts
as
table index
if valid=0
then page
not in memory
– 28 –
virtual address
n–1
p p–1
virtual page number (VPN)
page offset
0
valid access physical page number (PPN)
m–1
p p–1
physical page number (PPN)
page offset
physical address
0
Integrating VM and Cache
VA
CPU
miss
PA
Translation
Cache
Main
Memory
hit
data
Most Caches “Physically Addressed”




Accessed by physical addresses
Allows multiple processes to have blocks in cache at same
time
Allows multiple processes to share pages
Cache doesn’t need to be concerned with protection issues
 Access rights checked as part of address translation
Perform Address Translation Before Cache Lookup


– 29 –
But this could involve a memory access itself (of the PTE)
Of course, page table entries can also become cached
Speeding up Translation with a TLB
“Translation Lookaside Buffer” (TLB)



Small hardware cache in MMU
Maps virtual page numbers to physical page numbers
Contains complete page table entries for small number of
pages
hit
PA
VA
CPU
miss
TLB
Lookup
miss
Cache
hit
Translation
data
– 30 –
Main
Memory
Address Translation with a TLB
n–1
p p–1
0
virtual page number page offset
valid
.
virtual address
tag physical page number
.
TLB
.
=
TLB hit
physical address
tag
index
valid tag
byte offset
data
Cache
=
cache hit
– 31 –
data
Simple Memory System Example
Addressing



14-bit virtual addresses
12-bit physical address
Page size = 64 bytes
13
12
11
10
9
8
7
6
5
4
VPN
– 32 –
10
2
1
0
VPO
(Virtual Page Offset)
(Virtual Page Number)
11
3
9
8
7
6
5
4
3
2
1
PPN
PPO
(Physical Page Number)
(Physical Page Offset)
0
Simple Memory System Page Table

– 33 –
Only show first 16 entries
VPN
PPN
Valid
VPN
PPN
Valid
00
28
1
08
13
1
01
–
0
09
17
1
02
33
1
0A
09
1
03
02
1
0B
–
0
04
–
0
0C
–
0
05
16
1
0D
2D
1
06
–
0
0E
11
1
07
–
0
0F
0D
1
Simple Memory System TLB
TLB


16 entries
4-way associative
TLBT
13
12
11
10
TLBI
9
8
7
6
5
4
3
VPN
2
1
0
VPO
Set
Tag
PPN
Valid
Tag
PPN
Valid
Tag
PPN
Valid
Tag
PPN
Valid
0
03
–
0
09
0D
1
00
–
0
07
02
1
1
03
2D
1
02
–
0
04
–
0
0A
–
0
2
02
–
0
08
–
0
06
–
0
03
–
0
3
07
–
0
03
0D
1
0A
34
1
02
–
0
– 34 –
Simple Memory System Cache
Cache



16 lines
4-byte line size
Direct mapped
CI
CT
11
10
9
8
7
6
5
4
PPN
CO
3
2
1
0
PPO
Idx
Tag
Valid
B0
B1
B2
B3
Idx
Tag
Valid
B0
B1
B2
B3
0
19
1
99
11
23
11
8
24
1
3A
00
51
89
1
15
0
–
–
–
–
9
2D
0
–
–
–
–
2
1B
1
00
02
04
08
A
2D
1
93
15
DA
3B
3
36
0
–
–
–
–
B
0B
0
–
–
–
–
4
32
1
43
6D
8F
09
C
12
0
–
–
–
–
5
0D
1
36
72
F0
1D
D
16
1
04
96
34
15
6
31
0
–
–
–
–
E
13
1
83
77
1B
D3
7
– 35 –
16
1
11
C2
DF
03
F
14
0
–
–
–
–
Address Translation Example #1
Virtual Address 0x03D4
TLBT
13
12
11
TLBI
10
9
8
7
6
5
4
3
VPN
VPN ___
2
1
0
VPO
TLBI ___TLBT ____ TLB Hit? __ Page Fault? __ PPN: ____
Physical Address
CI
CT
11
10
9
8
7
6
PPN
Offset ___ CI___
– 36 –
CT ____
5
4
CO
3
2
PPO
Hit? __
Byte: ____
1
0
Address Translation Example #2
Virtual Address 0x0B8F
TLBT
13
12
11
TLBI
10
9
8
7
6
5
4
3
VPN
VPN ___
2
1
0
VPO
TLBI ___TLBT ____ TLB Hit? __ Page Fault? __ PPN: ____
Physical Address
CI
CT
11
10
9
8
7
6
PPN
Offset ___ CI___
– 37 –
CT ____
5
4
CO
3
2
PPO
Hit? __
Byte: ____
1
0
Address Translation Example #3
Virtual Address 0x0040
TLBT
13
12
11
TLBI
10
9
8
7
6
5
4
3
VPN
VPN ___
2
1
0
VPO
TLBI ___TLBT ____ TLB Hit? __ Page Fault? __ PPN: ____
Physical Address
CI
CT
11
10
9
8
7
6
PPN
Offset ___ CI___
– 38 –
CT ____
5
4
CO
3
2
PPO
Hit? __
Byte: ____
1
0
Multi-Level Page Tables
Level 2
Tables
Given:



4KB (212) page size
32-bit address space
4-byte PTE
Problem:

Would need a 4 MB page table!
Level 1
Table
 220 *4 bytes
Common solution


multi-level page tables
e.g., 2-level table (P6)
 Level 1 table: 1024 entries, each
of which points to a Level 2 page
table.
 Level 2 table: 1024 entries, each
of which points to a page
– 39 –
...
Main Themes
Programmer’s View

Large “flat” address space
 Can allocate large blocks of contiguous addresses

Processor “owns” machine
 Has private address space
 Unaffected by behavior of other processes
System View

User virtual address space created by mapping to set of
pages
 Need not be contiguous
 Allocated dynamically
 Enforce protection during address translation

OS manages many processes simultaneously
 Continually switching among processes
 Especially when one must wait for resource
» E.g., disk I/O to handle page fault
– 40 –
Intel P6
Internal Designation for Successor to Pentium

Which had internal designation P5
Fundamentally Different from Pentium


Out-of-order, superscalar operation
Designed to handle server applications
 Requires high performance memory system
Resulting Processors


PentiumPro (1996)
Pentium II (1997)
 Incorporated MMX instructions
» special instructions for parallel processing
 L2 cache on same chip

Pentium III (1999)
 Incorporated Streaming SIMD Extensions
» More instructions for parallel processing
– 41 –
P6 Memory System
32 bit address space
4 KB page size
L1, L2, and TLBs
DRAM

external
system bus
(e.g. PCI)
inst TLB
L2
cache
instruction
fetch unit
processor package
– 42 –
L1
i-cache

32 entries

8 sets
data TLB
cache bus
bus interface unit
4-way set
associative
inst
TLB
data
TLB
L1
d-cache

64 entries

16 sets
L1 i-cache and d-cache

16 KB

32 B line size

128 sets
L2 cache

unified

128 KB -- 2 MB
Linux Organizes VM as Collection of
“Areas”
process virtual memory
task_struct
mm
vm_area_struct
mm_struct
pgd
mmap
vm_end
vm_start
vm_prot
vm_flags
vm_next

pgd:
 page directory address

vm_prot:
 read/write permissions
for this area

processes or private to
this process
0x40000000
data
0x0804a020
vm_next
vm_flags
 shared with other
– 43 –
vm_end
vm_start
vm_prot
vm_flags
shared libraries
text
vm_end
vm_start
vm_prot
vm_flags
vm_next
0x08048000
0
Linux Page Fault Handling
Is the VA legal?
process virtual memory
vm_area_struct

vm_end
vm_start
r/o

shared libraries
vm_next
1
read
vm_end
vm_start
r/w
3
vm_next
2


write
vm_end
vm_start
r/o
text
i.e., can the process
read/write this area?
if not then signal
protection violation
(e.g., (2))
If OK, handle fault

vm_next
0
– 44 –
if not then signal
segmentation violation
(e.g. (1))
Is the operation legal?
read
data
i.e. is it in an area
defined by a
vm_area_struct?
e.g., (3)
Memory Mapping
Creation of new VM area done via “memory mapping”


create new vm_area_struct and page tables for area
area can be backed by (i.e., get its initial values from) :
 regular file on disk (e.g., an executable object file)
» initial page bytes come from a section of a file
 nothing (e.g., bss)
» initial page bytes are zeros

dirty pages are swapped back and forth between a special
swap file.
Key point: no virtual pages are copied into physical
memory until they are referenced!


– 45 –
known as “demand paging”
crucial for time and space efficiency
User-Level Memory Mapping
void *mmap(void *start, int len,
int prot, int flags, int fd, int offset)

map len bytes starting at offset offset of the file specified
by file description fd, preferably at address start (usually 0
for don’t care).
 prot: MAP_READ, MAP_WRITE
 flags: MAP_PRIVATE, MAP_SHARED


return a pointer to the mapped area.
Example: fast file copy
 useful for applications like Web servers that need to quickly copy
files.
 mmap allows file transfers without copying into user space.
– 46 –
mmap() Example: Fast File Copy
#include
#include
#include
#include
#include
<unistd.h>
<sys/mman.h>
<sys/types.h>
<sys/stat.h>
<fcntl.h>
int main() {
struct stat stat;
int i, fd, size;
char *bufp;
/* open the file & get its size*/
fd = open("./mmap.c", O_RDONLY);
fstat(fd, &stat);
size = stat.st_size;
/*
* mmap.c - a program that uses mmap
* to copy itself to stdout
*/
/* map the file to a new VM area */
bufp = mmap(0, size, PROT_READ,
MAP_PRIVATE, fd, 0);
/* write the VM area to stdout */
write(1, bufp, size);
}
– 47 –
Exec() Revisited
To run a new program p in the
current process using
exec():
process-specific data
structures
(page tables,
task and mm structs)

physical memory
same
for each
process
kernel code/data/stack
0xc0
%esp
stack
Memory mapped region
for shared libraries

kernel
VM
demand-zero
process
VM
brk
runtime heap (via malloc)
– 48 –
0
uninitialized data (.bss)
initialized data (.data)
program text (.text)
forbidden
 stack, bss, data, text,
shared libs.
 text and data backed by ELF
executable object file.
 bss and stack initialized to
zero.
.data
.text
libc.so
demand-zero
.data
.text
p
free vm_area_struct’s and
page tables for old areas.
create new vm_area_struct’s
and page tables for new
areas.

set PC to entry point in
.text
 Linux will swap in code and
data pages as needed.
Fork() Revisited
To create a new process using fork():

make copies of the old process’s mm_struct,
vm_area_struct’s, and page tables.
 at this point the two processes are sharing all of their pages.
 How to get separate spaces without copying all the virtual
pages from one space to another?
» “copy on write” technique.

copy-on-write
 make pages of writeable areas read-only
 flag vm_area_struct’s for these areas as private “copy-on-
write”.
 writes by either process to these pages will cause page
faults.
» fault handler recognizes copy-on-write, makes a copy of
the page, and restores write permissions.

Net result:
 copies are deferred until absolutely necessary (i.e., when one
of the processes tries to modify a shared page).
– 49 –
Memory System Summary
Virtual Memory

Supports many OS-related functions
 Process creation
» Initial
» Forking children
 Task switching
 Protection

Combination of hardware & software implementation
 Software management of tables, allocations
 Hardware access of tables
 Hardware caching of table entries (TLB)
– 50 –
Download