23 - Pages

advertisement
Virtual Memory
1
Outline
• Multilevel page tables
• Different points of view
• Pentium/Linux Memory System
• Memory Mapping
• Suggested reading: 10.6, 10.3, 10.7, 10.8
2
10.6 Address Translation
10.6.3 Multi Level Page Tables
3
Multi-Level Page Tables
• Given:
– 4KB (212) page size
– 32-bit address space
– 4-byte PTE
• Problem:
– Would need a 4 MB page table!
• 220 *4 bytes
4
Multi-Level Page Tables
Level 2
Tables
• Common solution
– multi-level page tables
– e.g., 2-level table (P6)
• Level 1 table: 1024 entries,
Level 1
Table
each of which points to a
Level 2 page table.
• Level 2 table: 1024 entries,
...
each of which points to a
page
5
Multi-Level Page Tables
Figure 10.18 P710
6
Multi-Level Page Tables
Figure 10.19 P711
7
Representation of Virtual Address Space
PT 3
Page Directory
P=1, M=1
P=1, M=1
P=0, M=0
P=0, M=1
•
•
•
•
PT 2
PT 0
P=1, M=1
P=0, M=0
P=1, M=1
P=0, M=1
P=1, M=1
P=0, M=0
P=1, M=1
P=0, M=1
P=0, M=1
P=0, M=1
P=0, M=0
P=0, M=0
•
•
•
•
•
•
•
•
•
•
•
•
• Simplified Example
– 16 page virtual address space
• Flags
– P: Is entry in physical memory?
– M: Has this part of VA space been
mapped?
Page 15
Page 14
Page 13
Page 12
Page 11
Page 10
Page 9
Page 8
Page 7
Page 6
Page 5
Page 4
Mem Addr
Page 3
Disk Addr
Page 2
In Mem
Page 1
Page 0
On Disk
Unmapped
8
10.4 VM as a Tool for Memory Management
9
A Tool for Memory Management
• Separate virtual address space
– Each process has its own virtual address space
• Simplify linking, sharing, loading, and memory
allocation
10
A Tool for Memory Management
0
Virtual
Address
Space for
Process 1:
PP 2
Physical
Address
Space
(DRAM)
PP 7
(e.g., read/only
library code)
Address Translation
0
VP 1
VP 2
...
N-1
Virtual
Address
Space for
Process 2:
0
VP 1
PP 10
VP 2
...
M-1
N-1
11
A Tool for Memory Management
kernel virtual memory
%esp
Linux/x86
process
memory
image
memory invisible to
user code
stack
Memory mapped region
forshared libraries
the “brk” ptr
runtime heap (via malloc)
uninitialized data (.bss)
initialized data (.data)
program text (.text)
forbidden
Figure 10.10 P702
12
10.5 VM as a Tool for Memory Protection
13
A Tool for Memory Protection
• Page table entry contains access rights
information
– hardware enforces this protection (trap into OS if
violation occurs)
14
A Tool for Memory Protection
Page Tables
Read? Write?
Process i:
Physical Addr
VP 0: Yes
No
PP 9
VP 1: Yes
Yes
PP 4
No
XXXXXXX
VP 2:
No
•
•
•
•
•
•
Read? Write?
Process j:
Memory
•
•
•
Physical Addr
VP 0: Yes
Yes
PP 6
VP 1: Yes
No
PP 9
VP 2:
No
XXXXXXX
No
•
•
•
•
•
•
0:
1:
N-1:
•
•
•
15
A Tool for Memory Protection
Figure 10.11 P704
16
10.7 Case Study: The Pentium/Linux
Memory System
17
P6 Memory System
DRAM
external
system bus
(e.g. PCI)
2
L2
cache
1
3
cache bus
bus interface unit
instruction
fetch unit
3) L1
i-cache
processor package
Figure 10.22 P716
1) inst
TLB
2) data
TLB
4) L1
d-cache
• 32 bit address space
• 4 KB page size
• L1, L2, and TLBs
• 4-way set associative
• inst TLB
• 32 entries
• 8 sets
• data TLB
• 64 entries
• 16 sets
• L1 i-cache and d-cache
• 16 KB
• 32 B line size
• 128 sets
• L2 cache
• unified
• 128 KB -- 2 MB
• 32 B line size
18
10.7.1 Pentium Address Translation
19
P6 Address Translation
32
result
CPU
20
VPN
12
virtual address (VA)
VPO
...
TLB (16 sets,
4 entries/set)
10 10
VPN1 VPN2
PDE
PDBR
PTE
Page tables
L1
miss
L1
hit
16
4
TLBT TLBI
TLB
miss
L2 and DRAM
L1 (128 sets, 4 lines/set)
TLB
hit
20
PPN
...
20
CT
12
PPO
7 5
CI CO
physical
address (PA)
Figure 10.23 P717
20
P6 Page Table
• Page directory
– 1024 4-byte page directory entries
(PDEs) that point to page tables
– one page directory per process.
– page directory must be in memory
when its process is running
– always pointed to by PDBR
• Page tables:
– 1024 4-byte page table entries
(PTEs) that point to pages.
– page tables can be paged in and out.
Up to
1024
page
tables
1024
PTEs
page
directory
...
1024
PDEs
1024
PTEs
...
1024
PTEs
Figure 10.24 P718
21
P6 page directory entry (PDE)
31
12 11
Page table physical base addr
9
Avail
8
7
G
PS
6
5
A
4
3
2
1
0
CD WT U/S R/W P=1
Page table physical base address: 20 most significant bits of physical
page table address (forces page tables to be 4KB aligned)
Avail: available for system programmers
G: global page (don’t evict from TLB on task switch)
PS: page size 4K (0) or 4M (1)
A: accessed (set by MMU on reads and writes, cleared by software)
CD: cache disabled (1) or enabled (0)
WT: write-through or write-back cache policy for this page table
U/S: user or supervisor mode access
Figure 10.25 (a) P719
R/W: read-only or read-write access
P: page table is present in memory (1) or not (0)
31
1
Available for OS (page table location in secondary storage)
0
P=0
22
P6 page table entry (PTE)
31
12 11
Page physical base address
9
Avail
8
7
6
5
G
0
D
A
4
3
2
1
0
CD WT U/S R/W P=1
Page base address: 20 most significant bits of physical page
address (forces pages to be 4 KB aligned)
Avail: available for system programmers
G: global page (don’t evict from TLB on task switch)
D: dirty (set by MMU on writes)
A: accessed (set by MMU on reads and writes)
CD: cache disabled or enabled
WT: write-through or write-back cache policy for this page
U/S: user/supervisor
Figure 10.25 (b) P719
R/W: read/write
P: page is present in physical memory (1) or not (0)
31
1
Available for OS (page location in secondary storage)
0
P=0
23
Page tables Translation
10
VPN1
10
VPN2
word offset into
page directory
12
VPO
word offset into
page table
page directory
word offset into
physical and virtual
page
page table
PTE
PDE
PDBR
physical address
of page directory
Virtual address
physical address
of page table base
(if P=1)
20
PPN
Figure 10.26 P720
physical
address
of page base
(if P=1)
12
PPO
Physical address
24
P6 TLB translation
32
result
CPU
20
VPN
12
virtual address (VA)
VPO
...
TLB (16 sets,
4 entries/set)
10 10
VPN1 VPN2
PDE
PDBR
PTE
Page tables
Figure 10.27 P720
L1
miss
L1
hit
16
4
TLBT TLBI
TLB
miss
L2 andDRAM
L1 (128 sets, 4 lines/set)
TLB
hit
20
PPN
...
20
CT
12
PPO
7 5
CI CO
physical
address (PA)
25
P6 TLB
• TLB entry (not all documented, so this is speculative):
–
–
–
–
32
16
1
1
PDE/PTE
Tag
PD
V
V: indicates a valid (1) or invalid (0) TLB entry
PD: is this entry a PDE (1) or a PTE (0)?
tag: disambiguates entries cached in the same set
PDE/PTE: page directory or page table entry
• Structure of the data TLB:
– 16 sets, 4 entries/set
entry
entry
entry
entry
entry
entry
entry
entry
entry
entry
entry
entry
entry
entry
set 0
set 1
set 2
entry
entry
set 15
...
26
Translating with the P6 TLB P720
CPU
20
VPN
1. Partition VPN into TLBT
16
4
TLBT TLBI
TLB
miss
and TLBI.
12 virtual address
VPO
2. Is the PTE for VPN
cached in set TLBI?
1
2
PDE
...
page table translation
3. Yes: then build
TLB
PTE hit
physical address.
3
20
PPN
12
PPO
4. No: then read PTE (and
PDE if not cached) from
physical
address
memory and build
4
physical address.
27
P6 page table translation
32
result
CPU
20
VPN
12
VPO virtual address (VA)
...
TLB (16 sets,
4 entries/set)
10 10
VPN1 VPN2
PDE
PDBR
PTE
L1
miss
L1
hit
16
4
TLBT TLBI
TLB
miss
L2 andDRAM
L1 (128 sets, 4 lines/set)
TLB
hit
20
PPN
...
20
CT
12
PPO
7 5
CI CO
physical
address (PA)
Page tables
28
Translating with the P6 page tables (case 1/1)
20
VPN
• Case 1/1: page
table and page
present.
12
VPO
20
PPN
VPN1 VPN2
12
PPO
Mem
PDE p=1
PTE p=1
data
Page
table
Data
page
PDBR
Page
directory
• MMU Action:
– MMU build
physical address
and fetch data
word.
• OS action
– none
Disk
29
Translating with the P6 page tables (case 1/0)
20
VPN
•
12
VPO
•
VPN1 VPN2
Mem
PDE p=1
PTE p=0
Page
directory
Page
table
PDBR
Disk
data
Data
page
Case 1/0: page table
present but page
missing.
MMU Action:
– page fault
exception
– handler receives
the following args:
• VA that caused
fault
• fault caused by
non-present page
or page-level
protection
violation
• read/write
• user/supervisor
30
Translating with the P6 page tables (case 1/0, cont)
20
VPN
• OS Action:
12
VPO
20
PPN
VPN1 VPN2
Mem
PDE p=1
PTE p=1
data
Page
directory
Page
table
Data
page
PDBR
Disk
12
PPO
– Check for a legal
virtual address.
– Read PTE through
PDE.
– Find free physical
page (swapping out
current page if
necessary)
– Read virtual page
from disk and copy
to virtual page
– Restart faulting
instruction by
returning from
exception handler.
31
Translating with the P6 page tables (case 0/1)
20
VPN
• Case 0/1: page table
missing but page
present.
12
VPO
• Introduces consistency
issue.
VPN1 VPN2
Mem
PDE p=0
data
Page
directory
Data
page
PDBR
Disk
PTE p=1
Page
table
– potentially every
page out requires
update of disk page
table.
• Linux disallows this
– if a page table is
swapped out, then
swap out its data
pages too.
32
Translating with the P6 page tables (case 0/0)
20
VPN
12
VPO
• Case 0/0: page
VPN1 VPN2
table and page
missing.
PDE p=0
Mem
• MMU Action:
PDBR
– page fault
Page
directory
Disk
exception
PTE p=0
data
Page
table
Data
page
33
Translating with the P6 page tables (case 0/0, cont)
20
VPN
12
VPO
• OS action:
– swap in page table.
– restart faulting
VPN1 VPN2
instruction by
returning from
Mem
PDE p=1
PTE p=0
Page
directory
Page
table
PDBR
Disk
handler.
• Like case 0/1 from
here on.
data
Data
page
34
P6 L1 cache access
32
result
CPU
20
VPN
12
virtual address (VA)
VPO
...
TLB (16 sets,
4 entries/set)
10 10
VPN1 VPN2
PDE
PDBR
PTE
Page tables
L1
miss
L1
hit
16
4
TLBT TLBI
TLB
miss
L2 andDRAM
L1 (128 sets, 4 lines/set)
TLB
hit
20
PPN
...
20
CT
12
PPO
7 5
CI CO
physical
address (PA)
35
L1 cache access
32
data
L2 andDRAM
L1
miss
L1
hit
L1 (128 sets, 4 lines/set)
...
20
CT
7 5
CI CO
• Partition physical
address into CO, CI,
and CT.
• Use CT to determine if
line containing word at
address PA is cached
in set CI.
• If no: check L2.
• If yes: extract word
at byte offset CO and
return to processor.
physical
address (PA)
36
Speeding Up L1 Access
Tag Check
20
CT
7 5
CI CO
PPN
PPO
Physical address (PA)
Addr.
Trans.
virtual
address (VA)
No
Change
VPN
VPO
20
12
CI
• Observation
–
–
–
–
Bits that determine CI identical in virtual and physical address
Can index into cache while address translation being performed
Then check with CT from physical address
“Virtually indexed, physically tagged”
37
Linux Virtual Memory System
process-specific data
structures
(page tables,
task and mm structs)
physical memory
same
for each
process
0xc0000000
%esp
kernel code/data/stack
stack
Memory mapped region
for shared libraries
0x40000000
brk
runtime heap (via malloc)
0x08048000
Figure 10.28 P720
 kernel
 virtual
 memory
0
uninitialized data (.bss)
initialized data (.data)
program text (.text)
forbidden

 virtual

process
memory
38
Linux organizes VM as a collection of “areas”
task_struct
mm
mm_struct
pgd
mmap
vm_area_struct
vm_end
vm_start
vm_prot
vm_flags
vm_next
•
•
•
pgd:
– page directory address
vm_prot:
– read/write permissions
for this area
vm_flags
– shared with other
processes or private to
this process
Figure 10.29 P722
vm_end
vm_start
vm_prot
vm_flags
process virtual memory
shared libraries
0x40000000
data
0x0804a020
vm_next
text
vm_end
vm_start
vm_prot
vm_flags
vm_next
0x08048000
0
39
Linux page fault handling
process virtual memory
•
vm_area_struct
vm_end
vm_start
r/o
shared libraries
vm_next
1
read
vm_end
vm_start
r/w
3
data
read
vm_next
vm_end
vm_start
r/o
2
write
text
•
vm_next
0
Figure 10.30 P723
•
Is the VA legal?
– i.e. is it in an area
defined by a
vm_area_struct?
– if not then signal
segmentation violation
(e.g. (1))
Is the operation legal?
– i.e., can the process
read/write this area?
– if not then signal
protection violation
(e.g., (2))
If OK, handle fault
– e.g., (3)
40
10.8 Memory Mapping
41
Memory mapping
• Creation of new VM area done via “memory mapping”
– create new vm_area_struct and page tables for area
– area can be backed by (i.e., get its initial values from) :
• regular file on disk (e.g., an executable object file)
– initial page bytes come from a section of a
file
• nothing (e.g., bss)
– initial page bytes are zeros
– dirty pages are swapped back and forth between a special
swap file.
42
Memory mapping
• Key point: no virtual pages are copied into
physical memory until they are referenced!
– known as “demand paging”
– crucial for time and space efficiency
Crucial:至关重要的
43
10.8.3 The execve Function Revisited
44
Exec() revisited
process-specific data
structures
(page tables,
task and mm structs)
physical memory
same
for each
process
0xc0
%esp
kernel code/data/stack
stack
Memory mapped region
for shared libraries
brk
runtime heap (via malloc)
0
uninitialized data (.bss)
initialized data (.data)
program text (.text)
forbidden
• To run a new program p in the
current process using exec():
– free vm_area_struct’s and
page tables for old areas.
– create new vm_area_struct’s
and page tables for new areas.
kernel
VM
• stack, bss, data, text,
demand-zero
shared libs.
process
• text and data backed by
VM
ELF executable object file.
.data
• bss and stack initialized to
.text
zero.
libc.so
– set PC to entry point in .text
• Linux will swap in code and
demand-zero
data pages as needed.
.data
.text
p
45
10.8.4 User-Level Memory Mapping
with the mmap Function
46
User-level memory mapping P728
• void *mmap(void *start, int len, int prot, int flags,
int fd, int offset)
• map len bytes starting at offset offset of the file specified by
file description fd, preferably at address start (usually 0 for
don’t care).
– prot: MAP_READ, MAP_WRITE
– flags: MAP_PRIVATE, MAP_SHARED
• return a pointer to the mapped area.
• Example: fast file copy
– useful for applications like Web servers that need to quickly copy files.
– mmap allows file transfers without copying into user space.
47
mmap() example: fast file copy
#include <unistd.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
/*
* mmapcopy - uses mmap to copy
* file fd to stdout
*/
void mmapcopy(int fd, int size)
{
char *bufp;
/* mmapcopy driver */
int main(int argc, char **argv)
{
struct stat stat;
/* check for required command line
argument */
if ( argc != 2 ) {
printf(“usage: %s <filename>\n”,
argv[0]);
exit(0) ;
}
/* open the file and get its size*/
fd = open(argv[1], O_RDONLY, 0);
fstat(fd, &stat);
mmapcopy(fd, stat.st_size) ;
exit(0);
/* map the file to a new VM area */
bufp = Mmap(0, size, PROT_READ,
MAP_PRIVATE, fd, 0);
/* write the VM area to stdout */
write(1, bufp, size);
return ;
}
}
Practice Problem 10.5 P730 --- P772
48
10.8.2 The fork Function Revisited
49
Fork() revisted
• To create a new process using fork:
– make copies of the old process’s mm_struct,
vm_area_struct’s, and page tables.
• at this point the two processes are sharing all of their pages.
• How to get separate spaces without copying all the virtual
pages from one space to another?
– “copy on write” technique.
50
Fork() revisted
• To create a new process using fork:
– copy-on-write
• make pages of writeable areas read-only
• flag vm_area_struct’s for these areas as private “copy-onwrite”.
• writes by either process to these pages will cause page faults.
– fault handler recognizes copy-on-write, makes a copy of the
page, and restores write permissions.
51
10.8.1 Shared Objects Revisited
52
Shared Object P725
• shared object
– An object which is mapped into an area of virtual
memory of a process
– Any writes that the process makes to that area are
visible to any other processes that have also mapped
the shared object into their virtual memory
– The changes are also reflected in the original object
on disk.
• shared area
– A virtual memory area that a shared object is mapped
53
Private object
• private object
– As oppose to shared object
– Changes made to an area mapped to a private
object are not visible to other processes
– Any writes that the process makes to the area
are not reflected back to the object on disk.
• private area
– Similar to shared area
54
Figure 10.31 (a) P725
55
Figure 10.31 (b) P725
56
Figure 10.32 (a) P726
57
Figure 10.32 (b) P726
58
Copy-on-Write
• A private object begins life in exactly the
same way as a shared object, with only one
copy of the private object stored in physical
memory.
59
Copy-on-Write
• For each process that maps the private object
– The page table entries for the corresponding private
area are flagged as read-only
– The area struct is flagged as private copy-on-write
– So long as neither process attempts to write to its
respective private area, they continue to share a
single copy of the object in physical memory.
60
Copy-on-Write
• For each process that maps the private object
– As soon as a process attempts to write to some page
in the private area, the write triggers a protection
fault
– The fault handler notices that the protection
exception was caused by the process trying to write
to a page in a private copy-on-write area
61
Copy-on-Write
• For each process that maps the private object
– The fault handler
• Creates a new copy of the page in physical memory
• Updates the page table entry to point to the new copy
• Restores write permissions to the page
62
Fork() revisted
• To create a new process using fork:
– Net result:
– copies are deferred until absolutely necessary
(i.e., when one of the processes tries to
modify a shared page).
63
Download