Virtual Memory & Address Translation Vivek Pai Princeton University

advertisement
Virtual Memory &
Address Translation
Vivek Pai
Princeton University
General Memory Problem
• We have a limited (expensive) physical resource:
main memory
• We want to use it as efficiently as possible
• We have an abundant, slower resource: disk
Oct 9, 2001
Virtual Memory & Translation
2
Lots of Variants
• Many programs, total size less than memory
– Technically possible to pack them together
– Will programs know about each other’s existence?
• One program, using lots of memory
– Can you only keep part of the program in memory?
• Lots of programs, total size exceeds memory
– What programs are in memory, and how to decide?
Oct 9, 2001
Virtual Memory & Translation
3
History Versus Present
• History
– Each variant had its own solution
– Solutions have different hardware requirements
– Some solutions software/programmer visible
• Present – general-purpose microprocessors
– One mechanism used for all of these cases
• Present – less capable microprocessors
– May still use “historical” approaches
Oct 9, 2001
Virtual Memory & Translation
4
Many Programs, Small Total Size
• Observation: we can pack them into memory
• Requirements by segments
– Text: maybe contiguous
– Data: keep contiguous, “relocate” at start
– Stack: assume contiguous, fixed size
• Just set pointer at start, reserve space
– Heap: no need to make it contiguous
Oct 9, 2001
Virtual Memory & Translation
5
Many Programs, Small Total Size
• Software approach
– Just find appropriate space for data & code
segments
– Adjust any pointers to globals/functions in the code
– Heap, stack “automatically” adjustable
• Hardware approach
– Pointer to data segment
– All accesses to globals indirected
Oct 9, 2001
Virtual Memory & Translation
6
One Program, Lots of Memory
• Observations: locality
– Instructions in a function generally related
– Stack accesses generally in current stack frame
– Not all globals used all the time
• Goal: keep recently-used portions in memory
– Explicit: programmer/compiler reserves, controls part
of memory space – “overlays”
– Note: limited resource may be address space
Oct 9, 2001
Virtual Memory & Translation
7
Many Programs, Lots of Memory
• Software approach
– Keep only subset of programs in memory
– When loading a program, evict any programs that use the
same memory regions
– “Swap” programs in/out as needed
• Hardware approach
– Don’t permanently associate any address of any program to
any part of physical memory
• Note: doesn’t address problem of too few address bits
Oct 9, 2001
Virtual Memory & Translation
8
Why Virtual Memory?
• Use secondary storage($)
– Extend DRAM($$$) with reasonable performance
• Protection
– Programs do not step over each other
– Communications require explicit IPC operations
• Convenience
– Flat address space
– Programs have the same view of the world
Oct 9, 2001
Virtual Memory & Translation
9
How To Translate
• Must have some “mapping” mechanism
• Mapping must have some granularity
– Granularity determines flexibility
– Finer granularity requires more mapping info
• Extremes:
– Any byte to any byte: mapping equals program size
– Map whole segments: larger segments problematic
Oct 9, 2001
Virtual Memory & Translation
10
Translation Options
• Granularity
– Small # of big fixed/flexible regions – segments
– Large # of fixed regions – pages
• Visibility
– Translation mechanism integral to instruction set – segments
– Mechanism partly visible, external to processor – obsolete
– Mechanism part of processor, visible to OS – pages
Oct 9, 2001
Virtual Memory & Translation
11
Translation Overview
CPU
virtual address
Translation
(MMU)
physical address
Physical
memory
Oct 9, 2001
I/O
device
• Actual translation is in
hardware (MMU)
• Controlled in software
• CPU view
– what program sees,
virtual memory
• Memory view
– physical memory
Virtual Memory & Translation
12
Goals of Translation
• Implicit translation for
each memory
reference
• A hit should be very
fast
• Trigger an exception on
a miss
paging
• Protected from user’s
faults
Oct 9, 2001
Registers
Virtual Memory & Translation
Cache(s)
10x
DRAM
100x
Disk
10Mx
13
Base and Bound
virtual address
+
physical address
Oct 9, 2001
• Built in Cray-1
bound
• A program can only access
physical memory in
>
[base, base+bound]
error
• On a context switch:
base
save/restore base, bound
registers
• Pros: Simple
• Cons: fragmentation, hard
to share, and difficult to use
disks
Virtual Memory & Translation
14
Segmentation
Virtual address
segment
offset
seg
size
..
.
+
physical address
Oct 9, 2001
>
• Have a table of (seg, size)
• Protection: each entry has
error
– (nil, read, write, exec)
• On a context switch:
save/restore the table or a
pointer to the table in
kernel memory
• Pros: Efficient, easy to
share
• Cons: Complex
management and
fragmentation within a
segment
Virtual Memory & Translation
15
Paging
Virtual address
VPage #
page table size
offset
error • Use a page table to
>
translate
Page table
• Various bits in each entry
PPage# ...
• Context switch: similar to
...
..
the segmentation scheme
.
• What should be the page
PPage# ...
size?
• Pros: simple allocation,
easy to share
PPage #
offset
• Cons: big table & cannot
Physical address
deal with holes easily
Oct 9, 2001
Virtual Memory & Translation
16
How Many PTEs Do We Need?
• Assume 4KB page
– Equals “low order” 12 bits
• Worst case for 32-bit address machine
– # of processes  220
• What about 64-bit address machine?
– # of processes  252
Oct 9, 2001
Virtual Memory & Translation
17
Segmentation with Paging
Virtual address
Vseg #
seg
size
..
.
VPage #
offset
Page table
PPage# ...
..
.
PPage#
>
error
Oct 9, 2001
...
...
PPage #
offset
Physical address
Virtual Memory & Translation
18
Multiple-Level Page Tables
Virtual address
dir table offset
pte
..
.
Directory
..
.
..
.
..
.
What does this buy us? Sparse address spaces and easier paging
Oct 9, 2001
Virtual Memory & Translation
19
Inverted Page Tables
Physical
address
Virtual
address
pid vpage offset
k
0
pid vpage k
offset
• Main idea
– One PTE for each
physical page frame
– Hash (Vpage, pid) to
Ppage#
• Pros
– Small page table for
large address space
• Cons
– Lookup is difficult
– Overhead of managing
Inverted page table
hash chains, etc
Oct 9, 2001
Virtual Memory & Translation
20
n-1
Virtual-To-Physical Lookups
• Programs only know virtual addresses
• Each virtual address must be translated
– May involve walking hierarchical page table
– Page table stored in memory
– So, each program memory access requires several
actual memory accesses
• Solution: cache “active” part of page table
Oct 9, 2001
Virtual Memory & Translation
21
Translation Look-aside Buffer (TLB)
Virtual address
VPage #
offset
VPage# PPage#
VPage# PPage#
..
.
...
...
VPage# PPage#
...
Miss
Real
page
table
TLB
Hit
PPage #
offset
Physical address
Oct 9, 2001
Virtual Memory & Translation
22
Bits in A TLB Entry
• Common (necessary) bits
–
–
–
–
Virtual page number: match with the virtual address
Physical page number: translated address
Valid
Access bits: kernel and user (nil, read, write)
• Optional (useful) bits
–
–
–
–
Process tag
Reference
Modify
Cacheable
Oct 9, 2001
Virtual Memory & Translation
23
Hardware-Controlled TLB
• On a TLB miss
– Hardware loads the PTE into the TLB
• Need to write back if there is no free entry
– Generate a fault if the page containing the PTE is invalid
– VM software performs fault handling
– Restart the CPU
• On a TLB hit, hardware checks the valid bit
– If valid, pointer to page frame in memory
– If invalid, the hardware generates a page fault
• Perform page fault handling
• Restart the faulting instruction
Oct 9, 2001
Virtual Memory & Translation
24
Software-Controlled TLB
• On a miss in TLB
–
–
–
–
–
Write back if there is no free entry
Check if the page containing the PTE is in memory
If no, perform page fault handling
Load the PTE into the TLB
Restart the faulting instruction
• On a hit in TLB, the hardware checks valid bit
– If valid, pointer to page frame in memory
– If invalid, the hardware generates a page fault
Oct 9, 2001
• Perform page fault handling
• Restart the faulting instruction
Virtual Memory & Translation
25
Hardware vs. Software Controlled
• Hardware approach
– Efficient
– Inflexible
– Need more space for page table
• Software approach
– Flexible
– Software can do mappings by hashing
• PP#  (Pid, VP#)
• (Pid, VP#)  PP#
– Can deal with large virtual address space
Oct 9, 2001
Virtual Memory & Translation
26
Cache vs. TLBs
• Similarities
– Both cache a portion of
memory
– Both write back on a miss
• Combine L1 cache with TLB
– Virtually addressed cache
– Why wouldn’t everyone
use virtually addressed
caches?
Oct 9, 2001
• Differences
– Associativity
• TLB is usually fully setassociative
• Cache can be directmapped
– Consistency
• TLB does not deal with
consistency with memory
• TLB can be controlled by
software
Virtual Memory & Translation
27
Caches vs. TLBs
Similarities
• Both cache a portion of
memory
• Both read from memory on
misses
Differences
• Associativity
– TLBs generally fully associative
– Caches can be direct-mapped
• Consistency
– No TLB/memory consistency
– Some TLBs software-controlled
Combining L1 caches with TLBs
• Virtually addressed caches
• Not always used – what are their drawbacks?
Oct 9, 2001
Virtual Memory & Translation
28
Issues
• What TLB entry to be replaced?
– Random
– Pseudo LRU
• What happens on a context switch?
– Process tag: change TLB registers and process register
– No process tag: Invalidate the entire TLB contents
• What happens when changing a page table entry?
– Change the entry in memory
– Invalidate the TLB entry
Oct 9, 2001
Virtual Memory & Translation
29
Consistency Issues
• Snoopy cache protocols can maintain consistency with
DRAM, even when DMA happens
• No hardware maintains consistency between DRAM
and TLBs: you need to flush related TLBs whenever
changing a page table entry in memory
• On multiprocessors, when you modify a page table
entry, you need to do “TLB shoot-down” to flush all
related TLB entries on all processors
Oct 9, 2001
Virtual Memory & Translation
30
Issues to Ponder
• Everyone’s moving to hardware TLB
management – why?
• Segmentation was/is a way of maintaining
backward compatibility – how?
• For the hardware-inclined – what kind of
hardware support is needed for everything we
discussed today?
Oct 9, 2001
Virtual Memory & Translation
31
Download