Virtual Memory - BYU Computer Science Students Homepage Index

advertisement
CS 345
Virtual Memory
Chapter 8
Objectives
Topics to Cover…








Program Execution Patterns
Computer Memory
Virtual Memory
Paging
Segmentation
Performance
Replacement Algorithms
Paging Improvements
BYU CS 345
Virtual Memory
2
Program Execution
Program Execution

What are the characteristics of an executing
program?

Characteristics of an executing program:




has code that is unused
allocated more memory than is needed
has features that are used rarely
A program’s instructions must be in main memory to
execute, even though…


BYU CS 345
the entire program is not always executing
the address space of the program could be broken up across
available frames (paging)
Virtual Memory
3
Computer Memory
Computer Memory

What are the implications of lack of memory?

Lack of memory has serious implications



What if a program “grows” while executing?
What about moving to a new machine?
Execution of a program that is not ALL in physical
memory would be advantageous.




BYU CS 345
larger address space possible
more programs could be in memory
less I/O needed to get a process going
unused modules would not be loaded
Virtual Memory
4
Computer Memory
Early Memory Solutions

What were early solutions to lack of memory?

All larger programs had to contain logic for managing
two-level storage.



The non-volatile hard drive was used to store data and code.
Programs were responsible for moving “overlays” back and
forth from primary to secondary storage.
Multi-programming had to use “base and bounds
registers” to manage, allocate, and reallocate memory.
BYU CS 345
Virtual Memory
5
Computer Memory
Virtual Memory to the Rescue!






1961 - First virtual memory machine, Atlas Computer
project at the University of Manchester in the UK.
1962 - First commercial system, Burroughs B5000.
1972 – IBM introduces virtual memory in mainframes
with OS/370.
1979 - Unix uses virtual memory with 3BSD.
1993 - Microsoft introduces virtual memory into
Windows NT 3.
All had challenges
 Specialized, hard to build hardware required
 Too much processor power required to do address
translation
BYU CS 345
Virtual Memory
6
Virtual Memory
Virtual Memory

What is the difference between “real” and “virtual”
memory?



Program addresses only logical addresses
Hardware maps logical addresses to physical addresses
Only part of a process is loaded into memory






process may be larger than main memory
additional processes allowed in main memory
memory loaded/unloaded as the programs execute
generally implemented using demand paging.
Real Memory – The physical memory occupied by a
program (frames)
Virtual memory – The larger memory space perceived
by the program (pages)
BYU CS 345
Virtual Memory
7
Virtual Memory
Memory Hierarchy
Cache memory:
provides illusion of
very high speed
Main memory:
reasonable cost,
but slow & small
Virtual memory:
provides illusion of
very large size
Virtual
memory
Main memory
Cache
Registers
Words
Lines
(transferred
explicitly
via load/store)
BYU CS 345
Pages
(transferred
automatically
upon cache miss)
(transferred
automatically
upon page fault)
Virtual Memory
8
Virtual Memory
Virtual Memory



Principle of Locality – A program tends to
reference the same items - even if same item not
used, nearby items will often be referenced
Resident Set – Those parts of the program being
actively used (remaining parts of program on disk)
Thrashing – Constantly needing to get pages off
secondary storage



happens if the O.S. throws out a piece of memory that
is about to be used
can happen if the program scans a long array –
continuously referencing pages not used recently
O.S. must watch out for this situation!
BYU CS 345
Virtual Memory
9
Paging
Paging Hardware


Use page number as a index into the page table, which
then contains the physical frame holding that page
Typical Flag bits: Present, Accessed, Modified, various
protection-related bits
BYU CS 345
Virtual Memory
10
Paging
More Paging Hardware

Full page tables can be very large



4G space with 4K pages = 1M entries
some systems put page tables in virtual address space
Multilevel page tables

top level page table has a Present bit to indicate entire
range is not valid


BYU CS 345
second level table only
used if that part of the
address space is used
second level tables can
also be used for shared
libraries
Virtual Memory
11
Paging
Two-Level Paging System
15
Virtual Address
… 11 10
RPTE #
…
UPTE #
65
…
0
Frame Offset
Root Page Table
User Page Table
LC-3 Main Memory
+
Flags / Frame #
Flags / UPT #
RPT
BYU CS 345
+
One per process
Frame<<6 Offset
15 …
65 … 0
Physical Address
Virtual Memory
12
Paging
MMU’s

MMU’s used to sit between the CPU and bus


Page tables




now they are typically integrated into the CPU
originally implemented in special very fast registers
now they are stored in normal memory
entries are cached in fast registers as they are used
Optional features


separate page tables for each processor mode
read/write access control, referenced/dirty bits
BYU CS 345
Virtual Memory
13
Paging
More Paging Hardware

To minimize the performance penalty of address
translation, most modern CPUs include an onchip memory management unit (MMU) and
maintain a table of recently used virtual-tophysical translations, called a Translation
Lookaside Buffer (TLB).
BYU CS 345
Virtual Memory
14
Segmentation
Segmentation


Programmer sees memory as a set of multiple
segments, each with a separate address space
Growing data structures easier to handle



Can alter the one segment without modifying
other segments
Easy to share a library


O.S. can expand or shrink segment
share one segment among processes
Easy memory protection

can set values for the entire segment
BYU CS 345
Virtual Memory
15
Segmentation
Segmentation (continued…)

Implementation:




Combine with paging: No external fragmentation


have a segment table for each process
similar to one-level paging method
status – present, modified, location, size
easier to manage memory since all items are the same
size
Some processors have both (386)


each segment broken up into pages
address Translation



do segment translation
translate that address using paging
some internal fragmentation at the end of each
segment
BYU CS 345
Virtual Memory
16
So…

What policy decisions do OS designers
face?

Support paging, segmentation, or both?



Windows/Unix/Linux
 Use paging for virtual memory
 Use segments only for privilege level (segment =
address space)
Support virtual memory?
Which memory management algorithm?
BYU CS 345
Virtual Memory
17
Paging
Simple Paging Hardware

Use page number as a index into the page table, which
then contains the physical frame holding that page
Logical Address
BYU CS 345
Physical Address
Virtual Memory
18
Simple Paging Quiz
Consider a simple (1 level) byte addressable paging
system with the following parameters: 224 bytes of
physical memory; page/frame size of 211 bytes; 29 pages
of logical address space.
a. How many bits are in a logical address?
b. How many bytes in a frame?
c. How many bits in the physical address specify the
frame #?
d. What is the size of the logical address space?
e. How many bits in each page table entry?
(Include valid, dirty, and pin bits.)
f. What is the size of a page table?
BYU CS 345
Virtual Memory
19
Virtual Memory
Virtual Memory



Paged memory combined with disk swapping
Processes reside in main/secondary memory
Demand Paging




could also be termed as lazy swapping
bring pages into memory only when accessed
allows us to over allocate
What about at context switch time?



could swap out entire process
restore page state as remembered
anticipate which pages are needed
BYU CS 345
Virtual Memory
20
Decisions about Virtual Memory

Fetch Policy


Placement



How many process pages to keep in memory? Fixed or variable?
Reassign pages to other processes?
Cleaning Policy


What to unload to make room for a new page?
Resident Set Management


Where to put it? Unused page?
Replacement


When to bring a page in? When needed or in anticipation of
need?
When is a page written to disk?
Load Control
BYU CS 345
Virtual Memory
21
Page Replacement

Frame replacement – two page transfers






select a frame (victim)
write the victim frame to disk
read in new frame from disk
update page tables
restart process
Reduce overhead using dirty bit


dirty bit is set whenever a page is modified
if dirty, write page, else just throw it out
BYU CS 345
Virtual Memory
22
Page Fault
0
Page 0
0:m1
1
Page 1
1:v0
2
Page 2
2:m3
3
Page 3
3:v1
4
Page 0
Page 1
Page 2
Page 3
5
6
7
Page fault is generated when an invalid page is accessed
BYU CS 345
Virtual Memory
23
Memory References

Check internal table



Find a free frame





valid or invalid frame (page fault)
new or swapped page
get from frame pool
unload a frame
If page defined, read in from disk
Update the page table
Restart the instruction


process restarts from exact location
state unchanged (as if not interrupted)
BYU CS 345
Virtual Memory
24
Paging Implementation

Extreme case



What about overhead of paging?



start a process with no pages in memory
pure demand paging
locality
thrashing
Hardware support


page table
secondary memory
BYU CS 345
Virtual Memory
25
Paging Implementation (continued…)

Must be able to restart a process at any time




instruction fetch
operand fetch
operand store (any memory reference)
Consider simple instruction (VLIW or CISC)




Add C,A,B (C = A + B)
All operands on different pages
Instruction not in memory
4 possible page faults )-: slooooow :-(
BYU CS 345
Virtual Memory
26
Paging Performance

Paging Time…





Disk latency
Disk seek
Disk transfer time
Total paging time
8 milliseconds
15 milliseconds
1 millisecond
~25 milliseconds
Could be longer due to


device queueing time
other paging overhead
BYU CS 345
Virtual Memory
27
Paging Performance (continued…)

Effective access time:
EAT = (1 - p)  ma + p  pft
where:
p is probability of page fault
ma is memory access time
pft is page fault time
BYU CS 345
Virtual Memory
28
Paging Performance (continued…)


Effective access time with 100 ns memory
access and 25 ms page fault time:
EAT = (1 - p)  ma + p  pft
= (1 - p)  100 + p  25,000,000
= 100 + 24,999,900  p
What is the EAT if p = 0.001 (1 out of 1000)?



100 + 24999,990  0.001 = 25 microseconds
250 times slowdown!
How do we get less than 10% slowdown?


100 + 24999,990  p  1.10  100 ns = 110 ns
Less than 1 out of 2,500,000 accesses fault
BYU CS 345
Virtual Memory
29
Placement Policies

Where to put the page




trivial in a paging system – can be placed anywhere
Best-fit, First-Fit, or Next-Fit can be used with
segmentation
is a concern with distributed systems
Frame Locking


require a page to stay in memory
 O.S. Kernel and Interrupt Handlers
 real-Time processes
 other key data structures
implemented by bit in data structures
BYU CS 345
Virtual Memory
30
Replacement Algorithms
Replacement Algorithms

Random (RAND)


Belady’s Optimal Algorithm


strictly a straw-man for stack algorithms
Least Recently Used (LRU)


for the dogs, forget ‘em
Least Frequently Used (LFU)


best but unrealizable – used for comparison
First-In, First-Out (FIFO)


choose any page to replace at random
discard pages we have probably lost interest in
Clock – Not Recently Used (NRU)

efficient software LRU
BYU CS 345
Virtual Memory
31
Replacement Algorithms
Belady’s Optimal Algorithm

Belady’s optimal replacement




“Perfect knowledge” of the page reference stream.
Select the page that will not be referenced for the
longest time in the future.
Rare for a system reference stream for every thread in
every process in advance.
 generally unrealizable
 few special cases: program to predict the weather
Its theoretical behavior is used to compare the
performance of realizable algorithms.
BYU CS 345
Virtual Memory
32
Replacement Algorithms
Replacement Quiz
Belady’s Optimal
Frame
0 1 2 3 2 3 2 0 4 3 2 1 2 1 0 1
0
1
2
Least Recently
Used
Frame
0 1 2 3 2 3 2 0 4 3 2 1 2 1 0 1
0
1
2
Least Frequently
Used
Frame
0 1 2 3 2 3 2 0 4 3 2 1 2 1 0 1
0
1
2
BYU CS 345
Virtual Memory
33
Replacement Algorithms
FIFO page replacement

Replace oldest page in memory

Intuition:


Advantages:



Fair: All pages receive equal residency
Easy to implement (circular buffer)
Disadvantage:



First referenced long time ago, done with it now
Some pages may always be needed
Difficult to implement (time stamps)
Can we improve the performance by adding
more frames?
BYU CS 345
Virtual Memory
34
Replacement Algorithms
FIFO
FIFO/3 Frames
Frame
0 1 2 3 0 1 2 3 0 1 2 3 4 5 6 7
(16)
0
0 0 0 3 3 3 2 2 2 1 1 1 4 4 4 7
1
1 1 1 0 0 0 3 3 3 2 2 2 5 5 5
2
2 2 2 1 1 1 0 0 0 3 3 3 6 6
FIFO/4 Frames
Frame
0 1 2 3 0 1 2 3 0 1 2 3 4 5 6 7
(8)
0
0 0 0 0 0 0 0 0 0 0 0 0 4 4 4 4
1
1 1 1 1 1 1 1 1 1 1 1 1 5 5 5
2
2 2 2 2 2 2 2 2 2 2 2 2 6 6
3
3 3 3 3 3 3 3 3 3 3 3 3 7
BYU CS 345
Virtual Memory
35
Replacement Algorithms
13 Page
Faults!
FIFO/3 Frames
Frame
0 1 2 3 0 1 4 0 1 2 3 4 0 1 2 3
(13)
0
0 0 0 3 3 3 4 4 4 4 4 4 0 4 4 3
1
1 1 1 0 0 0 0 0 2 2 2 2 1 1 1
2
2 2 2 1 1 1 1 1 3 3 3 3 2 2
FIFO/4 Frames
Frame
0 1 2 3 0 1 4 0 1 2 3 4 0 1 2 3
(14)
0
0 0 0 0 0 0 4 4 4 4 3 3 3 3 2 2
1
1 1 1 1 1 1 0 0 0 0 4 4 4 4 3
2
2 2 2 2 2 2 1 1 1 1 0 0 0 0
3
3 3 3 3 3 3 2 2 2 2 1 1 1
14 Page
Faults!
BYU CS 345
Virtual Memory
36
Replacement Algorithms
FIFO replacement performance
Page Faults
What is going on here?
Belady’s Anomaly
14
Number of Page Faults
12
10
8
Page Faults
6
4
2
0
0
1
2
3
4
5
6
7
8
Number of Frames
BYU CS 345
Virtual Memory
37
Replacement Algorithms
Least Recently Used

Replace page not used for longest time in past


Intuition: Use past to predict the future
Advantages:


Disadvantages:



With locality, LRU approximates OPT (Belady’s algorithm)
harder to implement, must track which pages have been
accessed (time stamp, page stack)
does not handle all workloads well
Updates must occur at every memory access


huge overhead
few computers offer enough hardware support for LRU.
BYU CS 345
Virtual Memory
38
Replacement Algorithms
Implementing LRU

Software Perfect LRU





Hardware Perfect LRU





OS maintains ordered list of physical pages by reference time
When page is referenced: Move page to front of list (top)
When need victim: Pick page at back of list (bottom)
Trade-off: Slow on memory reference, fast on replacement
Associate register with each page
When page is referenced: Store system clock in register
When need victim: Scan through registers to find oldest clock
Trade-off: Fast on memory reference, slow on replacement
(especially as size of memory grows)
In practice, do not need to implement perfect LRU


LRU is an approximation anyway, so approximate more
Goal: Find an old page, but not necessarily the very oldest
BYU CS 345
Virtual Memory
39
Replacement Algorithms
Implementing LRU

Can use a reference bit



Reference time tracking implemented in software




cleared on loading
set every time the page is referenced
periodically scan page tables
note which pages referenced and modified
reset the reference/modified bits
Clock algorithm – efficient software LRU



all pages are in a circular list
start scan where the previous scan left off
replace the first un-referenced page you find
BYU CS 345
Virtual Memory
40
Clock Algorithm
Second-Chance Algorithm

Often called “clock algorithm”



If the reference bit is 1



clear the reference bit
move clock pointer to the next page
If reference bit is 0 and not pinned




keep circular list of all pages (RPT’s, UPT’s, Memory)
clock pointer refers to next page to consider
swap page out to disk (if dirty)
move clock pointer to next page
return page number
Could cycle through entire list before finding
victim
BYU CS 345
Virtual Memory
41
Clock Algorithm
Clock Replacement Quiz
Clock/3 Frames
Frame
7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0
0
1
2
Clock/4 Frames
Frame
7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0
0
1
2
3
BYU CS 345
Virtual Memory
42
CS 345 Virtual Memory Project
BYU CS 345
Virtual Memory
43
Project 4
Virtual Memory Guidelines





Verify a clean compilation of your LC-3 virtual memory
simulator. Validate that “crawler.hex” and “memtest.hex”
programs execute properly.
Modify the getMemAdr() function to handle a 2-level, paging,
virtual memory addressing.
Implement a clock page replacement algorithm to pick
which frame is unloaded, if necessary, on a page fault.
Use the provided 1MB page swap table routine to simulate
paged disk storage (8192 pages) or implement your own
routine.
Use crawler.hex and memtest.hex to validate your virtual
memory implementation. Use other routines (such as im)
to debug you implementation.
BYU CS 345
Virtual Memory
44
Project 4
Virtual Memory Guidelines
Use the following CLI commands to verify and validate
your virtual memory system. (Most of these routines are
provided, but may require some adaptation to your
system.)

dfm <#>
dft
dm <sa>,<ea>
dp <#>
dv <sa>,<ea>
Display LC3 memory frame <#>
Display frame allocation table
Display physical LC3 memory from <sa> to <ea>
Display page <#> in swap space
Display virtual LC3 memory <sa> to <ea>

im <#>
Init LC3/Set upper LC3 memory limit
rpt <#>
upt <p><#>
Display task <#> root page table
Display task <p> user page table <#>
vma <a>
vms
Access <a> and display RPTE’s and UPTE’s
Display LC3 statistics








BYU CS 345
Virtual Memory
45
Project 4
Virtual Memory Guidelines
Demonstrate that LC-3 tasks run correctly. Be able to dynamically
change LC-3 memory size (im command) and chart resulting changes
in page hits/faults. Memory accesses, hits and faults are defined as
follows:
Memory access (memAccess) = sum of memory hits (memHits) and memory faults (memPageFaults).
Hit (memHits) = access to task RPT, UPT, or data frame. (Exclude accesses below 0x3000.)
Fault (memPageFaults) = access to a task page that is undefined or not currently in a memory frame.
Page Reads (pageReads) = # pages read from swap space into memory.
Page Writes (pageWrites) = # pages written from memory to swap space.
Swap Page (nextPage) = # of swap space pages currently allocated to swapped pages.
Crawler
Frames:
320
16
Memtest
2
320
16
2
Accesses:
Hits:
Faults:
Page Reads:
Page Writes:
Swap Pages:
BYU CS 345
Virtual Memory
46
Project 4
Project 4 Grading Criteria
REQUIRED:





4 pts – Successfully execute crawler and memtest in 20k words (320 frames).
3 pts – Successfully execute crawler and memtest in 1k words (16 frames).
1 pt – Successfully execute 5 or more LC-3 tasks simultaneously in 16 frames of LC-3
memory.
1 pt – Correctly use the dirty bit to only write altered or new memory frames to swap
space.
1 pt – Chart and submit the resulting memory access, hit, fault, and swap page statistics
after executing crawler (and then memtest) in 320 and 16 frames.
BONUS:
+1 point – early pass-off (at least one day before due date.)
+1 point – Add a task frame/swap page recovery mechanism of a terminated task.
+1 point – Implement the advanced clock algorithm and chart the results.
+1 point – Join the 2-frame club. (Successfully execute 5 or more LC-3 tasks
simultaneously in 2 frames of LC-3 memory. Chart the memory accesses, hits, and
faults.)
 –1 point penalty for each school day late.




BYU CS 345
Virtual Memory
47
Project 4
So…
1.
2.
3.
4.
5.
6.
Read and comprehend Stallings, Section 8.1.
Comprehend the lab specs. Discuss questions with classmates, the
TA’s and/or the professor. Make sure you understand what the
requirements are! It's a tragedy to code for 20 hours and then
realize you're doing everything wrong.
Validate that the demo LC-3 simulator works for a single task with
pass-through addressing (virtual equals physical) for the LC-3 by
executing the commands “crawler” and “memtest”.
Design your MMU. Break the problem down into manageable parts.
Create and validate a “clock” mechanism that accesses all global
root page tables, user page tables, and data frames.
Implement dirty bit last – use “write-through” for all swapping of a
data frame to swap space.
BYU CS 345
Virtual Memory
48
Project 4
So…
7.
Incrementally add support for the actual translation of virtual
addresses to physical addresses with page fault detection as
follows:
a. Implement page fault frame replacement using available
memory frames only. This should allow you to execute any test
program in a full address space.
b. Implement clock page replacement algorithm to unload data
frames to swap pages and reload with a new frame or an existing
frame from swap space. This should allow you to execute all the
test programs in a 32k word address space (20k of paging frames).
c. Implement clock page replacement algorithm to unload User
Page Tables when there are no physical data frame references in
the UPT. This will be necessary when running in a small physical
space (16k words) with multiple tasks.
d. Implement dirty bit to minimize writing frames to swap space.
BYU CS 345
Virtual Memory
49
Project 4
So…
8.
Remember to always increment your clock after finding a
replacement frame.
9. Use the vma function to access a single virtual memory
location and then display any non-zero RPT and UPT
entries. Implement various levels of debug trace to watch
what is going on in your MMU. You may use the provided
display functions.
10. When swapping a user page table to swap space, add some
debug “sanity check” code to validate that the UPT does not
have any entries with the frame bit set.
BYU CS 345
Virtual Memory
50
BYU CS 345
Virtual Memory
51
Paging Problems?

Recent revival in page replacement research.




Size of primary storage has increased - algorithms
that require a periodic check of each and every
memory frame are becoming less and less practical.
Memory hierarchies have grown taller - the cost of a
CPU cache miss is far more expensive. This
exacerbates the previous problem.
Object-oriented programming techniques have
weakened locality of reference.
Sophisticated data structures like trees and hash
tables and the advent of garbage collection have
drastically changed the memory access behavior of
applications.
BYU CS 345
Virtual Memory
52
Paging Improvements?

Disk access techniques






Demand paging
Better Working Set model


use larger blocks
separate swap space - no file table lookup
binary boundaries
load several consecutive sectors/pages rather than individual
sectors due to seek, rotational latency
Monitor program execution – minimize number of pages per
process are needed for execution (locality)
Pre-paging


bring in pages that are likely to be used in the near future
easier to guess at program startup, but may load unnecessary
pages
BYU CS 345
Virtual Memory
53
Clock Algorithm Enhancements?


Consider reference bit and dirty bit
4 possible cases (Macintosh scheme)








(0,0) neither modified or referenced
(0,1) not recently used but modified
(1,0) recently used but clean
(1,1) recently used and modified
Still use “clock algorithm”
 clear only reference bit upon consideration
Add additional reference bits - 3rd, 4th,… chance
At regular intervals, clear all reference bits
A process can be in RAM if and only if all of the pages
that it is currently using can be in RAM.
BYU CS 345
Virtual Memory
54
Frame Allocation


Demand allocation
Options


Minimum number of frames


keep 3 empty frames, write out in background
what is the least number of frames to allocate
Allocation Algorithms



equal allocation
proportional to storage for executable
priority
BYU CS 345
Virtual Memory
55
Global vs Local Allocation

Global Allocation


Local Allocation



replacement page is selected from among all pages
in system
replacement page is selected only from the pages
owned by the process
Process controls its own page fault rate
Number of pages for a process won’t grow
BYU CS 345
Virtual Memory
56
BYU CS 345
Virtual Memory
57
Download