Lecture 9

advertisement
Virtual Memory: LRU Approximation Algorithms
Many systems provide some help in the form of a
reference bit.
Operating Systems:
The reference bit for a page is set, by the
Lecture 9
hardware, whenever that page is referenced
(either a read or a write to any byte in the page).
Memory Management
Virtual Memory
Reference bits are associated with each entry in
the Page Table.
2
Virtual Memory: LRU Approximation Algorithms (1)
Virtual Memory: LRU Approximation Algorithms (1)
AdditionalAdditional-ReferenceReference-Bits Algorithm
AdditionalAdditional-ReferenceReference-Bits Algorithm
We
can gain additional ordering information by
recording the reference bits at regular intervals.
These HRs contain the history of page use for the
last eight time periods.
We
If the HR contains 00000000, then the page has
not been used for eight time periods;
At
A page that is used at least once each period
would have a HR’s value of 11111111.
can keep an 8 bit byte (History Register (HR))
for each page in a table in memory.
regular intervals (say every 100 milliseconds),
a timer interrupts transfer control to the OS.
The
OS shifts the reference bit for each page into
the high order bit of its HR, shifting the other bits
right 1 bit, discarding the low order bit.
3
4
Virtual Memory: LRU Approximation Algorithms (2)
Virtual Memory: LRU Approximation Algorithms (1)
AdditionalAdditional-ReferenceReference-Bits Algorithm
time
Interval:
Reference bits:
0 0 1 0 0 0 1 1 1 0
<HR> = 11000100
<HR> = 11100010
<HR> = 01110001
SecondSecond-Chance Algorithm (Clock Algorithm)
A page with <HR> = 11000100
is more recently referenced then
a page with <HR> = 01110111
The
basic algorithm of a second chance
replacement is a FIFO replacement algorithm.
When a page has been selected, however, we
inspect its reference bit.
If we interpret <HR> as unsigned
integers, the page with lowest
number is the LRU page, and it
can be replaced
If
the value is 0, we proceed to replace this page.
If
the reference bit is 1, however, we give that
page a second chance and move on to select the
next FIFO page.
The number of bits in HR can be
varied (selection should be fast as
possible). In the extreme case, the
number can be reduced to zero,
leaving only the reference bit itself.
This is the secondsecond-chance algorithm.
5
Virtual Memory: LRU Approximation Algorithms (2)
Virtual Memory: LRU Approximation Algorithms (2)
SecondSecond-Chance Algorithm (Clock Algorithm)
6
Second-Chance Algorithm (Clock Algorithm)
When a page gets a second chance, its reference
bit is cleared and its arrival time is reset to current
time.
Before
a
b
Thus, a page that is given a second chance will
not be replaced until all other pages are replaced.
In additional, if a page is used often enough to
keep its reference bit set, it will never be replaced.
One way to implement this algorithm is as a
circular queue.
After
PageRef.
PageRef. bit
b 0
0
0
m 1
Pointer
c
a
b
1
g
0
c
1
f
e
g
0
circular queue of pages
1
d
f
0
h
0
Page “h” is referenced
7
m 1
0
circular queue of pages
d
0
0
1
0
Pointer
8
Virtual Memory: LRU Approximation Algorithms (3)
Virtual Memory: LRU Approximation Algorithms (2)
Second-Chance Algorithm (Clock Algorithm)
Enhanced SecondSecond-Chance Algorithm
Use the modify bit, M, along with the reference bit, R.
Depending on (R, M) pages are classified in four classes:
If
page to be replacement (in clock order) has
reference bit = 1, then:
1) (0, 0): not recently used nor modified - best page to replace;
2) (0, 1): not recently used, but modified - not quite as good,
because the page will need to be written out before
replacement;
set
reference bit 0;
leave page in memory;
replace next page (in clock order),
subject to same rules.
3) (1, 0): recently used, but clean - probably will be used again
soon;
4) (1, 1): recently used and modified - probably will be used
again, and write out will be needed before replacing it.
Class (1) has the lowest priority while class (4) the
highest.
highest.
We replace the first page encountered in the lowest nonempty
class.
9
10
Virtual Memory
Virtual Memory: LRU Approximation Algorithms (4)
Factors which determines the page fault rate
Comparison of Page
Replacement Algorithms
Counting Algorithms - keep a counter of the number of
references that have been made to each page.
40
LFU Algorithm: The least frequently used (LFU) algorithm
replaces page with smallest count.
Program;
MFU Algorithm: The most frequently used (MFU) algorithm is
based on the argument that the page with the smallest count
was probably just brought in and has yet to be used.
Memory
size;
Replacement
Page
size;
Prefetch
PagePage-Buffering Algorithm - desired page is read into
a free frame from the pool before the victim is written out.
Implementation is expensive.
Approximation of OPT replacement is not very good.
algorithm;
algorithm;
Page Faults per 1000 References
35
OPT
LRU
Clock
FIFO
30
25
20
15
10
5
0
6
8
10
12
14
Number of Page Frames Allocated
11
(From W.Stallings,
W.Stallings, Operating Systems, 1992, Macmillan Publ.
Publ. Co., p. 289)
12
Virtual Memory: Allocation
Virtual Memory: Allocation
of Frames
of Frames
The simplest case of virtual memory is the single-
There is a minimum number of frames that
user system. There are many variations for
distributing of frames.
must be allocated.
Obviously, as the number of frames allocated to
each process decreases, the page fault-rate
increases, slowing process execution.
A different problem arises when demand paging is
combined with multiprogramming.
Multiprogramming puts two (or more) processes in
memory at the same time.
13
14
Virtual Memory: Allocation of
Virtual Memory: Allocation
Frames
of Frames
Two major allocation schemes: fixed and priority allocations.
This minimum number of frames is defined by the instruction-set
architecture. (Remember that, when a page fault occurs before an
executing instruction is complete, the instruction must be restarted)
Fixed allocation:
allocation:
Example:
Consider a machine in which all memory-reference instructions
have only one memory address. We need at least one frame for the
instruction and one frame for the memory reference.
In addition, if one-level indirect addressing is allowed, then paging
requires at least three frames per process.
IBM 370, MVS (Move Character) instruction (storage-to-storage):
instruction is 6 bytes, might span 2 pages;
2 pages to handle from;
2 pages to handle to => 6 pages to handle.
Equal allocation => if 100 frames and 5 processes, give each 20
pages.
Proportional allocation => allocate according to the size of process:
si
= size of process pi ;
S = Σ si ;
m = total number of frames;
ai = allocation for pi = (si / S) x m .
Example: m = 64, s1
15
= 10, s2 = 127, a1 = (10/137) x 64 ~ 5,
a2 = (127/137) x 64 ~ 59.
16
Virtual Memory: Allocation
Virtual Memory: Allocation
of Frames
of Frames
We can classify page-replacement algorithms into
Priority allocation:
allocation:
two broad categories:
Use a proportional allocation scheme using priorities rather than
size.
If process Pi generates a page fault:
select
Global replacement - process select a replacement
frame from the set of all frames; one process can take a
frame from another.
Gives more performance; may cause trashing.
for replacement one of its frames;
select
for replacement a frame from a process with lower
priority number.
Local replacement - each process selects a replacement
frame from only its own set of allocated frames.
Not affected by other active processes.
17
Virtual Memory: Thrashing
Virtual Memory: Thrashing
A high paging activity is called thrashing. A process
is thrashing if it is spending more time paging than executing.
Thrashing = a process is busy swapping pages in and out.
If a process does not have “enough” pages, the page-fault
rate is very high:
The OS monitors CPU utilization:
if CPU utilization is too low,
we increase the degree of multiprogramming by introducing
a new process to the system
a global-replacement algorithm is used, replacing pages
with no regard to the process to which they belong (steals
pages from active process)
active processes have fewer pages,
more page faults
CPU utilization decreases
As a result, CPU utilization drops even further, and the CPU
scheduler tries to increase the degree of multiprogramming
even more. The page-fault rate increases tremendously.
CPU utilization
18
19
thrashing
degree of multiprogramming
20
Virtual Memory: Thrashing: How to
Virtual Memory: Thrashing:Working-
prevent it
Set Model
The working-set model is based on the assumption of
To prevent thrashing, we must provide a process as
locality.
many frames as it needs.
This model used a parameter, ∆, (a time unit) to define the
working-set window. The idea is to examine the most recent ∆
page references.
The set of pages in the most recent ∆ page references is the
working set.
set
If the page is in active use, it will be in the working set.
If it is no longer being used, it will drop from the working set ∆ time
units after its last reference. Thus, the working set is an
approximation of the program’s locality.
But how do we know how many frames it “needs”?
There are several techniques.
The working-set strategy starts by looking at how many frames a
process is actually using. This approach defines the locality model
of process execution.
The locality model states that, as a process executes, it migrates
from one locality to another. A locality is a set of pages that are
actively used together. A program is generally composed of several
different localities, which may overlap.
... 2 6 1 5 7 7 7 7 5 1 6 2 3 4 1 2 3 4 4 4 3 4 3 4 4 4 1 3 2 3 4 4 4 3 4 4 4...
We can see that localities are defined by the program structure and its
data structures.
∆
∆
9090-10 rule:
rule programs spend 90% of their time in 10% of their code
21
WS(t
WS(t1) = {1,2,5,6,7}
t1
WS(t
WS(t2) = {3,4}
t2
22
Virtual Memory: Thrashing:Working-
Thrashing:Working-Set Model
Set Strategy
The most important property of the working set is its
Allocate the working set size to each process.
size.
If we compute the working-set size, WSSi (total
number of pages referenced in the most recent ∆ ),
for each process in the system, we can then
consider D = Σ WSSi ,where D is the total demand
frames. Each process is actively using the pages in
its working set. Thus, process i needs WSSi frames.
If the total demand is greater than the total number
of available frames (D > m), thrashing will occur,
because some processes will not have enough
frames.
Policy if D > m , then suspend one of the processes.
Prevents thrashing while maximizing the
multiprogramming level.
The overhead of keeping track of the working set is
non-trivial.
Approximations have been used in actual implementation.
Approximation of the working set using the reference bits and
interval timer:
Set
the interval timer to interrupt at every T time-units;
When
the timer interrupts, check the page frames with the
reference bits set to 1, which are considered to be in the
working set.
23
24
Virtual Memory: Thrashing:Page-
Virtual Memory: Thrashing:Page-Fault
Fault Frequency
Frequency
If the actual page-fault rate exceeds the upper
Thrashing has a high page-fault rate. Thus, we
limit, we allocate that process another frame;
want to control the page-fault rate. The page-fault
frequency (PFF) strategy takes a more direct
approach than Working-Set Model.
When a page-fault rate is too high, we know that the process
needs more frames.
Similarly, if the page-fault rate is too low, then the process
may have too many frames.
we remove a frame from that process.
Thus, we can directly measure and control the page-fault rate to
prevent thrashing.
page--fault
fault--rate
page
If the page-fault rate falls below the lower limit,
We can establish upper and lower bounds on the
desired page-fault rate.
increase number
of frames
upper bound
lower bound
decrease number
of frames
25
number of frames
Virtual Memory: Page Size
Virtual Memory: Program
Consideration
Structure
Page sizes are usually hardware defined.
Demand paging is designed to be transparent to the
No ideal choice:
Bigger pages reduce table fragmentation;
Smaller pages give better localities and reduce internal
fragmentation;
I/O is more efficient for bigger pages;
Smaller pages may reduce the amount of I/O needed;
Large pages reduce the number of page faults.
26
user program. However, in many cases system
performance can be improved by an awareness of the
underlying demand paging.
Most contemporary system have settled on 4
Example: Assume pages are 128 words in size. Program is to initialize to
0 each element of a 128 x 128 array A(128,128).
for i := 1 to 128 do
for j := 1 to 128 do
A(i,j) := 0 ;
How many page faults are generated?
Case 1: Pascal, C, PL/1 => matrix is stored in row major
{A(1,1), A(1,2), …, A(1,128)} are stored in the same page => 128 page-faults
Case 2: Fortran => matrix is stored in column major
{A(1,1), A(2,1), …, A(128,1)} are stored in the same page => 128x128 = 16384
page-faults
kilobytes.
27
28
Virtual Memory: I/O
Virtual Memory: Demand
Interlock
Segmentation
The problem with I/O in swapping systems still
Demand Paging is the most efficient virtual-
exists in paging system.
memory system, but implementation is very
complicated (hardware).
The solution is usually to lock individual pages in
Demand Segmentation - used when insufficient
memory (these with I/O pending) rather than
locking whole process in memory.
hardware to implement demand paging.
The overall impact is better than the overhead of
copying in and out of OS buffers.
29
The End
Memory
Management
31
OS/2 (80286) allocates memory in segments, which it keeps
track of through segment descriptors.
Segment descriptor contains a valid bit to indicate whether the
segment is currently in memory.
If
segment is in main memory, access continues;
If
not in memory, segment fault.
30
Download