LRU-K Page Replacement Algorithm

advertisement
LRU-K Page Replacement
Algorithm
CSCI 485 Lecture notes
Instructor: Prof. Shahram
Ghandeharizadeh.
Outline
•
•
•
•
•
•
History
Motivation for LRU-K
Alternatives to LRU-K
LRU-K
Design and implementation
Conclusion
History
• LRU-K is attributed to Elizabeth J. O’Neil,
Patrick E. O’Neil, and Gerhard Weikum:
– The LRU-K Page Replacement Algorithm for
Database Disk Buffering, ACM SIGMOD
1993, Washington D.C., page 297-306.
Least Recently Used (LRU)
• When a new buffer page is needed, the
buffer pool manager drops the page from
buffer that has not been accessed for the
longest time.
• Originally for patterns of use in instruction
logic (Denning 1968).
• Limitation: Decides what page to drop
from buffer based on too little information
(time of last reference).
Pseudo-code for LRU
LRU (page p)
If p is in the buffer then LAST(p) = current time;
Else
i) Min = current time + 1;
ii) For all pages q in the buffer do
a) If (LAST(q) < min)
victim = q
Min = LAST(q)
iii) If victim is dirty then flush it to disk
iv) Fetch p into the buffer frame held by victim
v) LAST(p) = current time
Example 1: LRU Limitation
• Consider a non-clustered, primary B-tree index on the
SS# attribute of the Employee table.
–
–
–
–
t(Emp) = 20,000
P(Emp) = 10,000 (2 records per disk page)
lp(I, Emp) = 100
Workload: queries that retrieve Emp records using exact match
predicates using SS# attribute, e.g., SS#=940-98-7555
• If the B-tree is one-level deep (root-node, followed by
the 100 leaf pages), pattern of access is: Ir, I1, D1, Ir, I2,
D2, Ir, I3, D3, ….
• Assume your workload consists of 101 frames, what is
the ideal way to assign leaf pages and data pages to
these frames?
What will LRU do?
LRU (page p)
If p is in the buffer then LAST(p) = current time;
Else
i) Min = current time + 1;
ii) For all pages q in the buffer do
a) If (LAST(q) < min)
victim = q
Min = LAST(q)
iii) If victim is dirty then flush it to disk
iv) Fetch p into the buffer frame held by victim
v) LAST(p) = current time
•
In our example: Data pages compete with the leaf
pages, swapping them out. More disk I/O than
necessary.
Example 2: LRU Limitation
• A banking application with good locality of
shared page references, e.g., 5000
buffered pages out of one million disk
pages observe 95% of the references.
• Once a few batch processes begin
sequential scans through all one million
pages, the referenced pages swap out the
5000 buffered pages.
Possible Approaches
• Page pool tuning
• Query execution plan analysis
• LRU-K
Page pool tuning
• DBA constructs page pools, separating
different reference patterns into different
buffer pools.
• Disadvantage:
– requires human effort,
– What happens when new reference patterns
are introduced? Or existing reference
patterns disapper?
Query execution plan analysis
• Query optimizer should provide hints
about the usage pattern of a query plan
– Buffer pool manager employs FIFO for pages
retrieved by a sequential scan.
– Buffer pool manager employs LRU for index
pages.
• In multi-user situations, query optimizer
plans may overlap in complicated ways.
• What happens with Example 1?
LRU-K
• The victim page (page to be dropped) is
the one whose backward K-distance is the
maximum of all pages in buffer.
• Definition of Backward K-distance bt(p,K):
Given a reference string known up to time
t (r1, r2, …,rt), the backward distance
bt(p,K) is the distance backward to the Kth
most recent reference to page p.
LRU-K (Cont…)
• Design limitations:
– Early page replacement: An unpopular page
may observe correlated references shortly
after being referenced for the first time.
– Extra memory because LRU-K retains history
of pages referenced (even those that are not
in the buffer). LRU does not have this
limitation; its memory requirement is well
defined.
Early Page Replacement
Key observation
• Two correlated references are insufficient reason
to conclude that independent references will
occur.
• One solution: The system should not drop a
page immediately after its first reference.
Instead, it should keep the page around for a
short period until the likelihood of a dependent
follow-up reference is minimal. Then the page
can be dropped. AND, correlated references
should not impact the interarrival time between
requests as observed by LRU-K.
• Correlated reference period = timeout.
Memory required by LRU-K
• Why not keep the last K references in the
header of each disk page (instead of main
memory)?
– After all, when the page is memory resident
then its last K references are available.
Memory required by LRU-K
• Forget history of pages using the 5 minute
rule. Those pages that are not referenced
during the last 5 minute, loose their
history.
Pseudo-code of LRU-K
• HIST & LAST are
main memory data
structures.
• Optimizations:
• Use tree search to
find the page with
maximum backward
K-distance.
Performance Analysis
• Compare LRU (LRU-1) with LRU-2 and
LRU-3.
• Three different workloads
• Measured metrics:
– Cache hit for a given buffer pool size.
– How much larger should the buffer pool with
LRU-1 be in order to perform the same as
LRU-2? This value is represented as
B(1)/B(2).
Workload 1: Two Pool Experiments
• Designed to resemble Limitation 1 shown
earlier.
• Two pools of disk pages: N1 and N2.
• Alternate references to each pool. A page
in pool Ni is referenced randomly.
• What is the probability of reference to a
page in Pool N1?
Obtained results for Workload 1
Key Observations
• LRU-3 is identical/very-close to optimal.
• Why would one not choose K=3?
Key Observations
• LRU-3 is identical/very-close to optimal.
• Why would one not choose K=3?
– For evolving access patterns, LRU-3 is less
adaptive than LRU-2 because it needs more
references to adapt itself to dynamic changes
of reference frequencies.
– LRU-3 requires a larger number of requests to
forget the past.
• Recommendation: Advocate LRU-2 as a
generally efficient policy.
Workload 2: Zipfian random access
• 1000 pages accessed using a Zipfian
distribution of access.
Workload 3: Trace driven
• Gather traces for one hour from an OLTP
system used by a large bank.
• Number of unique page references is
470,000.
• Key observation: LRU-2 is superior to
both LRU and LFU.
Results for Workload 3
Workload 3
• LRU-2 is superior to both LFU and LRU.
• With small buffer sizes (< 600), LRU-2
improved the buffer hit ratio by more than
a factor of 2.
• LFU is surprisingly good. Why not LFU?
Workload 3
•
•
•
LRU-2 is superior to both LFU and LRU.
With small buffer sizes (< 600), LRU-2
improved the buffer hit ratio by more than
a factor of 2.
LFU is surprisingly good. Why not LFU?
1. LFU never forgets previous references when
it compares the pirorities of pages. Hence, it
cannot adapt to evolving access patterns.
LRU-K
•
Advantages:
1. Discriminates well between page sets with different
levels of reference frequency, e.g., index versus
data pages (Example 1).
2. Detects locality of reference within query
executions, across multiple queries in the same
transaction, and across multiple transactions
executing simultaneously.
3. Does not require external hints.
4. Fairly simple and incurs little bookkeeping
overhead.
Download