Dueling Segmented LRU Replacement Algorithm

advertisement
Dueling Segmented LRU
Replacement Algorithm
Hongliang Gao
Chris Wilkerson
The Basic Ideas
• Auxiliary Directory:
– Evaluates “dueling” replacement algorithms.
• Segmented LRU list:
– Reference bit protects lines with good locality.
– Aging/ Random Promotion.
• Adaptive Bypass:
– Protect cache contents by bypassing the
cache completely.
Dueling Replacement Algos
Auxiliary Directory
Set0
Set1
Set2
Set3
Set4
Set5
Set6
• 32 sets sampled
(static)
• 2 policies evaluated
in each sampled set.
• 16-bit mini-tags
• Counter updated
when policies differ.
Set7
Tag Array
Saturating
Counter
Review of Segmented LRU
SLRU: Reference Bit
4 LRU bits per line
track LRU position
Tag
•Reference bit is marked when a line is referenced.
•Replace any non-referenced lines first.
•Replace global LRU if all lines are referenced.
SLRU Features
• Random Promotion
– Reference bit is marked when referenced or when
randomly promoted.
– Eg: 1/32 newly allocated lines may randomly be
selected for promotion.
• Aging
– Reference bits can be cleared as well as set.
– Line allocations cause the reference bit of the
LRU line to be cleared.
Adaptive Bypass
• Misses result in allocation or bypass.
Data
Structure
Thrashing on 4th way No Thrashing
w/o Bypass
w/ Bypass
Cache
• Bypass based on a random probability.
– Eg: 1, 1/2, 1/4, … 1/4096.
– Probability is doubled/halved according to
the success of previous bypasses.
SLRU w/ Adaptive Bypassing
SLRU: Reference Bit
0
1
0
1
• De-allocated line
tracked by partial tag.
• Allocated line tracked
by 4 bit pointer.
• Valid Bit
• Virtual Bypass Bit
16 bit partial tag for “out-of-cache” competitor
4 bit pointer for “in-cache competitor”
ch
en
bz
ip
40 2
3.
gc
42 c
43 9.m
4.
ze cf
u
44 sm
5.
p
go
b
44 mk
7.
d
45 ea
lI
0.
so I
p
45
6. lex
h
46 mm
4.
e
h2 r
47
64
1.
om ref
ne
47 tpp
3.
as
ta
48 r
1
48
2. .wr
f
s
48
3. phi
xa
nx
la
3
nc
bm
av k
er
ag
e
1.
40
rlb
pe
0.
40
% Bypass
Frequency of Bypass
100.00%
P0_BYPASSED
Benchmark
P1_BYPASSED
80.00%
60.00%
40.00%
20.00%
0.00%
45 9.m
cf
48 0.
s
3.
xa opl
la ex
47 nc
b
1.
om mk
48 ne
t
2.
sp pp
hi
44 nx
7. 3
43 de
a
4.
ze lII
us
40 mp
1.
bz
ip
40 2
3.
47 gcc
3.
as
t
40
48 ar
0.
1
pe .w
rlb rf
46 en
4.
c
h2 h
45 64
6. ref
hm
44
m
e
5.
go r
bm
gm k
ea
n
42
DSB impact on MPKI vs TLRU
70
60
40
70
MPKI for true LRU
50
% reduction
MPKI w/ DSB
-10
60
50
40
30
30
20
20
10
10
0
0
-10
so
0.
cf
.m
p
la lex
47 nc
b
1.
om mk
48 ne
t
2.
sp pp
hi
44 nx
7. 3
43 de
a
4.
ze lII
us
40 m
1. p
bz
i
40 p2
3.
47 gcc
3.
as
t
40
48 ar
0.
1
pe .w
rlb rf
46 en
4.
c
h2 h
45 64
6. ref
hm
44
m
e
5.
go r
bm
gm k
ea
n
xa
3.
48
45
42
9
Speedup vs TLRU
Speedup
1.8
1.6
DSB
NoAge
1.4
NoByp
1.2
1
0.8
0.6
0.4
0.2
0
BACKUP
SLRU w/ Adaptive Bypassing
SLRU: Reference Bit
• Bypass
• Bypassed line tracked
by partial tag.
• Incumbent line tracked
by 4 bit pointer.
1
• Subsequent reference
0
to bypass line reduces
bypass probability.
• Subsequent reference
16 bit partial tag for “out-of-cache” competitor
4 bit pointer for “in-cache competitor” to incumbent increases
bypass probability.
CONFIG
1
CONFIG
2
CONFIG
3
Enable bypassing for policy0
True
True
True
Enable bypassing for policy1
False
True
True
Random promotion probability
for policy0
0
0
0
Random promotion probability
for policy1
0
0
16
Aging for policy0
0
0
0
Aging for policy1
1
1
1
Virtual bypassing probability
16
8
8
Initial bypassing probability
64
64
8
Second minimum bypassing
probability (minimum is 0)
1/256
1/4096
1/4096
Config2:
2 Policies
auxiliary directory collects statistics
replacement policy performance and
updates a policy selector counter.
SLRU
1-reference bit indicates whether each line
is in the reference or non-reference list.
Set0
Set1
Set2
4 LRU bits per line
track LRU position
Set3
Set4
Set5
Set6
valid bits
Set7
Tag Array
16 bit partial tag for “out-of-cache”
competitor
Tracking bypass
4 bit pointer for “in-cache competitor”
Download