KosinskiJohnston

advertisement
Cache Replacement in
Modern Processors
Prepared By:
Paul Kosinski and Bridget Johnston
Introduction

The replacement policy specifies which disk
block should be removed when a new block
must be entered into an already full cache, and
should be chosen so as to ensure that blocks
likely to be referenced in the near future are
retained in the cache.
Common Cache Replacement
Policies Used




Least Recently Used (LRU)
First In First Out (FIFO)
Last In First Out (LIFO)
Random (Rand)
Least Recently Used



Replaces the block in the cache which has not
been used for the longest period of time
Advantage: simplicity
Drawback: does not consider file sizes & latency
First In First Out


Determines the oldest block rather than the least
recently used.
Advantage: Easier to calculate
Last In First Out

Uses the newest block
Random



Candidate blocks are randomly selected.
All disk blocks are accessed with equal
probability
Used as a benchmark
Problem Statement

We will show the Cache Replacement Policy
used by the Pentium III and Pentium IV
processors and propose why each policy was
used. We will use a simple matrix multiplication
program and Vtune to determine our results.
Experimental Setup




Matrix Multiplication program
Vtune
Intel Pentium III machine
Intel Pentium IV machine
Experimental Process

Ran matrix multiplication program on Pentium III & IV with
the following inputs as the dimensions of an n x n matrix:
128
256
160
320
512
1024
2048
640
1280
2560
4098
8192
5120
10240
Intel Pentium III results
1.2
1
0.8
SB to DM Access
0.6
L2CM to DM Access
0.4
0.2
0
128
256
512
1024
2048
4096
8192
Intel Pentium III results
1.4
1.2
1
0.8
SB to DM Access
L2CM to DM Access
0.6
0.4
0.2
0
160
320
640
1280
2560
5120
10240
Intel Pentium III results
L1CM to DM Access
70
60
50
40
L1CM to DM Access
30
20
10
0
128
256
512
1024
2048
4096
8192
Intel Pentium III results
L1CM to DM Access
120
100
80
60
L1CM to DM Access
40
20
0
160
320
640
1280
2560
5120
10240
Intel Pentium IV results
1000
900
800
700
600
L1CM to SR
500
L2CM to SR
400
300
200
100
0
128
256
512
1024
2048
4098
8192
Intel Pentium IV results
600
500
400
L!CM to SR
300
L2CM to SR
200
100
0
160
320
640
1280
2560
5120
10240
Conclusion



Results Don’t Mean Anything
Simple Hardware Required due to Timing
Constraints
This Means Only Simple Algorithms Should be
Used
LRU
 FIFO

Download