CSCE614-2013-HW2

advertisement
CSCE614 Computer Architecture (Fall 2013)
Assignment #2
Due: 10/7 11:20AM (Report must be submitted in the class time)
Objective
This project is to help you understand how pseudo-associative (column-associative) cache works. You
will initially analyze the sensitivity of L1 caches to changes in parameters. Then you are to implement L1
data cache as pseudo-associative in SimpleScalar and compare its performance to the normal directmapped L1 data cache.
System Requirement
Linux operating system is needed in order to use the pre-compiled little-endian Alpha ISA SPEC2000
binaries. Do not use Cygwin. If you don’t have any linux machine, please use linux.cs.tamu.edu with your
CS account. If you don’t have CS account, contact HelpDesk located in the first floor.
Setting up the environment and installing SimpleScalar
1. Download and Install SimpleScalar 3.0.
(1) Download simplesim-3v0e.tgz from http://www.simplescalar.com/.
(2) Untar the downloaded file.
$ tar xzvf simplesim-3v0e.tgz
(3) Read the README file under simplesim3.0 directory you have just untarred.
(4) Compile the simulator according to the instructions.
$ make config-alpha
$ make
Note: Some versions of GCC may generate compilation errors. In this case, use the version of GCC, which is already
installed in the department linux machine, linux.cs.tamu.edu.
(5) After you get the simulator, execute 'sim-outorder', and you will get all the configurable
parameters in the out-of-order simulator and their default values.
2. Get the benchmark.
(1) Download alpha binaries of SPECcpu 2000 benchmark from the following link.
http://students.cse.tamu.edu/rahulboyapati/spec2000binary.tgz
(2) Untar the downloaded file.
$ tar xzvf spec2000binary.tgz
3. Get run scripts and argument files.
(1) Download files from the following links.
http://students.cse.tamu.edu/rahulboyapati/spec2000args.tgz
http://students.cse.tamu.edu/rahulboyapati/runscripts.tgz
(2) Untar the files using tar command.
(3) Each run script contains the executable scripts to run each benchmark.
(4) Each benchmark needs its own arguments which are stored in the files.
(5) Select 2 integer and floating point benchmarks according to the last digit of your UIN.
Last digit of UIN
Integer
Floating Point
0
bzip2, crafty
ammp, applu
1
crafty, gap
applu, apsi
2
gap, gcc
apsi, art
3
gcc, gzip
art, equake
4
gzip, mcf
equake, fma3d
5
mcf, parser
fma3d, galgel
6
parser, twolf
galgel, lucas
7
twolf, vortex
lucas, mesa
8
vortex, vpr
mesa, mgrid
9
vpr, bzip2
mgrid, swim
4. Run benchmarks using compiled SimpleScalar binary.
(1) Copy the script to the directory where the argument files are stored.
Note: The script file and argument files must be in the same directory.
$ cp (script dir)/RUN(benchmark) (spec2000args dir)/(benchmark)
Ex) Assuming tar files are extracted in the current directory
$ cp runscripts/RUNequake spec2000args/equake
(2) Run the script
$ cd (spec2000args dir)/(benchmark)
$ ./RUN(benchmark) (simplescalar dir)/sim-outorder
dir)/(benchmark)00.peak.ev6 (simplescalar options)
(spec2000bin
Ex) Assuming tar files are extracted in the current directory
$ cd spec2000args/equake
$ ./RUNequake ../../simplesim-3.0/simoutorder ../../spec2000binaries/equake00.peak.ev6 –max:inst 50000000 –fastfwd
20000000 –redir:sim output1.txt –bpred bimod –bpred:bimod 256 –bpred:ras 8 –
bpred:btb 64 2
Procedure
Implement a pseudo-associative cache in L1 data cache in SimpleScalar. Run sim-outorder to compare
the performance to the normal direct-mapped L1 data cache using SPEC2000 benchmarks. Use the
integer and floating-point benchmarks according to the last digit of your UIN.
When running sim-outorder, use the following options as default.
-max:inst 50000000 -fastfwd 20000000 -redir:sim sim_output_file
-bpred 2lev –bpred:2lev 1 256 4 0 –bpred:ras 8 –bpred:btb 64
2
Since the assignment would require you to modify the L1 cache configurations, you can use an unified 64
KB L2 cache with a 64B cache block and 2-way associativity.
If you are running SimpleScalar in linux.cse.tamu.edu, be sure you are not monopolizing computational
resources on the machine. Do not run more than 1 instance at a time in linux.cse.tamu.edu. It is violation
of section 3.3 of the Appropriate Use of Computer Science Computing Resources Policy, located here:
http://www.cse.tamu.edu/department/policies/resources
Don't run more than one instance of any benchmark simultaneous in the same machine. It may cause
errors. Run one instance at a time per benchmark.
Assignment
Part A.
In the first part of the assignment you will be evaluating the sensitivity of L1 caches to changes in various
parameters like cache size, block size, associativity and replacement policy. You will need to run the
simulations on all the different configurations and analyze the effects of changing cache parameters on
the performance.
Configurations
Size
Associativity
Cache block size
1 (baseline)
4 KB
Direct mapped
32 B
2
4 KB
{4 , 8, fully}
32B
3
4 KB
Direct mapped
64B
4
16 KB
Direct mapped
32B
Replacement
policy
{LRU,random}
You need to report the appropriate cache performance results and the analysis as to why you see this
particular behavior. Please explain why you think you see particular behavioral patterns in each of the
configurations. Also explain the effect of change in performance in L1 caches on the performance of the
L2 cache.
You need to read up on the options you need to use to simulate the cache configurations. They will show
up in the configurable parameters when you execute sim-outorder as in step 1.(5) in setting up the
simulator.
Part B.
0. Reading
(1) Anant Agarwal and Steven D. Pudar, “Column-Associative Caches: A Technique for Reducing the
Miss Rate of Direct-Mapped Caches,” ISCA 1993
1. Guideline
Direct-mapped caches are the solution for simple and easy-to-design caches with short hit access time.
However, the biggest drawback of using direct-mapped caches is the large number of conflict misses.
Pseudo-associative caches resolve conflicts by allowing alternate hashing functions and show much
higher hit rate than normal direct-mapped caches while maintaining almost the same hit access time.
Basically a pseudo-associative cache is the same as a direct-mapped cache. The fundamental idea is to
resolve conflicts by dynamically choosing different locations, which are accessed by different hashing
functions. When a conflict miss happens, the pseudo-associative cache tries to avoid it by relocating the
cache block using another rehashing function. The simplest solution of rehashing function is bit selection
with the highest-order bit inverted, which is called bit flipping.
In order to avoid secondary thrashing effect, which is explained in detail in the reference paper, each
cache block is expanded to have extra 1-bit information called a rehash bit that indicates whether the
block is a rehashed location or not.
2. Design
Add a new CACHE_TAG_PSEUDOASSOC macro in cache.c to get a tag value with the high-order bit
of the index appended at the end.
#define
CACHE_TAG_PSEUDOASSOC(cp, addr)
…
Add one more variable in struct cache_blk_t for the rehash bit as following. The rehash bit must be
initialized to 1 when the pseudo-associative cache is first created in cache_create() function in cache.c.
int
rehash_bit;
You must modify cache_access() function in cache.c to implement the pseudo-associative cache for
L1D. Since cache_access() is a general function used by all caches in the system and the pseudo-
associative cache is only for L1D, you need to write new code for pseudo-associative cache specific to
L1D.
3. Implementation
Add the following options for pseudo-associative cache.
-pseudoassoc
<true/false> # false # use pseudo-associative cache in L1D
4. Comparison
Compare performance of the two L1D cache configurations assuming the same size (128 sets * 32-byte
block size = 4KB).
(1) Normal direct-mapped L1D : -cache:dl1
dl1:128:32:1:l
(2) Pseudo-associative L1D
dl1:128:32:1:l
: -cache:dl1
-pseudoassoc false
-pseudoassoc true
You do not need to consider various hit access times in the pseudo-associative cache. Focus on only
hit/miss rates (dl1.hits/misses/miss_rate in SimpleScalar results).
Turning Instruction
1. Make all your files including modified source codes, simulation results and the report into one zipped
file. We accept zip files only. If you send a different file format, you may receive 0 points for the
assignment.
Your report must contain simulation results (You should include SimpleScalar log files in the zipped file,
but don’t put the whole log in your report.) and analysis of them. Any result you consider important can
be used. Only Microsoft DOC (DOCX) or PDF is acceptable for the report.
2. Send the zipped file to rahul_1@neo.tamu.edu with the following in the email’s subject line:
Assignment2 (Your Full Name)
3. “IMPORTANT” Be sure to turn in the hard copy of your report including simulation results in the class
time. It should be the same as the one submitted via email.
4. Penalty of Late submission: 5% deduction per day
Download