CMPE 110 – Fall 2006 – Homework 7 Caches Name: _________________ Email: _________________ Partners: _______________ 1. 6 points In this problem, we are designing a computer with a physically addressable space of 16 GB of main memory. The CPU has an 8 byte word size. Figure out the parameters for a number of different off-CPU cache configurations. You may state powers of 2 (ex 2^10 instead of 1024) for your answers if you wish. a) Direct mapped 4MB write-back cache with 16-word blocks: How many lines are in the cache? 2^15 16 word line * 8 bytes/word = 128 bytes per line, 4MB/128 = 22bits - 7bits How many sets are in the cache? Direct mapped, so one set per line 2^15 How many bits are used for byte offset? 8 bytes, so need 3 bits 3 How many bits are used for word offset? 16 words, so need 4 bits 4 How many bits are in the index? direct mapped, so same as the number of lines 15 How many bits is the address tag? 12 16 GB is 34 bits, minus 15, minus 4, minus 3 = 12 bits Make a diagram showing which bits of the physical addr are used for which purposes in the cache. State the field name and bit length. 12 bits tag 15 bits index 4 bits word offset 3 bits byte offset Make a diagram of a cache line. State the field name and number of bits. 1 bit valid 1 bit dirty 12 bits tag 16*8*8 bits data How many bits total in the cache? 4MB * 8 + 2^15 * 14 34013184 What percent is overhead? 2^15 * 14 / (2^22 + 2^15*14) 1.3% b) 2-way set associative 4MB write-back cache with 8-word blocks. This cache has a least recently used replacement policy. How many blocks are in the cache? 2^16 22bits (4MB) - 3bits (8 word block) - 3bits (8 byte word) = 16 bits How many sets are in the cache? two blocks per set, so 1 bit less 2^15 How many bits are used for byte offset? 8 bytes, so need 3 bits 3 How many bits are used for word offset? 8 word block, so need 3 bits 3 How many bits are in the index? the sets are indexed, so 15 bits 15 How many bits is the address tag? 13 16 GB is 34 bits, minus 15, minus 3, minus 3 = 13 bits Make a diagram showing which bits of the physical addr are used for which purposes in the cache. State the field name and bit length. 13 bits tag 15 bits index 3 bits word offset 3 bits byte offset Make a diagram of a cache line. State the field name and number of bits. 1 bit valid 1 bit dirty 1 bit LRU 13 bits tag 8*8*8 bits data How many bits total in the cache? (8*8*8 + 16) * 2^16 What percent is overhead? 16 / (16 + 8*8*8) 34603008 3% c) 4-way set associative 4MB write-back cache with 4-word blocks. This cache has a LRU replacement policy. How many blocks are in the cache? 2^17 22bits (4MB) - 2bits (4 word block) - 3bits (8 byte word) = 17 bits How many sets are in the cache? four blocks per set, so 2 bits less 2^15 How many bits are used for byte offset? 8 bytes, so need 3 bits 3 How many bits are used for word offset? 4 word block, so need 2 bits 2 How many bits are in the index? the sets are indexed, so 15 bits 15 How many bits is the address tag? 14 16 GB is 34 bits, minus 15, minus 2, minus 3 = 14 bits Make a diagram showing which bits of the physical addr are used for which purposes in the cache. State the field name and bit length. 14 bits tag 15 bits index 2 bits word offset 3 bits byte offset Make a diagram of a cache line. State the field name and number of bits. 1 bit valid 1 bit dirty 2 bits LRU 14 bits tag 4*8*8 bits data How many bits total in the cache? (18 + 4*8*8) * 2^17 What percent is overhead? 18 / (18 + 4*8*8) 35913728 6.6% 2. 2 points Answer the questions given the following line from a cache that uses a strict least recently used replacement policy, has 2^14 sets and an 8 word block: 1 bit valid 1 bit dirty 4 bits LRU 15 bit tag 1024 bits data How many way set associative is the cache? 16 The number of LRU bits tells how many blocks to a set How many bits is the index of the cache? a set is indexed, so 14 bits 14 How many lines are in the cache? 2^18 14 bits (number of sets) plus 4 bits (number of lines per set) What is the data size of the cache? 2^5 MB 2^18 lines * 1024 bits / 8bits per byte = 18 bits + 10 bits - 3 bits - 20 bits What is the word size of this machine? 4B 1024 bits / 8 words / 8 bits per byte = 10 - 3 - 3 = 4 bytes per word How large is the main memory of the machine? 15 bits tag + 14 bits index + 3 bits block + 2 bits word 16 GB Make a diagram showing which bits of the physical addr are used for which purposes in the cache. State the field name and bit length. 15 bits tag 14 bits index 3 bits word offset 2 bits byte offset 3. 2 points Evaluate memory to go with a 4 GHz processor. Choose from two different kinds of memory: DDR400 (200MHz clock) with timings of 2-2-2-5-T2 and an 8 byte-wide bus DDR533 (266MHz clock) with timings of 3-3-3-8-T2 and an 8 byte-wide bus See the class notes, pg 13-28 and 13-29 for DDR memory timing. Only consider RAS to CAS delay, and the delay from CAS to data. Assume a burst-length of 4 for both. Assume every cache-miss causes one row address to be latched, and that cache blocks are aligned with Column boundaries. The processor's cache has a 4 word block, and each word is 8 bytes. How many processor clock cycles does each kind of memory take to service a cache miss? DDR400 110 4 GHz processor cycle time is 1 / (4 * 10^9) = 250ps. Memory cycle time is 5ns. Have 1 tRCD delay, which is RAS to CAS delay. This is 2 mem cycles = 10ns. Then, have 8 bytes per access, which is 1 word, so need 4 accesses to fill a cache line. So, need only a single CAS. The CAS to data is 2 cycle = 10ns Then 1.5 cycles more till the end of the data burst, for 7.5ns. Grand total 10 + 10 + 7.5 = 27.5ns. The number of processor clock cycles is 27.5/.25 = 110. DDR533 112.5 Memory cycle time is 3.75ns tRCD is 3 mem cycles = 11.25ns tCL is 3 mem cycles = 11.25ns 1.5 cycles to get data = 5.625ns Grand total = 28.125ns Processor cycles is 28.125ns/.25 = 112.5 Which memory will give the system higher performance? A: DDR400