CSCE614 Computer Architecture (Spring 2012) Assignment #1 Due: 2/13(Mon) 2:50PM 1. A new faster machine is being announced. According to the announcement, the old machine and new machine have the following specs: A. The old machine: Clock rate of 2 GHz Instruction Class A B C D B. The new machine: Clock rate of 2.5 GHz CPI Frequency 2 3 3 8 45 % 30 % 20 % 5% Instruction Class A B C D CPI Frequency 2 2 3 5 35 % 30 % 25 % 10 % (a) What is the CPI for each computer? (b) What are the native MIPS ratings for those two machines? MIPS = Instruction Count 106 ∗ Execution Time (c) What is the speedup of the new machine? 2. Suppose we enhance a computer to make all floating-point instructions run 7 times faster. Let’s look at how speedup behaves when we incorporate the faster floating-point hardware. If the execution time of some benchmark before the floating-point enhancement is 15 seconds, what will the speedup be if 35% of the 15 seconds is spent executing floating-point instructions? 3. When making changes to optimize part of a processor, it is often the case that speeding up one type of instruction comes at the cost of slowing down something else. For example, if we put in a complicated fast floating-point unit, that takes space, and something might have to be moved farther away from the middle to accommodate it, adding an extra cycle in delay to reach that unit. The basic Amdahl’s law equation does not take this trade-off into account. (a) If the new fast floating-point unit speeds up floating-point operations by 2 times, and the operations takes 20% of the original program’s execution time, what is the overall speedup (ignoring the penalty to any other instructions)? (b) Now assume that speeding up the floating-point unit slowed down data cache access, resulting in a 1.5 times slowdown. Data cache accesses consume 10% of the total execution time. What is the overall speedup now? 1 4. The main reliability measure is mean time to failure (MTTF), and design decisions affect their reliability. (a) We have a single processor with a failures in time (FIT) of 100. What is the MTTF for this system? (b) If it takes 2 days to get the system running again, what is the availability of the system? 5. Your company has just bought a new Intel Core i5 dual-core processor, and you have been tasked with optimizing your software for this processor. You will run two applications on this system, but the resource requirements are not equal. The first application needs 80% of the resources, and the other only 20% of the resources. (a) Given that 40% of the first application is parallelizable, how much speedup would you achieve with that application if run in isolation? (b) Given that 99% of the second application is parallelizable, how much speedup would this application observe if run in isolation? (c) Given that 60% of the first application is parallelizable, how much overall system speedup would you observe if you parallelized it, but not the second application? (d) How much overall system speedup would you achieve if you parallelize both applications? 6. We have a program of 103 instructions in the format of “lw, add, lw, add, …” The add instruction only depends on the lw instruction right before it. The lw instruction also only depends on the add instruction right before it. If the program is executed on the pipelined datapath of Figure 1, (a) What would be the actual CPI? (b) Without forwarding, what would be the actual CPI? 7. Consider executing the following code on the pipelined datapath of Figure 1. add sub add add add $2, $4, $5, $7, $8, $3, $3, $3, $6, $2, $1 $5 $7 $1 $6 At the end of the fifth cycle of execution, which registers are being read and which register will be written? 2 Figure 1. Pipelined Datapath 8. Your colleague at Intel suggests that, since the yield is so poor, you might make chips cheaper if you placed an extra core on the die and only threw out chips on which both processors had failed. We will solve this exercise by viewing the yield as a probability of no defects occurring in a certain area given the defect rate. Calculate probabilities based on each Intel Core i5 core separately (this may not be entirely accurate, since the yield equation is based on empirical evidence rather than a mathematical calculation relating the probabilities of finding errors in different portions of the chip). (a) What is the probability that a defect will occur on no more than one of the two processor cores? (b) If the old chip cost $15 per chip, what will the cost be of the new chip, taking into account the new area and yield? 9. The following table presents the power consumption of several computer system components. In this exercise, we will explore how the hard drive affects power consumption for the system. Component type Processor DRAM Hard drive Product Sun Niagara 8-core Performance Power 1.2 GHz 72-79W peak Intel Pentium 4 Kingston X64C3AD2 1 GB Kingston D2N3 1 GB DiamondMax 16 DiamondMax Plus 9 2 GHz 184 pin 240 pin 5400 rpm 7200 rpm 48.9-66W 3.7W 2.3W 7.0W read/seek, 2.9W idle 7.9W read/seek, 4.0W idle (a) Assuming the maximum load for each component, and a power supply efficiency of 75%, what wattage must the server’s power supply deliver to a system with an Sun Niagara 8-core, 4 GB 240 pin Kingston DRAM, and one 7200 rpm hard drive? (b) How much power will the 7200 rpm disk drive consume if it is idle roughly 40% of the time? 3 (c) Given that the time to read data off a 7200 rpm disk drive will be roughly 75% of a 5400 rpm disk, at what idle time of the 7200 rpm disk will the power consumption be equal, on average, for the two disks? 10. Imagine that your company is trying to decide between a single-processor system and a dualprocessor system. Table below gives the performance on two sets of benchmarks: a memory benchmark and a processor benchmark. You know that your application will spend 20% of its time on memorycentric computations, and 75% of its time on processor-centric computations. Chip # of cores Athlon 64 X2 Pentium 4 2 1 Clock Frequency(MHz) 3,200 2,800 Memory Performance 2,940 2,730 Dhrystone Performance 17,100 7,600 (a) Calculate the weighted performance of the benchmarks for the Pentium 4 and Athlon 64 X2. (b) How much speedup do you anticipate getting if you move from using a Pentium 4 to an Athlon 64 X2 on a memory-intensive application suite? 4