Tianjin University School of Computer Science and Technology (Computer Engineering) Computer Architecture: Fall 2015 Homework #1 TJU Honor Pledge: "I have neither given nor received unauthorized aid on this test or assignment." Student’s signature or electronic signature: ____________________________ (if homework is submitted electronically, sign by typing your name) Rules for all homeworks: All homeworks must be handed in electronically via Baidu Cloud. The required format is a single MS-Word or PDF file. The late policy is stated in the course syllabus. Note: This homework assignment will be easier if you use a spreadsheet program, like Microsoft Excel. You can import Excel graphs into MS Word. Note: You must show all your work to receive full credit for a given problem. Amdahl’s Law 1. [10 points] Amdahl's Law is about adding a new enhancement to a computer. This enhancement provides a certain speedup s if it is used 100% of the time. Unfortunately, it is only used a fraction f of the time. a. b. c. 2. Assume that s is infinite. Plot the speedup predicted by Amdahl's Law vs. f for values of f = 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 and 0.95. Assume that f = 0.8. Plot the speedup predicted by Amdahl's Law vs. s for values of s = 10n for n = 2, 3, 4, 5, 6 and 7. (Note that horizontal axis is log10(s), i.e., n) For f = 0.9, what is the minimum s required to achieve 95% of the theoretical maximum speedup (i.e., the theoretical maximum speedup was plotted in part (a) of this problem). [10 points] What value of s (speedup of the enhancement) most clearly shows that “performance improvement is limited by the part you cannot improve”? Explain your answer, including using math. Summarizing Performance with Means 3. [10 points] Computer A executes Program P1 in 3 seconds, Program P2 in 16 seconds, and Program P3 in 7 seconds. Computer B executes Program P1 in 4 seconds, Program P2 in 5 seconds, and Program P3 in 14 seconds. Which computer is faster, A or B? 4. [10 points] On Computer A, Program P1 executes at a rate of 2.2 billion instr./sec. (BIPS) and Program P2 executes at a rate of 3.1 BIPS. On Computer B, Program P1 executes at a rate of 1.5 BIPS and Program P2 executes at a rate of 8.0 BIPS. Which computer is faster, A or B? (Hint: P1 and P2 have the same instruction count.) CPU Time Equation 5. 6. 7. [10 points] Consider a program with the following instruction mix for a RISC instruction set: 15% stores, 25% loads, 15% branches, and 35% integer arithmetic, 5% integer shift, and 5% integer multiply. Given that load instructions require two cycles, stores take two cycles, branches require four cycles, integer ALU (including shift) require one cycle and integer multiplies require ten cycles, compute the overall CPI. [10 pts] Given the parameters of Problem 5, consider a strength-reducing optimization that converts multiplies by a compile-time constant into a sequence of shifts and adds. For this instruction mix, 50% of the multiplies can be converted to shift-add sequences with an average length of three instructions. Assuming a fixed frequency, compute the change in instructions per program, cycles per instruction and overall program speedup [10 points] Suppose you develop a technique that increases instructions-per-cycle (IPC) by a factor of 1.32. The extra circuit complexity causes a factor of 1.10 increase in processor cycle time. What is the overall speedup? Moore’s Law 8. [10 points] Moore's Law states that performance doubles every 18 months for a given cost. If you buy a computer today then wait 18 months and buy another for the same amount of money, the second computer will be twice as fast. Provide a table showing how much faster a new computer will be than a current computer if it were bought 9 months from now, 18 months from now, etc., up to 9 years from now (i.e., provide a row in your table for every 9 months from now until 9 years from now). Power Consumption 9. [20 points] Assume ideal voltage and frequency scaling relationship (i.e., x% drop in voltage results in x% drop in frequency), how much power can you save by using a three-core processor vs. a single core processor for ideally parallelizable workloads? How much energy can you save by using a three-core processor vs. a single core processor for ideally parallelizable workloads? Hint: We want to get the same performance by using 3 cores as using a single core. Reliability 10. [10 points] Calculate FIT and MTTF for a computer system, which has a CPU (365 day MTTF), a memory module (180 day MTTF), and the system software (30 day MTTF).