word file

advertisement
Summer
CDA5155 Homework 1
Due: May 28th, 2010, 11:59pm
You are not allowed to take or give help in completing this assignment. Submit the PDF
version of the submission in e-Learning website before the deadline. Please include the
sentence in bold on top of your submission: “I have neither given nor received any
unauthorized aid on this assignment”.
Total Points: 70 pts
1. [10 points] Using the following table, solve the following questions:
Chip
Num of Cores
Memory
performance
Processor
performance
Athlon 64 X2 4800 +
2
3423
20178
Pentium EE 840
2
3228
18893
Pentium D 820
2
3000
15220
Athlon 64 X2 3800 +
2
2941
17129
Pentium 4
1
2731
7621
Athlon 64 3000+
1
2953
7628
Pentium $ 570
1
3501
11210
Processor X
1
7000
5000
a. Create a table similar to the given table, except express the results as
normalized to the Pentium 4 for both memory performance and processor
performance.
b. Calculate the arithmetic mean of the performance of each processor using
both the original performance and your normalized performance in part a).
c. Given the answer from part b), are there any conflicting conclusions you can
make?
2. [15 points] Your company’s internal studies show that a single-core system is
sufficient for the demand on your processing power. You are exploring, however,
whether you could save power by using two cores.
a. Assume that your application is 90% parallelizable. By how much could you
decrease the frequency and get the same performance?
b. Assume that the voltage may be decreased linearly with the frequency. Using
the equation in Section 1.5, how much dynamic power would the dual-core
system require as compared to the single-core system?
c. Now assume that the voltage may not decrease below 30% of the original
voltage. This voltage is referred to as the “voltage floor,” and any voltage
lower than that will lose the state. Using the equation in Section 1.5, how
much dynamic power would the dual‐core system require from part (a)
compared to the single‐core system when taking into account the voltage floor?
3. [10 points] You are designing a 32-bit instruction-set architecture which needs to
support 100 opcodes, three source operands and two destination operands. All the
source and destination operands are registers. Moreover, all the operands should be
able to access all the registers. What is the maximum size of the register file that this
architecture can use (show your computations)?
4. [15 points] In the load-store architecture of MIPS, operands of arithmetic and logical
instruction must be from registers. For a typical integer program, the instruction
distribution and CPI of 4 groups are given in the following table.
Type
Frequency
CPI
ALU
50%
1
Load
25%
2
Store
15%
2
Branch
10%
4
a. Calculate the average CPI of the integer program.
b. Now, assume that a set of new memory-register type of arithmetic and logical
instructions are added into the ISA. Each memory-register ALU instruction
combines one Load and one original ALU instruction together. It takes 4
cycles to execution this new type of instruction. Assume 60% of the load
instructions can be combined for the program; calculate the new CPI of the
integer program.
c. Assume the modification makes the overall cycle time increased by 5%. Is
this modification really worthwhile?
5. [20 points] Assume that values A, B, C and D reside in memory. Also assume that
instruction operation codes are represented in 8 bits, memory addresses are 64 bits
and register addresses are 8 bits. Assume all the data are 32-bits, and the instruction
lengths are in the table.
a. Write the code sequence for D=A+B*(A+C) for the following instruction set
architectures: 1) Stack; 2) Accumulator; 3) Register (Register-memory); 4)
Register (Load-Store). (You can refer to class slides, or Figure B.1-B.2 on page
B-4 of the Appendix B )
b. Compute the total instruction number and code size for each sequence you get.
c. Compute how many bytes are transferred to or from the memory in executing
the code sequences, including fetching instructions, read data, write data.
ISA
Stack
Accumulator
Register-memory
Load-Store
Instruction Length
(bits)
8 or 72
72
32 or 80 or 88
32 or 80
Download