CDA3101 Assignment 1
Due 5/20
Submissions are due by the beginning of class on the specified due date. Handwritten or typed solutions are acceptable. If you do write your solutions by hand, be sure to write clearly. If the grader cannot read your answer, they cannot give you the points.
Late submissions will be accepted with a 10% penalty for each day they are late (up to
48 hours). You must show how you arrived at the answer or no credit will be given .
1. Consider three different processors P1, P2, and P3 executing the same instruction
set with the clock rates and CPIs given in the following table.
Processor
P1
Clock Rate
2.2 GHz
CPI
1.6
P2
P3
1.7 GHz
2.9 GHz
1.2
2.4
a. (5 pts) Which processor has the highest performance?
b. (10 pts) If the processors each execute a program in 12 seconds, find the number
of cycles and the number of instructions for each processor.
c. (10 pts) We are trying to reduce the current time of 12s by 30% but this leads to
an increase of 20% in the CPI. For each processor, what clock rate should we
have to achieve this time reduction?
d. (10 pts) Using the results above, explain why it is inappropriate to compare the
performance of each processor using the Clock Rate as a lone metric. What are
the three key factors that affect performance?
2. The following table shows the number of instructions for a program.
Arithmetic
600
Store
40
Load
120
Branch
40
Total
800 a. (5 pts) Assuming that arithmetic instructions take 1 cycle, load and store 5 cycles,
and branch 2 cycles, what is the execution time of the program in a 2.2 GHz
processor? b. (5 pts) What is the CPI for the program? c. (10 pts) If the number of load instructions can be reduced by one-half, what is the speed-up and the new CPI? d.
(10 pts) What is Amdahl’s law? Explain how the solution to part c supports this law.
3. The tables below shows the instruction type breakdown of two applications A and B
executed on 1, 2, 4, or 8 processors. Using this data, you will be exploring the speed-
up of applications on parallel processors.
Application A
Processors Instructions Per
1
2
4
8
Arithmetic
2800
1400
700
350
1360
680
340
170
Processor
Load/Store Branch
256
128
64
32
Arithmetic Load/Store Branch
1 4 2
1 4 2
1
1
CPI
4
4
2
2
Application B
2
4
8
Processors Instructions Per
1
Arithmetic
2560
Processor
Load/Store Branch
1280 256
CPI
Arithmetic Load/Store Branch
1 4 2
1350
800
600
800
600
500
128
64
32
1
1
1
6
9
13
2
2
2 a. (5 pts) The table above shows the number of instructions required per processor to complete a program on a multiprocessor with 1, 2, 4, and 8 processors. For each of the configurations of applications A and B: What is the total number of instructions executed per processor? What is the aggregate number of instructions executed across all processors? b. (10 pts) Given the CPI values above, find the total execution times of each application on 1, 2, 4, and 8 processors. Assume each processor has a 2.2 GHz clock frequency. c. (10 pts) If the CPI of the arithmetic instructions was doubled, what would the impact be on the execution time of each program using 1, 2, 4, and 8 processors? d. (10 pts) Based on your solutions, is it always advantageous to further parallelize an application? What might account for the trends observed in parallelizing each application? Specifically, why might it be that when we increase the number of processors we observe continual execution time improvements for A, but not B?