CDA3101 Assignment 1

advertisement
CDA3101 Assignment 1
Due 6/1
Submissions are due by the beginning of class on the specified due date. Handwritten
or typed solutions are acceptable. If you do write your solutions by hand, be sure to
write clearly. If the grader cannot read your answer, they cannot give you the points.
Late submissions will be accepted with a 10% penalty for each day they are late (up to
48 hours). You must show how you arrived at the answer and circle your final
answer where applicable!
1. Consider three different processors P1, P2, and P3 executing the *same instruction
set* with the clock rates and CPIs given in the following table.
Processor
P1
P2
P3
Clock Rate
2.2 GHz
1.7 GHz
2.9 GHz
CPI
1.6
1.2
2.4
a. Which processor has the highest performance (as defined in class)?
b. If the processors each execute a program in 12 seconds, find the number
of instructions and the number of cycles for each processor.
c. We are trying to reduce the current time of 12s by 30% but this leads to
an increase of 20% in the CPI. For each processor, what clock rate should we
have to achieve this time reduction?
d. Using the results above, explain why it is inappropriate to compare the
performance of each processor using the Clock Rate as a lone metric. What are
the three key factors that affect performance?
2. The following table shows the number of instructions for a program.
Arithmetic
600
Store
40
Load
120
Branch
40
Total
800
a. Assuming that arithmetic instructions take 1 cycle, load and store 5 cycles,
and branch 2 cycles, what is the execution time of the program in a 2.2 GHz
processor?
b. What is the CPI for the program?
c. If the number of load instructions can be reduced by one-half, what is the speedup and the new CPI?
3. Translate each of the following C statements below into MIPS assembly. Assume that
the variables f, g, and h are assigned to registers $s0, $s1, and $s2, respectively.
Assume that the base address of the arrays A and B are in registers $s6 and $s7,
respectively.
a. f = g – h + B[4];
b. f = g * A[B[3]];
4. For each of the following actions, indicate the translation phase during which the
action takes place (preprocessing, compiling, assembling, linking, or loading).
a. Translating i = i + 1 to addi $t0, $t0, 1.
b. Including the contents of <stdio.h>.
c. Placing the symbolic names printf in the symbol table and call printf in
the relocation table.
d. Allocating space for a.out in main memory.
e. Detecting the syntax error a * b =c;.
f. Creating a.out from main.o and frac.o.
g. Expansion of #define PI 3.14159 in the program text.
h. Detecting the semantic error a = b, where a in an int and b is a char array.
i. Translating addi $t0, $t0, 1 to 00100001000010000000000000000001.
j. Updating the symbol table entry for printf (patching external reference).
5. In class, we said that the logic equation for the result of an adder can be expressed in
the following way:
Sum = (a̅ ∙ b̅ ∙ CarryIn) + (a ∙ b̅ ∙ ̅̅̅̅̅̅̅̅̅̅
CarryIn) + (a̅ ∙ b ∙ ̅̅̅̅̅̅̅̅̅̅
CarryIn) + (a ∙ b ∙ CarryIn)
Using only AND, OR, and NOT gates, design the hardware that will implement Sum.
6. Using the truth table below, write a logic equation for D in terms of the input values A,
B, and C. Your logic equation must be in canonical form as a sum-of-products.
A
0
0
0
0
1
1
1
1
B
0
0
1
1
0
0
1
1
C
0
1
0
1
0
1
0
1
D
1
0
0
1
0
1
1
0
7. The tables below shows the instruction type breakdown of two applications A and B
executed on 1, 2, 4, or 8 processors. Using this data, you will be exploring the speedup of applications on parallel processors.
Application A
Processors Instructions
Arithmetic
1
2800
2
1400
4
700
8
350
Per
Load/Store
1360
680
340
170
Processor
Branch
256
128
64
32
Arithmetic
1
1
1
1
CPI
Load/Store
4
4
4
4
Branch
2
2
2
2
Per
Load/Store
1280
800
600
500
Processor
Branch
256
128
64
32
Arithmetic
1
1
1
1
CPI
Load/Store
4
6
9
13
Branch
2
2
2
2
Application B
Processors Instructions
Arithmetic
1
2560
2
1350
4
800
8
600
a. The table above shows the number of instructions required per processor to
complete a program on a multiprocessor with 1, 2, 4, and 8 processors. For each of
the configurations of applications A and B: What is the total number of instructions
executed per processor? What is the total number of instructions executed across
all processors?
b. Given the CPI values above, find the execution time of each application on 1, 2, 4,
and 8 processors. Assume each processor has a 2.2 GHz clock frequency and
keep in mind that the processors run in parallel.
c. If the CPI of the arithmetic instructions was doubled, what would the impact be on
the execution time of each program using 1, 2, 4, and 8 processors?
d. Based on your solutions, is it always advantageous to further parallelize an
application? What might account for the trends observed in parallelizing each
application? Specifically, why might it be that when we increase the number of
processors we observe continual execution time improvements for A, but not B?
Download