Key to chapter 1 questions

advertisement
Assignment 1
Chapter 2: 1-6, 9, 13, 18, 26, 31, 32
John Visser
CS 2560 TA
1-13-2002
2.1
Which program is faster for each program and by how much?
Program
1
2
Time on M1
10 seconds
3 seconds
Time on M2
5 seconds
4 seconds
For program 1, M2 is twice (10s/5s = 2) as fast as M1. For program 2, M1 is 1.33 times (4s/3s =
1.33) as fast as M2.
2.2
Find the instruction rate for each machine for program one given the following
information:
Program
1
Instructions on M1
200 x 106
Instructions on M2
160 x 106
Instructions per second for M1: 200x106 instr. / 10 seconds = 20x106 instr./second.
Instructions per second for M2: 160x106 instr. / 5 seconds = 32x106 instr./second.
2.3
Find the CPI for program 1 if the clock rate for M1 is 200 MHz and the clock rate for M2
is 300 MHz.
CPI on M1 
10 s
200  10 6 cycles

 10 cpi
1s
200  10 6 instructio ns
CPI on M2 
5s
300  10 6 cycles

 9.375 cpi
1s
160  10 6 instructio ns
2.4
Assume that the CPI for M1 and M2 are the same for program 2 as they were for program
1. Find the instruction count for program 2 on M1 and M2.
M 1 instructio n count 
1 instructio n 200  10 6 cycles

 3 s  60  10 6 instructio ns
10 cycles
1s
M 2 instructio n count 
1 instructio n 300  10 6 cycles

 4 s  128  10 6 instructio ns
9.375 cycles
1s
2.5
Suppose that M1 costs $10,000 and M2 costs $15,000. If you are interested in running
program 1 a large number of times, which machine should you buy in large quantities?
Purchasing machine 2 would certainly be more cost effective. While machine two costs one and
a half times more, it runs program 1 twice as fast. This means that for every model of M2
running program 1, two models of M1 would be required to get the same throughput as the M2
machines. The cost of two models of M1 would be $20,000, which is $5,000 more than the cost
of a single M2 machine. M2 should definitely be purchased in large quantities.
2.6
Could you use cost multiplied by execution time as a metric to help in a purchasing
decision? How about the cost divided by the execution time?
Multiplying the cost by the execution time is a valid metric because we are attempting to
minimize those two quantities, and we can simply choose the machine that has the smallest
product of cost and execution time. On the other hand, dividing cost by execution time is an
invalid metric. For example, if we divide and use the value that is highest, then we could select
the most expensive machine, and if we divide and use the value that is lowest, then we could be
selecting the machine that has the longest execution time.
2.9
Assume that program 1 must be executed 200 times per hour and that the rest of the time
can be spent executing program 2. Which machine is faster for this workload? Which
machine is more cost effective? (Performance, in this case, is measured as the throughput
for program 2.)
3600
executions of P 2 on M 1 
3600
executions of P 2 on M 2 
s
s
 200  10
hour
exec. of P1 1600

 533
sec onds
3
3
exec. of P 2
s
s
 200  5
hour
exec. of P1 2600

 650
sec onds
4
4
exec. of P 2
so, M2 is 650/533 = 1.2 times faster than M1. Cost effectiveness is measured as throughput per
dollar, so:
M1 cost effectiveness = 533/10,000 = .053
M2 cost effectiveness = 650/15,000 = .043
M1 is more cost effective than M2 for this particular job.
2.13
Class
A
B
C
The clock rate of M1 is 400 MHz and the clock rate for M2 is 200 MHz, and the amount
that a compiler uses a particular class of instruction is shown below.
CPI on M1
4
6
8
CPI on M2
2
4
3
C1 Usage
30%
50%
20%
C2 Usage
30%
20%
50%
3rd Party Usage
50%
30%
20%
Using compiler usage C1 on both M1 and M2, how much faster does M1 appear to be over M2?
(I assumed that the program was made up of 10 instructions for all calculations.)
M 1 cycles  (3 class A instr.  4 CPI )  (5 class A instr.  6 CPI )  (2 class A instr.  8 CPI )  58 cycles
1s
M 1 exec. time  58 cycles 
 .145  10 6 s  145 ns
6
400  10 cycles
M 2 cycles  (3 class A instr.  2 CPI )  (5 class A instr .  4 CPI )  (2 class A instr.  3 CPI )  32 cycles
1s
M 2 exec. time  32 cycles 
 .160  10 6 s  160 ns
6
200  10 cycles
M 2 exec. time 160 ns

 1.10  the makers of M1 can say that is is 10% faster tha n M2.
M 1 exec. time 145 ns
Using C2 on both M1 and M2, how much faster is M2 compared to M1?
M 1 cycles  (3 class A instr .  4 CPI )  (2 class A instr .  6 CPI )  (5 class A instr .  8 CPI )  64 cycles
M 1 exec. time  64 cycles 
1s
 .160  10  6 s  160 ns
400  10 6 cycles
M 2 cycles  (3 class A instr .  2 CPI )  (2 class A instr .  4 CPI )  (5 class A instr .  3 CPI )  29 cycles
1s
M 2 exec. time  29 cycles 
 .145  10  6 s  145 ns
6
200  10 cycles
M 1 exec. time 160 ns

 1.10  the makers of M2 can say that is is 10% times faster tha n M2.
M 2 exec. time 145 ns
If you purchase M1, which compiler would you use? If you purchase M2, which compiler would
you use? Which machine would you purchase if we assume that all other criteria are identical,
including costs?
First, calculate the performance of M1 versus M2 with the 3rd party compiler:
M 1 cycles  (5 class A instr .  4 CPI )  (3 class A instr .  6 CPI )  (2 class A instr .  8 CPI )  54 cycles
1s
M 1 exec. time  54 cycles 
 .135  10 6 s  135 ns
400  10 6 cycles
M 2 cycles  (5 class A instr .  2 CPI )  (3 class A instr .  4 CPI )  (2 class A instr .  3 CPI )  29 cycles
1s
M 2 exec. time  28 cycles 
 .140  10 6 s  140 ns
6
200  10 cycles
M 2 exec. time 140 ns

 1.037 according the the 3rd party compiler, M1 is just slightly (4%) faster tha n M2.
M 1 exec. time 135 ns
Given this data, I would use the 3rd party compiler with BOTH machines one and two because
the programs from this compiler would run faster than the programs produced by any compilers
by the makers of M1 and M2. If M1 and M2 were identical in all other criteria, I would select
M1 because it is slightly faster (1.04 times).
2.18
The instruction classes, CPI, and instruction frequency for two different machines is
shown below. If the first, Mbase, has a clock rate of 500 MHz, and the second, Mopt, has
a clock rate of 600 MHz, what is the CPI for each machine?
MBase – 500 MHz
Instr. Class
CPI
A
2
B
3
C
3
D
5
Frequency
40%
25%
25%
10%
Mopt – 600 MHz
Instr. Class
CPI
A
2
B
3
C
3
D
5
Frequency
40%
25%
25%
10%
Mbase CPI  (2 cpi  .4)  (3 cpi  .25)  (3 cpi  .25)  (5 cpi  .10)  2.8 cpi overall for MBase
Mopt CPI  (2 cpi  .4)  (2 cpi  .25)  (3 cpi  .25)  (4 cpi  .10)  2.45 cpi overall for Mopt
2.26
Which machine is faster according to total execution time and by how much?
Program
FLOP
1
2
10,000,000
100,000,000
Computer A
1
1000
Exec. time in seconds
Computer B
10
100
Computer C
20
20
Total execution time of computer A is (1000+1) = 1001 seconds; computer B, (100+10) = 110
seconds; computer C, (20+20) = 40 seconds. Computer C is fastest. It’s (1001/40) = 25 times
faster than computer A and (110/40) = 2.75 times faster than computer B.
2.31
Assume that multiply instructions take 12 cycles and account for 10% of the instructions
in a typical program and that the other 90% of the instructions require an average of 4
cycles for each instruction. What percentage of time does the CPU spend doing
multiplication?
Assume 100 instructions, then the number of cycles will be 90 4 + 10 12 = 480 cycles. Of
these, 120 are spent doing multiplication, and therefore 25% of the time is spent doing
multiplication.
2.32
Your hardware engineering team has indicated that it would be possible to reduce the
number of cycles required for multiplication to 6 in exercise 2.31, but this will require a
20% increase in the cycle time. Nothing else will be affected. Should they proceed with
the modification?
Without modifications, it takes 480 cycles for 100 instructions. With the modification, it will
only take 90  4 + 10  6 = 420 cycles. However, our cycle length has increased by 20 %, so we
have to conclude that we should NOT make the modification. The unmodified architecture is 1.2
 420 / 480 = 1.05 times faster than the “improved” architecture.
Download