Uploaded by JK zhao

Solutions To Computer Engineering Textbooks Computer Organization and Design

advertisement
Solutions To Computer Engineering Textbooks/Computer
Organization and Design: The Hardware-Software Interface
(5th Edition) (9780124077263)/Chapter 1
Contents
1
1.1
2
1.2
3
1.3
4
1.4
5
1.5
5.1
5.2
5.3
a
b
c
1.6
6.1
6.2
a
b
1.7
7.1
7.2
7.3
a
b
c
1.8
8.1
8.2
8.3
1.8.1
1.8.2
1.8.3
1.9
9.1
9.2
9.3
1.9.1
1.9.2
1.9.3
1.10
10.1
10.2
10.3
10.4
1.10.1
1.10.2
1.10.3
1.10.4
6
7
8
9
10
11
1.11
11.1
11.2
11.3
11.4
11.5
11.6
11.7
11.8
11.9
11.10
11.11
12
1.12
13
1.13
14
1.14
15
1.15
1.11.1
1.11.2
1.11.3
1.11.4
1.11.5
1.11.6
1.11.7
1.11.8
1.11.9
1.11.10
1.11.11
1.1
1. Personal computer.
2. Server.
3. Supercomputer.
4. Embedded computer.
1.2
Idea from other field
Idea from computer architecture
a
Performance via Pipelining
b
Dependability via Redundancy
c
Performance via Prediction
d
Performance via Parallelism
e
Make the Common Case Fast
f
Hierarchy of Memories
g
Design for Moore’s Law
h
Use Abstraction to Simplify Design
1.3
1. A special kind of program called acompiler reads the high-level source code and translates it into a program inassembly language.
2. Another program called anassembler transforms the program in assembly language into a program inmachine language, which is what a computer
understands and can execute directly.
Some compilers "cut the middleman" and produce machine code directly
.
1.4
a.
b.
1.5
a
Processor
Instructions per second
1
2
3
Thus, processor 2 has the highest performance ininstructions per second.
b
Processor
Number of cycles
Number of instructions
1
2
3
c
Let
be the number of instructions executed, then the a reduction in execution time of 30% can be expressed as in the following formula.
Thus,
. This represents a 71% increase in clock rate.
1.6
In order to find which implementation of the hypothetical Instruction Set Architecture is faster we need to find the execution time of the program under each processor.
The execution time of the program can be calculated as follows:
Since we know the clock rates of each processor
, we need to find out how many clock cycles it takes each processor to execute the program. This number is given by:
In the above formula,
and
the program executes
are the CPI and instruction count, respectively, for each instruction class (A, B, C or D). From the problem description, we know that
instructions of class A,
instructions of class B,
instructions of class C and
instructions of class D.
Thus, for processor P1 we have:
And for processor P2 we have:
Hence, the execution times for each processor are:
Therefore, processor P2 is faster.
a
Remembering that CPI refers to the average number of clock cycles per instruction for a program (or program segment), we can find the CPI for each processor by diving
the total number of clock cycles needed to execute the program by the number of instructions.
b
As calculated before,
, and
.
1.7
a
To calculate the CPI generated by each compiler
, we use the formula
b
.
Let's assume processor 1 is running compiler A's code and processor 2 is running compiler B's code. Applying the formula for the execution time of a program we get:
Since we know the execution times are equal, we can equate both sides and rearrange terms to get the following equation:
Thus, the clock of processor 1 which is running compiler A's code is actually about 36% slower than the clock of processor 2.
c
Let C be the new compiler. Then the execution time for compiler C's code will be:
.
The amount by which compiler C's code is faster is given by the ratio of the execution times:
Thus, compiler C's code is about 1.67 times faster than compiler's A code. Likewise, it is about 2.27 times faster that compiler B's code.
1.8
1.8.1
The text explains that dynamic power is the one that depends on the overall capacitive load of each transistor. However, it only gives proportional formulas. Thus, we will
use the following approximation, where is the dynamic power,
is the capacitive load,
is the voltage and is the switch frequency.
Rearranging, we have that the average capacitive load for the Pentium 4 Prescott processor is:
Notes on units:
A watt can be expressed as the product of current (in amperes) and voltage (V).
The ampere (A) is a unit of electrical current, given as a coulomb of charge per second.
A Farad (F) is a unit of electrical capacitance, expressed as a coulomb of charge per voltage.
Similarly, the average capacitive load for the Core i5 Ivy Bridge is:
1.8.2
Processor
% of static power
Ratio of static to dynamic power
Pentium P4 Prescott
Core i5 Iv Bridge
1.8.3
First, we can consider the total power consumption as the sum of the static and dynamic power components:
Since static energy consumption is caused by leakage current, we can determine the latter through the following formula, where is the leakage current:
Hence, for the Pentium 4 Presctott
is equal to:
We want an overall reduction of 10% in power consumption, which means the reduction must be from both the static and dynamic components. Thus, we need to find the
new voltage
such that the following equation holds:
This boils down to the following quadratic equation:
We calculated the value of the capacitive load
in a previous step. For the Pentium 4 Prescott
. Also since the leakage current is to remain the same, we
have all the necessary information to solve the quadratic:
Choosing the positive solution of the quadratic equation, we find that
similar operations for the Core i5 Ivy Bridge, we find that
, which represents a reduction of about 5.4% over the original 1.25 volts. Following
, a reduction of ~6.51%.
1.9
1.9.1
We can again use the following formula for the execution time of the program:
For one processor, the number of clock cycles required to process the program is given by the summation of the different instruction classes, as explained in the answer to
exercise 1.6:
For more than one processor (
), the number of cycles is given by:
Number of processors
Execution time
1
2
4
8
9.6 s
7.04 s
3.84 s
2.24 s
1
1.37
2.5
4.29
Relative speed-up over 1 processor
1.9.2
If the CPI for the arithmetic operations was doubled, then the new clock cycle counts would be:
Number of processors
, for
.
1
2
4
8
New execution time
13.44 s
9.78 s
5.21 s
2.93 s
Relative slow-down
1.4
1.39
1.36
1.31
1.9.3
Since the clock rates are the same we can compare the number of clock cycles directly. Thus, we need to find a value of
satisfied:
Hence, the new value of
should be:
1.10
1.10.1
In order to use the yield equation we first obtain the approximate die areas.
We can now plug these values into the yield equation:
1.10.2
Since we have the yields, we can apply the formula for cost per die immediately:
1.10.3
For the first wafer:
For the second wafer:
1.10.4
Since the die area is 2 square centimeters, we find that the yield is given by:
Solving for the defect rate we find
such that the following equation is
Thus, the previous defect rate was
And the new one is
1.11
1.11.1
1.11.2
1.11.3
1.11.4
1.11.5
1.11.6
1.11.7
The change in CPI cannot be explained by the increase in clock rate alone. Since the clock rate increased 33% and the number of instructions decreased 15%, we would
have expected a reduction in execution time of approximately
, but the execution time only decreased 6.67%. Therefore, the CPI must have
increased as well.
1.11.8
.
1.11.9
1.11.10
We assume the additional reduction in execution time is over the time obtained in
exercise 1.11.9, and thus use those parameters:
1.11.11
1.12
1.13
1.14
1.15
Retrieved from "https://en.wikibooks.org/w/index.php?
title=Solutions_To_Computer_Engineering_Textbooks/Computer_Organization_and_Design:_The_HardwareSoftware_Interface_(5th_Edition)_(9780124077263)/Chapter_1&oldid=3305621
"
This page was last edited on 30 September 2017, at 10:01.
Text is available under theCreative Commons Attribution-ShareAlike License.
; additional terms may apply. By using this site, you agree to theTerms of
Use and Privacy Policy.
Download