Uploaded by alex.ajamian

Computer Architecture - Assignment 01 - Alex Ajamian

advertisement
Department of Electrical and Computer Engineering
Computer Architecture – Fall 2022
Instructor: Dr. Maria Abi Saad
Homework 1
Alex H. Ajamian 201804477
0|Page
Table of Contents:
Table of Figures: ............................................................................................................................. 2
Introduction: .................................................................................................................................... 3
Textbook Assignments: .................................................................................................................. 4
Problem 1: ................................................................................................................................... 4
Problem 2: ................................................................................................................................... 4
Problem 3: ................................................................................................................................... 5
Problem 4: ................................................................................................................................... 5
Problem 5: ................................................................................................................................... 7
Problem 6: ................................................................................................................................... 8
Problem 7: ................................................................................................................................... 9
Non-Textbook Assignments: ........................................................................................................ 11
Problem 1: ................................................................................................................................. 11
Problem 2: ................................................................................................................................. 13
Problem 3: ................................................................................................................................. 13
Problem 4: ................................................................................................................................. 14
Problem 5: ................................................................................................................................. 15
Works Cited: ................................................................................................................................. 18
1|Page
Table of Figures:
Figure 1 : Given of problem 4......................................................................................................... 5
Figure 2 : Given of problem 5......................................................................................................... 7
Figure 3: Major Features and differences of CISC and RISC architectures ................................. 14
Figure 4 : Apples' chips' details .................................................................................................... 16
2|Page
Introduction:
This word document will cover the first assignment of the computer architecture course. The
format is as follows:
Large titles: Times New Roman, Font 16, blue, bold, underlined.
Small title: Times New Roman, Font 14, blue, bold, underlined.
Regular texts: Times New Roman, 12.
Line spacing is 1.5 throughout.
All calculations are rounded up to the second decimal point.
Kindly note that no certain referencing was used (APA, MLA); the information used
extracted from reliable sources can be found in the work cited page at the end.
3|Page
Textbook Assignments:
Problem 1:
Computer architecture refers to the attributes of a system that are visible to the
programmer, in other words, those attributes that have a direct impact on the logical execution of
a program. Whereas, computer organization refers to the operational units and their
interconnections that realize the architectural specifications.
For example, an architectural design issue is whether a computer will have a multiply
instruction imbedded in its system, however, it is an organizational issue whether this multiply
function is implemented using a block that multiples or multiple blocks that add the value
multiple times.
The decision on how the multiplication process is implemented in the organization
depends on multiple factors such as how fast must the multiplication process be, or how many
often are we going to have to multiply.
Problem 2:
One notable performance advantage for the IBM 65 and 75 that the overall processing
speed has been increased, since we now have access to two words from memory in a single fetch
instruction due to the addresses being staggered in even and odd placements inside of the main
memory; hence a single clock cycle. In addition, this change in design will allow us to add more
data lines instead of increasing our clock speed; since one is easier to achieve than the other.
4|Page
Problem 3:
To begin with, the ENIAC machine that has each register represented by a ring of ten
vacuum tubes is an inefficient one since only one of those tubes can be ON at any time. To
further explain, the representation is inefficient or wasteful due to the fact that we only need four
vacuum tubes to represent all of the decimal digits hence the other six tubes are somewhat
redundant.
When discussing the range of numbers that the ENIAC machine can represent using all of
the ten vacuum tubes that each have two states. When considering binary representation, we can
represent a total of 2^10 combinations. Considering that we have to also represent the decimal
digit 0 and all of the possible combinations, the machine can account for a range of
0 οƒ  2^10 -1, hence 0 οƒ  1023.
Problem 4:
Given: clock rate of both machines 200 MHz
Figure 1 : Given of problem 4
5|Page
We are asked to calculate the Clock per Instruction (CPI), execution time, and the MIPS
Clock per instruction calculation:
𝐢𝑃𝐼𝐴 =
`𝐢𝑃𝐼𝐡 =
( 8∗1)+(4∗3)+(2∗4)+(4∗3)
8+4+2+4
= 2.22 𝑐𝑦𝑐𝑙𝑒𝑠 π‘π‘’π‘Ÿ π‘–π‘›π‘ π‘‘π‘Ÿπ‘’π‘π‘‘π‘–π‘œπ‘›
( 10∗1)+(8∗2)+(2∗4)+(4∗3)
10+8+2+4
= 1.92𝑐𝑦𝑐𝑙𝑒𝑠 π‘π‘’π‘Ÿ π‘–π‘›π‘ π‘‘π‘Ÿπ‘’π‘π‘‘π‘–π‘œπ‘›
Execution time calculation:
𝐸π‘₯π‘’π‘π‘’π‘‘π‘–π‘œπ‘› π‘‡π‘–π‘šπ‘’ =
π‘–π‘›π‘ π‘‘π‘Ÿπ‘’π‘π‘‘π‘–π‘œπ‘› π‘π‘™π‘œπ‘π‘˜ 𝑐𝑦𝑐𝑙𝑒𝑠
π‘ π‘’π‘π‘œπ‘›π‘‘π‘ 
∗
∗
π‘π‘Ÿπ‘œπ‘”π‘Ÿπ‘Žπ‘š πΌπ‘›π‘ π‘‘π‘Ÿπ‘’π‘π‘‘π‘–π‘œπ‘›π‘  π‘π‘™π‘œπ‘π‘˜ 𝑐𝑦𝑐𝑙𝑒𝑠
𝐸π‘₯π‘’π‘π‘’π‘‘π‘–π‘œπ‘› π‘‡π‘–π‘šπ‘’ π‘œπ‘“ 𝐴 = (8 + 4 + 2 + 4)π‘₯ 106 ∗ 2.22 ∗
𝐸π‘₯π‘’π‘π‘’π‘‘π‘–π‘œπ‘› π‘‡π‘–π‘šπ‘’ π‘œπ‘“ 𝐡 = (10 + 8 + 2 + 4)π‘₯ 106 ∗ 1.92 ∗
1
= 0.2 π‘ π‘’π‘π‘œπ‘›π‘‘π‘ 
200 ∗ 106
1
= 0.23 π‘ π‘’π‘π‘œπ‘›π‘‘π‘ 
200 ∗ 106
MIPS calculation:
(πΌπ‘›π‘ π‘‘π‘Ÿπ‘’π‘π‘‘π‘–π‘œπ‘› π‘π‘œπ‘’π‘›π‘‘)
𝑀𝐼𝑃𝑆 = 𝐸π‘₯π‘’π‘π‘’π‘‘π‘–π‘œπ‘› π‘‘π‘–π‘šπ‘’∗106
𝑀𝐼𝑃𝑆 π‘œπ‘“ 𝐴 =
𝑀𝐼𝑃𝑆 π‘œπ‘“ 𝐡 =
(πΌπ‘›π‘ π‘‘π‘Ÿπ‘’π‘π‘‘π‘–π‘œπ‘› π‘π‘œπ‘’π‘›π‘‘)
18 ∗ 106
=
(
) = 90
𝐸π‘₯π‘’π‘π‘’π‘‘π‘–π‘œπ‘› π‘‘π‘–π‘šπ‘’ ∗ 106
0.2 ∗ 106
(πΌπ‘›π‘ π‘‘π‘Ÿπ‘’π‘π‘‘π‘–π‘œπ‘› π‘π‘œπ‘’π‘›π‘‘)
24 ∗ 106
=
(
) = 103.35
𝐸π‘₯π‘’π‘π‘’π‘‘π‘–π‘œπ‘› π‘‘π‘–π‘šπ‘’ ∗ 106
0.23 ∗ 106
Conclusion:
Our primary indication for performance is the execution time of the machines, due to the
fact that at the same clock speed machine A is able to achieve a lower execution time by 0.03
seconds then we can conclude that machine A is more efficient that machine B.
6|Page
It is significant to take into consideration that machine B has a higher MIPS than machine A;
however, the execution time remains our main factor of determining the better outcome of the
two, especially the fact that their clock cycles are both 200 MHz
Problem 5:
Given:
Figure 2 : Given of problem 5
πΌπ‘›π‘ π‘‘π‘Ÿπ‘’π‘π‘‘π‘–π‘œπ‘› π‘π‘œπ‘’π‘›π‘‘
A) 𝑀𝐼𝑃𝑆 = 𝐸π‘₯π‘’π‘π‘’π‘‘π‘–π‘œπ‘› π‘‘π‘–π‘šπ‘’ ∗106 → π‘–π‘›π‘ π‘‘π‘Ÿπ‘’π‘π‘‘π‘–π‘œπ‘› π‘π‘œπ‘’π‘›π‘‘ = 𝑀𝐼𝑃𝑆 ∗ 𝐸π‘₯π‘’π‘π‘’π‘‘π‘–π‘œπ‘› π‘‘π‘–π‘šπ‘’ ∗ 10^6
πΌπ‘–π‘›π‘ π‘‘π‘Ÿπ‘’π‘π‘‘π‘–π‘œπ‘› π‘π‘œπ‘’π‘›π‘‘ π‘“π‘œπ‘Ÿ 𝑉𝐴𝑋 = 12π‘₯ ∗ 1 ∗ 106
πΌπ‘›π‘ π‘‘π‘Ÿπ‘’π‘π‘‘π‘–π‘œπ‘› π‘π‘œπ‘’π‘›π‘‘ π‘“π‘œπ‘Ÿ 𝐼𝐡𝑀 = π‘₯ ∗ 18 ∗ 106
If we have a code running on these two machines the relative size of the instruction
count can be calculated as follows: relative size = 12x/18x = 2/3 = 0.67
B) 𝑀𝐼𝑃𝑆 =
π‘π‘™π‘œπ‘π‘˜ π‘Ÿπ‘Žπ‘‘π‘’
𝐢𝑃𝐼∗106
→ 𝐢𝑃𝐼 =
π‘π‘™π‘œπ‘π‘˜ π‘Ÿπ‘Žπ‘‘π‘’
𝑀𝐼𝑃𝑆∗106
(5 ∗ 106 )
𝐢𝑃𝐼 π‘“π‘œπ‘Ÿ 𝑉𝐴𝑋 =
=5
1 ∗ 106
7|Page
25 ∗ 106
𝐢𝑃𝐼 π‘“π‘œπ‘Ÿ 𝐼𝐡𝑀 =
= 1.39
18 ∗ 106
Problem 6:
8|Page
Problem 7:
Given: 32-bit microprocessor, 16-bit external data bus, 8MHz input clock
Microprocessor bus cycle = 4 * clock cycle
To be able to achieve the maximum data transfer that the bus can contain we need to determine
the number of bytes that can be transferred per bus cycle.
9|Page
Knowing that the frequency of the input clock is 8MHz we can determine the frequency of the
data bus as well, which will allow us to figure out how much data the bus is transferring.
The input frequency of the data bus can be calculated as 8MHz/4 = 2MHz; given that the
external data bus has a capacity of 16 bit, the maximum transfer occurs when we are using the
full external data bus capacity.
Therefore, the maximum transfer rate of the bus = 16 bits/cycle * 2MHz = 32 bits/sec = 4
bytes/sec.
To be able to increase performance, we can either increase the bus size where we can transfer
more data from the now larger bus or we can increase the clock speed where we transfer the
same data but at a faster pace; increasing the size of the data bus to 32 bits is the more logical
outcome since we are simply increasing its size or widening it. Even though it is more difficult
and impractical to increase the clock speed, it does have a lot more benefits than increasing the
size of the data bus.
10 | P a g e
Non-Textbook Assignments:
Problem 1:
1) To be able to asses which machine has a higher performance, we will take a look at
the run time of each machine and compare them to each other, the overall run time is
an optimal performance metric that will allow us to either prove or disprove the given
fallacy.
(0.9 π‘π‘™π‘œπ‘π‘˜π‘  π‘π‘’π‘Ÿ π‘–π‘›π‘ π‘‘π‘Ÿπ‘’π‘π‘‘π‘–π‘œπ‘› ∗ 5 ∗ 109 π‘–π‘›π‘ π‘‘π‘Ÿπ‘’π‘π‘‘π‘–π‘œπ‘›π‘ )
𝑃1 π‘Ÿπ‘’π‘› π‘‘π‘–π‘šπ‘’ =
= 1.125 𝑠𝑒𝑐
4 ∗ 109 π‘π‘™π‘œπ‘π‘˜π‘  π‘π‘’π‘Ÿ π‘ π‘’π‘π‘œπ‘›π‘‘
(0.75 𝑐𝑦𝑐𝑙𝑒𝑠 π‘π‘’π‘Ÿ π‘–π‘›π‘ π‘‘π‘Ÿπ‘’π‘π‘‘π‘–π‘œπ‘› ∗ 109 π‘–π‘›π‘ π‘‘π‘Ÿπ‘’π‘π‘‘π‘–π‘œπ‘›π‘  )
𝑃2 π‘Ÿπ‘’π‘› π‘‘π‘–π‘šπ‘’ =
= 0.25
3 ∗ 109
We can clearly see that P2 has a better run time by quite a significant margin οƒ 
1.125/0.25 = 4.5 times faster that P1, even though P1 has a larger clock cycle, hence
we prove that the statement is indeed a fallacy.
2) To begin with, the starting assumption that the processor or machine that goes
through the most instruction is the better functioning or performing one is not a valid
performance measure; however, we will compute the performance indicators that are
requested in the problem.
𝑃1 π‘Ÿπ‘’π‘› π‘‘π‘–π‘šπ‘’ (1𝐺 πΌπ‘›π‘ π‘‘π‘Ÿπ‘’π‘π‘‘π‘–π‘œπ‘›π‘ ) =
(0.9 ∗ 1 ∗ 109 )
= 0.225 π‘ π‘’π‘π‘œπ‘›π‘‘π‘ 
4 ∗ 109
# π‘œπ‘“ π‘–π‘›π‘ π‘‘π‘Ÿπ‘’π‘π‘‘π‘–π‘œπ‘›π‘  π‘œπ‘“ 𝑃2 𝑒𝑠𝑖𝑛𝑔 π‘Ÿπ‘’π‘› π‘‘π‘–π‘šπ‘’ π‘œπ‘“ 𝑃1 =
(3 ∗ 109 ∗ 0.225)
= 9 ∗ 109 π‘–π‘›π‘ π‘‘π‘Ÿπ‘’π‘π‘‘π‘–π‘œπ‘›π‘ 
0.75
11 | P a g e
As we can see from the outcome, both processors have a different number of
instructions despite the fact that they had the same run time; the difference in clock
speed and CPI shows its effect here.
3) Find the MIPS calculations below:
𝑀𝐼𝑃𝑆 π‘œπ‘“ 𝑃1 =
(4 ∗ 109 )
= 4444.44
0.9 ∗ 106
𝑀𝐼𝑃𝑆 π‘œπ‘“ 𝑃2 =
(3 ∗ 109 )
= 4000
0.75 ∗ 106
We can clearly see that P1 has a higher MIPS, even though we established that P2 has a
far better run time than P1.
4) Find the MFLOPS calculations below:
𝑀𝐺𝐿𝑂𝑃𝑆 π‘œπ‘“ 𝑃1 =
0.4 ∗ (4 ∗ 109 )
= 1777.78
0.9 ∗ 106
0.4 ∗ (3 ∗ 109 )
𝑀𝐹𝐿𝑂𝑃𝑆 π‘œπ‘“ 𝑃2 =
= 1600
0.75 ∗ 106
We have reached the same conclusion as in part 3, the MFLOPS are not a good
performance measure since the opposite of the run time conclusion was reached.
5) When trying to use the MIPS as a performance measure we are not considering the
run time of the machine, we are considering the number of instructions completed in
a second completely disregarding the clock speed of the processor and the cycles used
per instruction.
12 | P a g e
Problem 2:
1) To be able to find the clock writing period in ns we need to first calculate T then
subtract 50 ns from it; we can find T = 1/f = 1/5MHz = 200 ns
Hence our clock writing time Tw = 150 ns (our window to write information reliably)
2) We simply need 20 more ns since that is the duration of the writing signal that
remains after fall off οƒ  Twf = 150 ns + 20 ns = 170 ns
3) We can only fit one 190 ns in 200 ns hence we can only have one wait state if 190 ns
are needed for reliable data representation
Problem 3:
A) To find the percent increase in time we must find the time it takes for each program
and then find the percent difference or directly find the percent difference from the
number of bus cycles since the difference will be the same.
π·π‘–π‘“π‘“π‘’π‘Ÿπ‘’π‘›π‘π‘’ =
(π‘“π‘–π‘›π‘Žπ‘™−π‘–π‘›π‘–π‘‘π‘–π‘Žπ‘™)
π‘–π‘›π‘–π‘‘π‘–π‘Žπ‘™
=
29−19
19
∗ 100 = 52.63 % π‘–π‘›π‘π‘Ÿπ‘’π‘Žπ‘ π‘’ 𝑖𝑛 π‘‘π‘–π‘šπ‘’ 𝑑𝑒𝑒 π‘‘π‘œ π‘€π‘Žπ‘–π‘‘π‘ 
Final = 4+2+3+2+3+2+3+2+3+3+2 = 29 bus cycles
Initial = 4+3+3+3+3+3 = 19 cycles
B) part 2
13 | P a g e
Problem 4:
We will now discuss both CISC and RISC architectures:
-
Reduced Instruction Set Computer (RISC) is a type of processor where machines execute
one instruction per clock cycle.
-
Complex Instruction Set Computer is a type of processor where machines can have
special instructions as well as instructions that take up more than one clock cycle.
-
RISC is known as the streamlined version of CISC, since CISC architecture might take
several instructions to execute on a RISC machine
Figure 3: Major Features and differences of CISC and RISC architectures
Some known examples or RISC processors are:
o MIPS, PowerPC, Atmel’s AVR, RISC-V, and most of the modern processors
known to date
Some known examples of CISC processors are:
o VAX, Motorola 68000 family, System/360, AMD, and intel x86 CPUs
14 | P a g e
Problem 5:
We will conduct a small research on apples’ M1 microprocessor. It is important to note
that we can see some of the concepts covered in this course in some real-life applications
and one notable application is conducting such research or being able to use the
knowledge taught to us in this course as a way to come to reasonable conclusions
regarding certain purchases or decision.
A) Some very notable features that we can find about Apples’ M1 processor that may be of
useful information to us:
a. Clock speed
b. Amount of cache
c. Levels of cache
d. Power consumption
e. Typical uses
To begin with, the M1 processor has a clock speed of 3.2 GHz in an 8 core CPU and a 78 core GPU; compared to other processor standards the M1 is almost 3.5x faster than a
quad-core 1.2 GHz Intel Core i7 processor.
Second, the Apple M1 processor has multiple cores that have different caches with
different levels accessible to the different cores; the high-performance cores have a large
192 KB of L1 instruction cache and 128 KB of L1 data cache and also shares a 12 MB
L2 cache; whereas, the more energy-efficient cores have a 128 KB L1 instruction cache
and 64 KB L1 data cache, and a shared 4 MB L2 cache. The CPU and GPU can both use
the 24 MB SLC (System level Cache). Hence, we have a total of 7 caches in the Apple
M1 where they are separated between the high efficiency cores and the more energyefficient cores
o 3 caches for high-performance cores (2 L1s and 1 L2)
o 3 caches for energy-efficient cores (2L1s and 1L2)
o System level cache between the CPU and GPU
15 | P a g e
Before getting into the typical uses for the Apple M1, we will take a look at its power
consumption, when in peak power consumption the M1 chip will consume around 30-31
W of power.
A small table from a reliable source regarding all of the chips’ detail and significant
figures:
Figure 4 : Apples' chips' details
B) The research states that the Apple M1 was designed to run in mobile devices, where it
also offers an M1 pro, M1 Max, and M1 Ultra. The other versions of the Apple M1 are
used for machines that have higher-performance requirements or larger machine such as
iPad and macs.
C) Other processor brands in the market include Intel and AMD. Intel is an American
multinational corporation that sells microprocessors to international computer companies
including Dell and Hewlett Packard (HP). On the other hand, AMD also produces and
sells microprocessors and is considered to be Intel's main competitor. Regarding the
16 | P a g e
reason why Apple shifted from Intel to its own made ARM-based chips (M1) is largely
due to three things: Increasing the performance, increasing its efficiency, and maximizing
profits. According to Apple, the Macs with M1 chips are considered to be approximately
3 times faster and better than the older versions. Moreover, it will also ensure longer
battery life. Now it is indeed a given that every new product released should be better and
more efficient than the preceding one. The latter ideology is also the reason why Apple
shifted from M1 to M2. It is said that the M2 chip is 1.4 times faster than the M1. Also,
the M2 chip offers up to a 10-core GPU, while the M1 offers a maximum of 8-core. The
aforementioned improvements are only a few from the wide range of improvements that
have been seen through the shift from M1 to M2.
17 | P a g e
Works Cited:
(1) https://www.microcontrollertips.com/risc-vs-cisc-architectures-onebetter/#:~:text=RISC%2Dbased%20machines%20execute%20one,execute%20on%2
0a%20RISC%20machine.
(2) https://www.javatpoint.com/risc-vscisc#:~:text=Examples%20of%20CISC%3A%20VAX%2C%20Motorola,and%20the
%20Intel%20x86%20CPUs.
(3) https://www.apple.com/euro/business/mac/pdf/Apple-at-Work-M1-Overview.pdf
(4) https://discussions.apple.com/thread/252061257
(5) https://www.notebookcheck.net/Apple-M1-Pro-Processor-Benchmarks-andSpecs.579915.0.html
(6) https://www.macrumors.com/guide/m1/
(7) https://www.investopedia.com/ask/answers/120114/who-are-intels-intc-maincompetitors.asp
(8) https://www.techtarget.com/searchenterprisedesktop/news/252491875/Apple-debutsMacBooks-Mac-mini-with-in-house-M1-chip
(9) https://www.apple.com/newsroom/2022/06/apple-unveils-m2-with-breakthroughperformance-and-capabilities/
18 | P a g e
Download