Uploaded by James Pui

L1 - Introduction to Computer Architecture

advertisement
TMF1214/TMC1214
Computer Architecture
(Semester 2 2017/2018)
Introduction to Computer Architecture
Reference: Chapter 1
William Stallings Computer Organization and Architecture 8th Edition
1
Computer System: User’s View
Image: http://www.coolnerds.com
2
Computer System Components: High Level View
Input
Keyboard
Mouse
Microphone
Computer
System unit
Output
Monitor
Speaker
3
Architecture & Organization
• Architecture is those attributes visible to the
programmer
—Instruction set, number of bits used for data
representation, I/O mechanisms, addressing
techniques.
—e.g. Is there a multiply instruction?
• Organization is how features are implemented
—Control signals, interfaces, memory technology.
—e.g. Is there a hardware multiply unit or is it done by
repeated addition?
Architecture & Organization (Cont...)
• All Intel x86 family share the same basic
architecture
• The IBM System/370 family share the same
basic architecture
• This gives code compatibility
—At least backwards
• Organization differs between different versions
Structure & Function
• Structure is the way in which components relate
to each other
• Function is the operation of individual
components as part of the structure
6
Function
• All computer functions are:
—Data processing
—Data storage ->even if the computer processing data on the
fly (eg data come in and get processed, and result go out
immediately)<- the computer must temporarily store at least
those pieces of data that are being worked on at any given
moment
—Data movement->I/O vs data communication
—Control-> to control all the functions (outside/within the pc)
individual(s) & CU
7
Functional view
Johari Abdullah, FCSIT
8
Operations (1) Data movement
Transferring data from
one peripheral or
communication line to
another
9
Operations (2) Storage
Transferring data from
external environment to
computer storage
(read) and vice versa
(write)
10
Operation (3) Processing from/to storage
Data processing
involving data in
storage
11
Operation (4)
Processing from storage to I/O
Data processing involving
en route data between
storage and external
environment
12
Structure - Top Level (The Computer)
Computer
Central
Processing
Unit (CPU)
Computer
Main
Memory
Systems
Interconnection/Bus
Input
Output
13
Computer Structure
• CPU: control the operation of the computer and
performs its data processing functions, often
simply referred to as processor.
• Main Memory: Stores data
• I/O: Move data between the computer and its
external environment
• System Interconnection: mechanism that
provides for communication among CPU, main
memory and I/O. Example: System bus
(consisting of a number of conducting wires to
attach all the other components)
14
Structure - The CPU
CPU
Computer
Arithmetic
and Logic
Unit (ALU)
Registers
I/O
System
Bus
Memory
CPU
Internal CPU
Interconnection
Control
Unit
15
The CPU Structure
• Control Unit: controls the operation of the CPU
and hence the computer
• ALU: Performs the computer’s data processing
functions
• Registers: Provides internal storage to the CPU
• CPU interconnection/Internal bus: Some
mechanism that provides for communication
among the control unit, ALU and registers.
16
Structure - The Control Unit
Control Unit
CPU
Sequencing
Logic
ALU
Internal
Bus
Registers
Control
Unit
Control Unit
Registers and
Decoders
Control
Memory
17
Recall Computer System: User’s View
Image: http://www.coolnerds.com
18
Recall Computer System Components: High Level View
Input
Keyboard
Mouse
Microphone
Computer
System unit
Output
Monitor
Speaker
19
CPU Motherboard
20
Computer Components: Interconnection
I/O
CPU
MEMORY
21
CPU
22
CPU Organization
Registers
ALU
CU
23
Memory
I/O
CPU
MEMORY
address
content
0000000000
01010101010010101
0000000001
01110101010010101
1111111110
01010101011110101
1111111111
11010111010010101
24
Input/Output
CPU
I/O Module
I/O Devices
25
Computer Systems Hierarchy
A digital computer solves problems by carrying out instructions
Results
Instructions
Computer
A program:
A sequence of instructions describing
how to perform a certain task.
26
Computer Systems Hierarchy
Human
Language
Difficult to implement
Interpretation/Translation
Machine
Language
Computer
27
Computer Systems Hierarchy
Human
Language
Machine-like/Human-like language
Interpretation/Translation
Machine
Language
Computer
28
Computer Systems Hierarchy
Programmers
High-level language - C++, Java VB
Assembly language
OS - UNIX, Windows NT
Systems
programmers
Instruction sets - Pentium, PowerPC
Micro programs
Hardware
29
TMC1214/TMC1213
Computer Architecture
(Semester 2 2017/2018)
Computer Evolution and Performance
Reference: Chapter 2
William Stallings Computer Organization and Architecture 8th Edition
30
A (Very) Brief History of Computers
The first Generation - Vacuum Tubes (1945 -1955)
ENIAC (1943 - 1946)
Intended for calculating range tables of aiming artillery
Consisted more than 18000 vacuum tubes, 1500 square feet of floor space, weight 30
tons, consumed 140 KW
Decimal machine
Each digit represented by a ring of 10 vacuum tubes.
Designed for artillery range table, but used to perform complex calculations to help
determine the feasibility of hydrogen bomb - general purpose computer
Programmed with multi-position switches and jumper cables.
John von Neumann (1945 -1952) more later …
Originally a member of the ENIAC development team.
First to use binary arithmetic
Architecture consists of : Memory, ALU, Program control, Input, Output
Stored-program concept - main memory store both data and instructions
31
A (Very) Brief History of Computers (Cont…)
Vacuum Tubes
ENIAC
32
A (Very) Brief History of Computers (Cont…)
The Second Generation - Transistors (1955 -1965)
Transistors
Transistor was invented in 1948 at Bell Labs by John Barden, Walter
Brattain and William Shockley
TX-0 (Transistorised eXperimental computer 0), first transistor
computer, build at MIT Lincoln Labs
DEC PDP-1, first affordable microcomputer ($120,000), performance
half that of IBM 7090 (the fastest computer in the world at that time,
which cost millions)
PDP-8, cheap ($16,000), the first to use single bus
33
A (Very) Brief History of Computers (Cont…)
The Third Generation - Integrated Circuits (1965 -1980)
IBM System/360
Family of machines with same assembly language
Designed for both scientific and commercial computing
First to allowed microprogramming
Very popular with universities
34
A (Very) Brief History of Computers (Cont…)
The Fourth Generation – VLSI (1980- ?)
• Very Large Scale Integration (VLSI) is the process of creating
integrated circuits by combining thousands of transistors into a
single chip
• Led to PC revolution
• High performance, low cost
35
Generations of Computer (Technology)
• Vacuum tube - 1946-1957
• Transistor - 1958-1964
• Small scale integration - 1965 on
—Up to 100 devices on a chip
• Medium scale integration - to 1971
—100-3,000 devices on a chip
• Large scale integration - 1971-1977
—3,000 - 100,000 devices on a chip
• Very large scale integration - 1978 -1991
—100,000 - 100,000,000 devices on a chip
• Ultra large scale integration – 1991 —Over 100,000,000 devices on a chip
Moore’s Law
Moore’s Law
Computers double in power roughly
every two years, but cost only half as
much
37
Moore’s Law
•
•
•
•
Increased density of components on chip
Gordon Moore - cofounder of Intel
Number of transistors on a chip will double every year
Since 1970’s development has slowed a little
— Number of transistors doubles every 18 months
• Cost of a chip has remained almost unchanged
• Higher packing density means shorter electrical paths,
giving higher performance
• Smaller size gives increased flexibility
• Reduced power and cooling requirements
• Fewer interconnections increases reliability
38
Growth in CPU Transistor Count
39
The IAS (von Neumann) Machine
Main
Memory
Stored Program concept
Main memory storing programs and data
ALU operating on binary data
Control unit interpreting instructions from memory and
executing
Input and output equipment operated by control unit
1946 ~ 1952
John von Neumann
Arithmetic
Princeton
and
Institute for Advanced Studies
Logic Unit
Input
Output
Equipment
Program
Control Unit
The Structure of IAS Computer
Almost all of today’s
computers have the same
general structure as the IAS referred to as
von Neumann machines.
40
The IAS Machine: Control Unit
The control unit operates the machine by fetching instructions
from memory and executing them ONE at a time.
Central Processing Unit
Arithmetic and Logic Unit
Accumulator
MQ
Arithmetic & Logic Circuits
MBR
Input
Output
Equipment
Instructions
& Data
Main
Memory
PC
IBR
MAR
IR
Control
Circuits
Program Control Unit
Address
41
The IAS Machine: Instruction Cycle
The IAS operates by repetitively performing an instruction cycle.
Two sub-cycles:
•During the fetch cycle, the opcode of the NEXT instruction is loaded in to
the IR and the address portion is loaded into the MAR
•Once the opcode is in the IR, the execute cycle is performed. Control
circuitry interprets the opcode and executes the instruction by sending out
appropriate control signals to cause data to be moved or an operation to
be performed by the ALU.
42
IAS - details
• 1000 x 40 bit words
—Binary number
—2 x 20 bit instructions
• Set of registers (storage in CPU)
—Memory Buffer Register (MBR)
—Memory Address Register (MAR)
—Instruction Register (IR)
—Instruction Buffer Register (IBR)
—Program Counter (PC)
—Accumulator (AC)
—Multiplier Quotient (MQ)
43
Structure of IAS –
detail
, FCSIT
44
Evolution of Intel Microprocessor
Source: http://www.intel.com/intel/museum/25anniv/hof/tspecs.htm
1970s Processors
4004
8008
8080
8086
8088
Introduced 11/15/71
4/1/72
4/1/74
6/8/78
6/1/79
Clock
Speeds
200KHz
2MHz
5MHz, 8MHz,
10MHz
5MHz, 8MHz
Bus Width 4 bits
8 bits
8 bits
16 bits
8 bits
Number of 2,300
Transistor (10 microns)
s
3,500
(10 microns)
6,000
(6 microns)
29,000
(3 microns)
29,000
(3 microns)
Addressab 640 bytes
le Memory
16 KBytes
64 KBytes
1 MB
1 MB
Virtual
Memory
--
--
--
--
10X the
performance of
the 8080
Identical to 8086 except
for its 8-bit external bus
108KHz
--
Brief
First microcomputer
Descriptio chip, Arithmetic
n
manipulation
Data/character 10X the
manipulation
performance of
the 8008
45
Evolution of Intel Microprocessor
Source: http://www.intel.com/intel/museum/25anniv/hof/tspecs.htm
1980s Processors
Intel386TM DX
Microprocessor
Intel386TM SX
Microprocessor
Intel486TM
DX CPU
Microproce
ssor
Introduced 2/1/82
10/17/85
6/16/88
4/10/89
Clock
Speeds
16MHz, 20MHz, 25MHz,
33MHz
16MHz, 20MHz, 25MHz, 33MHz
25MHz,
33MHz,
50MHz
Bus Width 16 bits
32 bits
16 bits
32 bits
Number of 134,000
Transistor (1.5 microns)
s
275,000
(1 micron)
275,000
(1 micron)
1.2 million
(1 micron)
(.8 micron
with 50MHz)
Addressab 16 megabytes
le Memory
4 gigabytes
16 megabytes
4 gigabytes
Virtual
Memory
64 terabytes
64 terabytes
64 terabytes
80286
6MHz, 8MHz,
10MHz, 12.5MHz
1 gigabyte
Brief
3-6X the
Descriptio performance of the
n
8086
First X86 chip to handle 32- 16-bit address bus enabled lowbit data sets
cost 32-bit processing
Level 1 cache
on chip
46
Evolution of Intel Microprocessor
Source: http://www.intel.com/intel/museum/25anniv/hof/tspecs.htm
1990s Processors
Intel486TM SX
Microprocessor
Introduced
Pentium® Processor
Pentium® Pro
Processor
Pentium® II
Processor
4/22/91
3/22/93
11/01/95
5/07/97
Clock
Speeds
16MHz, 20MHz,
25MHz, 33MHz
60MHz,66MHz
150MHz, 166MHz,
180MHz, 200MHz
200MHz, 233MHz,
266MHz, 300MHz
Bus Width
32 bits
64 bits
64 bits
64 bits
Number of
Transistors
1.185 million
(1 micron)
3.1 million
(.8 micron)
5.5 million
(0.35 micron)
7.5 million
(0.35 micron)
Addressable 4 gigabytes
Memory
4 gigabytes
64 gigabytes
64 gigabytes
Virtual
Memory
64 terabytes
64 terabytes
64 terabytes
64 terabytes
Brief
Description
Identical in design to
TM
Intel486 DX but
without math
coprocessor
Superscalar architecture
brought 5X the performance of
TM
the 33-MHz Intel486 DX
processor
Dynamic execution
architecture drives
high-performing
processor
Dual independent
bus, dynamic
execution, Intel
TM
MMX technology
47
Pentium Evolution (1)
• 8080
— first general purpose microprocessor
— 8 bit data path
— Used in first personal computer – Altair
• 8086
— much more powerful
— 16 bit
— instruction cache, prefetch few instructions
— 8088 (8 bit external bus) used in first IBM PC
• 80286
— 16 Mbyte memory addressable
— up from 1Mb
• 80386
— 32 bit
— Support for multitasking
48
Pentium Evolution (3)
• Pentium II
—MMX technology
—graphics, video & audio processing
• Pentium III
—Additional floating point instructions for 3D graphics
• Pentium 4
—Note Arabic rather than Roman numerals
—Further floating point and multimedia enhancements
• Itanium
—64 bit
—see chapter 15
• See Intel web pages for detailed information on
processors
49
Speeding it up
•
•
•
•
•
•
Pipelining
On board cache
On board L1 & L2 cache
Branch prediction
Data flow analysis
Speculative execution
50
Performance Mismatch
• Processor speed increased
• Memory capacity increased
• Memory speed lags behind processor speed
51
Logic and Memory Performance Gap
Solutions
• Increase number of bits retrieved at one time
—Make DRAM “wider” rather than “deeper”
• Change DRAM interface
—Cache
• Reduce frequency of memory access
—More complex cache and cache on chip
• Increase interconnection bandwidth
—High speed buses
—Hierarchy of buses
I/O Devices
•
•
•
•
•
Peripherals with intensive I/O demands
Large data throughput demands
Processors can handle this
Problem moving data
Solutions:
—Caching
—Buffering
—Higher-speed interconnection buses
—More elaborate bus structures
—Multiple-processor configurations
Typical I/O Device Data Rates
Key is Balance
•
•
•
•
Processor components
Main memory
I/O devices
Interconnection structures
Improvements in Chip Organization and
Architecture
• Increase hardware speed of processor
—Fundamentally due to shrinking logic gate size
– More gates, packed more tightly, increasing clock rate
– Propagation time for signals reduced
• Increase size and speed of caches
—Dedicating part of processor chip
– Cache access times drop significantly
• Change processor organization and architecture
—Increase effective speed of execution
—Parallelism
Problems with Clock Speed and Login
Density
• Power
— Power density increases with density of logic and clock speed
— Dissipating heat
• RC delay
— Speed at which electrons flow limited by resistance and
capacitance of metal wires connecting them
— Delay increases as RC product increases
— Wire interconnects thinner, increasing resistance
— Wires closer together, increasing capacitance
• Memory latency
— Memory speeds lag processor speeds
• Solution:
— More emphasis on organizational and architectural approaches
Intel Microprocessor Performance
Increased Cache Capacity
• Typically two or three levels of cache between
processor and main memory
• Chip density increased
—More cache memory on chip
– Faster cache access
• Pentium chip devoted about 10% of chip area to
cache
• Pentium 4 devotes about 50%
More Complex Execution Logic
• Enable parallel execution of instructions
• Pipeline works like assembly line
—Different stages of execution of different instructions
at same time along pipeline
• Superscalar allows multiple pipelines within
single processor
—Instructions that do not depend on one another can
be executed in parallel
Diminishing Returns
• Internal organization of processors complex
—Can get a great deal of parallelism
—Further significant increases likely to be relatively
modest
• Benefits from cache are reaching limit
• Increasing clock rate runs into power dissipation
problem
—Some fundamental physical limits are being reached
New Approach – Multiple Cores
• Multiple processors on single chip
— Large shared cache
• Within a processor, increase in performance proportional
to square root of increase in complexity
• If software can use multiple processors, doubling
number of processors almost doubles performance
• So, use two simpler processors on the chip rather than
one more complex processor
• With two processors, larger caches are justified
— Power consumption of memory logic less than processing logic
Internet Resources
• http://www.intel.com/
—Search for the Intel Museum
•
•
•
•
•
http://www.ibm.com
http://www.dec.com
Charles Babbage Institute
PowerPC
Intel Developer Home
64
Download