Lecture Notes 3

advertisement
Lecture 3:
Computer Architectures
Basic Computer Architecture
Memory

instruction
Input
unit
data
ALU
Processor
CU
Reg.
Output
unit
Von Neumann
Architecture
Levels of Parallelism




Bit level parallelism
•
Within arithmetic logic circuits
Instruction level parallelism
•
Multiple instructions execute per clock cycle
Memory system parallelism
•
Overlap of memory operations with computation
Operating system parallelism
•
•
More than one processor
Multiple jobs run in parallel on SMP
•
•
Loop level
Procedure level
Levels of Parallelism
Bit Level Parallelism
Within arithmetic logic circuits
Levels of Parallelism
Instruction Level Parallelism (ILP)
Multiple instructions execute per clock cycle

Pipelining (instruction - data)
 Multiple Issue (VLIW)
Levels of Parallelism
Memory System Parallelism
Overlap of memory operations with computation
Levels of Parallelism
Operating System Parallelism
•
•
There are more than one processor
Multiple jobs run in parallel on SMP
•
•
Loop level
Procedure level
Flynn’s Taxonomy




Single Instruction stream - Single Data stream (SISD)
Single Instruction stream - Multiple Data stream (SIMD)
Multiple Instruction stream - Single Data stream (MISD)
Multiple Instruction stream - Multiple Data stream (MIMD)
Single Instruction stream Single Data stream (SISD)
Memory

instruction
CU
data
ALU
Processor
Von Neumann
Architecture
Flynn’s Taxonomy




Single Instruction stream - Single Data stream (SISD)
Single Instruction stream - Multiple Data stream (SIMD)
Multiple Instruction stream - Single Data stream (MISD)
Multiple Instruction stream - Multiple Data stream (MIMD)
Single Instruction stream Multiple Data stream (SIMD)
instruction
PE
PE
instruction

Each processor executes
the same instruction
synchronously, but using
different data

Used for applications that
operate upon arrays of
data
data
CU
PE
Instructions of the
program are broadcast to
more than one processor
data
data
data
Memory
PE

Flynn’s Taxonomy




Single Instruction stream - Single Data stream (SISD)
Single Instruction stream - Multiple Data stream (SIMD)
Multiple Instruction stream - Single Data stream (MISD)
Multiple Instruction stream - Multiple Data stream (MIMD)
Multiple Instruction stream Multiple Data stream (MIMD)

Each processor has a separate program

An instruction stream is generated for each
program on each processor

Each instruction operates upon different data
Multiple Instruction stream Multiple Data stream (MIMD)

Shared memory

Distributed memory
Shared vs Distributed Memory
M
M
M
M
P
P
P
P

Distributed memory
•
•
Network

P
P
P
Bus
Memory
P
Each processor has its own
local memory
Message-passing is used to
exchange data between
processors
Shared memory
•
•
Single address space
All processes have access
to the pool of shared
memory
Distributed Memory
M
M
M
M
P
P
P
P
NI
NI
NI
NI
Network

Processors cannot
directly access
another processor’s
memory

Each node has a
network interface
(NI) for
communication and
synchronization
Distributed Memory
M
instr
CU
PE
data

data
data
CU
PE
data
M
instr
data
CU
PE
data
M
instr
data
data
CU
PE
Network
M
instr
Each processor
executes different
instructions
asynchronously,
using different
data
Shared Memory
CU
PE
data

CU
PE
data
CU
PE
data
CU
PE
instruction
Memory
data
Each processor
executes different
instructions
asynchronously,
using different
data
Shared Memory
P
P
P
P

Bus
Uniform memory access
(UMA)
•
Memory
P
P
P
P
P
P
P
Bus
Bus
Memory
Memory
Network
P

Each processor has uniform
access to memory (symmetric
multiprocessor - SMP)
Non-uniform memory access
(NUMA)
•
•
•
Time for memory access depends
on the location of data
Local access is faster than nonlocal access
Easier to scale than SMPs
Distributed Shared Memory

Making the main memory of a cluster of
computers look as if it is a single memory with a
single address space

Shared memory programming techniques can be
used
Multicore Systems



Many general purpose processors
GPU (Graphics Processor Unit)
GPGPU (General Purpose GPU)
Hybrid
The trend is:
 Board composed of multiple manycore
chips sharing memory
 Rack composed of multiple boards
 A room full of these racks
Memory

Distributed Systems

Clusters
• Individual computers, that are tightly coupled by
software, in a local environment, to work together
on single problems or on related problems

Grid
• Many individual systems, that are geographically
distributed, are tightly coupled by software, to
work together on single problems or on related
problems
Download