Document

advertisement
Chapter One
Introduction to Pipelined
Processors
Handler’s Classification
• Based on the level of processing, the pipelined
processors can be classified as:
1. Arithmetic Pipelining
2. Instruction Pipelining
3. Processor Pipelining
Arithmetic Pipelining
• The arithmetic logic units of a computer can
be segmented for pipelined operations in
various data formats.
• Example : Star 100
Arithmetic Pipelining
Arithmetic Pipelining
• Example : Star 100
– It has two pipelines where arithmetic operations
are performed
– First: Floating Point Adder and Multiplier
– Second : Multifunctional
• All scalar instructions
• Floating point adder, multiplier and divider.
– Both pipelines are 64-bit and can be split into four
32-bit at the cost of precision
Star 100 Architecture
Instruction Pipelining
• The execution of a stream of instructions can
be pipelined by overlapping the execution of
current instruction with the fetch, decode
and operand fetch of the subsequent
instructions
• It is also called instruction look-ahead
Instruction Pipelining
Example : 8086
• The organization of 8086 into a separate BIU
and EU allows the fetch and execute cycle to
overlap. This is called pipelining.
Processor Pipelining
• This refers to the processing of same data
stream by a cascade of processors each of
which processes a specific task
• The data stream passes the first processor
with results stored in a memory block which
is also accessible by the second processor
• The second processor then passes the refined
results to the third and so on.
Processor Pipelining
Li and Ramamurthy's Classification
• According to pipeline configurations and
control strategies, Li and Ramamurthy classify
pipelines under three schemes
– Unifunction v/s Multi-function Pipelines
– Static v/s Dynamic Pipelines
– Scalar v/s Vector Pipelines
Uni-function v/s Multi-function
Pipelines
Unifunctional Pipelines
• A pipeline unit with fixed and dedicated
function is called unifunctional.
• Example: CRAY1 (Supercomputer - 1976)
• It has 12 unifunctional pipelines described in
four groups:
– Address Functional Units:
• Address Add Unit
• Address Multiply Unit
Unifunctional Pipelines
– Scalar Functional Units
•
•
•
•
Scalar Add Unit
Scalar Shift Unit
Scalar Logical Unit
Population/Leading Zero Count Unit
– Vector Functional Units
• Vector Add Unit
• Vector Shift Unit
• Vector Logical Unit
Unifunctional Pipelines
– Floating Point Functional Units
• Floating Point Add Unit
• Floating Point Multiply Unit
• Reciprocal Approximation Unit
Cray 1 : Architecture
Cray -1
Multifunctional
• A multifunction pipe may perform different
functions either at different times or same
time, by interconnecting different subset of
stages in pipeline.
• Example 4X-TI-ASC (Supercomputer - 1973)
4X-TI ASC
• It has four multifunction pipeline processors,
each of which is reconfigurable for a variety of
arithmetic or logic operations at different
times.
• It is a four central processor comprised of nine
units.
Multifunctional
• It has
– one instruction processing unit
– four memory buffer units and
– four arithmetic units.
• Thus it provides four parallel execution
pipelines below the IPU.
• Any mixture of scalar and vector instructions
can be executed simultaneously in four pipes.
Architecture Overview of 4X-TI ASC
Static Vs Dynamic Pipeline
Static Pipeline
• It may assume only one functional
configuration at a time
• It can be either unifunctional or
multifunctional
• Static pipelines are preferred when
instructions of same type are to be executed
continuously
• A unifunction pipe must be static.
Dynamic pipeline
• It permits several functional configurations to
exist simultaneously
• A dynamic pipeline must be multi-functional
• The dynamic configuration requires more
elaborate control and sequencing mechanisms
than static pipelining
Scalar Vs Vector Pipeline
Scalar Pipeline
• It processes a sequence of scalar operands
under the control of a DO loop
• Instructions in a small DO loop are often
prefetched into the instruction buffer.
• The required scalar operands are moved into
a data cache to continuously supply the
pipeline with operands
• Example: IBM System/360 Model 91
IBM System/360 Model 91
• In this computer, buffering plays a major role.
• Instruction fetch buffering:
– provide the capacity to hold program loops of
meaningful size.
– Upon encountering a loop which fits, the buffer locks
onto the loop and subsequent branching requires less
time.
• Operand fetch buffering:
– provide a queue into which storage can dump
operands and execution units can fetch operands.
– This improves operand fetching for storage-toregister and storage-to-storage instruction types.
Architecture overview of IBM
360/Model 91
Vector Pipelines
• They are specially designed to handle vector
instructions over vector operands.
• Computers having vector instructions are called
vector processors.
• The design of a vector pipeline is expanded from
that of a scalar pipeline.
• The handling of vector operands in vector pipelines is
under firmware and hardware control.
• Example : Cray 1
Linear pipeline (Static & Unifunctional)
• In a linear pipeline data flows from one stage
to another and all stages are used once in a
computation and it is for one functional
evaluation.
Non-linear pipeline
• In floating point adder, stage (2) and (4)
needs a shift register.
• We can use the same shift register and then
there will be only 3 stages.
• Then we should have a feedback from third
stage to second stage.
• Further the same pipeline can be used to
perform fixed point addition.
• A pipeline with feed-forward and/or feedback
connections is called non-linear
Example: 3-stage nonlinear
pipeline
3 stage non-linear pipeline
Output A
Input
Sa
Output B
Sb
Sc
• It has 3 stages Sa, Sb and Sc and latches.
• Multiplexers(cross circles) can take more than
one input and pass one of the inputs to
output
• Output of stages has been tapped and used for
feedback and feed-forward.
3 stage non-linear pipeline
• The above pipeline can perform a variety of
functions.
• Each functional evaluation can be represented
by a particular sequence of usage of stages.
• Some examples are:
1. Sa, Sb, Sc
2. Sa, Sb, Sc, Sb, Sc, Sa
3. Sa, Sc, Sb, Sa, Sb, Sc
Reservation Table
• Each functional evaluation can be represented
using a diagram called Reservation Table(RT).
• It is the space-time diagram of a pipeline
corresponding to one functional evaluation.
• X axis – time units
• Y axis – stages
Reservation Table
• For first sequence Sa, Sb, Sc, Sb, Sc, Sa called
function A , we have
Sa
Sb
Sc
0
A
1
2
A
3
4
A
A
A
5
A
Reservation Table
• For second sequence Sa, Sc, Sb, Sa, Sb, Sc
called function B, we have
Sa
Sb
Sc
0
B
1
2
B
B
3
B
4
5
B
B
3 stage non-linear pipeline
Output A
Input
Output B
Sa
Sc
Reservation Table
Time 
Stage 
0
Sa
Sb
Sc
Sb
1
2
3
4
5
Function A
3 stage pipeline : Sa, Sb, Sc, Sb, Sc, Sa
Output A
Input
Output B
Sa
Sb
Sc
Reservation Table
Time 
Stage 
Sa
Sb
Sc
0
A
1
2
3
4
5
3 stage pipeline : Sa, Sb, Sc, Sb, Sc, Sa
Output A
Input
Output B
Sa
Sc
Reservation Table
Time 
Stage 
Sa
Sb
Sc
Sb
0
A
1
A
2
3
4
5
3 stage pipeline : Sa, Sb, Sc, Sb, Sc, Sa
Output A
Input
Output B
Sa
Sc
Reservation Table
Time 
Stage 
Sa
Sb
Sc
Sb
0
A
1
2
A
A
3
4
5
3 stage pipeline : Sa, Sb, Sc, Sb, Sc, Sa
Output A
Input
Output B
Sa
Sc
Reservation Table
Time 
Stage 
Sa
Sb
Sc
Sb
0
A
1
2
A
3
A
A
4
5
3 stage pipeline : Sa, Sb, Sc, Sb, Sc, Sa
Output A
Input
Output B
Sa
Sc
Reservation Table
Time 
Stage 
Sa
Sb
Sc
Sb
0
A
1
2
A
3
4
A
A
A
5
3 stage pipeline : Sa, Sb, Sc, Sb, Sc, Sa
Output A
Input
Output B
Sa
Sc
Reservation Table
Time 
Stage 
Sa
Sb
Sc
Sb
0
A
1
2
A
3
4
A
A
A
5
A
Function B
3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc
Output A
Input
Output B
Sa
Sc
Reservation Table
Time 
Stage 
Sa
Sb
Sc
Sb
0
B
1
2
3
4
5
3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc
Output A
Input
Output B
Sa
Sc
Reservation Table
Time 
Stage 
Sa
Sb
Sc
Sb
0
B
1
B
2
3
4
5
3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc
Output A
Input
Output B
Sa
Sc
Reservation Table
Time 
Stage 
Sa
Sb
Sc
Sb
0
B
1
2
B
B
3
4
5
3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc
Output A
Input
Output B
Sa
Sc
Reservation Table
Time 
Stage 
Sa
Sb
Sc
Sb
0
B
1
2
B
B
3
B
4
5
3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc
Output A
Input
Output B
Sa
Sc
Reservation Table
Time 
Stage 
Sa
Sb
Sc
Sb
0
B
1
2
B
B
3
B
4
B
5
3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc
Output A
Input
Output B
Sa
Sc
Reservation Table
Time 
Stage 
Sa
Sb
Sc
Sb
0
B
1
2
B
B
3
B
4
5
B
B
Reservation Table
• After starting a function, the stages need to be
reserved in corresponding time units.
• Each function supported by multifunction
pipeline is represented by different RTs
• Time taken for function evaluation in units of
clock period is compute time.(For A & B, it is
6)
Reservation Table
• Marking in same row => usage of stage more
than once
• Marking in same column => more than one
stage at a time
Multifunction pipelines
• Hardware of multifunction pipeline should be
reconfigurable.
• Multifunction pipeline can be static or
dynamic
Multifunction pipelines
• Static:
– Initially configured for one functional evaluation.
– For another function, pipeline need to be drained
and reconfigured.
– You cannot have two inputs of different function
at the same time
Multifunction pipelines
• Dynamic:
– Can do different functional evaluation at a time.
– It is difficult to control as we need to be sure that
there is no conflict in usage of stages.
Download