Chapter 6: Pipelining

advertisement
Ch. 13 Pipelining
1
Comp Sci 251 -- pipelining
Pipelining
2
Comp Sci 251 -- pipelining
Performance of Pipeline




3
One instruction completes every clock cycle
n stages in pipeline  up to n times faster
speedup < n because some instructions do not
need every stage
Note: individual instructions are not faster
Comp Sci 251 -- pipelining
MIPS Pipeline
5 stages
 IF: instruction fetch
 ID: instruction decode (read registers)
 EX: instruction execution, address calc
 MEM: memory access
 WB: write back (to register)
4
Comp Sci 251 -- pipelining
Implementing a single-cycle pipeline
IF
ID
EX
MEM
WB
IF
ID
EX
MEM
WB
IF
ID
EX
MEM
WB
• Every stage takes the same time, whether there is work or not
• Each stage must be stretched to accommodate the slowest instruction
5
Comp Sci 251 -- pipelining
Total time for each instruction
6
Instr
Fetch
Reg Read ALU
Operation
Data
access
Register
Write
Total
Time
Load
word (lw)
200 ps
100 ps
200 ps
200 ps
100 ps
800 ps
Store
word (sw)
200 ps
100 ps
200 ps
200 ps
R-format
200 ps
(add, sub,
and, or,
slt)
100 ps
200 ps
Branch
(beq)
100 ps
200 ps
200 ps
700 ps
100 ps
600 ps
500 ps
Comp Sci 251 -- pipelining
Single-cycle non-pipelined execution
lw $7, 100($5)
IR
R
ALU
R
IR
lw $8, 200($5)
lw $9, 300($5)
MEM
R
ALU
MEM
800 ps
IR
800 ps
3 independent lw instructions will take 3 x 800 ps = 2400 ps
7
R
Comp Sci 251 -- pipelining
Pipelined execution
200
lw $7, 100($5)
400
R ALU
IF
200ps
800
MEM
R ALU
IF
lw $8, 200($5)
lw $9, 300($5)
600
IF
1000
1200
1400
R
R
MEM
R ALU
MEM
R
200ps
200ps
3 independent lw instructions takes 3 x 200 ps = 600 ps
4 times faster than the single-cycle non-pipelined execution
8
Comp Sci 251 -- pipelining
Pipeline Hazards
Control Hazards
 caused by conditional branch instruction
 cannot decide which instruction is next until
stage 3 (or stage 2 with beefed up
processor)
 pipeline wants to start next instruction during
stage 2
9
Comp Sci 251 -- pipelining
Solutions to Control Hazards

Stall:
–
–

start next instruction during stage 3 (assume branch is resolved
in stage 2)
equivalent to placing a “nop” after every branch
Predict:
–
–
if incorrect, flush the bad instruction
Some prediction strategies



10
assume all branches not taken
static: assume some always taken, others never taken
dynamic: use past history, keep stats on branches
Comp Sci 251 -- pipelining
Solutions to Control Hazards

Delayed decision: (used in MIPS and SPARC)
–
–
–
instruction following branch always executes
branch takes place after this instruction
compiler or assembler fills “delay slots” with useful
instructions (or nop’s)


11
Change order of neighboring instructions, if logically
acceptable
Or nop
Comp Sci 251 -- pipelining
Data Hazards

Assume register write happens in WB stage
add $s0, $t0, $t1
sub $t2, $s0, $t3

Example requires three pipeline stalls
–
–
12
too costly to allow
too frequent for compiler to resolve
Comp Sci 251 -- pipelining
Solution to Data Hazards
Forwarding: getting the missing item early from
internal resources
 sub gets $s0 value from ALU, not reg. file
 sometimes forwarding avoids stalls
13
Comp Sci 251 -- pipelining
Solution to Data Hazards

14
sometimes forwarding only reduces stalls
Comp Sci 251 -- pipelining
Advanced Pipelining Techniques


Superpipelining: large number of stages
Superscalar
–
–
15
multiple copies of each stage
several instructions started/finished per cycle
Comp Sci 251 -- pipelining
Advanced Pipelining Techniques

16
Dynamic Pipeline Scheduling
Comp Sci 251 -- pipelining
Pentium Pro / Power PC 604
17
Comp Sci 251 -- pipelining
Download