Ch. 13 Pipelining 1 Comp Sci 251 -- pipelining Pipelining 2 Comp Sci 251 -- pipelining Performance of Pipeline 3 One instruction completes every clock cycle n stages in pipeline up to n times faster speedup < n because some instructions do not need every stage Note: individual instructions are not faster Comp Sci 251 -- pipelining MIPS Pipeline 5 stages IF: instruction fetch ID: instruction decode (read registers) EX: instruction execution, address calc MEM: memory access WB: write back (to register) 4 Comp Sci 251 -- pipelining Implementing a single-cycle pipeline IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB • Every stage takes the same time, whether there is work or not • Each stage must be stretched to accommodate the slowest instruction 5 Comp Sci 251 -- pipelining Total time for each instruction 6 Instr Fetch Reg Read ALU Operation Data access Register Write Total Time Load word (lw) 200 ps 100 ps 200 ps 200 ps 100 ps 800 ps Store word (sw) 200 ps 100 ps 200 ps 200 ps R-format 200 ps (add, sub, and, or, slt) 100 ps 200 ps Branch (beq) 100 ps 200 ps 200 ps 700 ps 100 ps 600 ps 500 ps Comp Sci 251 -- pipelining Single-cycle non-pipelined execution lw $7, 100($5) IR R ALU R IR lw $8, 200($5) lw $9, 300($5) MEM R ALU MEM 800 ps IR 800 ps 3 independent lw instructions will take 3 x 800 ps = 2400 ps 7 R Comp Sci 251 -- pipelining Pipelined execution 200 lw $7, 100($5) 400 R ALU IF 200ps 800 MEM R ALU IF lw $8, 200($5) lw $9, 300($5) 600 IF 1000 1200 1400 R R MEM R ALU MEM R 200ps 200ps 3 independent lw instructions takes 3 x 200 ps = 600 ps 4 times faster than the single-cycle non-pipelined execution 8 Comp Sci 251 -- pipelining Pipeline Hazards Control Hazards caused by conditional branch instruction cannot decide which instruction is next until stage 3 (or stage 2 with beefed up processor) pipeline wants to start next instruction during stage 2 9 Comp Sci 251 -- pipelining Solutions to Control Hazards Stall: – – start next instruction during stage 3 (assume branch is resolved in stage 2) equivalent to placing a “nop” after every branch Predict: – – if incorrect, flush the bad instruction Some prediction strategies 10 assume all branches not taken static: assume some always taken, others never taken dynamic: use past history, keep stats on branches Comp Sci 251 -- pipelining Solutions to Control Hazards Delayed decision: (used in MIPS and SPARC) – – – instruction following branch always executes branch takes place after this instruction compiler or assembler fills “delay slots” with useful instructions (or nop’s) 11 Change order of neighboring instructions, if logically acceptable Or nop Comp Sci 251 -- pipelining Data Hazards Assume register write happens in WB stage add $s0, $t0, $t1 sub $t2, $s0, $t3 Example requires three pipeline stalls – – 12 too costly to allow too frequent for compiler to resolve Comp Sci 251 -- pipelining Solution to Data Hazards Forwarding: getting the missing item early from internal resources sub gets $s0 value from ALU, not reg. file sometimes forwarding avoids stalls 13 Comp Sci 251 -- pipelining Solution to Data Hazards 14 sometimes forwarding only reduces stalls Comp Sci 251 -- pipelining Advanced Pipelining Techniques Superpipelining: large number of stages Superscalar – – 15 multiple copies of each stage several instructions started/finished per cycle Comp Sci 251 -- pipelining Advanced Pipelining Techniques 16 Dynamic Pipeline Scheduling Comp Sci 251 -- pipelining Pentium Pro / Power PC 604 17 Comp Sci 251 -- pipelining