CS 61C: Great Ideas in Computer Architecture (Machine Structures) Instructor:

CS 61C: Great Ideas in Computer Architecture (Machine Structures) MIPS Control, MIPS Pipelining Intro Instructor: Michael Greenbaum 6/27/2016 Summer 2011 -- Lecture #21 1 Levels of Representation/Interpretation High Level Language Program (e.g., C) Compiler Assembly Language Program (e.g., MIPS) Assembler Machine Language Program (MIPS) temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; lw lw sw sw 0000 1010 1100 0101 $t0, 0($2) $t1, 4($2) $t1, 0($2) $t0, 4($2) 1001 1111 0110 1000 1100 0101 1010 0000 Anything can be represented as a number, i.e., data or instructions 0110 1000 1111 1001 1010 0000 0101 1100 1111 1001 1000 0110 0101 1100 0000 1010 1000 0110 1001 1111 Machine Interpretation Hardware Architecture Description (e.g., block diagrams) Architecture Implementation Logic Circuit Description (Circuit Schematic Diagrams)Spring 2011 -- Lecture #18 6/27/2016 2 Review: A Single Cycle Datapath RegDst 32 0 5 5 5 Rw Ra Rb RegFile busA busB 32 16 Extender imm16 MemtoReg MemWr Rs Rt clk clk ALUctr Zero 32 z ALU busW PC PC Ext Adder Mux 00 RegWr Adder 4 Rt Rd Imm16 Rd Rt 1 Instruction<31:0> <0:15> nPC_sel Rs <11:15> Adr <16:20> <21:25> Inst Memory 0 32 1 32 Data In clk imm16 ExtOp ALUSrc 32 0 WrEn Adr Data Memory 1 Agenda • • • • • Overview of Control Signals Administrivia Control Implementation Break Pipelining Intro 6/27/2016 Summer 2011 -- Lecture #21 4 Processor Design Process • Five steps to design a processor: Step 1: Analyze instruction set to determine datapath requirements Step 2: Select set of datapath components & establish clocking methodology Step 3: Assemble datapath components to meet the requirements Step 4: Analyze implementation of each instruction to determine setting of control signals that implements the register transfer Step 5: Assemble the control logic 6/27/2016 Spring 2011 -- Lecture #18 5 A Single Cycle Datapath Instruction<31:0> Rs Rt Rd Imm16 ALUctr MemtoReg Rd Rt 1 RegWr 0 5 Rs Rt 5 busB 32 16 Extender clk imm16 busA ExtOp 32 MemWr Z ALU RegFile 32 zero 5 Rw Ra Rb busW <0:15> RegDst <11:15> clk <16:20> instr fetch unit nPC_sel <21:25> • We have everything but the values of control signals 0 32 1 32 Data In clk ALUSrc 32 0 WrEn Adr Data Memory 1 Values of the Control Signals “n”=next “+4” 0  PC <– PC + 4 “br” 1  PC <– PC + 4 + {SignExt(Im16) , 00 } Inst Address 0 PC Mux Adder PC Ext imm16 nPC_sel Adder 4 00 • nPC_sel: 1 clk Values of the Control Signals • ExtOp: • ALUsrc: • ALUctr: ° ° ° ° zero, sign 0  reg; 1  imm ADD, SUB, OR MemWr: 1  write memory MemtoReg: 0  alu; 1  mem RegDst: 0  rt; 1  rd RegWr: 1  write register ALUctr RegDst Rd Rt 1 RegWr 0 Rs Rt 5 5 5 Rw Ra Rb RegFile 32 busA busB 32 imm16 16 ExtOp Extender clk 32 0 ALU busW MemtoReg MemWr 32 0 32 WrEn Adr 1 32 Data In ALUSrc clk Data Memory 1 RTL: The Add Instruction 31 26 op 6 bits 21 rs 5 bits 16 rt 5 bits 11 6 0 rd shamt funct 5 bits 5 bits 6 bits add rd, rs, rt – MEM[PC] Fetch the instruction from memory – R[rd] = R[rs] + R[rt] The actual operation – PC = PC + 4 Calculate the next instruction’s address 6/27/2016 Spring 2011 -- Lecture #18 9 Single Cycle Datapath during Add 31 26 op 21 16 rs 11 rt rd 6 0 shamt funct R[rd] = R[rs] + R[rt] RegWr=1 rs 5 5 Rw busW rt 5 Ra Rb busB 32 imm16 16 ExtOp=x Extender clk Rs Rt Rd Imm16 zero ALUctr=ADD MemtoReg=0 MemWr=0 32 = ALU RegFile 32 6/27/2016 busA 32 0 0 32 1 Data In 32 ALUSrc=0 Spring 2011 -- Lecture #18 <0:15> 0 <11:15> 1 <16:20> rt <21:25> rd Instruction<31:0> instr fetch unit nPC_sel=+4 RegDst=1 clk clk WrEn Adr Data Memory 1 10 Instruction Fetch Unit for Add • PC = PC + 4 – Same for all instructions except: Branch and Jump Inst Memory nPC_sel=+4 Inst Address Adder 4 00 PC Mux Adder PC Ext clk imm16 6/27/2016 Spring 2011 -- Lecture #18 11 Single Cycle Datapath during Or Immediate 31 26 21 op 16 rs 0 rt immediate • R[rt] = R[rs] OR ZeroExt[Imm16] Rs Rt 5 5 Rw busW 5 Ra Rb busA busB 32 imm16 16 ExtOp= Extender clk 32 = ALU RegFile 32 6/27/2016 Rs Rt Rd zero ALUctr= 0 <0:15> RegWr= <11:15> 1 clk Instruction<31:0> <16:20> Rd Rt instr fetch unit <21:25> nPC_sel= RegDst= Imm16 MemtoReg= MemWr= 32 0 0 32 1 Data In 32 ALUSrc= Spring 2011 -- Lecture #18 clk WrEn Adr Data Memory 1 12 Single Cycle Datapath during Or Immediate 31 26 21 op 16 rs 0 rt immediate • R[rt] = R[rs] OR ZeroExt[Imm16] 5 Rw busW Rs Rt 5 5 Ra Rb busA busB 32 imm16 16 ExtOp=zero Extender clk 32 = ALU RegFile 32 6/27/2016 Rs Rt Rd zero ALUctr=OR 0 <0:15> RegWr=1 <11:15> clk Rd Rt 1 instr fetch unit <21:25> RegDst=0 Instruction<31:0> <16:20> nPC_sel=+4 Imm16 MemtoReg=0 MemWr=0 32 0 0 32 1 Data In 32 ALUSrc=1 Spring 2011 -- Lecture #18 clk WrEn Adr Data Memory 1 13 Single Cycle Datapath during Load 31 26 21 op 16 rs 0 rt immediate • R[rt] = Data Memory {R[rs] + SignExt[imm16]} Rs Rt 5 5 Rw busW 5 Ra Rb busA busB 32 imm16 16 ExtOp= Extender clk 32 = ALU RegFile 32 6/27/2016 Rs Rt Rd zero ALUctr= 0 <0:15> RegWr= <11:15> 1 clk Instruction<31:0> <16:20> Rd Rt instr fetch unit <21:25> nPC_sel= RegDst= Imm16 MemtoReg= MemWr= 32 0 0 32 1 Data In 32 ALUSrc= clk Spring 2011 -- Lecture #18 WrEn Adr Data Memory 1 14 Single Cycle Datapath during Load 31 26 21 op 16 rs 0 rt immediate • R[rt] = Data Memory {R[rs] + SignExt[imm16]} 5 Ra Rb busB 32 imm16 16 ExtOp=sign Extender clk Rs Rt Rd Imm16 zero ALUctr=ADD MemtoReg=1 MemWr=0 32 = ALU RegFile 32 6/27/2016 busA <0:15> Rw busW 5 <11:15> 5 Rs Rt <16:20> RegWr=1 0 <21:25> Rd Rt 1 Instruction<31:0> instr fetch unit nPC_sel=+4 RegDst=0 clk 32 0 0 32 1 Data In 32 ALUSrc=1 clk Spring 2011 -- Lecture #18 WrEn Adr Data Memory 1 15 Single Cycle Datapath during Branch 31 26 21 op • 16 rs 0 rt immediate if (R[rs] - R[rt] == 0) then Zero = 1 ; else Zero = 0 Rs Rt 5 5 Rw busW 5 Ra Rb busA busB 32 imm16 16 ExtOp= Extender clk 32 = ALU RegFile 32 6/27/2016 Rs Rt Rd zero ALUctr= 0 <0:15> RegWr= <11:15> 1 clk <16:20> Rd Rt Instruction<31:0> <21:25> nPC_sel= RegDst= instr fetch unit Imm16 MemtoReg= MemWr= 32 0 0 32 1 Data In 32 ALUSrc= clk Spring 2011 -- Lecture #18 WrEn Adr Data Memory 1 16 Single Cycle Datapath during Branch 31 26 21 op • 16 rs 0 rt immediate if (R[rs] - R[rt] == 0) then Zero = 1 ; else Zero = 0 5 Rw busW 5 Ra Rb busB 32 imm16 16 ExtOp=x Extender clk Rs Rt Rd Imm16 zero ALUctr=SUB MemtoReg=x MemWr=0 32 = ALU RegFile 32 6/27/2016 busA <0:15> 5 <11:15> Rs Rt <16:20> RegWr=0 0 <21:25> Rd Rt 1 Instruction<31:0> instr fetch unit nPC_sel=br RegDst=x clk 32 0 0 32 1 Data In 32 ALUSrc=0 clk Spring 2011 -- Lecture #18 WrEn Adr Data Memory 1 17 Instruction Fetch Unit at the End of Branch 31 26 op 21 16 rs 0 rt immediate • if (Zero == 1) then PC = PC + 4 + SignExt[imm16]*4 ; else PC = PC + 4 Inst Memory Adr nPC_sel Instruction<31:0> Zero 0 Mux PC Adder 6/27/2016 PC Ext imm16 Adder 4 00 MUX ctrl nPC_sel 1 clk (nPC_sel = 1 if branch instr.) nPC_sel 0 1 1 zero? x 0 1 Spring 2011 -- Lecture #18 MUX 0 0 1 Q: What logic gate? 18 Summary: Datapath’s Control Signals • ExtOp: • ALUsrc: • ALUctr: • • • • “zero”, “sign” 0  regB; 1  immed “ADD”, “SUB”, “OR” MemWr: MemtoReg: RegDst: RegWr: ALUctr MemtoReg MemWr RegDst Rd Rt 1 Inst Address RegWr 4 0 Rs Rt 5 5 Rw busW 5 Ra Rb busA RegFile busB PC Mux 32 clk imm16 16 Extender PC Ext Adder 1 imm16 0 32 WrEn Adr 1 Data In ALUSrc clk 32 ExtOp 6/27/2016 32 0 32 clk 32 ALU Adder 0 00 nPC_sel 1  write memory 0  ALU; 1  Mem 0  “rt”; 1  “rd” 1  write register Spring 2011 -- Lecture #18 1 Data Memory 19 Administrivia • HW3 Due Wednesday at midnight • Project 2 Part 2 due Sunday. – Slides at end of July 12 lecture contain useful info. – Project 2 Extra Credit: If you have time after finishing Part 2, try to make your program as fast as possible! Top submissions in the class get lots of extra credit, substantial improvement will get some. • Lab 12 cancelled! – Replaced with free study session where you can catch up on labs / work on project 2. – The TA’s will still be there. 6/27/2016 Spring 2011 -- Lecture #18 20 cs61c in the News • Use power dissipation from servers as source of heating? • So-called “Data Furnaces.” Sell the heat to consumers. • “Additionally, such a setup would also provide lower network latency as the storage and computation systems can be located closer to areas of high population density and therefore those using them.” http://www.gizmag.com/data-furnaces-to-heat-the-home/19314/ http://research.microsoft.com/pubs/150265/heating.pdf 6/27/2016 Summer 2011 -- Lecture #21 21 Processor Design Process • Five steps to design a processor: Step 1: Analyze instruction set to determine datapath requirements Step 2: Select set of datapath components & establish clocking methodology Step 3: Assemble datapath components to meet the requirements Step 4: Analyze implementation of each instruction to determine setting of control signals that implements the register transfer Step 5: Assemble the control logic 6/27/2016 Spring 2011 -- Lecture #18 22 Given Datapath: RTL  Control Instruction<31:0> Rd <0:15> Rs <11:15> Rt <16:20> Op Fun <21:25> <0:5> <26:31> Inst Memory Adr Imm16 Control nPC_sel RegWr RegDst ExtOp ALUSrc ALUctr MemWr MemtoReg DATA PATH 6/27/2016 Spring 2011 -- Lecture #18 23 Summary of the Control Signals (1/2) inst Register Transfer add R[rd]  R[rs] + R[rt]; PC  PC + 4 ALUsrc=RegB, ALUctr=“ADD”, RegDst=rd, RegWr, nPC_sel=“+4” sub R[rd]  R[rs] – R[rt]; PC  PC + 4 ALUsrc=RegB, ALUctr=“SUB”, RegDst=rd, RegWr, nPC_sel=“+4” ori R[rt]  R[rs] + zero_ext(Imm16); PC  PC + 4 ALUsrc=Im, Extop=“Z”, ALUctr=“OR”, RegDst=rt,RegWr, nPC_sel=“+4” R[rt]  MEM[ R[rs] + sign_ext(Imm16)]; PC  PC + 4 lw ALUsrc=Im, Extop=“sn”, ALUctr=“ADD”, MemtoReg, RegDst=rt, RegWr, nPC_sel = “+4” MEM[ R[rs] + sign_ext(Imm16)]  R[rs]; PC  PC + 4 sw ALUsrc=Im, Extop=“sn”, ALUctr = “ADD”, MemWr, nPC_sel = “+4” Beq if (R[rs] == R[rt]) then PC  PC + sign_ext(Imm16)] || 00 else PC  PC + 4 nPC_sel = “br”, 6/27/2016 ALUctr = “SUB” Spring 2011 -- Lecture #18 24 Summary of the Control Signals (2/2) See Appendix A func 10 0000 10 0010 op 00 0000 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010 add sub ori lw sw beq jump RegDst 1 1 0 0 x x x ALUSrc 0 0 1 1 1 0 x MemtoReg 0 0 0 1 x x x RegWrite 1 1 1 1 0 0 0 MemWrite 0 0 0 0 1 0 0 nPCsel 0 0 0 0 0 1 ? Jump 0 0 0 0 0 0 1 ExtOp x x 0 1 1 x Add Subtract Or Add Add x Subtract ALUctr<2:0> 31 26 21 16 R-type op rs rt I-type op rs rt J-type op 6/27/2016 We Don’t Care :-) 11 rd 6 shamt immediate target address Spring 2011 -- Lecture #18 x 0 funct add, sub ori, lw, sw, beq jump 25 Boolean Expressions for Controller RegDst ALUSrc MemtoReg RegWrite MemWrite nPCsel Jump ExtOp ALUctr[0] ALUctr[1] = = = = = = = = = = add + sub ori + lw + sw lw add + sub + ori + lw sw beq jump lw + sw sub + beq (assume ALUctr is or 00 ADD, 01: SUB, 10: OR) Where: rtype ori lw sw beq jump = = = = = = ~op5 ~op5 op5 op5 ~op5 ~op5       ~op4 ~op4 ~op4 ~op4 ~op4 ~op4       ~op3 op3 ~op3 op3 ~op3 ~op3       ~op2 op2 ~op2 ~op2 op2 ~op2       ~op1 ~op1 op1 op1 ~op1 op1  ~op0,  op0  op0  op0  ~op0  ~op0 How do we implement this in gates? add = rtype  func5  ~func4  ~func3  ~func2  ~func1  ~func0 sub = rtype  func5  ~func4  ~func3  ~func2  func1  ~func0 6/27/2016 Spring 2011 -- Lecture #18 26 Controller Implementation opcode func “AND” logic 6/27/2016 add sub ori lw sw beq jump “OR” logic Spring 2011 -- Lecture #18 RegDst ALUSrc MemtoReg RegWrite MemWrite nPCsel Jump ExtOp ALUctr[0] ALUctr[1] 27 An Abstract View of the Implementation Ideal Instruction Memory PC clk 32 Instruction Rd Rs Rt 5 5 5 Rw Ra Rb Register File clk Control Signals Conditions A 32 ALU Next Address Instruction Address Control B 32 32 Datapath Data Addr Data Ideal Out Data Memory Data In clk Call home, we’ve made HW/SW contact! High Level Language Program (e.g., C) Compiler Assembly Language Program (e.g., MIPS) Assembler Machine Language Program (MIPS) temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; lw lw sw sw 0000 1010 1100 0101 $t0, 0($2) $t1, 4($2) $t1, 0($2) $t0, 4($2) 1001 1111 0110 1000 1100 0101 1010 0000 Anything can be represented as a number, i.e., data or instructions 0110 1000 1111 1001 1010 0000 0101 1100 1111 1001 1000 0110 0101 1100 0000 1010 1000 0110 1001 1111 Machine Interpretation Hardware Architecture Description (e.g., block diagrams) Architecture Implementation Logic Circuit Description (Circuit Schematic Diagrams)Spring 2011 -- Lecture #18 6/27/2016 29 Agenda • • • • • Overview of Control Signals Administrivia Control Implementation Break Pipelining Intro 6/27/2016 Summer 2011 -- Lecture #21 30 Agenda • • • • • Overview of Control Signals Administrivia Control Implementation Break Pipelining Intro 6/27/2016 Summer 2011 -- Lecture #21 31 Single Cycle Performance • Assume time for actions are – 100ps for register read or write; 200ps for other events • Clock rate is? Instr Instr fetch Register read ALU op Memory access Register write Total time lw 200ps 100 ps 200ps 200ps 100 ps 800ps sw 200ps 100 ps 200ps 200ps R-format 200ps 100 ps 200ps beq 200ps 100 ps 200ps 700ps 100 ps 600ps 500ps • What can we do to improve clock rate? • Will this improve performance as well? Want increased clock rate to mean faster programs 6/27/2016 Summer 2011 -- Lecture #21 32 Pipeline Analogy: Doing Laundry • Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, fold, and put away A B C D – Washer takes 30 minutes – Dryer takes 30 minutes – “Folder” takes 30 minutes – “Stasher” takes 30 minutes to put clothes into drawers 6/27/2016 Summer 2011 -- Lecture #21 33 Sequential Laundry 6 PM 7 T a s k O r d e r A 8 9 10 11 12 1 2 AM 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 Time B C D 6/27/2016 • Sequential laundry takes 8 hours for 4 loads Summer 2011 -- Lecture #21 34 Pipelined Laundry 6 PM 7 T a s k 8 9 11 10 3030 30 30 30 30 30 12 1 2 AM Time A B C O D r d e r • 6/27/2016 Pipelined laundry takes 3.5 hours for 4 loads! Summer 2011 -- Lecture #21 35 Pipelining Lessons (1/2) 6 PM T a s k 8 9 Time 30 30 30 30 30 30 30 A B O r d e r 7 C D 6/27/2016 • Pipelining doesn’t help latency of single task, it helps throughput of entire workload • Multiple tasks operating simultaneously using different resources • Potential speedup = Number pipe stages • Time to fill pipeline and time to drain it reduces speedup: 2.3X v. 4X in this example Summer 2011 -- Lecture #21 36 Pipelining Lessons (2/2) 6 PM T a s k 8 9 Time 30 30 30 30 30 30 30 A B O r d e r 7 C D 6/27/2016 • Suppose new Washer takes 20 minutes, new Stasher takes 20 minutes. How much faster is pipeline? • Pipeline rate limited by slowest pipeline stage • Unbalanced lengths of pipe stages reduces speedup Summer 2011 -- Lecture #21 37 Recall: Steps in Executing MIPS 1) IF: Instruction Fetch, Increment PC 2) ID: Instruction Decode, Read Registers 3) EX: Mem-ref: Calculate Address Arith-log: Perform Operation 4) Mem: Load: Read Data from Memory Store: Write Data to Memory 5) WB: Write Data Back to Register 6/27/2016 Summer 2011 -- Lecture #21 38 +4 1. Instruction Fetch 6/27/2016 rd rs rt ALU Data memory registers PC instruction memory Redrawn Single-Cycle Datapath imm 2. Decode/ 3. Execute 4. Memory Register Read Summer 2011 -- Lecture #21 5. Write Back 39 +4 1. Instruction Fetch rd rs rt ALU Data memory registers PC instruction memory Pipelined Datapath imm 2. Decode/ 3. Execute 4. Memory Register Read 5. Write Back • Add registers between stages – Hold information produced in previous cycle 6/27/2016 Summer 2011 -- Lecture #21 40 More Detailed Pipeline 6/27/2016 Summer 2011 -- Lecture #21 41 IF for Load, Store, … 6/27/2016 Summer 2011 -- Lecture #21 42 ID for Load, Store, … 6/27/2016 Summer 2011 -- Lecture #21 43 EX for Load 6/27/2016 Summer 2011 -- Lecture #21 44 MEM for Load 6/27/2016 Summer 2011 -- Lecture #21 45 WB for Load Wrong register number 6/27/2016 Summer 2011 -- Lecture #21 46 Corrected Datapath for Load 6/27/2016 Summer 2011 -- Lecture #21 47 Pipelined Execution Representation Time IF ID EX Mem WB IF ID EX Mem WB IF ID EX Mem WB IF ID EX Mem WB IF ID EX Mem WB IF ID EX Mem WB • Every instruction must take same number of steps, also called pipeline stages, so some will go idle sometimes 6/27/2016 Summer 2011 -- Lecture #21 48 rd rs rt ALU Data memory registers PC instruction memory Graphical Pipeline Diagrams imm +4 1. Instruction Fetch 2. Decode/ 3. Execute 4. Memory Register Read 5. Write Back • Use datapath figure below to represent pipeline IF 6/27/2016 Reg EX ALU I$ ID Mem WB D$ Reg Summer 2011 -- Lecture #21 49 Graphical Pipeline Representation (In Reg, right half highlight read, left half write) Time (clock cycles) Reg Reg D$ Reg I$ Reg D$ Reg I$ Reg ALU D$ Reg I$ Reg ALU I$ D$ ALU Reg ALU I$ ALU I n Load s t Add r. Store O Sub r d Or e r6/27/2016 D$ Summer 2011 -- Lecture #21 Reg 50 Pipeline Performance • Assume time for stages is – 100ps for register read or write – 200ps for other stages • What is pipelined clock rate? – Compare pipelined datapath with single-cycle datapath Instr Instr fetch Register read ALU op Memory access Register write Total time lw 200ps 100 ps 200ps 200ps 100 ps 800ps sw 200ps 100 ps 200ps 200ps R-format 200ps 100 ps 200ps beq 200ps 100 ps 200ps 6/27/2016 700ps 100 ps 600ps 500ps Summer 2011 -- Lecture #21 51 Student Roulette? Pipeline Performance Single-cycle (Tc= 800ps) Pipelined (Tc= 200ps) 6/27/2016 Summer 2011 -- Lecture #21 52 Pipeline Speedup • If all stages are balanced – i.e., all take the same time – Time between instructionspipelined = Time between instructionsnonpipelined Number of stages • If not balanced, speedup is less • Speedup due to increased throughput – Latency (time for each instruction) does not decrease 6/27/2016 Summer 2011 -- Lecture #21 53 Instruction Level Parallelism (ILP) • ILP – e.g., Pipelined Instruction Execution – 5 stage pipeline => 5 instructions executing simultaneously, one at each pipeline stage 6/27/2016 Summer 2011 -- Lecture #21 54 And in Conclusion, … • Can determine control signal values for each instruction. • Control implementation: Just combinational logic (at least for a single cycle CPU…) • Pipelining – Reduce critical path by inserting registers between each stage – N-stage pipeline => N operations done in parallel! 6/27/2016 Summer 2011 -- Lecture #21 55 Bonus slides • Note: You are still responsible for knowing the material covered in these slides. • The slides will appear in the order they would have in the normal presentation Single Cycle Datapath during Store 31 26 21 op 16 rs 0 rt immediate • Data Memory {R[rs] + SignExt[imm16]} = R[rt] Rs Rt 5 5 Rw busW 5 Ra Rb busA busB 32 imm16 16 ExtOp= Extender clk 32 = ALU RegFile 32 6/27/2016 Rs Rt Rd zero ALUctr= 0 <0:15> RegWr= <11:15> 1 clk Instruction<31:0> <16:20> Rd Rt instr fetch unit <21:25> nPC_sel= RegDst= Imm16 MemtoReg= MemWr= 32 0 0 32 1 Data In 32 ALUSrc= clk Spring 2011 -- Lecture #18 WrEn Adr Data Memory 1 57 Single Cycle Datapath during Store 31 26 21 op 16 rs 0 rt immediate • Data Memory {R[rs] + SignExt[imm16]} = R[rt] Rw busW 5 5 Ra Rb busB 32 imm16 16 ExtOp=sign Extender clk Rs Rt Rd Imm16 zero ALUctr=ADD MemtoReg=x MemWr=1 32 = ALU RegFile 32 6/27/2016 busA <0:15> 5 Rs Rt <11:15> RegWr=0 0 <16:20> Rd Rt <21:25> nPC_sel=+4 RegDst=x clk 1 Instruction<31:0> instr fetch unit 32 0 0 32 1 Data In 32 ALUSrc=1 clk Spring 2011 -- Lecture #18 WrEn Adr Data Memory 1 58 Single Cycle Datapath during Jump 31 J-type 26 25 0 op target address jump • New PC = { PC[31..28], target address, 00 } Instruction<31:0> Jump= <0:25> Data In32 ALUSrc = Spring 2011 -- Lecture #18 0 32 Clk WrEn Adr 32 Mux 32 <0:15> ExtOp = 1 <11:15> 6/27/2016 16 Extender imm16 Rs Rd Imm16 TA26 MemtoReg = Zero MemWr = ALU busA Rw Ra Rb 32 32 32-bit Registers busB 0 32 <16:20> 5 Rt ALUctr = Rs Rt 5 5 Mux 32 Clk Clk 1 Mux 0 RegWr = busW Rt <21:25> RegDst = Rd Instruction Fetch Unit nPC_sel= 1 Data Memory 59 Single Cycle Datapath during Jump 31 J-type 26 25 0 op target address jump • New PC = { PC[31..28], target address, 00 } Instruction<31:0> Jump=1 <0:25> Data In32 ALUSrc = x Spring 2011 -- Lecture #18 0 32 Clk WrEn Adr 32 Mux 32 <0:15> ExtOp = x 1 <11:15> 6/27/2016 16 Extender imm16 Rs Rd Imm16 TA26 MemtoReg = x Zero MemWr = 0 ALU busA Rw Ra Rb 32 32 32-bit Registers busB 0 32 <16:20> 5 Rt ALUctr =x Rs Rt 5 5 Mux 32 Clk Clk 1 Mux 0 RegWr = 0 busW Rt <21:25> RegDst = x Rd Instruction Fetch Unit nPC_sel=? 1 Data Memory 60 Instruction Fetch Unit at the End of Jump 31 26 25 J-type 0 op target address • New PC = { PC[31..28], target address, 00 } Jump Inst Memory nPC_sel jump Instruction<31:0> Adr Zero nPC_MUX_sel Adder 0 imm16 PC Mux Adder 6/27/2016 00 4 How do we modify this to account for jumps? 1 Clk Spring 2011 -- Lecture #18 61 Instruction Fetch Unit at the End of Jump 31 26 25 J-type 0 op target address • New PC = { PC[31..28], target address, 00 } Jump Inst Memory nPC_sel jump Instruction<31:0> Adr Zero imm16 00 TA Mux Adder 6/27/2016 1 4 (MSBs) 1 PC Adder 0 26 Mux 4 00 nPC_MUX_sel 0 Clk Spring 2011 -- Lecture #18 Query • Can Zero still get asserted? • Does nPC_sel need to be 0? • If not, what? 62

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Instructor:

Related documents

Products

Support

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Instructor:

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib