CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 30: Single-Cycle CPU Datapath Control Part 2 Instructor: Miki Garcia http://inst.eecs.berkeley.edu/~cs61c FUNCTIONAL MRI REVEALS AMOUNT PARALLEL PROCESSING IN BRAIN? MIT Technology Review: Using independent component analysis on fMRI data during visuomotor task, researchers from Greece conclude that there are about 50 parallel processes (“cores”) in the brain http://science.slashdot.org/story/14/11/08/1913229/fmri-data-reveals-how-manyparallel-processes-run-in-the-brain Design Steps – What we did • MIPS light ISA: +4 rd rs rt Data memory registers PC instruction memory – ADDU, SUBU, ORI, LOAD, STORE, BEQ ALU imm opcode, funct == zero Controller 6/27/2016 Fall 2011 -- Lecture #29 3 Review: Processor Design 5 steps Step 1: Analyze instruction set to determine datapath requirements – Meaning of each instruction is given by register transfers – Datapath must include storage element for ISA registers – Datapath must support each register transfer Step 2: Select set of datapath components & establish clock methodology Step 3: Assemble datapath components that meet the requirements Step 4: Analyze implementation of each instruction to determine setting of control points that realizes the register transfer Step 5: Assemble the control logic Processor Design: 5 steps Step 1: Analyze instruction set to determine datapath requirements – Meaning of each instruction is given by register transfers – Datapath must include storage element for ISA registers – Datapath must support each register transfer Step 2: Select set of datapath components & establish clock methodology Step 3: Assemble datapath components that meet the requirements Step 4: Analyze implementation of each instruction to determine setting of control points that realizes the register transfer Step 5: Assemble the control logic Add & Subtract Ctrl + data timing Clk PC Old Value Rs, Rt, Rd, Op, Func Old Value ALUctr Old Value RegWr Old Value busA, B Old Value busW Old Value New Value Instruction Memory Access Time New Value Delay through Control Logic New Value New Value Register File Access Time New Value ALU Delay New Value ALUctr RegWr Rd Rs Rt 5 Rw busW 5 Ra Rb busA 32 ALU RegFile clk 5 busB 32 32 Register Write Occurs Here 3c: Logical Op (or) with Immediate • R[rt] = R[rs] op ZeroExt[imm16] 31 26 21 op rs 31 6 bits RegDst RegWr Rw Rs Rt 5 bits 16 15 16 bits immediate 16 bits 5 ALUctr 5 Ra Rb busA busB 32 clk 16 ZeroExt imm16 32 ALU RegFile 32 5 bits immediate Writing to Rt register (not Rd)!! 0 5 0 rt 0000000000000000 16 bits Rd Rt 1 16 0 1 32 ALUSrc 32 What about Rt Read? 0 3d: Load Operations • R[rt] = Mem[R[rs] + SignExt[imm16]] Example: lw rt,rs,imm16 31 26 21 op 16 rs 6 bits 0 rt 5 bits immediate 5 bits 16 bits RegDst Rd Rt 1 RegWr 0 5 Rw Rs Rt 5 ALUctr 5 Ra Rb busA busB 32 clk 16 ZeroExt imm16 ALU RegFile 32 32 0 1 32 ALUSrc 32 3d: Load Operations • R[rt] = Mem[R[rs] + SignExt[imm16]] Example: lw rt,rs,imm16 31 26 21 op 16 rs 6 bits 0 rt 5 bits immediate 5 bits 16 bits 1 RegWr 0 Rs Rt 5 5 Rw busW 5 Ra Rb busA busB 32 imm16 16 ExtOp Extender clk 32 ALU RegFile 32 MemtoReg ALUctr RegDst Rd Rt 32 0 0 Adr 1 32 ALUSrc clk Data Memory 1 3e: Store Operations • Mem[ R[rs] + SignExt[imm16] ] = R[rt] Ex.: sw rt, rs, imm16 31 26 21 op 6 bits 16 rs 5 bits rt 5 bits immediate 16 bits RegDst Rd Rt 1 RegWr ALUctr 0 5 5 5 busA busB 32 imm16 16 ExtOp Extender clk 32 0 ALU RegFile 32 MemtoReg MemWr Rs Rt Rw Ra Rb busW 0 32 0 32 WrEn Adr 1 32 ALUSrc Data In clk Data Memory 1 3e: Store Operations • Mem[ R[rs] + SignExt[imm16] ] = R[rt] Ex.: sw rt, rs, imm16 31 26 21 op 6 bits 16 rs 5 bits rt 5 bits immediate 16 bits RegDst Rd Rt 1 RegWr ALUctr 0 5 5 5 busA busB 32 imm16 16 ExtOp Extender clk 32 0 ALU RegFile 32 MemtoReg MemWr Rs Rt Rw Ra Rb busW 0 32 0 32 WrEn Adr 1 32 ALUSrc Data In clk Data Memory 1 3f: The Branch Instruction 31 26 op 6 bits 21 rs 5 bits 16 rt 5 bits immediate 16 bits 0 beq rs, rt, imm16 – mem[PC] Fetch the instruction from memory – Equal = R[rs] == R[rt] Calculate branch condition – if (Equal) Calculate the next instruction’s address • PC = PC + 4 + ( SignExt(imm16) x 4 ) else • PC = PC + 4 Datapath for Branch Operations beq rs, rt, imm16 31 26 op 6 bits 21 rs 5 bits 16 0 rt 5 bits immediate 16 bits Datapath generates condition (Equal) Inst Address 00 RegWr clk busW clk 5 Rs Rt 5 ALUctr 5 Rw Ra Rb busA RegFile busB 32 = ALU PC Mux Adder PC Ext imm16 nPC_sel Adder 4 Equal 32 32 Already have mux, adder, need special sign extender for PC, need equal compare (sub?) Instruction Fetch Unit including Branch 31 26 op 21 rs 16 rt 0 immediate • if (Zero == 1) then PC = PC + 4 + SignExt[imm16]*4 ; else PC = PC + 4 Inst nPC_sel Equal MUX ctrl nPC_sel • How to encode nPC_sel? 0 00 • Direct MUX select? • Branch inst. / not branch inst. Mux PC Adder 4 Adder PC Ext imm16 Instruction<31:0> Memory Adr 1 clk • Let’s pick 2nd option nPC_sel 0 1 1 zero? x 0 1 MUX 0 0 1 Q: What logic gate? Putting it All Together:A Single Cycle Datapath RegDst 32 0 5 5 5 Rw Ra Rb RegFile busA busB 32 16 Extender imm16 MemtoReg MemWr Rs Rt clk clk ALUctr Equal 32 = ALU busW PC PC Ext Adder Mux 00 RegWr Adder 4 Rt Rd Imm16 Rd Rt 1 Instruction<31:0> <0:15> nPC_sel Rs <11:15> Adr <16:20> <21:25> Inst Memory 0 32 1 32 Data In clk imm16 ExtOp ALUSrc 32 0 WrEn Adr Data Memory 1 Datapath Control Signals • ExtOp: • ALUsrc: • ALUctr: • • • • “zero”, “sign” 0 regB; 1 immed “ADD”, “SUB”, “OR” MemWr: MemtoReg: RegDst: RegWr: ALUctr MemtoReg MemWr RegDst Rd Rt 1 Inst Address 4 RegWr 0 Rs Rt 5 5 Rw busW Ra Rb busA RegFile clk imm16 16 Extender PC Ext Adder 1 32 0 0 32 clk 32 busB PC Mux 32 ExtOp imm16 5 ALU Adder 0 00 nPC_sel & Equal 1 write memory 0 ALU; 1 Mem 0 “rt”; 1 “rd” 1 write register 32 WrEn Adr 1 Data In ALUSrc clk 32 Data Memory 1 Given Datapath: RTL Control Instruction<31:0> Rd <0:15> Rs <11:15> Rt <16:20> Op Fun <21:25> <0:5> <26:31> Inst Memory Adr Imm16 Control nPC_sel RegWr RegDst ExtOp ALUSrc ALUctr DATA PATH MemWr MemtoReg RTL: The Add Instruction 31 26 op 6 bits 21 rs 5 bits 16 rt 5 bits 11 6 0 rd shamt funct 5 bits 5 bits 6 bits add rd, rs, rt – MEM[PC] Fetch the instruction from memory – R[rd] = R[rs] + R[rt] The actual operation – PC = PC + 4 Calculate the next instruction’s address Instruction Fetch Unit at the Beginning of Add • Fetch the instruction from Instruction memory: Instruction = MEM[PC] – same for all instructions Inst Memory nPC_sel 00 Adder 4 PC Mux Adder PC Ext imm16 clk Inst Address Instruction<31:0> Single Cycle Datapath during Add 31 26 op 21 16 rs 11 rt rd 6 shamt R[rd] = R[rs] + R[rt] 5 5 Rw busW 5 busA Ra Rb busB 32 imm16 16 ExtOp=x Extender clk Rs Rt Rd Imm16 zero ALUctr=ADD MemtoReg=0 MemWr=0 32 = ALU RegFile 32 <0:15> Rs Rt <11:15> RegWr=1 0 <16:20> 1 <21:25> Rd Rt funct Instruction<31:0> instr fetch unit nPC_sel=+4 RegDst=1 clk 0 32 0 0 32 1 Data In 32 ALUSrc=0 clk WrEn Adr Data Memory 1 Instruction Fetch Unit at End of Add • PC = PC + 4 – Same for all instructions except: Branch and Jump nPC_sel=+4 00 Adder 4 Inst Memory PC Mux Adder PC Ext imm16 clk Inst Address P&H Figure 4.17 Summary of the Control Signals (1/2) inst Register Transfer add R[rd] R[rs] + R[rt]; PC PC + 4 ALUsrc=RegB, ALUctr=“ADD”, RegDst=rd, RegWr, nPC_sel=“+4” sub R[rd] R[rs] – R[rt]; PC PC + 4 ALUsrc=RegB, ALUctr=“SUB”, RegDst=rd, RegWr, nPC_sel=“+4” ori R[rt] R[rs] + zero_ext(Imm16); PC PC + 4 ALUsrc=Im, Extop=“Z”, ALUctr=“OR”, RegDst=rt,RegWr, nPC_sel=“+4” lw R[rt] MEM[ R[rs] + sign_ext(Imm16)]; PC PC + 4 ALUsrc=Im, Extop=“sn”, ALUctr=“ADD”, MemtoReg, RegDst=rt, RegWr, nPC_sel = “+4” sw MEM[ R[rs] + sign_ext(Imm16)] R[rs]; PC PC + 4 ALUsrc=Im, Extop=“sn”, ALUctr = “ADD”, MemWr, nPC_sel = “+4” beq if (R[rs] == R[rt]) then PC PC + sign_ext(Imm16)] || 00 else PC PC + 4 nPC_sel = “br”, ALUctr = “SUB” Summary of the Control Signals (2/2) See Appendix A func 10 0000 10 0010 We Don’t Care :-) op 00 0000 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010 add sub ori lw sw beq jump RegDst 1 1 0 0 x x x ALUSrc 0 0 1 1 1 0 x MemtoReg 0 0 0 1 x x x RegWrite 1 1 1 1 0 0 0 MemWrite 0 0 0 0 1 0 0 nPCsel 0 0 0 0 0 1 ? Jump 0 0 0 0 0 0 1 ExtOp x x 0 1 1 x Add Subtract Or Add Add x Subtract ALUctr<2:0> 31 26 21 16 R-type op rs rt I-type op rs rt J-type op 11 rd 6 shamt immediate target address x 0 funct add, sub ori, lw, sw, beq jump Boolean Expressions for Controller RegDst ALUSrc MemtoReg RegWrite MemWrite nPCsel Jump ExtOp ALUctr[0] ALUctr[1] = = = = = = = = = = add + sub ori + lw + sw lw add + sub + ori + lw sw beq jump lw + sw sub + beq (assume ALUctr is 00 ADD, 01 SUB, 10 OR) or Where: rtype ori lw sw beq jump = = = = = = ~op5 ~op5 op5 op5 ~op5 ~op5 ~op4 ~op4 ~op4 ~op4 ~op4 ~op4 ~op3 op3 ~op3 op3 ~op3 ~op3 ~op2 op2 ~op2 ~op2 op2 ~op2 ~op1 ~op1 op1 op1 ~op1 op1 ~op0, op0 op0 op0 ~op0 ~op0 How do we implement this in gates? add = rtype func5 ~func4 ~func3 ~func2 ~func1 ~func0 sub = rtype func5 ~func4 ~func3 ~func2 func1 ~func0 Controller Implementation opcode func “AND” logic add sub ori lw sw beq jump “OR” logic RegDst ALUSrc MemtoReg RegWrite MemWrite nPCsel Jump ExtOp ALUctr[0] ALUctr[1] Peer Instruction 1) We should use the main ALU to compute PC=PC+4 in order to save some gates 2) The ALU is inactive for memory reads (loads) or writes (stores). a) b) c) d) e) 12 FF FT TF TT Help! Summary: Single-cycle Processor • Five steps to design a processor: Processor 1. Analyze instruction set Input datapath requirements Control Memory 2. Select set of datapath components & establish Datapath Output clock methodology 3. Assemble datapath meeting the requirements 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. 5. Assemble the control logic • Formulate Logic Equations • Design Circuits Bonus Slides • How to implement Jump Single Cycle Datapath during Jump 31 26 25 J-type 0 op target address jump • New PC = { PC[31..28], target address, 00 } Instruction<31:0> Jump= Clk ALUSrc = ExtOp = Rd Imm16 TA26 MemtoReg = MemWr = 0 32 Data In 32 <0:25> 32 Rs WrEn Adr Data Memory 32 Mux 1 <0:15> 16 Extender imm16 Zero Mux 32 Clk busA Rw Ra Rb 32 32 x 32-bit Registers busB 0 32 ALU busW 5 Rt ALUctr = Rs Rt 5 5 <11:15> RegWr = Clk <16:20> RegDst = <21:25> Rd Rt 1 Mux 0 Instruction Fetch Unit nPC_sel= 1 Single Cycle Datapath during Jump 31 26 25 0 op J-type target address jump • New PC = { PC[31..28], target address, 00 } Instruction<31:0> Jump=1 ExtOp = x <0:25> Clk ALUSrc = x Rd Imm16 TA26 MemtoReg = x MemWr = 0 0 WrEn Adr Data Memory 32 Mux 32 Rs 32 Data In 32 <0:15> 1 <11:15> 16 Extender imm16 Zero Mux 32 Clk busA Rw Ra Rb 32 32 x 32-bit busB Registers 0 32 ALU busW 5 Rt ALUctr =x Rs Rt 5 5 <16:20> RegWr = 0 Clk <21:25> RegDst = x Rd Rt 1 Mux 0 Instruction Fetch Unit nPC_sel=? 1 Instruction Fetch Unit at the End of Jump 31 26 25 J-type 0 op target address jump • New PC = { PC[31..28], target address, 00 } Jump Inst Memory nPC_sel Adr Instruction<31:0> Zero nPC_MUX_sel Adder 00 4 0 imm16 PC Mux Adder 1 Clk How do we modify this to account for jumps? Instruction Fetch Unit at the End of Jump 31 26 25 0 op J-type target address jump • New PC = { PC[31..28], target address, 00 } Inst Memory Jump nPC_sel Instruction<31:0> Adr Zero imm16 00 TA Mux Adder 1 4 (MSBs) 1 PC Adder 0 26 Mux 4 00 nPC_MUX_sel 0 Clk Query • Can Zero still get asserted? • Does nPC_sel need to be 0? • If not, what?