CS 61C: Great Ideas in Computer Architecture Single Cycle MIPS CPU—Part II Instructors: Krste Asanovic, Randy H. Katz http://inst.eecs.Berkeley.edu/~cs61c/fa12 6/27/2016 Fall 2012 -- Lecture #26 1 You are Here! Software • Parallel Requests Assigned to computer e.g., Search “Katz” Hardware Smart Phone Warehouse Scale Computer Harness • Parallel Threads Parallelism & Assigned to core e.g., Lookup, Ads Achieve High Performance Computer • Parallel Instructions >1 instruction @ one time e.g., 5 pipelined instructions • Parallel Data >1 data item @ one time e.g., Add of 4 pairs of words • Hardware descriptions All gates @ one time Memory Core (Cache) Input/Output Instruction Unit(s) Core Functional Unit(s) A0+B0 A1+B1 A2+B2 A3+B3 Cache Memory Today Logic Gates • Programming Languages 6/27/2016 … Core Fall 2012 -- Lecture #26 2 Levels of Representation/Interpretation High Level Language Program (e.g., C) Compiler Assembly Language Program (e.g., MIPS) Assembler Machine Language Program (MIPS) temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; lw lw sw sw 0000 1010 1100 0101 $t0, 0($2) $t1, 4($2) $t1, 0($2) $t0, 4($2) 1001 1111 0110 1000 1100 0101 1010 0000 Anything can be represented as a number, i.e., data or instructions 0110 1000 1111 1001 1010 0000 0101 1100 1111 1001 1000 0110 0101 1100 0000 1010 1000 0110 1001 1111 Machine Interpretation Hardware Architecture Description (e.g., block diagrams) Architecture Implementation Logic Circuit Description (Circuit Schematic Diagrams) Fall 2012 -- Lecture #26 6/27/2016 3 Processor Design Process • Five steps to design a processor: Processor 1. Analyze instruction set Input datapath requirements Control Memory 2. Select set of datapath components & establish Datapath Output clock methodology 3. Assemble datapath meeting the requirements 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. 5. Assemble the control logic • Formulate Logic Equations • Design Circuits 6/27/2016 Fall 2012 -- Lecture #26 4 Agenda • Datapath Control • Administrivia • Control Implementation 6/27/2016 Fall 2012 -- Lecture #26 5 Agenda • Datapath Control • Administrivia • Control Implementation 6/27/2016 Fall 2012 -- Lecture #26 6 The MIPS-lite Subset • ADDU and SUBU 31 op – addu rd,rs,rt – subu rd,rs,rt • OR Immediate: 26 rs 6 bits 31 op 31 – lw rt,rs,imm16 – sw rt,rs,imm16 • BRANCH: 31 26 op – beq rs,rt,imm16 6 bits 6/27/2016 5 bits Fall 2012 -- Lecture #26 rd shamt funct 5 bits 5 bits 6 bits 0 16 bits 0 immediate 5 bits 21 rs 0 16 rt 5 bits 6 immediate 5 bits 21 rs 11 16 rt 5 bits 26 6 bits 5 bits 21 rs op 16 rt 5 bits 26 – ori rt,rs,imm16 6 bits • LOAD and STORE Word 21 16 bits 16 rt 5 bits 0 immediate 16 bits 7 Register Transfer Language (RTL) • RTL gives the meaning of the instructions {op , rs , rt , rd , shamt , funct} MEM[ PC ] {op , rs , rt , Imm16} MEM[ PC ] • All start by fetching the instruction Inst Register Transfers ADDU R[rd] R[rs] + R[rt]; PC PC + 4 SUBU R[rd] R[rs] – R[rt]; PC PC + 4 ORI R[rt] R[rs] | zero_ext(Imm16); PC PC + 4 LOAD R[rt] MEM[ R[rs] + sign_ext(Imm16)]; PC PC + 4 STORE MEM[ R[rs] + sign_ext(Imm16) ] R[rt]; PC PC + 4 BEQ if ( R[rs] == R[rt] ) then PC PC + 4 + (sign_ext(Imm16) || 00) else PC PC + 4 6/27/2016 Fall 2012 -- Lecture #26 8 RTL: The Add Instruction 31 26 op 6 bits 21 rs 5 bits 16 rt 5 bits 11 6 0 rd shamt funct 5 bits 5 bits 6 bits add rd, rs, rt – MEM[PC] Fetch the instruction from memory – R[rd] = R[rs] + R[rt] The actual operation – PC = PC + 4 Calculate the next instruction’s address 6/27/2016 Spring 2012 -- Lecture #26 9 Instruction Fetch Unit at the Beginning of Add • Fetch the instruction from Instruction memory: Instruction = MEM[PC] Inst Memory – same for all instructions nPC_sel Inst Address Adder 4 Instruction<31:0> 00 PC Mux Adder PC Ext clk imm16 6/27/2016 Spring 2012 -- Lecture #26 10 Single Cycle Datapath during Add 31 26 op 21 16 rs 11 rt rd 6 0 shamt funct R[rd] = R[rs] + R[rt] RegWr=1 rs 5 5 Rw busW rt 5 Ra Rb busB 32 imm16 16 ExtOp=x Extender clk Rs Rt Rd Imm16 zero ALUctr=ADD MemtoReg=0 MemWr=0 32 = ALU RegFile 32 6/27/2016 busA 32 0 0 32 1 Data In 32 ALUSrc=0 Spring 2012 -- Lecture #26 <0:15> 0 <11:15> 1 <16:20> rt <21:25> rd Instruction<31:0> instr fetch unit nPC_sel=+4 RegDst=1 clk clk WrEn Adr Data Memory 1 11 Instruction Fetch Unit at End of Add • PC = PC + 4 – Same for all instructions except: Branch and Jump Inst Memory nPC_sel=+4 Inst Address Adder 4 00 PC Mux Adder PC Ext clk imm16 6/27/2016 Spring 2012 -- Lecture #26 12 Single Cycle Datapath during Or Immediate 31 26 21 op 16 rs 0 rt immediate • R[rt] = R[rs] OR ZeroExt[Imm16] Rs Rt 5 5 Rw busW 5 Ra Rb busA busB 32 imm16 16 ExtOp= Extender clk 32 = ALU RegFile 32 6/27/2016 Rs Rt Rd zero ALUctr= 0 <0:15> RegWr= <11:15> 1 clk <16:20> Rd Rt Instruction<31:0> instr fetch unit <21:25> nPC_sel= RegDst= Student Roulette Imm16 MemtoReg= MemWr= 32 0 0 32 1 Data In 32 ALUSrc= Fall 2012 -- Lecture #26 clk WrEn Adr Data Memory 1 13 Single Cycle Datapath during Or Immediate 31 26 21 op 16 rs 0 rt immediate • R[rt] = R[rs] OR ZeroExt[Imm16] 5 Rw busW Rs Rt 5 5 Ra Rb busA busB 32 imm16 16 ExtOp=zero Extender clk 32 = ALU RegFile 32 6/27/2016 Rs Rt Rd zero ALUctr=OR 0 <0:15> RegWr=1 <11:15> clk Rd Rt 1 instr fetch unit <21:25> RegDst=0 Instruction<31:0> <16:20> nPC_sel=+4 Imm16 MemtoReg=0 MemWr=0 32 0 0 32 1 Data In 32 ALUSrc=1 Fall 2012 -- Lecture #26 clk WrEn Adr Data Memory 1 14 Single Cycle Datapath during Load 31 26 21 op 16 rs 0 rt immediate • R[rt] = Data Memory {R[rs] + SignExt[imm16]} Rs Rt 5 5 Rw busW 5 Ra Rb busA busB 32 imm16 16 ExtOp= Extender clk 32 = ALU RegFile 32 6/27/2016 Rs Rt Rd zero ALUctr= 0 <0:15> RegWr= <11:15> 1 clk Instruction<31:0> <16:20> Rd Rt instr fetch unit <21:25> nPC_sel= RegDst= Student Roulette Imm16 MemtoReg= MemWr= 32 0 0 32 1 Data In 32 ALUSrc= Fall 2012 -- Lecture #26 clk WrEn Adr Data Memory 1 15 Single Cycle Datapath during Load 31 26 21 op 16 rs 0 rt immediate • R[rt] = Data Memory {R[rs] + SignExt[imm16]} 5 Ra Rb busB 32 imm16 16 ExtOp=sign Extender clk Rs Rt Rd Imm16 zero ALUctr=ADD MemtoReg=1 MemWr=0 32 = ALU RegFile 32 6/27/2016 busA <0:15> Rw busW 5 <11:15> 5 Rs Rt <16:20> RegWr=1 0 <21:25> Rd Rt 1 Instruction<31:0> instr fetch unit nPC_sel=+4 RegDst=0 clk 32 0 0 32 1 Data In 32 ALUSrc=1 Fall 2012 -- Lecture #26 clk WrEn Adr Data Memory 1 16 Single Cycle Datapath during Store 31 26 21 op 16 rs 0 rt immediate • Data Memory {R[rs] + SignExt[imm16]} = R[rt] Rs Rt 5 5 Rw busW 5 Ra Rb busA busB 32 imm16 16 ExtOp= Extender clk 32 = ALU RegFile 32 6/27/2016 Rs Rt Rd zero ALUctr= 0 <0:15> RegWr= <11:15> 1 clk Instruction<31:0> <16:20> Rd Rt instr fetch unit <21:25> nPC_sel= RegDst= Student Roulette Imm16 MemtoReg= MemWr= 32 0 0 32 1 Data In 32 ALUSrc= Fall 2012 -- Lecture #26 clk WrEn Adr Data Memory 1 17 Single Cycle Datapath during Store 31 26 21 op 16 rs 0 rt immediate • Data Memory {R[rs] + SignExt[imm16]} = R[rt] Rw busW 5 5 Ra Rb busB 32 imm16 16 ExtOp=sign Extender clk Rs Rt Rd Imm16 zero ALUctr=ADD MemtoReg=x MemWr=1 32 = ALU RegFile 32 6/27/2016 busA <0:15> 5 Rs Rt <11:15> RegWr=0 0 <16:20> Rd Rt <21:25> nPC_sel=+4 RegDst=x clk 1 Instruction<31:0> instr fetch unit 32 0 0 32 1 Data In 32 ALUSrc=1 Fall 2012 -- Lecture #26 clk WrEn Adr Data Memory 1 18 Single Cycle Datapath during Branch 31 26 21 op • 16 rs 0 rt immediate if (R[rs] - R[rt] == 0) then Zero = 1 ; else Zero = 0 Rs Rt 5 5 Rw busW 5 Ra Rb busA busB 32 imm16 16 ExtOp= Extender clk 32 = ALU RegFile 32 6/27/2016 Rs Rt Rd zero ALUctr= 0 <0:15> RegWr= <11:15> 1 clk <16:20> Rd Rt Instruction<31:0> <21:25> nPC_sel= RegDst= instr fetch unit Imm16 MemtoReg= MemWr= 32 0 0 32 1 Data In 32 ALUSrc= Fall 2012 -- Lecture #26 clk WrEn Adr Data Memory 1 19 Single Cycle Datapath during Branch 31 26 21 op • 16 rs 0 rt immediate if (R[rs] - R[rt] == 0) then Zero = 1 ; else Zero = 0 5 Rw busW 5 Ra Rb busB 32 imm16 16 ExtOp=x Extender clk Rs Rt Rd Imm16 zero ALUctr=SUB MemtoReg=x MemWr=0 32 = ALU RegFile 32 6/27/2016 busA <0:15> 5 <11:15> Rs Rt <16:20> RegWr=0 0 <21:25> Rd Rt 1 Instruction<31:0> instr fetch unit nPC_sel=br RegDst=x clk 32 0 0 32 1 Data In 32 ALUSrc=0 Fall 2012 -- Lecture #26 clk WrEn Adr Data Memory 1 20 Instruction Fetch Unit at the End of Branch 31 26 op 21 16 rs 0 rt immediate • if (Zero == 1) then PC = PC + 4 + SignExt[imm16]*4 ; else PC = PC + 4 Inst Memory Adr nPC_sel Zero MUX ctrl nPC_sel • What is encoding of nPC_sel? 0 00 • Direct MUX select? • Branch inst. / not branch Mux PC Adder 6/27/2016 PC Ext imm16 Adder 4 Instruction<31:0> 1 clk • Let’s pick 2nd option nPC_sel 0 1 1 zero? x 0 1 Fall 2012 -- Lecture #26 MUX 0 0 1 Q: What logic gate? 21 Summary: Datapath’s Control Signals • ExtOp: • ALUsrc: • ALUctr: • • • • “zero”, “sign” 0 regB; 1 immed “ADD”, “SUB”, “OR” MemWr: MemtoReg: RegDst: RegWr: ALUctr MemtoReg MemWr RegDst Rd Rt 1 Inst Address RegWr 4 0 Rs Rt 5 5 Rw busW 5 Ra Rb busA RegFile busB PC Mux 32 clk imm16 16 Extender PC Ext Adder 1 imm16 0 32 WrEn Adr 1 Data In ALUSrc clk 32 ExtOp 6/27/2016 32 0 32 clk 32 ALU Adder 0 00 nPC_sel 1 write memory 0 ALU; 1 Mem 0 “rt”; 1 “rd” 1 write register Fall 2012 -- Lecture #26 1 Data Memory 22 Agenda • Datapath Control • Administrivia • Control Implementation 6/27/2016 Fall 2012 -- Lecture #26 23 CS61c in the News 6/27/2016 Fall 2012 -- Lecture #26 24 6/27/2016 Fall 2012 -- Lecture #26 25 Agenda • Datapath Control • Administrivia • Control Implementation 6/27/2016 Fall 2012 -- Lecture #26 26 Given Datapath: RTL Control Instruction<31:0> Rd <0:15> Rs <11:15> Rt <16:20> Op Fun <21:25> <0:5> <26:31> Inst Memory Adr Imm16 Control nPC_sel RegWr RegDst ExtOp ALUSrc ALUctr MemWr MemtoReg DATA PATH 6/27/2016 Fall 2012 -- Lecture #26 27 Summary of the Control Signals (1/2) inst Register Transfer add R[rd] R[rs] + R[rt]; PC PC + 4 ALUsrc=RegB, ALUctr=“ADD”, RegDst=rd, RegWr, nPC_sel=“+4” sub R[rd] R[rs] – R[rt]; PC PC + 4 ALUsrc=RegB, ALUctr=“SUB”, RegDst=rd, RegWr, nPC_sel=“+4” ori R[rt] R[rs] + zero_ext(Imm16); PC PC + 4 ALUsrc=Im, Extop=“Z”, ALUctr=“OR”, RegDst=rt,RegWr, nPC_sel=“+4” R[rt] MEM[ R[rs] + sign_ext(Imm16)]; PC PC + 4 lw ALUsrc=Im, Extop=“sn”, ALUctr=“ADD”, MemtoReg, RegDst=rt, RegWr, nPC_sel = “+4” MEM[ R[rs] + sign_ext(Imm16)] R[rs]; PC PC + 4 sw ALUsrc=Im, Extop=“sn”, ALUctr = “ADD”, MemWr, nPC_sel = “+4” beq if (R[rs] == R[rt]) then PC PC + sign_ext(Imm16)] || 00 else PC PC + 4 nPC_sel = “br”, 6/27/2016 ALUctr = “SUB” Fall 2012 -- Lecture #26 28 Summary of the Control Signals (2/2) See Appendix A func 10 0000 10 0010 op 00 0000 00 0000 00 1101 10 0011 10 1011 00 0100 add sub ori lw sw beq RegDst 1 1 0 0 x x ALUSrc 0 0 1 1 1 0 MemtoReg 0 0 0 1 x x RegWrite 1 1 1 1 0 0 MemWrite 0 0 0 0 1 0 nPCsel 0 0 0 0 0 1 Jump 0 0 0 0 0 0 ExtOp x x 0 1 1 Add Subtract Or Add Add x Subtract ALUctr<2:0> 31 26 21 16 R-type op rs rt I-type op rs rt 6/27/2016 We Don’t Care :-) 11 rd Fall 2012 -- Lecture #26 6 shamt immediate 0 funct add, sub ori, lw, sw, beq 29 Boolean Expressions for Controller RegDst ALUSrc MemtoReg RegWrite MemWrite nPCsel Jump ExtOp ALUctr[0] ALUctr[1] = = = = = = = = = = add + sub ori + lw + sw lw add + sub + ori + lw sw beq jump lw + sw sub + beq (assume ALUctr is or 00 ADD, 01: SUB, 10: OR) Where: rtype ori lw sw beq jump = = = = = = ~op5 ~op5 op5 op5 ~op5 ~op5 ~op4 ~op4 ~op4 ~op4 ~op4 ~op4 ~op3 op3 ~op3 op3 ~op3 ~op3 ~op2 op2 ~op2 ~op2 op2 ~op2 ~op1 ~op1 op1 op1 ~op1 op1 ~op0, op0 op0 op0 ~op0 ~op0 How do we implement this in gates? add = rtype func5 ~func4 ~func3 ~func2 ~func1 ~func0 sub = rtype func5 ~func4 ~func3 ~func2 func1 ~func0 6/27/2016 Fall 2012 -- Lecture #26 30 Controller Implementation opcode func “AND” logic 6/27/2016 add sub ori lw sw beq “OR” logic Fall 2012 -- Lecture #26 RegDst ALUSrc MemtoReg RegWrite MemWrite nPCsel ExtOp ALUctr[0] ALUctr[1] 31 AND Control in Logisim 6/27/2016 Fall 2012 -- Lecture #26 32 OR Control Logic in Logisim 6/27/2016 Fall 2012 -- Lecture #26 33 Single Cycle Performance • Assume time for actions are – 100ps for register read or write; 200ps for other events • Clock rate is? Instr Instr fetch Register read ALU op Memory access Register write Total time lw 200ps 100 ps 200ps 200ps 100 ps 800ps sw 200ps 100 ps 200ps 200ps R-format 200ps 100 ps 200ps beq 200ps 100 ps 200ps 700ps 100 ps 600ps 500ps • What can we do to improve clock rate? • Will this improve performance as well? Want increased clock rate to mean faster programs 6/27/2016 Fall 2012 -- Lecture #26 Student Roulette? 34 And in Conclusion, … Single-Cycle Processor • Five steps to design a processor: Processor 1. Analyze instruction set Input datapath requirements Control Memory 2. Select set of datapath components & establish Datapath Output clock methodology 3. Assemble datapath meeting the requirements 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. 5. Assemble the control logic • Formulate Logic Equations • Design Circuits 6/27/2016 Fall 2011 -- Lecture #26 35