inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 26 – Single Cycle CPU Datapath II Lecturer PSOE Dan Garcia www.cs.berkeley.edu/~ddgarcia 98,389 UC Identity thefts?! A laptop was stolen from Grad division with names, SS #, birth dates of almost 106 former grad students at UC Berkeley. The thief may not know what they have! Sensitive data allowed on portables? Good idea…NOT! newscenter.berkeley.edu/security/grad/ CS61C L26 Single Cycle CPU Datapath II (1) Garcia © UCB How to Design a Processor: step-by-step • 1. Analyze instruction set architecture (ISA) => datapath requirements • meaning of each instruction is given by the register transfers • datapath must include storage element for ISA registers • datapath must support each register transfer • 2. Select set of datapath components and establish clocking methodology • 3. Assemble datapath meeting requirements • 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. • 5. Assemble the control logic (hard part!) CS61C L26 Single Cycle CPU Datapath II (2) Garcia © UCB Step 3: Assemble DataPath meeting requirements • Register Transfer Requirements Datapath Assembly • Instruction Fetch • Read Operands and Execute Operation CS61C L26 Single Cycle CPU Datapath II (3) Garcia © UCB 3a: Overview of the Instruction Fetch Unit • The common RTL operations • Fetch the Instruction: mem[PC] • Update the program counter: - Sequential Code: PC = PC + 4 - Branch and Jump: PC = “something else” Clk PC Next Address Logic Address Instruction Memory CS61C L26 Single Cycle CPU Datapath II (4) Instruction Word 32 Garcia © UCB 3b: Add & Subtract • R[rd] = R[rs] op R[rt] Ex.: addU rd,rs,rt • Ra, Rb, and Rw come from instruction’s Rs, Rt, 26 21 16 11 6 and Rd fields 31 op 6 bits rs 5 bits rt 5 bits rd 5 bits shamt 5 bits funct 6 bits 0 • ALUctr and RegWr: control logic after decoding the instruction Rd Rs Rt RegWr 5 5 5 32 32-bit Registers busA 32 busB 32 ALU busW 32 Clk Rw Ra Rb ALUctr Result 32 • We’ve already defined register file, ALU CS61C L26 Single Cycle CPU Datapath II (5) Garcia © UCB Clocking Methodology Clk . . . . . . . . . . . . • Storage elements clocked by same edge • Being physical devices, flip-flops (FF) and combinational logic have some delays • Gates: delay from input change to output change • Signals at FF D input must be stable before active clock edge to allow signal to travel within the FF, and we have the usual clock-to-Q delay • “Critical path” (longest path through logic) determines length of clock period CS61C L26 Single Cycle CPU Datapath II (6) Garcia © UCB Register-Register Timing: One complete cycle Clk New Value PC Old Value Rs, Rt, Rd, Op, Func Old Value ALUctr Old Value RegWr Old Value busA, B Old Value busW Old Value Instruction Memory Access Time New Value Delay through Control Logic New Value New Value Register File Access Time New Value ALU Delay New Value Rd Rs Rt RegWr5 5 5 CS61C L26 Single Cycle CPU Datapath II (7) busA 32 busB 32 ALU busW 32 Clk Rw Ra Rb 32 32-bit Registers ALUctr Register Write Occurs Here Result 32 Garcia © UCB 3c: Logical Operations with Immediate • R[rt] = R[rs] op ZeroExt[imm16] ] 31 26 21 op rs 31 6 bits 5 bits 16 rt 5 bits 16 15 rd? 11 0 immediate 16 bits 0 ALU immediate 0000000000000000 Rd Rt RegDst 16 bits 16 bits Mux Rt register read?? Rs Rt? What about ALUct RegWr 5 5 5 r busA Rw Ra Rb busW 32 Result 32 32-bit 32 Registers 32 busB Clk 32 Mux 16 ZeroExt imm16 32 ALUSrc • Already defined 32-bit MUX; Zero Ext? CS61C L26 Single Cycle CPU Datapath II (8) Garcia © UCB 3d: Load Operations • R[rt] = Mem[R[rs] + SignExt[imm16]] Example: lw rt,rs,imm16 31 26 op rs 6 bits Rd RegDst Mux RegWr 5 32 Clk 0 rt 5 bits immediate 5 bits 16 bits Rt Rs Rt 5 5 Rw Ra Rb 32 32-bit Registers busA W_Src 32 32 ExtOp CS61C L26 Single Cycle CPU Datapath II (9) 32 MemWr ?? ALUSrc Data In 32 Clk Mu x busB 32 Mux 16 ALUctr Extender imm16 16 ALU busW 21 WrEn Adr Data Memory 32 Garcia © UCB 3e: Store Operations • Mem[ R[rs] + SignExt[imm16] ] = R[rt] Ex.: sw rt, rs, imm16 31 26 21 op rs 6 bits 5 bits Rd Rt RegDst Mux RegWr5 5 rt 5 bits W_Src 32 ExtOp CS61C L26 Single Cycle CPU Datapath II (10) 32 Data In32 Clk WrEn Adr 32 Data Memory Mux Extender 16 immediate 16 bits ALUctr MemWr ALU busA Rw Ra Rb 32 32 32-bit Registers busB 32 imm16 0 Rs Rt 5 Mux busW 32 Clk 16 ALUSrc Garcia © UCB 3f: The Branch Instruction 31 26 op 6 bits 21 rs 5 bits 16 rt 5 bits 0 immediate 16 bits • beq rs, rt, imm16 • mem[PC] Fetch the instruction from memory • Equal = R[rs] == R[rt] Calculate branch condition • if (Equal) Calculate the next instruction’s address - PC = PC + 4 + ( SignExt(imm16) x 4 ) else - PC = PC + 4 CS61C L26 Single Cycle CPU Datapath II (11) Garcia © UCB Datapath for Branch Operations • beq rs, rt, imm16 Datapath generates condition (equal) 26 op 6 bits 21 00 Adder 32 PC Mux Adder PC Ext imm16 0 rs rt immediate 5 bits 5 bits 16 bits Inst Address nPC_sel 4 16 Rs Rt 5 busA Rw Ra Rb 32 32 32-bit Registers busB 32 Cond RegWr 5 5 busW Clk Equal? 31 Clk • Already MUX, adder, sign extend, zero CS61C L26 Single Cycle CPU Datapath II (12) Garcia © UCB Putting it All Together:A Single Cycle Datapath Instruction<31:0> <0:15> <11:15> Rs <16:20> <21:25> Inst Memory Adr Rt Rd Imm16 RegDst ALUctr MemWr MemtoReg Equal Rt Rd 1 0 Rs Rt RegWr 5 5 5 busA Rw Ra Rb = busW 32 32 32-bit 0 32 32 Registers busB 0 32 Clk 32 WrEn Adr 1 1 Data In Data imm16 32 Clk 16 Clk Memory nPC_sel imm16 Mux ALU Extender PC Ext Adder Mux PC Mux Adder 00 4 ExtOp ALUSrc CS61C L26 Single Cycle CPU Datapath II (13) Garcia © UCB An Abstract View of the Implementation PC Clk Next Address ALU Control Ideal Instruction Instruction Control Signals Conditions Memory Rd Rs Rt 5 5 5 Instruction Address A Data Data 32 Address Rw Ra Rb 32 Ideal Out 32 32-bit 32 Data Data Registers B Memory In Clk 32 Clk Datapath CS61C L26 Single Cycle CPU Datapath II (14) Garcia © UCB Peer Instruction A. Our ALU is a synchronous device B. We could have used tri-state devices instead of a MUX to feed busW, the register write data line C. The ALU is inactive for memory reads or writes. CS61C L26 Single Cycle CPU Datapath II (15) 1: 2: 3: 4: 5: 6: 7: 8: ABC FFF FFT FTF FTT TFF TFT TTF TTT Garcia © UCB Summary: Single cycle datapath °5 steps to design a processor • 1. Analyze instruction set => datapath requirements • 2. Select set of datapath components & establish clock methodology • 3. Assemble datapath meeting the requirements • 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. Processor • 5. Assemble the control logic °Control is the hard part °Next time! CS61C L26 Single Cycle CPU Datapath II (16) Input Control Memory Datapath Output Garcia © UCB