inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures Lecture 26 CPU Design: Designing a Single-cycle CPU, pt 2 2007-03-19 Lecturer SOE Dan Garcia www.cs.berkeley.edu/~ddgarcia 3.6 TB DVDs? Maybe! Researchers at Harvard have found a way to use light-focusing “optical antennas” to get way past the diffraction limit and focus light smaller. QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. technologyreview.com/Infotech/18295 CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB How to Design a Processor: step-by-step 1. Analyze instruction set architecture (ISA) => datapath requirements • meaning of each instruction is given by the register transfers • datapath must include storage element for ISA registers • datapath must support each register transfer 2. Select set of datapath components and establish clocking methodology 3. Assemble datapath meeting requirements 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. 5. Assemble the control logic CS61C L26 CPU Design : Designing a Single-Cycle CPU II (2) Garcia, Spring 2007 © UCB Clocking Methodology Clk . . . . . . . . . . . . • Storage elements clocked by same edge • Being physical devices, flip-flops (FF) and combinational logic have some delays • Gates: delay from input change to output change • Signals at FF D input must be stable before active clock edge to allow signal to travel within the FF (set-up time), and we have the usual clock-to-Q delay • “Critical path” (longest path through logic) determines length of clock period CS61C L26 CPU Design : Designing a Single-Cycle CPU II (3) Garcia, Spring 2007 © UCB Register-Register Timing: One complete cycle Clk New Value PC Old Value Rs, Rt, Rd, Op, Func Old Value ALUctr Old Value RegWr Old Value busA, B Old Value busW Old Value Instruction Memory Access Time New Value Delay through Control Logic New Value New Value Register File Access Time New Value ALU Delay New Value ALUctr RegWr Rd Rs Rt 5 5 Rw Ra Rb busA RegFile busB clk CS61C L26 CPU Design : Designing a Single-Cycle CPU II (4) Register Write Occurs Here 32 ALU busW 5 32 32 Garcia, Spring 2007 © UCB 3c: Logical Operations with Immediate • R[rt] = R[rs] op ZeroExt[imm16] ] 31 26 op 31 6 bits 21 rs 16 rt 5 bits 0 immediate 5 bits 16 15 16 bits 0 immediate 0000000000000000 16 bits 16 bits But we’re writing to Rt register?? ALUctr RegWr Rd Rs Rt 5 5 Rw Ra Rb busA RegFile busB clk CS61C L26 CPU Design : Designing a Single-Cycle CPU II (5) 32 ALU busW 5 32 32 Garcia, Spring 2007 © UCB 3c: Logical Operations with Immediate • R[rt] = R[rs] op ZeroExt[imm16] ] 31 26 21 op rs 31 6 bits RegDst RegWr Rs Rt 5 16 bits 0 immediate 16 bits ALUctr 5 Rw Ra Rb busA RegFile busB 32 16 ZeroExt clk imm16 immediate 5 bits 16 15 32 ALU 32 0 What about Rt register read?? 0 5 5 bits rt 0000000000000000 16 bits Rd Rt 1 16 0 32 1 32 ALUSrc • Already defined 32-bit MUX; Zero Ext? CS61C L26 CPU Design : Designing a Single-Cycle CPU II (6) Garcia, Spring 2007 © UCB 3d: Load Operations • R[rt] = Mem[R[rs] + SignExt[imm16]] Example: lw rt,rs,imm16 31 26 21 op 16 rs 6 bits 0 rt 5 bits immediate 5 bits 16 bits RegDst Rd Rt 1 RegWr 5 Rs Rt 5 ALUctr 5 Rw Ra Rb busA RegFile busB 32 imm16 16 ZeroExt clk CS61C L26 CPU Design : Designing a Single-Cycle CPU II (7) 32 0 ALU 32 0 32 1 32 ALUSrc Garcia, Spring 2007 © UCB 3d: Load Operations • R[rt] = Mem[R[rs] + SignExt[imm16]] Example: lw rt,rs,imm16 31 26 21 op 16 rs 6 bits 0 rt 5 bits immediate 5 bits 16 bits ALUctr RegDst Rd Rt 1 RegWr 0 Rs Rt 5 5 5 Rw Ra Rb RegFile 32 busA busB 32 imm16 16 Extender clk 32 0 1 32 ExtOp CS61C L26 CPU Design : Designing a Single-Cycle CPU II (8) ALU busW MemtoReg MemWr ? 32 Data In ALUSrc clk 32 0 WrEn Adr Data Memory 1 Garcia, Spring 2007 © UCB 3e: Store Operations • Mem[ R[rs] + SignExt[imm16] ] = R[rt] Ex.: sw rt, rs, imm16 31 26 21 op 6 bits 16 rs 5 bits 0 rt 5 bits immediate 16 bits ALUctr RegDst Rd Rt 1 RegWr 0 Rs Rt 5 5 5 Rw Ra Rb RegFile 32 busA busB 32 imm16 16 Extender clk 32 0 ALU busW MemtoReg MemWr 32 0 32 WrEn Adr 1 32 ExtOp CS61C L26 CPU Design : Designing a Single-Cycle CPU II (9) Data In ALUSrc clk Data Memory 1 Garcia, Spring 2007 © UCB 3e: Store Operations • Mem[ R[rs] + SignExt[imm16] ] = R[rt] Ex.: sw rt, rs, imm16 31 26 21 op 6 bits 16 rs 5 bits 0 rt 5 bits immediate 16 bits ALUctr RegDst Rd Rt 1 RegWr 0 Rs Rt 5 5 5 Rw Ra Rb RegFile 32 busA busB 32 imm16 16 Extender clk 32 0 ALU busW MemtoReg MemWr 32 0 32 WrEn Adr 1 32 ExtOp CS61C L26 CPU Design : Designing a Single-Cycle CPU II (10) Data In ALUSrc clk Data Memory 1 Garcia, Spring 2007 © UCB 3f: The Branch Instruction 31 26 op 6 bits 21 rs 5 bits 16 rt 5 bits 0 immediate 16 bits beq rs, rt, imm16 • mem[PC] Fetch the instruction from memory • Equal = R[rs] == R[rt] Calculate branch condition • if (Equal) Calculate the next instruction’s address PC = PC + 4 + ( SignExt(imm16) x 4 ) else PC = PC + 4 CS61C L26 CPU Design : Designing a Single-Cycle CPU II (11) Garcia, Spring 2007 © UCB Datapath for Branch Operations • beq rs, rt, imm16 Datapath generates condition (equal) 31 26 op 6 bits 21 rs 5 bits 16 0 rt 5 bits immediate 16 bits Inst Address Equal nPC_sel Adder 4 00 clk busW 5 ALUctr Rs Rt 5 5 Rw Ra Rb busA RegFile busB clk 32 = ALU PC Mux Adder PC Ext imm16 RegWr 32 32 Already have mux, adder, need special sign extender for PC, need equal compare (sub?) CS61C L26 CPU Design : Designing a Single-Cycle CPU II (12) Garcia, Spring 2007 © UCB Putting it All Together:A Single Cycle Datapath RegDst 32 Equal 0 5 5 5 busA Rw Ra Rb RegFile busB 32 16 Extender imm16 MemtoReg MemWr Rs Rt clk clk ALUctr 32 = ALU busW PC PC Ext Adder Mux 00 RegWr Adder 4 Rt Rd Imm16 Rd Rt 1 Instruction<31:0> <0:15> nPC_sel Rs <11:15> Adr <16:20> <21:25> Inst Memory 0 32 1 32 Data In clk 32 0 WrEn Adr Data Memory 1 imm16 ExtOp CS61C L26 CPU Design : Designing a Single-Cycle CPU II (13) ALUSrc Garcia, Spring 2007 © UCB An Abstract View of the Implementation Ideal Instruction Memory PC clk 32 Instruction Rd Rs Rt 5 5 5 Rw Ra Rb Register File clk Control Signals Conditions A 32 ALU Next Address Instruction Address Control B 32 32 Data Addr Ideal Data Memory Data Out Data In clk Datapath CS61C L26 CPU Design : Designing a Single-Cycle CPU II (14) Garcia, Spring 2007 © UCB An Abstract View of the Critical Path Ideal Instruction Memory PC clk 32 Instruction Rd Rs Rt 5 5 5 Rw Ra Rb Register File clk (Assumes a fast controller) A 32 ALU Next Address Instruction Address Critical Path (Load Instruction) = Delay clock through PC (FFs) + Instruction Memory’s Access Time + Register File’s Access Time, + ALU to Perform a 32-bit Add + Data Memory Access Time + Stable Time for Register File Write B 32 CS61C L26 CPU Design : Designing a Single-Cycle CPU II (15) 32 Data Addr Ideal Data Memory Data In clk Garcia, Spring 2007 © UCB Administrivia • Friday will be a webcast archived lec • …no need to attend class, but you can if you want to (I’ll be here) CS61C L26 CPU Design : Designing a Single-Cycle CPU II (16) Garcia, Spring 2007 © UCB Peer Instruction A. Truth table for mux with 4-bits of signals has 24 rows B. We could cascade N 1-bit shifters to make 1 N-bit shifter for sll, srl C. If 1-bit adder delay is T, the N-bit adder delay would also be T CS61C L26 CPU Design : Designing a Single-Cycle CPU II (17) 0: 1: 2: 3: 4: 5: 6: 7: ABC FFF FFT FTF FTT TFF TFT TTF TTT Garcia, Spring 2007 © UCB Peer Instruction Answer A. Truth table for mux with 4-bits of signals controls 16 inputs, for a total of 20 inputs, so truth table is 220 rows…FALSE B. We could cascade N 1-bit shifters to make 1 N-bit shifter for sll, srl … TRUE C. What about the cascading carry? FALSE ABC A. Truth table for mux with 4-bits of 0: FFF 4 signals has 2 rows 1: FFT B. We could cascade N 1-bit shifters to make 1 N-bit shifter for sll, srl C. If 1-bit adder delay is T, the N-bit adder delay would also be T CS61C L26 CPU Design : Designing a Single-Cycle CPU II (18) 2: 3: 4: 5: 6: 7: FTF FTT TFF TFT TTF TTT Garcia, Spring 2007 © UCB Summary: A Single Cycle Datapath Instruction<31:0> Rs Rt Rd Imm16 ALUctr MemtoReg Rd Rt 1 RegWr 0 5 Rs Rt 5 zero 5 busA Rw Ra Rb RegFile 32 32 16 Extender clk imm16 busB ExtOp CS61C L26 CPU Design : Designing a Single-Cycle CPU II (19) 32 MemWr = ALU busW <0:15> RegDst <11:15> clk <16:20> instr fetch unit nPC_sel <21:25> • We have everything except control signals 0 32 1 32 Data In clk 32 0 WrEn Adr Data Memory 1 ALUSrc Garcia, Spring 2007 © UCB