inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures Lecture 26 CPU Design: Designing a Single-cycle CPU, pt 2 2010-03-31 Hello to Vinay Kumar listening from Columbia Univ! Lecturer SOE Dan Garcia www.cs.berkeley.edu/~ddgarcia Apple to end AT&T monopoly? The Wall Street Journal reports that Apple is developing a CDMA cell phone that could be used by Verizon and Sprint. Most carriers (incl AT&T) use GSM. Lots of people declining to comment, but if there’s any truth to this, it’s great! apple.com/iphone CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2010 © UCB How to Design a Processor: step-by-step 1. Analyze instruction set architecture (ISA) => datapath requirements • meaning of each instruction is given by the register transfers • datapath must include storage element for ISA registers • datapath must support each register transfer 2. Select set of datapath components and establish clocking methodology 3. Assemble datapath meeting requirements 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. 5. Assemble the control logic CS61C L26 CPU Design : Designing a Single-Cycle CPU II (2) Garcia, Spring 2010 © UCB Clocking Methodology Clk . . . . . . . . . . . . • Storage elements clocked by same edge • Being physical devices, flip-flops (FF) and combinational logic have some delays • Gates: delay from input change to output change • Signals at FF D input must be stable before active clock edge to allow signal to travel within the FF (set-up time), and we have the usual clock-to-Q delay • “Critical path” (longest path through logic) determines length of clock period CS61C L26 CPU Design : Designing a Single-Cycle CPU II (3) Garcia, Spring 2010 © UCB Register-Register Timing: One complete cycle Clk New Value PC Old Value Rs, Rt, Rd, Op, Func Old Value ALUctr Old Value RegWr Old Value busA, B Old Value busW Old Value Instruction Memory Access Time New Value Delay through Control Logic New Value New Value Register File Access Time New Value ALU Delay New Value ALUctr RegWr Rd Rs Rt 5 5 Rw Ra Rb busA RegFile busB clk CS61C L26 CPU Design : Designing a Single-Cycle CPU II (4) Register Write Occurs Here 32 ALU busW 5 32 32 Garcia, Spring 2010 © UCB 3c: Logical Operations with Immediate • R[rt] = R[rs] op ZeroExt[imm16] 31 26 op 31 6 bits 21 rs 16 rt 5 bits 0 immediate 5 bits 16 15 16 bits 0 immediate 0000000000000000 16 bits 16 bits But we’re writing to Rt register?? ALUctr RegWr Rd Rs Rt 5 5 Rw Ra Rb busA RegFile busB clk CS61C L26 CPU Design : Designing a Single-Cycle CPU II (5) 32 ALU busW 5 32 32 Garcia, Spring 2010 © UCB 3c: Logical Operations with Immediate • R[rt] = R[rs] op ZeroExt[imm16] ] 31 26 21 op rs 31 6 bits RegDst RegWr Rs Rt 5 16 bits 0 immediate 16 bits ALUctr 5 Rw Ra Rb busA RegFile busB 32 16 ZeroExt clk imm16 immediate 5 bits 16 15 32 ALU 32 0 What about Rt register read?? 0 5 5 bits rt 0000000000000000 16 bits Rd Rt 1 16 0 32 1 32 ALUSrc • Already defined 32-bit MUX; Zero Ext? CS61C L26 CPU Design : Designing a Single-Cycle CPU II (6) Garcia, Spring 2010 © UCB 3d: Load Operations • R[rt] = Mem[R[rs] + SignExt[imm16]] Example: lw rt,rs,imm16 31 26 21 op 16 rs 6 bits 0 rt 5 bits immediate 5 bits 16 bits RegDst Rd Rt 1 RegWr 5 Rs Rt 5 ALUctr 5 Rw Ra Rb busA RegFile busB 32 imm16 16 ZeroExt clk CS61C L26 CPU Design : Designing a Single-Cycle CPU II (7) 32 0 ALU 32 0 32 1 32 ALUSrc Garcia, Spring 2010 © UCB 3d: Load Operations • R[rt] = Mem[R[rs] + SignExt[imm16]] Example: lw rt,rs,imm16 31 26 21 op 16 rs 6 bits 0 rt 5 bits immediate 5 bits 16 bits ALUctr RegDst Rd Rt 1 RegWr 0 Rs Rt 5 5 5 Rw Ra Rb RegFile 32 busA busB 32 imm16 16 Extender clk 32 0 1 32 ExtOp CS61C L26 CPU Design : Designing a Single-Cycle CPU II (8) ALU busW MemtoReg MemWr ? 32 Data In ALUSrc clk 32 0 WrEn Adr Data Memory 1 Garcia, Spring 2010 © UCB 3e: Store Operations • Mem[ R[rs] + SignExt[imm16] ] = R[rt] Ex.: sw rt, rs, imm16 31 26 21 op 6 bits 16 rs 5 bits 0 rt 5 bits immediate 16 bits ALUctr RegDst Rd Rt 1 RegWr 0 Rs Rt 5 5 5 Rw Ra Rb RegFile 32 busA busB 32 imm16 16 Extender clk 32 0 ALU busW MemtoReg MemWr 32 0 32 WrEn Adr 1 32 ExtOp CS61C L26 CPU Design : Designing a Single-Cycle CPU II (9) Data In ALUSrc clk Data Memory 1 Garcia, Spring 2010 © UCB 3e: Store Operations • Mem[ R[rs] + SignExt[imm16] ] = R[rt] Ex.: sw rt, rs, imm16 31 26 21 op 6 bits 16 rs 5 bits 0 rt 5 bits immediate 16 bits ALUctr RegDst Rd Rt 1 RegWr 0 Rs Rt 5 5 5 Rw Ra Rb RegFile 32 busA busB 32 imm16 16 Extender clk 32 0 ALU busW MemtoReg MemWr 32 0 32 WrEn Adr 1 32 ExtOp CS61C L26 CPU Design : Designing a Single-Cycle CPU II (10) Data In ALUSrc clk Data Memory 1 Garcia, Spring 2010 © UCB 3f: The Branch Instruction 31 26 op 6 bits 21 rs 5 bits 16 rt 5 bits 0 immediate 16 bits beq rs, rt, imm16 • mem[PC] Fetch the instruction from memory • Equal = R[rs] == R[rt] Calculate branch condition • if (Equal) Calculate the next instruction’s address PC = PC + 4 + ( SignExt(imm16) x 4 ) else PC = PC + 4 CS61C L26 CPU Design : Designing a Single-Cycle CPU II (11) Garcia, Spring 2010 © UCB Datapath for Branch Operations • beq rs, rt, imm16 Datapath generates condition (equal) 31 26 op 6 bits 21 rs 5 bits 16 0 rt 5 bits immediate 16 bits Inst Address Equal nPC_sel Adder 4 00 clk busW 5 ALUctr Rs Rt 5 5 Rw Ra Rb busA RegFile busB clk 32 = ALU PC Mux Adder PC Ext imm16 RegWr 32 32 Already have mux, adder, need special sign extender for PC, need equal compare (sub?) CS61C L26 CPU Design : Designing a Single-Cycle CPU II (12) Garcia, Spring 2010 © UCB Putting it All Together:A Single Cycle Datapath RegDst 32 Equal 0 5 5 5 busA Rw Ra Rb RegFile busB 32 16 Extender imm16 MemtoReg MemWr Rs Rt clk clk ALUctr 32 = ALU busW PC PC Ext Adder Mux 00 RegWr Adder 4 Rt Rd Imm16 Rd Rt 1 Instruction<31:0> <0:15> nPC_sel Rs <11:15> Adr <16:20> <21:25> Inst Memory 0 32 1 32 Data In clk 32 0 WrEn Adr Data Memory 1 imm16 ExtOp CS61C L26 CPU Design : Designing a Single-Cycle CPU II (13) ALUSrc Garcia, Spring 2010 © UCB An Abstract View of the Implementation Ideal Instruction Memory PC clk 32 Instruction Rd Rs Rt 5 5 5 Rw Ra Rb Register File clk Control Signals Conditions A 32 ALU Next Address Instruction Address Control B 32 32 Data Addr Ideal Data Memory Data Out Data In clk Datapath CS61C L26 CPU Design : Designing a Single-Cycle CPU II (14) Garcia, Spring 2010 © UCB An Abstract View of the Critical Path Ideal Instruction Memory PC clk 32 Instruction Rd Rs Rt 5 5 5 Rw Ra Rb Register File clk (Assumes a fast controller) A 32 ALU Next Address Instruction Address Critical Path (Load Instruction) = Delay clock through PC (FFs) + Instruction Memory’s Access Time + Register File’s Access Time, + ALU to Perform a 32-bit Add + Data Memory Access Time + Stable Time for Register File Write B 32 CS61C L26 CPU Design : Designing a Single-Cycle CPU II (15) 32 Data Addr Ideal Data Memory Data In clk Garcia, Spring 2010 © UCB Administrivia • Any administrivia to announce? CS61C L26 CPU Design : Designing a Single-Cycle CPU II (16) Garcia, Spring 2010 © UCB Peer Instruction 1) In the worst case, the delay is the memory access time 2) With only changes to control, our datapath could write to memory and registers in one cycle. CS61C L26 CPU Design : Designing a Single-Cycle CPU II (17) a) b) c) d) 12 FF FT TF TT Garcia, Spring 2010 © UCB Summary: A Single Cycle Datapath Instruction<31:0> Rs Rt Rd Imm16 ALUctr MemtoReg Rd Rt 1 RegWr 0 5 Rs Rt 5 zero 5 busA Rw Ra Rb RegFile 32 32 16 Extender clk imm16 busB ExtOp CS61C L26 CPU Design : Designing a Single-Cycle CPU II (18) 32 MemWr = ALU busW <0:15> RegDst <11:15> clk <16:20> instr fetch unit nPC_sel <21:25> • We have everything except control signals 0 32 1 32 Data In clk 32 0 WrEn Adr Data Memory 1 ALUSrc Garcia, Spring 2010 © UCB