CS:APP Chapter 4 Computer Architecture Sequential Implementation Randal E. Bryant Carnegie Mellon University http://csapp.cs.cmu.edu CS:APP Y86 Instruction Set Byte 0 nop 0 0 halt 1 0 rrmovl rA, rB 2 0 rA rB irmovl V, rB 3 0 F rB V rmmovl rA, D(rB) 4 0 rA rB D mrmovl D(rB), rA OPl rA, rB jXX Dest call Dest ret pushl rA popl rA –2– 5 1 2 3 0 rA rB 4 5 addl 6 0 subl 6 1 andl 6 2 xorl 6 3 jmp 7 0 jle 7 1 jl 7 2 je 7 3 jne 7 4 jge 7 5 jg 7 6 D 6 fn rA rB 7 fn 8 9 A B 0 Dest Dest 0 0 rA F 0 rA F CS:APP Computing with Logic Gates a b out out = a && b –3– Not Or And a b out out = a || b a out out = !a Outputs are Boolean functions of inputs CS:APP Bit Equality Bit equal a eq b bool eq = (a&&b)||(!a&&!b) –4– CS:APP Word Equality Word-Level Representation b31 Bit equal eq31 B = a31 b30 Bit equal Eq A eq30 a30 Eq b1 Bit equal eq1 a1 b0 Bit equal bool Eq = (A == B) 32-bit word size eq0 a0 –5– CS:APP Arithmetic Logic Unit 0 Y A X B 1 A L U Y A X+Y OF ZF CF X A L U B 2 Y A X-Y OF ZF CF X B 3 Y A L U A X&Y OF ZF CF X B A L U X^Y OF ZF CF Combinational logic Continuously responding to inputs Control signal selects function computed Corresponding to 4 arithmetic/logical operations in Y86 –6– Also computes values for condition codes CS:APP Bit-Level Multiplexor s Bit MUX bool out = (s&&a)||(!s&&b) b out a –7– Control signal s Data signals a and b Output a when s = 1, b when s = 0 CS:APP Word Multiplexor Word-Level Representation s s B b31 out31 MUX Out A a31 b30 out30 a30 int Out = [ S = 1 : A; S = 0 : B; ]; Select input word A or B depending on control signal s b0 out0 a0 –8– CS:APP Random-Access Memory valA srcA A valW Register file Read ports valB srcB W dstW Write port B Clock Stores multiple words of memory Address input specifies which word to read or write Register file Holds values of program registers %eax, %esp, etc. Register identifier serves as address » ID 8 implies no read or write performed Multiple Ports Can read and/or write multiple words in one cycle –9– » Each has separate address and data input/output CS:APP Building Blocks fun Combinational Logic – 10 – Store bits Addressable memories Non-addressable registers = A L U Compute Boolean functions of inputs B Continuously respond to input changes Operate on data and implement control Storage Elements A 0 MUX 1 valA srcA A valW Register file valB srcB B W dstW Clock CS:APP SEQ Hardware Structure newPC PC valE, valM Write back valM State Program counter register (PC) Condition code register (CC) Register File Memories Access same memory space Data Data memory memory Memory Addr, Data valE Bch Execute CC CC aluA, aluB Data: for reading/writing program data Instruction: for reading instructions Instruction Flow Read instruction at address specified by PC Process through stages Update program counter ALU ALU valA, valB srcA, srcB dstA, dstB Decode A B Register RegisterM file file E icode ifun rA , rB valC valP , Fetch Instruction Instruction memory memory PC PC increment increment PC – 11 – CS:APP newPC SEQ Stages PC valE, valM Write back valM Fetch Read instruction from instruction memory Data Data memory memory Memory Addr, Data Decode Read program registers Execute valE Bch Execute CC CC aluA, aluB Compute value or address valA, valB Memory Read or write data icode ifun rA , rB valC B valP , Fetch A Register RegisterM file file E Write program registers PC srcA, srcB dstA, dstB Decode Write Back ALU ALU Instruction Instruction memory memory PC PC increment increment Update program counter PC – 12 – CS:APP Instruction Decoding Optional 5 0 rA rB Optional D icode ifun rA rB valC Instruction Format – 13 – Instruction byte icode:ifun Optional register byte rA:rB Optional constant word valC CS:APP Executing Arith./Logical Operation OPl rA, rB Fetch Memory Read 2 bytes Decode Read operand registers Execute – 14 – 6 fn rA rB Perform operation Set condition codes Do nothing Write back Update register PC Update Increment PC by 2 CS:APP Stage Computation: Arith/Log. Ops OPl rA, rB icode:ifun M1[PC] Read instruction byte rA:rB M1[PC+1] Read register byte valP PC+2 Compute next PC valA R[rA] Read operand A valB R[rB] Read operand B valE valB OP valA Perform ALU operation Set CC Set condition code register Memory Write R[rB] valE Write back result back PC update PC valP Update PC Fetch Decode Execute – 15 – Formulate instruction execution as sequence of simple steps Use same general form for all instructions CS:APP Executing rmmovl rmmovl rA, D(rB) 4 0 rA rB Fetch Memory Read 6 bytes Decode Read operand registers Execute – 16 – D Compute effective address Write to memory Write back Do nothing PC Update Increment PC by 6 CS:APP Stage Computation: rmmovl rmmovl rA, D(rB) Fetch Decode Execute Memory Write back PC update – 17 – icode:ifun M1[PC] Read instruction byte rA:rB M1[PC+1] Read register byte valC M4[PC+2] Read displacement D valP PC+6 Compute next PC valA R[rA] Read operand A valB R[rB] Read operand B valE valB + valC Compute effective address M4[valE] valA Write value to memory PC valP Update PC Use ALU for address computation CS:APP Executing popl popl rA Fetch Memory Read 2 bytes Decode Read stack pointer Execute b 0 rA F Increment stack pointer by 4 Write back Update stack pointer Write result to register PC Update – 18 – Read from old stack pointer Increment PC by 2 CS:APP Stage Computation: popl popl rA icode:ifun M1[PC] Read instruction byte rA:rB M1[PC+1] Read register byte valP PC+2 valA R[%esp] Compute next PC valB R [%esp] Read stack pointer valE valB + 4 Increment stack pointer Memory Write valM M4[valA] R[%esp] valE Read from stack back PC update R[rA] valM Write back result PC valP Update PC Fetch Decode Execute Read stack pointer Update stack pointer Use ALU to increment stack pointer Must update two registers Popped value – 19 – New stack pointer CS:APP Executing Jumps jXX Dest 7 fn fall thru: XX XX Not taken target: XX XX Taken Fetch Memory Read 5 bytes Increment PC by 5 Decode Do nothing Execute – 20 – Dest Determine whether to take branch based on jump condition and condition codes Do nothing Write back Do nothing PC Update Set PC to Dest if branch taken or to incremented PC if not branch CS:APP Stage Computation: Jumps jXX Dest Fetch icode:ifun M1[PC] Read instruction byte valC M4[PC+1] Read destination address valP PC+5 Fall through address Cnd Cond(CC,ifun) Take branch? PC Cnd ? valC : valP Update PC Decode Execute Memory Write back PC update – 21 – Compute both addresses Choose based on setting of condition codes and branch condition CS:APP Executing call 8 0 call Dest return: XX XX target: XX XX Fetch Memory Read 5 bytes Increment PC by 5 Decode Read stack pointer Execute – 22 – Dest Decrement stack pointer by 4 Write incremented PC to new value of stack pointer Write back Update stack pointer PC Update Set PC to Dest CS:APP Stage Computation: call call Dest icode:ifun M1[PC] Read instruction byte valC M4[PC+1] Read destination address valP PC+5 Compute return point valB R[%esp] Read stack pointer valE valB + –4 Decrement stack pointer Memory Write M4[valE] valP R[%esp] valE Write return value on stack back PC update PC valC Set PC to destination Fetch Decode Execute – 23 – Update stack pointer Use ALU to decrement stack pointer Store incremented PC CS:APP Executing ret 9 0 ret return: XX XX Fetch Memory Read 1 byte Decode Read stack pointer Execute Increment stack pointer by 4 Write back Update stack pointer PC Update – 24 – Read return address from old stack pointer Set PC to return address CS:APP Stage Computation: ret ret icode:ifun M1[PC] Read instruction byte valA R[%esp] Read operand stack pointer valB R[%esp] Read operand stack pointer valE valB + 4 Increment stack pointer Memory Write valM M4[valA] R[%esp] valE Read return address back PC update PC valM Set PC to return address Fetch Decode Execute – 25 – Update stack pointer Use ALU to increment stack pointer Read return address from memory CS:APP Computation Steps OPl rA, rB Fetch Decode Execute icode:ifun M1[PC] Read instruction byte rA,rB rA:rB M1[PC+1] Read register byte valC [Read constant word] valP valP PC+2 Compute next PC valA, srcA valA R[rA] Read operand A valB, srcB valB R[rB] Read operand B valE valE valB OP valA Perform ALU operation Cond code Set CC Set condition code register Memory Write valM [Memory read/write] back PC update dstM – 26 – icode,ifun dstE PC R[rB] valE Write back ALU result [Write back memory result] PC valP Update PC All instructions follow same general pattern Differ in what gets computed on each step CS:APP Computation Steps call Dest icode,ifun Fetch Decode Execute Read instruction byte [Read register byte] valC valC M4[PC+1] Read constant word valP valP PC+5 Compute next PC valA, srcA [Read operand A] valB, srcB valB R[%esp] Read operand B valE valE valB + –4 Perform ALU operation Cond code valM back PC update dstM – 27 – rA,rB Memory Write icode:ifun M1[PC] dstE PC [Set condition code reg.] M4[valE] valP R[%esp] valE [Memory read/write] [Write back ALU result] Write back memory result PC valC Update PC All instructions follow same general pattern Differ in what gets computed on each step CS:APP Computed Values Fetch Execute icode ifun rA Instruction code Instruction function Instr. Register A rB valC valP Instr. Register B Instruction constant Incremented PC valE Cnd ALU result Condition flag Memory valM Value from memory Decode – 28 – srcA srcB dstE dstM valA Register ID A Register ID B Destination Register E Destination Register M Register value A valB Register value B CS:APP SEQ Summary Implementation Express every instruction as series of simple steps Follow same general flow for each instruction type Assemble registers, memories, predesigned combinational blocks Connect with control logic Limitations – 29 – Too slow to be practical In one cycle, must propagate through instruction memory, register file, ALU, and data memory Would need to run clock very slowly Hardware units only active for fraction of clock cycle CS:APP