Score:_____ Date:____ Name:__________________Partner:_________________ ECE 3055A Lab Assignment #2 - Counts as two labs TA Signoff & Grades Part 1(25%)_______________________ Part 2 (75%)___________________________ Due Date Part 1: Thursday, September 22 Due Date Part 2:Thursday, October 6 Note: This assignment has two parts, the first part is simple, you just have to run an existing VHDL-based MIPS hardware simulation using the new Altera Quartus II CAD tools and understand the results. The second part, adding Pipelining to the model will take a significant effort, so be sure to start work early to be able to finish on time. The Altera Quartus II tools are available in the ECE PC labs or on the web using the link at http://www.ece.gatech.edu/~hamblen/3055 The MIPS VHDL source files are also available on a link on the class web page at: http://www.ece.gatech.edu/~hamblen/3055 Note: Get the new files for this Semester from this location only! – see the separate handout passed out earlier on how to run the new MIPS model using Quartus II. Avoid using Quartus versions newer than the ones you used in 2031 lab or those installed in the ECE PC clusters. In the newest ones, 10.0 and above, a totally different simulator is used that is quite a bit more complex. Older ones are still posted at Altera. 1. Use VHDL to synthesize the MIPS single clock cycle design in the top-level file MIPS.VHD. More information on this model can be found in the Chapter 14 of your ECE 2031 textbook and Chapter 6 also contains a review of using VHDL for synthesis models. After synthesis and simulation perform the following steps: Display and print the timing diagram from the simulation. Verify that the information on the timing diagram shows that the hardware is functioning correctly. Examine the test program in PROGRAM.MIF. Look at the program counter, instruction bus, register file, ALU outputs, and control signals on the timing diagram and carefully follow the execution of each instruction in the test program. Label the important values for each instruction on the timing diagram and attach a short write-up explaining in detail what the timing diagram shows relative to each instruction's execution and correct operation. Return to the simulator and run the simulation again. Examine the ALU output in the timing diagram window. Zoom in on the ALU output after execution of the third instruction and see what happens when it changes values. Explain exactly what is happening at this point. Hint: Real hardware like adders have timing delays. 2. Pipeline the MIPS VHDL simulation. Test your VHDL model by running a simulation of the example program using the hardware shown in Computer Organization and Design The Hardware/Software Interface. This detailed example is from the earlier third edition of the text, so the figures are attached at the end of this handout. To minimize changes, pipeline registers must be placed in the VHDL module that generates the input to the pipeline. As an example, all of the pipeline registers that store control signals must be placed in the control module. Synthesize and check the control module first, since it is simple to see if it works correctly when you add the pipeline flip-flops. Use the following notation to add new latched signals add a "D_" to the beginning of the signal name to indicate it is the input to a D flip-flop used in a pipeline register. Signals that need to go through two D flip-flops would be "DD_" and three would be "DDD_". As an example, instruction would be the registered (one cycle delay) version of the signal, D_instruction as seen in Figure 1. The existing signal name in this example, Instruction, is changed to D_Instruction and the output of the register becomes Instruction. This will minimize the changes needed in the top level module since the global signal names used to interconnect signals to the six modules will not change (i.e., global variable names stay the same). Before pipelining: Instruction After pipelining and changing signal names: D_Instruction Instruction D Pipeline Register Q Figure 1: The naming convention for new signals added during pipelining. Add pipeline registers inside the existing modules that generate the inputs to the pipeline registers shown in the text (i.e., no new modules). This will prevent adding more modules and will not require extensive changes to the top level MIPS.vhd module. Add signal and process statements to model the pipeline modules – see the PC in the ifetch.vhd module for an example of how this can work (the PC is already pipelined). A few muxes may have to be moved to different modules. You may want to change the inverted clock signal on data memory once you add pipelining. Note: The previously described way to add pipeline latches will minimize changes to the output ports of each module and the top-level mips.vhd file. However, some students in the past have found the nomenclature confusing. As an alternative, you can add a prefix “Q_” for every delay (“Q” output of D flip-flop). For example, ALU_RESULT would be the current output of the ALU before any pipeline latches as seen in Figure 2. Q_ALU_RESULT would be the ALU_RESULT generated in the previous clock cycle. QQ_ALU_RESULT would be from two clock cycles ago. This method will mean that any “QQ_” signal in any module will be from 2 cycles ago (has gone through 2 pipeline registers). The other scheme represents how many registers are left to go for the signal in question. The big trade-off is that you will have to rename all the output port signal names in each *.vhd file that you pipeline, and also modify the top level mips.vhd file in two places to reflect the new output port signal names using this approach (i.e., most global variable names will change). ALU_Result Q_ALU_Result D Pipeline Register Q Figure 2: The alternative naming convention for new signals added during pipelining. In Dmemory.vhd, the data memory address and data inputs are already pipelined inside the altsyncram function used for data memory (this is why it has a clock input). You will need to take this into account when you pipeline your design. High speed memory writes almost always require a clock and the design in the book skips over this point – since they do not have their design running on real hardware. As an example, in the Quartus software you can’t even have memory without a clock! Currently in the original single cycle design, data memory uses NOT CLOCK as the clock input so that there is time to get both the correct ALU result loaded into the memories internal address and data pipeline registers (first half of clock cycle) and write to memory (second half of clock cycle). Once you pipeline the model, you will probably want to have your data memory clock input use CLOCK instead of NOT CLOCK for the fastest clock cycle time. With NOT CLOCK you would be loading the ALU Result into the pipeline register in the middle of the clock cycle (not the end) – so it would slow down the clock cycle time on real hardware. Since there is already a built-in pipeline register in the data memory inputs, don’t add another one in the address or data input paths to data memory, if you switch NOT CLOCK to CLOCK. You will still need to delay the ALU result two clocks (with two pipeline registers) for the register file write back operation. The control module should contain all of the control pipeline registers – 1, 2, or 3 stages of pipeline registers for control signals. Some control signals must be reset to zero, so use a D flip-flop with a synchronous reset for these pipeline registers. This generates a flip-flop with a Clear input that will be tied to Reset. Critical pipeline registers with control signals such as regwrite or memwrite should be cleared at reset so that the pipeline starts up correctly. The MIPS instruction sll $0, $0, 0 is all zeros in machine code and does not modify any values in registers or memory. It is used to initialize the IF/ID pipeline at reset. You will need to put new machine code in PROGRAM.MIF for the test program. Put an initial data value of 0xAAAA5555 in word 21 and 0x5555AAAA in word 25 for your programs memory test data (used by lw instructions in test program). Initialize other data locations to all 0’s. Chapter 4 of Computer Organization and Design The Hardware/Software Interface contains additional background information on pipelining. Note that the VHDL input signal, function_opcode in execute.vhd can be tied to the low six-bits of the pipelined instruction sign-extend field as shown in the text’s diagrams. Your simulation should show the major busses just as in part one. You can add other signals as needed. To display internal signals in the simulation, you may also need to make them outputs from the top-level module. Print out the simulation results. Label the simulation output to indicate how the instructions move through the pipeline. Note: This program deliberately does not contain any data or branch hazards. Hardware for forwarding and branch flushing will be added in later labs – so don’t try to fix all of those problems in this lab! In Figure 3, the processor is seen running the following MIPS instructions through the basic unimproved pipeline for several clock cycles: lw sub and or add $10, 20($1) $11, $2, $3 $12, $4, $5 $13, $6, $7 $14, $8, $9 Note that Figure 3 shows the value of major busses and control signals for each clock cycle. It will be useful when testing your pipelined processor design and verifying that your pipeline simulation has the correct control signals and results. IF : lw $ 1 0 , 2 0 ($ 1 ) 0 M u x 1 ID : b e fo re < 1 > E X : b e f o re < 2 > IF /ID M E M : b e fo re < 3 > ID /E X 00 C o n tro l WB 0 00 0 000 E X /M E M M E M /W B 00 000 M W B : b e fo re < 4 > WB 0 00 EX 0 00 0 0 0 M 0 WB 0 A dd Add A d d r e su lt In stru ctio n m em ory MemWrite A d d re s s A LU S rc R ead re g iste r 1 Instruction PC B ra n ch S h ift le ft 2 R e ad d a ta 1 R ead re g iste r 2 R e g iste r s R e a d W rite d a ta 2 re g iste r MemtoReg RegWrite 4 Zero A LU 0 M u x 1 W rite d a ta A LU re s u lt A d d re ss D a ta m em ory R ead d a ta 1 M u x 0 W rite d a ta In stru ctio n [1 5 – 0 ] S ig n e x te n d ALU co n tro l In stru c tio n [2 0 – 1 6 ] 0 M u x 1 In stru c tio n [1 5 – 1 1 ] C lo c k 1 IF : s u b $ 1 1 , $ 2 , $ 3 0 M u x 1 M em R ea d A LU O p R egD s t ID : lw $ 1 0 , 2 0 ($ 1 ) E X : b e f o re < 1 > IF /ID M E M : b e fo re < 2 > ID /E X 11 lw C o n tro l WB 0 10 0 001 M W B : b e fo re < 3 > E X /M E M M E M /W B 00 000 WB 0 00 EX 0 M 00 0 0 0 0 WB 0 A dd A d d r e Asud ltd In stru ctio n m em ory 1 X A LU S rc R ead re g iste r 1 R e ad d a ta 1 R ead re g iste r 2 R e g iste r s R e a d W rite d a ta 2 re g iste r $1 MemtoReg A d d re s s B ra n ch S h ift le ft 2 MemWrite PC Instruction RegWrite 4 Zero $X W rite d a ta A LU 0 M u x 1 A LU re s u lt A d d re ss D a ta m em ory R ead d a ta W rite d a ta C lo c k 2 20 In stru ctio n [1 5 – 0 ] 10 In stru c tio n [2 0 – 1 6 ] X In stru c tio n [1 5 – 1 1 ] S ig n e x te n d 20 10 X ALU co n tro l 0 M u x 1 A LU O p R egD s t M em R ea d 1 M u x 0 IF : a n d $ 1 2 , $ 4 , $ 5 0 M u x 1 ID : s u b $ 1 1 , $ 2 , $ 3 E X : lw $ 1 0 , . . . IF /ID ID /E X 10 su b C o n tro l WB 0 00 1 100 W B : b e fo re < 2 > E X /M E M M E M /W B 11 010 M M E M : b e fo re < 1 > WB 0 00 EX 1 00 0 0 0 M 0 WB 0 A dd Add A d d r e su lt In stru ctio n m em ory 2 3 A LU S rc R ea d r e g iste r 1 R ead d a ta 1 R ea d r e g iste r 2 R e g iste r s R e a d W rite d a ta 2 r e g iste r $2 $1 MemtoReg A d d re ss B ra n ch S h ift le ft 2 MemWrite PC Instruction RegWrite 4 Z e ro $3 A LU 0 M u x 1 W rite d a ta A LU re s u lt A d d re ss D a ta m em ory R ead d a ta 1 M u x 0 W r ite d a ta X In stru ctio n [1 5 – 0 ] X In stru ctio n [2 0 – 1 6 ] 11 In stru ctio n [1 5 – 1 1 ] S ig n e x te n d X 20 X 10 11 C lo c k 3 IF : o r $ 1 3 , $ 6 , $ 7 0 M u x 1 ALU c o n tro l 0 M u x 1 M em R ea d A LU O p R egD s t ID : a n d $ 1 2 , $ 2 , $ 3 E X: sub $11, . . . IF /ID ID /E X 10 and C o n tro l WB 0 00 1 100 M M E M : lw $ 1 0 , . . . E X /M E M W B : b e fo re < 1 > M E M /W B 10 000 WB 1 10 EX 0 M 11 0 1 0 0 WB 0 A dd A d d r e Asud ltd In stru ctio n m em ory 4 5 A LU S rc R ea d r e g iste r 1 R ead d a ta 1 R ea d r e g iste r 2 R e g iste r s R e a d W rite d a ta 2 r e g iste r $4 $2 $5 $3 W rite d a ta A LU 0 M u x 1 Z e ro A LU re s u lt MemtoReg A d d re ss B ra n ch S h ift le ft 2 MemWrite PC Instruction RegWrite 4 R ead d a ta A d d re ss D a ta m e m o ry W r ite d a ta C lo c k 4 X In stru ctio n [1 5 – 0 ] X In stru ctio n [2 0 – 1 6 ] 12 In stru ctio n [1 5 – 1 1 ] S ig n e x te n d X ALU c o n tro l X 12 11 0 M u x 1 M em R ea d A LU O p 10 R egD s t 1 M u x 0 IF : a d d $ 1 4 , $ 8 , $ 9 0 M u x 1 ID : o r $ 1 3 , $ 6 , $ 7 EX : and $12, . . . IF /ID ID /E X 10 or C o n tro l WB 00 0 11 00 E X /M E M W B : lw $ 1 0 , . . . M E M /W B 10 0 00 M M E M : sub $11, . . . WB 1 10 EX 0 10 0 0 0 M 1 WB 1 Add A dd A d d re su lt In stru c tio n m e m o ry 6 7 10 A L U S rc R ead re g is te r 1 R e ad d a ta 1 R ead re g is te r 2 R e g iste rs R e a d W rite d a ta 2 re g is te r $6 $4 $7 $5 MemtoReg A d d re s s B ra n ch S h ift le ft 2 MemWrite PC Instruction RegWrite 4 Z e ro W rite d a ta ALU 0 M u x 1 A LU re s u lt A d d re s s D a ta m e m o ry R ead d a ta 1 M u x 0 W rite d a ta C lo c k 5 IF : a fte r< 1 > X In stru c tio n [1 5 – 0 ] X In stru c tio n [2 0 – 1 6 ] 13 In stru c tio n [1 5 – 1 1 ] X S ig n e x te n d A LU co n tro l X 13 12 A LU O p 11 10 R e g D st ID : a d d $ 1 4 , $ 8 , $ 9 0 M u x 1 0 M u x 1 M em R ead EX : or $13, . . . IF /ID M E M : and $12, . . . ID /E X 10 a dd C o n tro l WB 00 0 11 00 M E X /M E M W B : su b $11 , . . . M E M /W B 10 0 00 WB 1 10 EX 0 M 10 0 0 0 1 WB 0 Add A d d re Asudltd In stru c tio n m e m o ry 8 9 11 A L U S rc R ead re g is te r 1 R e ad d a ta 1 R ead re g is te r 2 R e g iste rs R e a d W rite d a ta 2 re g is te r $8 $6 $9 $7 MemtoReg A d d re s s B ra n ch S h ift le ft 2 MemWrite PC Instruction RegWrite 4 Z e ro W rite d a ta ALU 0 M u x 1 A LU re s u lt A d d re s s D a ta m e m o ry R ead d a ta 1 M u x 0 W rite d a ta C lo c k 6 X In stru c tio n [1 5 – 0 ] X In stru c tio n [2 0 – 1 6 ] 14 In stru c tio n [1 5 – 1 1 ] S ig n e x te n d X A LU co n tro l X 14 13 0 M u x 1 M em R ead A LU O p 12 R e g D st 11 IF : a fte r< 2 > ID : a fte r< 1 > 0 M u x 1 EX: add $14, . . . IF /ID ID /E X 00 C o n tro l 000 000 0 WB E X /M E M W B: and $12, . . . M E M /W B 10 00 0 M M EM : or $13, . . . WB 1 10 EX 0 10 0 0 0 M 1 WB 0 A dd A dd A d d re su lt In s tru c tio n m e m o ry 12 MemWrite A d d re ss A L U S rc R e ad re g is te r 1 Instruction PC B ra n c h S h ift le ft 2 $8 R ea d d a ta 1 R e ad re g is te r 2 R e g is te rs R e a d W rite d a ta 2 re g is te r MemtoReg RegWrite 4 Z e ro $9 A LU 0 M u x 1 W rite d a ta A LU re su lt A d d re ss D a ta m e m o ry R e ad d a ta 1 M u x 0 W rite d a ta In s tru ctio n [1 5 – 0 ] S ig n e xte n d A LU co n tro l In s tru ctio n [2 0 – 1 6 ] In s tru ctio n [1 5 – 1 1 ] 14 C lo c k 7 IF : a fte r< 3 > ALUOp 13 12 R e g D st ID : a fte r< 2 > 0 M u x 1 0 M u x 1 Me m R ead E X : a fte r< 1 > IF /ID M EM : add $14, . . . ID /E X 00 C o n tro l 000 000 0 WB M E X /M E M W B: or $13, . . . M E M /W B 00 00 0 WB 0 00 EX 0 M 10 0 0 0 1 WB 0 A dd A d d re Asudltd In s tru c tio n m e m o ry R e ad re g is te r 1 13 MemWrite A d d re ss Instruction PC B ra n c h S h ift le ft 2 A L U S rc R ea d d a ta 1 R e ad re g is te r 2 R e g is te rs R e a d W rite d a ta 2 re g is te r MemtoReg RegWrite 4 Z e ro A LU 0 M u x 1 W rite d a ta A LU re su lt A d d re ss D a ta m e m o ry R e ad d a ta 1 M u x 0 W rite d a ta In s tru ctio n [1 5 – 0 ] In s tru ctio n [2 0 – 1 6 ] In s tru ctio n [1 5 – 1 1 ] C lo c k 8 S ig n e xte n d A LU co n tro l 0 M u x 1 Me m R ead ALUOp 14 R e g D st 13 IF: after<4> ID: after<3> 0 M u x 1 EX: after<2> IF/ID ID/EX 00 Control 000 0000 WB M MEM: after<1> WB: add $14, . . . EX/MEM MEM/WB 00 000 WB 0 00 EX 0 M 00 0 0 0 1 WB 0 Add Add Add result Instruction memory 14 Read data 1 Read register 2 Registers Read Write data 2 register Write data Zero ALU ALU result 0 M u x 1 MemtoReg Address ALUSrc Read register 1 Instruction PC Branch Shift left 2 MemWrite RegWrite 4 Address Data memory Read data 1 M u x 0 Write data Instruction [15–0] Sign extend Instruction [20–16] Instruction [15–11] Clock 9 ALU control 0 M u x 1 MemRead ALUOp 14 RegDst Figure 3: The processor is seen running the following MIPS instructions through the basic unimproved pipeline for several clock cycles: lw sub and or add $10, 20($1) $11, $2, $3 $12, $4, $5 $13, $6, $7 $14, $8, $9 A similar version of Figure 3 can be found in the optional CDROM advanced content material for Chapter 4 of the textbook in Section 4.12 in Figures 4.12.11-15. Figure 4 shows the single clock cycle MIPS model running on the Quartus Simulator. Note that the values of major busses, control signals, and the register file are all displayed. For any new additional signals to be available in the simulation display they may need to be made outputs from the top level module. Figure 4: Simulation Results for Single Clock Cycle MIPS VHDL Model in Quartus.