CMPEN 331-Final Project By Robert Cassidy & Dan McMahon Section 1 Dr. Almekkawy Abstract For the Final Project, we were tasked with creating the final two stages of the MIPS architecture, memory and write back, and implementing them together with the instruction fetch, instruction decode, and execution stages from the previous labs all together onto an FPGA. For our design, we assumed that pipelining would be utilized, and as a result implemented registers accordingly. In order to realize the design requirements laid out in the project description, we first created modules for the basic components of the circuit utilizing the behavioral method of Verilog programming, just as we did before. After these modules were realized, we instantiated them in their respective main modules, which connected them in a structural fashion to realize the desired stages of the MIPS architecture. We instantiated a number of necessary modules with the same values we used in our previous labs. Finally, we connected all 5 stages in one large testbench to showcase how it all works together. The output from this testbench was ran on our Zynq-7000 XC7Z010--1CLG400C board and the resulting product, from the board itself to the waveforms and the reports, are all shown below. Introduction In order to implement the MIPS architecture, we first constructed modules for each of the stages (Instruction Fetch, Instruction Decode, Execute, Memory, and Writeback). The instruction fetch stage was responsible for retrieving a 1 word instruction and passing it, as well as the incremented instruction count to the decoding stage. In the decode stage, the first 6 bits of the instruction were analyzed to determine which operation would be executed. Additionally, the first 15 bits of the instruction were sign extended in this stage and data was loaded from memory based upon the instruction. These results, as well as two 5 bit portions of the instruction were passed to the next stage, the execute stage. In this stage, the ALU, controlled by both the op code and the function portions of the instruction, operated on two arguments passed to it. It outputted the result of the operation, as well as a value of “Zero”, denoting if the arguments were equal (this is necessary for branching). Additionally, an adder added the sign extension output from the previous stage to the current address. These results, along with one of the aforementioned 5 bit portions of the instruction, determined by a mux, were passed to the next stage. In this stage, execute, the result of the ALU was used to lookup an address in memory, and either read or write a result, depending on the opcode. This result as well as the ALU result and the 5 bits were passed to the writeback stage. In this stage, a mux, operating in accordance to the opcode, chose which result to send back to the instruction decode stage. Additionally, the 5 bits were sent to the instruction stage, to act as an address. While implementing this design, registers were used to separate the stages. This allowed for pipeline, or the act of operating on multiple instructions at a time. This is particularly useful, because it allows the computer to complete sets of instructions in a fraction of the amount of time it would take without pipelining. A diagram of the MIPS architecture is shown above; it will be discussed more later. Design Code Stage 1: INSTRUCTION FETCH Mux `timescale 1ns/1ps module mux2to1 (sel, a0, a1, out); input sel; input [31:0] a0, a1; output reg [31:0] out; /*initial begin out <= 31'd0; #10 out <= 31'd1; #10 out <= 31'd2; end*/ always @ (sel or a0 or a1) begin case(sel) 0:out = a0; 1:out = a1; endcase end endmodule Program Counter `timescale 1ns/1ps module program_counter (NPCin, NPCout, clk); input clk; input [31:0] NPCin; output reg [31:0] NPCout; initial begin NPCout <= 0; end always @ (posedge clk) begin //#1 NPCout <= NPCin; end endmodule Adder `timescale 1ns/1ps module INCR(IncrOut, a); input [31:0] a; output reg [31:0] IncrOut; always @(a) begin IncrOut = a + 1; end endmodule Instruction Memory `timescale 1ns/1ps module Instruction_memory (address, data); input [31:0] address; output reg [31:0] data; reg [31:0] memory [0:127]; initial begin memory [0] <= 32'h002300AA; memory [1] <= 32'h10654321; memory [2] <= 32'h00100022; memory [3] <= 32'h8C123456; memory [4] <= 32'h8F123456; memory [5] <= 32'hAD654321; memory [6] <= 32'h13012345; memory [7] <= 32'hAC654321; memory [8] <= 32'h12012345; memory [9] <= 32'h002300AA; memory [10] <= 32'h10654321; memory [11] <= 32'h00100022; memory [12] <= 32'h8C123456; memory [13] <= 32'h8F123456; memory [14] <= 32'hAD654321; memory [15] <= 32'h13012345; memory [16] <= 32'hAC654321; memory [17] <= 32'h12012345; memory [18] <= 32'h002300AA; memory [19] <= 32'h10654321; memory [20] <= 32'h00100022; memory [21] <= 32'h8C123456; memory [22] <= 32'h8F123456; memory [23] <= 32'hAD654321; memory [24] <= 32'h13012345; memory [25] <= 32'hAC654321; memory [26] <= 32'h12012345; memory [27] <= 32'h002300AA; memory [28] <= 32'h10654321; memory [29] <= 32'h00100022; memory [30] <= 32'h8C123456; memory [31] <= 32'h8F123456; end always @(address) begin data <= memory [address]; end endmodule Instruction Fetch/Instruction Decode Register `timescale 1ns/1ps module If_Id_REG ( input clock, input wire [31:0] NPCin, instin, output reg [31:0] NPCout, instout ); /*initial begin instout <= 0; NPCout <= 0; end*/ always @(posedge clock) begin NPCout <= NPCin; instout <= instin; end endmodule Main Module `timescale 1ns/1ps module IF(clock, PC_choose, EX_MEM_NPC, IF_ID_NPC, IF_ID_IR); input clock, PC_choose; input [31:0] EX_MEM_NPC; output [31:0] IF_ID_NPC, IF_ID_IR; wire [31:0] a0, a1, a2, a3; mux2to1 IF_MUX( .sel(PC_choose), .a0(a1), .a1(EX_MEM_NPC), .out(a2) ); program_counter IF_PROG( .clk(clock), .NPCin(a2), .NPCout(a0) ); INCR IF_INC( .a(a0), .IncrOut(a1) ); Instruction_memory IF_MEM( .address(a0), .data(a3) ); If_Id_REG IF_LATCH( .clock(clock), .NPCin(a1), .instin(a3), .NPCout(IF_ID_NPC), .instout(IF_ID_IR) ); endmodule Stage 2: INSTRUCTION DECODE Control Module `timescale 1ns/1ps module CONTROL ( input[5:0] opcode, output reg [1:0] WB, output reg [2:0] M, output reg [3:0] EX ); always @ (opcode) begin case (opcode) 6'b000000: begin EX = 4'b1100; M = 3'b000; WB = 2'b10; end 6'b000100: begin EX = 4'b1010; M = 3'b100; WB = 2'b00; end 6'b100011: begin EX = 4'b0001; M = 3'b010; WB = 2'b11; end 6'b101011 :begin EX = 4'b0001; M = 3'b001; WB = 2'b00; end /*default: begin EX = 4'b1100; M = 3'b000; WB = 2'b10;*/ endcase end endmodule Register Module `timescale 1ns/1ps module REG( input [4:0] read1, read2, write1, input [31:0] writedata, input wexert, output reg [31:0] data1, data2 ); reg [31:0] memory [0:31]; reg [4:0] temp1, temp2; initial begin memory [0] <= 32'h002300AA; memory [1] <= 32'h10654321; memory [2] <= 32'h00100022; memory [3] <= 32'h8C123456; memory [4] <= 32'h8F123456; memory [5] <= 32'hAD654321; memory [6] <= 32'h13012345; memory [7] <= 32'hAC654321; memory [8] <= 32'h12012345; memory [9] <= 32'h002300AA; memory [10] <= 32'h10654321; memory [11] <= 32'h00100022; memory [12] <= 32'h8C123456; memory [13] <= 32'h8F123456; memory [14] <= 32'hAD654321; memory [15] <= 32'h13012345; memory [16] <= 32'hAC654321; memory [17] <= 32'h12012345; memory [18] <= 32'h002300AA; memory [19] <= 32'h10654321; memory [20] <= 32'h00100022; memory [21] <= 32'h8C123456; memory [22] <= 32'h8F123456; memory [23] <= 32'hAD654321; memory [24] <= 32'h13012345; memory [25] <= 32'hAC654321; memory [26] <= 32'h12012345; memory [27] <= 32'h002300AA; memory [28] <= 32'h10654321; memory [29] <= 32'h00100022; memory [30] <= 32'h8C123456; memory [31] <= 32'h8F123456; end always @(read1 or read2 or wexert) begin data1 = memory [read1]; data2 = memory [read2]; if (wexert) memory [write1] <= writedata; end endmodule Sign Extend Module `timescale 1ns/1ps module S_EXTEND( input [15:0] in, output reg [31:0] out ); reg [15:0] temp; always @ (in) begin if (in[15] == 1) begin temp = 16'hffff; end else begin temp = 16'h0000; end out[31:16] <= temp[15:0]; out [15:0] <= in[15:0]; end endmodule ID_EX Module `timescale 1ns/1ps module ID_EX ( input clock, input [1:0] ctlwb_out, input [2:0] ctlm_out, input [3:0] ctlex_out, input [31:0] npc, readdat1, readdat2, signext_out, input [4:0] instr_2016, instr_1511, output reg RegDst, ALUSrc, output reg [1:0] ALUOp, output reg [1:0] wb_ctlout, output reg [2:0] m_ctlout, //output reg [3:0] ex_ctlout, output reg [31:0] npcout, readdat1out, readdat2out, s_extendout, output reg [4:0] instrout_2016, instrout_1511 ); initial begin //wb_ctlout <= 0; //m_ctlout <= 0; //ex_ctlout <= 0; //RegDst <= 0; //ALUOp <= 0; //ALUSrc <= 0; //npcout <= 0; //readdat1out <= 0; //readdat2out <= 0; //s_extendout <= 0; //instrout_2016 <= 0; //instrout_1511 <= 0; end always @(posedge clock) begin wb_ctlout <= ctlwb_out; m_ctlout <= ctlm_out; RegDst <= ctlex_out[3]; ALUOp <= ctlex_out[2:1]; ALUSrc <= ctlex_out[0]; npcout <= npc; readdat1out <= readdat1; readdat2out <= readdat2; s_extendout <= signext_out; instrout_2016 <= instr_2016; instrout_1511 <= instr_1511; end endmodule Main Module `timescale 1ns/1ps module ID(clock, IF_ID_IR,IF_ID_NPC, RegWrite, Write_reg, Write_data, WB1, M1, RegDst, ALUop, ALUsrc, ADD10, ALU0, MUX0, IR, BOTMUX0, BOTMUX1 ); input clock; input [31:0] IF_ID_IR, IF_ID_NPC; input RegWrite; input [4:0] Write_reg; input [31:0] Write_data; output [1:0] WB1; output [2:0] M1; output RegDst; output [1:0] ALUop; output ALUsrc; output [31:0] ADD10, ALU0, MUX0, IR; output [4:0] BOTMUX0, BOTMUX1; wire [1:0] a0; wire [2:0] a1; wire [3:0] a2; wire [31:0] a3, a4, a5; CONTROL ID_con ( .opcode (IF_ID_IR[31:26] ), .WB (a0 ), .M ( a1 ), .EX ( a2 ) ); REG ID_register ( .read1 (IF_ID_IR[25:21] ), .read2 (IF_ID_IR[20:16] ), .write1 ( Write_reg ), .writedata ( Write_data ), .wexert ( RegWrite ), .data1 ( a3 ), .data2 ( a4 ) ); S_EXTEND ID_signextend ( .in (IF_ID_IR[15:0] ), .out ( a5 ) ); ID_EX ID_LATCH ( .clock (clock ), .ctlwb_out (a0 ), .ctlm_out (a1 ), .ctlex_out ( a2 ), .npc ( IF_ID_NPC ), .readdat1 ( a3 ), .readdat2 ( .signext_out ( a5 ), .instr_2016 (IF_ID_IR[20:16] ), .instr_1511 ( IF_ID_IR[15:11] ), .wb_ctlout ( WB1 .m_ctlout ( M1 a4 ), ), ), .RegDst ( RegDst ), .ALUSrc ( ALUsrc ), .ALUOp ( ALUop), .npcout ( ADD10 ), .readdat1out ( ALU0 ), .readdat2out ( MUX0 ), //TO EX_MUX0 AND EX_MEM_LATCH .s_extendout ( IR ), .instrout_2016 ( BOTMUX0 ), .instrout_1511 ( BOTMUX1 ) ); endmodule Stage 3: INSTRUCTION EXECUTE ALU_Control Module `timescale 1ns/1ps module ALU_CONTROL ( input[1:0] ALUOp, input [5:0] funct , output reg [2:0] contin ); always @ (ALUOp or funct) begin case (ALUOp) 2'b00: begin contin = 3'b010; end 2'b01: begin contin = 3'b110; end 2'b10: begin if (funct === 6'b100000) contin = 3'b010; else if (funct === 6'b100010) contin = 3'b110; else if (funct === 6'b100100) contin = 3'b000; else if (funct === 6'b100101) contin = 3'b001; else if (funct === 6'b101010) contin = 3'b111; /*case (funct) 6'b100000 : contin = 3'b010; 6'b100010 : contin = 3'b110; 6'b100100 : contin = 3'b000; 6'b100101 : contin = 3'b001; 6'b101010 : contin = 3'b111; default : contin <= 3'b010; endcase*/ end /*default: begin contin = 3'b010;*/ endcase end endmodule Adder Module `timescale 1ns/1ps module ADDER( input [31:0] add_in1, add_in2, output reg [31:0] add_out ); always @(add_in1 or add_in2) begin add_out <= add_in1 + add_in2; end endmodule Mux Module `timescale 1ns/1ps module mux2to1 (sel, a0, a1, out); input sel; input [31:0] a0, a1; output reg [31:0] out; /*initial begin out <= 31'd0; #10 out <= 31'd1; #10 out <= 31'd2; end*/ always @ (sel or a0 or a1) begin case(sel) 0:out = a0; 1:out = a1; endcase end endmodule ALU Module `timescale 1ns/1ps module alu(ctl, a0, a1, Out, Zero); input [2:0] ctl; input [31:0] a0,a1; output reg [31:0] Out; output reg Zero; initial begin //Zero = 0; end always @(ctl or a0 or a1) begin /*case (ctl) 0: Out = a0 && a1; 1: Out = a0 || a1; 2: Out = a0 + a1; 6: Out = a1 - a0; 7: begin if (a0 < a1) Out <= 1; else Out <= 0; end //12: Out <= ~(a0 | a1); default: Out <= 0; endcase*/ if (ctl === 3'd0) Out = a0 && a1; else if (ctl === 3'd1) Out = a0 || a1; else if (ctl === 3'd2) Out = a0 + a1; else if (ctl === 3'd6) begin Out = a0 - a1; if (Out === 0) Zero = 32'hffffffff; else Zero = 0; end else if (ctl === 3'd7) if (a0 < a1) Out = 32'd1; else Out = 32'd0; end endmodule 5-bit mux `timescale 1ns/1ps module fivebitmux (sel, a0, a1, out); input sel; input [4:0] a0, a1; output reg [4:0] out; always @ (sel or a0 or a1) begin case(sel) 0:out = a0; 1:out = a1; endcase end endmodule EX_MEM Module `timescale 1ns/1ps module EX_MEM ( input clock, input [1:0] ctlwb_out, input [2:0] ctlm_out, input aluzero, input [31:0] addout, aluout, readdat2, input [4:0] muxout, output reg [1:0] wbout, output reg MemRead, MemWrite, MEM_Branch, output reg [31:0] add_result, output reg zero_out, output reg [31:0] alu_result, output reg [31:0] rdata2out, output reg [4:0] five_bit_muxout ); reg temp; initial begin //wbout <=0; //MemRead <= 0; //MemWrite <= 0; //MEM_Branch <= 0; add_result <= 0; temp <= 0; #16 temp <= 1; //zero_out <= 0; //alu_result <=0; //rdata2out <= 0; //five_bit_muxout <= 0; end always @(posedge clock) begin wbout <= ctlwb_out; MemRead <= ctlm_out[1]; MemWrite <= ctlm_out[0]; MEM_Branch <= ctlm_out[2]; if (temp === 1) add_result <= addout; zero_out <= aluzero; alu_result <= aluout; rdata2out <= readdat2; five_bit_muxout <= muxout; end endmodule Main Module `timescale 1ns/1ps module EX(clock, WB1, M1, RegDst, ALUop, ALUsrc, ADD10, ALU0, MUX0, IR, BOTMUX0, BOTMUX1, WB2, MEMwrite, MEMread, MEMbranch, EX_MEM_NPC, Zero, DataAdress, WriteData, muxout); input clock; input [1:0] WB1; input [2:0] M1; input RegDst; input [1:0] ALUop; input ALUsrc; input [31:0] ADD10, ALU0, MUX0, IR; input [4:0] BOTMUX0, BOTMUX1; output [1:0] WB2; output MEMwrite, MEMread, MEMbranch; output [31:0] EX_MEM_NPC; output Zero; output [31:0] DataAdress, WriteData; output [4:0] muxout; wire [31:0] a0, a1, a3; wire [2:0] a2; wire a4; wire [4:0] a5; ALU_CONTROL alucon ( .ALUOp ( ALUop ), .funct ( IR[5:0] ), .contin ( a2 ) ); ADDER adder1 //created from last lab's incrementer ( .add_in1 (ADD10 ), .add_in2 (IR ), .add_out (a0 ) ); mux2to1 ALU_MUX ( .sel ( ALUsrc ), .a0 ( MUX0 ), .a1 ( IR ), .out ( a1 ) ); alu ALU ( .ctl ( a2 ), .a0 ( ALU0 ), .a1 ( a1 ), .Out ( a3 ), .Zero ( a4 ) ); fivebitmux BOTTOM_MUX ( .sel ( RegDst ), .a0 ( BOTMUX0 ), .a1 ( BOTMUX1 ), .out ( a5 ) ); EX_MEM EX_MEM1 ( .clock (clock ), .ctlwb_out ( WB1 ), .ctlm_out ( M1 ), .addout ( a0 ), .aluzero ( a4 ), .aluout ( a3 ), .readdat2 ( MUX0 ), .muxout ( a5 ), .wbout ( WB2 ), .MemRead ( MEMread ), .MemWrite ( MEMwrite ), .MEM_Branch ( MEMbranch ), .add_result ( EX_MEM_NPC ), .zero_out ( Zero ), .alu_result ( DataAdress ), .rdata2out ( WriteData ), .five_bit_muxout ( muxout ) ); endmodule Stage 4: MEMORY DataMem_MEMWB `timescale 1ns/1ps module D_MEM( input [31:0] Address, Write_data, input MemWrite, MemRead, output reg [31:0] Read_data ); reg [31:0] Data_memory [0:255]; initial begin Data_memory[0] <= 32'h002300AA; Data_memory[1] <= 32'h10654321; Data_memory[2] <= 32'h00100022; Data_memory[3] <= 32'h8C123456; Data_memory[4] <= 32'h8F123456; Data_memory[5] <= 32'hAD654321; Data_memory[6] <= 32'h13012345; Data_memory[7] <= 32'hAC654321; Data_memory[8] <= 32'h12012345; Data_memory[9] <= 32'h002300AA; Data_memory[10] <= 32'h10654321; Data_memory[11] <= 32'h00100022; Data_memory[12] <= 32'h8C123456; Data_memory[13] <= 32'h8F123456; Data_memory[14] <= 32'hAD654321; Data_memory[15] <= 32'h13012345; Data_memory[16] <= 32'hAC654321; Data_memory[17] <= 32'h12012345; Data_memory[18] <= 32'h002300AA; Data_memory[19] <= 32'h10654321; Data_memory[20] <= 32'h00100022; Data_memory[21] <= 32'h8C123456; Data_memory[22] <= 32'h8F123456; Data_memory[23] <= 32'hAD654321; Data_memory[24] <= 32'h13012345; Data_memory[25] <= 32'hAC654321; Data_memory[26] <= 32'h12012345; Data_memory[27] <= 32'h002300AA; Data_memory[28] <= 32'h10654321; Data_memory[29] <= 32'h00100022; Data_memory[30] <= 32'h8C123456; Data_memory[31] <= 32'h8F123456; /*Data_memory[32] <= 32'hAD654321; Data_memory[33] <= 32'h13012345; Data_memory[34] <= 32'hAC654321; Data_memory[35] <= 32'h12012345; Data_memory[36] <= 32'h002300AA; Data_memory[37] <= 32'h10654321; Data_memory[38] <= 32'h00100022; Data_memory[39] <= 32'h8C123456; Data_memory[40] <= 32'h8F123456; Data_memory[41] <= 32'hAD654321; Data_memory[42] <= 32'h13012345; Data_memory[43] <= 32'hAC654321; Data_memory[44] <= 32'h12012345; Data_memory[45] <= 32'h002300AA; Data_memory[56] <= 32'h10654321; Data_memory[47] <= 32'h00100022; Data_memory[48] <= 32'h8C123456; Data_memory[49] <= 32'h8F123456; Data_memory[50] <= 32'hAD654321; Data_memory[51] <= 32'h13012345; Data_memory[52] <= 32'hAC654321; Data_memory[53] <= 32'h12012345; Data_memory[54] <= 32'h002300AA; Data_memory[55] <= 32'h10654321; Data_memory[56] <= 32'h00100022; Data_memory[57] <= 32'h8C123456; Data_memory[58] <= 32'h8F123456; Data_memory[59] <= 32'hAD654321; Data_memory[60] <= 32'h13012345; Data_memory[61] <= 32'hAC654321; Data_memory[62] <= 32'h12012345; Data_memory[63] <= 32'h002300AA; Data_memory[64] <= 32'h10654321; Data_memory[65] <= 32'h00100022; Data_memory[66] <= 32'h8C123456; Data_memory[67] <= 32'h8F123456; Data_memory[68] <= 32'hAD654321; Data_memory[69] <= 32'h13012345; Data_memory[70] <= 32'hAC654321; Data_memory[71] <= 32'h12012345; Data_memory[72] <= 32'h002300AA; Data_memory[73] <= 32'h10654321; Data_memory[74] <= 32'h00100022; Data_memory[75] <= 32'h8C123456; Data_memory[76] <= 32'h8F123456; Data_memory[77] <= 32'hAD654321; Data_memory[78] <= 32'h13012345; Data_memory[79] <= 32'hAC654321; Data_memory[80] <= 32'h12012345; Data_memory[81] <= 32'h002300AA; Data_memory[82] <= 32'h10654321; Data_memory[83] <= 32'h00100022; Data_memory[84] <= 32'h8C123456; Data_memory[85] <= 32'h8F123456; Data_memory[86] <= 32'hAD654321; Data_memory[87] <= 32'h13012345; Data_memory[88] <= 32'hAC654321; Data_memory[89] <= 32'h12012345; Data_memory[90] <= 32'h002300AA; Data_memory[91] <= 32'h10654321; Data_memory[92] <= 32'h00100022; Data_memory[93] <= 32'h8C123456; Data_memory[94] <= 32'h8F123456; Data_memory[95] <= 32'hAD654321; Data_memory[96] <= 32'h13012345; Data_memory[97] <= 32'hAC654321; Data_memory[98] <= 32'h12012345; Data_memory[99] <= 32'h002300AA; Data_memory[100] <= 32'h10654321; Data_memory[101] <= 32'h00100022; Data_memory[102] <= 32'h8C123456; Data_memory[103] <= 32'h8F123456; Data_memory[104] <= 32'hAD654321; Data_memory[105] <= 32'h13012345; Data_memory[106] <= 32'hAC654321; Data_memory[107] <= 32'h12012345; Data_memory[108] <= 32'h002300AA; Data_memory[109] <= 32'h10654321; Data_memory[110] <= 32'h00100022; Data_memory[111] <= 32'h8C123456; Data_memory[112] <= 32'h8F123456; Data_memory[113] <= 32'hAD654321; Data_memory[114] <= 32'h13012345; Data_memory[115] <= 32'hAC654321; Data_memory[116] <= 32'h12012345; Data_memory[117] <= 32'h002300AA; Data_memory[118] <= 32'h10654321; Data_memory[119] <= 32'h00100022; Data_memory[120] <= 32'h8C123456; Data_memory[121] <= 32'h8F123456; Data_memory[122] <= 32'hAD654321; Data_memory[123] <= 32'h13012345; Data_memory[124] <= 32'hAC654321; Data_memory[125] <= 32'h12012345; Data_memory[126] <= 32'h002300AA; Data_memory[127] <= 32'h10654321; Data_memory[128] <= 32'h00100022; Data_memory[129] <= 32'h8C123456; Data_memory[130] <= 32'h8F123456; Data_memory[131] <= 32'hAD654321; Data_memory[132] <= 32'h13012345; Data_memory[133] <= 32'hAC654321; Data_memory[134] <= 32'h12012345; Data_memory[135] <= 32'h002300AA; Data_memory[136] <= 32'h10654321; Data_memory[137] <= 32'h00100022; Data_memory[138] <= 32'h8C123456; Data_memory[139] <= 32'h8F123456; Data_memory[140] <= 32'hAD654321; Data_memory[141] <= 32'h13012345; Data_memory[142] <= 32'hAC654321; Data_memory[143] <= 32'h12012345;*/ end always @ (MemRead or Address or MemWrite or Write_data) begin if (MemRead) begin Read_data <= Data_memory[Address]; end else begin Read_data <= 32'd0; end if (MemWrite) begin Data_memory[Address] <= Write_data; end end endmodule AND `timescale 1ns/1ps module AND( input ALU_zero, MEM_LATCH, output reg PCSrc ); reg temp; initial begin PCSrc <= 0; end always @ (ALU_zero or MEM_LATCH) begin temp <= ALU_zero && MEM_LATCH; if (temp) begin PCSrc <= 1; end else begin PCSrc <=0; end end endmodule MEM_WB Register `timescale 1ns/1ps module MEM_WB ( input clk, input [1:0] control_wb_in, input [31:0] Read_data_in, ALU_result_in, input [4:0] Write_reg_in, output reg RegWrite, output reg Mem2Reg, output reg [31:0] Read_data, mem_ALU_result, output reg [4:0] mem_Write_reg ); /*initial begin RegWrite <= 1'd0; Mem2Reg <= 1'd0; Read_data <= 32'd0; mem_ALU_result <= 32'd0; mem_Write_reg <= 5'd0; end*/ always @ (posedge clk) begin RegWrite <= control_wb_in[1]; Mem2Reg <= control_wb_in[0]; Read_data <= Read_data_in; mem_ALU_result <= ALU_result_in; mem_Write_reg <= Write_reg_in; end endmodule Main `timescale 1ns/1ps module MEM(clock, WB2, MEMwrite, MEMread, MEMbranch, Zero, DataAdress, WriteData, muxout, PC_choose, MUXSEL, RegWrite, LMUX0, LMUX1, Write_reg ); input clock; input [1:0] WB2; input MEMwrite, MEMread, MEMbranch; input Zero; input [31:0] DataAdress, WriteData; input [4:0] muxout; output PC_choose, MUXSEL, RegWrite; output [31:0] LMUX0, LMUX1; output [4:0] Write_reg; wire [31:0] a0, a1, a2; D_MEM DataMemMEMWB ( .Address (DataAdress ), .Write_data (WriteData ), .MemWrite (MEMwrite ), .MemRead (MEMread ), .Read_data (a0 ) ); AND AndGate ( .ALU_zero (Zero ), .MEM_LATCH (MEMbranch ), .PCSrc (PC_choose ) ); MEM_WB MEMWBRegister ( .clk (clock ), .control_wb_in (WB2 ), .Read_data_in (a0 ), .ALU_result_in (DataAdress ), .Write_reg_in (muxout ), .RegWrite (RegWrite), .Mem2Reg (MUXSEL), .Read_data (LMUX1), .mem_ALU_result (LMUX0), .mem_Write_reg (Write_reg) ); endmodule Stage 5: WRITE BACK Mux `timescale 1ns/1ps module mux2to1 (sel, a0, a1, out); input sel; input [31:0] a0, a1; output reg [31:0] out; /*initial begin out <= 31'd0; #10 out <= 31'd1; #10 out <= 31'd2; end*/ always @ (sel or a0 or a1) begin case(sel) 0:out = a0; 1:out = a1; endcase end Endmodule Main `timescale 1ns/1ps module WB( MUXSEL, LMUX0, LMUX1, WB_mux); input MUXSEL; input [31:0] LMUX0, LMUX1; output [31:0] WB_mux; mux2to1 con ( .sel (MUXSEL ), .a0 (LMUX0 ), .a1 (LMUX1 ), .out (WB_mux ) ); endmodule Main Module `timescale 1ns/1ps module Final_Testbench (onswitch, clk, LED1); input onswitch; input clk; output reg LED1; //IF output wires wire [31:0] IF_ID_IR, IF_ID_NPC; //ID output wires wire [1:0] WB1; wire [2:0] M1; wire RegDst; wire [1:0] ALUop; wire ALUsrc; wire [31:0] ADD10, ALU0, MUX0, IR; wire [4:0] BOTMUX0, BOTMUX1; // EX output wires wire [1:0] WB2; wire MEMwrite, MEMread, MEMbranch; wire [31:0] EX_MEM_NPC; wire Zero; wire [31:0] DataAdress, WriteData; wire [4:0] muxout; // MEM output wires wire PC_choose, MUXSEL, RegWrite; wire [31:0] LMUX0, LMUX1; wire [4:0] Write_reg; //WB wires wire [31:0] WB_mux; IF if1 ( .PC_choose(PC_choose), .EX_MEM_NPC(EX_MEM_NPC), .IF_ID_IR(IF_ID_IR ), .IF_ID_NPC(IF_ID_NPC ), .clock(clk) ); ID id1 ( .clock(clk), .IF_ID_IR(IF_ID_IR), .IF_ID_NPC(IF_ID_NPC), .RegWrite(RegWrite), .Write_reg(Write_reg), .Write_data(WB_mux), .WB1(WB1), .M1(M1), .RegDst(RegDst), .ALUop(ALUop), .ALUsrc(ALUsrc), .ADD10(ADD10), .ALU0(ALU0), .MUX0(MUX0), .IR(IR), .BOTMUX0(BOTMUX0), .BOTMUX1(BOTMUX1) ); EX ex1 ( .clock(clk), .WB1(WB1), .M1(M1), .RegDst(RegDst), .ALUop(ALUop), .ALUsrc(ALUsrc), .ADD10(ADD10), .ALU0(ALU0), .MUX0(MUX0), .IR(IR), .BOTMUX0(BOTMUX0), .BOTMUX1(BOTMUX1), .WB2(WB2), .MEMwrite(MEMwrite), .MEMread(MEMread), .MEMbranch(MEMbranch), .EX_MEM_NPC( EX_MEM_NPC), .Zero(Zero), .DataAdress(DataAdress), .WriteData(WriteData), .muxout(muxout) ); MEM mem1 ( .clock(clk), .WB2(WB2), .MEMwrite(MEMwrite), .MEMread(MEMread), .MEMbranch(MEMbranch), .Zero(Zero), .DataAdress(DataAdress), .WriteData(WriteData), .muxout(muxout), .PC_choose(PC_choose), .MUXSEL(MUXSEL), .RegWrite(RegWrite), .LMUX0( LMUX0), .LMUX1(LMUX1), .Write_reg(Write_reg) ); WB wb1 ( .MUXSEL(MUXSEL), .LMUX0(LMUX0), .LMUX1(LMUX1), .WB_mux(WB_mux) ); always begin if (WB_mux) LED1 <= WB_mux[0]; end endmodule Constraints ## This file is a general .xdc for the ZYBO Rev B board ## To use it in a project: ## - uncomment the lines corresponding to used pins ## - rename the used signals according to the project ##Clock signal ##IO_L11P_T1_SRCC_35 set_property PACKAGE_PIN L16 [get_ports clk] set_property IOSTANDARD LVCMOS33 [get_ports clk] create_clock -add -name sys_clk_pin -period 8.00 -waveform {0 4} [get_ports clk] ##Switches ##IO_L19N_T3_VREF_35 set_property PACKAGE_PIN G15 [get_ports onswitch] set_property IOSTANDARD LVCMOS33 [get_ports onswitch] ##IO_L24P_T3_34 #set_property PACKAGE_PIN P15 [get_ports {sw[1]}] #set_property IOSTANDARD LVCMOS33 [get_ports {sw[1]}] ##IO_L4N_T0_34 #set_property PACKAGE_PIN W13 [get_ports {sw[2]}] #set_property IOSTANDARD LVCMOS33 [get_ports {sw[2]}] ##IO_L9P_T1_DQS_34 #set_property PACKAGE_PIN T16 [get_ports {sw[3]}] #set_property IOSTANDARD LVCMOS33 [get_ports {sw[3]}] ##Buttons ##IO_L20N_T3_34 #set_property PACKAGE_PIN R18 [get_ports {btn[0]}] #set_property IOSTANDARD LVCMOS33 [get_ports {btn[0]}] ##IO_L24N_T3_34 #set_property PACKAGE_PIN P16 [get_ports {btn[1]}] #set_property IOSTANDARD LVCMOS33 [get_ports {btn[1]}] ##IO_L18P_T2_34 #set_property PACKAGE_PIN V16 [get_ports {btn[2]}] #set_property IOSTANDARD LVCMOS33 [get_ports {btn[2]}] ##IO_L7P_T1_34 #set_property PACKAGE_PIN Y16 [get_ports {btn[3]}] #set_property IOSTANDARD LVCMOS33 [get_ports {btn[3]}] ##LEDs ##IO_L23P_T3_35 set_property PACKAGE_PIN M14 [get_ports LED1] set_property IOSTANDARD LVCMOS33 [get_ports LED1] Waveform In the waveform, it can be seen that most values are initially undefined within the system. This is because the information hasn’t yet rippled through the pipeline to populate the fields. Each chunk that changes from red to green is representative of a stage's outputs. Interestingly, after each field has been populated, we still experience some undefined segments. We believe that this is a result of the address that is passed to the memory in the memory stage being too large, leading the program to try to lookup an address that isn’t defined. This carries over to the LED being undefined for this portion because it is controlled by the least significant bit of the result of the writeback stage mux, which is an undefined value. Other than that, the values shown in the diagram seem to indicate that the system works as intended. Also from the graph, you can see the the PC_select bit initialized. This was done because we needed a way for the mux in the instruction fetch stage to know which address to pass, for the first couple of cycles. Although it passes values not obtained from analyzing instructions, its correctness is assured because it can safely be assumed that you would never want to jump before evaluating a branch command. Design Schematic This is the diagram of the MIPS architecture discussed previously. You can plainly see the stages of the pipeline marked by each of the five rectangles, and the variables being passed between each stage are clearly highlighted. The only true input to the system is the clock, since the rest is a cyclical, self-sustaining system, at least in this case where memory addresses are predefined. You can also see an additional input and an output system. These are a result of putting the whole architecture on to the programmable board. The input is the switch we set in the constraint file, and the LED, likewise, is the output we set in the constraint file. The system that leads from the WB stage to the LED is likely just a way to clarify the relevant data into the bit stream that was required to work the LED. This shows that it was actually integrated into the system, and not simply a separate program running in conjunction to the MIPS project file. This diagram is a pictorial representation of our code, and it’s also an exact replica of the architecture we discussed ad nauseum in class. This diagram is that surefire example of our success. Snapshot of the Routed Design This is the routed design created from the synthesis and implementation of our project. The lights show activity on our board: which portions are activated and in use. Previous labs only saw small sections of the board lit up, but when the whole system is connected together we can the effects of this on the actual hardware become obvious. We’ve created a complex and integral system and successfully ran it on real life hardware. Reports Timing Report The code from the previous labs largely remained the same, with some rewiring to fix both issues we hadn’t noticed previously, as well as new ones the cropped up in the process of connecting everything together. The old errors consisted primarily of just typos, while the new errors came from connecting our new modules back to the old ones, replacing previously static values with new variables. Speaking of new modules: both the memory and writeback stages were created from scratch for this project. They were fairly simple to create, with basic components and a similar register to the previous stages. These were then combined I fairly straight forward testbench where we connected each module just as we learned in lecture. All of this code is included above. This is where the constraint file came in. We had to implement all of this code onto the circuit board we were provided, and the only way to show it was implemented correctly is to properly link it with the constraint file. This wasn’t an overly complicated process, as we just followed the tutorial provided. We set our clock established in the testbench to the board clock, and then an arbitrary switch as our input with an LED as our output. This brought our code to life and gave it a practical application beyond just learning the process of creation and practicing our coding skills. A photo proving the effectiveness of this implementation is shown directly above the conclusion. Everything synthesized and implemented without error, creating all of the images and reports already detailed. Of note is the Timing report, which we haven’t discussed in previous labs. Due to the extreme length of the report, I included a brief summary at the end to capture the highlights of the report for those who want an idea of the timing at a brief length. The short of it is everything worked as intended. Overall, our design seems to be a success on all fronts and our understanding of the material seems to have improved tremendously as a result of working on this project.