Microprocessor Design Trainer Lab Manual Control Inputs Data Inputs Microprocessor '0' Control Unit Datapath MUX ff Nextstate Logic State Memory Output Logic Register Control Signals ALU ff Register Status Signals Control Outputs Enoch Hwang, Ph.D. Data Outputs Microprocessor Design Trainer Lab Manual Copyright © 2011 by Enoch Hwang No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permission of the author. For information regarding permission, email to enoch@hwangs.net. Copyright © 2011 Enoch Hwang Page 2 of 125 Microprocessor Design Trainer Lab Manual Table of Contents Microprocessor Design Trainer ............................................................................................................................1 1 Microprocessor Design Trainer.............................................................................................................................5 1.1 Microprocessor Design Trainer Hardware.....................................................................................................5 1.2 System Requirements ....................................................................................................................................6 1.3 Quartus Development Software Installation ..................................................................................................6 1.4 Driver Installation..........................................................................................................................................9 1.5 Testing the Microprocessor Design Trainer Board......................................................................................11 2 Microprocessor Circuits ......................................................................................................................................14 2.1 Datapath.......................................................................................................................................................14 2.2 Control Unit.................................................................................................................................................15 3 Datapath Design..................................................................................................................................................16 3.1 Register Transfer Level ...............................................................................................................................16 3.2 Problem Specification..................................................................................................................................17 3.3 Selecting Registers.......................................................................................................................................20 3.4 Selecting Functional Units...........................................................................................................................20 3.5 Data Transfer Methods ................................................................................................................................21 3.5.1 Multiple Sources..................................................................................................................................21 3.5.2 Multiple Destinations...........................................................................................................................22 3.5.3 Tri-state Bus.........................................................................................................................................22 3.6 Generating Status Signals ............................................................................................................................23 3.7 Control Words .............................................................................................................................................24 3.8 Examples of Datapath Design......................................................................................................................24 3.8.1 Example 1: Datapath for a simple IF-THEN-ELSE problem ...................................................................24 3.8.2 Example 2: Datapath for the counting from 1 to 10 probem ...............................................................26 4 Control Unit Design ............................................................................................................................................28 4.1 The State Diagram .......................................................................................................................................28 4.2 Examples of Control Unit Design................................................................................................................29 4.2.1 Example 3: Control unit for a simple IF-THEN-ELSE problem ..............................................................29 4.2.2 Example 4: Control unit for the counting from 1 to 10 problem .........................................................33 5 Microprocessor Design .......................................................................................................................................38 5.1 Examples of Microprocessor Design ...........................................................................................................38 5.1.1 Example 5: Microprocessor for the two-statement problem................................................................38 5.1.2 Example 6: Microprocessor for the counting from 1 to 10 problem....................................................39 6 Labs.....................................................................................................................................................................41 6.1 Lab 1: Quartus Development Software........................................................................................................42 6.1.1 Starting Quartus II ...............................................................................................................................42 6.1.2 Creating a New Project ........................................................................................................................43 6.1.3 Using the Block Editor ........................................................................................................................45 6.1.4 Managing Files in a Project .................................................................................................................50 6.1.5 Creating and Using a Logic Symbol ....................................................................................................51 6.1.6 Experiments .........................................................................................................................................53 6.2 Lab 2: Implementing a Circuit in Hardware ................................................................................................55 6.2.1 Analysis and Synthesis ........................................................................................................................55 6.2.2 Mapping the I/O Signals to the FPGA Pins .........................................................................................55 6.2.3 Full Compilation..................................................................................................................................57 6.2.4 Programming the FPGA ......................................................................................................................58 6.2.5 Testing the Circuit in Hardware...........................................................................................................58 6.2.6 Experiments .........................................................................................................................................59 6.3 Lab 3: Counting from 1 to 10 ......................................................................................................................60 6.3.1 Experiments .........................................................................................................................................62 6.4 Lab 4: Countdown from Input n ..................................................................................................................64 6.4.1 Experiments .........................................................................................................................................67 6.5 Lab 5: Count and Sum .................................................................................................................................69 Copyright © 2011 Enoch Hwang Page 3 of 125 Microprocessor Design Trainer Lab Manual 6.5.1 Experiments .........................................................................................................................................73 6.6 Lab 6: Greatest Common Divisor ................................................................................................................74 6.6.1 Experiments .........................................................................................................................................80 6.7 Lab 7: Summing Input Numbers .................................................................................................................81 6.7.1 Experiments .........................................................................................................................................86 6.8 Lab 8: Finding the Largest Number.............................................................................................................87 6.8.1 Experiments .........................................................................................................................................91 6.9 Lab 9: Hi-Lo Number Guessing Game ........................................................................................................93 6.9.1 Experiments .........................................................................................................................................98 6.10 Lab 10: The EC-1 General-Purpose Microprocessor.................................................................................100 6.10.1 Instruction Set....................................................................................................................................100 6.10.2 Datapath.............................................................................................................................................101 6.10.3 Control Unit .......................................................................................................................................102 6.10.4 EC-1 Microprocessor Circuit.............................................................................................................106 6.10.5 Sample Program.................................................................................................................................107 6.10.6 Hardware Implementation .................................................................................................................109 6.10.7 Experiments .......................................................................................................................................109 6.11 Lab 11: The EC-2 General-Purpose Microprocessor.................................................................................113 6.11.1 Instruction Set....................................................................................................................................113 6.11.2 Datapath.............................................................................................................................................114 6.11.3 Control Unit .......................................................................................................................................115 6.11.4 EC-2 Microprocessor Circuit.............................................................................................................118 6.11.5 Sample Program.................................................................................................................................119 6.11.6 Hardware Implementation .................................................................................................................121 6.11.7 Experiments .......................................................................................................................................122 7 Appendix A – FPGA Pin Mappings..................................................................................................................124 Copyright © 2011 Enoch Hwang Page 4 of 125 Microprocessor Design Trainer Lab Manual 1 Microprocessor Design Trainer 1.1 Microprocessor Design Trainer Hardware The Microprocessor Design Trainer that you have contains all of the necessary tools for you to easily implement custom design microprocessor circuits. This trainer is the third in a 3-part series of our digital logic and microprocessor trainer course. The reader is assumed to already have a good understanding of the materials presented in the Combinational Logic Design Trainer (DL-010) and Sequential Logic Design Trainer (DL-020). DO NOT connect the Microprocessor Design Trainer to your PC until after you have installed the development software and the necessary drivers. See Section 1.3 for instructions on installing the software and drivers. The layout of the Microprocessor Design Trainer is shown in Figure 1. Microprocessor Trainer DL - 030 USB to computer GLOBAL SPECIALTIES 1x8 header PB2 HEX2 HEX1 PB1 PB0 HEX0 LED15 LED14 LED13 LED12 LED11 LED10 LED9 LED8 LED7 LED6 LED5 LED4 LED3 LED2 LED1 LED0 SW15 SW14 SW13 SW12 SW11 SW10 SW9 SW8 SW7 SW6 SW5 SW4 SW3 SW2 SW1 SW0 Figure 1: Microprocessor Design Trainer layout. Copyright © 2011 Enoch Hwang Page 5 of 125 Microprocessor Design Trainer Lab Manual The following is a list of all of the components on the trainer: • • • • • • • • • • Altera Cyclone III (EP3C16F256C8N) FPGA 16 MHz clock Three 7-segment LED displays 16 red LEDs 16 toggle switches Three push button switches 8-pin header for general I/O VCC and GND connection points General bread board area with 270 tie points USB connector for connecting the trainer to the development computer. The main component of the Microprocessor Design Trainer is the Altera Cyclone III (EP3C16F256C8N) FPGA chip. This chip will be used to implement all of the microprocessor circuits develop in this courseware. The 16 LEDs and the three 7-segment LED displays are active high, which means that a logic 1 will turn the light on, and a logic 0 will turn the light off. The three push buttons, PB0, PB1 and PB2, are also active high, so pressing the button will produce a logic 1 signal. All of the 16 switches, SW0 to SW15, are configured so that when the switch is in the up (on) position the output is a logic 1, and when the switch is in the down (off) position the output is a logic 0. The 8-pin header for general I/O in conjunction with the general breadboard area allows you to connect other components that are not available on the trainer together with your circuit that is implemented on the FPGA chip. 1.2 System Requirements The following are the requirements for the development computer: • • • 1.3 PC running Windows XP or higher 10 GB of free disk space for installation. 4 GB of disk space to run program after installation USB port Quartus Development Software Installation The Altera Quartus II development software is required in order to design and implement your digital logic circuits on the Microprocessor Design Trainer board. The Web Edition of the Quartus II software is on the DVD disc that is included with the trainer. Put the disc in the disk drive and go into the Quartus II folder. Double click on the setup.exe file to run the installation program. To save disk space, in the Select components screen as shown in Figure 2, you can de-select everything except for the Quartus II Web Edition software (Free) and the Cyclone III and Cyclone III LS Families. The entire installation process will take approximately 30 minutes. Copyright © 2011 Enoch Hwang Page 6 of 125 Microprocessor Design Trainer Lab Manual Figure 2: The Select components screen during the software installation process. Run the Quartus software by clicking on Start > All Programs > Altera > Quartus II 10.0 Web Edition > Quartus II 10.0 Web Edition. In the Getting Started With Quartus II Software screen as shown in Figure 3, you can select the Don’t show this screen again check box at the bottom left corner if you don’t want to see this window again the next time you run Quartus. Close the window by clicking on the × at the top right corner. Copyright © 2011 Enoch Hwang Page 7 of 125 Microprocessor Design Trainer Lab Manual Figure 3: The Getting Started With Quartus II Software screen. After Quartus is launched, you will see the main Quartus screen as shown in Figure 4. For now, just exit the program by selecting File > Exit from the main Quartus menu. Copyright © 2011 Enoch Hwang Page 8 of 125 Microprocessor Design Trainer Lab Manual Figure 4: The main Quartus II software screen. 1.4 Driver Installation Start the Microprocessor Design Trainer programmer installation program setup.exe located in the folder D:\ Microprocessor Design Trainer\programmer. Substitute the drive letter D with the correct letter of your CD-ROM drive. In the Save As window, select the folder C:\altera\10.0. Create a new folder and type in the name driver. Go into this new folder by double clicking on it. Click on the Save button. The following driver installation instructions are for Windows XP. Other versions of Windows will be slightly different. Connect the Microprocessor Design Trainer USB cable to your PC. After a few seconds, the Found New Hardware Wizard screen as shown in Figure 5 should come up. Select the Yes, this time only option, and then click Next. In the second screen as shown in Figure 6, select the Install from a list or specific location (Advanced) option, and then click Next. Copyright © 2011 Enoch Hwang Page 9 of 125 Microprocessor Design Trainer Lab Manual Figure 5: The Found New Hardware Wizard screen. Figure 6: Selecting the Advanced option. Copyright © 2011 Enoch Hwang Page 10 of 125 Microprocessor Design Trainer Lab Manual In the next screen as shown in Figure 7 where it asks for the location of the driver, browse to the folder D:\Microprocessor Design Trainer\driver, and then click Next. Substitute the drive letter D with the correct letter of your CD-ROM drive. Figure 7: Specifying the location of the driver folder. When you see the message The software you are installing for this hardware has not passed Windows Logo testing, just click on the Continue Anyway button. Click on the Finish button after the driver has been installed. You will repeat this same driver installation process two more times, one for the Arrow USB-Blaster Port B and one for the USB Serial Port. At this point, you should have successfully installed both the Quartus development software and the needed drivers for the Microprocessor Design Trainer. 1.5 Testing the Microprocessor Design Trainer Board Start the Quartus program. Select from the Quartus main menu File, then Open Project. Browse to the directory D:\Circuits\Demo. Select the project file named Top.qpf, and click Open. Under the Project Navigator window pane, click on the Files tab as shown in Figure 8. Here you will see a listing of all of the source files used for this Demo project. Copyright © 2011 Enoch Hwang Page 11 of 125 Microprocessor Design Trainer Lab Manual Figure 8: Files for the Demo project as shown in the Project Navigator window pane. In the tool bar as shown in Figure 9, click on the Start Compilation icon . Figure 9: Quartus II main toolbar. You can watch the compilation progress in the Tasks window pane, and also the Flow Summary statistics. At the completion of the compilation, you will see in the Message window pane a message telling you that the Quartus II full compilation was successful with 0 errors and some warnings. In most situations, you can ignore the warnings. Next, we will upload the circuit for the project onto the Microprocessor Design Trainer development board. Connect the USB cable from the Microprocessor Design Trainer development board to your PC if it is not already in the toolbar. The first time you open this programmer window, it connected. Click on the Programmer button will say No Hardware at the top-left corner of the window. Click on the Hardware Setup button next to it to bring up the Hardware Setup window as shown in Figure 10. Click on the down arrow next to the Currently selected hardware to open the drop-down box. Select Arrow-USB-Blaster [USB], and then click the Close button. Back in the Programmer window, click on the Start button to upload the circuit onto the trainer board as shown in Figure 11. When the circuit is uploaded onto the board, the progress bar at the top-right corner of the programmer window should show 100% in green. If the programming failed, check the USB connection between your PC and the trainer board and then click on the Start button again. You may have to first close the Programmer window and repeat the process again. On the trainer board, you should see the three 7-segment hex digits counting, and the 16 LEDs moving from right to left. If you press the push button PB0, everything will stop. If you press the push button PB1, the counting on the hex digits will stop but the 16 LEDs will continue to move. If you turn on (move to the up position) any one of the 16 switches, the LED above that switch will stay on. Copyright © 2011 Enoch Hwang Page 12 of 125 Microprocessor Design Trainer Lab Manual Figure 10: Selecting the hardware Arrow-USB-Blaster for programming. Figure 11: Successful programming of the logic circuit onto the trainer board. Copyright © 2011 Enoch Hwang Page 13 of 125 Microprocessor Design Trainer Lab Manual 2 Microprocessor Circuits There are generally two types of microprocessors: general-purpose microprocessors and dedicated microprocessors. General-purpose microprocessors, such as the Intel Pentium® CPU, can perform different tasks under the control of software instructions. General-purpose microprocessors are used in all personal computers. Dedicated microprocessors, also known as microcontrollers, on the other hand, are designed to perform just one specific task. So for example, inside your cell phone, there is a dedicated microprocessor that controls its entire operation. The embedded microprocessor inside the cell phone does nothing else but controls the operation of the phone. Dedicated microprocessors are therefore usually much smaller, and not as complex as general-purpose microprocessors. Regardless of whether it is a general-purpose microprocessor or a dedicated microprocessor, the concept of designing a microprocessor is the same. A microprocessor circuit is a digital circuit that is composed of many different combinational circuits and many different sequential circuits. In part I of this three-part series on microprocessor design training courseware you have learned how to design combinational circuits. In part II, you have learned how to design sequential circuits. And now, finally in part III, you will learn how to put these different combinational and sequential circuits together to make a real working microprocessor. A microprocessor circuit, as shown in Figure 12, is divided into two main components, the datapath and the control unit. Control Inputs Data Inputs Microprocessor '0' Control Unit Datapath MUX ff Nextstate Logic State Memory Output Logic Register Control Signals ALU ff Register Status Signals Control Outputs Data Outputs Figure 12: Block diagram of a microprocessor circuit. 2.1 Datapath The datapath is responsible for performing all of the data manipulations needed by the microprocessor. For example, if the microprocessor has to perform some additions, then the datapath must have at least one adder in order to perform the additions. The datapath, therefore, must have all of the necessary functional units to perform all of the data operations needed by the microprocessor. In general a datapath will include (1) functional units such as adders, shifters, multipliers, ALUs, and comparators for data manipulations, (2) registers and other memory elements for the temporary storage of data, and (3) buses, multiplexers, and tri-state buffers for the transfer of data between the different components in the datapath, and the external world. In Figure 12, the datapath has an ALU for performing arithmetic and logical operations, a register for storing results from the ALU, a mux and a tri-state buffer for controlling the movement of data, and an OR gate acting as a comparator for generating the status signal. Many of the functional units used inside the datapath have control signals. For example, the mux and the ALU both have select lines, and the register has write and clear lines. Thus, the operation of the datapath is determined by Copyright © 2011 Enoch Hwang Page 14 of 125 Microprocessor Design Trainer Lab Manual which control signals are asserted or de-asserted and at what time. When we were testing these components individually, we connected these control signals directly to input switches and we were manually setting or resetting these signals using external input switches. However, when these functional units are connected together and used inside a datapath, we want them to be able to work on their own without requiring any manual user interventions, therefore, their control signals must be generated automatically. In order for the datapath to operate correctly on its own, control signals need to be generated correctly and at the right time. This is the work of the control unit. 2.2 Control Unit The control unit is responsible for controlling the entire operation of the datapath, and therefore, the entire operation of the microprocessor by sending the correct control signals at the right time to the datapath. The control unit is a finite state machine (FSM) consisting of three sub-components, the next-state logic, the state memory, and the output logic. In part II, Sequential Logic Design Trainer, you have already learned how to design stand-alone FSMs. The state memory is a register for remembering the current state that the FSM is in. The next-state logic is a combinational circuit for determining the next state for the FSM to go to, and it is dependent on the current state of the FSM and the status signals. The output logic is also a combinational circuit for generating the control signals based on the current state of the FSM. Some of the control signals generated by the control unit are dependent on the data that is being manipulated within the datapath. For example, the result of a conditional test with a number that is stored in a register. Hence, in order for the control unit to generate these control signals correctly, the datapath needs to supply status signals to the control unit. These status signals are usually from the output of comparators. The comparator tests for a given logical condition between two data values in the datapath. These values are obtained either from memory elements or directly from the output of functional units, or are hardwired as constants. The status signals provide information for the control unit to determine what state to go to next. For example, in a conditional loop situation, the status signal provides the result of the condition being tested, and tells the control unit whether to repeat or exit the loop. The state diagram for deriving the control unit is dependent on the sequence and ordering of steps involved to solve a particular problem. The solution to the problem will be given in the form of an algorithm. Hence, the sequencing of the steps, the repetition of steps and branching decisions of what steps to take, will be known from the given algorithm. The control unit and the datapath are connected together via the control signals and the status signals. The control unit generates the control signals for controlling the functional units that are inside the datapath. In return, the datapath generates status signals for the control unit in order for the control unit to make the right decision as to what to do next. Both the control unit and the datapath may have external I/O signals. The control unit may have an external reset input signal to reset the microprocessor. The control unit may also generate a done signal to the external world at the completion of an operation. The datapath may have input signals for inputting external data into the datapath and output signals for outputting results to the external world. In the following sections of this courseware, you will learn how to design the datapath, the control unit, and finally connect them together to form a microprocessor. Copyright © 2011 Enoch Hwang Page 15 of 125 Microprocessor Design Trainer Lab Manual 3 Datapath Design In part II, Combinational Logic Design Trainer, you learned how to design functional units for performing single, simple data operations, such as the adder for adding two numbers or the comparator for comparing two values. The next logical question to ask is how do we design a circuit for performing more complex data operations or operations that involve multiple steps? For example, how do we design a circuit for adding four numbers or a circuit for adding a million numbers? For adding four numbers, we can connect three adders together, as shown in Figure 13(a). However, for adding a million numbers, we really don’t want to connect a million minus one adders together like that. Instead, we want a circuit with just one adder and to use it a million times. A datapath circuit allows us to do just that, that is, for performing operations involving multiple steps. Figure 13(b) shows a simple datapath using one adder to add as many numbers as we want. In order for this to be possible, a register is needed to store the temporary result after each addition. The temporary result from the register is fed back to the input of the adder so that the next number can be added to the current sum. Number 1 Number 2 Number 3 Number 4 Input of 1 million numbers + + + Register + (a) (b) Figure 13: Circuits to add several numbers: (a) combinational circuit to add four numbers; (b) datapath to add one million numbers. The goal for designing a datapath is to build a circuit for solving a specific problem. So if the problem requires the addition of two numbers, the datapath, therefore, must contain an adder. If the problem requires the storage of three temporary variables, the datapath must have three registers. However, even with these requirements, there are still many options as to what actually is implemented in the datapath. For example, an adder can be implemented as a single adder circuit, or as part of the ALU. These functional units can also be used more than once. Registers can be separate register units or combined in a register file. Furthermore, two temporary variables can share the same register if they are not needed at the same time. In the design process, we need to answer the following questions: • • • • 3.1 What kind of registers to use, and how many are needed? What kind of functional units to use, and how many are needed? Can a certain functional unit be shared between two or more operations? How are the registers and functional units connected together so that all of the data movements specified by the problem can be realized? Register Transfer Level Datapath design is also referred to as register-transfer level (RTL) design. In a register-transfer level design, we look at how data is transferred from one register to another, or back to the same register. If the same data is written back to a register without any modifications, then nothing has been accomplished. Therefore, before writing the data Copyright © 2011 Enoch Hwang Page 16 of 125 Microprocessor Design Trainer Lab Manual to a register, the data usually passes through one or more functional units, and gets modified. Recall that the writing of data into a register occurs at the active edge of the clock which is at the end of a clock cycle. Valid data from the register is available for reading at the beginning of the next clock cycle, i.e. shortly after the active clock edge. The sequence of RTL operations— reading a value from a register, modifying the value by one or more functional units, and writing the modified value back into the same or a different register—is referred to as a register-transfer operation. Every register-transfer operation must complete within one clock cycle. In other words, the functional units must finish their calculations and have valid results at their outputs before the end of the clock cycle so that when the register writes the value at the active clock edge, it will be the correct value. This implies that in a single register-transfer operation, i.e., in one clock cycle, a functional unit cannot be used more than once since the result from the functional unit is saved only once at the end of the clock cycle. However, the same functional unit can be used again in a different register-transfer operation, which will be in a different clock cycle. 3.2 Problem Specification In designing the datapath for a microprocessor, we will start by specifying the problem in the form of an algorithm using C-style pseudocodes. For example, the assignment statement: A=A+3 takes the value that is stored in the variable A, adds the constant 3 to it, and stores the result back into A. Note that whatever the initial value of A is here is irrelevant. In order for the datapath to perform the data operation specified by this statement, the datapath must have a register for storing the value A. Furthermore, there must be an adder for performing the addition. The constant 3 can be hardwired into the circuit as a binary value. The next question to ask is how to connect the register, the adder, and the constant 3 together so that the execution of the assignment statement can be realized. Recall that a value stored in a register is available at the Q output of the register. Since we want to add A + 3, we connect the Q output of the register to the first operand input of the adder, and connect the constant 3 to the second operand input. We want to store the result of the addition back into A (i.e., back into the same register), therefore, we connect the output of the adder to the D input of the register, as shown in Figure 14(a). The storing of the adder result into the register is accomplished by asserting the Load control signal of the register (i.e., asserting ALoad). This ALoad signal is an example of what we have been referring to as the datapath control signal. This control signal controls the operation of this datapath. The control unit, which we will talk about in the next chapter, will control this signal by either asserting or de-asserting it. The actual storing of the value into the register, however, does not occur immediately when ALoad is asserted. Since the register is synchronous to the clock signal, therefore, the actual storing of the value occurs at the next active clock edge. Because of this, the new value of A is not available at the Q output of the register during the current clock cycle, but is available at the beginning of the next clock cycle. As another example, the datapath shown in Figure 14(b) can perform the execution of the statement: A=B+C where B and C are two different variables stored in two separate registers, thus providing the two operand inputs to the adder. The output of the adder is connected to the D input of the A register for storing the result of the adder. Copyright © 2011 Enoch Hwang Page 17 of 125 Microprocessor Design Trainer Lab Manual D7-0 Load 8-bit Register B Clock Q7-0 D7-0 Load 8-bit Register A Clock Q7-0 ALoad Clock '3' 8 8 D7-0 Load 8-bit Register C Clock Q7-0 8 8 8 + + 8 ALoad Clock (a) D7-0 Load 8-bit Register A Clock Q7-0 (b) Figure 14: Sample datapaths: (a) for performing A = A + 3; (b) for performing A = B + C. The execution of the statement is realized simply by asserting the ALoad signal, and the actual storing of the value for A occurs at the next active edge of the clock. During the current clock cycle, the adder will perform the addition of B and C, and the result from the adder must be ready and available at its output before the end of the current clock cycle so that, at the next active clock edge, the correct result will be written into A. Since we are not writing any values to register B or C, we do not need to control the two Load signals for them. If we want a single datapath that can perform both of the statements: A=B+C and A=A+3 we will need to combine the two datapaths in Figure 14 together as follows. Since A is the same variable in the two statements, only one register for A is needed. However, register A now has two data sources: one from the first adder for B + C, and the second from the second adder for A + 3. The problem is that two or more data sources cannot be connected directly together to one destination, as shown in Figure 15(a) because the two signals will collide, resulting in incorrect values. The solution is to use a multiplexer to select which of the two sources to pass to register A. The correct datapath using the multiplexer is shown in Figure 15(b). Both statements assign a value to A, so ALoad must be asserted for the execution of both statements. The actual value that is written into A, however, depends on the selection of the multiplexer. If Amux is asserted, then the result from the bottom adder (i.e., the result from A + 3) is stored into A; otherwise, the result from the top adder is stored into A. Since the two adders are combinational circuits and the value from a register is always available at its output, therefore, the results from the two additions are always available at the two inputs of the multiplexer. But depending on the Amux control signal, only one value will be passed through to the Q input of the A register. Notice that the datapath does not show which statement is going to be executed first. The sequence in which these two statements are executed depends on whether the signal Amux is asserted first or de-asserted first. If this datapath is part of a microprocessor, then the control unit would determine when to assert or de-assert this Amux control signal, since it is the control unit that performs the sequencing of the datapath operations. Furthermore, notice that these two statements cannot be executed within the same clock cycle. Since both statements write to the same register, and a register can only latch in one value at an active clock edge, only one Copyright © 2011 Enoch Hwang Page 18 of 125 Microprocessor Design Trainer Lab Manual result from one adder can be written into the register in one clock cycle. The other statement will have to be performed in another clock cycle, but not necessarily the next clock cycle. D7-0 Load 8-bit Register B Clock Q7-0 D7-0 Load 8-bit Register C Clock Q7-0 8 D7-0 Load 8-bit Register B Clock Q7-0 8 D7-0 Load 8-bit Register C Clock Q7-0 8 8 + + 8 8 Error D7-0 Load 8-bit Register A Clock Q7-0 ALoad Clock 1 Amux 0 D7-0 Load 8-bit Register A Clock Q7-0 ALoad Clock '3' 8 8 '3' 8 8 8 + (a) 8 + (b) Figure 15: Datapath for performing A = A + 3 and A = B + C: (a) without multiplexer—wrong; (b) with multiplexer—correct. Let us redesign the datapath to execute the two statements: A=B+C and A=A+3 but this time using only one adder. In order to execute the first statement, the first operand input to the adder is from register B, and the second operand input to the adder is from register C. However, to execute the second statement, the two input operands to the adder are register A and the constant 3. Since both input operands to the adder have two different sources, again we must use a multiplexer for each of them. The output of the two multiplexers will connect to the two adder input operands, as shown in Figure 16. For both statements, the result of the addition is stored in register A, therefore, the output of the adder connects to the input of the A register. Notice that the two select lines for the two multiplexers can be connected together. This is possible because the two operands B and C for the first statement are connected to input 0 of the two multiplexers, respectively, and the two operands A and 3 for the second statement are connected to input 1 of the two multiplexers, respectively. Thus, de-asserting the Mux select signal will pass the two correct operands for the first statement, and likewise, asserting the Mux select signal will pass the two correct operands for the second statement. We want to reduce the number of control signals for the datapath as much as possible, because (as we will see in the next chapter) minimizing the number of control signals will minimize the size of the output circuit in the control unit. Copyright © 2011 Enoch Hwang Page 19 of 125 Microprocessor Design Trainer Lab Manual D7-0 Load 8-bit Register A Clock Q7-0 ALoad D7-0 Load 8-bit Register B Clock Q7-0 8 D7-0 Load 8-bit Register C Clock Q7-0 8 8 Clock '3' 8 1 0 1 0 Mux 8 8 + Figure 16: Datapath for performing A = A + 3 and A = B + C using only one adder. 3.3 Selecting Registers In most situations, one register is needed for each variable used by the algorithm. However, if two variables are not used at the same time, then they can share the same register. If two or more variables share the same register, then the data transfer connections leading to the register and out from the register usually are made more complex, since the register now has more than one source and destination. Having multiple destinations is not too big of a problem, since we can connect all of the destinations to the same source.1 However, having multiple sources will require a multiplexer to select one of the several sources to transfer to the destination. Figure 17 shows a circuit with a register having two sources—one from an external input and one from the output of an adder. A multiplexer is needed in order to select which one of these two sources is to be the input to the register. Input 1 + 0 Reg Figure 17: Circuit of a register with two sources. After deciding how many registers are needed, we still need to determine whether to use a single register file containing enough register locations, separate individual registers, or a combination of both for storing the variables in. Furthermore, registers with built-in special functions, such as shift registers and counters, can also be used. For example, if the algorithm has a FOR loop statement, a single counter register can be used to not only store the count variable but also to increment the count. This way, not only do we reduce the component count, but the amount of datapath connections between components is also reduced. Decisions for selecting the type of registers to use will affect how the data transfer connections between the registers and functional units are connected. 3.4 Selecting Functional Units It is fairly straightforward to decide what kind of functional units are required. For example, if the algorithm requires the addition of two numbers, then the datapath must include an adder. However, we still need to decide whether to use a dedicated adder, an adder–subtractor combination, or an ALU (which has the addition operation 1 This is true only theoretically. In practice, there are fan-in (multiple sources with one destination) and fan-out (one source with multiple destinations) limits that must be observed. Copyright © 2011 Enoch Hwang Page 20 of 125 Microprocessor Design Trainer Lab Manual implemented). Of course, these questions can be answered by knowing what other data operations are needed by the algorithm. If the algorithm has only an addition and a subtraction, then you may want to use the adder–subtractor combination unit. On the other hand, if the algorithm requires several addition operations, do we use just one adder or several adders? Using one adder may decrease the datapath size in terms of the number of functional units, but it may also increase the datapath size because more complex data transfer paths are needed. For example, if the algorithm contains the following two addition operations: a=b+c d=e+f Using two separate adders will result in the datapath shown in Figure 18(a); whereas, using one adder will require the use of two extra 2-to-1 multiplexers to select which register will supply the input to the adder operands, as shown in Figure 18(b). Furthermore, this second datapath requires two extra control signals for the two multiplexers. In terms of execution speed, the datapath in (a) can execute both addition statements simultaneously within the same clock cycle, since they are independent of each other. However, the datapath in (b) will have to execute these two additions sequentially in two different clock cycles, since there is only one adder available. The final decision as to which datapath to use is up to you as the designer. b b c e f + + a d (a) e 1 c 0 f 1 0 + a d (b) Figure 18: Datapaths for realizing two addition operations: (a) using two separate adders; (b) using one adder. 3.5 Data Transfer Methods There are several methods in which the registers and functional units can be connected together so that the correct data transfers between the different units can be made. 3.5.1 Multiple Sources If the input to a unit has more than one source, then a multiplexer can be used to select which one of the multiple sources to use. The sources can be from registers, constant values, or outputs from other functional units. Figure 19 shows two such examples. In Figure 19(a), the left operand of the adder has four sources: two from two different registers, one from the constant 1, and one from the output of an ALU. In Figure 19(b), register a has two sources: one from the constant 1 and one from the output of an adder. Copyright © 2011 Enoch Hwang Page 21 of 125 Microprocessor Design Trainer Lab Manual a b s31 s0 '1' 2 1 0 ALU + '1' 1 + (a) 0 a (b) Figure 19: Examples of multiple sources using multiplexers: (a) an adder operand having four sources; (b) a register having two sources. 3.5.2 Multiple Destinations A source having multiple destinations does not require any extra circuitry. The one source can be connected directly to the different destinations, and all of the destinations where the data is not needed would simply ignore the data source. For example, in Figure 18(b), the output of the adder has two destinations: register a, and register d. If the output of the adder is for register a, then the Load line for register a is asserted, while the Load line for register d is not; and if the output of the adder is for register d, then the Load line for register d is asserted, while the Load line for register a is not. In either case, only the correct register will load in the data while the other register simply will ignore the data by not loading it. This also works if one of the destinations is a combinational functional unit. In this case, the functional unit will take the source data and manipulates it. However, the output of the functional unit will not be used (that is, not stored in any register) so functionally, it doesn’t matter that the functional unit worked on the source, because the result is not stored. However, it does require power for the functional unit to manipulate the data, so if we want to reduce the power consumption, we would want the functional unit to not manipulate the data at all. This, however, is a power optimization issue that is beyond the scope of this courseware. 3.5.3 Tri-state Bus Another scheme where multiple sources and destinations can be connected to the same data bus is through the use of tri-state buffers. The point to note here is that, when multiple sources are connected to the same bus, only one source can output to the bus at any one time. If two or more sources output to the same bus at the same time, then there will be data conflicts. This occurs when one source outputs a 0 while another source outputs a 1. By using tristate buffers to connect between the various sources and the common data bus, we want to make sure that only one tri-state buffer is enabled at any one time, while the rest of them are all disabled. Tri-state buffers that are disabled output high- impedance Z values, so no data conflicts can occur. Figure 20 shows a tri-state bus with five units (three registers, an ALU, and an adder) connected to it. An advantage of using a tri-state bus is that the bus is bi-directional, so that data can travel in both directions on the bus. Connections for data going from a component into the bus need to be tri-stated, while connections for data going from the bus to a component need not be. Notice also that data input and output of a register both can be connected to the same tri-state bus; whereas, the input and output of a functional unit (such as the adder or ALU) cannot be connected to the same tri-state bus. Copyright © 2011 Enoch Hwang Page 22 of 125 Microprocessor Design Trainer Lab Manual ALU a b Common Data Bus c + Figure 20: Multiple sources using tri-state buffers to share a common data bus. 3.6 Generating Status Signals Although it is the control unit that is responsible for the sequencing of statement execution, the datapath, however, must supply the results of the conditional tests for the control unit so that the control unit can determine what statement to execute next. Status signals are the results of the conditional tests that the datapath supplies to the control unit. Every conditional test that the algorithm has requires a corresponding status signal. These status signals usually are generated by comparators. For example, if the algorithm has the following IF statement IF (A = 0) THEN … the datapath must, therefore, have an equality comparator that compares the value from the A register with the constant 0, as shown in Figure 21(a). The output of the comparator is the status signal for the condition (A = 0). This status signal is a 1 when the condition (A = 0) is true; otherwise, it is a 0. Recall that the circuit for the equality comparator with the constant 0 is simply a NOR gate, therefore, we can replace the black box for the comparator in Figure 21(a) with just an 8-input NOR gate, as shown in Figure 21(b). D7-0 Load 8-bit Register A Clock Q7-0 D7-0 Load 8-bit Register A Clock Q7-0 8 (A = 0) = 8 '0' (a) (A = 0) (b) Figure 21: Comparator for generating the status signal (A = 0). There are times when an actual comparator is not needed for generating a status signal. For example, if we want a status signal for testing whether a number is an odd number, as in the following IF statement IF (A is an odd number) THEN … we can simply use the A0 bit of the 8-bit number from register A as the status signal for this condition, since all odd numbers have a 1 in the zero bit position. The generation of this status signal is shown in Figure 22. Copyright © 2011 Enoch Hwang Page 23 of 125 Microprocessor Design Trainer Lab Manual D7-0 Load 8-bit Register A Clock Q7-0 (A is an odd number) 8 A0 Figure 22: Comparator for generating the status signal (A is an odd number). 3.7 Control Words Any given datapath will have a number of control signals. By asserting or de-asserting these control signals at different times, the datapath can perform different register-transfer operations. Since the execution of an operation requires the correct assertion and de-assertion of all of the control signals together, we would like to think of all of them as a unit rather than as individual signals. All of the control signals for a datapath, when grouped together, are referred to as a control word. Hence, a control word will have one bit for each control signal in the datapath. One register-transfer operation of a datapath, therefore, is determined by the values set in one control word, and so, we can specify the datapath operation simply by specifying the bit string for the control word. Each control word operation will be executed in one clock cycle to perform one register-transfer operation. By combining multiple control words together in a certain sequence, the datapath will perform the specified operations in the order given. The datapath in Figure 16, having the two control signals ALoad and Mux, was designed to execute the two statements: A = A + 3 and A = B + C. The control word for this datapath, therefore, has two bits—one for each control signal. The ordering of these two bits at this point is arbitrary; however, once decided, we need to be consistent with the order. The two control words for performing the two statements are shown in Figure 23. Control word 1 specifies the control word bit string for executing the statement, A = A + 3. This is accomplished by asserting both the ALoad and the Mux signals. Control word 2 is for executing the statement, A = B + C, by asserting ALoad and de-asserting Mux. Control Word 1 2 Instruction ALoad Mux A=A+3 A=B+C 1 1 1 0 Figure 23: Control words for the datapath in Figure 17 for performing the two statements: A = A + 3 and A = B + C. 3.8 Examples of Datapath Design We will now illustrate the design of datapaths with two examples. The datapaths produced in the examples are by no mean the only correct datapaths for solving each of the problems. Just like writing a computer program, there are many ways of doing it. 3.8.1 Example 1: Datapath for a simple IF-THEN-ELSE problem In this example, we want to construct a 4-bit-wide dedicated datapath for solving a simple IF-THEN-ELSE algorithm as shown in Figure 24. To create a datapath for the algorithm, we need to look at all of the data manipulation statements in the algorithm, since the datapath is responsible for manipulating data. These data manipulation instructions are the register-transfer operations. In most cases, one data manipulation instruction is equivalent to one register-transfer operation. However, some data manipulation instructions may require two or more register-transfer operations to realize. The algorithm uses two variables, A and B; therefore, the datapath should have two 4-bit registers—one for each variable. Line 1 of the algorithm inputs a value into A. In order to realize this operation, we need to connect the data Copyright © 2011 Enoch Hwang Page 24 of 125 Microprocessor Design Trainer Lab Manual input signals to the input of register A, as shown in Figure 25. By asserting the ALoad signal, the data input value will be loaded into register A at the next active clock edge. Line 2 of the algorithm tests the value of A with the constant 5. The datapath in Figure 25 uses a 4-input AND gate for the equality comparator with the four input bits connected as 0101 to the four output bits of register A. Since 5 in decimal is 0101 in binary, bits 0 and 2 are not inverted for the two 1’s in the bit string, while bits 1 and 3 are inverted for the two 0’s. With this connection, the AND gate will output a 1 when the input is a 5. The output of this comparator is the 1-bit status signal for the condition (A = 5) that the datapath sends to the control unit. 1 INPUT A 2 IF (A = 5) THEN 3 B = 8 4 ELSE 5 B = 13 6 END IF 7 OUTPUT B Figure 24: Algorithm for solving a simple IF-THEN-ELSE problem. Input '8' '13' 4 4 Muxsel ALoad D3-0 Load 4-bit Register A Clock Q3-0 (A = 5) 0 4 BLoad Clock A3 1 4 4 D3-0 Load 4-bit Register B Clock Q3-0 4 Out A0 Output Figure 25: Datapath for solving the simple IF-THEN-ELSE problem. Given the status signal for the comparison (A = 5), the control unit will decide whether to execute line 3 or line 5 of the algorithm. This decision is done by the control unit and not by the datapath. The datapath is responsible only for all of the register-transfer operations. Lines 3 and 5 require loading either an 8 or a 13 into register B. In order to be able to select one data from several sources, a multiplexer is needed—in this case, a 2-to-1 multiplexer is used. One input of the multiplexer is connected to the constant 8 and the other to the constant 13. The output of the multiplexer is connected to the input of register B, so that one of the two constants can be loaded into the register. Again, which constant is to be loaded into the register is dependent on the condition in line 2. Knowing the result of the test from the status signal (A = 5), the control unit will generate the correct signal for the multiplexer select line, Muxsel. The actual loading of the value into register B is accomplished by asserting the BLoad signal. Finally, the algorithm outputs the value from register B in line 7. This is accomplished by connecting a tri-state buffer to the output of the B register. To output the value, the control unit asserts the enable line, Out, on the tri-state buffer, and the value from the B register will be passed to the data output lines. Notice that the complete datapath shown in Figure 25 consists of two separate circuits. This is because the algorithm does not require the values of A and B to be used together. A question you might ask is whether you can connect the output of the comparator to the multiplexer select signal so that the status signal (A = 5) directly controls Muxsel. Logically, this might be alright, since if the condition (A = 5) is true, then the status signal is a 1. Assigning a 1 to Muxsel will select the 1 input of the multiplexer, thus passing the constant 8 to register B. Otherwise, if the condition (A = 5) is false, then Muxsel will get a 0 from the comparator, and the constant 13 will pass through the multiplexer. The advantage of doing this is that the datapath will generate one less status signal and requires one less Copyright © 2011 Enoch Hwang Page 25 of 125 Microprocessor Design Trainer Lab Manual control signal from the control unit. However, in some situations, we need to be careful with the timing when we use status signals to directly control the control signals. Figure 26 shows the control words for performing the algorithm in Figure 24 using the datapath in Figure 25. Control word 1 executes the instruction INPUT A. To do this, the ALoad signal is asserted, and the data value at the input port will be loaded into the register at the next active clock edge. For this instruction, we do not need to load a value into the B register; hence, BLoad is de-asserted for this control word. Furthermore, because of this, it does not matter what the multiplexer outputs, so Muxsel can be a don’t-care value. For control words 2 and 3, we want to load one of the two constants into B; therefore, BLoad is asserted for both of these control words, and the value for Muxsel determines which constant is loaded into B. When Muxsel is asserted, the constant 8 is passed to the input of the B register, and when it is de-asserted, the constant 13 is passed to the register. Control word 4 asserts the Out signal to enable the tri-state buffer, thus outputting the value from the B register. Control Word 1 2 3 4 Instruction ALoad Muxsel BLoad Out INPUT A B=8 B = 13 OUTPUT B 1 0 0 0 × 1 0 × 0 1 1 0 0 0 0 1 Figure 26: Control words for solving the simple IF-THEN-ELSE problem. 3.8.2 Example 2: Datapath for the counting from 1 to 10 probem Construct a 4-bit-wide dedicated datapath to generate and output the numbers from 1 to 10. The algorithm for this counting problem is shown in Figure 27. From the algorithm, we see that again we need a 4-bit register for storing the value for i. For line 3, an adder can be used for incrementing i. Both lines 1 and 3 write a value into i, thus providing two sources for the register. Our first inclination might be to use a 2-input multiplexer. However, notice that loading a 0 into a register is equivalent to clearing the register with the asynchronous Clear line, and this is alright as long as the timing is correct. The resulting datapath is shown in Figure 28(a). For line 1, we assert the Clear signal to initialize i to 0, and for line 3, we assert the iLoad signal to load in the result from the adder, which adds a 1 to the current value of i. Asserting Out will output i. The status signal for the conditional test (i ≠ 10) is realized by the 4-input NAND gate, where the four input bits of the NAND gate are connected to the four output lines from the register as 1010 binary for the constant decimal 10. Alternatively, instead of using a separate register and adder, we can use a single 4-bit up counter to implement the entire algorithm, as shown in Figure 28(b). Again, initializing i to 0 is accomplished by asserting the Clear signal. To increment i for line 3, we simply assert the Count signal. Generating the status signal and outputting the count are the same as before. The control words for the two different datapaths in Figure 28 are shown in Figure 29, respectively. For both cases, asserting the Clear signal will initialize i to 0. To increment i for the datapath in Figure 28(a), we need to assert iLoad. This will load in the value from the output of the adder, which is the result for i + 1, since one operand of the adder is i, and the other operand is the constant 1. For the datapath in Figure 28(b), we simply have to assert Count to increment i. The internal counter will increment the content in the register. Control word 3 asserts the Out signal, which asserts the enable signal on the tri-state buffer, thus passing the content of the register to the output port. Note that control words 2 and 3 (corresponding to lines 3 and 4 in the algorithm, respectively) must be executed ten times in order to output the ten numbers. The looping in the algorithm is implemented in the control unit, and we will see in the next chapter how it is done. Copyright © 2011 Enoch Hwang Page 26 of 125 Microprocessor Design Trainer Lab Manual 1 2 3 4 5 i = 0 WHILE (i ≠ 10){ i = i + 1 OUTPUT i } Figure 27: Algorithm for solving the counting problem. '1' + iLoad Clear Clock D3-0 Load 4-bit Register Clear i Clock Q3-0 i3 (i ≠ 10) 4-bit Up Count Counter Clear i Clock Q3-0 4 i Count Clear Clock 3 (i ≠ 10) i0 4 Out i0 Out Output Output (b) (a) Figure 28: Datapath for solving the counting problem: (a) using a separate adder and register; (b) using a single up counter. Control Word 1 2 3 Instruction iLoad Clear Out i=0 i=i+1 OUTPUT i 0 1 0 1 0 0 0 0 1 (a) Control Word 1 2 3 Instruction Count Clear Out i=0 i=i+1 OUTPUT i 0 1 0 1 0 0 0 0 1 (b) Figure 29: Control words for solving the counting problem: (a) using a separate adder and register datapath; (b) using a single up counter datapath. Copyright © 2011 Enoch Hwang Page 27 of 125 Microprocessor Design Trainer Lab Manual 4 Control Unit Design In the last chapter, you learned how to construct and use a datapath to implement an algorithm. All data manipulation instructions in the algorithm are converted to control words, and each control word is executed in one clock cycle to perform one register-transfer operation. However, in order for the datapath to automatically execute all of the control words for the algorithm, a control unit is needed to generate the appropriate control word signals at each clock cycle. The control unit inside the microprocessor is a finite state machine (FSM). In part II of this trainer, Sequential Logic Design Trainer, you have learned how to design finite state machines. Recall that FSMs operate by moving from one state to another as specified by a state diagram, and in each state, different output signals are generated. For each state that the control unit is in, the output logic that is inside the control unit will generate all of the appropriate control signals in a control word for the datapath to perform one register-transfer operation. So every register-transfer operation will and must complete within one clock cycle, which is equivalent to one state of the FSM since the FSM changes state at every clock cycle. In addition to generating the control signals, the control unit is also needed to control the sequencing of the instructions in the algorithm. The datapath is responsible only for the manipulation of the data; it only performs the register-transfer operations. It is the control unit that determines when each register-transfer operation is to be executed and in what order. The sequencing done by the control unit is established during the derivation of the state diagram. 4.1 The State Diagram The state diagram shows what register-transfer operation is executed in what state and the sequencing of the execution of these register-transfer operations. A state is created for each control word, and each state is executed in one clock cycle. The edges in the state diagram are determined by the sequence in which the instructions in the algorithm are executed. The sequential execution of instructions is represented by unconditional transitions between states (i.e., edges with no labels). Execution branches in the algorithm are represented by conditional transitions from a state with two outgoing edges: one with the label for when the condition is true and the other with the label for when the condition is false. If there is more than one condition, then all possible combinations of these conditions must be labeled on the outgoing edges from every state. These conditions are the status signals generated by the datapath, and passed to the next-state logic in the FSM. By stepping through a sequence of states as specified in the state diagram, the control unit will be able to control the datapath to perform all of the register-transfer operations in the correct order as specified by the algorithm. Once the state diagram is derived, the actual construction of the control unit circuit follows the same procedure as for the synthesis of FSM circuits discussed in part II of this trainer. Let us now design the control unit for controlling the datapath shown in Figure 16 for executing the two statements A = B + C and A = A + 3 using only one adder. The control words for this datapath from Figure 23 is again shown next Control Word 1 2 Instruction ALoad Mux A=A+3 A=B+C 1 1 1 0 Since there are two control words, therefore, two states, s0 and s1, are needed. In state s0 the control unit will generate the control signals for executing control word 1 by setting both ALoad and Mux to a 1. From state s0 the FSM will unconditionally transition to state s1. In state s1 the control unit will set the control signals ALoad to a 1 and Mux to a 0. The state diagram for this FSM is shown in Figure 30(a). The reason for the unconditional edge that goes from state s1 and back to itself is that we want the FSM to stop in that state at the end of the execution. The corresponding next-state table and next-state equation are shown in Figure 30(b). Since there are only two states, only one D flip-flop is needed for the state memory to encode these two states. We will use the encoding 0 for state Copyright © 2011 Enoch Hwang Page 28 of 125 Microprocessor Design Trainer Lab Manual s0 and encoding 1 for state s1. The output table, as obtained directly from the Control Word table, along with the two output equations (which are the control signals) are shown in Figure 30(c). Finally, the complete control unit circuit is shown in Figure 30(d). The state memory simply consists of the one D flip-flop with asynchronous clear signal. The asynchronous clear signal is connected to the global Reset signal so that on reset, the FSM will start from state s0. The Clock signal to the flip-flop is always connected directly to the global system clock. Both the next-state logic circuit and the output logic circuit are combinational circuits and are constructed from the next-state equations and output equations, respectively. Control Word 1 A=A+3 Control Word 2 A=B+C s0 s1 Next State (Implementation) Q0next (D0) 1 1 Current State Q0 0 1 D0 = 1 (b) (a) Current State Q0 0 1 Outputs ALoad Mux 1 1 1 0 '1' '1' D0 Clock ALoad Q'0 Mux Clk Clr ALoad = 1 Mux = Q0' Q0 Reset (c) (d) Figure 30: Construction of the control unit for the two-statement problem: (a) State diagram; (b) Next-state table and next-state equation; (c) Output table and output equations; (d) circuit for the control unit. 4.2 Examples of Control Unit Design We will now illustrate the design of control units with two examples. 4.2.1 Example 3: Control unit for a simple IF-THEN-ELSE problem In this example, we will construct the control unit for controlling the datapath from Example 1 for the simple IFproblem. The algorithm, datapath, and control words from Example 1 are repeated here in Figure 31 for convenience. THEN-ELSE In Example 1, we have designed the datapath for the algorithm shown in Figure 31(a). From the algorithm, we see that there are four data manipulation instructions: lines 1, 3, 5, and 7. Line 2 is not a data manipulation statement, but rather, it is a control statement. From these four data manipulation instructions, we have come up with the four corresponding control words. The datapath and the control words are shown in Figure 31 (b) and (c). Copyright © 2011 Enoch Hwang Page 29 of 125 Microprocessor Design Trainer Lab Manual 1 INPUT A 2 IF (A = 5) THEN 3 B = 8 4 ELSE 5 B = 13 6 END IF 7 OUTPUT B (a) '8' '13' Input 4 4 1 Muxsel ALoad D3-0 Load 4-bit Register A Clock Q3-0 (A = 5) 0 4 BLoad Clock A3 4 D3-0 Load 4-bit Register B Clock Q3-0 4 4 Out A0 Output (b) Control Word 1 2 3 4 Instruction ALoad Muxsel BLoad Out INPUT A B=8 B = 13 OUTPUT B 1 0 0 0 × 1 0 × 0 1 1 0 0 0 0 1 (c) Figure 31: The IF-THEN-ELSE problem from Example 1: (a) algorithm; (b) datapath; (c) control words. To design the control unit for this problem, we start by assigning the four control words to four separate states in the state diagram, as shown in Figure 32(a). The states are given the symbolic names s0, s1, s2, and s3, and are annotated with the control word and instruction that each is assigned to execute. In state s0, we want the control unit to generate the control signals for control word 1 to execute the instruction INPUT A. In state s1, we want the control unit to generate the control signals for control word 2 to execute the instruction B = 8. In state s2, we want the control unit to generate the control signals for control word 3 to execute the instruction B = 13. State s3 executes the instruction OUTPUT B. The sequence in which the states are connected follows the sequence of the instructions in the algorithm. The FSM starts from the reset state s0, which inputs a value into register A. After executing line 1 in the algorithm, the execution of lines 3 and 5 depends on the condition (A = 5) of the IF statement in line 2. This condition is represented by the two outgoing edges from state s0: one edge going to state s1 with the label (A = 5) for when the condition is true, and one edge going to state s2 with the label (A = 5)' for when the condition is false. The execution of line 7 follows immediately after either line 3 or 5; hence, there is an unconditional edge from both states s1 and s2 going to state s3. After executing line 7, the algorithm terminates; this is represented by the unconditional edge from s3 going back to itself. Copyright © 2011 Enoch Hwang Page 30 of 125 Microprocessor Design Trainer Lab Manual Having derived the state diagram, the actual construction of the control unit circuit is exactly the same as for constructing general FSM circuits. The next-state table is shown in Figure 32(b). Since there is a total of four states, two flip-flops are needed to encode them. For simplicity, the straight binary encoding scheme is used for encoding the states. Hence, state s0 is encoded as Q1Q0 = 00, state s1 is encoded as Q1Q0 = 01, and so on. In the next-state table, these four states are assigned to four rows, each labeled with their symbolic state name and their encoding. In addition to the four current states listed down the rows of the table, the next state of the FSM is also dependent on the status signal for the test condition (A = 5). Thus, we have the last two columns in the table: one column with the label (A = 5)' for when the condition is false and one column with the label (A = 5) for when the condition is true. The two flip-flops and one status signal give us a total of three variables (or 23 different combinations) to consider in the next-state table. Each next-state entry in the table is obtained from the state diagram by looking at the corresponding current state and the edges leading out from that state to see what the next state is. For example, looking at the state diagram shown in Figure 32(a), the edge with the label (A = 5) leading out from state s0 goes to state s1. Correspondingly, in the next-state table, the next-state entry at the intersection of row s0 (00) and the column labeled (A = 5) has the value s1 (01). After completing the next-state table as shown in Figure 32(b) we can proceed to derive the next-state equations based on the next-state table. The next-state equations are used to derive the next-state circuit for generating the D inputs to the state memory flip-flops. Since we have used two D flip-flops, two next-state equations (one for D0, and one for D1) are needed. Visualize this next-state table as being two separate tables, where the leftmost bit in each entry in the table is for the D1 equation, and the rightmost bit is for the D0 equation. These two equations are dependent on the three variables, Q1, Q0, and (A = 5), which represent the current state and status signal, respectively. We have optionally used the K-map method to reduce the size of the equations as shown in Figure 32(c). The reduced next-state equations for D0 and D1 are also shown in Figure 32(c). Having derived the next-state equations, it is trivial to draw the next-state logic circuit based on these equations. The output logic circuit for the FSM is derived from the control word signals and the states in which the control words are assigned to. Recall that the control signals control the operation of the datapath, and now we are constructing the control unit to control the datapath. So what the control unit needs to do is to generate and output the appropriate control signals in each state to execute the instruction that is assigned to that state. In other words, the control signals for controlling the operation of the datapath are simply the output signals from the output logic circuit in the FSM. To derive the output table, we take the control word table and replace all of the control word numbers with the actual encoding of the state in which that control word is assigned to. For example, looking at the state diagram shown in Figure 32(a), control word 1 is assigned to state s0. So in the output table, we put in the value 00 instead of the control word number 1. The value 00 is the encoding that we have given to state s0, and it represents the current state value for the two flip-flops, Q1 and Q0. The output table and the resulting output equations are shown in Figure 32(d) and (e), respectively. Once we have derived the next-state equations and output equations, we easily can draw the control unit circuit shown in Figure 32(f). The state memory simply consists of the two D flip-flops with asynchronous clear signals. The two asynchronous clear signals to the flip-flops are connected to the global Reset signal so that on reset, the FSM will start from state s0. All of the Clock signals to the flip-flops are always connected directly to the global system clock. Both the next-state logic circuit and the output logic circuit are combinational circuits and are constructed from the next-state equations and output equations, respectively. Copyright © 2011 Enoch Hwang Page 31 of 125 Microprocessor Design Trainer Lab Manual Next State (Implementation) Q1next Q0next (D1D0) (A = 5)' (A = 5) s3 10 s1 01 s2 11 s2 11 s3 11 s1 11 s3 11 s3 11 Current State Q1Q0 Control Word 1 INPUT A (A = 5) Control Word 2 B=8 s0 00 s1 01 s2 10 s3 11 s0 (A = 5)' Control Word 3 B = 13 s1 Control Word 4 OUTPUT B (b) s2 s3 (a) D1 (A = 5) 0 Q 1Q 0 D0 1 00 1 01 1 1 11 1 1 Q 1Q 0 (A = 5) 0 (A = 5)' 00 Q0 01 1 1 (A = 5) 1 1 Q0 11 1 1 10 1 1 Q1 10 1 Q1 1 D1 = (A = 5)' + Q0 + Q1 D0 = (A = 5) + Q0 + Q1 (c) Q 1Q 0 00 01 10 11 ALoad 1 0 0 0 Muxsel × 1 0 × (d) Copyright © 2011 Enoch Hwang BLoad 0 1 1 0 Out 0 0 0 1 ALoad Muxsel BLoad Out = Q1'Q0' = Q1' = Q1' ⊕ Q0' = Q 1Q 0 (e) Page 32 of 125 Microprocessor Design Trainer Lab Manual Status Signal from the Datapath Next-state Logic State Memory Output Logic and Control Signals to the Datapath (A = 5) D1 Q1 ALoad Clk Clear Q'1 Muxsel BLoad Out D0 Q0 Clk Clock Reset Clear Q'0 (f) Figure 32: Construction of the control unit for the IF-THEN-ELSE problem: (a) state diagram; (b) next-state table; (c) K-maps and next-state equations; (d) output table; (e) output equations for the three control signals; (f) control unit circuit. 4.2.2 Example 4: Control unit for the counting from 1 to 10 problem In this example, we will construct the control unit for controlling the datapath from Example 2 to generate and output the numbers from 1 to 10. The algorithm, datapath, and control words from Example 2 are repeated here in Figure 33 for convenience. In Example 2, we have designed the datapath for the algorithm. From the algorithm, we see that there are three data manipulation instructions: lines 1, 3, and 4. Line 2 is not a data manipulation statement, but rather, it is a control statement. From these three data manipulation instructions, we have come up with the three corresponding control words. The datapath and the control words are shown in Figure 33 (b) and (c). 1 2 3 4 5 i = 0 WHILE (i ≠ 10){ i = i + 1 OUTPUT i } (a) Copyright © 2011 Enoch Hwang Page 33 of 125 Microprocessor Design Trainer Lab Manual '1' + iLoad Clear Clock D3-0 Load 4-bit Register Clear i Clock Q3-0 i3 (i ≠ 10) 4 i0 Out Output (b) Control Word 1 2 3 Instruction iLoad Clear Out i=0 i=i+1 OUTPUT i 0 1 0 1 0 0 0 0 1 (c) Figure 33: The counting problem from Example 2: (a) algorithm; (b) datapath; (c) control words. To design the control unit for this problem, we start by assigning the three control words to three separate states in the state diagram, as shown in Figure 34(a). The states are given the symbolic names s0, s1, and s2, and are annotated with the control word and instruction that each is assigned to execute. In state s0, we want the control unit to generate the control signals for control word 1 to execute the instruction i = 0. In state s1, we want the control unit to generate the control signals for control word 2 to execute the instruction i = i + 1. State s2 executes the instruction OUTPUT i. We have also added a fourth state, s3, for exiting the WHILE loop and halting the execution of the algorithm. The sequence in which the states are connected follows the sequence of the instructions in the algorithm. The FSM starts from the reset state s0, which initializes i to 0. After executing line 1 in the algorithm, the execution of line 3 depends on the condition in the WHILE loop. Since line 1 is executed in state s0 and line 3 is executed in state s1, transitioning from state s0 to s1 depends on the test condition (i ≠ 10). This condition is represented by the two outgoing edges from state s0: one edge going to state s1 with the label (i ≠ 10) for when the condition is true, and one edge going to state s3 with the label (i ≠ 10)' for when the condition is false. The execution of line 4 follows immediately after line 3; hence, there is an unconditional edge from state s1 to s2. From state s2, there are the same two conditional edges as from state s0 for testing whether to repeat the WHILE loop or not. If the condition (i ≠ 10) is true, then the FSM will go back to state s1; otherwise, the FSM will exit the WHILE loop and go to state s3. The FSM halts in state s3 by having an unconditional edge going back to itself. No register-transfer operation is assigned to state s3. Having derived the state diagram, the actual construction of the control unit circuit is exactly the same as for constructing general FSM circuits. The next-state table is shown in Figure 34(b). Since there is a total of four states, two flip-flops are needed to encode them. For simplicity, the straight binary encoding scheme is used for encoding the states. Hence, state s0 is encoded as Q1Q0 = 00, state s1 is encoded as Q1Q0 = 01, and so on. In the next-state table, these four states are assigned to four rows, each labeled with the state name and their encoding. In addition to the four current states listed down the rows of the table, the next state of the FSM is also dependent on the status signal for the test condition (i ≠ 10). Thus, we have the last two columns in the table: one Copyright © 2011 Enoch Hwang Page 34 of 125 Microprocessor Design Trainer Lab Manual column with the label (i ≠ 10)' for when the condition is false and one column with the label (i ≠ 10) for when the condition is true. The two flip-flops and one status signal give us a total of three variables (or 23 different combinations) to consider in the next-state table. Each next-state entry in the table is obtained from the state diagram by looking at the corresponding current state and the edges leading out from that state to see what the next state is. For example, looking at the state diagram shown in Figure 34(a), the edge with the label (i ≠ 10) leading out from state s0 goes to state s1. Correspondingly, in the next-state table, the next-state entry at the intersection of row s0 (00) and the column labeled (i ≠ 10) has the value s1 (01). After completing the next-state table as shown in Figure 34(b) we can proceed to derive the next-state equations based on the next-state table. The next-state equations are used to derive the next-state circuit for generating the D inputs to the state memory flip-flops. Since we have used two D flip-flops, two next-state equations (one for D0, and one for D1) are needed. Visualize this next-state table as being two separate tables, where the leftmost bit in each entry in the table is for the D1 equation, and the rightmost bit is for the D0 equation. These two equations are dependent on the three variables, Q1, Q0, and (i ≠ 10), which represent the current state and status signal, respectively. We have optionally used the K-map method to reduce the size of the equations as shown in Figure 34(c). The reduced next-state equations for D0 and D1 are also shown in Figure 34(c). Having derived the next-state equations, it is trivial to draw the next-state logic circuit based on these equations. The output logic circuit for the FSM is derived from the control word signals and the states in which the control words are assigned to. Recall that the control signals control the operation of the datapath, and now we are constructing the control unit to control the datapath. So what the control unit needs to do is to generate and output the appropriate control signals in each state to execute the instruction that is assigned to that state. In other words, the control signals for controlling the operation of the datapath are simply the output signals from the output logic circuit in the FSM. To derive the output table, we take the control word table and replace all of the control word numbers with the actual encoding of the state in which that control word is assigned to. For example, looking at the state diagram shown in Figure 34(a), control word 1 is assigned to state s0. So in the output table, we put in the value 00 instead of the control word number 1. The value 00 is the encoding that we have given to state s0, and it represents the current state value for the two flip-flops, Q1 and Q0. Since there is no control word or instruction assigned to the halting state, s3 (11), all of the control signals for this state can be de-asserted. The output table and the resulting output equations are shown in Figure 34(d) and (e), respectively. Once we have derived the next-state equations and output equations, we easily can draw the control unit circuit shown in Figure 34(f). The state memory simply consists of the two D flip-flops with asynchronous clear signals. The two asynchronous clear signals to the flip-flops are connected to the global Reset signal so that on reset, the FSM will start from state s0. All of the Clock signals to the flip-flops are always connected directly to the global system clock. Both the next-state logic circuit and the output logic circuit are combinational circuits and are constructed from the next-state equations and output equations, respectively. Copyright © 2011 Enoch Hwang Page 35 of 125 ♦ Microprocessor Design Trainer Lab Manual Control Word 1 i=0 Current State Q1Q0 s0 (i ≠ 10) Control Word 2 i=i+1 (i ≠ 10)' s0 00 s1 01 s2 10 s3 11 s1 Next State (Implementation) Q1next Q0next (D1D0) (i ≠ 10)' (i ≠ 10) s3 11 s1 01 s2 10 s2 10 s3 11 s1 01 s3 11 s3 11 (i ≠ 10) (b) Control Word 3 OUTPUT i s2 (i ≠ 10)' Halt s3 (a) D0 D1 (i ≠ 10) 0 Q1Q0 Q1Q0 1 00 1 01 1 1 11 1 1 10 1 (i ≠ 10) 0 (i ≠ 10)' 00 Q0 01 1 1 1 Q0' D1 = (i ≠ 10)' + Q0 11 1 1 10 1 1 Q1 D0 = Q1 + Q0' (c) Q1Q0 00 01 10 11 iLoad 0 1 0 0 Clear 1 0 0 0 (d) Copyright © 2011 Enoch Hwang Out 0 0 1 0 iLoad = Q1'Q0 Clear = Q1'Q0' Out = Q1Q0' (e) Page 36 of 125 Microprocessor Design Trainer Lab Manual Status Signal from the Datapath Next-state Logic State Memory Output Logic and Control Signals to the Datapath (i ≠ 10) D1 Q1 iLoad Clk Clear Q'1 Clear Out D0 Q0 Clk Clock Reset Clear Q'0 (f) Figure 34: Construction of the control unit for the counting problem: (a) state diagram; (b) next-state table; (c) Kmaps and next-state equations; (d) output table; (e) output equations for the three control signals; (f) control unit circuit. Copyright © 2011 Enoch Hwang Page 37 of 125 Microprocessor Design Trainer Lab Manual 5 Microprocessor Design In building the final microprocessor, we simply have to combine the control unit together with the datapath. This involves the connection of all of the control signals and the status signals together between the two units. All of the clock signals from both the datapath and the control unit are connected together to the master clock signal, and all of the reset signals are connected together to a master reset switch. 5.1 Examples of Microprocessor Design We will now illustrate the design of microprocessors with two examples. 5.1.1 Example 5: Microprocessor for the two-statement problem For the two-statement problem from Section 3.2, we have derived the datapath shown in Figure 16 and repeated here in Figure 35(a) for convenience. To simplify the microprocessor schematic drawing, we will represent the datapath circuit with the logic symbol shown in Figure 35(b). The logic symbol simply consists of all of the input and output signals for this circuit. D7-0 Load 8-bit Register A Clock Q7-0 ALoad D7-0 Load 8-bit Register B Clock Q7-0 8 D7-0 Load 8-bit Register C Clock Q7-0 8 8 Clock ALoad Mux '3' 8 1 0 1 DP 0 Mux 8 Clock 8 + (b) (a) '1' D0 Clock ALoad Mux '1' Q0 ALoad Q'0 Mux Clk Clr CU Reset Clock Reset (c) Copyright © 2011 Enoch Hwang (d) Page 38 of 125 Microprocessor Design Trainer Lab Manual Control Signals ALoad Mux CU Clock Reset ALoad Mux DP Clock Clock Reset (e) Figure 35: Construction of the microprocessor for the two-statement problem: (a) datapath; (b) symbol for the datapath; (c) control unit; (d) symbol for the control unit; (e) microprocessor circuit. Similarly, the control unit for this datapath derived in Section 4 and repeated here in Figure 35(c) is represented by the logic symbol shown in Figure 35(d). The final microprocessor circuit is made by connecting these two components together using their corresponding signals as shown in Figure 35(e). In this example, there are no external data signals. 5.1.2 Example 6: Microprocessor for the counting from 1 to 10 problem For the counting from 1 to 10 problem, we have derived the datapath shown in Figure 28(a) and the control unit shown in Figure 34(f). These two components are repeated here in Figure 36 (a) and (c) for convenience, and their respective logic symbol are shown in Figure 36 (b) and (d). The final microprocessor circuit is made by connecting these two components together using their corresponding signals as shown in Figure 36(e). '1' + iLoad Clear Clock D3-0 Load 4-bit Register Clear i Clock Q3-0 i3 (i ≠ 10) i0 Out 4 iLoad Clear Out DP (i ≠ 10) Clock Output Output (a) Copyright © 2011 Enoch Hwang (b) Page 39 of 125 Microprocessor Design Trainer Lab Manual (i ≠ 10) D1 Q1 iLoad Clear Out iLoad Clk Clear Clear Q'1 CU Out D0 Clk Clear Clock Reset (i ≠ 10) Clock Reset Q0 Q'0 (d) (c) iLoad Clear Out Control Signals iLoad Clear Out CU Clock Reset Clock Reset DP Status Signal (i ≠ 10) (i ≠ 10) Clock Output Data output (e) Figure 36: Construction of the microprocessor for the counting from 1 to 10 problem: (a) datapath; (b) symbol for the datapath; (c) control unit; (d) symbol for the control unit; (e) complete microprocessor circuit. Copyright © 2011 Enoch Hwang Page 40 of 125 Microprocessor Design Trainer Lab Manual 6 Labs The following labs will teach you how to design and implement microprocessor circuits. Each lab will first show how the datapath is designed, followed by the control unit, and finally the complete microprocessor. You will then implement the microprocessor using the Quartus development software and then load it onto the Microprocessor Design Trainer to test and verify its operation. The complete microprocessor project for all of the labs are available on the DVD under the folder \Circuits. You can open up these projects and directly load it onto the trainer board to see its operation without having to create them yourself. However, it is highly recommended that you do create them yourself, and only refer to these complete projects if you have difficulties. Copyright © 2011 Enoch Hwang Page 41 of 125 Microprocessor Design Trainer Lab Manual 6.1 Lab 1: Quartus Development Software Purpose In this lab you will learn how to use the Altera Quartus II development software to design a simple combinational circuit. Introduction The Altera Quartus II development software and the Microprocessor Design Trainer development board provide all of the necessary tools for implementing and trying out all of the circuits, including building the final generalpurpose microprocessor, discussed in this courseware. The Quartus II software offers a completely integrated development tool and easy-to-use graphical-user interface for the design, and synthesis of digital logic circuits. Together with the Microprocessor Design Trainer development board, these circuits can actually be implemented in hardware. The main component on the Microprocessor Design Trainer development board is a field programmable gate array (FPGA) chip which is capable of implementing very complex digital logic circuits. After synthesizing a circuit and downloading it onto the FPGA, you can see the operation of the circuit in hardware. The Web Edition version of the Quartus II software is included on the accompanying DVD. This lab assumes that you are familiar with the Windows environment, and that the Quartus II software has already been installed on your computer. If you have not done so, go back and follow the instructions in Sections 1.3 and 1.4. The rest of this lab will provide a step-by-step instruction for the schematic entry of an 8-bit 2-to-1 multiplexer circuit. 6.1.1 Starting Quartus II After the successful installation of the Quartus II software, there should be a link to the program under the Windows’ Start button named Quartus II 10.0 Web Edition. Click on this link to start the program. You should see the main Quartus II window similar to Figure 37. Copyright © 2011 Enoch Hwang Page 42 of 125 Microprocessor Design Trainer Lab Manual Figure 37: The Quartus II main window. 6.1.2 Creating a New Project Each circuit design in Quartus II is called a project. Each project should be placed in its own folder, since the program creates many associated working files for a project. Perform the following steps to create a new project and a new folder for storing the project files. From the Quartus II menu, select File > New Project Wizard. If the New Project Wizard Introduction screen appears and you don’t want to see it again the next time you start the new project wizard, you can select the check box that says Don’t show me this introduction again, and then click Next to go to the next screen. You should see the New Project Wizard: Directory, Name, Top-Level Entity [page 1 of 5] window as shown in Figure 38. Figure 38: The New Project Wizard: Directory, Name, Top-Level Entity window with the working directory, the project name, and the top-level entity name filled in. Type in the directory for storing your project. You can also click on the directory. • icon next to it to browse to the For this lab, type in c:\2x8mux to create a folder named 2x8mux in the root directory of the C drive. You also need to give the project a name. • For this lab, type in the project name mux. Copyright © 2011 Enoch Hwang Page 43 of 125 Microprocessor Design Trainer Lab Manual A project may have more than one design file. Whether your project has one or more files, you need to specify which design file is the top-level design entity. The default name given is the same as the project name. However, you can use a different name. • For this lab, leave the top-level file name as mux, and click Next to continue to the next window. Since the directory c:\2x8mux does not yet exist, Quartus II will inform you of that and asks whether you want to create this new directory. Click Yes to create the directory. In the New Project Wizard: Add Files [page 2 of 5] window, you can optionally add existing circuit source files associated with your project. For example, if you have a source file created in another project and want to use it in this project, you can specify that here. • We are starting a new project and do not yet have any source files so click Next to continue to the next window. In the New Project Wizard: Family & Device Settings [page 3 of 5] window as shown in Figure 39, we select the target FPGA device that we will be using to implement the circuit on. The Microprocessor Design Trainer development board uses the EP3C16F256C8 FPGA chip. • In the Device Family drop-down box, select Cyclone III. • In the Available devices list, select the device EP3C16F256C8. If this device is not listed, then you need to reinstall the Quartus II program with the Cyclone III device family option checked. • Click Next to continue to the next window. Copyright © 2011 Enoch Hwang Page 44 of 125 Microprocessor Design Trainer Lab Manual Figure 39: The New Project Wizard: Family & Device Settings window with the device EPF10K70RC240-4 selected. In the next New Project Wizard: EDA Tool Settings [page 4 of 5] window, we do not have any EDA tools to use for this project, so click Next to continue to the next window. The final window is a summary of the choices that you have just made. Click Finish to create your new project. 6.1.3 Using the Block Editor After creating a new project, we are now ready to start the Block Editor for manually drawing the schematic circuit. Starting the Block Editor From the Quartus II menu, select File > New. Under Design Files, select Block Diagram/Schematic File, and then click OK. You should see the Block Editor window similar to the one shown in Figure 40. Any circuit diagram can be drawn in this Block Editor window. Detach window Selection Tool Zoom Hand Tool Text Tool Symbol Tool Node, Bus and Conduit Tool Line and Shape Drawing Tool Partial Line Selection Rubberbanding Flip and Rotate Tool Figure 40: The Graphic Editor window with the graphics toolbar showing on the left. Copyright © 2011 Enoch Hwang Page 45 of 125 Microprocessor Design Trainer Lab Manual Drawing Tools In Figure 40, the tools for drawing circuits in the Block Editor are shown in the toolbar on the left side. The default location for this tool bar is at the top. There are the standard tools such as text writing, zoom, flip and rotate, and line and shape drawing. The main tool that you will use is the Selection tool. This selection tool allows you to perform many different operations depending on the context in which it is used. Two main operations performed by this tool are selecting objects and making connections between logic symbols. The Symbol tool allows you to select and use logic symbols from the library or from your own design files. The six Node, Bus and Conduit tools allow you to draw connection lines that are not connected to another object. The Partial Line Selection and Rubberbanding buttons turn on or off these functions. When rubberbanding is turned on, connection lines are adjusted automatically when symbols are moved from one location to another. When rubberbanding is turned off, moving a symbol will not affect the lines connected to it. Inserting Logic Symbols To insert a logic symbol, first select the Selection tool, and then double-click on an empty spot in the Block Editor window. You should see the Symbol window as shown in Figure 41. • Alternatively, you can click on the Symbol tool icon in the toolbar to bring up the Symbol window. Figure 41: The Symbol selector window. Available symbol libraries are listed in the Libraries box. These libraries include the standard primitive gates, standard combinational and sequential components, and your own logic symbols located in the current project directory. All of the basic logic gates, latches, flip-flops, and input and output connectors that you need are located in the primitives folder. If this folder is not listed, then click on the plus (+) sign to expand the libraries folder. Within the primitives folder are several subfolders. The basic gates are in the logic subfolder; the latches and flip-flops are in the storage subfolder; and the input and output connectors are in the pin subfolder. Your own circuits, if there are any, that you want to reuse in building larger circuits will be listed in the Project folder. Copyright © 2011 Enoch Hwang Page 46 of 125 Microprocessor Design Trainer Lab Manual Expand the logic subfolder by clicking on the plus sign next to it to see a list of logic gate symbols available in that library. The logic symbols are sorted in alphabetical order. Select the logic symbol name that you want to use, or alternatively, you can just type in the name of the logic symbol in the Name field. Click on the OK button to insert the symbol in the Block Editor. If the Repeat-insert mode box is checked, then you can insert several instances of the same symbol until you press the Esc key. For this lab, insert the following symbols into the Block Editor: • A 2-input AND gate (and2) found in the logic subfolder. • A 2-input OR gate (or2) found in the logic subfolder. • A NOT gate (not) found in the logic subfolder • An input signal connector (input) found in the pin subfolder. • An output signal connector (output) found in the pin subfolder. A unique number is assigned to each instance of a symbol and is written at the lower-left corner of the symbol. This number is used only as a reference number in the output netlist and report files. The numbers that you see may be different from those in the examples. Selecting, Moving, Copying, and Deleting Logic Symbols To select a logic symbol in the Block Editor, simply click on the symbol using the Selection tool. You can also select multiple symbols by holding down the Ctrl key while you select the symbols. An alternative method is to trace a rectangle around the objects that you want to select. All objects inside the rectangle will be selected. To de-select a symbol, simply click on an empty spot in the Block Editor. To move a symbol, simply drag the symbol. To copy a symbol, first select it and then perform the Copy and Paste operations. An alternative method is to hold down the Ctrl key while you drag the symbol. To delete a symbol, first select it and then press the Delete key. To rotate a symbol, right-click on the symbol, select Rotate by Degrees from the pop-up menu, and select the angle to rotate the symbol. Alternatively, you can first select the symbol and then click on one of the Flip or Rotate buttons on the tool bar. Perform the following operations for this lab: • Make a copy of the 2-input and gate • Make two more copies of the input signal connector • Position the symbols similar to Figure 42 Copyright © 2011 Enoch Hwang Page 47 of 125 Microprocessor Design Trainer Lab Manual Figure 42: Symbol placements for the 2-to-1 multiplexer circuit. Making and Naming Connections To make a connection between two connection points, use the Selection tool and drag from one connection point to the second connection point. Notice that, when you position the pointer to a connection point, the arrow pointer changes to a crosshair. To change the direction of a connection line while dragging the line, simply release and press the mouse button again, and then continue to drag the connection line. You can also make a connection between two connection points by moving a symbol so that its connection point touches the connection point of the second symbol. With rubberbanding turned on, you can now move one symbol away from the second symbol, and a connection line is automatically drawn between them. If you want to make a connection line that does not start from a symbol connection point, you will need to use the either the Orthogonal Node tool or the Diagonal Node tool instead of the Selection tool. Do not use the Line tool to make connections; this tool is only for drawing lines and not actually making a connection. Once a connection is made to a symbol, you can move the symbol to another location, and the connection line is adjusted automatically if the rubberbanding function is turned on. However, if the rubberbanding function is turned off, the connection will be broken if the symbol is moved. To make a connection between two lines that cross each other as shown in Figure 43, you need to use the Selection tool. Right-click on the junction point (i.e., the point where the two lines cross) and then select from the pop-up menu Toggle Connection Dot. You can repeat the same process to remove the connection point. Figure 43: Making or deleting a connection point. To select a line segment, simply single click on it. To select the entire line (with several line segments connected in different directions), you need to double-click on it. Use the Orthogonal Bus tool to draw a bus connection. To change a single node line to a bus line, right-click on the line and select Bus Line from the pop-up menu. Select Node Line from the pop-up menu to change it back to a node line. Copyright © 2011 Enoch Hwang Page 48 of 125 Microprocessor Design Trainer Lab Manual A bus must also have a name and a width associated with it. To name a connection line, right-click on the line that you want to name. In the pop-up menu, select Properties and then type in the name and the width for the bus in the Name box. For example, data[7..0] is an 8-bit bus with the name data, as shown in Figure 44. To change the name, just double-click on the name and edit it. To connect one line to a bus, connect a single line to the bus, and then give it the same name as the bus with the line index appended to it. For example, data[2], is bit two of the data bus, as shown in Figure 44. Figure 44: A single line connected to an 8-bit bus with the name data. To check whether a name is attached correctly to a line, select the line, and the name that is attached to the line will also be selected. All input and output signals in a circuit must be connected to an input and output signal connector respectively. To name an input or output signal connector, select its name label by single-clicking it, and then double-clicking it. You can now type in the new name. Pressing the Enter key will move the text entry cursor to the name label for the symbol below the current symbol. Alternatively, you can select the input or output connector and then double-click on. The Properties window for that pin will open up which allows you to enter the pin name, among other things. A bus line connected to an input or output connector must have the same bus width as the connector. For this lab, perform the following operations to look like Figure 45: • Name the three input connectors d0[7..0], s, and d1[7..0] • Name the output connector y[7..0] • Connect and name the five bus lines d0[7..0], d1[7..0], and0[7..0], and1[7..0], and y[7..0] • Connect the single lines from the input connector s to the inverter and to the two AND gates Select File > Save to save the design file. Type in mux for the filename. The default file extension is .bdf (for block design file). Recall that when we created the project, we had specified mux as the top-level filename. We will now use this file as the top-level source file. Copyright © 2011 Enoch Hwang Page 49 of 125 Microprocessor Design Trainer Lab Manual Figure 45: Connections and names for the 2-to-1 multiplexer circuit. Selecting, Moving and Deleting Connection Lines To select a straight connection line segment, just single-click on it. To select an entire connection line with horizontal and vertical segments, just double-click on it. To select a portion of a line segment, first turn on the Use Partial Line Selection button, and then drag a rectangle around the line segment. Only the portion of the line segment that is inside the rectangle will be selected. After a line is selected, it can be moved by dragging. After a line is selected, it can be deleted by pressing the Delete key. 6.1.4 Managing Files in a Project A project may have one or more design files associated with it. Design Files in a Project To see the files that are currently associated with a project, click on the Files tab in the Project Navigator window. The Project Navigator shown in Figure 46 shows that this project has only one file named mux.bdf. Figure 46: Files associated with a project as shown in the Project Navigator window. Opening a Design File To open a design file, simply double-click on the file that is listed in the Project Navigator window. Depending on the type of file, the associated editor will be used. The Block Editor is used to edit a Block Diagram/Schematic File, and a text editor is used to edit a VHDL or Verilog text file. Creating a New Design File To create a new schematic drawing design file, select File > New from the Quartus II menu. In the Device Design Files tab, select Block Diagram/Schematic File. After you save this file, the file is automatically added into the project. Copyright © 2011 Enoch Hwang Page 50 of 125 Microprocessor Design Trainer Lab Manual Adding Design Files to a Project To add an existing design file to the current project, select Project > Add/Remove Files in Project from the Quartus II menu. Alternatively, you can right-click on the folder icon labeled Files in the Project Navigator window, and then select Add/Remove Files in Project from the pop-up menu. This will bring up the Files Category under the Settings window. From this window, you can choose additional files to be added into the project by either manually typing in the file name or browsing to the directory and then selecting it. Click on the Add button to add individual files, or click on the Add All button to add all of the files in the selected directory. Deleting Design Files from a Project To delete a design file from a project, simply select it in the Project Navigator window, and then press the Delete key. Alternatively, you can right-click on the file that you want to delete, and then select Remove File from Project from the pop-up menu. Setting the Top-Level Entity Design File • When you first created a new project, you also had to specify the name of the top-level design file. If you want to change the top-level entity to another design file, you can do so by right-clicking on the file that you want to be the top-level entity in the Project Navigator window. From the pop-up menu, select Set as Top-Level Entity. Saving the Project Select File > Save Project to save the project and all of its associated files. 6.1.5 Creating and Using a Logic Symbol If you want to use a circuit as part of another circuit in a schematic drawing, you can create a logic symbol for this circuit. Logic symbols are like black boxes that hide the details of a circuit. Only the input and output signals for the circuit are shown. The input and output signals for the logic symbol are obtained directly from the input and output signal connectors that are connected in the circuit. To create a logic symbol for a circuit, first select the Block Editor window containing the circuit that you want as the active window. Select File > Create/Update > Create Symbol Files for Current File. The name of this symbol file will be the same as the name of the current active circuit diagram in the Block Editor, but with the file extension .bsf (for block symbol file). You can view and edit the logic symbol by first opening the .bsf file. Select File > Open and type in the filename. Click on the Open button. A window similar to Figure 47 will open showing the logic symbol. Copyright © 2011 Enoch Hwang Page 51 of 125 Microprocessor Design Trainer Lab Manual Figure 47: Logic symbol of the mux circuit. The placements of the input and output signals can be moved to different locations by dragging the signal connection line around the symbol box. The signal label will also be moved. You can then drag the label to another location if you wish. The size of the symbol can also be changed by dragging the edges of the symbol box. This new symbol can now be used in the Block Editor. It will show up in the Symbol window under the Project folder as shown in Figure 48. You can follow the same steps as discussed earlier for inserting built-in logic symbols to insert this logic symbol into another schematic circuit design. To use a circuit that is represented by its logic symbol in another project, you need to first copy the .bsf symbol file and the corresponding .bdf circuit design file to the other project’s directory. It will then be available in the Symbol window inside the Project folder as shown in Figure 48. Copyright © 2011 Enoch Hwang Page 52 of 125 Microprocessor Design Trainer Lab Manual Figure 48: Selecting the mux logic symbol to be inserted into another circuit design. 6.1.6 Experiments 1. Start a new project and draw the circuit for a 4-bit adder. Refer to the Combinational Logic Design Trainer for the adder circuit. Create a symbol for this circuit. 2. Start a new project and draw the circuit for a 4-bit register with load and asynchronous clear signals. Refer to the Sequential Logic Design Trainer for the register circuit. Instead of drawing the D flip-flop circuit from scratch, you can use the symbol named dffe from the Symbol library located in the folder primitives > storage. Create a symbol for this circuit. 3. Start a new project and draw the datapath circuit for the counting from 1 to 10 problem from Section 3.8.2 as shown next. Use the 4-bit adder symbol that you created in Experiment 1 above, and use the 4-bit register symbol that you created in Experiment 2 above. Remember that in order to use these two symbols, you need to first copy their .bdf and .bsf files to your new project directory. The tri-state buffer in the Symbol library is called tri located in the folder primitives > buffer. To get a logic 1 signal, use the symbol named vcc located in the folder primitives > other. The symbol named gnd located in the same folder will give you a logic 0. Copyright © 2011 Enoch Hwang Page 53 of 125 Microprocessor Design Trainer Lab Manual '1' + iLoad Clear Clock (i ≠ 10) D3-0 Load 4-bit Register Clear i Clock Q3-0 i3 4 i0 Out Output Copyright © 2011 Enoch Hwang Page 54 of 125 Microprocessor Design Trainer Lab Manual 6.2 Lab 2: Implementing a Circuit in Hardware Purpose In this lab you will implement a simple combinational circuit on the Microprocessor Design Trainer development board to test and verify its operation. Introduction In Lab 1, you used the Block Editor to layout the circuit for an 8-bit 2-to-1 multiplexer. In this lab you will implement this circuit onto the Microprocessor Design Trainer development board. You will test the operation of the multiplexer in hardware and verify its operation. 6.2.1 Analysis and Synthesis After drawing your circuit with the Block Editor, the next step is to analyze and synthesize it. During this step, Quartus II collects all of the necessary information about your circuit, and produces a netlist for it. A netlist is a description of all of the components used in the circuit and how these components are connected together. From the Quartus II menu, select Processing > Start > Start Analysis & Synthesis to synthesize the circuit. Alternatively, you can click on the icon in the toolbar. If there are no errors in your circuit, you should see the message “Quartus II Analysis & Synthesis was successful” in the Message window at the bottom. Most warnings can generally be ignored. If there are errors then they will be reported in the Message window and highlighted in red. For some errors, you can double-click on the red error message to see where the error is in the circuit. Go back and double check your circuit with the one shown in Figure 45 to correct all of the errors. 6.2.2 Mapping the I/O Signals to the FPGA Pins Since we want to implement the circuit on a FPGA, we need to assign all of the I/O signals from the circuit to the actual pins on the FPGA. The Pin Planner is used to map each of the I/O signals from the circuit to the pins on the Cyclone III EP3C16F256C8 chip. From the Quartus II menu, select Assignments > Pin Planner to bring up the Pin Planner as shown in Figure 49. Alternatively, you can click on the Pin Planner icon on the toolbar to bring up the Pin Planner. The actual mappings between the I/O signals and the FPGA pins are shown in the bottom half of the Pin Planner window. The I/O signals are listed under the Node Name column, and the FPGA pin names are listed under the Location column. For each I/O signal name, double-click on the cell next to it under the Location column, and then click on the down arrow to bring up a pop-up list of all of the assignable pins from the FPGA. Select the pin number that you want to assign to that I/O signal. In Figure 49, the eight bits of signal d0 has already been assigned to the corresponding pins on the FPGA. For example, d0[7] (bit 7 of d0) is assigned to pin L15. Copyright © 2011 Enoch Hwang Page 55 of 125 Microprocessor Design Trainer Lab Manual Figure 49: The Pin Planner showing the mapping of the I/O signals for the circuit to the pins on the EP3C16F256C8 chip. Refer to Appendix A for the mappings between the actual I/O device (i.e., the switches and lights on the development board) and the FPGA pin. Perform the following signal-to-pin assignments for the FPGA chip. For your convenience, the pin mappings that you need to do are listed in Table 1. • d0[7..0] to SWITCH[7..0] • d1[7..0] to SWITCH[15..8] • y[7..0] to LED[7..0] • s to PB0 Copyright © 2011 Enoch Hwang Page 56 of 125 Microprocessor Design Trainer Lab Manual I/O Signal Name d0[7] d0[6] d0[5] d0[4] d0[3] d0[2] d0[1] d0[0] d1[7] d1[6] d1[5] d1[4] d1[3] d1[2] d1[1] d1[0] s y[7] y[6] y[5] y[4] y[3] y[2] y[1] y[0] I/O Device Name SWITCH7 SWITCH6 SWITCH5 SWITCH4 SWITCH3 SWITCH2 SWITCH1 SWITCH0 SWITCH15 SWITCH14 SWITCH13 SWITCH12 SWITCH11 SWITCH10 SWITCH9 SWITCH8 PB0 LED7 LED6 LED5 LED4 LED3 LED2 LED1 LED0 FPGA Pin L15 J15 J13 G11 F13 E10 D12 C11 N15 P15 P14 T15 T14 N14 M11 L13 B10 K12 J14 J12 F14 E11 D14 A13 C9 Table 1: Pin mappings for the 2-to-1 multiplexer circuit. An alternative and quicker method for mapping the pins is to edit the .qsf file directly. Instead of using the Pin Planner to do the pin mappings, you can edit the text file mux.qsf using a text editor such as Windows’ Notepad. (For this file name, mux is the name of the project.) For each signal to pin mapping, insert a line such as the following to the end of the file set_location_assignment PIN_L15 -to d0[7] Replace L15 with the actual pin number that you want, and replace d0[7] with the signal name that you want. The ordering and actual locations of these lines in the file do not matter. After saving this text file, the pin numbers will be reflected in the Pin Planner. 6.2.3 Full Compilation Now that we have synthesized and created the netlist for the circuit, and have mapped all of the I/O signals to the actual pins on the FPGA, the next step is to perform a full compilation to fit the netlist and the pin assignments to the FPGA chip. From the Quartus II menu, select Processing > Start Compilation to start the full compilation of the circuit. Alternatively, you can click on the Start Compilation icon . You can watch the compilation progress in the Tasks window pane, and also the Flow Summary statistics. At the completion of the compilation, you will see in the Message window pane a message telling you that the Quartus II full compilation was successful with 0 errors and some warnings. In most situations, you can ignore the warnings. Copyright © 2011 Enoch Hwang Page 57 of 125 Microprocessor Design Trainer Lab Manual 6.2.4 Programming the FPGA In the final step, we will upload the circuit onto the Microprocessor Design Trainer board by programming the FPGA chip. Click on the Programmer button in the toolbar to bring up the Programmer window. Click on the Start button to upload the circuit onto the trainer board as shown in Figure 50. When the circuit is uploaded onto the board, the progress bar at the top-right corner of the programmer window should show 100% in green. Figure 50: Successful programming of the logic circuit onto the trainer board. 6.2.5 Testing the Circuit in Hardware Test and verify the operation of the multiplexer on the trainer board. Switches 8 to 15 are the 8-bit input for d1, and switches 0 to 7 are the 8-bit input for d0. The 8-bit output y is connected to LEDs 0 to 7. The mux select signal is connected to push button PB0. The truth table for the mux is shown next. Input s (PB0) 0 1 Output y (LED 0 to 7) d0 (SWITCH 0 to 7) d1 (SWITCH 8 to 15) When PB0 is not pressed, the eight LEDs should reflect the settings of switches 0 to 7, and when PB0 is pressed, the LEDs should reflect the settings of switches 8 to 15. Copyright © 2011 Enoch Hwang Page 58 of 125 Microprocessor Design Trainer Lab Manual 6.2.6 Experiments 1. Currently the 8-bit output signal y is connected to LEDs 0 to 7. Re-map the 8-bit output signal y to LEDs 8 to 15 instead. Test the circuit with the new mappings on the trainer board. 2. Test and verify the operation of the 4-bit adder circuit from Experiment 1 of Lab 1 on the trainer board. Connect the adder inputs to the switches, and the adder output to the LEDs. 3. Test and verify the operation of the 4-bit register circuit from Experiment 2 of Lab 1 on the trainer board. Connect the 4-bit register data inputs to four switches. Connect the 4-bit register data outputs to four LEDs. Connect the asynchronous Clear signal to PB0. Connect the Load signal to PB1. Connect the register Clock signal to the master clock on the trainer board which is pin E2. Set the data input switches to some value. Press PB1 and you should see the value being displayed on the LEDs. Pressing PB0 will clear the LEDs. 4. Test and verify the operation of the counting from 1 to 10 problem datapath circuit from Experiment 3 of Lab 1 on the trainer board. Connect the Clear signal to PB0. Connect the iLoad signal to PB1. Connect the Clock signal to the master clock on the trainer board which is pin E2. Connect the status signal (i ≠ 10) to LED15. Connect the Out signal to Switch 0. Connect the 4-bit Output signal to LEDs 0 to 3. Set Switch 0 to the on position so that you will always see the output. Each time you press PB1 (iLoad), the next value from the adder should be loaded into the register, and you should see the content of the register on the LEDs. Do you actually see each count from 1 to 10 as a binary number on the four LEDs? Describe what you see and explain why it is so? Hint: it has to do with the speed of the clock. 5. There is a clock divider circuit on the DVD inside the Circuits folder for slowing down the clock. Copy this circuit and the symbol to the project folder for Experiment 4. Add this clock divider circuit into the counting circuit of Experiment 4. Instead of the master clock signal connecting directly to the Clock input of the register, connect this clock divider circuit between them, so that the master clock signal inputs to the clock divider circuit, and the output from the clock divider circuit connects to the Clock input of the register. Now test this new circuit. Do you see each count from 1 to 10? Does the count stop at 10, or does it continue? Why? Copyright © 2011 Enoch Hwang Page 59 of 125 Microprocessor Design Trainer Lab Manual 6.3 Lab 3: Counting from 1 to 10 Purpose In this lab you will implement the full microprocessor for the counting problem that counts from 1 to 10. You will test this microprocessor on the Microprocessor Design Trainer development board and verify its operation. Introduction In Section 3.8.2 you have designed the datapath for this microprocessor. In Section 4.2.2 you have designed the control unit for this microprocessor. And finally in Section 5.1.2, you have combined the datapath and the control unit together to make the complete microprocessor for solving the counting from 1 to 10 problem. Furthermore, in Lab 2, you have tested the datapath manually on the trainer board. Now, we just need to actually draw the circuits for these components and connect them together to form the complete microprocessor so that it can automatically count from 1 to 10. The circuits for the different components used in this microprocessor as drawn out in the Quartus Block Editor are shown in Figure 51. Figure 51(a) shows the circuit for the full-adder (FA). Figure 51(b) shows the circuit for the 4-bit adder where four FA logic symbols from Figure 51(a) are used. The second operand of this adder is a constant 1. This is obtained by connecting a 0 (GND) to the four yi inputs, and a 1 (via the NOT gate) to the first carry-in signal of the first FA. Notice that the connections between the 4-bit input bus x[3..0] and its separate bit signals are not physically connected in the drawing. They are, however, connected by their names. Similarly, the 4-bit output bus f[3..0] is connected to its individual bit signals by their names. Figure 51(c) shows the circuit for the 4-bit register containing the four D flip-flops with enable and asynchronous clear and set signals (DFFE). Both the asynchronous clear (CLRN) and set (PRN) signals are active low signals. We do not need to use the set signal so they are all connected to VCC. We want an active high clear signal, so it is connected to an input signal connector labeled Clear via an inverter. All the clock signals are connected in common to an input signal connector labeled Clock. All the load (ENA) signals are connected in common to an input signal connector labeled Load. The individual D input of each flip-flop is connected to the data input bus D[3..0], and the individual Q output of each flip-flop is connected to the data output bus Q[3..0]. Figure 51 (d), (e) and (f) show the datapath, control unit and microprocessor circuits respectively. We created one more level, the top level circuit shown in Figure 51(g) where the microprocessor is used. A clock divider circuit is used to slow down the master clock signal before being fed to the clock input of the microprocessor. (a) Copyright © 2011 Enoch Hwang (b) Page 60 of 125 Microprocessor Design Trainer Lab Manual (c) (d) (e) Copyright © 2011 Enoch Hwang Page 61 of 125 Microprocessor Design Trainer Lab Manual (f) (g) Figure 51: Circuits for the microprocessor for the counting from 1 to 10 problem: (a) full adder; (b) 4-bit adder; (c) 4-bit register; (d) datapath; (e) control unit; (f) microprocessor; (g) top level circuit. You should already have drawn the datapath circuit and created the logic symbol for it. Open up that project and create a new Block Diagram/Schematic File with the Block Editor. Draw the control unit circuit as shown in Figure 51(e) in this new Block Diagram file. Add input and output pins to the I/O signals. Create a logic symbol for it. Create another Block Diagram/Schematic File for the microprocessor circuit. Insert the datapath logic symbol and the control unit logic symbol into it. Connect these two components up as shown in Figure 51(f). Add input and output pins to the I/O signals. Create a logic symbol for it. Copy the clock divider circuit source file Clockdiv.vhd and its symbol file Clockdiv.bsf from the folder Circuits/Clock Divider on the DVD into your project folder. Create another Block Diagram/Schematic File for the top level circuit. Insert the microprocessor logic symbol and clock divider symbol into it. Connect these two components up as shown in Figure 51(g). Add input and output pins to the I/O signals. Click on the icon Analysis & Synthesis to synthesize the circuit. Map the microprocessor I/O signals to the correct FPGA pins. Connect the Clock signal to the master clock pin E2. Connect the Reset signal to PB0. Connect the 4-bit data output signal to LEDs 0 to 3. Compile the circuit and upload it onto the trainer board. Verify that it counts from 1 to 10, and then stops. 6.3.1 Experiments 1. Remove the clock divider circuit from the top level circuit shown in Figure 51(g) and connect the master clock signal directly to the microprocessor clock input. Compile and upload the circuit to the trainer board. Do you see the counting? Explain what you see. 2. Open up the clock divider source file Clockdiv.vhd. Towards the top of the file there is a line that says CONSTANT max: INTEGER := 10000000; -- 10000000; = 1sec Change the number 10000000 in this line to 20000000. Compile and upload this new circuit. What happens? Try it with several other numbers. Copyright © 2011 Enoch Hwang Page 62 of 125 Microprocessor Design Trainer Lab Manual 3. Modify the circuit so that you can see the count in decimal rather than in binary. You will need to add a binary coded decimal converter to convert the 4-bit count output signal to drive a 7-segment LED display. You can copy this circuit and symbol from the folder Circuits/Bin2Dec on the DVD. There are six files in this folder and you need to copy all of them to your project folder. Copyright © 2011 Enoch Hwang Page 63 of 125 Microprocessor Design Trainer Lab Manual 6.4 Lab 4: Countdown from Input n Purpose In this lab you will design a dedicated microprocessor to input a number n, and then count from n down to 0. You will manually design the datapath and the control unit for this microprocessor circuit. The datapath and the control unit are then combined together to form the microprocessor. Finally, you will implement this microprocessor on the trainer and verify its operation. Introduction We will design a dedicated microprocessor to input a number n, and then count from n down to 0. The input number n is an 8-bit value. The datapath will automatically generate the numbers from n down to 0. A Done signal is asserted when the count reaches 0. The algorithm for solving this problem is shown in Figure 60. 1 2 3 4 5 6 7 INPUT n OUTPUT n WHILE (n = 0)’{ n = n - 1 OUTPUT n } Assert Done Figure 52: Algorithm for solving the countdown from input n problem. For the datapath, we will need one 8-bit register to store the initial input n and then the countdown of n. We will also need a subtractor for decrementing n by one. Since there are two sources for storing the value n into the register, one from line 1 and the second from line 4, we will need a multiplexer to select which source to use. For the two output statements, line 2 and 5, we will not use any control to output the value n. Instead the Q output of the register can be connected directly to the data output so that the data output will always have the current value stored in the register. Finally, a NOR gate is used for the (n = 0) comparator. The datapath circuit is shown in Figure 53. Input 8 1 InSelect 0 D7-0 Load 8-bit Register Clear n Q7-0 Load_n Reset Clock 8 -1 (n = 0) 8 8 Output Figure 53: Datapath for the countdown from input n problem. Copyright © 2011 Enoch Hwang Page 64 of 125 Microprocessor Design Trainer Lab Manual The state diagram, shown in Figure 54, is derived from the algorithm. Line 1 from the algorithm is executed in state 00. After n is loaded into the register, its value will be available in the next clock cycle, so in state 01, the value n from the register is available for reading and no further action is needed. Depending on the outcome of the condition (n = 0), the FSM will either go to state 10 or 11. Regardless of this decision, the subtractor will have decremented the value of n. If the FSM goes to state 10 then the newly decremented value of n will be ready for writing back into the register. And this new value will again be available for reading in the next state which is state 01. As you can see, although there are two output statements, lines 2 and 5, we have only use one state for executing both of them. The FSM terminates in state 11 where the Done signal is asserted. Reset 00 OUTPUT n InSelect = 1 Load_n = 1 01 (n = 0)' 10 (n = 0) 11 InSelect = 0 Load_n = 1 Done = 1 Figure 54: State diagram for the countdown from input n problem. The next-state table and the two next-state equations as derived from the state diagram are shown in Figure 55. Two flip-flops are needed since there are four states to be represented. This resulted in the two next-state equations D1 and D0. Current State Q1Q0 00 01 10 11 Next State Q1next Q0next (D1 D0) (n = 0) 0 1 01 01 10 11 01 01 11 11 D1 = Q0 D0 = (n = 0) + Q1 + Q0' Figure 55: Next-state table and next-state equations for the countdown from input n problem. Finally, the output table and output equations are shown in Figure 56. Copyright © 2011 Enoch Hwang Page 65 of 125 Microprocessor Design Trainer Lab Manual Current State Q1Q0 00 01 10 11 InSelect 1 0 0 0 Outputs Load_n 1 0 1 0 Done 0 0 0 1 InSelect = Q1'Q0' Load_n = Q0' Done = Q1Q0 Figure 56: Output table and output equations for the countdown from input n problem. Using the two next-state equations, three output equations and the fact that we need two D flip-flops for the state memory, we can derive the complete control unit circuit as shown in Figure 57. InSelect D1 Q1 Load_n Clk Clr D0 Q'1 Q0 Done Clk Clock Reset Clr Q'0 (n = 0) Figure 57: Control unit for the countdown from input n problem. The final complete microprocessor circuit is shown in Figure 58. Copyright © 2011 Enoch Hwang Page 66 of 125 Microprocessor Design Trainer Lab Manual Input 8 Input InSelect Load_n Control Signals InSelect Load_n CU Clock Reset Clock Reset DP (n = 0) Status Signal (n = 0) Done Clock Reset Output 8 Done Output Figure 58: Microprocessor circuit for the countdown from input n problem. 6.4.1 Experiments 1. Use the Block Editor in Quartus to draw out the various component circuits for the countdown from input n problem. Each component is to be drawn in a new Block Diagram/Schematic File design file. Create symbols for these components. If you have already drawn a particular component in another lab, you can reuse it again by copying the .bdf and .bsf files to this project’s folder. 2. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the datapath using the component symbols that you have just created. Add input and output pins to the I/O signals. Create a symbol for this datapath circuit. 3. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the control unit using the component symbols that you have just created. Add input and output pins to the I/O signals. Create a symbol for this control unit circuit. 4. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the microprocessor circuit by connecting the control signals and status signals together between the datapath component symbol and the control unit component symbol. Add input and output pins to the I/O signals. Create a symbol for this microprocessor circuit. 5. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the top-level interface circuit for connecting the microprocessor I/O signals to the FPGA pins as shown in Figure 59. Add also a clock divider circuit to the clock input of the microprocessor. This design file is the top-level file for your project. Copyright © 2011 Enoch Hwang Page 67 of 125 Microprocessor Design Trainer Lab Manual SWITCH[7..0] 8 PB[0] Reset Input Countdown from input n CLOCK Clock Divider Clkin Clkout Microprocessor Clock Output LED[7..0] 8 Done LED[15] Figure 59: Hardware implementation for the countdown from input n problem. 6. Use the Pin Planner to map the microprocessor I/O signals to the FPGA I/O pins. Refer to Appendix A for the correct pin mappings. 7. Use the Programmer to upload the microprocessor circuit to the trainer board. Test and verify the operation of this microprocessor on the trainer board. Set SWITCH[7..0] to any binary number for the starting count n. Press PB[0] to start the countdown. See the countdown in binary on LED[7..0]. When the count reaches zero, LED[15] will lit. Copyright © 2011 Enoch Hwang Page 68 of 125 Microprocessor Design Trainer Lab Manual 6.5 Lab 5: Count and Sum Purpose In this lab you will design a dedicated microprocessor to input a number n, and then sum all of the numbers from n down to 1. You will manually design the datapath and the control unit for this microprocessor circuit. The datapath and the control unit are then combined together to form the microprocessor. Finally, you will implement this microprocessor on the trainer and verify its operation. Introduction We will design a dedicated microprocessor to input a number n, and then sum all of the numbers from n down to 1. The input number n is an 8-bit value. The datapath will automatically generate and add the numbers from n down to 1. When the calculation is completed the datapath should output the sum of the numbers and assert a Done signal. The algorithm for solving this problem is shown in Figure 60. 1 2 3 4 5 6 7 sum = 0 INPUT n WHILE (n = 0)’{ sum = sum + n n = n - 1 } OUTPUT sum Figure 60: Algorithm for solving the count and sum problem. We first note that we need to have two 8-bit registers with load function for storing the two variables, n and sum. The register for sum should also include a Clear function for initializing it to 0. A subtractor for subtracting n with a constant 1 is needed. A separate adder is used for the addition operation. The resulting dedicated datapath is shown in Figure 61. For initializing sum, we simply can assert the Clear line for that register. Asserting nLoad will input a value for n. Asserting sumLoad will load into register sum the value from the output of the adder, which is the summation of sum plus n. Decrementing n by 1 is accomplished by the subtractor and asserting the nLoad signal to store the result back into n. A 2-input mux is needed for the two sources, lines 2 and 5, to the n register. Finally, asserting Out will enable the tri-state buffer, thus outputting the value from the sum register. The Out signal is also used as a Done signal to notify external devices that the calculation is completed. The comparator for the WHILE loop condition (n = 0) is an 8-input NOR gate. The NOR gate outputs a 1 when the 8-bit input is a 0, and this serves as both the status signal for the control unit and the Done signal. Copyright © 2011 Enoch Hwang Page 69 of 125 Microprocessor Design Trainer Lab Manual Input 8 1 InSelect 0 Load_Sum D7-0 Load 8-bit Register Clear n Q7-0 Load_n D7-0 Load 8-bit Register Clear Sum Q7-0 8 Reset Clock -1 (n = 0) 8 8 8 8 + Out Done Output Figure 61: Datapath for the count and sum problem. The control words for this control unit are shown in Figure 62. Note that we do not need a control word for initializing sum to zero because since the Clear line is connected to the Reset signal, therefore, on reset at the beginning, sum will be cleared immediately. This way, there will be one less control signal and one less state. Control Word 1 2 3 4 Instruction InSelect Load_n Load_sum Done/Out INPUT n sum = sum + n n=n–1 Done/OUTPUT sum 1 0 0 0 1 0 1 0 0 1 0 0 0 0 0 1 Figure 62: Control words for the count and sum problem. The state diagram is shown in Figure 63. The next-state table and next-state equations are shown in Figure 64. The output table and equations are shown in Figure 65. The complete control unit circuit is shown in Figure 66. Finally, the complete microprocessor circuit is shown in Figure 67. Copyright © 2011 Enoch Hwang Page 70 of 125 Microprocessor Design Trainer Lab Manual Reset 00 InSelect = 0 Load_Sum = 1 InSelect = 1 Load_n = 1 01 (n = 0)' 10 (n = 0) 11 InSelect = 0 Load_n = 1 Done = 1 Figure 63: State diagram for the count and sum problem. Current State Q1Q0 00 01 10 11 Next State Q1next Q0next (D1 D0) (n = 0) 0 1 01 01 10 11 01 01 11 11 D1 = Q0 D0 = (n = 0) + Q1 + Q0' Figure 64: Next-state table and next-state equations for the count and sum problem. Current State Q1Q0 00 01 10 11 InSelect 1 0 0 0 Outputs Load_n Load_Sum 1 0 0 1 1 0 0 0 Done 0 0 0 1 InSelect = Q1'Q0' Load_n = Q0' Load_Sum = Q1'Q0 Done = Q1Q0 Figure 65: Output table and output equations for the count and sum problem. Copyright © 2011 Enoch Hwang Page 71 of 125 Microprocessor Design Trainer Lab Manual InSelect D1 Q1 Load_n Q'1 Load_Sum Clk Clr D0 Done Q0 Clk Clr Clock Reset Q'0 (n = 0) Figure 66: Control unit for the count and sum problem. Input 8 Input InSelect Load_n Load_Sum Control Signals InSelect Load_n Load_Sum CU Clock Reset Clock Reset DP (n = 0) Status Signal (n = 0) Done Clock Reset Output 8 Done Output Figure 67: Complete microprocessor circuit for the count and sum problem. Copyright © 2011 Enoch Hwang Page 72 of 125 Microprocessor Design Trainer Lab Manual 6.5.1 Experiments 1. Use the Block Editor in Quartus to draw out the various component circuits for the count and sum problem. Each component is to be drawn in a new Block Diagram/Schematic File design file. Create symbols for these components. If you have already drawn a particular component in another lab, you can reuse it again by copying the .bdf and .bsf files to this project’s folder. 2. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the datapath using the component symbols that you have just created. Create a symbol for this datapath circuit. 3. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the control unit using the component symbols that you have just created. Create a symbol for this control unit circuit. 4. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the microprocessor circuit by connecting the control signals and status signals together between the datapath component symbol and the control unit component symbol. Create a symbol for this microprocessor circuit. 5. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the top-level interface circuit for connecting the microprocessor I/O signals to the FPGA pins. Add also a clock divider circuit as shown in Figure 68. This design file is the top-level file for your project. SWITCH[7..0] 8 PB[0] Reset Input Count and sum CLOCK Clock Divider Clkin Clkout Microprocessor Clock Output LED[7..0] 8 Done LED[15] Figure 68: Hardware implementation for the countdown from input n problem. 6. Use the Pin Planner to map the microprocessor I/O signals to the FPGA I/O pins. Refer to Appendix A for the correct pin mappings. 7. Use the Programmer to upload the microprocessor circuit to the trainer board. Test and verify the operation of this microprocessor on the trainer board. Set SWITCH[7..0] to a binary number for the starting count n. LED[7..0] will display the running sum of the numbers being counted. For example, if you set the input to binary 6, the LEDs will show binary 21 (10101) when it terminates. Copyright © 2011 Enoch Hwang Page 73 of 125 Microprocessor Design Trainer Lab Manual 6.6 Lab 6: Greatest Common Divisor Purpose In this lab you will design a dedicated microprocessor to input two numbers, X and Y, and then calculate the greatest common divisor (GCD) for these two numbers. You will manually design the datapath and the control unit for this microprocessor circuit. The datapath and the control unit are then combined together to form the microprocessor. Finally, you will implement this microprocessor on the trainer and verify its operation. Introduction In this lab, you manually will design the complete dedicated microprocessor for evaluating the greatest common divisor (GCD) of two 8-bit unsigned numbers, X and Y. For example, GCD(3,5) = 1, GCD(6,4) = 2, and GCD(12,4) = 4. The algorithm for solving the GCD problem is listed in Figure 69. You first will design a dedicated datapath for the algorithm. Next, you will design the control unit for the datapath. You will use the Block Editor to implement these two components and the final complete dedicated microprocessor. Finally, you will test and verify its operation by running it on the trainer board. 1 INPUT X 2 INPUT Y 3 WHILE (X ≠ Y){ 4 IF (X > Y) THEN 5 X = X - Y 6 ELSE 7 Y = Y - X 8 END IF 9 } 10 OUTPUT X Figure 69: Algorithm for solving the GCD problem. The algorithm shown in Figure 69 has five data manipulation statements in lines 1, 2, 5, 7, and 10. There are two conditional tests in lines 3 and 4. We can conclude that the datapath requires two 8-bit registers, one for variable X and one for variable Y, and a subtractor for performing the two subtractions in lines 5 and 7 in two different clock cycles. The dedicated datapath is shown in Figure 70. We need a 2-to-1 multiplexer for the input of each register, because we need to initially load each register with an input number and subsequently with the result from the subtractor. The two control signals, In_X and In_Y, select which of the two sources are to be loaded into the registers X and Y, respectively. The two control signals, XLoad and YLoad, load a value into the respective register. The bottom two multiplexers, selected by the same XY signal, determine the source to the two operands for the subtractor. When XY is asserted, then the value from register X will go to the left operand of the subtractor, and the value from register Y will go to the right operand. When XY is de-asserted, then Y goes to the left operand, and X goes to the right operand. Thus, this allows the selection of one of the two subtraction operations, X – Y or Y – X, to perform. Finally, a tri-state buffer is used for outputting the result from register X. The Out control signal is used to enable the tri-state buffer for the output and also used as the Done signal. A comparator for testing the two conditions, equal-to and greater-than, is used to generate the two needed conditional status signals. The comparator inputs are directly from the two X and Y registers. There are two output signals, (X = Y) and (X > Y), from the comparator. (X = Y) is asserted if X is equal to Y, and (X > Y) is asserted if X is greater than Y. The circuit for this 8-bit comparator is shown in Figure 71. This datapath for solving the GCD problem requires six control signals, In_X, In_Y, XLoad, YLoad, XY, and Out, and generates two status signals, (X = Y), and (X > Y). Copyright © 2011 Enoch Hwang Page 74 of 125 Microprocessor Design Trainer Lab Manual Input_X Input_Y 8 8 1 In_X 0 1 0 In_Y D7-0 Load 8-bit Register Clear X Q7-0 XLoad YLoad Reset Clock (X = Y) (X > Y) D7-0 Load 8-bit Register Clear Y Q7-0 8 8 Comparator (X = Y) X (X > Y) Y 1 0 1 8 0 XY 8 8 Subtractor 8 Out Done Output Figure 70: Datapath for solving the GCD problem. xi yi xi gin gout gin ein x7 y7 1 > eout (b) (a) gin ein gout eout ein 0 yi x6 y6 gout > eout x5 y5 gout > eout x4 y4 gout > eout x3 y3 gout > x2 y2 gout > eout eout x1 y1 gout > eout x0 y0 gout > eout gout > eout x>y x=y (c) Figure 71: Comparator circuit for x > y and x = y: (a) circuit for 1-bit; (b) symbol for 1-bit; (c) 8-bit circuit. Copyright © 2011 Enoch Hwang Page 75 of 125 Microprocessor Design Trainer Lab Manual The state diagram for the GCD algorithm requires five states, as shown in Figure 72(a). Four states are used for the five data manipulation statements, since only one state is used for performing both inputs. One “no-operation” state is used for the conditional testing of the updated values of X and Y. This no-op state, 001, is needed, since we need to test the conditions on the updated values of X and Y. From state 001, we test for the two conditions, (X = Y) and (X > Y). If (X = Y) is true, then the next state is 100. If (X = Y) is false, then the next state is either 010 or 011, depending on whether the condition (X > Y) is true or false, respectively. This state diagram does not have a Start signal, so in order for the resulting microprocessor to read the inputs correctly, we must first set up the input numbers and then assert the Reset signal to clear the state memory flip-flops to 0. This way, when the FSM starts executing from state 000, the two input numbers are ready to be read in. The next-state table, as derived from the state diagram, is shown in Figure 72(b). The table requires five variables: three to encode the six states, Q2, Q1, and Q0, and two for the status signals, (X = Y) and (X > Y). There are three unused state encodings: 101, 110 and 111. We have assumed that the next states from these three unused states are unconditionally back to state 000. D flip-flops are used to implement the state memory. The K-maps and the next-state equations for D2, D1, and D0 are shown in Figure 72(c). The control words and output table, having the six control signals, are shown in Figure 72(d). State 000 performs both inputs of X and Y. The two multiplexer select lines, In_X and In_Y, must be asserted so that the data comes from the two primary inputs. The two numbers are loaded into the two corresponding registers by asserting the XLoad and YLoad signals. State 001 is for testing the two conditions, so no operations are performed. The no-op is accomplished by not loading the two registers and not outputting a value. For states 010 and 011, the XY multiplexer select line is used to select which of the two subtraction operations is to be performed. Asserting XY performs the operation X – Y; whereas, de-asserting XY performs the operation Y – X. The corresponding In_X or In_Y line is de-asserted to route the result from the subtractor back to the input of the register. The corresponding XLoad or YLoad line is asserted to store the result of the subtraction into the correct register. State 100 outputs the result from X by asserting the Out line. The output equations, as derived from the output table, are shown in Figure 72(e). There is one equation for each of the six control signals. Each equation is dependent only on the current state (i.e., the current values in Q2, Q1, and Q0). We have assumed that the control signals have don’t-care values in all of the unused states. The complete control unit circuit is shown in Figure 72(f). The state memory consists of three D flip-flops. The inputs to the flip-flops are the next-state circuits derived from the three next-state equations. The output circuits for the six control signals are derived from the six output equations. The two status signals, (X = Y) and (X > Y), come from the comparator in the datapath. The Out signal is also connected as the Done signal. 000 001 INPUT X INPUT Y (X = Y) 100 OUTPUT X (X = Y)' (X > Y) (X = Y)' (X > Y)' X = X - Y 010 011 Y=Y-X (a) Copyright © 2011 Enoch Hwang Page 76 of 125 Microprocessor Design Trainer Lab Manual Current State Q2Q1Q0 000 001 010 011 100 101 Unused 110 Unused 111 Unused Next State Q2next Q1next Q0next (D2 D1 D0) (X = Y), (X > Y) 00 001 011 001 001 100 000 000 000 01 001 010 001 001 100 000 000 000 10 001 100 001 001 100 000 000 000 11 001 100 001 001 100 000 000 000 (b) D2 Q2 = 0 (X = Y), (X > Y) Q1Q0 00 01 11 Q2 = 1 10 00 1 01 00 01 11 10 1 1 1 1 1 Q2Q1'Q0' Q2'Q1'Q0(X = Y) 11 10 D2 = Q2Q1'Q0' + Q2'Q1'Q0 (X = Y) D1 Q2 = 0 (X = Y), (X > Y) 00 Q1Q0 01 11 Q2 = 1 10 00 01 11 10 00 01 1 Q2'Q1'Q0(X = Y)' 1 11 10 D1 = Q2'Q1'Q0 (X = Y)' Copyright © 2011 Enoch Hwang Page 77 of 125 Microprocessor Design Trainer Lab Manual D0 Q2 = 0 (X = Y), (X > Y) Q1Q0 00 11 01 1 1 Q2 = 1 10 00 1 01 1 11 1 1 1 1 10 1 1 1 1 00 01 11 10 Q2'Q0' 1 Q2'(X = Y)'(X > Y)' Q2'Q1 D0 = Q2'Q0' + Q2' (X = Y)' (X > Y)' + Q2'Q1 (c) Control Word 0 1 2 3 4 State Q2 Q1 Q0 000 001 010 011 100 Instruction In_X In_Y XLoad YLoad XY Out INPUT X, INPUT Y No operation X=X–Y Y=Y–X OUTPUT X 1 × 0 × × 1 × × 0 × 1 0 1 0 0 1 0 0 1 0 × × 1 0 × 0 0 0 0 1 (d) In_X = Q1' In_Y = Q0' XY = Q0' XLoad = Q2'Q0' YLoad = Q2'Q1'Q0' + Q2'Q1Q0 Out = Done = Q2 (e) Copyright © 2011 Enoch Hwang Page 78 of 125 Microprocessor Design Trainer Lab Manual In_X D2 In_Y Q2 Clk Clr Q'2 D1 XLoad Q1 YLoad Clk Clr Q'1 XY D0 Q0 Out Clk Clr Clock Reset Q'0 Done (X = Y) (X > Y) (f) Figure 72: Control unit for solving the GCD problem: (a) state diagram; (b) next-state table; (c) K-maps and nextstate equations; (d) control words and output table; (e) output equations; (f) circuit.. The final microprocessor can now be formed easily by connecting the control unit and the datapath together using the designated control and status signals, as shown in Figure 73. Input X 8 In_X In_Y XLoad YLoad XY CU Out Clock Reset Clock Reset (X = Y) (X > Y) Control Signals Status Signals Done Input Y 8 Input_X Input_Y In_X In_Y XLoad YLoad XY DP Out (X = Y) (X > Y) Clock Reset Output 8 Done Output Figure 73: Microprocessor for solving the GCD problem. Copyright © 2011 Enoch Hwang Page 79 of 125 Microprocessor Design Trainer Lab Manual 6.6.1 Experiments 1. Use the Block Editor in Quartus to draw out the various component circuits for the GCD problem. Each component is to be drawn in a new Block Diagram/Schematic File design file. Create symbols for these components. If you have already drawn a particular component in another lab, you can reuse it again by copying the .bdf and .bsf files to this project’s folder. 2. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the datapath using the component symbols that you have just created. Create a symbol for this datapath circuit. 3. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the control unit using the component symbols that you have just created. Create a symbol for this control unit circuit. 4. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the microprocessor circuit by connecting the control signals and status signals together between the datapath component symbol and the control unit component symbol. Create a symbol for this microprocessor circuit. 5. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the top-level interface circuit for connecting the microprocessor I/O signals to the FPGA pins. Add also a clock divider circuit as shown in Figure 74. This design file is the top-level file for your project. SWITCH[15..8] SWITCH[7..0] 8 PB[0] Reset 8 Input X GCD CLOCK Clock Divider Clkin Clkout Microprocessor Clock Input Y Output LED[7..0] 8 Done LED[15] Figure 74: Hardware implementation for the GCD problem. 6. Use the Pin Planner to map the microprocessor I/O signals to the FPGA I/O pins. Refer to Appendix A for the correct pin mappings. 7. Use the Programmer to upload the microprocessor circuit to the trainer board. Test and verify the operation of this microprocessor on the trainer board. For example, for GCD(6,4) = 2, set SWITCH[15..8] to binary 6, SWITCH[7..0] to binary 4, and then press PB[0]. At the end, LED[7..0] will show binary 2. 8. Instead of using two separate mux select signals, In_X and In_Y, for the two input multiplexers, redesign the microprocessor so that it uses only one control signal to control both of these mux select signals. Copyright © 2011 Enoch Hwang Page 80 of 125 Microprocessor Design Trainer Lab Manual 6.7 Lab 7: Summing Input Numbers Purpose In this lab you will design a dedicated microprocessor to sum the input of unsigned integer numbers. You will manually design the datapath and the controller for this microprocessor circuit. The datapath and the control unit are then combined together to form the microprocessor. Finally, you will implement this microprocessor on the trainer and verify its operation. Introduction In this lab, we manually will design the complete dedicated microprocessor for inputting many 8-bit unsigned numbers through one input port and then output the sum of these numbers. The algorithm continues to input numbers as long as the number entered is not a 0. Each number entered is also displayed on the output. When the number entered is a 0, the algorithm stops and outputs the sum of all of the numbers entered. You first will design a dedicated datapath for the algorithm. Next, you will design the control unit for the datapath. You will use the Block Editor to implement the complete dedicated microprocessor. Finally, you will test and verify it on the trainer board. The algorithm for solving this problem is listed in Figure 75. 1 sum = 0 2 BEGIN LOOP 3 INPUT X 4 sum = sum + X 5 IF (X = 0) THEN 6 EXIT LOOP 7 END IF 8 OUTPUT X 9 END LOOP 10 OUTPUT sum Figure 75: Algorithm for solving the summing input numbers problem. The algorithm shown in Figure 75 has five data manipulation statements in lines 1, 3, 4, 8, and 10. There is one conditional test in line 5. The algorithm requires an adder for summing, and two 8-bit registers: one for variable X and one for variable sum. The dedicated datapath is shown in Figure 76. Line 1 in the algorithm is performed by asserting the Reset signal, line 3 is performed by asserting the XLoad signal, and line 4 is performed by asserting the sumLoad signal. Instead of using a tri-state buffer to output X or sum at lines 8 and 10, the datapath continuously outputs either X or sum using a multiplexer. The Out signal selects the 2to-1 multiplexer for one of the two sources: register X or register sum. The output of the multiplexer is always available at the output, so either X or sum is always shown at the output. The conditional test (X = 0) is generated by the 8-input NOR gate. Copyright © 2011 Enoch Hwang Page 81 of 125 Microprocessor Design Trainer Lab Manual Input 8 sumLoad D7-0 Load 8-bit Register Clear X Q7-0 XLoad D7-0 Load 8-bit Register Clear sum Q7-0 Reset Clock (X = 0) 8 8 8 Out 8 0 8 + 1 8 Output Figure 76: Datapath for solving the summing input numbers problem. At first glance, this algorithm is very similar to the GCD algorithm in Lab 6. However, because of one requirement of this problem, the actual hardware implementation of this microprocessor is slightly more difficult. Specifically, the requirement that many different numbers be input through one input port requires careful timing considerations and an understanding of how mechanical switches behave. As a first try, we begin with the state diagram shown in Figure 77(a). Line 1 of the algorithm is performed by the asynchronous Reset, so it does not require a state to execute. Line 3 is performed in state 00, which is followed unconditionally by line 4 in state 01. The condition (X = 0) is then tested. If the condition is true, the loop is exited, and the FSM goes to state 11 to output the value for sum and stays in that state until reset. If the condition is false, the FSM goes to state 10 to output X, and the loop repeats back to state 00. Enter' 00 INPUT X OUTPUT X 00 INPUT X OUTPUT X Enter 01 (X = 0)' 10 sum = sum + X OUTPUT X 01 (X = 0)' (X = 0) OUTPUT X sum = sum + X OUTPUT X 11 OUTPUT Sum 10 (X = 0) OUTPUT X (a) 11 OUTPUT Sum (b) Figure 77: Incorrect state diagrams for solving the summing input numbers problem. However, if you implement this circuit in hardware, it will not work correctly. The reason is that the FSM cycles through the three loop states (00, 01, and 10) very fast because of the fast clock speed (16 MHz). As a result, Copyright © 2011 Enoch Hwang Page 82 of 125 Microprocessor Design Trainer Lab Manual the FSM will have gone through state 00 to input a number many times (approximately 16 million times a second) before you can even change the input to another number. Hence, the same number will be summed many times. To correct this problem, we need to add another input signal that acts like the Enter switch. This way, the FSM will stay in state 00, waiting for the Enter signal to be asserted. This will give the user time to set up the input number before pressing the Enter switch. When the Enter signal is asserted, the FSM will exit state 00 with the new number to be processed. This modified state diagram is shown in Figure 77(b). There is still a slight timing problem with this modified state diagram because of the fast clock speed. After pressing the Enter switch, and before you have time to release it, the FSM will have cycled through the complete loop and is back at state 00. But since you have not yet released the Enter switch, the FSM will continue on another loop with the same input number. What we need to do is to break the loop by waiting for the Enter switch to be released. This is shown in the state diagram in Figure 78(a). State 10 will wait for the Enter switch to be released before continuing on and looping back to state 00. This last state diagram is correct. However, there might be a problem with the operation of the mechanical switch used for the Enter signal. When a mechanical switch is pressed, it usually goes on and off several times before settling down in the on position. This is referred to as the “debounce” problem. When the switch is fluctuating between the on and the off positions, the FSM can again be able to go through the loop many times. What we need to do is to debounce the switch. This, however, need not be done within the FSM circuit itself but in an interface circuit between the microprocessor and the switch. We will now construct the control unit circuit based on the state diagram shown in Figure 78(a). Four states are used for the five data manipulation statements. All of the states except for 11 will output X. State 00 inputs X and waits for the Enter signal. This allows the user to set up the input number and then press the Enter switch. When the Enter switch is pressed, the FSM goes to state 01, to sum X, and tests for the condition (X = 0). If the condition is true, the FSM terminates in state 11 and outputs sum; otherwise, it goes to state 10 to wait for the Enter signal to be de-asserted by the user releasing the Enter switch. After exiting state 10, the FSM continues on to repeat the loop in state 00. The next-state table, as derived from the state diagram, is shown in Figure 78(b). The table requires four variables: two to encode the four states, Q1 and Q0, and two for the status signals, Enter and (X = 0). The state memory is implemented with two D flip-flops. The K-maps and the next-state equations for D1 and D0 are shown in Figure 78(c). The control words and output table for the three control signals are shown in Figure 78(d). State 00 performs line 3 of the algorithm in Figure 75 by asserting XLoad and Line 8 by de-asserting Out. When Out is de-asserted, X is passed to the output. State 01 performs line 4 and line 8. Line 4 is executed by asserting sumLoad, and line 8 is executed by de-asserting Out. State 10 again performs line 8 by de-asserting Out. Finally, state 11 performs line 10 by asserting Out. The output equations, as derived from the output table, are shown in Figure 78(e). There is one equation for each of the three control signals. Each equation is dependent only on the current state, i.e., the current values in Q1 and Q0. The complete control unit circuit is shown in Figure 78(f). The state memory consists of two D flip-flops. The inputs to the flip-flops are the next-state circuit derived from the two next-state equations. The output circuit for the three control signals is derived from the three output equations. The status signal, (X = 0), comes from the comparator in the datapath. Copyright © 2011 Enoch Hwang Page 83 of 125 Microprocessor Design Trainer Lab Manual Enter' 00 INPUT X OUTPUT X Enter sum = sum + X OUTPUT X 01 (X = 0)' (X = 0) 10 Enter' OUTPUT X 11 OUTPUT Sum Enter (a) Current State Q1Q0 Next State Q2next Q1next Q0next (D2 D1 D0) Enter, (X = 0) 00 00 10 00 11 00 01 10 11 01 00 11 00 11 10 01 10 10 11 11 01 11 10 11 (b) D1 D0 Enter, (X = 0) 01 00 Q1Q0 11 10 Q1Q0 00 Enter, (X = 0) 01 00 00 01 1 1 1 1 Q0 01 11 1 1 1 1 Q1 Enter 11 1 1 10 1 11 10 1 1 1 1 1 1 Q1'Q0'Enter Q0(X = 0) 1 Q1Q0 10 D1 = Q0 + Q1Enter D0 = Q1Q0 + Q0 (X = 0) + Q1'Q0'Enter (c) Control Word 0 1 2 3 State Q1 Q0 00 01 10 11 Instruction XLoad sumLoad Out INPUT X, OUTPUT X sum = sum – X, OUTPUT X OUTPUT X OUTPUT sum 1 0 0 0 0 1 0 0 0 0 0 1 (d) Copyright © 2011 Enoch Hwang Page 84 of 125 Microprocessor Design Trainer Lab Manual XLoad = Q1'Q0' sumLoad = Q1'Q0 Out = Q1Q0 (e) Enter D1 Q1 XLoad Q'1 sumLoad Clk Clr D0 Out Q0 Clk Clr Clock Reset Q'0 Done (X = 0) (f) Figure 78: Control unit for solving the summing input numbers problem: (a) state diagram; (b) next-state table; (c) K-maps and next-state equations; (d) control words and output table; (e) output equations; (f) circuit. The final microprocessor can now be formed easily by connecting the control unit and the datapath together using the designated control and status signals, as shown in Figure 79. Input Enter 8 Enter XLoad sumLoad Out Input Control Signals XLoad sumLoad Out CU Clock Reset Clock Reset DP (X = 0) Status Signal (X = 0) Done Clock Reset Output 8 Done Output Figure 79: Microprocessor for solving the summing input numbers problem. Copyright © 2011 Enoch Hwang Page 85 of 125 Microprocessor Design Trainer Lab Manual 6.7.1 Experiments 1. Use the Block Editor in Quartus to draw out the various component circuits for the summing input numbers problem. Each component is to be drawn in a new Block Diagram/Schematic File design file. Create symbols for these components. If you have already drawn a particular component in another lab, you can reuse it again by copying the .bdf and .bsf files to this project’s folder. 2. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the datapath using the component symbols that you have just created. Create a symbol for this datapath circuit. 3. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the control unit using the component symbols that you have just created. Create a symbol for this control unit circuit. 4. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the microprocessor circuit by connecting the control signals and status signals together between the datapath component symbol and the control unit component symbol. Create a symbol for this microprocessor circuit. 5. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the top-level interface circuit for connecting the microprocessor I/O signals to the FPGA pins. Add also a clock divider circuit as shown in Figure 80. This design file is the top-level file for your project. PB[1] SWITCH[7..0] 8 PB[0] Reset Enter Input Summing input numbers CLOCK Clock Divider Clkin Clkout Microprocessor Clock Output LED[7..0] 8 Done LED[15] Figure 80: Hardware implementation for the summing input numbers microprocessor. 6. Use the Pin Planner to map the microprocessor I/O signals to the FPGA I/O pins. Refer to Appendix A for the correct pin mappings. 7. Use the Programmer to upload the microprocessor circuit to the trainer board. Test and verify the operation of this microprocessor on the trainer board. Copyright © 2011 Enoch Hwang Page 86 of 125 Microprocessor Design Trainer Lab Manual 6.8 Lab 8: Finding the Largest Number Purpose In this lab you will design a dedicated microprocessor to find the largest number from user inputs. You will manually design the datapath and the control unit for this microprocessor circuit. The datapath and the control unit are then combined together to form the microprocessor. Finally, you will implement this microprocessor on the trainer and verify its operation. Introduction In this lab, we will design and implement the classic computer program of finding the largest number from a set of input numbers. We will assume that the numbers, entered through one input port, are 8-bit unsigned numbers. The current largest number is always displayed. The algorithm continues to input numbers as long as the number entered is not a 0. The algorithm stops when the number entered is a 0. You first will design a dedicated datapath for the algorithm. Next, you will design the control unit for the datapath. You will use the Block Editor to implement the complete dedicated microprocessor. Finally, you will test and verify it on the trainer board. The algorithm for solving this problem is listed in Figure 81. The algorithm shows that there are five data manipulation operations in lines 1, 2, 5, 7, and 8. It requires two registers, an 8-bit register for storing the input X and an 8-bit register for storing Largest. No functional unit for performing arithmetic is needed. 1 2 3 4 5 6 7 8 9 Largest = 0 INPUT X WHILE (X ≠ 0){ IF (X > Largest) THEN Largest = X END IF OUTPUT Largest INPUT X } // for storing the current largest number // enter first number // if new number greater? // yes, remember new largest number // get next number Figure 81: Algorithm for finding the largest number problem. The dedicated datapath is shown in Figure 82. Copyright © 2011 Enoch Hwang Page 87 of 125 Microprocessor Design Trainer Lab Manual Input 8 D7-0 Load 8-bit Register Clear X Q7-0 XLoad Reset Clock 8 (X = 0) 8 D7-0 Load 8-bit Register Clear Largest Q7-0 LargestLoad (X > Largest) > 8 8 Output Figure 82: Datapath for finding the largest number problem. Line 1 in the algorithm is performed by asserting the Reset signal, lines 2 and 8 are performed by asserting the XLoad signal, and line 5 is performed by asserting the LargestLoad signal. Instead of using a tri-state buffer to output the Largest number in line 7, we will continuously output Largest by connecting the Q output of the Largest register directly to the data Output port. With this solution, no additional control signal and, therefore, no additional state is required for line 7. The conditional test (X = 0) is generated by the 8-input NOR gate, and the conditional test (X > Largest) is generated by the greater-than comparator. This algorithm is very similar to the summing input numbers problem in Lab 7, especially the situation for handling the Enter switch for inputting each number through the same input port. The state diagram is shown in Figure 83(a). State 00 inputs X and waits for the Enter signal. This allows the user to set up the input number and then press the Enter switch. When the Enter switch is pressed, the FSM goes to state 01 to make two tests. If (X = 0) is true, the FSM terminates in state 11. If (X > Largest) is true, then the FSM goes to state 10 to assign X as the new largest number. If X is not 0 and not the largest, then the FSM will go back to state 00 to wait for another input number. Before going back to state 00 from either state 01 or 10, we need to wait for the release of the Enter switch as explained in Lab 7. The next-state table, as derived from the state diagram, is shown in Figure 83(b). The table requires five variables: two to encode the four states, Q1 and Q0, and three for the status signals, Enter, (X = 0), and (X > Largest). The state memory is implemented with two D flip-flops. The K-maps and the next-state equations for D1 and D0 are shown in Figure 83(c). The control words and output table for the three control signals are shown in Figure 83(d). State 00 performs lines 2 and 8 of the algorithm by asserting the XLoad signal. All of the states output Largest, and this action does not require any control signals. State 10 performs line 5 by asserting LargestLoad. State 11 outputs a Done signal to inform the user that the FSM has stopped. The output equations, as derived from the output table, are shown in Figure 83(e). There is one equation for each of the three control signals. The complete control unit circuit is shown in Figure 83(f). The state memory consists of two D flip-flops. The inputs to the flip-flops are the next-state circuit derived from the two next-state equations. The output circuit for the three control signals is derived from the three output equations. The status signal (X = 0) comes from the NOR-gate comparator in the datapath, and the status signal (X > Largest) comes from the greater-than comparator in the datapath. Copyright © 2011 Enoch Hwang Page 88 of 125 Microprocessor Design Trainer Lab Manual Enter' 00 INPUT X OUTPUT Largest Enter Enter(X = 0)'(X > Largest)' 01 Enter'(X = 0)'(X > Largest)' OUTPUT Largest (X = 0) (X = 0)'(X > Largest) 11 Largest = X OUTPUT Largest 10 Enter' OUTPUT Largest Enter (a) Next State Q1next Q0next (D1 D0) Current State Q1Q0 Enter, (X = 0), (X > Largest) 000 00 00 00 11 00 01 10 11 001 00 10 00 11 010 00 11 00 11 011 00 11 00 11 100 01 01 10 11 101 01 10 10 11 110 01 11 10 11 111 01 11 10 11 (b) D1 Enter = 0 (X = 0), (X > Largest) 01 00 Q1Q0 11 Enter = 1 10 00 01 11 10 Q0(X > Largest) 00 01 11 1 1 1 1 1 1 1 10 1 1 1 1 1 1 1 1 1 1 1 Q0(X = 0) Q1Enter Q1Q0 D1 = Q1Q0 + Q1Enter + Q0(X = 0) + Q0(X > Largest) Copyright © 2011 Enoch Hwang Page 89 of 125 Microprocessor Design Trainer Lab Manual D0 Enter = 0 (X = 0), (X > Largest) Q1Q0 00 11 01 Q1'Q0'Enter Enter = 1 10 00 00 01 11 10 1 1 1 1 1 1 Q1' Enter (X > Largest)' 1 01 11 1 1 Q0 (X = 0) 1 1 1 1 1 1 1 1 10 Q1Q0 D0 = Q1Q0 + Q0(X = 0) + Q1'Q0'Enter + Q1'Enter(X > Largest)' (c) Control Word 0 1 2 3 State Q1 Q0 00 01 10 11 Instruction XLoad LargestLoad Done INPUT X, OUTPUT Largest OUTPUT Largest Largest = X, OUTPUT Largest OUTPUT Largest 1 0 0 0 0 0 1 0 0 0 0 1 (d) XLoad = Q1'Q0' LargestLoad = Q1Q0' Done = Q1Q0 (e) Enter XLoad D1 Q1 LargestLoad Clk Clr D0 Q'1 Done Q0 Clk Clock Reset Clr Q'0 (X = 0) (X > Largest) (f) Copyright © 2011 Enoch Hwang Page 90 of 125 Microprocessor Design Trainer Lab Manual Figure 83: Control unit for finding the largest number problem: (a) state diagram; (b) next-state table; (c) K-maps and next-state equations; (d) control words and output table; (e) output equations; (f) circuit. Connecting the control unit and the datapath together using the control and status signals produces the final microprocessor, as shown in Figure 84. Enter Input 8 Enter XLoad LargestLoad Input Control Signals XLoad LargestLoad CU Clock Reset Clock Reset (X = 0) (X > Largest) DP Status Signal Done (X = 0) (X > Largest) Clock Reset Output 8 Done Output Figure 84: Microprocessor for finding the largest number problem. 6.8.1 Experiments 1. Use the Block Editor in Quartus to draw out the various component circuits for the largest number problem. Each component is to be drawn in a new Block Diagram/Schematic File design file. Create symbols for these components. If you have already drawn a particular component in another lab, you can reuse it again by copying the .bdf and .bsf files to this project’s folder. 2. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the datapath using the component symbols that you have just created. Create a symbol for this datapath circuit. 3. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the control unit using the component symbols that you have just created. Create a symbol for this control unit circuit. 4. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the microprocessor circuit by connecting the control signals and status signals together between the datapath component symbol and the control unit component symbol. Create a symbol for this microprocessor circuit. 5. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the top-level interface circuit for connecting the microprocessor I/O signals to the FPGA pins as shown in Figure 85. Add also a clock divider circuit to the clock input of the microprocessor. This design file is the top-level file for your project. Copyright © 2011 Enoch Hwang Page 91 of 125 Microprocessor Design Trainer Lab Manual PB[1] SWITCH[7..0] 8 PB[0] Reset Enter Input Finding the largest number CLOCK Clock Divider Clkin Clkout Microprocessor Clock Output LED[7..0] 8 Done LED[15] Figure 85: Hardware implementation for the finding the largest number problem. 6. Use the Pin Planner to map the microprocessor I/O signals to the FPGA I/O pins. Refer to Appendix A for the correct pin mappings. 7. Use the Programmer to upload the microprocessor circuit to the trainer board. Test and verify the operation of this microprocessor on the trainer board. Set up a number and press PB[1]. Repeat as many times as you want. Each time a number is entered, LED[7..0] will display the current largest number. Copyright © 2011 Enoch Hwang Page 92 of 125 Microprocessor Design Trainer Lab Manual 6.9 Lab 9: Hi-Lo Number Guessing Game Purpose In this lab we will design a dedicated microprocessor for playing a high-low number guessing game. We will manually design the datapath and the control unit for this microprocessor circuit. The datapath and the control unit are then combined together to form the microprocessor. Finally, we will implement this microprocessor on the trainer and verify its operation. Introduction In this lab, we manually will design the complete dedicated microprocessor for playing a high-low number guessing game. The user picks a number between 0 and 99, and the microprocessor will use the binary search algorithm to guess the number. After each guess, the user tells the microprocessor whether the guess is high or low compared to the picked number. Two push-buttons, hi_button and lo_button, are used for the user to tell the computer whether the guess is too high, too low, or correct. The hi_button is pressed if the guess is too high, and the lo_button is pressed if the guess is too low. If the guess is correct, both buttons are pressed at the same time. The algorithm for this high-low number guessing game is listed in Figure 86. The two boundary variables, Low and High, are initialized to 0 and 100, respectively. The loop between lines 3 to 11 will keep repeating until both buttons, hi_button and lo_button, are pressed. Inside the loop, line 4 calculates the next guess by finding the middle number between the lower and upper boundaries, and assigns it to the variable Guess. Line 5 outputs this new Guess. Lines 6 to 10 checks which button is pressed. If the lo_button is pressed, that means the guess is too low, so line 7 changes the Low boundary to the current Guess. Otherwise, if the hi_button is pressed, that means the guess is too high, and line 9 changes the High boundary to the current Guess. The loop is then repeated with the calculation of the new Guess in line 4. When both buttons are pressed, the condition in line 11 is true, and the loop is exited. Lines 12 to 15 simply cause the display to blink the correct guess by turning it on and off until either one of the buttons is pressed again. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Low = 0 // initialize Low boundary High = 100 // initialize High boundary REPEAT { Guess = (Low + High) / 2 // calculate guess using binary search OUTPUT Guess IF (lo_button = '1' AND hi_button = '0') THEN // low button pressed Low = Guess ELSE IF (lo_button = '0' AND hi_button = '1') THEN // hi button pressed High = Guess END IF } UNTIL (lo_button = '1' AND hi_button = '1') // repeat until both // buttons are pressed WHILE (lo_button = '0' AND hi_button = '0') // blink correct guess OUTPUT Guess turn off display END WHILE Figure 86: Algorithm for the high-low number guessing game. We will first design a dedicated datapath for the algorithm. Next, we will design the control unit for the datapath. We will use the Block Editor to implement the complete dedicated microprocessor. Finally, we will test and verify its operation on the trainer board. The algorithm shown in Figure 86 has eight data manipulation operations in lines 1, 2, 4, 5, 7, 9, 13 and 14. The dedicated datapath for realizing this algorithm is shown in Figure 87. It requires three 8-bit registers (Low, High, and Copyright © 2011 Enoch Hwang Page 93 of 125 Microprocessor Design Trainer Lab Manual Guess) for storing the low and high range boundary values and the guess, respectively. Two 2-to-1 multiplexers are used for the inputs to the Low and High registers to select between the initialization values for lines 1 and 2, and to the new Guess values for lines 7 and 9. The only arithmetic operations needed are the addition and division-by-2 in line 4. Hence, the outputs from the two registers, Low and High, go to the inputs of an adder for the addition, and the output of the adder goes to a shifter. The division-by-2 is performed by doing a right shift of 1 bit. The result from the shifter is stored in the register Guess. Depending on the condition in line 6, the value in Guess is loaded into either the Low or the High register by de-asserting the Init signal to both of the multiplexers and asserting the corresponding load signal for that register. A 2-input by 8-bit AND gate is used to control the output of the Guess number. One 8-bit set of inputs is connected to the output of the Guess register. The other 8-bit set of inputs is connected together in common to the output enable Out signal. By asserting Out, the number from Guess is passed to the output port, otherwise a zero is passed to the output port. To blink the output display in lines 13 and 14, we just toggle the Out line. The datapath shown in Figure 87 requires five control signals, Init, LowLoad, HighLoad, GuessLoad, and Out. The Init signal controls the two multiplexers to determine whether to load in the initialization values or the new guess. The three load signals, LowLoad, HighLoad, and GuessLoad, control the writing of these three respective registers. Finally, Out controls the output of the guess value. "00000000" "01100100" 8 8 1 8 0 1 0 Init 8 LowLoad 8 D7-0 Load 8-bit Register Clear Low Q7-0 D7-0 Load 8-bit Register Clear High Q7-0 HighLoad Reset Clock 8 8 8 + >> 1 8 GuessLoad D7-0 Load 8-bit Register Clear Guess Q7-0 8 Out 8 Output Figure 87: Datapath for the high-low number guessing game. Copyright © 2011 Enoch Hwang Page 94 of 125 Microprocessor Design Trainer Lab Manual The state diagram for this algorithm requires six states, as shown in Figure 88(a). State 000 is the starting initialization state. State 001 executes lines 4 and 5 by calculating the new guess and outputting it. State 001 also waits for the user keypress. If only the lo_button is pressed, then the FSM goes to state 010 to assign the guess as the new low value. If only the hi_button is pressed, then the FSM goes to state 011 to assign the guess as the new high value. If both buttons are pressed, then the FSM goes to state 100 to output the guess. From state 100, the FSM turns on and off the output by cycling between states 100 and 101 until a button is pressed. When a button is pressed from either state 100 or 101, the FSM goes back to the initialization state for a new game. Having six states, three D flip-flops are needed with two unused states. The next-state table for this state diagram is shown in Figure 88(b). The three K-maps with the three corresponding next-state equations, D2, D1, and D0, for are shown in Figure 88(c). The output table showing the five output signals (Init, LowLoad, HighLoad, GuessLoad, and Out) to be generated in each state are shown in Figure 88(d). The corresponding output equations derived from the output table are shown in Figure 88(e). Using the three next-state equations for deriving the next-state logic circuit, the three D flip-flops for the state memory, and the five output equations for deriving the output logic circuit, we get the complete control unit circuit for the high-low number guessing game, as shown in Figure 88(f). 000 Low = 0 High = 100 Guess = (Low + High) / 2 OUTPUT Guess (hi_button)'(lo_button)' 001 (hi_button)'(lo_button) (hi_button)(lo_button)' 010 011 Low = Guess High = Guess (hi_button)(lo_button) (hi_button) + (lo_button) 100 OUTPUT Guess (hi_button)'(lo_button)' (hi_button)'(lo_button)' 101 Turn Off LEDs (hi_button) + (lo_button) (a) Current State Q2Q1Q0 000 001 010 011 Copyright © 2011 Enoch Hwang Next State Q2next Q1next Q0next (D2 D1 D0) hi_button, lo_button 00 001 001 001 001 01 001 010 001 001 10 001 011 001 001 11 001 100 001 001 Page 95 of 125 Microprocessor Design Trainer Lab Manual 100 101 110 Unused 111 Unused 101 100 000 000 000 000 000 000 000 000 000 000 000 000 000 000 (b) D2 Q2 = 0 (hi_button), (lo_button) Q1Q0 00 01 11 10 Q2 = 1 00 01 11 10 Q2Q1'(hi_button)'(lo_button)' 1 00 1 01 1 11 Q2'Q1'Q0(hi_button)(lo_button) 10 D2 = Q2Q1'(hi_button)'(lo_button)' + Q2'Q1'Q0(hi_button)(lo_button) D1 Q2 = 0 (hi_button), (lo_button) 00 Q1Q0 01 11 10 Q2 = 1 00 01 11 10 00 1 01 Q2'Q1'Q0(hi_button)(lo_button)' 1 11 Q2'Q1'Q0(hi_button)'(lo_button) 10 D1 = Q2'Q1'Q0(hi_button)(lo_button)' + Q2'Q1'Q0(hi_button)'(lo_button) D0 Q2 = 0 (hi_button), (lo_button) 01 Q1Q0 00 11 10 1 1 1 Q2 = 1 00 01 11 1 10 Q1'Q0'(hi_button)'(lo_button)' 00 1 01 1 11 1 1 1 1 Q2'Q1 10 1 1 1 1 Q2'(lo_button)' Q2'Q0' 1 D0 = Q1'Q0'(hi_button)'(lo_button)' + Q2'Q0' + Q2'Q1 + Q2'(lo_button)' (c) Copyright © 2011 Enoch Hwang Page 96 of 125 Microprocessor Design Trainer Lab Manual Control Word 0 1 2 3 4 5 State Q2 Q1 Q0 000 001 010 011 100 101 Instruction Init HighLoad LowLoad GuessLoad Out Low = 0, High = 100 Guess = (Low + High) / 2 Low = Guess High = Guess OUTPUT Guess Turn off LEDs 1 0 0 0 0 0 1 0 0 1 0 0 1 0 1 0 0 0 0 1 0 0 0 0 1 1 1 1 1 0 (d) Init = Q2'Q1'Q0' HighLoad = Q2'Q1'Q0' + Q2'Q1Q0 LowLoad = Q2'Q1'Q0' + Q2'Q1Q0' GuessLoad = Q2'Q1'Q0 Out = Q2' + Q0' (e) Init D2 Q2 Clk HighLoad Clr Q'2 D1 Q1 LowLoad Clk Clr Q'1 GuessLoad D0 Q0 Clk Clr Out Q'0 Clock Reset (lo_button) (hi_button) (f) Figure 88: Control unit for the high-low number guessing game: (a) state diagram; (b) next-state table; (c) K-maps and next-state equations; (d) control words and output table; (e) output equations; (f) circuit. Connecting the datapath circuit shown in Figure 87 and the control unit circuit shown in Figure 88(f) together using the control and status signals produces the final microprocessor, as shown in Figure 89. Copyright © 2011 Enoch Hwang Page 97 of 125 Microprocessor Design Trainer Lab Manual lo_button hi_button Clock Reset lo_button hi_button Init HighLoad LowLoad GuessLoad Out CU Init HighLoad LowLoad GuessLoad Out DP Control Signals Clock Reset Clock Reset Output 8 Output Figure 89: Microprocessor for the high-low number guessing game. 6.9.1 Experiments 1. Use the Block Editor in Quartus to draw out the various component circuits for the high-low number guessing game. Each component is to be drawn in a new Block Diagram/Schematic File design file. Create symbols for these components. If you have already drawn a particular component in another lab, you can reuse it again by copying the .bdf and .bsf files to this project’s folder. 2. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the datapath using the component symbols that you have just created. Create a symbol for this datapath circuit. 3. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the control unit using the component symbols that you have just created. Create a symbol for this control unit circuit. 4. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the microprocessor circuit by connecting the control signals and status signals together between the datapath component symbol and the control unit component symbol. Create a symbol for this microprocessor circuit. 5. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the top-level interface circuit for connecting the microprocessor I/O signals to the FPGA pins as shown in Figure 90. Add also a clock divider circuit to the clock input of the microprocessor. This design file is the top-level file for your project. PB[0] Reset PB[2] PB[1] lo_button hi_button High-low number guessing CLOCK Clock Divider Clkin Clkout Microprocessor Output Clock 8 Binary to BCD Decoder HEX1, HEX0 Figure 90: Hardware implementation for the high-low number guessing game. Copyright © 2011 Enoch Hwang Page 98 of 125 Microprocessor Design Trainer Lab Manual 6. Use the Pin Planner to map the microprocessor I/O signals to the FPGA I/O pins. Refer to Appendix A for the correct pin mappings. 7. Use the Programmer to upload the microprocessor circuit to the trainer board. Test and verify the operation of this microprocessor on the trainer board. On reset, it will display the number 50. Press PB[1] if the displayed number is larger than your number. Press PB[2] if the displayed number is smaller than your number. Press both PB[1] and PB[2] at the same time if the displayed number is equal to your number. Copyright © 2011 Enoch Hwang Page 99 of 125 Microprocessor Design Trainer Lab Manual 6.10 Lab 10: The EC-1 General-Purpose Microprocessor Purpose In this lab we will design the EC-1 general-purpose microprocessor. Using the instructions that are defined specifically for the EC-1 microprocessor, we will write a software program to execute on the EC-1 and see that it actually works! We will manually design the datapath and the control unit for this microprocessor circuit. The datapath and the control unit are then combined together to form the microprocessor. Finally, we will implement this microprocessor on the trainer and verify its operation. Introduction A general-purpose microprocessor is often referred to as the central processing unit (CPU). It differs from a dedicated microprocessor in that it can perform different functions under the direction of different software instructions, whereas a dedicated microprocessor is designed to perform just one specific function and does not involve the execution of software instructions. However, a general-purpose microprocessor can also be viewed as a dedicated microprocessor, because it is made to perform only one function, and that is, to only execute program instructions. In this sense, we can design and construct general-purpose microprocessors in the same way that we design dedicated microprocessors. The operation of a general-purpose microprocessor basically involves three steps, which generally is referred to as the instruction cycle. In step one the control unit fetches an instruction from memory. The memory location to be fetched is determined by the content of the program counter (PC) register. The instruction fetched from memory is copied into the instruction register (IR). The PC is then incremented by 1 (assuming that each instruction occupies one memory location). In step two, the instruction that is in the IR is decoded. The control unit checks the encoding of the instruction to determine what instruction it is and then jumping to the state that is assigned for executing that instruction. Once in that particular state, the control unit performs the third and final step by simply asserting the appropriate control signals for controlling the datapath to execute that instruction. The general-purpose EC-1 microprocessor that we will design in this lab is extremely small and very limited as to what it can do. Nevertheless, the building of this microprocessor demonstrates how a general-purpose microprocessor is designed and how the different components are put together. In order to keep the manual design of the microprocessor manageable, we have to keep the number of variables small since these variables determine the number of states and input signals for the finite state machine. After this lab, you quickly will appreciate the power of designing at a higher abstraction level using VHDL and the use of an automatic synthesizer. 6.10.1 Instruction Set The first step in designing a general-purpose microprocessor is to define its instruction set and how the instructions are encoded and executed. The instructions that our EC-1 general-purpose microprocessor can execute and the corresponding encodings for these instructions are defined in Figure 91. The Instruction column shows the syntax and mnemonic to use for the instruction when writing a program in assembly language. The Encoding column shows the binary encoding defined for the instruction, and the Operation column shows the operation of the instruction. As we can see from Figure 91, our microprocessor’s instruction set has only five instructions. To encode five different instructions, the operation code (opcode) will require three bits—giving us eight different combinations. As shown in the Encoding column, the first three most significant bits are the opcode given to an instruction. For example, the opcode for the IN A instruction is 011, the opcode for OUT A is 100, and so on. The three encodings, 000, 001, and 010, are not defined and so can be used as a “no-operation” (NOP) instruction. We have fixed the bit width of each instruction encoding to be eight bits. The remaining bits, in our case the last five bits in each encoding, normally are used as operand bits to specify what registers or other resources to use for data manipulation. In our case, because of the limited instruction set that we have, only the JNZ (Jump Not Zero) instruction uses the last 4 bits, designated as aaaa, to specify an address in the memory to jump to. Copyright © 2011 Enoch Hwang Page 100 of 125 Microprocessor Design Trainer Lab Manual Instruction IN A OUT A DEC A JNZ address HALT Encoding 011 ××××× 100 ××××× 101 ××××× 110 ×aaaa 111 ××××× Operation A ← Input Output ← A A←A–1 IF (A != 0) THEN PC = aaaa Halt Comment Input to A Output from A Decrement A Jump to address aaaa if A is not zero Halt execution Notations: A = accumulator – a special purpose register for data manipulation PC = program counter – a special purpose register for keeping track of the next instruction to be executed aaaa = four bits for specifying a memory address × = don’t-cares Figure 91: Instruction set for the EC-1. The IN A instruction inputs an 8-bit value from the data input port, Input, and stores it into the accumulator (A). The accumulator is a special 8-bit register for performing many data operations. The OUT A instruction copies the content of the accumulator to the output port, Output. For the EC-1, the content of the accumulator is always available at the output port, so this OUT A instruction really is not necessary. It is included just because a program should have an output instruction. The DEC A instruction decrements the content of A by 1 and stores the result back into A. The JNZ (Jump Not Zero) instruction tests to see if the value in A is equal to 0 or not. If A is equal to 0, then nothing is done. If A is not equal to 0, then the last four bits (aaaa) of the instruction is loaded into the special program counter (PC) register. The PC is used to store the memory address of the next instruction to be fetched from memory and executed. After an instruction is fetched from memory, the PC is incremented by one to point to the next instruction in memory to be fetched. When a different value is loaded into the PC, we essentially are performing a jump to a different instruction at this new memory address. Finally, the HALT instruction halts the CPU by having the control unit stay in the Halt state indefinitely until reset. 6.10.2 Datapath Once we have defined the instruction set, we can proceed to design a datapath that can execute all of the operations as defined by all of the instructions in the instruction set. In addition, the datapath must also handle the operations involved in step 1 and 2 of the instruction cycle, that is, fetching and decoding of the instructions. The custom datapath for the EC-1 is shown in Figure 91. The EC-1 datapath can be viewed as having three separate parts: (1) for performing the instruction cycle operations of fetching an instruction, and incrementing or loading the PC, (2) the memory for storing the instructions, and (3) for performing the data operations for all of the instructions in the instruction set. The portion of the datapath for performing the instruction cycle operations basically contains the instruction register (IR) and the program counter (PC). The bit width of the instructions determine the size of the IR; whereas, the number of addressable memory locations determines the size of the PC. For this datapath, we want a memory with 16 locations, each being 8-bits wide, so we need a 4-bit (24 = 16) address. Hence, the PC is 4-bits wide, and the IR is 8-bits wide. A 4-bit increment unit is used to increment the PC by 1. The PC needs to be loaded with either the result of the increment unit or the address from the JNZ instruction. A 2-to-1 multiplexer is used for this purpose. One input of the multiplexer is from the increment unit, and the other input is from the four least significant bits of the IR, i.e., IR3-0. To keep our design simple, instead of having external memory for storing the program instructions, we have included the memory as part of the datapath. In this design, the memory is a 16 locations × 8-bits wide read-only memory (ROM). Since the instruction set does not have an instruction that writes to memory, we only need a readonly memory. The output of the PC is connected directly to the 4-bit memory address lines, because the memory location always is determined by the content of the PC. The 8-bit memory output, Q7-0, is connected to the input of the IR for executing the instruction fetch operation (step 1 of the instruction cycle). The portion of the datapath for performing the instruction set operations includes the 8-bit accumulator A and an 8-bit decrement unit. A 2-to-1 multiplexer is used to select the input to the accumulator. For the IN A instruction, the Copyright © 2011 Enoch Hwang Page 101 of 125 Microprocessor Design Trainer Lab Manual input to the accumulator is from the data input port, Input; whereas for the DEC A instruction, the input is from the output of the decrement unit, which performs the decrement of A. The output of the accumulator is connected directly to the data output port, Output, hence the OUT A instruction does not require any specific datapath actions. Furthermore, with this direct connection, it is equivalent to always performing the OUT A instruction. The JNZ instruction requires an 8-input OR gate connected to the output of the accumulator to test for the condition (A ≠ 0). The output of this 8-input OR gate is the status signal, (A ≠ 0), to the control unit. The actual operation required by the JNZ instruction is to load the PC with the four least significant bits of the IR. The HALT instruction also does not require any specific datapath actions. The control word for this custom datapath has five control signals, IRload, PCload, INmux, Aload, and JNZmux. The datapath provides one status signal, (A ≠ 0), to the control unit. Input IRload IR7-5 D7-0 Load 8-bit Register Clear IR IR7-5 IR3-0 4 3 1 JNZmux 8 4 8 0 4 D3-0 Load 4-bit Register Clear PC PC3-0 PCload 1 4-bit Increment 8 Q7-0 4 16 locations × 8 bits ROM Address3-0 0 D7-0 Load 8-bit Register Clear A A7-0 INmux Aload Reset Clock 8 8 8 (A ≠ 0) 8-bit Decrement Output Instruction Cycle Operations Memory Instruction Set Operations Figure 92: Datapath for the EC-1. 6.10.3 Control Unit The state diagram for the control unit is shown in Figure 93(a). States for executing the instructions are given the same name as the instruction mnemonics. The first Start state, 000, serves as the initial reset state. No action is performed in this state, however, this Start state provides one extra clock cycle for instructions that require an extra clock cycle to complete its operation. Although, those instructions that do not require this extra clock cycle to complete its operation technically should go back to the Fetch state, however, we have made all instructions go back to the Start state so that the next-state table and the excitation equations are simpler. Of the five instructions, only the JNZ instruction requires an extra clock cycle to complete its operation. This is because the PC must be loaded with a new address value if the condition is tested true. This new address value, however, is loaded into the PC at the beginning of the next clock cycle. So, if we have the FSM go to the Fetch state in the next clock cycle, then the IR will be loaded with the memory from the old address and not from the new address. However, by making the FSM go back to the Start state, it will provide one extra clock cycle for the PC to be updated with the new address before the memory is accessed during the next Fetch state using the new address. Copyright © 2011 Enoch Hwang Page 102 of 125 Microprocessor Design Trainer Lab Manual From the Start state, the control unit goes to the Fetch state unconditionally. In the Fetch state, the IR is loaded with the memory content from the location specified by the PC by asserting the IRload signal. Furthermore, the PC is incremented by 1, and the result is loaded back into the PC by asserting the PCload signal. The Decode state tests the three most significant bits of the IR, IR7-5, and goes to the corresponding state as encoded by the 3-bit opcode for executing the instruction. In the five execute states corresponding to the five instructions, the appropriate control signals for the datapath are asserted to execute that instruction. For example, the IN A instruction requires setting the INmux signal to a 1 for the input multiplexer, and setting the Aload signal to a 1 to load the input value into A. Notice that, in order for the input instruction to read in the correct value, the input value must be set up first before resetting the CPU. Furthermore, since the Input state does not wait for an Enter key signal, only one value can be read in, even if there are multiple input statements. The DEC A instruction requires setting INmux to 0 and Aload to 1, so the output from the decrement unit is routed back to the accumulator and gets loaded in. The JNZ instruction asserts the JNZmux signal to route the four address bits from the IR, IR3-0, to the PC. Whether the PC actually gets loaded with this new address depends on the condition of the (A ≠ 0) status signal. Hence, the PCload control signal is asserted only if (A ≠ 0) is a 1. By asserting the PCload signal conditionally depending on the status signal (A ≠ 0), the state diagram will require one less state, thus making the finite state machine smaller, otherwise, the FSM will need two states for the JNZ instruction: one state for asserting the PCload signal when (A ≠ 0) is true, and one state for de-asserting the PCload signal when (A ≠ 0) is false. Once the FSM enters the Halt state, it unconditionally loops back to the Halt state, giving the impression that the CPU has halted. The next-state table for the state diagram and the three next-state equations, as derived from the table, are shown in Figure 93 (b) and (c), respectively. With eight states, three D flip-flops are needed for the state memory of the control unit circuit. Notice that the derivation of the next-state equations is fairly straightforward, since most of the entries in the table contain 0’s. Only the Decode state row contains different values. There are five control signals, IRload, PCload, INmux, Aload, and JNZmux, that the control unit needs to generate for controlling the datapath. There is an additional control output signal, Halt, that is asserted when the microprocessor executes the Halt instruction. The control words and output table for controlling this datapath are shown in Figure 93(d). The output equations, as derived from the output table, are shown in Figure 93(e). Finally, we can derive the complete circuit for the control unit based on the next-state equations and the output equations. The complete control unit circuit for the EC-1 general-purpose microprocessor is shown in Figure 93(f). Copyright © 2011 Enoch Hwang Page 103 of 125 Microprocessor Design Trainer Lab Manual Start 000 Fetch 001 Decode 010 (IR7-5 = 011) Input 011 Output 100 (IR7-5 = 100) (IR7-5 = 000, 001, or 010) (IR7-5 = 101) Dec 101 (IR7-5 = 110) Jnz 110 (IR7-5 = 111) Halt 111 (a) Copyright © 2011 Enoch Hwang Page 104 of 125 Microprocessor Design Trainer Lab Manual Next State Q2next Q1next Q0next (D2 D1 D0) Current State Q2Q1Q0 000 Start 001 Fetch 010 Decode 011 Input 100 Output 101 Dec 110 Jnz 111 Halt IR7, IR6, IR5 000 001 010 011 100 101 110 111 NOP NOP NOP INPUT OUTPUT DEC JNZ HALT 001 010 000 000 000 000 000 111 001 010 000 000 000 000 000 111 001 010 000 000 000 000 000 111 001 010 011 000 000 000 000 111 001 010 100 000 000 000 000 111 001 010 101 000 000 000 000 111 001 010 110 000 000 000 000 111 001 010 111 000 000 000 000 111 (b) D2 = Q2Q1Q0 + Q2'Q1Q0'IR7 D1 = Q2Q1Q0 + Q2'Q1Q0' (IR6IR5 + IR7IR6) + Q2'Q1'Q0 D0 = Q2Q1Q0 + Q2'Q1Q0' (IR6IR5 + IR7IR5) + Q2'Q1'Q0' (c) Control Word 0 1 2 3 4 5 6 7 State Q2Q1Q0 000 Start 001 Fetch 010 Decode 011 Input 100 Output 101 Dec 110 Jnz 111 Halt IRload 0 1 0 0 0 0 0 0 PCload 0 1 0 0 0 0 IF (A ≠ 0) THEN 1 ELSE 0 0 INmux 0 0 0 1 0 0 0 0 Aload 0 0 0 1 0 1 0 0 JNZmux 0 0 0 0 0 0 1 0 Halt 0 0 0 0 0 0 0 1 (d) IRload = Q2'Q1'Q0 PCload = Q2'Q1'Q0 + Q2Q1Q0' (A ≠ 0) INmux = Q2'Q1Q0 Aload = Q2'Q1Q0 + Q2Q1'Q0 JNZmux = Q2Q1Q0' Halt = Q2Q1Q0 (e) Copyright © 2011 Enoch Hwang Page 105 of 125 Microprocessor Design Trainer Lab Manual IRload D2 Q2 Clk Clear PCload Q'2 INmux D1 Q1 Clk Clear Aload Q'1 JNZmux Halt D0 Q0 Clk Clear Clock Reset Q'0 (A ≠ 0) IR7 IR6 IR5 (f) Figure 93: Control unit for the EC-1: (a) state diagram; (b) next-state table; (c) next-state equations; (d) control words and output table; (e) output equations; (f) circuit. 6.10.4 EC-1 Microprocessor Circuit The complete circuit for the EC-1 general-purpose microprocessor is constructed by connecting the datapath from Figure 92 and the control unit from Figure 93(f) together using the designated control and status signals as shown in Figure 94. Copyright © 2011 Enoch Hwang Page 106 of 125 Microprocessor Design Trainer Lab Manual Input 8 Input IRload PCload INmux CU Aload JNZmux Clock Reset Clock Reset IR7-5 (A ≠ 0) Control Signals Status Signals IRload PCload INmux Aload JNZmux DP IR7-5 (A ≠ 0) Halt Clock Reset Output 8 Halt Output Figure 94: Complete circuit for the EC-1 general-purpose microprocessor. 6.10.5 Sample Program Dedicated microprocessors have an algorithm built right into the hardware circuit of the microprocessor. General-purpose microprocessors, on the other hand, do not have an algorithm built into its circuitry. It is designed only to execute program instructions fetched from the memory. Hence, in order for the EC-1 computer to work, it needs to execute a software program stored in its memory. The program, of course, can only use instructions that are defined in its instruction set. There are only five instructions in the instruction set defined for the EC-1, as shown in Figure 91. For our sample program, we will use these five instructions to write a program to input a number and then to count down from this input number to 0. The program listing is shown in Figure 95(a). Since we do not have a compiler for the EC-1 that can automatically translate the program into binary executable code, we need to manually compile this program by hand. The binary executable code for this program is shown in Figure 95(b). The binary code is obtained by replacing each instruction with its corresponding 3-bit opcode, as defined in Figure 91, followed by 5 bits for the operand. All of the instructions, except for the JNZ instruction, do not use these five operand bits, so either a 0 or a 1 can be used. From Figure 91, we find that the opcode for the IN A instruction is 011; therefore, the encoding for this first instruction is 01100000. Similarly, the opcode for the OUT A instruction is 100; therefore, the encoding used is 10000000. For the JNZ instruction, the 4 least significant bits represent the memory address to jump to if the condition is true. In the example, we are assuming that the first instruction, IN A, is stored in memory location 0000. Since the JNZ instruction jumps to the second instruction, OUT A, which will be stored in memory location 0001, therefore, the four address bits for the JNZ instruction are 0001. The opcode for the JNZ instruction is 110; hence, the encoding for the complete JNZ instruction is 11000001. IN A loop: OUT A DEC A JNZ loop HALT ------ Copyright © 2011 Enoch Hwang input a value into the A register output the value from the A register decrement A by one go back to loop if A is not zero halt Page 107 of 125 Microprocessor Design Trainer Lab Manual (a) memory address 0000 0001 0010 0011 0100 instruction encoding 01100000; 10000000; 10100000; 11000001; 11111111; ------ IN A OUT A DEC A JNZ loop HALT (b) Figure 95: Countdown program to run on the EC-1: (a) assembly code; (b) binary executable code. Normally, the program instructions are stored in memory that is external to the CPU, and the computer (with the help of the operating system) will provide means to independently load the instructions into the memory. However, to keep our design simple, we have included the memory as part of the CPU inside the datapath. Furthermore, we do not have an operating system for loading the instructions into the memory separately. Therefore, our program must be “loaded” into the memory before the synthesis of the datapath and the microprocessor. The memory circuit that we have used is from Quartus’ LPM component library. Information on the usage of this component can be obtained from the Help menu in the Quartus II software. This memory is initialized with the content of the text file named program.mif. Therefore, to load our memory with the program, we must enter the binary encoding of the program in this text file and then re-synthesize the microprocessor circuit. The content of this program.mif file for the countdown program is shown in Figure 96. Texts after the two hypens (--) are comments, and all of the capitalized words are keywords. Remember to re-synthesize your microprocessor every time you make changes to the program.mif file. -- Content of the ROM memory in the file PROGRAM.MIF DEPTH = 16; -- Number of memory locations: 4-bit address WIDTH = 8; -- Data Width of memory: 8-bit data ADDRESS_RADIX = BIN; DATA_RADIX = BIN; -- Specifies the address values are in binary -- Other valid radixes are HEX, DEC, OCT, BIN -- Specifies the data values are in binary -- Specify memory content. -- Format of each memory location is -address : data CONTENT BEGIN [0000..1111] : 00000000; -- Initialize locations range 0-F to 0 -- Program to countdown from an input to 0 0000 : 01100000; 0001 : 10000000; 0010 : 10100000; 0011 : 11000001; 0100 : 11111111; ------ IN A OUT A DEC A JNZ 0001 HALT END; Figure 96: The program.mif file containing the countdown program to run on the EC-1. Copyright © 2011 Enoch Hwang Page 108 of 125 Microprocessor Design Trainer Lab Manual 6.10.6 Hardware Implementation Figure 97 shows the interface between the EC-1 microprocessor and the input and output devices on the trainer board. The input simply consists of eight switches, and the output is two 7-segment LED displays. Since the microprocessor outputs an 8-bit binary number, we need an 8-bit binary to 2-digit BCD (binary coded decimal) decoder, so that we can see the 8-bit binary number as two decimal digits on the two 7-segment LED displays. A single LED is used to show when the microprocessor has halted. A push button switch is used as the Reset key. Finally, the 16 MHz clock is slowed down with a clock divider circuit. The reason for using the slower clock speed is so that we can see some intermediate results on the display. SWITCH[7..0] 8 PB[0] CLOCK Reset Clock Divider Clkin Clkout Input EC-1 Microprocessor Clock Output 8 Binary to BCD Decoder HEX1, HEX0 Halt LED[15] Figure 97: Hardware implementation for the EC-1. 6.10.7 Experiments 1. Use the Block Editor in Quartus to draw out the various component circuits for the EC-1 computer. Each component is to be drawn in a new Block Diagram/Schematic File design file. Create symbols for these components. See the instruction below for creating the ROM. If you have already drawn a particular component in another lab, you can reuse it again by copying the .bdf and .bsf files to this project’s folder. 2. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the datapath using the component symbols that you have just created. Create a symbol for this datapath circuit. 3. To use the ROM from the Quartus library, open up the Symbol Tool window and click on MegaWizard Plug-In Manager button as shown in next. Copyright © 2011 Enoch Hwang Page 109 of 125 Microprocessor Design Trainer Lab Manual On page 1 of the MegaWizard Plug-In Manager window, select Create a new custom megafunction variation, and then click Next as shown next. Under the megafunction list on the left side, select Memory Compiler > ROM: 1-PORT. Append the name ROM to the end of the path in the name text field as shown next. In the MegaWizard Plug-In Manager [page 1 of 5] window, select 8 bits for the ‘q’ output bus width, and select 16 words for the memory size as shown next. Copyright © 2011 Enoch Hwang Page 110 of 125 Microprocessor Design Trainer Lab Manual In the MegaWizard Plug-In Manager [page 2 of 5] window, de-select the check mark in the ‘q’ output port as shown next. In the MegaWizard Plug-In Manager [page 3 of 5] window, select Yes, use this file for the memory content data, and type in the filename program.mif as shown next. Copyright © 2011 Enoch Hwang Page 111 of 125 Microprocessor Design Trainer Lab Manual You can skip the last two pages in the MegaWizard Plug-In Manager windows, and then click Finish. You can now insert this new symbol for the ROM into your Block Editor. 4. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the control unit using the component symbols that you have just created. Create a symbol for this control unit circuit. 5. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the microprocessor circuit by connecting the control signals and status signals together between the datapath component symbol and the control unit component symbol. Create a symbol for this microprocessor circuit. 6. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the top-level interface circuit for connecting the microprocessor I/O signals to the FPGA pins as shown in Figure 97. Add also a clock divider and a binary-to-BCD decoder to the interface circuit. This design file is the top-level file for your project. 7. Use a text editor such as Notepad to create the file program.mif as listed in Figure 96. Alternatively, you can just copy this file from the DVD under the folder \Circuits\Lab 11\EC-2. Save this file in your project folder using the name program.mif. 8. Use the Pin Planner to map the microprocessor I/O signals to the FPGA I/O pins. Refer to Appendix A for the correct pin mappings. 9. Use the Programmer to upload the microprocessor circuit to the trainer board. Test and verify the operation of this microprocessor on the trainer board. Set up SWITCH[7..0] for the starting input count n, then press PB[0] to start the count. Copyright © 2011 Enoch Hwang Page 112 of 125 Microprocessor Design Trainer Lab Manual 6.11 Lab 11: The EC-2 General-Purpose Microprocessor Purpose In this lab we will design the EC-2 general-purpose microprocessor. Using the instructions that are defined specifically for the EC-2 microprocessor, we will write a software program to execute on the EC-2 and see that it actually works! We will manually design the datapath and the control unit for this microprocessor circuit. The datapath and the control unit are then combined together to form the microprocessor. Finally, we will implement this microprocessor on the trainer and verify its operation. Introduction In this lab, we will design the EC-2, a second version of our EC general-purpose microprocessor. The main difference between the EC-1 and the EC-2 is that the EC-2 has a richer instruction set. With this instruction set, more complex programs can be written. 6.11.1 Instruction Set The instruction set for the EC-2 general-purpose microprocessor has eight instructions, as shown in Figure 98. The reason for keeping this number at eight is so that we can still use only 3 bits to encode all of them. The LOAD instruction loads the content of the memory at the specified address into the accumulator, A. The address is specified by the five least significant bits of the instruction. The STORE instruction is similar to the LOAD instruction, except that it stores the value in A to the memory at the specified address. The ADD and SUB instructions, respectively, add and subtract the content of A with the content in a memory location and store the result back into A. The IN instruction inputs a value from the data input port, Input, and stores it into A. The JZ instruction loads the PC with the specified address if A is zero. Loading the PC with a new address simply causes the CPU to jump to this new memory location. The JPOS instruction loads the PC with the specified address if A is a positive number. The value in A is taken as a two’s complement signed number, so a positive number is one where the most significant bit of the number is a 0. Finally, the HALT instruction halts the CPU. Instruction LOAD A, address STORE A, address ADD A, address SUB A, address IN A JZ address JPOS address HALT Notations: A M PC aaaaa × Encoding 000 aaaaa 001 aaaaa 010 aaaaa 011 aaaaa 100 ××××× 101 aaaaa 110 aaaaa 111 ××××× Operation A ← M[aaaaa] M[aaaaa] ← A A ← A + M[aaaaa] A ← A – M[aaaaa] A ← Input IF (A = 0) THEN PC = aaaaa IF (A ≥ 0) THEN PC = aaaaa Halt Comment Load A with content of memory location aaaaa Store A into memory location aaaaa Add A with M[aaaaa] and store the result back into A Subtract A with M[aaaaa] and store result back into A Input to A Jump to address if A is zero Jump to address if A is a positive number Halt execution = accumulator. = memory. = program counter. = five bits for specifying a memory address. = don’t cares. Figure 98: Instruction set for the EC-2. Copyright © 2011 Enoch Hwang Page 113 of 125 Microprocessor Design Trainer Lab Manual 6.11.2 Datapath The custom datapath for the EC-2 is shown in Figure 99. The portion of the datapath for performing the instruction cycle operations is very similar to that of the previous EC-1 with the instruction register (IR), the program counter (PC), and the increment unit for incrementing the PC. The minor differences between the two are in the size of the PC and the increment unit. In this second design, we want a memory with 32 locations; hence, the address and, therefore, the size of the PC and the increment unit must all be 5-bits wide. The main modification to this portion of the datapath is the addition of a second 2-to-1 multiplexer that is connected between the output of the PC and the memory address input. One input of this multiplexer comes from the PC, and the other input comes from the five least significant bits of the IR, IR4-0. The reason for this is because there are now two different types of operations that can access memory. The first is still for the fetch operation, where the memory address is given by the content of the PC. The second type is for the four instructions, LOAD, STORE, ADD, and SUB, where they use the memory as an operand. Hence, the memory address for these four instructions is given by the five least significant bits of the IR, IR4-0. The select signal for this multiplexer is Meminst. The memory size for the EC-2 is increased to 32 locations, thus requiring five address bits. The memory is still included as part of the datapath rather than as an independent external unit to the CPU. In order to accommodate the STORE instruction for storing the value of A into the memory, we need to use a RAM instead of a ROM as used in the EC-1. To realize this operation, the output of the accumulator A is connected to the memory data input, D7-0. The signal MemWr, when asserted, causes the memory to write the value from register A into the location specified by the address in the instruction. The output of the memory at Q7-0 is connected to both the input of the IR and to the input of the accumulator, A, through a 4-to-1 multiplexer. The connection to the IR is for the fetch operation just like in the EC-1 design. The connection to the accumulator is for performing the LOAD instruction, where the content of the memory is loaded into A. Since the memory is only one among three different sources that is loaded into A, the multiplexer is needed. The portion of the datapath for performing the instruction set operations includes the 8-bit accumulator A, an 8bit adder-subtractor unit, and a 4-to-1 multiplexer. The adder-subtractor unit performs the ADD and SUB instructions. The Sub signal, when asserted, selects the subtraction operation, and when de-asserted selects the addition operation. The 4-to-1 multiplexer allows the accumulator input to come from one of three different sources. For the ADD and SUB instructions, the A input comes from the output of the adder-subtractor unit. For the IN instruction, the A input comes from the data input port, Input. For the LOAD instruction, the A input comes from the output of the memory, Q7-0. The selection of this multiplexer is through the two signal lines, Asel1-0. The fourth input of the multiplexer is not used. Similar to the previous EC-1 design, the output port is connected directly to the output of the accumulator, A. Therefore, the value of the accumulator is always available at the output port, and no specific output instruction is necessary to output the value in A. For the two conditional jump instructions, JZ and JPOS, the datapath provides the two status signals, (A = 0) and (A ≥ 0), respectively, that are generated from two comparators. The (A = 0) status signal outputs a 1 if the value in A is a 0, hence an 8-input NOR gate is used. The (A ≥ 0) status signal outputs a 1 if the value in A, which is treated as a two’s complement signed number, is a positive number. Since for a two’s complement signed number, a leading 0 means positive and a leading 1 means negative, hence, (A ≥ 0) is simply the negated value of A7 (the most significant bit of A). The actual operation required by the JZ and JPOS instructions is to load the PC with the five least significant bits of the IR. The HALT instruction does not require any specific datapath actions. The control word for this custom datapath has nine control signals, IRload, JMPmux, PCload, Meminst, MemWr, Asel1, Asel0, Aload, and Sub. The datapath provides two status signals, (A = 0) and (A ≥ 0), to the control unit. Copyright © 2011 Enoch Hwang Page 114 of 125 Microprocessor Design Trainer Lab Manual Input IRload D7-0 Load 8-bit Register Clear IR IR7-5 IR4-0 3 IR7-5 5 8 8 5 1 JMPmux 0 5-bit Increment 5 PCload 8 5 1 Meminst 3 2 1 0 s1,0 D4-0 Load 5-bit Register Clear PC PC4-0 0 5 32 locations × 8 bits RAM Address4-0 WE MemWr D7-0 Load 8-bit Register Clear A A7-0 Q7-0 8 8 8 D7-0 Asel1-0 Aload Reset Clock Sub (A = 0) (A ≥ 0) AddSub Subtract 8 1 A7 Output Instruction Cycle Operations Memory Instruction Set Operations Figure 99: Datapath for the EC-2. 6.11.3 Control Unit The state diagram for the control unit is shown in Figure 100(a), and the actions that are executed, specifically the control signals that are asserted in each state, are shown in Figure 100(d). States for executing the instructions are given the same name as the instruction mnemonics. The first three states, Start, Fetch, and Decode, serve the same purpose as in the previous EC-1’s control unit. The Decode state for this second design, however, needs to decode eight opcodes by branching to eight different states for executing the corresponding eight instructions. Like before, the decoding of the opcodes depends on the three most significant bits of the IR. A very important timing issue for this control unit has to do with the memory accesses of the four instructions, and SUB. The problem here is that only after fetching these instructions will the address of the memory location for these instructions be available. Furthermore, only after decoding the instruction will the control unit know that the memory needs to be read. If we change the memory address during the Execute state, the memory will not have enough time to output the value for the instruction to operate on. LOAD, STORE, ADD, Normally, for instructions requiring a memory access for one of its operands, an extra clock cycle for a memory read state will be inserted between the Decode state and the Execute state. This way, the memory will have one clock cycle to output the data for the instruction to operate on in the following clock cycle. This, of course, is assuming that the memory requires only one clock cycle for a read operation. If the memory is slower, then more clock cycles (states) must be inserted in between. To minimize the number of states in our design, we have used the Decode state to also perform the memory read. This way, when the control unit gets to the Execute state, the memory will already have the data ready. Whether the data from the memory actually is used or not will depend on the instruction being executed. If the Copyright © 2011 Enoch Hwang Page 115 of 125 Microprocessor Design Trainer Lab Manual instruction does not require the data from the memory, it is simply ignored. On the other hand, if the instruction needs the data, then the data is there and ready to be used. This solution works in this design because it does not conflict with the operations for the rest of the instructions in our instruction set. The memory read operation performed in the Decode state is accomplished by asserting the Meminst signal from this state. Looking at the output table in Figure 100(d), this is reflected by the 1 under the Meminst column for the Decode state. The actual execution of each instruction is accomplished by asserting the correct control signals to control the operation of the datapath. This is shown by the assignments made for the respective rows in the output table in Figure 100(d). At this point, you should be able to understand why each assignment is made by looking at the operation of the datapath. For example, for the LOAD instruction, the Asel1 signal needs to be asserted and the Asel0 signal needs to be de-asserted in order to select input 2 of the multiplexer so that the output from the memory can pass to the input of the accumulator, A. The actual loading of A is done by asserting the Aload signal. To perform the STORE instruction, the memory address is taken from the IR by asserting Meminst. The writing into memory takes place when MemWr is asserted. The Input state for this state diagram waits for the Enter key signal before looping back to the Start state. In so doing, we can read in several values correctly by having multiple input statements in the program. Notice that after the Enter signal is asserted, there is no state that waits for the Enter signal to be de-asserted (i.e., for the Enter key to be released). Hence, the input device must resolve this issue by outputting exactly one clock pulse each time the Enter key is pressed. This is accomplished at the computer circuit level by using a one-shot circuit. The next-state table for the state diagram and the three next-state equations, as derived from the next-state table, are shown in Figure 100 (b) and (c), respectively. To keep the table reasonable small, all of the possible combinations of the input signals are not listed. All of the states, except for the Input state, depend only on the three IR bits, IR7-5; whereas, the Input state depends only on the Enter signal. The blank entries in the table, therefore, can be viewed as having all 0’s. With 11 states, four D flip-flops are needed for the state memory of the control unit circuit. There are nine control signals, IRload, JMPmux, PCload, Meminst, MemWr, Asel1, Asel0, Aload, and Sub, that the control unit needs to generate for controlling the datapath The datapath provides two status signals, (A = 0) and (A ≥ 0), to the control unit. The control words and output table for controlling this datapath are shown in Figure 100(d). The output equations shown in Figure 100(e) are derived directly from the output table in Figure 100(d). Finally, we can derive the complete circuit for the control unit based on the next-state equations and the output equations. The complete control unit circuit for the EC-2 general-purpose microprocessor is shown in Figure 100(f). Start 0000 Fetch 0001 Decode 0010 (IR7-5 = 000) Load 1000 (IR7-5 = 001) Store 1001 Add 1010 (IR7-5 = 010) Sub 1011 (IR7-5 = 011) (IR7-5 = 100) Input 1100 Enter Copyright © 2011 Enoch Hwang (IR7-5 = 101) Jz 1101 (IR7-5 = 110) Jpos 1110 (IR7-5 = 111) Halt 1111 Enter' Page 116 of 125 Microprocessor Design Trainer Lab Manual (a) Current State Q3Q2Q1Q0 0000 Start 0001 Fetch 0010 Decode 1000 Load 1001 Store 1010 Add 1011 Sub 1100 Input 1101 Jz 1110 Jpos 1111 Halt Next State Q3next Q2next Q1next Q0next (D3 D2 D1 D0) IR7, IR6, IR5 010 011 100 101 110 111 000 001 LOAD STORE ADD SUB INPUT JZ JPOS HALT 0001 0010 1000 0000 0000 0000 0000 0001 0010 1001 0000 0000 0000 0000 0001 0010 1010 0000 0000 0000 0000 0001 0010 1011 0000 0000 0000 0000 0001 0010 1100 0000 0000 0000 0000 0001 0010 1101 0000 0000 0000 0000 0001 0010 1110 0000 0000 0000 0000 0001 0010 1111 0000 0000 0000 0000 0000 0000 1111 0000 0000 1111 0000 0000 1111 0000 0000 1111 0000 0000 1111 0000 0000 1111 0000 0000 1111 0000 0000 1111 Enter 0 1 1100 0000 (b) D3 = Q3'Q2'Q1Q0' + Q3Q2Q1'Q0'Enter' + Q3Q2Q1Q0 D2 = Q3'Q2'Q1Q0'IR7 + Q3Q2Q1'Q0'Enter' + Q3Q2Q1Q0 D1 = Q3'Q2'Q1'Q0 + Q3'Q2'Q1Q0'IR6 + Q3Q2Q1Q0 D0 = Q3'Q2'Q1'Q0' + Q3'Q2'Q1Q0'IR5 + Q3Q2Q1Q0 (c) State Q3Q2Q1Q0 0000 Start 0001 Fetch 0010 Decode 1000 Load 1001 Store 1010 Add 1011 Sub 1100 Input 1101 Jz 1110 Jpos 1111 Halt IRload 0 1 0 0 0 0 0 0 0 0 0 JMPmux 0 0 0 0 0 0 0 0 1 1 0 PCload 0 1 0 0 0 0 0 0 (A = 0) (A ≥ 0) 0 Meminst 0 0 1 0 1 0 0 0 0 0 0 MemWr 0 0 0 0 1 0 0 0 0 0 0 Asel1,0 00 00 00 10 00 00 00 01 00 00 00 Aload 0 0 0 1 0 1 1 1 0 0 0 Sub 0 0 0 0 0 0 1 0 0 0 0 Halt 0 0 0 0 0 0 0 0 0 0 1 (d) IRload = Q3'Q2'Q1'Q0 JMPmux = Q3Q2Q1'Q0 + Q3Q2Q1Q0' PCload = Q3'Q2'Q1'Q0 + Q3Q2Q1'Q0(A = 0) + Q3Q2Q1Q0' (A ≥ 0) Meminst = Q3'Q2'Q1Q0' + Q3Q2'Q1'Q0 MemWr = Q3Q2'Q1'Q0 Asel1 = Q3Q2'Q1'Q0' Copyright © 2011 Enoch Hwang Page 117 of 125 Microprocessor Design Trainer Lab Manual Asel0 Aload Sub Halt = Q3Q2Q1'Q0' = Q3Q1'Q0' + Q3Q2'Q1 = Q3Q2'Q1Q0 = Q3Q2Q1Q0 (e) IRload D3 Q3 JMPmux Clk Clear Q'3 PCload Enter D2 Q2 Clk Clear Q'2 Meminst MemWr D1 Q1 Clk Clear D0 Q'1 Q0 Clock Reset Asel0 Aload Sub Halt Clk Clear Asel1 Q'0 (A = 0) (A ≥ 0) IR7 IR6 IR5 (f) Figure 100: Control unit for the EC-2: (a) state diagram; (b) next-state table; (c) next-state equations; (d) control words and output table; (e) output equations; (f) circuit 6.11.4 EC-2 Microprocessor Circuit The complete circuit for the EC-2 general-purpose microprocessor is constructed by connecting the datapath from Figure 99 and the control unit from Figure 100(f) together using the designated control and status signals as shown in Figure 101. Copyright © 2011 Enoch Hwang Page 118 of 125 Microprocessor Design Trainer Lab Manual Enter Input 8 Enter IRload JMPmux PCload Meminst MemWr Asel1-0 CU Aload Sub Clock Reset Clock Reset IR7-5 (A = 0) (A ≥ 0) Control Signals Status Signals Input IRload JMPmux PCload Meminst MemWr Asel1-0 Aload DP Sub IR7-5 (A = 0) (A ≥ 0) Halt Clock Reset Output 8 Halt Output Figure 101: Complete circuit for the EC-2 general-purpose microprocessor. 6.11.5 Sample Program The memory for our computer is initialized with the content of the text file program.mif. A sample file is shown in Figure 102. The file contains three programs: GCD calculates the greatest common divisor of two input numbers; SUM evaluates the sum of all of the numbers between an input n and 1; and COUNT displays the count from input n down to 0. Only the last program (COUNT) listed in the file is executed. To try out another program, move the code for that program to the end of the file, then recompile the entire microprocessor circuit and upload it onto the trainer board. -- Content of the RAM memory in the file PROGRAM.MIF DEPTH = 32; -- Depth of memory: 5-bit address WIDTH = 8; -- Width of memory: 8-bit data ADDRESS_RADIX = BIN;-- All values in binary (HEX, DEC, OCT, BIN) DATA_RADIX = BIN; ---------- Opcodes for the EC-2 000 = LOAD A,aaaaa 001 = STORE A,aaaaa 010 = ADD A,aaaaa 011 = SUB A,aaaaa 100 = IN A 101 = JZ aaaaa 110 = JPOS aaaaa 111 = HALT -- Specify the memory content Copyright © 2011 Enoch Hwang Page 119 of 125 Microprocessor Design Trainer Lab Manual -- Format of each memory location is -address : data CONTENT BEGIN [00000..11111] : 00000000; -- Initialize locations range 00-1F to 0 --------------------------------------------------------- There are three programs listed below: GCD, SUM, and COUNT -- Only the program listed last is run. -- To try out a different program, -- move the code for the program that you want -- to the end of the list immediately before the END statement. -- Re-compile, and download to the FPGA --------------------------------------------------------- GCD -- Program to calculate the GCD of two numbers, x and y 00000 00001 00010 00011 : 10000000; : 00111110; : 10000000; : 00111111; ----- IN A STORE A,x IN A STORE A,y 00100 00101 00110 00111 : 00011110; : 01111111; : 10110000; : 11001100; ----- loop: LOAD A,x SUB A,y JZ out JPOS xgty -- x=y? 01000 01001 01010 01011 : 00011111; : 01111110; : 00111111; : 11000100; ----- LOAD A,y SUB A,x STORE A,y JPOS loop -- y>x -- y-x 01100 01101 01110 01111 : 00011110; : 01111111; : 00111110; : 11000100; ----- xgty: LOAD A,x SUB A,y STORE A,x JPOS loop -- x>y -- x-y -- x=y -- x>y 10000 : 00011110; -- out: LOAD A,x 10001 : 11111111; -- HALT 11110 : 00000000; -- storage for variable x 11111 : 00000000; -- storage for variable y --------------------------------------------------------- SUM -- Program to sum n downto 1 00000 : 00011101; -- LOAD A,one 00001 : 01111101; -- SUB A,one 00010 : 00111110; -- STORE A,sum -- zero sum by doing 1-1 00011 : 10000000; -- IN A 00100 : 00111111; -- STORE A,n 00101 : 00011111; -- loop: LOAD A,n Copyright © 2011 Enoch Hwang -- n + sum Page 120 of 125 Microprocessor Design Trainer Lab Manual 00110 : 01011110; -- ADD A,sum 00111 : 00111110; -- STORE A,sum 01000 : 00011111; -- LOAD A,n 01001 : 01111101; -- SUB A,one 01010 : 00111111; -- STORE A,n 01011 01100 01101 01110 : 10101101; : 11000101; : 00011110; : 11111111; ----- -- decrement A JZ out JPOS loop out: LOAD A,sum HALT 11101 : 00000001; -- storage for the constant 1 11110 : 00000000; -- storage for variable sum 11111 : 00000000; -- storage for variable n --------------------------------------------------------- COUNT -- Program to countdown from input n to 0 00000 00001 00010 00011 00100 11111 : 10000000; : 01111111; : 10100100; : 11000001; : 11111111; : 00000001; ------- IN A SUB A,11111 JZ 00100 JPOS 00001 HALT storage for the constant 1 END; Figure 102: The program.mif file containing three programs for the EC-2. Only the last program, COUNT, is executed. 6.11.6 Hardware Implementation Figure 103 shows the interface between the EC-2 microprocessor and the input and output devices on the trainer board. The input simply consists of eight switches, and the output is the two 7-segment LED displays. Since the microprocessor outputs an 8-bit binary number, we need an 8-bit binary to two-digit BCD (binary coded decimal) decoder so that we can see the 8-bit binary number as two decimal digits on the two 7-segment LED displays. A single LED is used to show when the microprocessor has halted. A push button switch is used as the Reset key, and a second push button switch is used as the Enter key. Finally, the 16 MHz clock is slowed down with a clock divider circuit. The reason for using the slower clock speed is so that we can see some intermediate results on the display. Copyright © 2011 Enoch Hwang Page 121 of 125 Microprocessor Design Trainer Lab Manual SWITCH[7..0] 8 PB[0] Reset Input PB[1] Enter EC-2 Microprocessor CLOCK Clock Divider Clkin Clkout Clock Output 8 Binary to BCD Decoder HEX1, HEX0 Halt LED[15] Figure 103: Hardware implementation for the EC-2. 6.11.7 Experiments 1. Use the Block Editor in Quartus to draw out the various component circuits for the EC-2 computer. Each component is to be drawn in a new Block Diagram/Schematic File design file. Create symbols for these components. See the instruction below for creating the RAM. If you have already drawn a particular component in another lab, you can reuse it again by copying the .bdf and .bsf files to this project’s folder. 2. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the datapath using the component symbols that you have just created. See the instruction below for creating the RAM. Create a symbol for this datapath circuit. 3. To use the RAM from the Quartus library, open up the Symbol Tool window and click on MegaWizard Plug-In Manager button. Under the megafunction list on the left side, select Memory Compiler > RAM: 1-PORT. Append the name RAM to the end of the path in the name text field. In the MegaWizard Plug-In Manager [page 1 of 6] window, select 8 bits for the ‘q’ output bus width, and select 32 words for the memory size as shown next. Copyright © 2011 Enoch Hwang Page 122 of 125 Microprocessor Design Trainer Lab Manual On page 2 of 6, de-select the check mark in the ‘q’ output port as shown next. Accept the defaults on page 3 of 6. On page 4 of 6, select Yes, use this file for the memory content data, and type in the filename program.mif. You can skip the last two pages in the MegaWizard Plug-In Manager windows, and then click Finish. You can now insert this new symbol for the RAM into your Block Editor. 4. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the control unit using the component symbols that you have just created. Create a symbol for this control unit circuit. 5. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the microprocessor circuit by connecting the control signals and status signals together between the datapath component symbol and the control unit component symbol. Create a symbol for this microprocessor circuit. 6. Create a new Block Diagram/Schematic File design file in the Block Editor to draw the top-level interface circuit for connecting the microprocessor I/O signals to the FPGA pins. Add also a clock divider and a binaryto-BCD decoder to the interface circuit as shown in Figure 103. This design file is the top-level file for your project. 7. Use a text editor such as Notepad to create the file program.mif as listed in Figure 102. Alternatively, you can just copy this file from the DVD under the folder \Circuits\Lab 11\EC-2. Save this file in your project folder using the name program.mif. 8. Use the Pin Planner to map the microprocessor I/O signals to the FPGA I/O pins. Refer to Appendix A for the correct pin mappings. 9. Use the Programmer to upload the microprocessor circuit to the trainer board. Test and verify the operation of this microprocessor on the trainer board. Press PB[0] to reset the microprocessor. Set up SWITCH[7..0] with a number. Press PB[1] to enter the number. Watch the countdown to zero. 10. There are three programs, GCD, SUM, and COUNT, listed in the file program.mif. Only the program listed last is run. To try out a different program, move the code for the program that you want to the end of the list immediately before the END statement. Re-compile, and download to the FPGA. Copyright © 2011 Enoch Hwang Page 123 of 125 Microprocessor Design Trainer Lab Manual Toggle Switches 7 Appendix A – FPGA Pin Mappings 7-segment LED 1 7-segment LED 0 LEDs Push Buttons Copyright © 2011 Enoch Hwang I/O Device SWITCH15 SWITCH14 SWITCH13 SWITCH12 SWITCH11 SWITCH10 SWITCH9 SWITCH8 SWITCH7 SWITCH6 SWITCH5 SWITCH4 SWITCH3 SWITCH2 SWITCH1 SWITCH0 PB0 PB1 PB2 LED15 LED14 LED13 LED12 LED11 LED10 LED9 LED8 LED7 LED6 LED5 LED4 LED3 LED2 LED1 LED0 HEX0 segment A HEX0 segment B HEX0 segment C HEX0 segment D HEX0 segment E HEX0 segment F HEX0 segment G HEX1 segment A HEX1 segment B HEX1 segment C HEX1 segment D HEX1 segment E HEX1 segment F FPGA Pin N15 P15 P14 T15 T14 N14 M11 L13 L15 J15 J13 G11 F13 E10 D12 C11 B10 A11 B11 N16 P16 R16 R14 N11 N12 L11 L14 K12 J14 J12 F14 E11 D14 A13 C9 C15 D16 L16 K16 J16 C16 K15 B16 C14 H16 H15 G16 B14 Page 124 of 125 7-segment LED 2 Microprocessor Design Trainer Lab Manual Header Clock Copyright © 2011 Enoch Hwang HEX1 segment G HEX2 segment A HEX2 segment B HEX2 segment C HEX2 segment D HEX2 segment E HEX2 segment F HEX2 segment G CLOCK HEADER0 HEADER1 HEADER2 HEADER3 HEADER4 HEADER5 HEADER6 HEADER7 A15 B13 A14 G15 F15 D15 B12 A12 E2 T10 R10 T11 R11 R12 T13 R13 F16 Page 125 of 125