Lab Manual v1.2012

The State University of New York (SUNY) at Buffalo Department of Electrical Engineering Lab Manual v1.2012 EE 478/578 - HDL Based Digital Design with Programmable Logic Cristinel Ababei Copyleft © by Cristinel Ababei, 2012. Being a big supporter of open-source, this lab manual is free to use for educational purposes. However, you should credit the author. The advanced materials are available for $1, which will be donated to my preferred charity. Table of Contents Lab 1: Aldec Active-HDL Tutorial...................................................................................................................... 3 Lab 1: Supplemental Material - A First Look at VHDL ................................................................................... 12 Lab 2: Xilinx ISE WebPack Tutorial ................................................................................................................. 16 Lab 2: Supplemental Material - Subprograms and Packages......................................................................... 25 Lab 3: Four-Bit Binary Counter....................................................................................................................... 30 Lab 3: Supplemental Material - Testbenches ................................................................................................ 35 Lab 4: Finite State Machines .......................................................................................................................... 41 Lab 4: Supplemental Material – Writing VHDL code for synthesis ................................................................ 50 Lab 5: Memories: ROMs and BRAMs Internal to the FPGA ........................................................................... 57 Lab 6: Memories: External SPI Flash and DDR2 ............................................................................................. 63 Lab 7: Interfacing FPGA Spartan-6 with AC’97 Codec .................................................................................... 68 Lab 7 Supplemental: PS2 Keyboard and UART .............................................................................................. 71 Lab 8: Interfacing FPGA Spartan-6 with Host Computer via USB .................................................................. 72 Lab 9: Video Interfaces: HDMI and DVI.......................................................................................................... 80 Lab 10: PicoBlaze – an embedded microcontroller ....................................................................................... 86 Lab 11: Single Cycle Computer (SCC) ............................................................................................................. 90 2 Lab 1: Aldec Active-HDL Tutorial 1. Objective The objective of this tutorial is to introduce you to Aldec’s Active-HDL 9.1 Student Edition simulator by performing the following tasks on a 4-bit adder design example: Create a new design or add .vhd files to your design Compile and debug your design Run Simulation Note: Active-HDL is an alternative simulator to Xilinx’s ISim (ISE Simulator) simulator. It is one of the most popular commercial HDL simulators today. It is developed by Aldec. In this course, we use the free student version of Active-HDL, which has some limitations (file sizes and computational runtime). You can download and install it on your own computer: http://www.aldec.com/en/products/fpga_simulation/active_hdl_student 2. Introduction Active-HDL is a Windows based integrated FPGA Design Creation and Simulation solution. Active-HDL includes a full HDL graphical design tool suite and RTL/gate-level mixed-language simulator. It supports industry leading FPGA devices, from Altera, Atmel, Lattice, Microsemi (Actel), Quicklogic, Xilinx and more. The core of the system is an HDL simulator. Along with debugging and design entry tools, it makes up a complete system that allows you to write, debug and simulate VHDL code. Based on the concept of a workspace (think of it as of design), Active-HDL allows us to organize your VHDL resources into a convenient and clear structure. 3. Procedure Creating the 1-bit full adder 1. Start Aldec Active-HDL: Start->All Programs->Aldec->Active-HDL Student Edition 2. Select “Create New Workspace” and click OK 3. Enter fall2012_aldec as the name of the workspace and change the directory to where you want to save it (for example M:\UB\labs) and click OK 4. Select “Create an Empty Design” and click NEXT 5. Choose the block diagram configuration as “Default HDL Language” and default HDL language as “VHDL”. Select the target technology as Xilinx for vendor and SPARTAN6 for technology. Click NEXT 6. Enter fourbit_adder as the name of the design as well as the name of the default working library. Click NEXT 7. Click FINISH You should have now the Design Browser as a window showing current workspace and design contents. 8. Double-click on “Add New File” in the Design Browser window 9. Select “VHDL Source Code” and type in full_adder in the name field, click OK 3 The following is the VHDL code for the 1-bit full adder. Enter the code as seen below into the empty file. ------ 1-bit full adder Declare the 1-bit full adder with the inputs and outputs shown inside the port(). This adds two bits together (x,y) with a carry in (cin) and outputs the sum (sum) and a carry out (cout). LIBRARY IEEE; use IEEE.STD_LOGIC_1164.ALL; entity full_adder is port(x, y, cin: in std_logic; sum, cout: out std_logic); end full_adder; architecture my_dataflow of full_adder is begin sum <= (x xor y) xor cin; cout <= (x and y) or (x and cin) or (y and cin); end my_dataflow; 10. Select the File menu and choose Save. 11. To check sintax of the newly created adder, right click on “full_adder.vhd” in the Design Browser window and select the Compile option. The code should compile without any problems and you should see a green check mark next to the full_adder.vhd file. If you get any errors, check the code that you have typed against the above code provided. Once you have the source file (or all the source files of the entire design) compiled, the design can be simulated for functional correctness. Manual Simulation Note: This type of simulation should be done only for small designs with few inputs and outputs. As design size increases you should use testbenches – described later. 1. Select menu Simulation->Initialize Simulation After the simulation has been initialized, you have to open a new Waveform window. 2. Click the New Waveform toolbar button to invoke the Waveform window. Now you need assign the stimulators to all the input signals. 3. In the Design Browser window select all signals (one by one by holding Control key pressed), then right click and choose Add to Wavefom Note: To add signals to the simulator we could also use the drag and drop feature. In the Structure pane/tab of the Design Browser window, select the design and while holding down the left button, drag it to the right-section of the Waveform window and then release the mouse button. This is a standard drag-anddrop operation. 4 4. Go to the left pane/tab of the Waveform Editor window and select the “x” signal. Press the right button to invoke a context menu, choose the “x” item from Stimulators… dialog; choose Clock for Type. Leave the Frequency at the default value of 10 MHz. Click APPLY and then CLOSE 5. Repeat step 4 for input “y”. Choose Clock for Type but this time place the mouse pointer in the Frequency box and set the value of 5 MHz. Click the APPLY button to assign the stimulator then CLOSE. 6. Repeat step 4 for input “cin”. Choose Formula for Type and when the dialog appears, type formula expression as follows: 0 0, 1 100000. Click APPLY and then CLOSE 7. Simulation->Run Until and enter 300ns 8. Finish simulation by selecting the Simulation->End Simulation option in the Simulation menu At this time your Waveform viewer should look like this: Investigate the waveforms to verify that your full_adder works correctly. Testbench Based Simulation The VHDL testbench is a VHDL program that describes simulation inputs in standard VHDL language. There is a wide variety of VHDL specific functions and language constructs designed to create simulation inputs. You can read the simulation data from a text file, create separate processes driving input ports, and more. The typical way to create a testbench is to create an additional VHDL file for the design that treats your actual VHDL design as a component (Design Under Test, DUT) and assigns specific values to this component input ports. It also monitors the output response of the DUT to verify correct operation. The diagram below illustrates the relationship between the entity, architecture, and testbench: 5 1. Create a new file full_adder_testbench.vhd and save it under the current design’s “src” directory (for example M:\UB\labs\fall2012_aldec\fourbit_adder\src). The content of this file is in the Appendix A at the end of this tutorial. You can create it using Aldec’s editor or any other editor (e.g., even Notepad). 2. Select the Design menu and choose “Add Files to Design” and add the newly created full_adder_testbench.vhd to the design. 3. Right click on full_adder_testbench.vhd in the Design Browser window and select the Compile option. 4. Left click on the plus (+) next to full_adder_testbench.vhd. This will bring us the TEST_FULL_ADDER entity 5. Right click on the TEST_FULL_ADDER and choose Set as Top-Level 6. Select the File menu and choose the New option and pick New Waveform 7. In the Design Browser window select the Structure pane/tab at the bottom of the window 8. Select the Simulation menu and choose Initialize Simulation 9. Click on “+” next to TEST_FULL_ADDER (MY_TEST) 10. Click on U1:FULL_ADDER and drag all signals to the waveform window 11. Change the time for simulation to 400 ns by clicking on the up arrow 12. Select the Simulation menu and choose Run For 13. Inspect the simulation to verify that the 1-bit full adder functionality is indeed correct At this time your Waveform viewer should look like this: Creating and testing the 4-bit adder 1. Add the following two files to the design: fourbit_adder.vhd and fourbit_adder_testbench.vhd. Their source code is in Apendices B and C at the end of this tutorial. 2. Compile both files and use the testbench (fourbit_adder_testbench.vhd) to simulate the design for say 200 ns 3. View the simulation to verify that the 4-bit adder functionality is correct. At this time your Waveform viewer should look like this: 6 4. Taking it further While intuitive to use, Active-HDL has a lot of features. It is outside the scope of this tutorial to discuss all of them. You should spend some time searching and reading additional documentation on how to use Active-HDL. A few first examples: http://www.aldec.com/en/downloads/tutorials Once you launched Active-HDL tool select the Help menu and read stuff Google for “Active-HDL tutorial”. You will find a lot of detailed tutorials (some written for older versions of the tool but a lot of concepts still apply), which have been kindly made public by the online community. Note: As it is the case with most of the electronic design automation (EDA) tools, there are multiple ways of achieving or performing something. If by reading the documentation or other tutorials you learn how to accomplish any of the steps described in this tutorial in a different way - that is OK. You should learn and use the methods you like the most and are more comfortable with. Finally, while Active-HDL (of Aldec) and ModelSim (of Mentor Graphics) are arguably some of the most popular HDL simulators in industry, Xilinx has been improving their own simulator, ISim, which is part of the free ISE WebPack used in this course. You can read more about iSim here: http://www.xilinx.com/support/documentation/sw_manuals/xilinx13_4/plugin_ism.pdf 7 Appendix A: VHDL source code of full_adder_testbench.vhd ------ 1-bit full adder testbench A testbench is used to rigorously tests a design that you have made. The output of the testbench should allow the designer to see if the design worked. The testbench should also report where the testbench failed. LIBRARY IEEE; use IEEE.STD_LOGIC_1164.ALL; -- Declare a testbench. Notice that the testbench does not have any input -- or output ports. entity TEST_FULL_ADDER is end TEST_FULL_ADDER; -- Describes the functionality of the tesbench. architecture MY_TEST of TEST_FULL_ADDER is -- The object that we wish to test is declared as a component of -- the test bench. Its functionality has already been described elsewhere. -- This simply describes what the object's inputs and outputs are, it -- does not actually create the object. component FULL_ADDER port( x, y, cin : in STD_LOGIC; sum, cout : out STD_LOGIC ); end component; -- Specifies which description of the adder you will use. for U1: FULL_ADDER use entity WORK.FULL_ADDER(MY_DATAFLOW); -- Create a set of signals which will be associated with both the inputs -- and outputs of the component that we wish to test. signal X_s, Y_s : STD_LOGIC; signal CIN_s : STD_LOGIC; signal SUM_s : STD_LOGIC; signal COUT_s : STD_LOGIC; -- This is where the testbench for the FULL_ADDER actually begins. begin -- Create a 1-bit full adder in the testbench. -- The signals specified above are mapped to their appropriate -- roles in the 1-bit full adder which we have created. U1: FULL_ADDER port map (X_s, Y_s, CIN_s, SUM_s, COUT_s); -- The process is where the actual testing is done. process begin -- We are now going to set the inputs of the adder and test -- the outputs to verify the functionality of our 1-bit full adder. -- Case 0 : 0+0 with carry in of 0. -- Set the signals for the inputs. X_s <= '0'; Y_s <= '0'; CIN_s <= '0'; -- Wait a short amount of time and then check to see if the -- outputs are what they should be. If not, then report an error -- so that we will know there is a problem. wait for 10 ns; 8 assert ( SUM_s = '0' ) report "Failed Case 0 - SUM" severity error; assert ( COUT_s = '0' ) report "Failed Case 0 - COUT" severity error; wait for 40 ns; -- Carry out the same process outlined above for the other 7 cases. -- Case 1 : 0+0 with carry in of 1. X_s <= '0'; Y_s <= '0'; CIN_s <= '1'; wait for 10 ns; assert ( SUM_s = '1' ) report "Failed Case 1 - SUM" severity error; assert ( COUT_s = '0' ) report "Failed Case 1 - COUT" severity error; wait for 40 ns; -- Case 2 : 0+1 with carry in of 0. X_s <= '0'; Y_s <= '1'; CIN_s <= '0'; wait for 10 ns; assert ( SUM_s = '1' ) report "Failed Case 2 - SUM" severity error; assert ( COUT_s = '0' ) report "Failed Case 2 - COUT" severity error; wait for 40 ns; -- Case 3 : 0+1 with carry in of 1. X_s <= '0'; Y_s <= '1'; CIN_s <= '1'; wait for 10 ns; assert ( SUM_s = '0' ) report "Failed Case 3 - SUM" severity error; assert ( COUT_s = '1' ) report "Failed Case 3 - COUT" severity error; wait for 40 ns; -- Case 4 : 1+0 with carry in of 0. X_s <= '1'; Y_s <= '0'; CIN_s <= '0'; wait for 10 ns; assert ( SUM_s = '1' ) report "Failed Case 4 - SUM" severity error; assert ( COUT_s = '0' ) report "Failed Case 4 - COUT" severity error; wait for 40 ns; -- Case 5 : 1+0 with carry in of 1. X_s <= '1'; Y_s <= '0'; CIN_s <= '1'; wait for 10 ns; assert ( SUM_s = '0' ) report "Failed Case 5 - SUM" severity error; assert ( COUT_s = '1' ) report "Failed Case 5 - COUT" severity error; wait for 40 ns; -- Case 6 : 1+1 with carry in of 0. X_s <= '1'; Y_s <= '1'; CIN_s <= '0'; wait for 10 ns; assert ( SUM_s = '0' ) report "Failed Case 6 - SUM" severity error; assert ( COUT_s = '1' ) report "Failed Case 6 - COUT" severity error; wait for 40 ns; -- Case 7 : 1+1 with carry in of 1. X_s <= '1'; Y_s <= '1'; CIN_s <= '1'; 9 wait for assert ( assert ( wait for 10 ns; SUM_s = '1' ) report "Failed Case 7 - SUM" severity error; COUT_s = '1' ) report "Failed Case 7 - COUT" severity error; 40 ns; end process; END MY_TEST; Appendix B: VHDL source code of fourbit_adder.vhd ----- 4-bit adder Structural description of a 4-bit adder. This device adds two 4-bit numbers together using four 1-bit full adders described above. -- This is just to make a reference to some common things needed. LIBRARY IEEE; use IEEE.STD_LOGIC_1164.ALL; ----- This describes designing. The inside port(). and produces a the black-box view inputs and outputs It takes two 4-bit 4-bit output (ANS) of the component we are are again described values as input (x and y) and a carry out bit (Cout). entity fourbit_adder is port( a, b : in STD_LOGIC_VECTOR(3 downto 0); z : out STD_LOGIC_VECTOR(3 downto 0); cout : out STD_LOGIC ); end fourbit_adder; -- Although we have already described the inputs and outputs, -- we must now describe the functionality of the adder (ie: -- how we produced the desired outputs from the given inputs). architecture MY_STRUCTURE of fourbit_adder is -- We are going to need four 1-bit adders, so include the -- design that we have already studied in full_adder.vhd. component FULL_ADDER port( x, y, cin sum, cout end component; : in STD_LOGIC; : out STD_LOGIC ); -- Now create the signals which are going to be necessary -- to pass the outputs of one adder to the inputs of the next -- in the sequence. signal c0, c1, c2, c3 : STD_LOGIC; begin c0 <= '0'; b_adder0: FULL_ADDER b_adder1: FULL_ADDER b_adder2: FULL_ADDER b_adder3: FULL_ADDER port port port port map map map map (a(0), (a(1), (a(2), (a(3), b(0), b(1), b(2), b(3), c0, c1, c2, c3, z(0), z(1), z(2), z(3), c1); c2); c3); cout); END MY_STRUCTURE; Appendix C: VHDL source code of fourbit_adder_testbench.vhd -- 4-bit Adder Testbench -- A testbench is used to rigorously tests a design that you have made. 10 -- The output of the testbench should allow the designer to see if -- the design worked. The testbench should also report where the testbench -- failed. -- This is just to make a reference to some common things needed. LIBRARY IEEE; use IEEE.STD_LOGIC_1164.ALL; -- Declare a testbench. Notice that the testbench does not have any -- input or output ports. entity TEST_FOURBIT_ADDER is end TEST_FOURBIT_ADDER; -- Describes the functionality of the tesbench. architecture MY_TEST of TEST_FOURBIT_ADDER is component fourbit_adder port( a, b : in z : out cout : out end component; STD_LOGIC_VECTOR(3 downto 0); STD_LOGIC_VECTOR(3 downto 0); STD_LOGIC); for U1: fourbit_adder use entity WORK.FOURBIT_ADDER(MY_STRUCTURE); signal a, b : STD_LOGIC_VECTOR(3 downto 0); signal z : STD_LOGIC_VECTOR(3 downto 0); signal cout : STD_LOGIC; begin U1: fourbit_adder port map (a,b,z,cout); process begin -- Case 1 that we are testing. a <= "0000"; b <= "0000"; wait for 10 ns; assert ( z = "0000" ) report "Failed Case 1 - z" severity error; assert ( Cout = '0' ) report "Failed Case 1 - Cout" severity error; wait for 40 ns; -- Case 2 that we are testing. a <= "1111"; b <= "1111"; wait for 10 ns; assert ( z = "1110" ) report "Failed Case 2 - z" severity error; assert ( Cout = '1' ) report "Failed Case 2 - Cout" severity error; wait for 40 ns; end process; END MY_TEST; 11 Lab 1: Supplemental Material - A First Look at VHDL The objective of this supplemental material is to give you an early presentation of some of the most important concepts in VHDL. You should keep this document as a reference for future work on your course assignments. ------------------------------------------------------------------ entity & architecture template ----------------------------------------------------------------library lib_name; use lib_name.package_name.all; entity entity_name is generic ( generic_name : type_name := default; generic_name : type_name := default ); port ( port_name : in|out|inout|buffer|linkage type_name; ort_name : in|out|inout|buffer|linkage type_name ); end entity_name; architecture arch_name of entity_name is signal signal_name : type_name := default; begin concurrent assignments and processes; end arch_name; ------------------------------------------------------------------ component declaration ----------------------------------------------------------------component component_name generic ( generic_name : type_name := default; generic_name : type_name := default ); port ( port_name : in|out|inout|buffer|linkage type_name; port_name : in|out|inout|buffer|linkage type_name ); end component_name; ------------------------------------------------------------------ component instantiation ----------------------------------------------------------------instance_name : component_name generic map ( generic_name => value, generic_name => value ); port map ( port_name => value, 12 port_name => value ); ------------------------------------------------------------------ process template ----------------------------------------------------------------process_name : process( signal_port_name, signal_port_name ) variable var_name : type_name := default; begin ... end process process_name; ------------------------------------------------------------------ concurrent signal assignments ----------------------------------------------------------------signal_name <= value; signal_name <= transport value after time_value, transport value after time_value; access_name <= new type_name ( initial_value ); signal_name <= value1 when ( condition1 ) else value2 when ( condition2 ) else value3; with expression select signal_name <= value1 when choice1, value2 when choice2, value3 when others; ------------------------------------------------------------------ type declarations ----------------------------------------------------------------type type_name is ( ENUM1, ENUM2, ENUM3 ); type type_name is range low_integer to high_integer units base_unit; unit1 = integer base_unit; unit2 = integer unit1; end units; type type_name is array ( low_index to high_index ) of element_type; type type_name is array ( high_index downto low_index ) of element_type; type type_name is array ( scalar_type1 range <> ) of element_type; type type_name is array ( index1, index2 ) of element_type; type type_name is record element_name : type_name; element_name : type_name; end record; type record_type_name; type pointer_type_name is access record_type_name; type record_type_name is record next_record : pointer_type_name; end record; type file_type_name is file of type_name; subtype subtype_name is scalar_type range low to high; subtype subtype_name is array_type( left downto/to right ); subtype subtype_name is resolution_fn type_name; 13 ------------------------------------------------------------------ signal/constant/variable declarations ----------------------------------------------------------------signal signal_name : type_name := default; constant const_name : type_name := value; variable var_name : type_name := default; file file_id : file_type is in/out file_name; ------------------------------------------------------------------ procedure & function declaration ----------------------------------------------------------------procedure proc_name ( constant/variable/signal param : in/out/inout type_name ); procedure proc_name ( param1 : type_name; param2 : type_name ); function fn_name ( constant/variable/signal param : in/out/inout type_name ) return type_name; function fn_name ( param1 : type_name; param2 : type_name ) return type_name; ------------------------------------------------------------------ procedure & function body ----------------------------------------------------------------procedure proc_name ( constant/variable/signal param : in/out/inout type_name ) is variable var_name : type_name := default; begin statements; end proc_name; function fn_name ( constant/variable/signal param : in/out/inout type_name ) return type_name is variable var_name : type_name := default; begin statements; return value; end proc_name; ------------------------------------------------------------------ if statement ----------------------------------------------------------------if ( condition1 ) then statements; elsif ( condition2 ) then statements; else statements; end if; ------------------------------------------------------------------ case statement ----------------------------------------------------------------case signal/variable is when value1 => statements; when value2 => statements; when others => statements; 14 end case; ------------------------------------------------------------------ while loop ----------------------------------------------------------------label : while ( condition ) loop statements; end loop label; ------------------------------------------------------------------ for loop ----------------------------------------------------------------label : for var_name in left to/downto right loop statements; end loop label; ------------------------------------------------------------------ assert statement ----------------------------------------------------------------assert ( condition ) report string_value severity severity_value; ------------------------------------------------------------------ package declaration and body ----------------------------------------------------------------package pkg_name is declarations; end pkg_name; package body pkg_name is definitions; end pkg_name; ------------------------------------------------------------------ configurations ----------------------------------------------------------------configuration cfg_name of entity_name is for arch_name end for; end cfg_name; configuration cfg_name of entity_name is for arch_name for instance_name : comp_name use entity entity_name ( architecture ); for instance_name : comp_name for arch_name end for; for instance_name : comp_name use configuration cfg_name2; for others : comp_name use configuration cfg_name2; for all : comp_name use configuration cfg_name2; end for; end cfg_name; 15 Lab 2: Xilinx ISE WebPack Tutorial 1. Objective To introduce you to Xilinx’s ISE WebPack by performing the following tasks on a 4-bit adder design example: Use Xilinx ISE WebPack software to: o Specify the type of FPGA to be programmed o Assign input and output signals to FPGA pins o Implement the design (producing a bit file) o Generate reports Use Digilent Adept software to: o Select the board to be programmed: Digilent ATLYS FPGA board o Select the bit file to be used o Program the FPGA board Test the design on the ATLYS board 2. Introduction In this course, we use Xilinx ISE WebPack 14.1 to synthesize our designs. The target FPGA is Xilinx Spartan-6. This FPGA is mounted on a board called Atlys by Digilent. The Atlys circuit board is a complete, ready-to-use digital circuit development platform based on a Xilinx Spartan-6 LX45 FPGA. It offers a large on-board collection of high-end peripherals including Gbit Ethernet, HDMI Video, 128MByte 16-bit DDR2 memory, and USB and audio ports. A typical design flow is illustrated in the next figure: Aldec Active-HDL or Xilinx ISim Specify design functionality Define inputs and outputs Write VHDL files; create testbenches Compile, simulate, and debug Previous tutorial (lab 1) VHDL file (e.g., MyFile.vhd) This tutorial (lab 2) Xilinx ISE WebPack Specify FPGA (Spartan-6, etc.) Assigns signals to pins Implement design (synthesis, place, route) Generate reports Generate bitstream file (to program FPGA) bit file (e.g., MyFile.bit) Digilent Adept or Xilinx iMPACT Program to download bit file to the FPGA on the Atlys board USB 3. Procedure: Design Implementation with Xilinx ISE WebPack 16 3.1 Start Xilinx ISE Launch Xilinx ISE using the shortcut on the desktop (or Start->All Programs->Xilinx Design Tools->ISE Design Suite 14.1->ISE Design Tools->Project Navigator) 3.2 Create a Project --Click “New Project” button or Select File->New Project --Enter the project name fourbit_adder and select the location where you want it to be saved. For example, M:\UB\labs\fall2012_ise. --Select HDL for Top-Level source type and click Next. You should get the Project Settings window. 3.3 Specify the FPGA to be Used --In the Project Settings window, select Spartan6 for Family, select XC6SLX45 for Device, select CSG324 for Package, and VHDL for Preferred Language. Leave the rest of the options unchanged (see figure below). Then, click Next. --You should get a Project Summary window. Click Finish to create the project. 17 3.4 Add Existing Source Files to Project --Select Project->Add Source and locate the vhd files for our design. In this example, we will use the full_ader.vhd and fourbit_adder.vhd files that we have already created in lab1. So, go ahead and locate them and add them to the project, then click Open. --At this time, you should see the Design Overview - Summary being displayed. 3.5 Implement the Design Design implementation is the process of translating, mapping, placing, routing, and generating a bitstream file for your design. The design implementation tools are embedded in the Xilinx ISE software for easy access and project management. The figure below illustrates the design implementation step within a typical FPGA design flow. Design Implementation     Mapping Placement Routing Bitstrean generation To perform the design implementation of our fourbit_adder follow these steps: --In the Hierarchy window, select “fourbit_adder – MY_STRUCTURE (fourbit_adder.vhd)” --In the Processes tab double-click Implement Design (or right-click on Implement Design and select Run). During and after the run, you should see: Lots of information should scroll by in the Console window. If any errors occur, scroll back up to read the messages and figure out how to fix the errors. Green check marks appear next to the processes that have been run Information filled out in the Design Overview – Summary window. For example: o Note that this simple example only uses 4 out of 27,288 available LUTs o This example has 2 inputs (“a” and “b”, each has four bits) and 2 outputs (“z” has four bits and “cout” is a single bit), so only 13 of 218 input-output blocks (IOB) are used. The next figure shows how the Project Navigator window looks like after Implementation run finished: 18 3.6 ATLYS Pinout The Atlys board includes six pushbuttons, eight slide switches, and eight LEDs for basic digital input and output. One pushbutton has a red plunger and is labeled “reset” on the PCB - this button is no different than the other five, but it can be used as a reset input to processor systems. The buttons and slide switches are connected to the FPGA via series resistors to prevent damage from inadvertent short circuits. The high efficiency LED anodes are connected to the FPGA via 390-ohm resistors, and they will illuminate when a logic high voltage is applied to their respective I/O pin. The next figure shows the connection of the pushbuttons, slide switches, and LEDs to the FPGA’s pins: Note: Now it’s a good time to take a while and read through the reference manual of Atlys board to get familiar with the rest of pinouts. You can download it directly from Digilent: http://www.digilentinc.com/Data/Products/ATLYS/Atlys_rm.pdf 19 Also, take some time to read through some of the documentation of Spartan-6 FPGA: http://www.xilinx.com/support/documentation/spartan-6.htm Because our fourbit_adder design is pretty small, we can actually conveniently assign the eight slide switches to control the two inputs and use five LEDs to be driven by the outputs. We will use the first four slide switches (SW0-SW3) as input “a” and the last four slide switches (SW4-SW7) as input “b” of the fourbit_adder. The output “z” of the fourbit_adder will drive the first four LEDs (LD0-LD3) and the output “cout” will drive the last LED (LD7). 3.7 Assigning Pins --Expand (+) User Constraints under the Processes tab --Double-click on I/O Pin Planning (PlanAhead) - Post-Synthesis --Select Yes to create a User Constraint File (UCF) --The PlanAhead 14.1 window should now appear; it may take a few seconds though. If a Welcome window appears, you can simply close it. If a window appears asking if you would like to load software updates, select No. --Expand (+) a(4), b(4), z(4), and Scalar ports under the I/O Ports tab to reveal all the inputs and outputs. --Double-click on signal a(0) to open the I/O Port Properties window. --Enter the desired pin number A10 for signal a[0] in the box labeled Site. --Click Apply. --Repeat for the rest of the signals a, b, z, and cout using the pin numbers determined earlier and shown in the previous figure. --Check the pin numbers now listed in the I/O Ports tab to be sure that they are correct. The PlanAhead window should look like in the next figure: 20 3.8 Printing the Package View Note: This step is optional. It is described here for the sake of completeness. To save a tree, do not actually print. Before printing the package view, change the background from black to white as follows: --Bring the mouse pointer within the Package window, then Right-click and select View – Options --When the Options window opens, change the PlanAhead Default Theme to PlanAhead Light Theme under Colors option. Click Apply. Notice that the package now has a light color background. --At this time you could select File->Print and the Package View will print. However, do not do it. Instead, in the Package window zoom-in to pin A10 and verify that it is assigned to a[0]: you should see a[0] written inside the cell at location row A column 10. Verify the correct assignment of the other pins as well. --Select File->Save Project --Select File->Exit --Select OK 3.9 Re-Implementing the Design after Pins Assignment --The Xilinx ISE screen should now again appear and a question mark (?) should appear next to Implement Design indicating that the design is no longer current (since we assigned pins). --Double-click on Implement Design to implement the design again using the assigned pins. --The (?) next to Implement Design should now have been replaced by a green check again. 3.10 Generating the Programming File --Double-click on Generate Programming File in the Process tab. This step will generate the bit file (fourbit_adder.bit in this example) that will be downloaded to the FPGA in a later step. A green check mark should appear after it has successfully run. 3.11 Viewing and Printing Reports Note: Again, this step is described for completeness; do not actually do the printing. --The Design Summary tab shows that several types of reports are available. Click on Summary, IOB Properties, Pinout Report, etc. reports and take some time to read through and understand them. For example, notice in the Static Timing report that the longest path delay of 14.427 ns is between input bit a<0> and output cout. --You could print any of these reports by selecting File->Print. --An alternative to printing the Pinout Report is to print the User Configuration File (UCF). Look in the project folder for a file with a ucf extension (fourbit_adder.ucf in this case). Open the file with any text editor. Notice that this file only contains information on pins that were assigned. --A schematic can be printed as follows: Select Tools->Schematic Viewer->RTL. When you do this first time, we are asked to select the Viewer Startup Mode; leave it as “Start with the Explorer Wizard”. At this time you should get a new dialog window, “Create RTL Schematic”. Expand the (+) sign of the Signals area, select all of the signals, and click the “Add ->” button. 21 Click Create Schematic button. The schematic now appears as shown in the figure below. Double click on the full_adder box to go lower into the hierarchy of the design. You could print it by selecting File->Print to print the schematic (the background changes to white). 3.12 Save and Close the Project --Select File->Save --Select File->Close Project --Select File->Exit to shut down Xilinx ISE 3.13 Opening an existing project --If you need to open an existing project, look in the project folder for a file with an .xise extension (full_adder.xise in this example). --If you modify the VHDL source code you must run again Implement Design and Generate Programming File. There is no need to run PlanAhead and assign signals to pins since the UCF file still exits - unless you added/changed inputs or outputs. 4. Procedure: FPGA Programming with Digilent Adept Digilent Adept is a free program available from Digilent to download synthesized designs (bit files) onto Digilent FPGA boards. 4.1 Method 1: Direct programming via USB cable To program the Atlys board using Adept software, first set up the board and initialize the software: --Plug in and attach the power supply --Plug in the USB cable to the PC and to the USB port on the board – the one marked “PROG” on the board’s PCB (this is the so called Adept USB port). --Turn ON Atlys’ power switch --Start the Adept software --Wait for the FPGA to be recognized. If everything is properly connected and powered-up, the software should recognize the board as indicated in the figure below. 22 --Select the Browse… button next to the FPGA box and locate the bit file generated by the Xilinx ISE WebPack software. In this example, the file is fourbit_adder.bit, located in the project folder created in the previous section of this tutorial. --If everything is ok, then the Adept software should print the message “Programming Successful” once the programming is finished. --Congratulations! You just programmed the Spartan-6 FPGA to implement the fourbit_adder design! Use the slide switches to set inputs “a” and “b” and watch the LEDs to verify that the adder works correctly. Try different combinations of input values. For example, if we wanted to test a + b = 2 + 5 = 7 we would set input a = 0010 and b = 0101 via the slide switches and the output should turn the LEDs on as shown in the figure below. b[0-3] a[0-3] 4.2 Method 2: Programming from Flash memory --Turn the Atlys FPGA board OFF and then ON again. Note that the design has been lost! 23 Recall that the LUTs in an FPGA are essentially RAM and their contents are lost when power is turned off. The Atlys board also contains 16Mbyte x4 SPI Flash, which can be used to permanently store the configuration file of our design. --To program the SPI Flash ROM, select the “Flash” tab in the Adept’s software window. --In the “FPGA Programming File” section click Browse… and locate the fourbit_adder.bit file and then click Program. If everything went ok, you should get the message “Flash configuration successful”. --Turn the FPGA board OFF and then ON again. The SPI Flash ROM is automatically transferred to the FPGA at power-on. --Disconnect the USB cable and turn the power switch OFF and ON again. Note that the design should still work! 4.3 Other programming methods As mentioned in the Reference Manual of the Atlys board (link to it provided earlier), the FPGA can be programmed also via the JTAG interface. In addition, the programming file can be transferred from a USB memory stick attached to the USB HID port (the one marked J13 on the board’s PCB). It is left as an assignment for you to search and read through the documentation to figure out how exactly programming from an USB memory stick can be done. 5. Taking it further As you already realized, Xilinx ISE WebPack is a sophisticated software with lots of features. It is outside the scope of this tutorial to discuss all of them. You should spend time on your own to search and read additional documentation and tutorials. A few first examples: Xilinx’s ISE In-Depth Tutorial: http://www.xilinx.com/support/documentation/sw_manuals/xilinx14_1/ise_tutorial_ug695.pdf Digilent’s ISE WebPack VHDL Tutorial: http://www.digilentinc.com/Data/Documents/Tutorials/Xilinx%20ISE%20WebPACK%20VHDL%20T utorial.pdf Digilent’s Adept Software Advanced Tutorial: http://digilentinc.com/Data/Documents/Tutorials/Adept%20Software%20Advanced%20Tutorial.pdf Once you launched the ISE tool select the Help menu and read stuff Google for “ISE WebPack tutorial”. You will find a lot of detailed tutorials (some written for older versions of the tool but a lot of concepts still apply), which have been kindly made public by the online community. Note: As it is the case with most of the electronic design automation (EDA) tools, there are multiple ways of achieving or performing something. If by reading the documentation or other tutorials you learn how to accomplish any of the steps described in this tutorial in a different way - that is OK. You should learn and use the methods you like the most and are more comfortable with. 24 Lab 2: Supplemental Material - Subprograms and Packages The objective of this supplemental material is to introduce you to the concepts of subprograms (functions and procedures) and packages in VHDL. 1. VHDL Functions A function executes a sequential algorithm and returns a single value to the calling program. We can think of a function as a generalization of expressions. The syntax rule for a function declaration is: [pure | impure] function identifier [(parameter_interface_list)] return type_mark is {subprogram declarations} begin {sequential statements} end [function] [identifier]; By default (i.e., if no keyword is given), functions are declared as pure. A pure function does not have access to a shared variable, because shared variables are declared in the declarative part of the architecture and pure functions do not have access to objects outside of their scope. Only parameters of mode 'in' are allowed in function calls and are treated as 'constant' by default. Functions may be used wherever an expression is necessary within a VHDL statement. Subprograms themselves, however, are executed sequentially like processes. Similar to a process, it is also possible to declare local variables. These variables are initialized with each function call with the leftmost element of the type declaration (boolean: false, bit: '0'). The leftmost value of integers is guaranteed to be at least -(2^31)-1 (i.e. zeros must be initialized to 0 at the beginning of the function body). It’s recommended to initialize all variables in order to enhance the clarity of the code. Example 1: The following VHDL code describes a simple function that adds two 4-bit vectors and a carry in and returns a 5-bit sum: function add4_func(a, b : std_logic_vector(3 downto 0); carry: std_logic) return std_logic_vector is variable cout : std_logic; variable cin : std_logic; variable sum : std_logic_vector(4 downto 0); begin cin := carry; sum := "00000"; loop1 : for i in 0 to 3 loop cout := (a(i) and b(i)) or (a(i) and cin) or (b(i) and cin); sum(i) := a(i) xor b(i) xor cin; cin := cout; end loop loop1; sum(4) := cout; return sum; 25 end add4_func; Question: what is the role of the statement: cin := cout; inside loop1? 2. VHDL Procedures Procedures, in contrast to functions, are used like any other statement in VHDL. Consequently, they do not have a return value, although the keyword 'return' may be used to indicate the termination of the subprogram. Depending on their position within the VHDL code, either in an architecture or in a process, the procedure as a whole is executed concurrently or sequentially, respectively. Procedures facilitate decomposition of VHDL code into modules. They can return any number of values using output parameters. The default mode of a parameter is 'in', the keyword 'out' or 'inout' is necessary to declare output signals/variables. The syntax rule for a procedure declaration is: procedure identifier [(parameter_interface_list)] is {subprogram declarations} begin {sequential statements} end [procedure] [identifier]; Example 2: The following procedure does basically the same thing as the function in the previous example: procedure add4_proc (a, b : in std_logic_vector(3 downto 0); carry : in std_logic; signal sum : out std_logic_vector(3 downto 0); signal cout : out std_logic) is variable c : std_logic; begin c := carry; for i in 0 to 3 loop sum(i) <= a(i) xor b(i) xor c; c := (a(i) and b(i)) or (a(i) and c) or (b(i) and c); end loop; cout <= c; end add4_proc; 3. Packages and libraries Packages and libraries provide a convenient way of referencing frequently used functions and components. Packages are the only language mechanism to share objects among different design units. Usually, they are designed to provide standard solutions for specific problems (e.g., data types and corresponding subprograms like type conversion functions for a certain bus protocol, procedures and components (macros) for signal processing purposes, etc.). A package consists of a package declaration and an optional package body. The package declaration contains a set of declarations, which may be shared by several design units (for 26 example: types, signals, components, and function and procedure declarations). The body package usually contains the functions and procedure bodies. The syntax rule for a package declaration is: package identifier is {package declarations} begin {sequential_statement} end [package] [identifier]; A package is analyzed separately and placed in the working library by the analyzer. Each package declaration that includes function and/or procedure declarations must have a corresponding package body. The syntax rule for a package body is: package body identifier is {package body declarations} end [package body] [identifier]; Example 3: Simple package declaration and its corresponding body. library IEEE; use IEEE.std_logic_1164.all; package my_package is function add4_func(a, b: std_logic_vector(3 downto 0); carry : std_logic) return std_logic_vector; procedure add4_proc (a, b: in std_logic_vector(3 downto 0); carry: in std_logic; signal sum: out std_logic_vector(3 downto 0); signal cout: out std_logic); end package my_package; Since the package contains subprogram declarations, we declare also the package body: package body my_package is function add4_func(a, b: std_logic_vector(3 downto 0); carry: std_logic) return std_logic_vector is variable cout: std_logic; variable cin: std_logic; variable sum: std_logic_vector(4 downto 0); begin cin := carry; sum := "00000"; loop1: for i in 0 to 3 loop cout := (a(i) and b(i)) or (a(i) and cin) or (b(i) and cin); sum(i) := a(i) xor b(i) xor cin; cin := cout; end loop loop1; sum(4) := cout; return sum; end add4_func; procedure add4_proc 27 (a, b: in std_logic_vector(3 downto 0); carry: in std_logic; signal sum: out std_logic_vector(3 downto 0); signal cout: out std_logic) is variable c: std_logic; begin c := carry; for i in 0 to 3 loop sum(i) <= a(i) xor b(i) xor c; c := (a(i) and b(i)) or (a(i) and c) or (b(i) and c); end loop; cout <= c; end add4_proc; end package body my_package; Suppose the above package and package body declarations are saved as my_package.vhd, (i.e., as a VHDL file). Normally, it could be analyzed and placed in any directory, for instance MY_LIBRARY directory. Then, we can write other VHDL files (or library units) in which we instantiate items from the newly created library, (i.e., MY_LIBRARY), using the "selected name". The "selected name" is formed by writing the library name, then the package name, and then the name of the item (or all if you want to use all items), all separated by dots. For example: library MY_LIBRARY; use MY_LIBRARY.my_package.all; Example 4: Simple 8-bit adder using the above package. Use ISE WebPack to create a new project. Add to your project the following two VHDL files, and then synthesize and implement the design. -------------------------------------------------------------------- First VHDL file: has package declaration and package body. -- Save it as my_package.vhd library IEEE; use IEEE.std_logic_1164.all; ------------------------------------------------------------------package my_package is function add4_func(a, b : std_logic_vector(3 downto 0); carry : std_logic) return std_logic_vector; procedure add4_proc (a, b : in std_logic_vector(3 downto 0); carry: in std_logic; signal sum: out std_logic_vector(3 downto 0); signal cout: out std_logic); end package my_package; ------------------------------------------------------------------package body my_package is function add4_func(a, b : std_logic_vector(3 downto 0); carry: std_logic) return std_logic_vector is variable cout: std_logic; variable cin: std_logic; variable sum: std_logic_vector(4 downto 0); 28 begin cin := carry; sum := "00000"; loop1: for i in 0 to 3 loop cout := (a(i) and b(i)) or (a(i) and cin) or (b(i) and cin); sum(i) := a(i) xor b(i) xor cin; cin := cout; end loop loop1; sum(4) := cout; return sum; end add4_func; procedure add4_proc (a, b : in std_logic_vector(3 downto 0); carry: in std_logic; signal sum: out std_logic_vector(3 downto 0); signal cout: out std_logic) is variable c: std_logic; begin c := carry; for i in 0 to 3 loop sum(i) <= a(i) xor b(i) xor c; c := (a(i) and b(i)) or (a(i) and c) or (b(i) and c); end loop; cout <= c; end add4_proc; end package body my_package; -------------------------------------------------------------------------------------------------------------------------------------- Second VHDL file: simple 8-bit adder -- Uses items from "my_package" created in WORK library directory -- in your current project directory -- Save it as bit8_adder.vhd library IEEE; use IEEE.std_logic_1164.all; use WORK.my_package.all; entity bit8_adder is port(a, b: in std_logic_vector(7 downto 0); ci: in std_logic; y: out std_logic_vector(7 downto 0); co: out std_logic); end bit8_adder; architecture structural of bit8_adder is signal internal_carry : std_logic; signal sum1, sum2: std_logic_vector(4 downto 0); begin sum1 <= add4_func(a(3 downto 0), b(3 downto 0), ci); sum2 <= add4_func(a(7 downto 4), b(7 downto 4), sum1(4)); y <= sum2(3 downto 0) & sum1(3 downto 0); co <= sum2(4); end; ------------------------------------------------------------------- 29 Lab 3: Four-Bit Binary Counter 1. Objective The objective of this lab is to design and test a 4-bit binary counter. Aside from learning about the on-board clock signal and push-buttons as well as about frequency dividers, this lab reinforces the design flow steps introduced in the previous labs. 2. Description We design a 4-bit binary counter. Our counter has an output “Q” with four bits. During correct operation, the counter starts at “0000” and then binary counts up to output “0001”, “0010”, “0011”, and so on until it outputs “1111”, after which it resets to “0000” and starts again. The first implementation of our counter has only one input: a clock signal CK. The clock signal is provided by the external (to the FPGA) clock generator. We use the output Q to drive the first four LEDs on the Atlys board. The block diagram of the simplest/basic structural implementation of such a binary counter is shown in the next figure. This implementation is known as a ripple counter. Figure 1 Block diagram of a 4-bit binary counter Toggle Flip-Flop As shown in the figure above, we use four Toggle Flip-Flops (TFF’s). As you remember, the operation of a TFF is as follows: When the “T” input is logic “1”, the output “Q” will toggle on each clock transition. When the “T” input is logic “0”, the output “Q” will not change. To start our design, we first create a new project by launching Xilinx ISE WebPack and following the steps discussed in lab 2. Call the new project fourbit_counter and select the same location where you created the previous project. Create and add to the project a first VHDL file called tff.vhd with the following content: -- tff.vhd -- Toggle Flip-Flop with behavioral description library IEEE; 30 use IEEE.STD_LOGIC_1164.ALL; entity tff is Port ( T : in STD_LOGIC; CK : in STD_LOGIC; Q, QN : out STD_LOGIC); end tff; architecture My_behavioral of tff is signal mem : std_logic := '0'; begin process (CK, T, mem) -- execute this process only when the clock changes begin if T = '0' then null; -- no toggle, so do nothing elsif (CK'event and CK = '1') then mem <= not mem; -- rising edge of clock and T = 1, toggle stored value end if; end process; Q <= mem; QN <= not mem; end; Clock Divider Our counter uses as a clock a signal generated by the on-board clock generator. This clock generator is a single 100 MHz CMOS oscillator on the Atlys board connected to pin L15 of the Spartan-6 FPGA. Because the frequency of 100 MHz is too high for the human eye to be able to see how the counter output drives the LEDs, we must utilize a clock divider to lower the frequency to about 1 Hz. Create and add to the project a second VHDL file called ck_divider.vhd with the following content: -- ck_divider.vhd -- This is a clock divider. It takes as input a signal -- of 100 MHz and generates an output as signal with a frequency -- of about 1 Hz. library IEEE; use IEEE.STD_LOGIC_1164.ALL; entity ck_divider is Port ( CK_IN : in STD_LOGIC; CK_OUT : out STD_LOGIC); end ck_divider; architecture Behavioral of ck_divider is constant TIMECONST : integer := 84; signal count0, count1, count2, count3 : integer range 0 to 1000; signal D : std_logic := '0'; begin process (CK_IN, D) begin 31 if (CK_IN'event and CK_IN = '1') then count0 <= count0 + 1; if count0 = TIMECONST then count0 <= 0; count1 <= count1 + 1; elsif count1 = TIMECONST then count1 <= 0; count2 <= count2 + 1; elsif count2 = TIMECONST then count2 <= 0; count3 <= count3 + 1; elsif count3 = TIMECONST then count3 <= 0; D <= not D; end if; end if; CK_OUT <= D; end process; end Behavioral; Read the above code to understand its operation. It takes the 100 MHz external clock as input CK_IN and generates an output signal CK_OUT of 1 Hz. The output frequency is adjustable according to the following formula (TIMECONST = 84 in this case in order to get an output frequency of about 1 Hz): Output Frequency = 100000000 / ( 2 * (TIMECONST ^ 4) ) Note: There are other ways of implementing the TFF or the clock divider. In time, by accumulating more and more experience, you will develop your own VHDL programming style by adopting different coding techniques. 4-bit Binary Counter Finally, let’s create a third VHDL file with the top-level description of our fourbit_counter design described in Figure 1. Create and add to the project the third VHDL file called fourbit_counter.vhd with the following content: -- fourbit_counter.vhd -- This is a simple 4-bit (Ripple) binary counter made up -- of four T flip-flops. It also includes a clock divider -- to bring down the input CK signal from 100 MHz to about 1 Hz. library IEEE; use IEEE.STD_LOGIC_1164.ALL; entity fourbit_counter is Port ( CK : in STD_LOGIC; Q : out STD_LOGIC_VECTOR (3 downto 0)); end fourbit_counter; architecture Structural of fourbit_counter is component tff Port ( T : in STD_LOGIC; 32 CK : in STD_LOGIC; Q, QN : out STD_LOGIC); end component; component ck_divider Port ( CK_IN : in STD_LOGIC; CK_OUT : out STD_LOGIC); end component; signal all_T, S0, S1, S2, S3, internal_ck : STD_LOGIC; begin -- We use signal all_T set to logic '1' to drive -- input T of all T flip-flops to logic '1'. all_T <= '1'; CLOCK: ck_divider port map TFF0: tff port map (all_T, TFF1: tff port map (all_T, TFF2: tff port map (all_T, TFF3: tff port map (all_T, (CK, internal_ck); internal_ck, Q(0), S0); S0, Q(1), S1); S1, Q(2), S2); S2, q(3), S3); end Structural; Design Implementation At this time, we have coded the entire design and its components. Before continuing to Design Implementation, we first take care of two things: Set the fourbit_counter as the top-level design (we need to do this because currently TFF.vhd is the toplevel because it was added first to the project). To do that, in the Hierarchy window, Right click on “fourbit_counter – Structural (fourbit_counter.vhd)” and select Set as To-Level. Pin assignment. As discussed earlier we use the external clock signal connected to pin L15 of the Spartan-6 FPGA. So, we assign pin L15 to the input “CK” of our design. Also, we use the output “Q” of our design to drive the first four LEDs of the Atlys board. Now, do the pin assignment as learned in lab 2. After this step, your UCF file should have the following content: # PlanAhead Generated physical constraints NET "Q[0]" LOC = U18; NET "Q[1]" LOC = M14; NET "Q[2]" LOC = N14; NET "Q[3]" LOC = L14; NET "CK" LOC = L15; We are now ready to implement the design: in the Processes tab double-click Implement Design (or rightclick on Implement Design and select Run). Generate the Programming File and Program the FPGA Double-click on Generate Programming File in the Process tab. Then, program the FPGA using the Adept software as learned in lab 2. Verify that our counter works correctly. 3. Lab assignment 33 Lab preparation A major problem with the counter implemented in this lab is that the individual flip-flops do not all change state at the same time. Rather, each flip-flop is used to trigger the next one in the series. Thus, in switching from all 1s (count = 15) to all 0s (count wraps back to 0), we don’t see a smooth transition. Instead, output Q(0) falls first, changing the apparent count to 14. This triggers output Q(1) to fall, changing the apparent count to 12. This in turn triggers output Q(2), which leaves a count of 8 while triggering output Q(3) to fall. This last action finally leaves us with the correct output count of zero. We say that the change of state “ripples” through the counter from one flip-flop to the next. Therefore, this circuit is known as a “ripple counter”. This causes no problem if the output is only to be read by human eyes; the ripple effect is too fast for us to see it. However, if the count is to be used as a selector by other digital circuits (such as a multiplexer or demultiplexer), the ripple effect can easily allow signals to get mixed together in an undesirable fashion. To prevent this, we need to devise a method of causing all of the flip-flops to change state at the same moment. That would be known as a “synchronous counter” because the flip-flops would be synchronized to operate in unison. In this lab assignment, you must design a synchronous counter version of our fourbit_counter to arrive to a new block diagram, where all flip-flops are driven by the same clock signal. You should design this counter using the Karnaugh Maps method and utilize JK flip-flops instead of T flip-flops. In addition, the top-level design of the fourbit_counter should have an additional input, “RESET”, which when set to logic “1” forces the counter to the initial state “0000”. The RESET input should be controlled by one of the pushbuttons of the Atlys board. Optional: Remove entirely the clock divider from the design. Instead of the clock signal of 100 MHz utilize a signal from one of the pushbuttons of the Atlys board. In this case, the counter will advance each time the pushbutton is pressed. Modify the counter such that it can be told to count up or down. Lab report and demo You must turn-in a lab report, which should contain the following: Lab title Your name Introduction section – a brief description of the problem you solve in this lab assignment, outlining the goal and design requirements. Solution – details of your Karnaugh Maps method. Include all block diagrams and K-maps you need to illustrate each step. This section must be hand-written. VHDL code – of your entire design. Use smaller font to save space. Conclusion – describe your results and any issues you may have faced during this assignment and how you solved them. For full credit, you must demo the correct operation of your counter to the TA during the next lab. 34 Lab 3: Supplemental Material - Testbenches The objective of this supplemental material is to reinforce the concept of testbenches in VHDL. 1. Introduction On alternative way to verify the correctness of a VHDL description of a design is to use testbenches. A testbench is an enclosing VHDL model. Its name comes from the analogy with a real hardware testbench, on which a Device Under Test (DUT) is stimulated with signal generators and observed with signal probes. A VHDL testbench consists of an architecture body containing an instance of the component to be tested and processes that generate sequences of values on signals connected to the component instance. The architecture body may also contain processes that test the component instance produces the expected values on its output signals. During this supplemental lab you will write the VHDL model for a registered ALU using a package, and test it using a testbench. Your ALU is capable of performing four operations on two operands as shown in Fig.1. The flag output is high (logic '1') whenever there is either an underflow or overflow on the C bus. a(3:0) b(3:0) ALU 1 func(1:0) 4 clk REGISTER flag reset c(3:0) Figure 1 Simple ALU 2. Writing the package As you already learned, a VHDL package is an important way of grouping a collection of related declarations that serve a common purpose. Usually, a package is a set of subprograms that provide operations on a particular type of data, or they might be just the set of declarations needed to model a design. The important thing is that they can be collected together into a separate design unit that can be worked on independently and reused in different parts of a model or models. The following VHDL code describes all the operations needed to implement the four basic operations of your simple ALU. Type it using any text editor (or using the VHDL editor of ISE WebPack) and save it as alupack.vhd. 35 --------------------------------------------library IEEE; use IEEE.std_logic_1164.all; use IEEE.numeric_std.all; ---------------------------------------------- package declarations for procedures and constants package addorsub is -- set the default bus size constant bussize : integer := 4; -- set up a type for a bus of size bussize subtype stdbus is signed (3 downto 0); subtype lrgbus is signed (4 downto 0); -- set the integer range for a bus of size bussize + 1 subtype medint is integer range -32 to 31; -- extend performs a one bit signed or signed bit extension based -- on the value of signex. signex=1 does a signed extension. procedure extend (signal inbus : in stdbus; variable outbus : out lrgbus; signex : in std_logic); -- usadd performs signed or signed addition of two busses of size -- bussize. the result is a signed or signed bus of size bussize -- depending on signex (signex = 1 produces a signed result). reportf -- indicates if there is an underflow or overflow. procedure usadd (signal abus, bbus : in signed(bussize-1 downto 0); signal result : out signed(bussize-1 downto 0); signex : in std_logic; signal reportf : out std_logic); -- ussub performs signed or signed subtraction (abus - bbus) -- of two busses of size bussize (signex=1 causes signed subtraction). -- reportf =1 if there is an underflow or overflow. procedure ussub (signal abus, bbus : in signed(bussize-1 downto 0); signal result : out signed(bussize-1 downto 0); signex : in std_logic; signal reportf : out std_logic); end addorsub; ---------------------------------------------- package body contains the procedure bodies. package body addorsub is procedure extend (signal inbus : in stdbus; variable outbus : out lrgbus; signex : in std_logic) is begin outbus := (signex and inbus (bussize-1)) & inbus(bussize-1 downto 0); end; procedure usadd (signal abus, bbus : in signed(bussize-1 downto 0); signal result : out signed(bussize-1 downto 0); signex : in std_logic; signal reportf : out std_logic) is variable tempr : medint; variable tempa : signed(bussize downto 0); variable tempb : signed(bussize downto 0); begin -- sign/unsign extend abus and bbus to a bus of size bussize + 1; 36 extend(abus, tempa, signex); extend(bbus, tempb, signex); --perform signed addition tempr := to_integer(tempa)+ to_integer(tempb); -- check for overflows dependent on type of addition if (signex = ‘0’ and tempr > 15) then --overflow of signed addition reportf <= ‘1’; elsif (signex = ‘1’ and (tempr > 7 or tempr < -8)) then -- overflow or underflow of signed addition reportf <= ‘1’; else reportf <= ‘0’; end if; result <= to_signed(tempr, bussize); end usadd; procedure ussub (signal abus, bbus : in signed(bussize-1 downto 0); signal result : out signed(bussize-1 downto 0); signex : in std_logic; signal reportf : out std_logic) is variable tempr : medint; variable tempa : signed(bussize downto 0); variable tempb : signed(bussize downto 0); begin -- sign/unsign extend abus and bbus to a bus of size bussize+1; extend(abus, tempa, signex); extend(bbus, tempb, signex); -- perform signed addition tempr := to_integer(tempa)- to_integer(tempb); -- check for overflows dependent on type of addition if (signex = ‘0’ and tempr < 0) then reportf <= ‘1’; elsif (signex = ‘1’ and (tempr > 7 or tempr < -8)) then -- overflow or underflow of signed addition reportf <= ‘1’; else reportf <= ‘0’; end if; result <= to_signed(tempr, bussize); end ussub; end addorsub; -- end of package body --------------------------------------------- 3. Writing the VHDL description of the ALU The following VHDL code describes the ALU, which uses the functions declared and implemented in the package alupack. The ALU should have a register to latch the output. Type it using any text editor and save it as alu.vhd. --------------------------------------------library IEEE; use IEEE.std_logic_1164.all; use IEEE.numeric_std.all; use WORK.addorsub.all; --------------------------------------------entity alu is port (a, b : in stdbus; 37 func : in std_logic_vector(1 downto 0); clk, reset : in std_logic; flag : out std_logic; c : out stdbus); end alu; --------------------------------------------architecture rtl of alu is signal intflag : std_logic; signal intbus : stdbus; begin regp : process (clk, reset) begin if (reset = ‘1’) then flag <= ‘0’; c <= “0000”; elsif (clk’event and clk = ‘0’) then flag <= intflag; c <= intbus; end if; end process regp; alup : process(a, b, func) begin if func(1) = ‘0’ then usadd(a, b, intbus, func(0), intflag); else ussub(a, b, intbus, func(0), intflag); end if; end process alup; end rtl; --------------------------------------------- 4. Writing the testbench The following VHDL code represents the testbench. It generates inputs for and monitors the outputs from the ALU. The testbench compares the actual outputs with expected outputs and prints out if a test is successful or not. Note that you do not need a stimulus file when you work with testbenches; the design is stimulated with stimulus generated inside the testbench. Type the following VHDL code and save it as testbench.vhd. --------------------------------------------library IEEE; use IEEE.std_logic_1164.all; use IEEE.numeric_std.all; use WORK.addorsub.all; --------------------------------------------entity testbench is end testbench; --------------------------------------------architecture test of testbench is type table_type1 is array (0 to 5) of signed (3 downto 0); type table_type2 is array (0 to 3) of std_logic_vector (1 downto 0); constant inputa : signed := “0000”; constant inputb : signed := “0000”; 38 constant outc : table_type1 := (“0001”, “0011”, “0101”, “0111”, “1001”,“1011”); constant outgen : table_type2 := (“00”, “01”, “10”, “11”); signal cbus : signed (3 downto 0); signal flag : std_logic; signal abus : signed (3 downto 0) := “0000”; signal bbus : signed (3 downto 0) := “0000”; signal clk : std_logic; signal reset : std_logic; signal sel : std_logic_vector (1 downto 0) := “00”; component alu port (a, b : in stdbus; func : in std_logic_vector(1 downto 0); clk, reset : in std_logic; flag : out std_logic; c : out stdbus); end component; for alu_inst : alu use entity work.alu(rtl); begin alu_inst : alu port map (abus, bbus, sel, clk, reset, flag, cbus); clkp : process begin clk <= ‘1’, ‘0’ after 50 ns; wait for 100 ns; end process clkp; rset : process begin reset <= ‘1’, ‘0’ after 100 ns; wait for 1 ms; end process rset; testp : process begin wait for 100 ns; -- this is needed for reset to finish for j in 0 to 1 loop -- test for unsigned & signed add sel <= outgen(j); for i in 0 to 5 loop abus <= inputa + TO_SIGNED(i, 4); bbus <= inputb + TO_SIGNED(i+1, 4); wait for 51 ns; assert (cbus = outc(i)) report “Result is not correct” severity warning; wait for 49 ns; end loop; end loop; for j in 2 to 3 loop -- test for unsigned & signed sub sel <= outgen(j); for i in 0 to 5 loop abus <= inputa + TO_SIGNED(i, 4); bbus <= inputb + TO_SIGNED(i+1, 4); wait for 51 ns; assert (cbus = "1111”) report “Result is not correct” severity warning; wait for 49 ns; 39 end loop; end loop; assert false report “Test Complete” severity error; end process testp; end test; --------------------------------------------- Read thoroughly the above files to understand the functionality of the testbench, then: Use Aldec HDL simulator to simulate alu.vhd together with alupack.vhd. Create your own input signals (as in lab#1) to stimulate the four basic operations performed by the ALU and verify its correctness. Simulate testbench.vhd (together with alu.vhd and alupack.vhd) to verify the ALU. Notice that using testbeches saves your time. 5. Lab assignment You are required to modify the ALU design such that it can be implemented with ISE WebPack and verified on the Atlys board. You must add a clock divider to provide a clock frequency of 1 Hz to the ALU unit. The clock divider uses as input the 100 MHz signal of the Atlys board. Use output c(3:0) to drive LEDs. The LEDs must display either a number between 0-15 for unsigned operations, or a number between 0-7 for the signed operations. The output "flag" should drive the left most LED. As inputs a(3:0) and b(3:0) use all eight slide-switche. As func(1:0) use the two push-buttons. Synthesize and implement this modified ALU and download its bitstream file to the board to configure the FPGA. Verify the correct operation. 40 Lab 4: Finite State Machines 1. Objective The objective of this lab is to study several different ways of specifying and implementing finite state machines (FSMs). We also discuss finite state machines with datapath (FSMD). 2. Introduction There are two basic types of sequential circuits: Mealy and Moore. Because these circuits transit among a finite number of internal states, they are referred to as finite state machines (FSMs). In a Mealy circuit, the outputs depend on both the present inputs and state. In a more circuit, the outputs depend only on the present state. The most common way of schematically representing a Mealy sequential circuit is shown in Fig.1. Figure 1 State transition table and block diagram of a Mealy type seq. circuit (BCD to excess-3 converter) The state register normally consists of D flip-flops (DFFs). However, other types of flip-flops can be utilized, such as JKFFs. The normal sequence of events is: (1) inputs X change to a new value, (2) after a clock period delay, outputs Z and next state NS become stable at the output of the combinational circuit, (3) the next state signals NS are stored in the state register; that is, next state NS replace present state PS at the output of the state register, which feeds back into the combinational circuit. At this time, a new cycle is ready to start. These operational cycles are synchronized with the clock signal CLK. It is worth mentioning that some authors further classify sequential circuits into two categories. The first category, referred to as “regular sequential circuits”, includes circuits like (shift) registers, FIFOs, and binary counters and variants. The second category, referred to as “finite state machines” (FSMs), include circuits that typically do not exhibit a simple, repetitive pattern. 3. Example 1: MEALY machine design – BCD to Excess-3 code converter In this example, we’ll design a serial converter that converts a binary coded decimal (BCD) digit to an excess-3-coded decimal digit. Excess-3 binary-coded decimal (XS-3) code, also called biased representation or Excess-N, is a complementary BCD code and numeral system. It was used on some older computers with a pre-specified number N as a biasing value. It is a way to represent values with a balanced number of positive and negative numbers. In our example, the XS-3 code is formed by adding 0011 to the BCD digit. 41 The table and state graph in Fig.2 describe the functionality of our design. For details, please read pages 1925 in the textbook. Figure 2 Code converter: table and state graph There are several ways to model this sequential machine. One popular/common approach is to use two processes to represent the two parts of the circuit: the combinational part and the state register. For clarity and flexibility, we use VHDL’s enumerated data type to represent the FSM’s states. The following VHDL code describes the converter (file code_conv_2processes.vhd): ----- Behavioral model of a Mealy state machine: code converter w/ 2 processes It is based on its state table. The output (Z) and next state are computed before the active edge of the clock. The state change occurs on the rising edge of the clock. library IEEE; use IEEE.STD_LOGIC_1164.ALL; entity Code_Converter is port( enable: in std_logic; X, CLK: in std_logic; Z: out std_logic); end Code_Converter; architecture Behavioral of Code_Converter is type state_type is (S0, S1, S2, S3, S4, S5, S6); signal State, Nextstate: state_type; -- a different way: represent states as integer signals: -- signal State, Nextstate: integer range 0 to 6; begin -- Combinational Circuit process(State, X) begin case State is when S0 => if X = '0' then Z <= '1'; Nextstate <= S1; else Z <= '0'; Nextstate <= S2; end if; when S1 => 42 if X = '0' then Z <= '1'; Nextstate <= S3; else Z <= '0'; Nextstate <= S4; end if; when S2 => if X = '0' then Z <= '0'; Nextstate <= S4; else Z <= '1'; Nextstate <= S4; end if; when S3 => if X = '0' then Z <= '0'; Nextstate <= S5; else Z <= '1'; Nextstate <= S5; end if; when S4 => if X = '0' then Z <= '1'; Nextstate <= S5; else Z <= '0'; Nextstate <= S6; end if; when S5 => if X = '0' then Z <= '0'; Nextstate <= S0; else Z <= '1'; Nextstate <= S0; end if; when S6 => if X = '0' then Z <= '1'; Nextstate <= S0; else Z <= '0'; Nextstate <= S0; end if; when others => null; -- should not occur end case; end process; -- State Register process (enable, CLK) begin if enable = '0' then State <= S0; elsif rising_edge (CLK) then State <= Nextstate; end if; end process; end Behavioral; Note that in each branch of the case statement, the output Z and Nextstate are assigned values. The second process represents the state register, which is updated on the rising edge of the CLK signal. To test this converter on the Atlys board, we’ll design a circuit that uses two shift-registers, the converter, and a clock divider, as shown in the diagram of Fig.3. The input is provided parallel as four bits via four slide switches while the output is displayed on four LEDs. We use a clock divider to generate a slower clock signal (about 1 Hz) to make it easier to monitor the operation of the whole system. So, create a new ISE project (let’s call it lab4_fsm) and add to it the following VHDL files: code_conv_2processes.vhd, ck_divider.vhd, shift_register.vhd, and top_level.vhd. These files contain the declaration and description of all necessary entities to implement the system from Fig.3. These files together with other useful files (e.g., .ucf file) are included in the downloadable archive with all the data for this lab. Read top_level.vhd and figure out what exactly the “control” block in Fig.3 does. Run the Implement Design step inside ISE WebPack to perform placement and routing. Generate the programming .bit file and program the FPGA. Verify the operation of your design. Observe and comment. 43 Figure 3 Block diagram of top-level design to test the BCD to XS3 converter Generally, there are other ways to describe the behavioral model for the code converter: One way is to use only a single process (rather than two processes as discussed above). In this case, the next-state is not computed explicitly, but the state register is updated directly to the proper next-state value on the rising edge of the clock signal. You can see the VHDL code of such an approach in Fig. 256, page 106, in the textbook. Another way is to use the so called dataflow approach. Basically, this is based on using Boolean equations that implement the combinational part of the state machine. An example of this is shown in Fig. 2-57, page 107, in the textbook. Because method assumes that we know these equations, it is not a preferred method. Yet another approach to write the VHDL code for the state machine is to create a structural model. The structural model describes all actual gates and flip-flops and their connectivity. An example of this is shown in Fig. 2-58, page 108, in the textbook. Finally, there is yet another way of describing a state machine: state machine editor. However, this can be done when using the Aldec-HDL tool. The State Diagram Editor of Aldec is a tool designed for the graphical editing of state diagrams of synchronous and asynchronous machines. Drawing a state diagram is an alternative approach to the modeling of a sequential device. Instead of writing the HDL code, one can enter the description of a logic block as a graphical state diagram. The tool will then automatically generate the HDL code based on the entered graphical description. Due to the intuitive graphic form, state diagrams are easy-to-learn and far more readable than the HDL code [1]. We’ll not use this in this course. However, it is mentioned here for the sake of completeness. For more info you may want to check out [2,3]. The method using two processes is the recommended one because it is closer to how actually the hardware works and it is more readable as a VHDL code. 4. Example 2: Finite state machine with datapath (FSMD) - bit difference calculator A finite state machine with datapath (FSMD) combines a FSM and regular sequential circuits. The FSM, sometimes referred to as a control-path or controller, examines the external commands and status and generates control signals to specify operations of the regular sequential circuits, which are known collectively as a data-path [4]. The FSMD is used to implement systems described by RT (register transfer) methodology, where the system’s functionality is specified as data manipulation and transfer among a collection of registers. 44 Most realistic circuits combine a controller and a datapath to perform some computation. The use of the FSMD model is especially recommended whenever the structure of the datapath is important. For example, if you are creating a custom pipelined datapath for a specific application, specifying the structure of the pipeline is likely important. The combination of a controller and datapath can be represented using several models in VHDL. In this lab, we'll look at two different models. To do that, we’ll design and simulate a simple example: a bit difference calculator [5]. The design’s description is as follows: Given an input of a generic width, the entity calculates the difference between the number of 1s and 0s. If for example there are 3 more 1s than 0s, the output is 3. If there are 3 more 0s than 1s, the output is -3. Implementation A: behavioral model using two processes A simplified pseudocode description of the bit difference calculator is as follows: Inputs: go, input (arbitrary width) Outputs: output(arbitrary width), done (1 bit) while (go == 0); value = input; // Store input in a register called value. diff = 0; for width iterations { if bit0 of value == 1 diff++; else diff--; value = shiftRight(value,1); } output = diff; done = 1; One possible implementation as a FSMD is described by the state graph in Fig.4. The VHDL file top_level_bit_diff_impl_A.vhd describes the entity bit_diff and its architecture the design. library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all; entity bit_diff is generic ( width : positive := 16); port ( clk : in std_logic; rst : in std_logic; go : in std_logic; input : in std_logic_vector(width-1 downto 0); output : out std_logic_vector(width-1 downto 0); done : out std_logic); end bit_diff; 45 Figure 4 State graph of FSMD implementation architecture FSMD_2P of bit_diff is type STATE_TYPE is (S_INIT, S_CHECK_BIT, S_STORE_OUTPUT, S_DONE); signal signal signal signal signal begin state, next_state : STATE_TYPE; value, next_value : std_logic_vector(width-1 downto 0); diff, next_diff : signed(width-1 downto 0); count, next_count : integer range 0 to width; output_s, next_output : std_logic_vector(width-1 downto 0); -- this process defines all registers used in the FSMD process(clk, rst) begin if (rst = '1') then value <= (others => '0'); count <= 0; diff <= (others => '0'); output_s <= (others => '0'); state <= S_INIT; elsif (clk'event and clk = '1') then -- these are the only registers used by the 2-process FSMD value <= next_value; count <= next_count; diff <= next_diff; output_s <= next_output; state <= next_state; end if; 46 end process; -- combinational logic process(go, input, value, count, diff, output_s, state) variable temp : integer range 0 to width; begin next_count <= count; next_value <= value; next_diff <= diff; next_output <= output_s; next_state <= state; done <= '0'; case state is when S_INIT => next_count <= 0; next_diff <= (others => '0'); next_value <= input; if (go = '1') then next_state <= S_CHECK_BIT; end if; when S_CHECK_BIT => if (value(0) = '0') then next_diff <= diff - 1; elsif (value(0) = '1') then next_diff <= diff + 1; end if; next_value <= std_logic_vector(shift_right(unsigned(value), 1)); temp := count + 1; next_count <= temp; if (temp = width) then next_state <= S_STORE_OUTPUT; end if; when S_STORE_OUTPUT => next_output <= std_logic_vector(diff); next_state <= S_DONE; when S_DONE => done <= '1'; next_state <= S_INIT; when others => null; end case; end process; output <= output_s; end FSMD_2P; 47 At this time, you should create a simple testbench VHDL file (you can do it by modifying testbench_top_level.vhd file from Example 1) and simulate using Aldec-HDL the above entity. Verify its operation and comment. Implementation B: structural model using component instantiations for registers, muxes, adders, subtracters, etc. The structural implementation is recommended when the exact structure of the datapath is important. In this model, we separate the controller and datapath from each other. Then, typically, we define the datapath structurally and then combine it with a corresponding controller (FSM) described using any of the possible models discussed in Example 1. For example, assume that we really wanted to implement the datapath described in Fig.5. Then, the following files: top_level_bit_diff_impl_B.vhd, datapath.vhd, fsm.vhd, add.vhd, sub.vhd, reg.vhd, mux2x1.vhd, comp.vhd describe all the entities required for implementing the design. Read these files to understand the description. Then, use the same testbench that you created to simulate the previous implementation (implementation A) of this design to verify also the operation of this description too. Figure 5 Block diagram of datapath 48 5. Lab assignment Design and code in VHDL the converter from Example 1 but as a Moore machine. Verify its operation using Aldec-HDL simulator. The lab report should include state diagram, VHD code, description, and waveforms. 6. References [1] Aldec state diagram editor. http://www.aldec.com/en/solutions/fpga_design/graphical_text_design_entry [2] Getting Started with Active-HDL. http://www.aldec.com/en/support/resources/documentation/articles/1054 [3] Lab tutorial at TCC. http://faculty.tcc.edu/PGordy/EGR270/AldecEx2.pdf [4] P.P. Chu, RTL Hardware Design Using VHDL: Coding for Efficiency, Portability and Scalability, Wiley-Interscience, 2006. [5] Greg Stitt, University of Florida, VHDL tutorials. http://www.gstitt.ece.ufl.edu/vhdl 49 Lab 4: Supplemental Material – Writing VHDL code for synthesis The objective of this supplemental material is to provide guidelines for writing VHDL code for synthesis. 1. Introduction The quality of a synthesized design, in terms of area, performance, etc., depends directly on the VHDL description of the design. Generally, two different VHDL descriptions of the same design may result in two different final implemented circuits. Also, the final implemented design depends on what software tools you use. Adopting a VHDL programming style, which ensures best synthesized designs, is desirable. During this lab you will learn how to write VHDL constructs that are efficiently synthesized. 2. Synthesis tools Usually, there are several ways to express the functionality of a design. For example, the following VHDL code describes an edge triggered D-flip-flop in four different ways: -- version 1 process (clk) is begin if rising_edge(clk) then q <= d; end if; end process -- version 2 process is begin wait until rising_edge(clk); q <= d; end process -- version 3 q <= d when rising_edge(clk) else q; -- version 4 b: block (rising_edge(clk) and not clk'stable) is begin q <= guarded d; end block b; It is unlikely that all the above descriptions will be synthesizeable by the same tool. This depends on how the tool is constructed, i.e., what are the "expectations" of the tool for certain expressions. That is why it is recommended that you 1) read the documentation of your particular tool and 2) possibly change your programming style to conform to the particular requirements specific to your tool. 3. Potentially synthesizable 50 VHDL can be utilized to describe design for the purposes of 1) simulation and 2) synthesis. There are constructs (closer to the C programming constructs) included in the VHDL language which are intended for simulation and which cannot be synthesized into hardware. File operations and assertion statements are such kind of constructs. Such constructs should be used for creating testbenches but not for the synthesizable sections of a model. There are constructs which are potentially synthesizable but are not handled correctly by some synthesis tools. If you use these constructs, the synthesized hardware will produce different results from the simulated model. For example, suppose you want to describe a registered comparator with two data inputs, a and b, a clock input, clk, and a data output, q. The device stores the result of comparing a and b on each rising edge of clk. The following are two possible ways to describe this circuit: -- version 1 process (clk) is variable d : std_logic; begin if a=b then d:='1'; else d:='0'; end if; if rising_edge(clk) then q <= d; end if; end process -- version 2 process (clk) is begin if rising_edge(clk) then if a=b then q <= '1'; else q <= '0'; end if; end if; end process When you simulate the first version it works correctly. When you synthesize it you have to take into account the fact that the process is resumed on both rising and falling edges of clk. The variable d is updated in both cases and in this way it is a function of clk. Some tools treat this as illegal and fail to synthesize the device. Others proceed to synthesize the device, but may not produce a correct circuit. Version 2 is a better description since it reflects accurately your intention that the comparison is performed only on rising edges of clk. In this case the process does not contain any unnecessary implied state. 4. "Doing it right" vs. "Doing it wrong" 51 a) "Doing it right" y <= a or b; -- simple gate, easy to synthesize y <= a when x = '1' else b; -- simple multiplexer, no process -- statement necessary! Extend VHDL code already written to describe new blocks. Example which uses the description of a flip-flop to specify a counter: -- flipflop description ff2 : process (reset, clk) is begin if reset = '1' then q <= '0'; elsif rising_edge(clk) then if x = '1' then q <= 'a'; else q <= 'b'; end if; end if; end process ff2; -- flipflop extended to form a counter constant terminal_count : integer := 2**6-1; subtype counter_range is integer range 0 to terminal_count; signal count : counter_range; ... counter6: process (reset, clk) is begin if reset = '0' then count <= '0'; elsif rising_edge(clk) then if count < terminal_count then count <= count + 1; else count <= '0'; end if; end if; end process counter6; Describe Finite State Machines (FSMs) using two-processes model: architecture behavioral of an_FSM is type state_type is (S0, S1, S3, S4); signal state, next_state : state_type; begin combinational_part: process (input, state) is begin case state is when S0 => if input = '1' then output <= '1'; next_state <= S0; else 52 output <= '1'; next_state <= S1; end if; when S1 => if input = '1' then output <= '0'; next_state <= S1; else output <= '0'; next_state <= S0; end if; when S2 => if input = '1' then output <= '0'; next_state <= S1; else output <= '0'; next_state <= S3; end if; when S3 => if input = '1' then output <= '0'; next_state <= S3; else output <= '1'; next_state <= S0; end if; end case; end process combinational_part; state_register: process (reset, clk) is begin if reset = '0' then state <= S0; elsif rising_edge(clk) then state <= next_state; end if; end process state_register; end architecture behavioral; With this type of description the register which holds the current state is independent of the logic that determines the next state and the outputs. Synthesis tools work better with the state machine specified in this way. In the above example, the four states can be encoded using, two, three or four bits. The best encoding depends on the synthesis target library, the required speed of the circuit, and the circuit area available. You can force the above machine to use “one hot” state encoding by modifying the state definition as follows: -- forcing one hot encoding subtype state_type is std_logic_vector (3 downto 0); constant S0 : state_type := "0001"; constant S1 : state_type := "0010"; constant S2 : state_type := "0100"; constant S3 : state_type := "1000"; b) "Doing it wrong" 53 -- Wrong: y <= a + b + c + d; -- will be synthesized as three stage circuit! -- Correct: y <= (a + b) + (c + d); -- will be synthesized as two stage circuit! -- Wrong: y <= a or b or c and d; -- wrong if you want (a+b)+(cd)! -- Recall the operator associativity. -- Correct: y <= (a or b) or (c and d); Incomplete definitions: -- Wrong version because there is uncertainty about what is x when a = 1 -- and about what is z when NOT(a = 1). if (a = '1' ) then z <= f(); elsif (clk'event and clk = '1') then x <= g(); end if; -- Correct: if (a = '1' ) then z <= f(); x <= x; elsif (clk'event and clk = '1') then x <= g(); z <= z; end if; -- Wrong construct; because there is an else after a clocked if. Try -- to draw a schematic and figure it out what is wrong. if (clk'event and clk = '1') then x <= f(); y <= g(); else z <= h(); end if; Avoid putting too much on the sensitivity list of processes! Usually we put on the sensitivity list clocks, resetting signals and inputs. Do not include in the sensitivity lists output signals! 5. Distinguishing when to use signals and when to use variables The behavior of signals and variables can be completely different. Variables can only be used to store data only temporarily in a process or subprogram. -- Undesired construct: signal int : std_logic; begin process(a, b, c, d, int) is begin int <= a and b and c; q <= int and d; end process; end; It is undesired because we assign the signal int inside the process and then use it to assign q inside the same process. Because int is updated only after a delta delay, in the current step int has still the 54 old (incorrect) value. To get around this, int has to be on the sensitivity list, and thus the process will be activated again. But according to the previous guideline, you have to avoid overloaded sensitivity list! A better option to write the above construct is: -- Better construct: begin process(a, b, c, d) is variable int : std_logic; begin int := a and b and c; q <= int and d; end process; end; This will present also the advantage of a faster simulation because, now, the process will be executed only once! The advantage of the version using int declared as signal is that int can be used as a waveform in the simulator. This is not possible if int is declared as a variable because no time is linked to a variable. This makes it harder to debug the variable example then the signal example! Rule: Use variables only when you want to store a value temporarily! 6. Others Declaring vectors: -- NOT recommended: signal a : std_logic_vector (0 to 3); -- Recommended: (because the MSB will be always the one with the highest index) signal a : std_logic_vector (3 downto 0); Counter synthesis: The following is the description of a counter without resetting line: library IEEE; use IEEE.Std_Logic_1164.all; entity COUNTER is port ( CLK : in std_ulogic; Q : out integer range 0 to 15 ); end COUNTER; architecture my_cool_arch of COUNTER is signal COUNT : integer range 0 to 15 ; begin process (CLK) begin if (CLK'event and CLK = '1') then if (COUNT >= 9) then COUNT <= 0; else COUNT <= COUNT + 1; end if; end if; end process; Q <= COUNT; end my_cool_arch; 55 Note the range assignment in the port declaration of the output. Here, only integers between 0 and 15 are allowed - which means that 4 bits are sufficient for the binary representation of the output port. The port Q is replaced by the synthesis tool with a 4 bit signal (ultimately all types are transformed by means of the synthesis tools to std_logic types). Note also that Q, as an output port, can be only written and it cannot be read within the architecture declaration (unless its mode is changed to buffer). Therefore, a signal COUNT must be declared within the architecture to be able to query COUNT >= 9. The result of the counting is finally transferred to Q in a concurrent signal assignment, Q<=COUNT, as an additional process. That means that each change of COUNT triggers the assignment Q<=COUNT. Thus, the internal IF assignment describes the combinatorial circuit before the FF. The number of FFs is derived from the width of the signal, which receive an assignment inside the outer IF assignment. In this example, the width is four for signal COUNT (because of its range 0 to 15). Finally note that only signal CLK is on the sensitivity list. 7. Conclusion The discussion in this supplemental material is not meant to be an exhaustive list of how VHDL code should be written for synthesis. Rather, the purpose is to provide a rough idea about what writing code for synthesis means. Remarkably, the intent is to make you aware of possible situations where the VHDL description performs correctly during simulation but it does not after synthesis and implementation (or even worse: code is not synthesizeable in the first place)! 56 Lab 5: Memories: ROMs and BRAMs Internal to the FPGA 1. Objective The objective of this lab is to illustrate the use of ROM and block RAM memories located inside the FPGA – a Spartan-6 in the case of our Atlys board. We’ll learn how to use the ISE’s Core Generator tool to create BRAMs. Depending on what your course project will do, you may need to use such memories in your project. 2. Description In this lab we’ll create a project to implement the following design description: The circuit must contain two memories. A ROM created using a case statement and initialized to the desired values (such as the coefficients of a filter). This memory will be inferred as a distributed RAM memory by the Xilinx synthesis tool (XST). The second memory is a block RAM (BRAM) created using the Core Generator tool (part of ISE WebPack). The contents of these memories will be read continuously and displayed on the 7 LEDs. Slide switch SW(0) is used to select between the two outputs of the two memories to drive the LEDs. A simplified representation of this functionality is shown in the block diagram in Fig.1. Figure 1 Block diagram of desired circuit 3. ISE WebPack project Create a new ISE project and add to it the VHDL files listed at the end of this document. These files contain the declaration and description of this lab’s entities including clock divider, ROM, and top level design. These files together with other useful files (.ucf and .coe files) are also included in the downloadable archive with all the data for this lab. To create a custom single-port block RAM using the Core Generator, inside your ISE project, follow these steps: First create, using ISE’s or any other text editor, a file named my_bram8x8.coe and save it in the main directory of your ISE project, with the following contents: memory_initialization_radix=2; memory_initialization_vector= 57 00000000, 01000000, 00100000, 00010000, 00001000, 00000100, 00000010, 00000001; Select New Source->IP (CORE Generator and Architecture Wizard); name it my_bram8x8 and click Next. In the new New Source Wizard that pops-up, select Memories and Storage Elements->RAMs & ROMs>Block Memory Generator and click Next then Finish, action which closes the New Source Wizard window and brings-up the Block Memory Generator window. Click Next, leave the Memory Type as “Single Port RAM” and click Next. On Page 3 of 6, set Write Width to 8 and Write Depth to 8. Click Next. On Page 4 of 6, select Load Init File and then browse to locate the file created earlier, my_bram8x8.coe, and then click Next. Click Next again. Page 6 of 6 should look like in Figure 2. Click Generate. This will generate the memory core and in your ISE project’s Console you should get the message: Wrote CGP file for project 'my_bram8x8'. Core Generator create command completed successfully. Figure 2 During the above process, several files are created and stored in ipcore_dir/ folder of your ISE project main folder. Among them, you can find my_bram8x8.vhd. Open it and read the VHDL code. Identify the BRAM entity declaration, and use it to instantiate a component in the top level VHDL file of your project. 58 Do the pin assignment. After this, your UCF file should have the following contents: NET NET NET NET NET NET NET NET NET NET "clk_100MHz" LOC = L15; "switch" LOC = A10; "leds[0]" LOC = U18; "leds[1]" LOC = M14; "leds[2]" LOC = N14; "leds[3]" LOC = L14; "leds[4]" LOC = M13; "leds[5]" LOC = D4; "leds[6]" LOC = P16; "leds[7]" LOC = N12; Run the Implement Design step inside ISE WebPack to perform placement and routing and observe the messages that the tool prints in the Console window. These messages provide useful information about the resource utilization on the FPGA as well as performance estimates. Generate the programming .bit file and program the FPGA. Verify the operation of your design; turn on/off the first slide-switch. Observe and comment. 4. Lab assignment Modify the project to be able to also write into the BRAM. The writing process should allow writing into BRAM new words (as dictated by the status of the slide switches) during eight cycles. These cycles should be controlled via one of the push-buttons on the Atlys board (BTND P3). It is up to you how you want to utilize the remaining push-buttons to achieve the desired operation of the whole system. 5. Credits and references [1] XST User Guide for Virtex-6, Spartan-6, and 7 Series Devices - ROMs and ROM coding examples (page 247): http://www.xilinx.com/support/documentation/sw_manuals/xilinx13_1/xst_v6s6.pdf [2] Spartan-6 Libraries Guide for HDL Designs: http://www.xilinx.com/support/documentation/sw_manuals/xilinx11/spartan6_hdl.pdf [3] Spartan-6 FPGA Block RAM Resources: http://www.xilinx.com/support/documentation/user_guides/ug383.pdf Appendix: Listing of VHDL code my_modules.vhd -- This is a ROM. XST tool (part of ISE WebPack tools) will -- infer this and implement this declaration as a distributed -- memory. -- The contents of this basically LUT will be utilized to drive -- the 8 LED on the Atlys board. This should turn them on one-by-one -- from right to left. library IEEE; use IEEE.STD_LOGIC_1164.ALL; 59 ENTITY rom8x8 IS PORT ( addr: in std_logic_vector(2 downto 0); dout: out std_logic_vector(7 downto 0)); END rom8x8; ARCHITECTURE behav OF rom8x8 IS BEGIN PROCESS(addr) BEGIN CASE addr IS when "000" => dout <= when "001" => dout <= when "010" => dout <= when "011" => dout <= when "100" => dout <= when "101" => dout <= when "110" => dout <= when "111" => dout <= when others => NULL; END case; END process; "00000001"; "00000010"; "00000100"; "00001000"; "00010000"; "00100000"; "01000000"; "10000000"; END behav; -- This is a clock divider. It takes as input a signal of 100 MHz -- and generates an output as signal with a frequency of about 1 Hz. library IEEE; use IEEE.STD_LOGIC_1164.ALL; entity ck_divider is Port ( CK_IN : in STD_LOGIC; CK_OUT : out STD_LOGIC); end ck_divider; architecture Behavioral of ck_divider is constant TIMECONST : integer := 84; signal count0, count1, count2, count3 : integer range 0 to 1000; signal D : std_logic := '0'; begin process (CK_IN, D) begin if (CK_IN'event and CK_IN = '1') then count0 <= count0 + 1; if count0 = TIMECONST then count0 <= 0; count1 <= count1 + 1; elsif count1 = TIMECONST then count1 <= 0; count2 <= count2 + 1; elsif count2 = TIMECONST then count2 <= 0; count3 <= count3 + 1; elsif count3 = TIMECONST then count3 <= 0; D <= not D; end if; 60 end if; CK_OUT <= D; end process; end Behavioral; top_level.vhd -- This is a simple design, in which we use two memories: -- memory1: ROM created using a case statement and initialized to desired values -- This should be inferred as a distributed RAM memory by the Xilinx tool -- memory2: block RAM created using the Core Generator and then only instantiated -- This mmeory is initialized using a .coe file -- The contents of these memories will be displayed on the 7 LED of Atlys. -- Slide switch SW(0) is used to select between the two memories. library IEEE; use IEEE.STD_LOGIC_1164.ALL; -- Uncomment the following library declaration if using -- arithmetic functions with Signed or Unsigned values use IEEE.NUMERIC_STD.ALL; -- Uncomment the following library declaration if instantiating -- any Xilinx primitives in this code. --library UNISIM; --use UNISIM.VComponents.all; entity top_level is Port ( clk_100MHz : in STD_LOGIC; -- FPGA's external oscillator switch : in STD_LOGIC; -- hooked to slide switch SW(0) on Atlys board leds : out STD_LOGIC_VECTOR (7 downto 0)); -- drives all eight LEDs on board end top_level; architecture Structural of top_level is component ck_divider Port (CK_IN : in STD_LOGIC; CK_OUT : out STD_LOGIC); end component; -- Question: what would be different if instead of using this component we would -- simply declare an array that would also need to be initialized with desired values? -- type ram_t is array (0 to 7) of std_logic_vector(7 downto 0); -- signal ram : ram_t := (others => (others => '0')); -- Exersize: Currently if mem_selector changes the counter does not reset. Change -- the code such that each time the mem_selector changes the counter is reset to -- zero. component rom8x8 PORT (addr : in std_logic_vector(2 downto 0); dout : out std_logic_vector(7 downto 0)); end component; -- This component is created using the Core Generator. Its VHDL description -- is inside ipcore_dir/my_bram8x8.vhd, which was created during the use -- Core Generator as explained in the lab. component my_bram8x8 PORT ( clka : IN STD_LOGIC; wea : IN STD_LOGIC_VECTOR(0 DOWNTO 0); addra : IN STD_LOGIC_VECTOR(2 DOWNTO 0); dina : IN STD_LOGIC_VECTOR(7 DOWNTO 0); 61 douta : OUT STD_LOGIC_VECTOR(7 DOWNTO 0) ); end component; signal clk_1Hz : STD_LOGIC; signal my_addr_counter : STD_LOGIC_VECTOR (2 downto 0) := "000"; signal dout_rom8x8, dout_bram8x8 : STD_LOGIC_VECTOR (7 downto 0); -- for the time being, we'll only read from this block RAM, so -- let's set all data ins to zero; signal dina_null : STD_LOGIC_VECTOR (7 downto 0) := "00000000"; signal wea_null : STD_LOGIC_VECTOR(0 DOWNTO 0) := "0"; -- no need for writing in this example begin clock_divider : ck_divider port map (clk_100MHz, clk_1Hz); -- poor instantiation memory1 : rom8x8 port map (addr => my_addr_counter, dout => dout_rom8x8); -- better instantiation -- Instantiate BRAM. memory2 : my_bram8x8 port map ( clka => clk_1Hz, -- clock for writing data to RAM wea => wea_null, -- write enable signal for Port A addra => my_addr_counter, -- 3 bit address for the RAM dina => dina_null, -- 8 bit data input to the RAM douta => dout_bram8x8); --8 bit data output to the RAM multiplex_out : process (clk_1Hz) is begin if (clk_1Hz'event and clk_1Hz = '1') then case switch is when '0' => leds <= dout_rom8x8; when '1' => leds <= dout_bram8x8; when others => NULL; end case; my_addr_counter <= std_logic_vector( unsigned(my_addr_counter) + 1); end if; end process; end Structural; 62 Lab 6: Memories: External SPI Flash and DDR2 1. Objective The objective of this lab is to learn how to access memory chips from within your VHDL design. These memory chips are external to the FPGA, located on the Atlys board. The board has a 16Mbyte x4 SPI Flash for configuration and data storage and a 128Mbyte DDR2 with 16-bit wide data. 2. Description The Atlys board uses a 128Mbit Numonyx N25Q12 Serial Flash memory device (16,777,216 bytes - 8 bits each) for non-volatile storage of FPGA configuration files. The SPI Flash can be programmed with a .bit, .bin., or .mcs file using the Adept software. Adept Flash programming application allows also allows user data files to be transferred to/from the Flash at user specified addresses. The Read/Write tools of Adept allow data to be exchanged between files on the host PC and specified address ranges in Flash. As general-purpose flash, the SPI serial flash can also be used for any other non-volatile storage that you might require. One example could be to store MicroBlaze processor application code for bootloading. In the first part of this lab we’ll create a project to implement the following design description: the circuit must read one byte (8 bits) from a specified location on the Flash memory chip and use it to drive the 8 LEDs on the Atlys board. A simplified representation of this functionality is shown in the block diagram in Fig.1. Figure 1 Block diagram of system that reads one memory location and displays it on 8 LEDS 3. SPI Controller The communication between Spartan-6 FPGA and the Flash memory chip is done via the so-called Serial Peripheral Interface (SPI) communication method (see Fig.2). This method was used to connect devices such as printers, cameras, scanners, etc. to a desktop computer; but it has largely been replaced by USB. However, SPI can still be a useful communication tool for some applications. SPI runs using a master/slave 63 set-up and can run in full duplex mode (i.e., signals can be transmitted between the master and the slave simultaneously). There is no standard communication protocol for SPI. SPI is still used to control some peripheral devices and has some advantages over I2C (another type of serial data communication). SPI can communicate at much higher data rates than I2C. Furthermore, when multiple slaves are present, SPI requires no addressing to differentiate between these slaves. SPI has the additional benefit of requiring only simple wiring, when compared to parallel buses. Figure 2 SPI communication method To access the Flash memory from within the Spartan-6 FPGA, we implement a finite state machine – an SPI controller – that is responsible with the SPI communication. The controller implements only a subset of all the commands that the Flash memory supports. The SPI controller utilized in this lab is a slightly modified version of the one developed by Johannes Hausensteiner and available at opencores.org [1]. You should read the spi_ctrl.vhd and study the state diagram (included in Appendix A as well as in the downloadable archive for this lab) to understand how the SPI controller works. In addition, you will need to read the datasheet of Numonyx N25Q12 Serial Flash memory device and understand how it works [2]. In addition, you should search online and read more about SPI [3,4]. 4. Aldec-HDL Simulation To help understanding the SPI controller and the circuit designed to use it for reading one memory location, we’ll first use Aldec-HDL simulation to investigate the overall system operation. Following the procedure presented in lab#1, create a new design and add to it the source files testbench_spi_ctrl.vhd, top_level_spi_ctrl.vhd, and spi_ctrl.vhd. Al these files are included in the downloadable archive for this lab. Run your simulation for 6 us. Study the provided VHDL files and display necessary waveforms to understand the operation of the circuit. An example of useful waveforms is shown in Fig.3. In your assignment for this lab, you will need to modify these files and verify the correct functionality of your new design. Using Aldec-HDL first to do your VHDL coding and debugging will save you a lot of trouble and frustration, which you might otherwise experience if you wanted to go directly for the implementation of your design on the FPGA with ISE WebPack. In addition, another important thing you should be aware of is that simulation does not always give you the same results as the hardware implementation. Most often, things appear to work in simulation but the hardware implementation would fail – and we need to go back to the simulation stage and continue to debug our designs. 64 Figure 3 Zoom-in into the Aldec's testbench simulation 5. ISE WebPack project By now, you should have a good idea about how the top level circuit works. Create a new ISE project and add to it the VHDL files top_level_spi.vhd and spi_ctrl.vhd. Again, these files together with other useful files (such as the .ucf file) are included in the downloadable archive with all the data for this lab. Do the pin assignment. Your UCF file should have the following contents: NET NET NET NET NET NET NET NET NET NET NET NET NET NET NET NET "clk_100MHz" LOC = L15; "reset_btn" LOC = F5; "spi_din" LOC = R13; "spi_dout" LOC = T13; "spi_cs" LOC = V3; "spi_clk" LOC = R15; "spi_wp_bar" LOC = T14; "spi_hold_bar" LOC = V14; "leds[0]" LOC = U18; "leds[1]" LOC = M14; "leds[2]" LOC = N14; "leds[3]" LOC = L14; "leds[4]" LOC = M13; "leds[5]" LOC = D4; "leds[6]" LOC = P16; "leds[7]" LOC = N12; 65 Run the “Implement Design” step inside ISE WebPack to perform placement and routing and observe the messages that the tool prints in the Console window. These messages provide useful information about the resource utilization on the FPGA as well as performance estimates. Generate the programming .bit file and program the FPGA. Verify the operation of your design. Observe and comment. Note that the provided top_level_spi.vhd file reads the content of the Flash memory location at address (or bias) x”18A230”, which I selected randomly and hard-coded it inside the VHDL code. To verify that what the design reads from the Flash memory (and displays on the 8 LEDs) is indeed the actual information stored inside the memory, I use Digilent’s Adept to first read the entire content of the Flash memory and save it into a temporary binary file. Then, using a hex editor (such as HxD Hex Editor [5]) I see that for example the information at address x”18A230” is x”80” - shown in Fig.4 - which corresponds to how the LEDs are turned on/off (only the left most LED is turned on)! Figure 4 “Current” content of Flash memory at address 0018A230 is 80 6. Lab assignment Modify the project to be able to also write into the Flash memory. Your design should be able to write ten consecutive memory locations starting at an arbitrary address (write numbers 1 through 10) and then read them back and display them using the 8 LEDs. Each of the numbers should be displayed for one second. 7. Credits and references [1] Johannes Hausensteiner, SPI Controller in VHDL. http://opencores.org/project,spiflashcontroller [2] Datasheet Numonyx N25Q12 Serial Flash memory. http://www.alldatasheet.com/datasheetpdf/pdf/353314/NUMONYX/N25Q128.html 66 [3] SPI description. http://www.ee.nmt.edu/~teare/ee308l/datasheets/S12SPIV3.pdf [4] Google search for Serial Peripheral Interface (SPI). [5] HxD - Freeware Hex Editor and Disk Editor. http://mh-nexus.de/en/hxd/ Appendix A: SPI controller state diagram – authored by Johannes Hausensteiner and available at http://opencores.org/project,spiflashcontroller 67 Lab 7: Interfacing FPGA Spartan-6 with AC’97 Codec 1. Objective The objective of this lab is to demonstrate the use of the National Semiconductor LM4550 AC‘97 audio codec (IC3), which is available on the Atlys board. We’ll code in VHDL a driver and implement it on the FPGA to communicate with and control the codec. The driver can select the input into the codec (e.g., microphone, line-in) and set the volume – via the slide switches of the Atlys board. 2. Introduction AC'97 (Audio Codec '97; also MC'97 for Modem Codec '97) is an audio codec standard developed by Intel Architecture Labs in 1997. The standard is used in motherboards, modems, and sound cards. Read more about AC’97 here: http://www-inst.eecs.berkeley.edu/~cs150/Documents/ac97_r23.pdf The Atlys board includes a National Semiconductor LM4550 AC‘97 audio codec (IC3) with four 1/8” audio jacks for line-out (J5), headphone-out (J7), line-in (J4), and microphone-in (J6). Audio data at up to 18 bits and 48KHz sampling is supported, and the audio in (record) and audio out (playback) sampling rates can be different. The microphone jack is mono, all other jacks are stereo. The headphone jack is driven by the audio codec's internal 50mW amplifier. LM4550 basically serves as an interface between the analog world of traditional audio components (e.g., headphones and microphones) and the digital world of the FPGA. Read more about LM4550 here: http://www.ti.com/lit/ds/symlink/lm4550.pdf 3. VHDL driver This is an example hardware driver used to interface the AC97 audio codec with an FPGA running at 100 MHz. The design can be scaled to other clock speeds by either scaling the internal counters, or instantiating an onboard PLL to attain a 100 MHz clock. The VHDL code and description of this controller is based on the work of Tony Storey and Scott Larson [1]. Spartan-6 FPGA From 100MHz oscillator L15 CLK SOURCE VOLUME 3 5 AC97CMD command state machine Reset T15 AC97 controller cmd_addr 8 cmd_data 16 latching_cmd ready From eight slide switches From Atlys’ RESET push-button N16 T18 U17 L13 T17 SDATA_IN SDATA_OUT SYNC 12.288 RESET To LM4550 AC97 on Atlys board Figure 1: Block diagram of desired circuit 68 The inputs to the controller “AC97 controller” include the CLK (main FPGA oscillator), an active low reset, a serial data in line, a 12.288 MHz bit clock from the ac97 chip, a 3 bit source selector (slide switches SW7-5) and a 5 bit volume control (slide switches, SW4-0). The controller’s outputs include a sync signal, serial data output, and an ac97 active low reset signal for initializing the ac97 (LM4550). There are two internal signals to sync the main ac97 controller with the “command state machine AC97CMD” (a small FSM to setup codec's registers). One of these signals pulses every 20us and the other is a signal used for error checking during the tag phase. Consult the LM4550 data sheet for details on the serial frame input/output. The VHDL files can be downloaded on the course website. The downloadable archive contains additional files (datasheets) including the .ucf file that must be utilized to assign FPGA I/O pins correctly. Its content is listed here: # PlanAhead Generated physical constraints NET "SOURCE[2]" LOC = E4; NET "SOURCE[1]" LOC = T5; NET "SOURCE[0]" LOC = R5; NET "VOLUME[4]" LOC = P12; NET "VOLUME[3]" LOC = P15; NET "VOLUME[2]" LOC = C14; NET "VOLUME[1]" LOC = D14; NET "clk" LOC = L15; NET "BIT_CLK" LOC = L13; NET "SDATA_IN" LOC = T18; NET "SDATA_OUT" LOC = N16; NET "SYNC" LOC = U17; NET "AC97_n_RESET" LOC = T17; NET "n_reset" LOC = T15; NET "VOLUME[0]" LOC = A10; 4. Synthesis and FPGA programming Use ISE WebPack to synthesize the entire design and then program the FPGA. Test the whole system using a microphone and the audio signal from your favorite YouTube music video connected to the MIC and LINE IN of the Atlys board. Use the slide switches to select between the two inputs and vary the volume. 5. Lab assignment Read the datasheets of AC97 and of LM4550 to get an understanding of the serial communication. Read the provided VHDL code and understand how it works – try to sketch the state graphs of the two FSM’s from Fig.1 above. Propose and implement a new VHDL design; you should reuse some or the entire VHDL code to do something different. The given VHDL design hierarchy simply routes the parallel outputs of the controller back to its parallel inputs. This makes the AC97 talk through from input to output. This process in the top level file can be replaced by port mapping user components for various signal processing tasks for example. An excellent example is the following voice-recorder design: http://web.mit.edu/6.111/www/f2008/handouts/labs/lab4.html The top-level plan is pretty simple – when recording, store the stream of incoming samples in a memory (inside FPGA or on Atlys’ memory?), when playing back feed the stored data stream back to the codec. 69 6. Credits and references [1] Tony Storey and Scott Larson, AC’97 Codec Hardware Driver Example. http://eewiki.net/display/LOGIC/AC%2797+Codec+Hardware+Driver+Example [2] http://www.javiervalcarce.eu/wiki/VHDL_Macro:_DC97#cite_note-0 [3] http://www-mtl.mit.edu/Courses/6.111/labkit/audio.shtml [4] http://web.mit.edu/6.111/www/f2008/handouts/labs/lab4.html 70 Lab 7 Supplemental: PS2 Keyboard and UART 1. Objective The objective of this lab is to learn how to connect a keyboard to the Atlys board, read pressed keys via a PS2 receiver, and send the key code to the host computer via an UART transmitter. The host computer displays the pressed key character in a Hyperterminal. The PS2 receiver and UART transmitter are implemented on the Spartan-6 FPGA. 2. Description To set-up the communication with the host computer: Download Windows driver from www.exar.com. Type the EXAR part number "XR21V1410". Download driver and install. To use Windows' Hyperterminal program, use port settings Bits per second: 19200, Data bits: 8, Parity: None, Stop bits: 1, Flow control: None 3. Credits and references [1] P.P. Chu, FPGA Prototyping by VHDL Examples: Xilinx Spartan-3 Version, Wiley 2008. 71 Lab 8: Interfacing FPGA Spartan-6 with Host Computer via USB 1. Objective The objective of this lab is to learn one method of implementing communication via USB between the FPGA (Spartan-6 on Atlys board) and the host computer. This method is based on using an excellent open source project called FPGALink [1]. Once this lab is completed you should be able to extend this method and utilize it in any project where you require the computer host to exchange data with the FPGA. 2. Introduction The Universal Serial Bus (USB) is a specification developed (in the mid-1990s) by Compaq, Intel, Microsoft and NEC, joined later by Hewlett-Packard, Lucent and Philips. The USB was developed as a new means to connect a large number of devices to the PC, and eventually to replace the 'legacy' ports (serial ports, parallel ports, keyboard and mouse connections, joystick ports, midi ports, etc.). USB requires a shielded cable containing 4 wires. The USB is based on a “tiered star topology” in which there is a single host controller and up to 127 “slave” devices. The host controller is connected to a hub, integrated within the PC, which allows a number of attachment points (referred to as ports). The USB is intended as a bus for devices near to the PC. For applications requiring distance from the PC, another form of connection is needed, such as Ethernet. Note however, that USB is not a true bus: only the root hub sees every signal on the bus. This implies there is no method to monitor upstream communications from a downstream device. There a lot of online information describing the USB. As a start, you may want to read [2,3]. In this lab we’ll use one of the USB ports available on the Atlys board; that is, the so called “Adept USB Port” (see Fig.1), marked as J8 on the board and on the schematic diagram [4]. The USB Controller is a Cypres chip, CY7C68013A-56 USB Microcontroller High-Speed USB Peripheral Controller. J8 Figure 1 72 3. FPGALink Library The FPGALink library was developed by Chris McClelland [1]. It provides an end-to end solution capable of JTAG-programming the FPGA on a variety of USB-based hardware platforms (including Atlys board). It also facilitates communication with the FPGA using a straightforward API on the host side and a standard FIFO interface on the FPGA side. The FPGALink library is just a C DLL. So, we would normally embed it in our application, for example developed in C/C++ or Python. To get started and help you become familiar with the FPGALink library, the binary distribution archive contains also a utility (called "flcli") which provides straightforward commandline access to many of the library functions. In this lab we will: 1) Use the "flcli" utility to demonstrate the host-FPGA communication using an example that is a slightly changed example that comes with the FPGALink library. 2) Build a simple C++ application to utilize the FPGALink library. 3.1 Working Environment Setup Notes: -- Steps 1 and 2 are necessary only if you plan to compile the FPGALink or you are doing this on your personal home computer. Because we'll use the provided downloadable binaries of this library, these steps can be skipped. -- I have done this lab on Windows (though FPGALink can be used on Linux and Mac too). These steps refer to the Windows. 1) Download and install "Visual C++ Express 2010" http://www.microsoft.com/visualstudio/en-us/products/2010editions/express#Visual_Studio_2010_Express_Downloads 2) Download and install "Microsoft Visual C++ 2010 Redistributable Package (x86)" http://www.microsoft.com/en-us/download/details.aspx?id=5555 3) Download "Build Infrastructure", windows version. This is the environment where we’ll work with the FPGALink library binaries. http://www.makestuff.eu/wordpress/software/build-infrastructure/ On windows, unpack the downloaded archive makestuff-win32-20111211.zip in your own directory. In my case, I did this directly in C:\. This created C:\makestuff\. 4) Download and install "Console 2". Console is a Windows console window enhancement. http://sourceforge.net/projects/console Simply unpack the downloaded archive directly in C:\Program Files\ Then create a shortcut to C:\Program Files\Console2\Console.exe Launch Console 2 and enter "C:\makestuff\msys\bin\sh.exe --login" in the "Shell" box at Edit->Settings>Console 73 5) Download the latest FPGALink library binaries (at the time of writing this lab, the latest version is "libfpgalink-20120621.tar.gz (Linux, MacOSX & Win32)"). This is basically the library that we’ll use. If your course project will require communication with the host, this will turn out to be very handy. http://www.makestuff.eu/wordpress/software/fpgalink/ Unpack it in C:\makestuff\libs\ 6) Download “LibUSB-Win32”. libusb-win32 is a port of the USB library libusb (http://sf.net/projects/libusb/) to 32/64bit Windows (2k, XP, 2003, Vista, Win7, 2008; 98SE/ME for v0.1.12.2). The library allows user space applications to access many USB device on Windows. http://sourceforge.net/projects/libusb-win32/ Plug in the Atlys board and turn the power on. Then run bin/inf-wizard.exe. Click “Next”, select your FPGA board, make a note (in our case, the Atlys board, that is 1443, 0007) of the vendor and product IDs and click “Next” twice. Choose a location for the driver and click “Save”. Click “Install Now”. That's all. We are now ready to use FPGALink library! You should now take the time to read the FPGALink manual: http://www.swaton.ukfsn.org/docs/fpgalink/vhdl_paper.pdf FPGALink library comes with two nice examples. Please follow the steps from "README" (C:\makestuff\libs\libfpgalink-20120621\README) to run either of the examples. 3.2 EXAMPLE #1: Communication Host (flcli utility) – FPGA (simple VHDL design) A) Description Our application implemented on the FPGA works in this simple example with primarily four registers, referred to as R0, R1, R2, R3. These registers provide the storage space for communicating with the host, and are associated with four different channels of the communication between host and FPGA. From the host, writes to R0 are simply displayed on the Atlys board’s eight LEDs. Reads from R0 return the state of the board’s eight slide switches. Writes to R1, R2, and R3 are registered and may be read back. The circuit implemented on the FPGA simply multiplies the R1 with R2 and places the result in R3. A simplified block diagram of the entire system (host + FPGA) is shown in the Fig.2 below. Figure 2 Interfacing the host computer with the FPGA via FPGALink 74 B) VHDL coding and .xsvf programming file generation The two VHDL source files (comm_fpga_fx2.vhdl and top_level.vhdl) together with the .UCF file required to implement the circuit on the FPGA are provided in the downloadable archive of this lab. These files are modified versions of the VHDL example files from the FPGALing library. top_level.vhdl is also included in Appendix A at the end of this document. First, please read these files to understand what they do. Then, create a new ISE WebPack project and add these files to your project. In my case, I called my new project lab8_usb_fpgalink. The entire directory of my ISE WebPack project is also included in the downloadable archive of this lab. Synthesize and implement the design. Generate .xsvf: Method 1 Because we will be programming the FPGA using the flcli utility provided as part of the FPGALink library, we need to generate an .xsvf programming file. Recall that the FPGA can be programmed using different programming file formats including .bit, .svf, and .xsvf. To generate the .xsvf file follow these steps: Inside ISE WebPack, select “Manage Configuration Project (iMPACT)”, right-click and choose “Run”. The ISE iMPACT window should pop-up after a few seconds. Double click on “Boundary Scan” and then File->Initialize Chain. You should get the sc6slx45 “instantiated” in the “Boundary Scan” panel like in this figure: Figure 3 Right click on the chip and choose Set target device. Assign a configuration file. This is usually a .BIT file such as top_level.bit in our case. So, go ahead and select top_level.bit and assign it. Select from the menu Option->XSVF File->Create XSVF File… Name it and then click OK to save it in your ISE project directory. In my case I named it lab8_usb_fpgalink.xsvf. Right click on the chip and choose Program. The output will be saved to .XSVF file, lab8_usb_fpgalink.xsvf. We’ll use this file to program the device. Close ISE iMPACT. Close also the ISE WebPack but keep the Atlys board connected and powered-on. 75 We have now lab8_usb_fpgalink.xsvf and so we’re ready to program the FPGA and to communicate with it via the flcli utility of the FPGALink library binaries distribution. Generate .xsvf: Method 2 This is optional and meant for the curious. Until now, we’ve been using Xilinx ISE WebPack tools via the graphical user interface, the actual ISE. However, these tools can be run using Makefiles at the command line too. This alternative approach is especially useful when we want to automate and thereby speed-up the design process: all design steps can be executed via a single makefile. In addition, memory and CPU resource utilization is better. It is left as an exercise for you to read Xilinx and other documentation [5] and to write the simplest makefile required to run the whole process of implementing the design of this lab and to finally generate the .XSVF programming file. C) Testing and validation of the overall host-FPGA system Before launching flcli, first create a new folder inside C:\makestuff\libs\libfpgalink-20120621\gen_xsvf and copy lab8_usb_fpgalink.xsvf to it. We’ll use the newly created folder, gen_xsvf, to store .xsvf programming files of our own projects. --Connect and power-on the Atlys board if not already. --Start a terminal by launching Console 2. We’ll use the flcli utility on the host side. flcli is a command-line utility, which offers many of the FPGALink library’s features. It is useful for testing, etc. Read more about it in the FPGALink manual vhdl_paper.pdf: http://www.swaton.ukfsn.org/docs/fpgalink/vhdl_paper.pdf --Use flcli utility to program the FPGA. In the Console 2 terminal, do: > cd libs/libfpgalink-20120621 > ./win32/rel/flcli -v 1443:0007 -i 1443:0007 -s -x gen_xsvf/lab8_usb_fpgalink.xsvf --Use flcli utility to connect to the FPGALink device 1443:0007 (that is the USB controller on the Atlys board): > ./win32/rel/flcli -v 1443:0007 -c Which enters the command-line mode, where we can use the flcli utility’s built-in functions to write and read the registers we have created on the FPGA. For example, try this: > w0 13 And observe the LEDs on the Atlys board. They should be turned on/off accordingly. Or for example, read the status of the slide switches: > r0 Write into R1 and R2: > w1 02;w2 03 > r1 > r2 > r3 If everything went OK, your Console 2 window should look like in Fig.4. Quit the flcli utility: >q 76 Figure 4 Snap-shot of Console 2 window 4. Lab assignment Implement a project in which you utilize the FPGALink from your own host-side application written in C/C++ or Python. Your project should open a file file_host2fpga.txt (the file format is with a byte in hex format on each line) and read its content line by line and send it to the FPGA to drive the eight LED. Also, your application should read the eight slide switches and save their status in the same format as above in file_fpga2host.txt. Append a new line to this file each time the switches are changed. To get started, read first the C example provided as part of the FPGALink binaries distribution. This example is located in: C:\makestuff\libs\libfpgalink-20120621\examples\c 5. Credits and references [1] Chris McClelland , FPGALink: Easy USB to FPGA Communication. http://www.makestuff.eu/wordpress/software/fpgalink [2] USB Home: http://www.usb.org/home [3] USB Made Simple: http://www.usbmadesimple.co.uk/index.html [4] Atlys schematic diagram. http://www.digilentinc.com/Data/Products/ATLYS/Atlys_C2_sch.pdf [5] Xilinx’s command line tools user guide. http://www.xilinx.com/support/documentation/sw_manuals/xilinx14_1/devref.pdf & others: http://www.demandperipherals.com/docs/CmdLineFPGA.pdf 77 http://outputlogic.com/xcell_using_xilinx_tools/74_xperts_04.pdf Appendix A: Content of top_level.vhd file. --- Copyright (C) 2009-2012 Chris McClelland --- This program is free software: you can redistribute it and/or modify -- it under the terms of the GNU Lesser General Public License as published by -- the Free Software Foundation, either version 3 of the License, or -- (at your option) any later version. --- This program is distributed in the hope that it will be useful, -- but WITHOUT ANY WARRANTY; without even the implied warranty of -- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -- GNU Lesser General Public License for more details. --- You should have received a copy of the GNU Lesser General Public License -- along with this program. If not, see <http://www.gnu.org/licenses/>. --- Additional changes/comments by Cristinel Ababei, 2012 -- Description: -- From the host, writes to R0 are simply displayed on the Atlys board's -- eight LEDs. Reads from R0 return the state of the board's eight slide -- switches. Writes to R1 and R2 are registered and may be read back. -- The circuit implemented on the FPGA simply multiplies the R1 with R2 -- and places the result in R3. Only reads, from host side, are allowed -- from from R3; that is an attempt to write into R3 will have no effect. -- When you input, from host side, data into R1 and R2, data should -- represent numbers that can be represented on 4 bits only. Because -- data will have to be input (will be done via the flcli application) -- in hex, writing for example 07 or A7 into R1 will have the same effect -- as writing 07 because the four MSB will be discarded inside the -- VHDL application on FPGA. -library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all; entity top_level is port( -- FX2 interface ----------------------------------------------------------------------------fx2Clk_in : in std_logic; -- 48MHz clock from FX2 fx2Addr_out : out std_logic_vector(1 downto 0); -- select FIFO: "10" for EP6OUT, "11" for EP8IN fx2Data_io : inout std_logic_vector(7 downto 0); -- 8-bit data to/from FX2 -- When EP6OUT selected: fx2Read_out : out std_logic; fx2OE_out : out std_logic; fx2GotData_in : in std_logic; -- asserted (active-low) when reading from FX2 -- asserted (active-low) to tell FX2 to drive bus -- asserted (active-high) when FX2 has data for us -- When EP8IN selected: fx2Write_out : out std_logic; fx2GotRoom_in : in std_logic; -- asserted (active-low) when writing to FX2 -- asserted (active-high) when FX2 has room for more data fx2PktEnd_out : out -- asserted (active-low) when a host read needs to be from us std_logic; committed early -- Onboard peripherals ----------------------------------------------------------------------led_out : out std_logic_vector(7 downto 0); -- eight LEDs slide_sw_in : in std_logic_vector(7 downto 0) -- eight slide switches ); end top_level; architecture behavioural of top_level is -- Channel read/write interface ----------------------------------------------------------------signal chanAddr : std_logic_vector(6 downto 0); -- the selected channel (0-127) -- Host >> FPGA pipe: signal h2fData : std_logic_vector(7 downto 0); signal h2fValid : std_logic; -- data lines used when the host writes to a channel -- '1' means "on the next clock rising edge, please accept the data on signal h2fReady -- channel logic can drive this low to say "I'm not ready for more data h2fData" : std_logic; yet" -- Host << FPGA pipe: signal f2hData : std_logic_vector(7 downto 0); -- data lines used when the host reads from a channel 78 signal f2hValid : std_logic; -- channel logic can drive this low to say "I don't have data ready for you" signal f2hReady : std_logic; -- '1' means "on the next clock rising edge, put your next byte of data on f2hData" -- ----------------------------------------------------------------------------------------------- Needed so that the comm_fpga_fx2 module can drive both fx2Read_out and fx2OE_out signal fx2Read : std_logic; -- Registers signal reg0, signal reg1, signal reg2, signal reg3, implementing the channels reg0_next : std_logic_vector(7 reg1_next : std_logic_vector(7 reg2_next : std_logic_vector(7 reg3_next : std_logic_vector(7 downto downto downto downto 0) 0) 0) 0) := := := := x"00"; x"00"; x"00"; x"00"; begin -- BEGIN_SNIPPET(registers) -- Infer registers process(fx2Clk_in) begin if ( rising_edge(fx2Clk_in) ) then --checksum <= checksum_next; reg0 <= reg0_next; reg1 <= reg1_next; reg2 <= reg2_next; reg3 <= reg3_next; end if; end process; -- Drive register inputs for each channel when the host is writing reg0_next <= h2fData when chanAddr = "0000000" and h2fValid = '1' else reg0; reg1_next <= h2fData when chanAddr = "0000001" and h2fValid = '1' else reg1; reg2_next <= h2fData when chanAddr = "0000010" and h2fValid = '1' else reg2; reg3_next <= std_logic_vector(unsigned(reg1(3 downto 0)) * unsigned(reg2(3 downto 0))); -- Select values to return for each channel when the host is reading with chanAddr select f2hData <= slide_sw_in when "0000000", -- return status of slide switches when reading R0 reg1 when "0000001", reg2 when "0000010", reg3 when "0000011", x"00" when others; -- Assert that there's always data for reading, and always room for writing f2hValid <= '1'; h2fReady <= '1'; -- CommFPGA module fx2Read_out <= fx2Read; fx2OE_out <= fx2Read; fx2Addr_out(1) <= '1'; -- Use EP6OUT/EP8IN, not EP2OUT/EP4IN. comm_fpga_fx2 : entity work.comm_fpga_fx2 port map( -- FX2 interface fx2Clk_in => fx2Clk_in, fx2FifoSel_out => fx2Addr_out(0), fx2Data_io => fx2Data_io, fx2Read_out => fx2Read, fx2GotData_in => fx2GotData_in, fx2Write_out => fx2Write_out, fx2GotRoom_in => fx2GotRoom_in, fx2PktEnd_out => fx2PktEnd_out, -- Channel read/write interface chanAddr_out => chanAddr, h2fData_out => h2fData, h2fValid_out => h2fValid, h2fReady_in => h2fReady, f2hData_in => f2hData, f2hValid_in => f2hValid, f2hReady_out => f2hReady ); -- LEDs led_out <= reg0; end behavioural; 79 --END_SNIPPET(registers) Lab 9: Video Interfaces: HDMI and DVI 1. Objective The objective of this lab is to learn how to transmit High-Definition Multimedia Interface (HDMI) and Digital Visual Interface (DVI) data streams to HDMI and DVI capable monitors. The top-level design in this lab displays a simple colored pattern. In addition, we will learn how to create ISE WebPack projects that use both VHDL and Verilog source files. 2. Introduction The Atlys board contains four HDMI ports (see Fig.1), including two buffered – via the TI’s TMDS141 buffers – HDMI input/output ports (type A connector), one buffered HDMI output port (type D connector), and one unbuffered port that can be input or output [1]. Figure 1 Illustration of the four HDMI ports of the Atlys board (left). HDMI Male to DVI-D Female Rotating Adapter (top right). HDMI connectors (bottom right) Since the HDMI and DVI systems use the same transition-minimized differential signaling TMDS signaling standard, a simple adaptor shown in the right hand side of Fig.1 (available at most electronics stores such as TigerDirect [2]) can be used to drive a DVI connector from either of the HDMI output ports. The HDMI connector does not include VGA signals, so analog VGA displays cannot be driven. The Atlys board does not have any VGA connector. For examples on how to drive VGA monitors from the Atlys board, please see the supplemental material of this lab on the course’s website. In this lab we will drive HDMI and DVI capable monitors to display a colored pattern. For this we’ll use a Verilog project developed by Bob Feng of Xilinx [3]. We’ll modify Bob’s project by converting to VHDL part of the Verilog code. In this way, this lab becomes a good opportunity for you to create ISE projects that use both Verilog and VHDL source files. By comparing two files, Verilog and VHDL, that implement the same functionality you get a fist time exposure to Verilog too. 80 Transition-minimized differential signaling (TMDS): TDMS is a method for transmitting high-speed serial data and is used by the DVI and HDMI video interfaces, as well as other digital communication interfaces. Developed by Silicon Image Inc. as a member of the Digital Display Working Group Transmitter incorporates an advanced coding algorithm which reduces electromagnetic interference over copper cables and enables robust clock. Recovery at the receiver to achieve high skew tolerance. TMDS uses 4 channels: Red, Green, Blue, Clock TMDS is a two-stage process. Converts an input of 8 bits into a 10 bit code o TMDS signaling uses a twisted pair for noise reduction. o Current Mode Logic (CML), DC coupled and terminated to 3.3 Volts. o 3 twisted pairs are used to transfer video data - each a different RGB component o 8 bit data transmission plus 2 bits of control signals 3. Brief HDMI Description What s HDMI: HDMI is the first & only industry supported, uncompressed, all-digital audio/video interface. HDMI is a compact audio/video interface for transferring uncompressed digital audio/video data from an HDMI-compliant device ("the source") to a compatible digital audio device, computer monitor, video projector, and digital television. HDMI provides an interface between any A/V source, such as a set-top box, DVD player, or A/V receiver and an audio and/or video monitor, such as a digital television (DTV), over a single cable. HDMI is a digital replacement for existing analog standards such as composite video, S-Video, SCART, component video, and VGA. HDMI supports standard, enhanced, or high-definition video, plus multi-channel digital audio on a single cable. Transmits all ATSC HDTV standards and supports 8-channel, 192kHz, uncompressed digital audio, all currently-available compressed formats & lossless digital audio formats with bandwidth to spare to accommodate future enhancements and requirements HDMI acts like Cat5, it passes a data signal not an RF signal like CATV. DVI is HDMI without the audio - separate cable needed for audio! HDMI communication channels (see Fig.2): HDMI has three physically separate communication channels, which are the TMDS, DDC, and the optional CEC: The HDMI cable and connectors carry four differential pairs that make up the TMDS data and clock channels. o Audio, video and auxiliary data is transmitted across the three TMDS data channels. o A TMDS clock, typically running at the video pixel rate, is transmitted on the TMDS clock channel HDMI carries a VESA DDC (Display Data Channel) channel. The DDC is used for configuration and status exchange between a single transmitter and a single receiver. o The DDC is used by the transmitter to read the receiver’s Enhanced Extended Display Identification Data (E-EDID) in order to discover the receiver’s configuration and capabilities. The optional CEC (Consumer Electronics Control) protocol provides high-level control functions between all of the various audiovisual products in a user’s environment. 81 Advantages of HDMI: Because HDMI is a digital interface, it provides the best quality of the video since there are no lossy analog to digital conversions as are required for all analog connections (such as component or S-Video). Digital video will be sharper. Single cable for both video and audio is the most effective format! HDMI devices supporting High-bandwidth Digital Content Protection (HDCP) have the comfort of knowing they will have access to premium HD content now and in the future. Figure 2 HDMI Signals Compatibility with DVI: HDMI is backward-compatible with single-link Digital Visual Interface digital video (DVI-D or DVI-I, but not DVI-A). No signal conversion is required when an adapter or asymmetric cable is used, so there is no loss of video quality. From a user's perspective, a DVI-D monitor would have the same level of basic interoperability unless there are content protection issues with High-bandwidth Digital Content Protection (HDCP), not supported by DVI, or the HDMI color encoding is in component color space YCbCr which is not supported by DVI, instead of RGB. Because discussing HDMI is not the main purpose of this lab, you may want to take some time to search and read more about HDMI on the Internet. There is tons of information out there. Here are some starting pointers [5]. 82 4. ISE WebPack Project Xilinx Application Note One challenge of working with HDMI is that there is no or only a limited number of design examples in the public domain. So, one would need to do many things from scratch, which becomes very challenging in the case of HDMI. However, a design example is the XAPP495 [3]. This is a good start as it was created to work with the Atlys board. But there are a few issues. One is that, generally, HDMI design requires significant effort and attention to many details. Another one is that it does not implement EDID (monitor identification data) nor audio. These will not stop us from using it. However, XAPP495 is written in Verilog while in this course we focus on VHDL. Hence, to put this lab together required some code conversion from Verilog to VHDL. Before continuing, you should take some time now and read XAPP495 paper [3] (also included in the downloadable archive of this lab). Driving DVI and HDMI Monitors to Display a Colored Pattern While the XAPP495 provides examples of both DVI transmitters and receivers, in this lab, we focus only on the transmitter part. Specifically, we create a design that drives HDMI and DVI monitors to display a colored bar pattern. The design basically uses all Verilog files related to the transmitting part from the xapp495 archive (downloadable from Xilinx). The only primary exception is the top-level file, which I replaced with a VHDL version of it. During the conversion Verilog-VHDL process, I had to make some other minor changes inside syncro.v and serdes_n_to_1.v to work around the fact that apparently parameters in Verilog modules cannot be instantiated as generics in VHDL (at least not with the ISE WebPack?). The VHDL top-level file is simply a VHDL counterpart of the vtc_demo.v file from the xapp495 archive. The name of the new VHDL top-level file is vtc_demo.vhd. At this time, you should open both files vtc_demo.vhd (located in lab9_files_ISE folder) and vtc_demo.v to read and compare them. Notice similarities and differences between VHDL and Verilog. Also, notice how Verilog modules are declared as components and instantiated in the VHDL top-level file. Comments inserted inside vtc_demo.vhd provide additional information on the main elements of the design. The block diagram of the design entity described in vtc_demo.vhd is shown in Fig.3. While reading the VHDL top-level file, try to identify the signals and components corresponding to those from Fig.3. The design is based on several IP cores that are available on the Spartan-6 FPGA [5]. These include (see [5] for description of each): IBUF - input buffer BUFIO2 - Dual Clock Buffer and Strobe Pulse BUFG - Global Clock Buffer SRL16E - 16-Bit Shift Register Look-Up Table (LUT) with Clock Enable OSERDES2 - Dedicated IOB Output Serializer DCM_CLKGEN - Digital Clock Manager PLL_BASE - Basic Phase Locked Loop Clock Circuit BUFPLL - PLL Buffer OBUFDS - 3-State Differential Signaling I/O Buffer with Active Low Output Enable 83 Figure 3 Block diagram of the “SMPTE HD Color bar Generation with Programmable Video Timing” designed by Bob Feng of Xilinx. Create a new ISE WebPack project and add to it all the Verilog and VHDL files in lab9_files_ISE folder. These files together with other useful files (such as the .ucf file) are included in the downloadable archive with all the data for this lab. Synthesize and implement the design. Download bitstream to the FPGA board and test. To test the design, we need to attach a monitor to the HDMI OUT (J2) port of the Atlys board. An HDMI monitor can be connected directly using an HDMI cable. To connect a DVI monitor (like most of today’s monitors) we need an HDMI to DVI converter; I got mine from TigerDirect [3] for $10. For this lab, the TA will have one such converter for you to take turns and use; however, if your project in this course involves using a monitor, you may want to buy your own converter. After setting everything up, you should see your monitor display the colored pattern shown in Fig.4. Figure 4 HDMI (left, 7” TFT) and DVI (right, 20” LCD) monitors display colored bar pattern 84 5. Lab Assignment Convert to VHDL the hdcolorbar module describes in hdclrbar.h file. You must create a new VHDL file hdclrbar.vhd inside which you must describe the hdcolorbar design entity in VHDL (similarly to how I converted the top level Verilog module to top level VHDL entity). Then use the VHDL file to replace the Verilog file in the ISE WebPack project. Optional (this is very challenging; do not attempt before talking to the instructor): implement the whole design in VHDL. Moreover, design and describe in VHDL your own entities for as many IP cores as possible. That is create your own VHDL entities to replace SRL16E , BUFPLL, etc. to improve the portability of the design to other FPGAs. 6. Credits and references [1] Digilent Atlys board reference manual; http://www.digilentinc.com/Data/Products/ATLYS/Atlys_rm.pdf [2] CTG HDMI Male to DVI-D Female Rotating Adapter; from TigerDirect; http://www.tigerdirect.com/applications/SearchTools/item-details.asp?EdpNo=3444838&CatId=467 [3] Bob Feng, Implementing a TMDS Video Interface in the Spartan-6 FPGA, Xilinx app note; http://www.xilinx.com/support/documentation/application_notes/xapp495_S6TMDS_Video_Interface.pdf [4] HDMI and DVI pointers: --HDMI resource center; http://www.hdmi.org/learningcenter/ --Wikipedia HDMI introduction; http://en.wikipedia.org/wiki/HDMI --HDMI specification document Version 1.3; see file included in this lab archive; --HDMI Hider: TI TMDS141 (datasheet of the chip on the Atlys board); http://www.ti.com/lit/ds/symlink/tmds141.pdf (also included in the archive of this lab); --HDMI connectors A,B pinouts; http://pinouts.ru/Video/hdmi_pinout.shtml --Wikipedia DVI introduction; http://en.wikipedia.org/wiki/Digital_visual_interface --DVI 10; http://www.ddwg.org/lib/dvi_10.pdf --DVI pinouts; http://pinouts.ru/Video/dvi_pinout.shtml --Wikipedia RMDS introduction; http://en.wikipedia.org/wiki/Transition-minimized_differential_signaling [5] Spartan-6 Libraries Guide for HDL Designs (BUFG, BUFIO2, SRL16E, PLL_BASE, etc.); http://www.xilinx.com/support/documentation/sw_manuals/xilinx12_4/spartan6_hdl.pdf (also included in the archive of this lab); 85 Lab 10: PicoBlaze – an embedded microcontroller 1. Objective The objective of this lab is to utilize PicoBlaze - an embedded into the FPGA fabric 8-bit microcontroller to implement a simple circuit that takes as input two 4-bit binary numbers, a and b (set via the slide switches of the Atlys board), and computes a2+b2, result which is displayed on the 8 LEDs of the Atlys board. As part of this lab, you will learn assembly language, which is utilized to code the algorithm that PicoBlaze must execute. 2. Preparation As part of this lab preparation you must fully read chapters 14 and 15 of Pong. P. Chu’s book [1]. This lab is created based on those two chapters. In addition, you should read additional materials included in the downloadable archive with all the files for this lab [2,3]. Please allocate enough time to do this especially if you do not have prior experience with microcontroller architectures and/or assembly languages. The PicoBlaze processor is a compact 8-bit microcontroller core for Xilinx FPGA devices. It is freely provided as a cell-level HDL description (referred to also as a soft-core) and can be synthesized along with other logic as part of a bigger digital design. It is optimized for efficiency and occupies very little area. It is recommended to be utilized for simple data-processing and control applications. Single or multiple copies of the PicoBlaze processor can be easily integrated into larger systems to add flexibility to FPGA-based designs. The PicoBlaze design was originally named KCPSM which stands for “Constant(K) Coded Programmable State Machine” (formerly “Ken Chapman's PSM”). Ken Chapman is the Xilinx systems designer who devised and implemented the microcontroller. You should also know that there is also MicroBlaze – another soft processor core designed for Xilinx FPGAs from Xilinx. However, it is not free, and you would need Xilinx’s EDK (Embedded Development Kit) tool to be able to build MicroBlaze embedded processor systems in Xilinx FPGAs. PicoBlaze is based on a RISC architecture of 8-bits and can reach speeds up to 100 MIPS on the Virtex-4 FPGA's family. The processors have an 8-bit address and data port for access to a wide range of peripherals. The latest version (available for download on Xilinx’s website for registered users) is KCPSM6. Its main characteristics include: - Only 26 Slices plus program memory (BRAM). - Performance 52 MIPS to 120 MIPs depending on device family and clock rate. - Supports programs up to 4K instructions. - 32 General Purpose Registers arranged in 2 banks. - 256 General Purpose Input Ports. - 256 General Purpose Output Ports. - 16 Constant-Optimised Output Ports. - 64-bytes of scratch pad memory expandable to 128 and 256-bytes (additional 2 and 6 Slices). - Fully automatic CALL/RETURN stack supporting nested subroutines to 30 levels. - Interrupt with user definable interrupt vector and maximum response time of 4 clock cycles. 86 - Power saving features including 'sleep' mode. - Superset of KCPSM3 with high degree of code compatibility. 3. Lab Description In this lab, we design a digital system that is uses a PicoBlaze microcontroller to compute a2+b2. a and b are two 4-bit numbers input via the slide switches of the Atlys board. The result is displayed on the 8 LEDs of the Atlys board. To do this, we follow the steps described below and illustrated in Fig.1. Figure 1 Block diagram of example1 design Step 1: Determine the software-hardware partition Decide about the structure of the design. In this example, the functionality of our design is very simple, so we only need a single instance of the PicoBlaze microcontroller. Basically it’s an all software implementation. Step 2: Develop the assembly program for the software portion 87 Because of its simplicity, PicoBlaze cannot effectively support high-level programming languages (such as C) and the code is generally developed in assembly language. Developing a complete assembly program consists of the following steps: -1-Derive the pseudocode of the main program. -2-Identify tasks in the main program and define them as subroutines. In needed, continue refining the complex subroutines and divide them into smaller subroutines. -3-Determine the register and data RAM use. -4-Write the assembly code for the subroutines. The main program usually has the following structure: call initialization_routine forever: call task_1_routine call task_2_routine ... call task_n_routine jump forever The result of steps 1,2,4 is the assembly program in file example1_sio_rom.psm that you find in the example1/ folder of the downloadable archive of this lab. Please read it thoroughly to see what and how it does achieve al the tasks. The structure of the main program is: call clear_data_ram forever: call read_switch call square call write_led jump forever Step 3 above is unique for assembly code development because we must manually allocate the data storage in assembly code. In this example, the allocation of the data RAM is done as shown in Fig.2. Figure 2 Allocation of data RAM 88 Step 3: Compiling with KCSPM6 The assembly code (file example1_sio_rom.psm in our example) is placed in the same folder (say example1/ in our case) with the assembler, which is kcpsm6.exe that is part of the PicoBlaze files downloaded from Xilinx. Also in the same folder we must place the file ROM_form.vhd (also part of the PicoBlaze files downloaded from Xilinx). This is a template used by kcpsm6.exe. Invoke a DOS window, navigate to the project directory, and run the program. Type: kcpsm6 example1_sio_rom.psm After successful compilation, several files are created. The one that we need is the one that contains the block RAM VHDL entity that we’ll plug into our top-level design. In this example this file is example1_sio_rom.vhd. Step 4: Create the ISE WebPack project and test Use the following source files to create a new ISE project and then implement the top-level design. kcpsm6.vhd – the VHDL file of the PicoBlaze microcontroller (comes with PicoBlaze files from Xilinx) example1_sio_rom.vhd – the instruction ROM entity; contains basically our program to be executed example1_top_level.vhd – top-level description of our design example1_top_level.ucf – you should know what this is Generate bitstream file and download it to the FPGA. Test operation and comment. 4. Lab Assignment Use PicoBlaze microcontroller to design a circuit that implement a Binary-to-BCD converter. Write the assembly code, compile, create a top-level design, implement, and test on Atlys board. Use 8 slide switches to input an 8-bit binary number. The BCD code should drive the 8 LEDs. Optional: Read also chapters 16 and 17 of P.P. Chu’s book and then implement some more complex design of your choice. 5. Credits and references [1] Pong P. Chu, FPGA Prototyping by VHDL Examples: Xilinx Spartan-3 Version, Wiley 2008. [2] Ken Chapman, PicoBlaze for Spartan-6, Virtex-6 and 7-Series (KCPSM6), USER GUIDE (comes with PicoBlaze files from Xilinx). [3] Ken Chapman, PicoBlaze 8-Bit Microcontroller for Virtex-E and Spartan-II/IIE Devices. Xilinx Application Note. http://www.xilinx.com/support/documentation/application_notes/xapp213.pdf 89 Lab 11: Single Cycle Computer (SCC) 1. Objective The objective of this lab is to design and utilize a completely functional (both in simulation and implemented on Atlys board) of a single-cycle computer (SCC). We implement in VHDL the SCC described in Chapter 9 of Mano and Kime book [1]. The goal of this exercise is to get insights into the design and operation of a simple computer (control + datapath) and to set the stage for more advanced exercises regarding processor pipelining. 2. Preparation As part of this lab preparation you must read chapter 9 of M. Morris Mano and Charles R. Kime book [1]. You should read this chapter as well as the complete VHDL implementation provided as part of the downloadable archive of this lab. When reading the VHDL code, identify each block of the top-level block diagram of the SCC shown in Fig.1 below. Figure 1 Top-level diagram of the Single-Cycle Computer (SCC) 90 3. Lab Description In this lab, we verify both in simulation as well in hardware on the Atlys board the operation of the SCC. More precisely, we set it up to execute the following program description: ------ bring 1 from data mem loc 0 (1 is hard coded inside data mem) and place it in R4; then add it with constant 3 to get 4 and place 4 in R5; then add R5 and R4 that is 4 + 3 = 7 and place it in R6; finally store R6 into mem loc 1; so, finally data mem at location 1 should contain the result 7; This is accomplished by the following small program: LD ADDI ADD ST $4, $5, $6, $4, $0 $4, 3 $5, $4 $6 Which corresponds to the following, stored directly into the instruction memory: 0010000100000000 1000010101100011 0000010110101100 0100000100110000 Part 1: Aldec-HDL project Download the lab files from the course website and create your own Aldec-HDL project to simulate and verify the operation of the SCC with the above program. You should be able to see waveforms like those shown in Fig.2. Figure 2 Snapshot of simulation to show 7 as the final result. Part 2: ISE WebPack project Use the files from the folder corresponding to the ISE WebPack project (from the downloaded files) and create your own ISE project. Synthesize and implement the design. Generate the bitstream file and download it to the FPGA. Observe the final result of the above program execution, 7, also displayed on the LEDs. Test operation and comment. 4. Lab Assignment While fully operational, the current single-cycle computer (SCC) is one of the simplest. Here is a list of suggested tasks that you may want to work on as a follow-up to this lab: 91 Write a new program and place it into the instruction memory. Implement the whole design again and verify that it works correctly. The program should find the maximum number among three numbers pre-stored in the first three locations of the data memory. Could such a program be implemented with less than 16 instructions (that is the current size of the instruction memory of the provided SCC)? Replace the data and instruction memories with BRAMs specific to Spartan-6 FPGA. Read lab#6 to recall how to do that. Implement a fully structural description of the Function Unit and then compare the best achievable operation frequency with the current one. Enhance the ISA of SCC with new instructions and make the necessary changes in the control unit and/or datapath. Aside from really helping you to get a feel about how a single-cycle computer works, this VHDL implementation is very useful to use as a platform to get additional insight into pipelining. Modify the entire VHDL source code to implement a pipelined version of the SCC as discussed in the last part of Chapter 9 of Mano and Kime book [1]. Read also Chapters 10 and 11 of this book and I am sure you will have your own ideas about how to enhance this simple SCC. 5. Credits and references [1] M. Morris Mano and Charles Kime, Logic and Computer Design Fundamentals, Pearson Prentice Hall, 4th Edition, 2008. 92

Lab Manual v1.2012

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib