The State University of New York (SUNY) at Buffalo
Department of Electrical Engineering
Lab Manual v1.2012
EE 478/578 - HDL Based Digital Design with Programmable Logic
Cristinel Ababei
Copyleft © by Cristinel Ababei, 2012. Being a big supporter of open-source, this lab manual is free
to use for educational purposes. However, you should credit the author. The advanced materials
are available for $1, which will be donated to my preferred charity.
Table of Contents
Lab 1: Aldec Active-HDL Tutorial...................................................................................................................... 3
Lab 1: Supplemental Material - A First Look at VHDL ................................................................................... 12
Lab 2: Xilinx ISE WebPack Tutorial ................................................................................................................. 16
Lab 2: Supplemental Material - Subprograms and Packages......................................................................... 25
Lab 3: Four-Bit Binary Counter....................................................................................................................... 30
Lab 3: Supplemental Material - Testbenches ................................................................................................ 35
Lab 4: Finite State Machines .......................................................................................................................... 41
Lab 4: Supplemental Material – Writing VHDL code for synthesis ................................................................ 50
Lab 5: Memories: ROMs and BRAMs Internal to the FPGA ........................................................................... 57
Lab 6: Memories: External SPI Flash and DDR2 ............................................................................................. 63
Lab 7: Interfacing FPGA Spartan-6 with AC’97 Codec .................................................................................... 68
Lab 7 Supplemental: PS2 Keyboard and UART .............................................................................................. 71
Lab 8: Interfacing FPGA Spartan-6 with Host Computer via USB .................................................................. 72
Lab 9: Video Interfaces: HDMI and DVI.......................................................................................................... 80
Lab 10: PicoBlaze – an embedded microcontroller ....................................................................................... 86
Lab 11: Single Cycle Computer (SCC) ............................................................................................................. 90
2
Lab 1: Aldec Active-HDL Tutorial
1. Objective
The objective of this tutorial is to introduce you to Aldec’s Active-HDL 9.1 Student Edition simulator by
performing the following tasks on a 4-bit adder design example:
Create a new design or add .vhd files to your design
Compile and debug your design
Run Simulation
Note: Active-HDL is an alternative simulator to Xilinx’s ISim (ISE Simulator) simulator. It is one of the
most popular commercial HDL simulators today. It is developed by Aldec. In this course, we use the free
student version of Active-HDL, which has some limitations (file sizes and computational runtime). You can
download and install it on your own computer:
http://www.aldec.com/en/products/fpga_simulation/active_hdl_student
2. Introduction
Active-HDL is a Windows based integrated FPGA Design Creation and Simulation solution. Active-HDL
includes a full HDL graphical design tool suite and RTL/gate-level mixed-language simulator. It supports
industry leading FPGA devices, from Altera, Atmel, Lattice, Microsemi (Actel), Quicklogic, Xilinx and
more.
The core of the system is an HDL simulator. Along with debugging and design entry tools, it makes up a
complete system that allows you to write, debug and simulate VHDL code. Based on the concept of a
workspace (think of it as of design), Active-HDL allows us to organize your VHDL resources into a
convenient and clear structure.
3. Procedure
Creating the 1-bit full adder
1. Start Aldec Active-HDL: Start->All Programs->Aldec->Active-HDL Student Edition
2. Select “Create New Workspace” and click OK
3. Enter fall2012_aldec as the name of the workspace and change the directory to where you want to save
it (for example M:\UB\labs) and click OK
4. Select “Create an Empty Design” and click NEXT
5. Choose the block diagram configuration as “Default HDL Language” and default HDL language as
“VHDL”. Select the target technology as Xilinx for vendor and SPARTAN6 for technology. Click
NEXT
6. Enter fourbit_adder as the name of the design as well as the name of the default working library. Click
NEXT
7. Click FINISH
You should have now the Design Browser as a window showing current workspace and design contents.
8. Double-click on “Add New File” in the Design Browser window
9. Select “VHDL Source Code” and type in full_adder in the name field, click OK
3
The following is the VHDL code for the 1-bit full adder. Enter the code as seen below into the empty file.
------
1-bit full adder
Declare the 1-bit full adder with the inputs and outputs
shown inside the port(). This adds two bits together (x,y)
with a carry in (cin) and outputs the sum (sum) and a
carry out (cout).
LIBRARY IEEE;
use IEEE.STD_LOGIC_1164.ALL;
entity full_adder is
port(x, y, cin: in std_logic;
sum, cout: out std_logic);
end full_adder;
architecture my_dataflow of full_adder is
begin
sum <= (x xor y) xor cin;
cout <= (x and y) or (x and cin) or (y and cin);
end my_dataflow;
10. Select the File menu and choose Save.
11. To check sintax of the newly created adder, right click on “full_adder.vhd” in the Design Browser
window and select the Compile option. The code should compile without any problems and you should
see a green check mark next to the full_adder.vhd file. If you get any errors, check the code that you
have typed against the above code provided.
Once you have the source file (or all the source files of the entire design) compiled, the design can be
simulated for functional correctness.
Manual Simulation
Note: This type of simulation should be done only for small designs with few inputs and outputs. As design
size increases you should use testbenches – described later.
1. Select menu Simulation->Initialize Simulation
After the simulation has been initialized, you have to open a new Waveform window.
2. Click the New Waveform toolbar button to invoke the Waveform window.
Now you need assign the stimulators to all the input signals.
3. In the Design Browser window select all signals (one by one by holding Control key pressed), then
right click and choose Add to Wavefom
Note: To add signals to the simulator we could also use the drag and drop feature. In the Structure
pane/tab of the Design Browser window, select the design and while holding down the left button, drag it to
the right-section of the Waveform window and then release the mouse button. This is a standard drag-anddrop operation.
4
4. Go to the left pane/tab of the Waveform Editor window and select the “x” signal. Press the right button
to invoke a context menu, choose the “x” item from Stimulators… dialog; choose Clock for Type.
Leave the Frequency at the default value of 10 MHz. Click APPLY and then CLOSE
5. Repeat step 4 for input “y”. Choose Clock for Type but this time place the mouse pointer in the
Frequency box and set the value of 5 MHz. Click the APPLY button to assign the stimulator then
CLOSE.
6. Repeat step 4 for input “cin”. Choose Formula for Type and when the dialog appears, type formula
expression as follows: 0 0, 1 100000. Click APPLY and then CLOSE
7. Simulation->Run Until and enter 300ns
8. Finish simulation by selecting the Simulation->End Simulation option in the Simulation menu
At this time your Waveform viewer should look like this:
Investigate the waveforms to verify that your full_adder works correctly.
Testbench Based Simulation
The VHDL testbench is a VHDL program that describes simulation inputs in standard VHDL language.
There is a wide variety of VHDL specific functions and language constructs designed to create simulation
inputs. You can read the simulation data from a text file, create separate processes driving input ports, and
more. The typical way to create a testbench is to create an additional VHDL file for the design that treats
your actual VHDL design as a component (Design Under Test, DUT) and assigns specific values to this
component input ports. It also monitors the output response of the DUT to verify correct operation.
The diagram below illustrates the relationship between the entity, architecture, and testbench:
5
1. Create a new file full_adder_testbench.vhd and save it under the current design’s “src” directory (for
example M:\UB\labs\fall2012_aldec\fourbit_adder\src). The content of this file is in the Appendix A at
the end of this tutorial. You can create it using Aldec’s editor or any other editor (e.g., even Notepad).
2. Select the Design menu and choose “Add Files to Design” and add the newly created
full_adder_testbench.vhd to the design.
3. Right click on full_adder_testbench.vhd in the Design Browser window and select the Compile option.
4. Left click on the plus (+) next to full_adder_testbench.vhd. This will bring us the
TEST_FULL_ADDER entity
5. Right click on the TEST_FULL_ADDER and choose Set as Top-Level
6. Select the File menu and choose the New option and pick New Waveform
7. In the Design Browser window select the Structure pane/tab at the bottom of the window
8. Select the Simulation menu and choose Initialize Simulation
9. Click on “+” next to TEST_FULL_ADDER (MY_TEST)
10. Click on U1:FULL_ADDER and drag all signals to the waveform window
11. Change the time for simulation to 400 ns by clicking on the up arrow
12. Select the Simulation menu and choose Run For
13. Inspect the simulation to verify that the 1-bit full adder functionality is indeed correct
At this time your Waveform viewer should look like this:
Creating and testing the 4-bit adder
1. Add the following two files to the design: fourbit_adder.vhd and fourbit_adder_testbench.vhd.
Their source code is in Apendices B and C at the end of this tutorial.
2. Compile both files and use the testbench (fourbit_adder_testbench.vhd) to simulate the design for say
200 ns
3. View the simulation to verify that the 4-bit adder functionality is correct.
At this time your Waveform viewer should look like this:
6
4. Taking it further
While intuitive to use, Active-HDL has a lot of features. It is outside the scope of this tutorial to discuss all
of them. You should spend some time searching and reading additional documentation on how to use
Active-HDL. A few first examples:
http://www.aldec.com/en/downloads/tutorials
Once you launched Active-HDL tool select the Help menu and read stuff
Google for “Active-HDL tutorial”. You will find a lot of detailed tutorials (some written for older
versions of the tool but a lot of concepts still apply), which have been kindly made public by the online
community.
Note: As it is the case with most of the electronic design automation (EDA) tools, there are multiple ways of
achieving or performing something. If by reading the documentation or other tutorials you learn how to
accomplish any of the steps described in this tutorial in a different way - that is OK. You should learn and
use the methods you like the most and are more comfortable with.
Finally, while Active-HDL (of Aldec) and ModelSim (of Mentor Graphics) are arguably some of the most
popular HDL simulators in industry, Xilinx has been improving their own simulator, ISim, which is part of
the free ISE WebPack used in this course. You can read more about iSim here:
http://www.xilinx.com/support/documentation/sw_manuals/xilinx13_4/plugin_ism.pdf
7
Appendix A: VHDL source code of full_adder_testbench.vhd
------
1-bit full adder testbench
A testbench is used to rigorously tests a design that you have made.
The output of the testbench should allow the designer to see if
the design worked. The testbench should also report where the testbench
failed.
LIBRARY IEEE;
use IEEE.STD_LOGIC_1164.ALL;
-- Declare a testbench. Notice that the testbench does not have any input
-- or output ports.
entity TEST_FULL_ADDER is
end TEST_FULL_ADDER;
-- Describes the functionality of the tesbench.
architecture MY_TEST of TEST_FULL_ADDER is
-- The object that we wish to test is declared as a component of
-- the test bench. Its functionality has already been described elsewhere.
-- This simply describes what the object's inputs and outputs are, it
-- does not actually create the object.
component FULL_ADDER
port( x, y, cin : in STD_LOGIC;
sum, cout : out STD_LOGIC );
end component;
-- Specifies which description of the adder you will use.
for U1: FULL_ADDER use entity WORK.FULL_ADDER(MY_DATAFLOW);
-- Create a set of signals which will be associated with both the inputs
-- and outputs of the component that we wish to test.
signal X_s, Y_s
: STD_LOGIC;
signal CIN_s
: STD_LOGIC;
signal SUM_s
: STD_LOGIC;
signal COUT_s
: STD_LOGIC;
-- This is where the testbench for the FULL_ADDER actually begins.
begin
-- Create a 1-bit full adder in the testbench.
-- The signals specified above are mapped to their appropriate
-- roles in the 1-bit full adder which we have created.
U1: FULL_ADDER port map (X_s, Y_s, CIN_s, SUM_s, COUT_s);
-- The process is where the actual testing is done.
process
begin
-- We are now going to set the inputs of the adder and test
-- the outputs to verify the functionality of our 1-bit full adder.
-- Case 0 : 0+0 with carry in of 0.
-- Set the signals for the inputs.
X_s <= '0';
Y_s <= '0';
CIN_s <= '0';
-- Wait a short amount of time and then check to see if the
-- outputs are what they should be. If not, then report an error
-- so that we will know there is a problem.
wait for 10 ns;
8
assert ( SUM_s = '0' ) report "Failed Case 0 - SUM" severity error;
assert ( COUT_s = '0' ) report "Failed Case 0 - COUT" severity error;
wait for 40 ns;
-- Carry out the same process outlined above for the other 7 cases.
-- Case 1 : 0+0 with carry in of 1.
X_s <= '0';
Y_s <= '0';
CIN_s <= '1';
wait for 10 ns;
assert ( SUM_s = '1' ) report "Failed Case 1 - SUM" severity error;
assert ( COUT_s = '0' ) report "Failed Case 1 - COUT" severity error;
wait for 40 ns;
-- Case 2 : 0+1 with carry in of 0.
X_s <= '0';
Y_s <= '1';
CIN_s <= '0';
wait for 10 ns;
assert ( SUM_s = '1' ) report "Failed Case 2 - SUM" severity error;
assert ( COUT_s = '0' ) report "Failed Case 2 - COUT" severity error;
wait for 40 ns;
-- Case 3 : 0+1 with carry in of 1.
X_s <= '0';
Y_s <= '1';
CIN_s <= '1';
wait for 10 ns;
assert ( SUM_s = '0' ) report "Failed Case 3 - SUM" severity error;
assert ( COUT_s = '1' ) report "Failed Case 3 - COUT" severity error;
wait for 40 ns;
-- Case 4 : 1+0 with carry in of 0.
X_s <= '1';
Y_s <= '0';
CIN_s <= '0';
wait for 10 ns;
assert ( SUM_s = '1' ) report "Failed Case 4 - SUM" severity error;
assert ( COUT_s = '0' ) report "Failed Case 4 - COUT" severity error;
wait for 40 ns;
-- Case 5 : 1+0 with carry in of 1.
X_s <= '1';
Y_s <= '0';
CIN_s <= '1';
wait for 10 ns;
assert ( SUM_s = '0' ) report "Failed Case 5 - SUM" severity error;
assert ( COUT_s = '1' ) report "Failed Case 5 - COUT" severity error;
wait for 40 ns;
-- Case 6 : 1+1 with carry in of 0.
X_s <= '1';
Y_s <= '1';
CIN_s <= '0';
wait for 10 ns;
assert ( SUM_s = '0' ) report "Failed Case 6 - SUM" severity error;
assert ( COUT_s = '1' ) report "Failed Case 6 - COUT" severity error;
wait for 40 ns;
-- Case 7 : 1+1 with carry in of 1.
X_s <= '1';
Y_s <= '1';
CIN_s <= '1';
9
wait for
assert (
assert (
wait for
10 ns;
SUM_s = '1' ) report "Failed Case 7 - SUM" severity error;
COUT_s = '1' ) report "Failed Case 7 - COUT" severity error;
40 ns;
end process;
END MY_TEST;
Appendix B: VHDL source code of fourbit_adder.vhd
-----
4-bit adder
Structural description of a 4-bit adder. This device
adds two 4-bit numbers together using four 1-bit full adders
described above.
-- This is just to make a reference to some common things needed.
LIBRARY IEEE;
use IEEE.STD_LOGIC_1164.ALL;
-----
This describes
designing. The
inside port().
and produces a
the black-box view
inputs and outputs
It takes two 4-bit
4-bit output (ANS)
of the component we are
are again described
values as input (x and y)
and a carry out bit (Cout).
entity fourbit_adder is
port( a, b
: in
STD_LOGIC_VECTOR(3 downto 0);
z : out STD_LOGIC_VECTOR(3 downto 0);
cout
: out STD_LOGIC );
end fourbit_adder;
-- Although we have already described the inputs and outputs,
-- we must now describe the functionality of the adder (ie:
-- how we produced the desired outputs from the given inputs).
architecture MY_STRUCTURE of fourbit_adder is
-- We are going to need four 1-bit adders, so include the
-- design that we have already studied in full_adder.vhd.
component FULL_ADDER
port( x, y, cin
sum, cout
end component;
: in STD_LOGIC;
: out STD_LOGIC );
-- Now create the signals which are going to be necessary
-- to pass the outputs of one adder to the inputs of the next
-- in the sequence.
signal c0, c1, c2, c3 : STD_LOGIC;
begin
c0 <= '0';
b_adder0: FULL_ADDER
b_adder1: FULL_ADDER
b_adder2: FULL_ADDER
b_adder3: FULL_ADDER
port
port
port
port
map
map
map
map
(a(0),
(a(1),
(a(2),
(a(3),
b(0),
b(1),
b(2),
b(3),
c0,
c1,
c2,
c3,
z(0),
z(1),
z(2),
z(3),
c1);
c2);
c3);
cout);
END MY_STRUCTURE;
Appendix C: VHDL source code of fourbit_adder_testbench.vhd
-- 4-bit Adder Testbench
-- A testbench is used to rigorously tests a design that you have made.
10
-- The output of the testbench should allow the designer to see if
-- the design worked. The testbench should also report where the testbench
-- failed.
-- This is just to make a reference to some common things needed.
LIBRARY IEEE;
use IEEE.STD_LOGIC_1164.ALL;
-- Declare a testbench. Notice that the testbench does not have any
-- input or output ports.
entity TEST_FOURBIT_ADDER is
end TEST_FOURBIT_ADDER;
-- Describes the functionality of the tesbench.
architecture MY_TEST of TEST_FOURBIT_ADDER is
component fourbit_adder
port( a, b
: in
z
: out
cout
: out
end component;
STD_LOGIC_VECTOR(3 downto 0);
STD_LOGIC_VECTOR(3 downto 0);
STD_LOGIC);
for U1: fourbit_adder use entity WORK.FOURBIT_ADDER(MY_STRUCTURE);
signal a, b
: STD_LOGIC_VECTOR(3 downto 0);
signal z
: STD_LOGIC_VECTOR(3 downto 0);
signal cout
: STD_LOGIC;
begin
U1: fourbit_adder port map (a,b,z,cout);
process
begin
-- Case 1 that we are testing.
a <= "0000";
b <= "0000";
wait for 10 ns;
assert ( z = "0000" ) report "Failed Case 1 - z" severity error;
assert ( Cout = '0' ) report "Failed Case 1 - Cout" severity error;
wait for 40 ns;
-- Case 2 that we are testing.
a <= "1111";
b <= "1111";
wait for 10 ns;
assert ( z = "1110" ) report "Failed Case 2 - z" severity error;
assert ( Cout = '1' )
report "Failed Case 2 - Cout" severity error;
wait for 40 ns;
end process;
END MY_TEST;
11
Lab 1: Supplemental Material - A First Look at VHDL
The objective of this supplemental material is to give you an early presentation of some of the most
important concepts in VHDL. You should keep this document as a reference for future work on your course
assignments.
------------------------------------------------------------------ entity & architecture template
----------------------------------------------------------------library lib_name;
use lib_name.package_name.all;
entity entity_name is
generic (
generic_name : type_name := default;
generic_name : type_name := default
);
port (
port_name : in|out|inout|buffer|linkage type_name;
ort_name : in|out|inout|buffer|linkage type_name
);
end entity_name;
architecture arch_name of entity_name is
signal signal_name : type_name := default;
begin
concurrent assignments and processes;
end arch_name;
------------------------------------------------------------------ component declaration
----------------------------------------------------------------component component_name
generic (
generic_name : type_name := default;
generic_name : type_name := default
);
port (
port_name : in|out|inout|buffer|linkage type_name;
port_name : in|out|inout|buffer|linkage type_name
);
end component_name;
------------------------------------------------------------------ component instantiation
----------------------------------------------------------------instance_name : component_name
generic map (
generic_name => value,
generic_name => value
);
port map (
port_name => value,
12
port_name => value
);
------------------------------------------------------------------ process template
----------------------------------------------------------------process_name : process( signal_port_name, signal_port_name )
variable var_name : type_name := default;
begin
...
end process process_name;
------------------------------------------------------------------ concurrent signal assignments
----------------------------------------------------------------signal_name <= value;
signal_name <= transport value after time_value,
transport value after time_value;
access_name <= new type_name ( initial_value );
signal_name <= value1 when ( condition1 ) else
value2 when ( condition2 ) else
value3;
with expression select
signal_name <= value1 when choice1,
value2 when choice2,
value3 when others;
------------------------------------------------------------------ type declarations
----------------------------------------------------------------type type_name is ( ENUM1, ENUM2, ENUM3 );
type type_name is range low_integer to high_integer
units
base_unit;
unit1 = integer base_unit;
unit2 = integer unit1;
end units;
type type_name is array ( low_index to high_index ) of element_type;
type type_name is array ( high_index downto low_index ) of element_type;
type type_name is array ( scalar_type1 range <> ) of element_type;
type type_name is array ( index1, index2 ) of element_type;
type type_name is
record
element_name : type_name;
element_name : type_name;
end record;
type record_type_name;
type pointer_type_name is access record_type_name;
type record_type_name is
record
next_record : pointer_type_name;
end record;
type file_type_name is file of type_name;
subtype subtype_name is scalar_type range low to high;
subtype subtype_name is array_type( left downto/to right );
subtype subtype_name is resolution_fn type_name;
13
------------------------------------------------------------------ signal/constant/variable declarations
----------------------------------------------------------------signal signal_name : type_name := default;
constant const_name : type_name := value;
variable var_name : type_name := default;
file file_id : file_type is in/out file_name;
------------------------------------------------------------------ procedure & function declaration
----------------------------------------------------------------procedure proc_name ( constant/variable/signal param : in/out/inout type_name
);
procedure proc_name ( param1 : type_name; param2 : type_name );
function fn_name ( constant/variable/signal param : in/out/inout type_name )
return type_name;
function fn_name ( param1 : type_name; param2 : type_name ) return type_name;
------------------------------------------------------------------ procedure & function body
----------------------------------------------------------------procedure proc_name ( constant/variable/signal param : in/out/inout type_name )
is
variable var_name : type_name := default;
begin
statements;
end proc_name;
function fn_name ( constant/variable/signal param : in/out/inout type_name )
return type_name is
variable var_name : type_name := default;
begin
statements;
return value;
end proc_name;
------------------------------------------------------------------ if statement
----------------------------------------------------------------if ( condition1 ) then
statements;
elsif ( condition2 ) then
statements;
else
statements;
end if;
------------------------------------------------------------------ case statement
----------------------------------------------------------------case signal/variable is
when value1 => statements;
when value2 => statements;
when others => statements;
14
end case;
------------------------------------------------------------------ while loop
----------------------------------------------------------------label : while ( condition ) loop
statements;
end loop label;
------------------------------------------------------------------ for loop
----------------------------------------------------------------label : for var_name in left to/downto right loop
statements;
end loop label;
------------------------------------------------------------------ assert statement
----------------------------------------------------------------assert ( condition )
report string_value
severity severity_value;
------------------------------------------------------------------ package declaration and body
----------------------------------------------------------------package pkg_name is
declarations;
end pkg_name;
package body pkg_name is
definitions;
end pkg_name;
------------------------------------------------------------------ configurations
----------------------------------------------------------------configuration cfg_name of entity_name is
for arch_name
end for;
end cfg_name;
configuration cfg_name of entity_name is
for arch_name
for instance_name : comp_name use entity entity_name ( architecture );
for instance_name : comp_name
for arch_name
end for;
for instance_name : comp_name use configuration cfg_name2;
for others : comp_name use configuration cfg_name2;
for all : comp_name use configuration cfg_name2;
end for;
end cfg_name;
15
Lab 2: Xilinx ISE WebPack Tutorial
1. Objective
To introduce you to Xilinx’s ISE WebPack by performing the following tasks on a 4-bit adder design
example:
Use Xilinx ISE WebPack software to:
o Specify the type of FPGA to be programmed
o Assign input and output signals to FPGA pins
o Implement the design (producing a bit file)
o Generate reports
Use Digilent Adept software to:
o Select the board to be programmed: Digilent ATLYS FPGA board
o Select the bit file to be used
o Program the FPGA board
Test the design on the ATLYS board
2. Introduction
In this course, we use Xilinx ISE WebPack 14.1 to synthesize our designs. The target FPGA is Xilinx
Spartan-6. This FPGA is mounted on a board called Atlys by Digilent. The Atlys circuit board is a
complete, ready-to-use digital circuit development platform based on a Xilinx Spartan-6 LX45 FPGA. It
offers a large on-board collection of high-end peripherals including Gbit Ethernet, HDMI Video, 128MByte
16-bit DDR2 memory, and USB and audio ports.
A typical design flow is illustrated in the next figure:
Aldec Active-HDL or Xilinx ISim
Specify design functionality
Define inputs and outputs
Write VHDL files; create testbenches
Compile, simulate, and debug
Previous tutorial (lab 1)
VHDL file (e.g., MyFile.vhd)
This tutorial (lab 2)
Xilinx ISE WebPack
Specify FPGA (Spartan-6, etc.)
Assigns signals to pins
Implement design (synthesis, place, route)
Generate reports
Generate bitstream file (to program FPGA)
bit file (e.g., MyFile.bit)
Digilent Adept or Xilinx iMPACT
Program to download bit file to the FPGA on
the Atlys board
USB
3. Procedure: Design Implementation with Xilinx ISE WebPack
16
3.1 Start Xilinx ISE
Launch Xilinx ISE using the shortcut on the desktop (or Start->All Programs->Xilinx Design Tools->ISE
Design Suite 14.1->ISE Design Tools->Project Navigator)
3.2 Create a Project
--Click “New Project” button or Select File->New Project
--Enter the project name fourbit_adder and select the location where you want it to be saved. For example,
M:\UB\labs\fall2012_ise.
--Select HDL for Top-Level source type and click Next. You should get the Project Settings window.
3.3 Specify the FPGA to be Used
--In the Project Settings window, select Spartan6 for Family, select XC6SLX45 for Device, select CSG324
for Package, and VHDL for Preferred Language. Leave the rest of the options unchanged (see figure
below). Then, click Next.
--You should get a Project Summary window. Click Finish to create the project.
17
3.4 Add Existing Source Files to Project
--Select Project->Add Source and locate the vhd files for our design. In this example, we will use the
full_ader.vhd and fourbit_adder.vhd files that we have already created in lab1. So, go ahead and locate
them and add them to the project, then click Open.
--At this time, you should see the Design Overview - Summary being displayed.
3.5 Implement the Design
Design implementation is the process of translating, mapping, placing, routing, and generating a bitstream
file for your design. The design implementation tools are embedded in the Xilinx ISE software for easy
access and project management. The figure below illustrates the design implementation step within a typical
FPGA design flow.
Design
Implementation




Mapping
Placement
Routing
Bitstrean
generation
To perform the design implementation of our fourbit_adder follow these steps:
--In the Hierarchy window, select “fourbit_adder – MY_STRUCTURE (fourbit_adder.vhd)”
--In the Processes tab double-click Implement Design (or right-click on Implement Design and select
Run). During and after the run, you should see:
Lots of information should scroll by in the Console window. If any errors occur, scroll back up to
read the messages and figure out how to fix the errors.
Green check marks appear next to the processes that have been run
Information filled out in the Design Overview – Summary window. For example:
o Note that this simple example only uses 4 out of 27,288 available LUTs
o This example has 2 inputs (“a” and “b”, each has four bits) and 2 outputs (“z” has four bits and
“cout” is a single bit), so only 13 of 218 input-output blocks (IOB) are used.
The next figure shows how the Project Navigator window looks like after Implementation run finished:
18
3.6 ATLYS Pinout
The Atlys board includes six pushbuttons, eight slide switches, and eight LEDs for basic digital input and
output. One pushbutton has a red plunger and is labeled “reset” on the PCB - this button is no different than
the other five, but it can be used as a reset input to processor systems. The buttons and slide switches are
connected to the FPGA via series resistors to prevent damage from inadvertent short circuits. The high
efficiency LED anodes are connected to the FPGA via 390-ohm resistors, and they will illuminate when a
logic high voltage is applied to their respective I/O pin. The next figure shows the connection of the
pushbuttons, slide switches, and LEDs to the FPGA’s pins:
Note: Now it’s a good time to take a while and read through the reference manual of Atlys board to get
familiar with the rest of pinouts. You can download it directly from Digilent:
http://www.digilentinc.com/Data/Products/ATLYS/Atlys_rm.pdf
19
Also, take some time to read through some of the documentation of Spartan-6 FPGA:
http://www.xilinx.com/support/documentation/spartan-6.htm
Because our fourbit_adder design is pretty small, we can actually conveniently assign the eight slide
switches to control the two inputs and use five LEDs to be driven by the outputs. We will use the first four
slide switches (SW0-SW3) as input “a” and the last four slide switches (SW4-SW7) as input “b” of the
fourbit_adder. The output “z” of the fourbit_adder will drive the first four LEDs (LD0-LD3) and the output
“cout” will drive the last LED (LD7).
3.7 Assigning Pins
--Expand (+) User Constraints under the Processes tab
--Double-click on I/O Pin Planning (PlanAhead) - Post-Synthesis
--Select Yes to create a User Constraint File (UCF)
--The PlanAhead 14.1 window should now appear; it may take a few seconds though.
If a Welcome window appears, you can simply close it.
If a window appears asking if you would like to load software updates, select No.
--Expand (+) a(4), b(4), z(4), and Scalar ports under the I/O Ports tab to reveal all the inputs and outputs.
--Double-click on signal a(0) to open the I/O Port Properties window.
--Enter the desired pin number A10 for signal a[0] in the box labeled Site.
--Click Apply.
--Repeat for the rest of the signals a, b, z, and cout using the pin numbers determined earlier and shown in
the previous figure.
--Check the pin numbers now listed in the I/O Ports tab to be sure that they are correct.
The PlanAhead window should look like in the next figure:
20
3.8 Printing the Package View
Note: This step is optional. It is described here for the sake of completeness. To save a tree, do not actually
print.
Before printing the package view, change the background from black to white as follows:
--Bring the mouse pointer within the Package window, then Right-click and select View – Options
--When the Options window opens, change the PlanAhead Default Theme to PlanAhead Light Theme under
Colors option. Click Apply. Notice that the package now has a light color background.
--At this time you could select File->Print and the Package View will print. However, do not do it. Instead,
in the Package window zoom-in to pin A10 and verify that it is assigned to a[0]: you should see a[0] written
inside the cell at location row A column 10. Verify the correct assignment of the other pins as well.
--Select File->Save Project
--Select File->Exit
--Select OK
3.9 Re-Implementing the Design after Pins Assignment
--The Xilinx ISE screen should now again appear and a question mark (?) should appear next to
Implement Design indicating that the design is no longer current (since we assigned pins).
--Double-click on Implement Design to implement the design again using the assigned pins.
--The (?) next to Implement Design should now have been replaced by a green check again.
3.10 Generating the Programming File
--Double-click on Generate Programming File in the Process tab. This step will generate the bit file
(fourbit_adder.bit in this example) that will be downloaded to the FPGA in a later step. A green check
mark should appear after it has successfully run.
3.11 Viewing and Printing Reports
Note: Again, this step is described for completeness; do not actually do the printing.
--The Design Summary tab shows that several types of reports are available. Click on Summary, IOB
Properties, Pinout Report, etc. reports and take some time to read through and understand them. For
example, notice in the Static Timing report that the longest path delay of 14.427 ns is between input bit
a<0> and output cout.
--You could print any of these reports by selecting File->Print.
--An alternative to printing the Pinout Report is to print the User Configuration File (UCF). Look in the
project folder for a file with a ucf extension (fourbit_adder.ucf in this case). Open the file with any text
editor. Notice that this file only contains information on pins that were assigned.
--A schematic can be printed as follows:
Select Tools->Schematic Viewer->RTL. When you do this first time, we are asked to select the
Viewer Startup Mode; leave it as “Start with the Explorer Wizard”. At this time you should get a
new dialog window, “Create RTL Schematic”.
Expand the (+) sign of the Signals area, select all of the signals, and click the “Add ->” button.
21
Click Create Schematic button. The schematic now appears as shown in the figure below. Double
click on the full_adder box to go lower into the hierarchy of the design.
You could print it by selecting File->Print to print the schematic (the background changes to white).
3.12 Save and Close the Project
--Select File->Save
--Select File->Close Project
--Select File->Exit to shut down Xilinx ISE
3.13 Opening an existing project
--If you need to open an existing project, look in the project folder for a file with an .xise extension
(full_adder.xise in this example).
--If you modify the VHDL source code you must run again Implement Design and Generate Programming
File. There is no need to run PlanAhead and assign signals to pins since the UCF file still exits - unless you
added/changed inputs or outputs.
4. Procedure: FPGA Programming with Digilent Adept
Digilent Adept is a free program available from Digilent to download synthesized designs (bit files) onto
Digilent FPGA boards.
4.1 Method 1: Direct programming via USB cable
To program the Atlys board using Adept software, first set up the board and initialize the software:
--Plug in and attach the power supply
--Plug in the USB cable to the PC and to the USB port on the board – the one marked “PROG” on the
board’s PCB (this is the so called Adept USB port).
--Turn ON Atlys’ power switch
--Start the Adept software
--Wait for the FPGA to be recognized.
If everything is properly connected and powered-up, the software should recognize the board as indicated in
the figure below.
22
--Select the Browse… button next to the FPGA box and locate the bit file generated by the Xilinx ISE
WebPack software. In this example, the file is fourbit_adder.bit, located in the project folder created in the
previous section of this tutorial.
--If everything is ok, then the Adept software should print the message “Programming Successful” once the
programming is finished.
--Congratulations! You just programmed the Spartan-6 FPGA to implement the fourbit_adder
design!
Use the slide switches to set inputs “a” and “b” and watch the LEDs to verify that the adder works correctly.
Try different combinations of input values. For example, if we wanted to test a + b = 2 + 5 = 7 we would set
input a = 0010 and b = 0101 via the slide switches and the output should turn the LEDs on as shown in the
figure below.
b[0-3]
a[0-3]
4.2 Method 2: Programming from Flash memory
--Turn the Atlys FPGA board OFF and then ON again. Note that the design has been lost!
23
Recall that the LUTs in an FPGA are essentially RAM and their contents are lost when power is turned off.
The Atlys board also contains 16Mbyte x4 SPI Flash, which can be used to permanently store the
configuration file of our design.
--To program the SPI Flash ROM, select the “Flash” tab in the Adept’s software window.
--In the “FPGA Programming File” section click Browse… and locate the fourbit_adder.bit file and then
click Program. If everything went ok, you should get the message “Flash configuration successful”.
--Turn the FPGA board OFF and then ON again. The SPI Flash ROM is automatically transferred to the
FPGA at power-on.
--Disconnect the USB cable and turn the power switch OFF and ON again. Note that the design should still
work!
4.3 Other programming methods
As mentioned in the Reference Manual of the Atlys board (link to it provided earlier), the FPGA can be
programmed also via the JTAG interface. In addition, the programming file can be transferred from a USB
memory stick attached to the USB HID port (the one marked J13 on the board’s PCB).
It is left as an assignment for you to search and read through the documentation to figure out how exactly
programming from an USB memory stick can be done.
5. Taking it further
As you already realized, Xilinx ISE WebPack is a sophisticated software with lots of features. It is outside
the scope of this tutorial to discuss all of them. You should spend time on your own to search and read
additional documentation and tutorials. A few first examples:
Xilinx’s ISE In-Depth Tutorial:
http://www.xilinx.com/support/documentation/sw_manuals/xilinx14_1/ise_tutorial_ug695.pdf
Digilent’s ISE WebPack VHDL Tutorial:
http://www.digilentinc.com/Data/Documents/Tutorials/Xilinx%20ISE%20WebPACK%20VHDL%20T
utorial.pdf
Digilent’s Adept Software Advanced Tutorial:
http://digilentinc.com/Data/Documents/Tutorials/Adept%20Software%20Advanced%20Tutorial.pdf
Once you launched the ISE tool select the Help menu and read stuff
Google for “ISE WebPack tutorial”. You will find a lot of detailed tutorials (some written for older
versions of the tool but a lot of concepts still apply), which have been kindly made public by the online
community.
Note: As it is the case with most of the electronic design automation (EDA) tools, there are multiple ways of
achieving or performing something. If by reading the documentation or other tutorials you learn how to
accomplish any of the steps described in this tutorial in a different way - that is OK. You should learn and
use the methods you like the most and are more comfortable with.
24
Lab 2: Supplemental Material - Subprograms and Packages
The objective of this supplemental material is to introduce you to the concepts of subprograms (functions
and procedures) and packages in VHDL.
1. VHDL Functions
A function executes a sequential algorithm and returns a single value to the calling program. We
can think of a function as a generalization of expressions. The syntax rule for a function
declaration is:
[pure | impure] function identifier [(parameter_interface_list)]
return type_mark is
{subprogram declarations}
begin
{sequential statements}
end [function] [identifier];
By default (i.e., if no keyword is given), functions are declared as pure. A pure function does not
have access to a shared variable, because shared variables are declared in the declarative part of the
architecture and pure functions do not have access to objects outside of their scope. Only
parameters of mode 'in' are allowed in function calls and are treated as 'constant' by default.
Functions may be used wherever an expression is necessary within a VHDL statement.
Subprograms themselves, however, are executed sequentially like processes. Similar to a process,
it is also possible to declare local variables. These variables are initialized with each function call
with the leftmost element of the type declaration (boolean: false, bit: '0'). The leftmost value of
integers is guaranteed to be at least -(2^31)-1 (i.e. zeros must be initialized to 0 at the beginning of
the function body). It’s recommended to initialize all variables in order to enhance the clarity of
the code.
Example 1: The following VHDL code describes a simple function that adds two 4-bit vectors and
a carry in and returns a 5-bit sum:
function add4_func(a, b : std_logic_vector(3 downto 0); carry: std_logic)
return std_logic_vector is
variable cout : std_logic;
variable cin : std_logic;
variable sum : std_logic_vector(4 downto 0);
begin
cin := carry;
sum := "00000";
loop1 : for i in 0 to 3 loop
cout := (a(i) and b(i)) or (a(i) and cin) or (b(i) and cin);
sum(i) := a(i) xor b(i) xor cin;
cin := cout;
end loop loop1;
sum(4) := cout;
return sum;
25
end add4_func;
Question: what is the role of the statement: cin := cout; inside loop1?
2. VHDL Procedures
Procedures, in contrast to functions, are used like any other statement in VHDL. Consequently,
they do not have a return value, although the keyword 'return' may be used to indicate the
termination of the subprogram. Depending on their position within the VHDL code, either in an
architecture or in a process, the procedure as a whole is executed concurrently or sequentially,
respectively.
Procedures facilitate decomposition of VHDL code into modules. They can return any number of
values using output parameters. The default mode of a parameter is 'in', the keyword 'out' or 'inout'
is necessary to declare output signals/variables. The syntax rule for a procedure declaration is:
procedure identifier [(parameter_interface_list)] is
{subprogram declarations}
begin
{sequential statements}
end [procedure] [identifier];
Example 2: The following procedure does basically the same thing as the function in the previous
example:
procedure add4_proc
(a, b : in std_logic_vector(3 downto 0);
carry : in std_logic;
signal sum : out std_logic_vector(3 downto 0);
signal cout : out std_logic) is
variable c : std_logic;
begin
c := carry;
for i in 0 to 3 loop
sum(i) <= a(i) xor b(i) xor c;
c := (a(i) and b(i)) or (a(i) and c) or (b(i) and c);
end loop;
cout <= c;
end add4_proc;
3. Packages and libraries
Packages and libraries provide a convenient way of referencing frequently used functions and
components. Packages are the only language mechanism to share objects among different design
units. Usually, they are designed to provide standard solutions for specific problems (e.g., data
types and corresponding subprograms like type conversion functions for a certain bus protocol,
procedures and components (macros) for signal processing purposes, etc.).
A package consists of a package declaration and an optional package body. The package
declaration contains a set of declarations, which may be shared by several design units (for
26
example: types, signals, components, and function and procedure declarations). The body package
usually contains the functions and procedure bodies.
The syntax rule for a package declaration is:
package identifier is
{package declarations}
begin
{sequential_statement}
end [package] [identifier];
A package is analyzed separately and placed in the working library by the analyzer. Each package
declaration that includes function and/or procedure declarations must have a corresponding
package body. The syntax rule for a package body is:
package body identifier is
{package body declarations}
end [package body] [identifier];
Example 3: Simple package declaration and its corresponding body.
library IEEE;
use IEEE.std_logic_1164.all;
package my_package is
function add4_func(a, b: std_logic_vector(3 downto 0); carry : std_logic)
return std_logic_vector;
procedure add4_proc
(a, b: in std_logic_vector(3 downto 0);
carry: in std_logic;
signal sum: out std_logic_vector(3 downto 0);
signal cout: out std_logic);
end package my_package;
Since the package contains subprogram declarations, we declare also the package body:
package body my_package is
function add4_func(a, b: std_logic_vector(3 downto 0); carry: std_logic)
return std_logic_vector is
variable cout: std_logic;
variable cin: std_logic;
variable sum: std_logic_vector(4 downto 0);
begin
cin := carry;
sum := "00000";
loop1: for i in 0 to 3 loop
cout := (a(i) and b(i)) or (a(i) and cin) or (b(i) and cin);
sum(i) := a(i) xor b(i) xor cin;
cin := cout;
end loop loop1;
sum(4) := cout;
return sum;
end add4_func;
procedure add4_proc
27
(a, b: in std_logic_vector(3 downto 0);
carry: in std_logic;
signal sum: out std_logic_vector(3 downto 0);
signal cout: out std_logic) is
variable c: std_logic;
begin
c := carry;
for i in 0 to 3 loop
sum(i) <= a(i) xor b(i) xor c;
c := (a(i) and b(i)) or (a(i) and c) or (b(i) and c);
end loop;
cout <= c;
end add4_proc;
end package body my_package;
Suppose the above package and package body declarations are saved as my_package.vhd, (i.e., as
a VHDL file). Normally, it could be analyzed and placed in any directory, for instance
MY_LIBRARY directory. Then, we can write other VHDL files (or library units) in which we
instantiate items from the newly created library, (i.e., MY_LIBRARY), using the "selected name".
The "selected name" is formed by writing the library name, then the package name, and then the
name of the item (or all if you want to use all items), all separated by dots. For example:
library MY_LIBRARY;
use MY_LIBRARY.my_package.all;
Example 4: Simple 8-bit adder using the above package.
Use ISE WebPack to create a new project. Add to your project the following two VHDL files, and
then synthesize and implement the design.
-------------------------------------------------------------------- First VHDL file: has package declaration and package body.
-- Save it as my_package.vhd
library IEEE;
use IEEE.std_logic_1164.all;
------------------------------------------------------------------package my_package is
function add4_func(a, b : std_logic_vector(3 downto 0); carry : std_logic)
return std_logic_vector;
procedure add4_proc
(a, b : in std_logic_vector(3 downto 0);
carry: in std_logic;
signal sum: out std_logic_vector(3 downto 0);
signal cout: out std_logic);
end package my_package;
------------------------------------------------------------------package body my_package is
function add4_func(a, b : std_logic_vector(3 downto 0); carry: std_logic)
return std_logic_vector is
variable cout: std_logic;
variable cin: std_logic;
variable sum: std_logic_vector(4 downto 0);
28
begin
cin := carry;
sum := "00000";
loop1: for i in 0 to 3 loop
cout := (a(i) and b(i)) or (a(i) and cin) or (b(i) and cin);
sum(i) := a(i) xor b(i) xor cin;
cin := cout;
end loop loop1;
sum(4) := cout;
return sum;
end add4_func;
procedure add4_proc
(a, b : in std_logic_vector(3 downto 0);
carry: in std_logic;
signal sum: out std_logic_vector(3 downto 0);
signal cout: out std_logic) is
variable c: std_logic;
begin
c := carry;
for i in 0 to 3 loop
sum(i) <= a(i) xor b(i) xor c;
c := (a(i) and b(i)) or (a(i) and c) or (b(i) and c);
end loop;
cout <= c;
end add4_proc;
end package body my_package;
-------------------------------------------------------------------------------------------------------------------------------------- Second VHDL file: simple 8-bit adder
-- Uses items from "my_package" created in WORK library directory
-- in your current project directory
-- Save it as bit8_adder.vhd
library IEEE;
use IEEE.std_logic_1164.all;
use WORK.my_package.all;
entity bit8_adder is
port(a, b: in std_logic_vector(7 downto 0);
ci: in std_logic;
y: out std_logic_vector(7 downto 0);
co: out std_logic);
end bit8_adder;
architecture structural of bit8_adder is
signal internal_carry : std_logic;
signal sum1, sum2: std_logic_vector(4 downto 0);
begin
sum1 <= add4_func(a(3 downto 0), b(3 downto 0), ci);
sum2 <= add4_func(a(7 downto 4), b(7 downto 4), sum1(4));
y <= sum2(3 downto 0) & sum1(3 downto 0);
co <= sum2(4);
end;
-------------------------------------------------------------------
29
Lab 3: Four-Bit Binary Counter
1. Objective
The objective of this lab is to design and test a 4-bit binary counter. Aside from learning about the on-board
clock signal and push-buttons as well as about frequency dividers, this lab reinforces the design flow steps
introduced in the previous labs.
2. Description
We design a 4-bit binary counter. Our counter has an output “Q” with four bits. During correct operation,
the counter starts at “0000” and then binary counts up to output “0001”, “0010”, “0011”, and so on until it
outputs “1111”, after which it resets to “0000” and starts again. The first implementation of our counter has
only one input: a clock signal CK. The clock signal is provided by the external (to the FPGA) clock
generator. We use the output Q to drive the first four LEDs on the Atlys board.
The block diagram of the simplest/basic structural implementation of such a binary counter is shown in the
next figure. This implementation is known as a ripple counter.
Figure 1 Block diagram of a 4-bit binary counter
Toggle Flip-Flop
As shown in the figure above, we use four Toggle Flip-Flops (TFF’s). As you remember, the operation of a
TFF is as follows: When the “T” input is logic “1”, the output “Q” will toggle on each clock transition.
When the “T” input is logic “0”, the output “Q” will not change.
To start our design, we first create a new project by launching Xilinx ISE WebPack and following the steps
discussed in lab 2. Call the new project fourbit_counter and select the same location where you created the
previous project.
Create and add to the project a first VHDL file called tff.vhd with the following content:
-- tff.vhd
-- Toggle Flip-Flop with behavioral description
library IEEE;
30
use IEEE.STD_LOGIC_1164.ALL;
entity tff is
Port ( T
: in STD_LOGIC;
CK
: in STD_LOGIC;
Q, QN : out STD_LOGIC);
end tff;
architecture My_behavioral of tff is
signal mem : std_logic := '0';
begin
process (CK, T, mem) -- execute this process only when the clock changes
begin
if T = '0' then null; -- no toggle, so do nothing
elsif (CK'event and CK = '1') then
mem <= not mem; -- rising edge of clock and T = 1, toggle stored value
end if;
end process;
Q <= mem;
QN <= not mem;
end;
Clock Divider
Our counter uses as a clock a signal generated by the on-board clock generator. This clock generator is a
single 100 MHz CMOS oscillator on the Atlys board connected to pin L15 of the Spartan-6 FPGA. Because
the frequency of 100 MHz is too high for the human eye to be able to see how the counter output drives the
LEDs, we must utilize a clock divider to lower the frequency to about 1 Hz.
Create and add to the project a second VHDL file called ck_divider.vhd with the following content:
-- ck_divider.vhd
-- This is a clock divider. It takes as input a signal
-- of 100 MHz and generates an output as signal with a frequency
-- of about 1 Hz.
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
entity ck_divider is
Port ( CK_IN : in STD_LOGIC;
CK_OUT : out STD_LOGIC);
end ck_divider;
architecture Behavioral of ck_divider is
constant TIMECONST : integer := 84;
signal count0, count1, count2, count3 : integer range 0 to 1000;
signal D : std_logic := '0';
begin
process (CK_IN, D)
begin
31
if (CK_IN'event and CK_IN = '1') then
count0 <= count0 + 1;
if count0 = TIMECONST then
count0 <= 0;
count1 <= count1 + 1;
elsif count1 = TIMECONST then
count1 <= 0;
count2 <= count2 + 1;
elsif count2 = TIMECONST then
count2 <= 0;
count3 <= count3 + 1;
elsif count3 = TIMECONST then
count3 <= 0;
D <= not D;
end if;
end if;
CK_OUT <= D;
end process;
end Behavioral;
Read the above code to understand its operation. It takes the 100 MHz external clock as input CK_IN and
generates an output signal CK_OUT of 1 Hz. The output frequency is adjustable according to the following
formula (TIMECONST = 84 in this case in order to get an output frequency of about 1 Hz):
Output Frequency = 100000000 / ( 2 * (TIMECONST ^ 4) )
Note: There are other ways of implementing the TFF or the clock divider. In time, by accumulating more
and more experience, you will develop your own VHDL programming style by adopting different coding
techniques.
4-bit Binary Counter
Finally, let’s create a third VHDL file with the top-level description of our fourbit_counter design described
in Figure 1. Create and add to the project the third VHDL file called fourbit_counter.vhd with the
following content:
-- fourbit_counter.vhd
-- This is a simple 4-bit (Ripple) binary counter made up
-- of four T flip-flops. It also includes a clock divider
-- to bring down the input CK signal from 100 MHz to about 1 Hz.
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
entity fourbit_counter is
Port ( CK : in STD_LOGIC;
Q : out STD_LOGIC_VECTOR (3 downto 0));
end fourbit_counter;
architecture Structural of fourbit_counter is
component tff
Port ( T
: in
STD_LOGIC;
32
CK
: in STD_LOGIC;
Q, QN : out STD_LOGIC);
end component;
component ck_divider
Port ( CK_IN : in STD_LOGIC;
CK_OUT : out STD_LOGIC);
end component;
signal all_T, S0, S1, S2, S3, internal_ck : STD_LOGIC;
begin
-- We use signal all_T set to logic '1' to drive
-- input T of all T flip-flops to logic '1'.
all_T <= '1';
CLOCK: ck_divider port map
TFF0: tff port map (all_T,
TFF1: tff port map (all_T,
TFF2: tff port map (all_T,
TFF3: tff port map (all_T,
(CK, internal_ck);
internal_ck, Q(0), S0);
S0, Q(1), S1);
S1, Q(2), S2);
S2, q(3), S3);
end Structural;
Design Implementation
At this time, we have coded the entire design and its components. Before continuing to Design
Implementation, we first take care of two things:
Set the fourbit_counter as the top-level design (we need to do this because currently TFF.vhd is the toplevel because it was added first to the project). To do that, in the Hierarchy window, Right click on
“fourbit_counter – Structural (fourbit_counter.vhd)” and select Set as To-Level.
Pin assignment. As discussed earlier we use the external clock signal connected to pin L15 of the
Spartan-6 FPGA. So, we assign pin L15 to the input “CK” of our design. Also, we use the output “Q”
of our design to drive the first four LEDs of the Atlys board. Now, do the pin assignment as learned in
lab 2. After this step, your UCF file should have the following content:
# PlanAhead Generated physical constraints
NET "Q[0]" LOC = U18;
NET "Q[1]" LOC = M14;
NET "Q[2]" LOC = N14;
NET "Q[3]" LOC = L14;
NET "CK" LOC = L15;
We are now ready to implement the design: in the Processes tab double-click Implement Design (or rightclick on Implement Design and select Run).
Generate the Programming File and Program the FPGA
Double-click on Generate Programming File in the Process tab. Then, program the FPGA using the Adept
software as learned in lab 2. Verify that our counter works correctly.
3. Lab assignment
33
Lab preparation
A major problem with the counter implemented in this lab is that the individual flip-flops do not all change
state at the same time. Rather, each flip-flop is used to trigger the next one in the series. Thus, in switching
from all 1s (count = 15) to all 0s (count wraps back to 0), we don’t see a smooth transition. Instead, output
Q(0) falls first, changing the apparent count to 14. This triggers output Q(1) to fall, changing the apparent
count to 12. This in turn triggers output Q(2), which leaves a count of 8 while triggering output Q(3) to fall.
This last action finally leaves us with the correct output count of zero. We say that the change of state
“ripples” through the counter from one flip-flop to the next. Therefore, this circuit is known as a “ripple
counter”.
This causes no problem if the output is only to be read by human eyes; the ripple effect is too fast for us to
see it. However, if the count is to be used as a selector by other digital circuits (such as a multiplexer or
demultiplexer), the ripple effect can easily allow signals to get mixed together in an undesirable fashion. To
prevent this, we need to devise a method of causing all of the flip-flops to change state at the same moment.
That would be known as a “synchronous counter” because the flip-flops would be synchronized to operate
in unison.
In this lab assignment, you must design a synchronous counter version of our fourbit_counter to arrive to a
new block diagram, where all flip-flops are driven by the same clock signal. You should design this counter
using the Karnaugh Maps method and utilize JK flip-flops instead of T flip-flops. In addition, the top-level
design of the fourbit_counter should have an additional input, “RESET”, which when set to logic “1” forces
the counter to the initial state “0000”. The RESET input should be controlled by one of the pushbuttons of
the Atlys board.
Optional:
Remove entirely the clock divider from the design. Instead of the clock signal of 100 MHz utilize a
signal from one of the pushbuttons of the Atlys board. In this case, the counter will advance each time
the pushbutton is pressed.
Modify the counter such that it can be told to count up or down.
Lab report and demo
You must turn-in a lab report, which should contain the following:
Lab title
Your name
Introduction section – a brief description of the problem you solve in this lab assignment, outlining the
goal and design requirements.
Solution – details of your Karnaugh Maps method. Include all block diagrams and K-maps you need to
illustrate each step. This section must be hand-written.
VHDL code – of your entire design. Use smaller font to save space.
Conclusion – describe your results and any issues you may have faced during this assignment and how
you solved them.
For full credit, you must demo the correct operation of your counter to the TA during the next lab.
34
Lab 3: Supplemental Material - Testbenches
The objective of this supplemental material is to reinforce the concept of testbenches in VHDL.
1. Introduction
On alternative way to verify the correctness of a VHDL description of a design is to use
testbenches.
A testbench is an enclosing VHDL model. Its name comes from the analogy with a real hardware
testbench, on which a Device Under Test (DUT) is stimulated with signal generators and observed
with signal probes. A VHDL testbench consists of an architecture body containing an instance of
the component to be tested and processes that generate sequences of values on signals connected to
the component instance. The architecture body may also contain processes that test the component
instance produces the expected values on its output signals.
During this supplemental lab you will write the VHDL model for a registered ALU using a
package, and test it using a testbench. Your ALU is capable of performing four operations on two
operands as shown in Fig.1. The flag output is high (logic '1') whenever there is either an
underflow or overflow on the C bus.
a(3:0)
b(3:0)
ALU
1
func(1:0)
4
clk
REGISTER
flag
reset
c(3:0)
Figure 1 Simple ALU
2. Writing the package
As you already learned, a VHDL package is an important way of grouping a collection of related
declarations that serve a common purpose. Usually, a package is a set of subprograms that provide
operations on a particular type of data, or they might be just the set of declarations needed to
model a design.
The important thing is that they can be collected together into a separate design unit that can be
worked on independently and reused in different parts of a model or models.
The following VHDL code describes all the operations needed to implement the four basic
operations of your simple ALU. Type it using any text editor (or using the VHDL editor of ISE
WebPack) and save it as alupack.vhd.
35
--------------------------------------------library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.numeric_std.all;
---------------------------------------------- package declarations for procedures and constants
package addorsub is
-- set the default bus size
constant bussize : integer := 4;
-- set up a type for a bus of size bussize
subtype stdbus is signed (3 downto 0);
subtype lrgbus is signed (4 downto 0);
-- set the integer range for a bus of size bussize + 1
subtype medint is integer range -32 to 31;
-- extend performs a one bit signed or signed bit extension based
-- on the value of signex. signex=1 does a signed extension.
procedure extend (signal inbus : in stdbus; variable outbus : out
lrgbus; signex : in std_logic);
-- usadd performs signed or signed addition of two busses of size
-- bussize. the result is a signed or signed bus of size bussize
-- depending on signex (signex = 1 produces a signed result). reportf
-- indicates if there is an underflow or overflow.
procedure usadd (signal abus, bbus : in signed(bussize-1 downto 0);
signal result : out signed(bussize-1 downto 0);
signex : in std_logic;
signal reportf : out std_logic);
-- ussub performs signed or signed subtraction (abus - bbus)
-- of two busses of size bussize (signex=1 causes signed subtraction).
-- reportf =1 if there is an underflow or overflow.
procedure ussub (signal abus, bbus : in signed(bussize-1 downto 0);
signal result : out signed(bussize-1 downto 0);
signex : in std_logic;
signal reportf : out std_logic);
end addorsub;
---------------------------------------------- package body contains the procedure bodies.
package body addorsub is
procedure extend (signal inbus : in stdbus; variable outbus : out
lrgbus; signex : in std_logic) is
begin
outbus := (signex and inbus (bussize-1)) & inbus(bussize-1 downto 0);
end;
procedure usadd (signal abus, bbus : in signed(bussize-1 downto 0);
signal result : out signed(bussize-1 downto 0);
signex : in std_logic;
signal reportf : out std_logic) is
variable tempr : medint;
variable tempa : signed(bussize downto 0);
variable tempb : signed(bussize downto 0);
begin
-- sign/unsign extend abus and bbus to a bus of size bussize + 1;
36
extend(abus, tempa, signex);
extend(bbus, tempb, signex);
--perform signed addition
tempr := to_integer(tempa)+ to_integer(tempb);
-- check for overflows dependent on type of addition
if (signex = ‘0’ and tempr > 15) then
--overflow of signed addition
reportf <= ‘1’;
elsif (signex = ‘1’ and (tempr > 7 or tempr < -8)) then
-- overflow or underflow of signed addition
reportf <= ‘1’;
else
reportf <= ‘0’;
end if;
result <= to_signed(tempr, bussize);
end usadd;
procedure ussub (signal abus, bbus : in signed(bussize-1 downto 0);
signal result : out signed(bussize-1 downto 0);
signex : in std_logic;
signal reportf : out std_logic) is
variable tempr : medint;
variable tempa : signed(bussize downto 0);
variable tempb : signed(bussize downto 0);
begin
-- sign/unsign extend abus and bbus to a bus of size bussize+1;
extend(abus, tempa, signex);
extend(bbus, tempb, signex);
-- perform signed addition
tempr := to_integer(tempa)- to_integer(tempb);
-- check for overflows dependent on type of addition
if (signex = ‘0’ and tempr < 0) then
reportf <= ‘1’;
elsif (signex = ‘1’ and (tempr > 7 or tempr < -8)) then
-- overflow or underflow of signed addition
reportf <= ‘1’;
else
reportf <= ‘0’;
end if;
result <= to_signed(tempr, bussize);
end ussub;
end addorsub; -- end of package body
---------------------------------------------
3. Writing the VHDL description of the ALU
The following VHDL code describes the ALU, which uses the functions declared and
implemented in the package alupack. The ALU should have a register to latch the output. Type it
using any text editor and save it as alu.vhd.
--------------------------------------------library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.numeric_std.all;
use WORK.addorsub.all;
--------------------------------------------entity alu is
port (a, b : in stdbus;
37
func : in std_logic_vector(1 downto 0);
clk, reset : in std_logic;
flag : out std_logic;
c : out stdbus);
end alu;
--------------------------------------------architecture rtl of alu is
signal intflag : std_logic;
signal intbus : stdbus;
begin
regp : process (clk, reset)
begin
if (reset = ‘1’) then
flag <= ‘0’;
c <= “0000”;
elsif (clk’event and clk = ‘0’) then
flag <= intflag;
c <= intbus;
end if;
end process regp;
alup : process(a, b, func)
begin
if func(1) = ‘0’ then
usadd(a, b, intbus, func(0), intflag);
else
ussub(a, b, intbus, func(0), intflag);
end if;
end process alup;
end rtl;
---------------------------------------------
4. Writing the testbench
The following VHDL code represents the testbench. It generates inputs for and monitors the
outputs from the ALU. The testbench compares the actual outputs with expected outputs and prints
out if a test is successful or not. Note that you do not need a stimulus file when you work with
testbenches; the design is stimulated with stimulus generated inside the testbench.
Type the following VHDL code and save it as testbench.vhd.
--------------------------------------------library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.numeric_std.all;
use WORK.addorsub.all;
--------------------------------------------entity testbench is
end testbench;
--------------------------------------------architecture test of testbench is
type table_type1 is array (0 to 5) of signed (3 downto 0);
type table_type2 is array (0 to 3) of std_logic_vector (1 downto 0);
constant inputa : signed := “0000”;
constant inputb : signed := “0000”;
38
constant outc : table_type1 := (“0001”, “0011”, “0101”, “0111”, “1001”,“1011”);
constant outgen : table_type2 := (“00”, “01”, “10”, “11”);
signal cbus : signed (3 downto 0);
signal flag : std_logic;
signal abus : signed (3 downto 0) := “0000”;
signal bbus : signed (3 downto 0) := “0000”;
signal clk : std_logic;
signal reset : std_logic;
signal sel : std_logic_vector (1 downto 0) := “00”;
component alu
port (a, b : in stdbus;
func : in std_logic_vector(1 downto 0);
clk, reset : in std_logic;
flag : out std_logic;
c : out stdbus);
end component;
for alu_inst : alu use entity work.alu(rtl);
begin
alu_inst : alu port map (abus, bbus, sel, clk, reset, flag, cbus);
clkp : process
begin
clk <= ‘1’, ‘0’ after 50 ns;
wait for 100 ns;
end process clkp;
rset : process
begin
reset <= ‘1’, ‘0’ after 100 ns;
wait for 1 ms;
end process rset;
testp : process
begin
wait for 100 ns; -- this is needed for reset to finish
for j in 0 to 1 loop -- test for unsigned & signed add
sel <= outgen(j);
for i in 0 to 5 loop
abus <= inputa + TO_SIGNED(i, 4);
bbus <= inputb + TO_SIGNED(i+1, 4);
wait for 51 ns;
assert (cbus = outc(i))
report “Result is not correct”
severity warning;
wait for 49 ns;
end loop;
end loop;
for j in 2 to 3 loop -- test for unsigned & signed sub
sel <= outgen(j);
for i in 0 to 5 loop
abus <= inputa + TO_SIGNED(i, 4);
bbus <= inputb + TO_SIGNED(i+1, 4);
wait for 51 ns;
assert (cbus = "1111”)
report “Result is not correct”
severity warning;
wait for 49 ns;
39
end loop;
end loop;
assert false
report “Test Complete”
severity error;
end process testp;
end test;
---------------------------------------------
Read thoroughly the above files to understand the functionality of the testbench, then:
Use Aldec HDL simulator to simulate alu.vhd together with alupack.vhd. Create your own
input signals (as in lab#1) to stimulate the four basic operations performed by the ALU and
verify its correctness.
Simulate testbench.vhd (together with alu.vhd and alupack.vhd) to verify the ALU. Notice that
using testbeches saves your time.
5. Lab assignment
You are required to modify the ALU design such that it can be implemented with ISE WebPack
and verified on the Atlys board. You must add a clock divider to provide a clock frequency of 1 Hz
to the ALU unit. The clock divider uses as input the 100 MHz signal of the Atlys board.
Use output c(3:0) to drive LEDs. The LEDs must display either a number between 0-15 for
unsigned operations, or a number between 0-7 for the signed operations. The output "flag" should
drive the left most LED. As inputs a(3:0) and b(3:0) use all eight slide-switche. As func(1:0) use
the two push-buttons. Synthesize and implement this modified ALU and download its bitstream
file to the board to configure the FPGA. Verify the correct operation.
40
Lab 4: Finite State Machines
1. Objective
The objective of this lab is to study several different ways of specifying and implementing finite state
machines (FSMs). We also discuss finite state machines with datapath (FSMD).
2. Introduction
There are two basic types of sequential circuits: Mealy and Moore. Because these circuits transit among a
finite number of internal states, they are referred to as finite state machines (FSMs). In a Mealy circuit, the
outputs depend on both the present inputs and state. In a more circuit, the outputs depend only on the
present state. The most common way of schematically representing a Mealy sequential circuit is shown in
Fig.1.
Figure 1 State transition table and block diagram of a Mealy type seq. circuit (BCD to excess-3 converter)
The state register normally consists of D flip-flops (DFFs). However, other types of flip-flops can be
utilized, such as JKFFs. The normal sequence of events is: (1) inputs X change to a new value, (2) after a
clock period delay, outputs Z and next state NS become stable at the output of the combinational circuit, (3)
the next state signals NS are stored in the state register; that is, next state NS replace present state PS at the
output of the state register, which feeds back into the combinational circuit. At this time, a new cycle is
ready to start. These operational cycles are synchronized with the clock signal CLK.
It is worth mentioning that some authors further classify sequential circuits into two categories. The first
category, referred to as “regular sequential circuits”, includes circuits like (shift) registers, FIFOs, and
binary counters and variants. The second category, referred to as “finite state machines” (FSMs), include
circuits that typically do not exhibit a simple, repetitive pattern.
3.
Example 1: MEALY machine design – BCD to Excess-3 code converter
In this example, we’ll design a serial converter that converts a binary coded decimal (BCD) digit to an
excess-3-coded decimal digit. Excess-3 binary-coded decimal (XS-3) code, also called biased representation
or Excess-N, is a complementary BCD code and numeral system. It was used on some older computers with
a pre-specified number N as a biasing value. It is a way to represent values with a balanced number of
positive and negative numbers. In our example, the XS-3 code is formed by adding 0011 to the BCD digit.
41
The table and state graph in Fig.2 describe the functionality of our design. For details, please read pages 1925 in the textbook.
Figure 2 Code converter: table and state graph
There are several ways to model this sequential machine. One popular/common approach is to use two
processes to represent the two parts of the circuit: the combinational part and the state register. For clarity
and flexibility, we use VHDL’s enumerated data type to represent the FSM’s states. The following VHDL
code describes the converter (file code_conv_2processes.vhd):
-----
Behavioral model of a Mealy state machine: code converter w/ 2 processes
It is based on its state table. The output (Z) and next state are
computed before the active edge of the clock. The state change
occurs on the rising edge of the clock.
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
entity Code_Converter is
port(
enable: in std_logic;
X, CLK: in std_logic;
Z: out std_logic);
end Code_Converter;
architecture Behavioral of Code_Converter is
type state_type is (S0, S1, S2, S3, S4, S5, S6);
signal State, Nextstate: state_type;
-- a different way: represent states as integer signals:
-- signal State, Nextstate: integer range 0 to 6;
begin
-- Combinational Circuit
process(State, X)
begin
case State is
when S0 =>
if X = '0' then Z <= '1'; Nextstate <= S1;
else Z <= '0'; Nextstate <= S2; end if;
when S1 =>
42
if X = '0' then Z <= '1'; Nextstate <= S3;
else Z <= '0'; Nextstate <= S4; end if;
when S2 =>
if X = '0' then Z <= '0'; Nextstate <= S4;
else Z <= '1'; Nextstate <= S4; end if;
when S3 =>
if X = '0' then Z <= '0'; Nextstate <= S5;
else Z <= '1'; Nextstate <= S5; end if;
when S4 =>
if X = '0' then Z <= '1'; Nextstate <= S5;
else Z <= '0'; Nextstate <= S6; end if;
when S5 =>
if X = '0' then Z <= '0'; Nextstate <= S0;
else Z <= '1'; Nextstate <= S0; end if;
when S6 =>
if X = '0' then Z <= '1'; Nextstate <= S0;
else Z <= '0'; Nextstate <= S0; end if;
when others => null;
-- should not occur
end case;
end process;
-- State Register
process (enable, CLK)
begin
if enable = '0' then
State <= S0;
elsif rising_edge (CLK) then
State <= Nextstate;
end if;
end process;
end Behavioral;
Note that in each branch of the case statement, the output Z and Nextstate are assigned values. The second
process represents the state register, which is updated on the rising edge of the CLK signal.
To test this converter on the Atlys board, we’ll design a circuit that uses two shift-registers, the converter,
and a clock divider, as shown in the diagram of Fig.3. The input is provided parallel as four bits via four
slide switches while the output is displayed on four LEDs. We use a clock divider to generate a slower
clock signal (about 1 Hz) to make it easier to monitor the operation of the whole system.
So, create a new ISE project (let’s call it lab4_fsm) and add to it the following VHDL files:
code_conv_2processes.vhd, ck_divider.vhd, shift_register.vhd, and top_level.vhd. These files contain
the declaration and description of all necessary entities to implement the system from Fig.3. These files
together with other useful files (e.g., .ucf file) are included in the downloadable archive with all the data for
this lab. Read top_level.vhd and figure out what exactly the “control” block in Fig.3 does.
Run the Implement Design step inside ISE WebPack to perform placement and routing. Generate the
programming .bit file and program the FPGA. Verify the operation of your design. Observe and comment.
43
Figure 3 Block diagram of top-level design to test the BCD to XS3 converter
Generally, there are other ways to describe the behavioral model for the code converter:
One way is to use only a single process (rather than two processes as discussed above). In this case, the
next-state is not computed explicitly, but the state register is updated directly to the proper next-state
value on the rising edge of the clock signal. You can see the VHDL code of such an approach in Fig. 256, page 106, in the textbook.
Another way is to use the so called dataflow approach. Basically, this is based on using Boolean
equations that implement the combinational part of the state machine. An example of this is shown in
Fig. 2-57, page 107, in the textbook. Because method assumes that we know these equations, it is not a
preferred method.
Yet another approach to write the VHDL code for the state machine is to create a structural model.
The structural model describes all actual gates and flip-flops and their connectivity. An example of this
is shown in Fig. 2-58, page 108, in the textbook.
Finally, there is yet another way of describing a state machine: state machine editor. However, this can
be done when using the Aldec-HDL tool. The State Diagram Editor of Aldec is a tool designed for the
graphical editing of state diagrams of synchronous and asynchronous machines. Drawing a state
diagram is an alternative approach to the modeling of a sequential device. Instead of writing the HDL
code, one can enter the description of a logic block as a graphical state diagram. The tool will then
automatically generate the HDL code based on the entered graphical description. Due to the intuitive
graphic form, state diagrams are easy-to-learn and far more readable than the HDL code [1]. We’ll not
use this in this course. However, it is mentioned here for the sake of completeness. For more info you
may want to check out [2,3].
The method using two processes is the recommended one because it is closer to how actually the hardware
works and it is more readable as a VHDL code.
4. Example 2: Finite state machine with datapath (FSMD) - bit difference calculator
A finite state machine with datapath (FSMD) combines a FSM and regular sequential circuits. The FSM,
sometimes referred to as a control-path or controller, examines the external commands and status and
generates control signals to specify operations of the regular sequential circuits, which are known
collectively as a data-path [4]. The FSMD is used to implement systems described by RT (register
transfer) methodology, where the system’s functionality is specified as data manipulation and transfer
among a collection of registers.
44
Most realistic circuits combine a controller and a datapath to perform some computation. The use of the
FSMD model is especially recommended whenever the structure of the datapath is important. For example,
if you are creating a custom pipelined datapath for a specific application, specifying the structure of the
pipeline is likely important.
The combination of a controller and datapath can be represented using several models in VHDL. In this lab,
we'll look at two different models. To do that, we’ll design and simulate a simple example: a bit difference
calculator [5]. The design’s description is as follows: Given an input of a generic width, the entity
calculates the difference between the number of 1s and 0s. If for example there are 3 more 1s than 0s, the
output is 3. If there are 3 more 0s than 1s, the output is -3.
Implementation A: behavioral model using two processes
A simplified pseudocode description of the bit difference calculator is as follows:
Inputs: go, input (arbitrary width)
Outputs: output(arbitrary width), done (1 bit)
while (go == 0);
value = input;
// Store input in a register called value.
diff = 0;
for width iterations {
if bit0 of value == 1
diff++;
else
diff--;
value = shiftRight(value,1);
}
output = diff;
done = 1;
One possible implementation as a FSMD is described by the state graph in Fig.4.
The VHDL file top_level_bit_diff_impl_A.vhd describes the entity bit_diff and its architecture the design.
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity bit_diff is
generic (
width : positive := 16);
port (
clk
: in std_logic;
rst
: in std_logic;
go
: in std_logic;
input : in std_logic_vector(width-1 downto 0);
output : out std_logic_vector(width-1 downto 0);
done
: out std_logic);
end bit_diff;
45
Figure 4 State graph of FSMD implementation
architecture FSMD_2P of bit_diff is
type STATE_TYPE is (S_INIT, S_CHECK_BIT, S_STORE_OUTPUT, S_DONE);
signal
signal
signal
signal
signal
begin
state, next_state : STATE_TYPE;
value, next_value : std_logic_vector(width-1 downto 0);
diff, next_diff
: signed(width-1 downto 0);
count, next_count : integer range 0 to width;
output_s, next_output : std_logic_vector(width-1 downto 0);
-- this process defines all registers used in the FSMD
process(clk, rst)
begin
if (rst = '1') then
value
<= (others => '0');
count
<= 0;
diff
<= (others => '0');
output_s <= (others => '0');
state
<= S_INIT;
elsif (clk'event and clk = '1') then
-- these are the only registers used by the 2-process FSMD
value
<= next_value;
count
<= next_count;
diff
<= next_diff;
output_s <= next_output;
state
<= next_state;
end if;
46
end process;
-- combinational logic
process(go, input, value, count, diff, output_s, state)
variable temp : integer range 0 to width;
begin
next_count <= count;
next_value <= value;
next_diff
<= diff;
next_output <= output_s;
next_state <= state;
done <= '0';
case state is
when S_INIT =>
next_count <= 0;
next_diff <= (others => '0');
next_value <= input;
if (go = '1') then
next_state <= S_CHECK_BIT;
end if;
when S_CHECK_BIT =>
if (value(0) = '0') then
next_diff <= diff - 1;
elsif (value(0) = '1') then
next_diff <= diff + 1;
end if;
next_value <= std_logic_vector(shift_right(unsigned(value), 1));
temp := count + 1;
next_count <= temp;
if (temp = width) then
next_state <= S_STORE_OUTPUT;
end if;
when S_STORE_OUTPUT =>
next_output <= std_logic_vector(diff);
next_state <= S_DONE;
when S_DONE =>
done
<= '1';
next_state <= S_INIT;
when others => null;
end case;
end process;
output <= output_s;
end FSMD_2P;
47
At this time, you should create a simple testbench VHDL file (you can do it by modifying
testbench_top_level.vhd file from Example 1) and simulate using Aldec-HDL the above entity. Verify its
operation and comment.
Implementation B: structural model using component instantiations for registers, muxes, adders,
subtracters, etc.
The structural implementation is recommended when the exact structure of the datapath is important. In this
model, we separate the controller and datapath from each other. Then, typically, we define the datapath
structurally and then combine it with a corresponding controller (FSM) described using any of the possible
models discussed in Example 1.
For example, assume that we really wanted to implement the datapath described in Fig.5. Then, the
following files: top_level_bit_diff_impl_B.vhd, datapath.vhd, fsm.vhd, add.vhd, sub.vhd, reg.vhd,
mux2x1.vhd, comp.vhd describe all the entities required for implementing the design. Read these files to
understand the description. Then, use the same testbench that you created to simulate the previous
implementation (implementation A) of this design to verify also the operation of this description too.
Figure 5 Block diagram of datapath
48
5. Lab assignment
Design and code in VHDL the converter from Example 1 but as a Moore machine. Verify its operation
using Aldec-HDL simulator. The lab report should include state diagram, VHD code, description, and
waveforms.
6. References
[1] Aldec state diagram editor. http://www.aldec.com/en/solutions/fpga_design/graphical_text_design_entry
[2] Getting Started with Active-HDL.
http://www.aldec.com/en/support/resources/documentation/articles/1054
[3] Lab tutorial at TCC. http://faculty.tcc.edu/PGordy/EGR270/AldecEx2.pdf
[4] P.P. Chu, RTL Hardware Design Using VHDL: Coding for Efficiency, Portability and Scalability,
Wiley-Interscience, 2006.
[5] Greg Stitt, University of Florida, VHDL tutorials. http://www.gstitt.ece.ufl.edu/vhdl
49
Lab 4: Supplemental Material – Writing VHDL code for synthesis
The objective of this supplemental material is to provide guidelines for writing VHDL code for synthesis.
1. Introduction
The quality of a synthesized design, in terms of area, performance, etc., depends directly on the
VHDL description of the design. Generally, two different VHDL descriptions of the same design
may result in two different final implemented circuits. Also, the final implemented design depends
on what software tools you use. Adopting a VHDL programming style, which ensures best
synthesized designs, is desirable. During this lab you will learn how to write VHDL constructs that
are efficiently synthesized.
2. Synthesis tools
Usually, there are several ways to express the functionality of a design. For example, the following
VHDL code describes an edge triggered D-flip-flop in four different ways:
-- version 1
process (clk) is
begin
if rising_edge(clk) then
q <= d;
end if;
end process
-- version 2
process is
begin
wait until rising_edge(clk);
q <= d;
end process
-- version 3
q <= d when rising_edge(clk) else q;
-- version 4
b: block (rising_edge(clk) and not clk'stable) is
begin
q <= guarded d;
end block b;
It is unlikely that all the above descriptions will be synthesizeable by the same tool. This depends
on how the tool is constructed, i.e., what are the "expectations" of the tool for certain expressions.
That is why it is recommended that you 1) read the documentation of your particular tool and 2)
possibly change your programming style to conform to the particular requirements specific to your
tool.
3. Potentially synthesizable
50
VHDL can be utilized to describe design for the purposes of 1) simulation and 2) synthesis. There
are constructs (closer to the C programming constructs) included in the VHDL language which are
intended for simulation and which cannot be synthesized into hardware. File operations and
assertion statements are such kind of constructs. Such constructs should be used for creating
testbenches but not for the synthesizable sections of a model.
There are constructs which are potentially synthesizable but are not handled correctly by some
synthesis tools. If you use these constructs, the synthesized hardware will produce different results
from the simulated model. For example, suppose you want to describe a registered comparator
with two data inputs, a and b, a clock input, clk, and a data output, q. The device stores the result
of comparing a and b on each rising edge of clk. The following are two possible ways to describe
this circuit:
-- version 1
process (clk) is
variable d : std_logic;
begin
if a=b then
d:='1';
else
d:='0';
end if;
if rising_edge(clk) then
q <= d;
end if;
end process
-- version 2
process (clk) is
begin
if rising_edge(clk) then
if a=b then
q <= '1';
else
q <= '0';
end if;
end if;
end process
When you simulate the first version it works correctly. When you synthesize it you have to take
into account the fact that the process is resumed on both rising and falling edges of clk. The
variable d is updated in both cases and in this way it is a function of clk. Some tools treat this as
illegal and fail to synthesize the device. Others proceed to synthesize the device, but may not
produce a correct circuit. Version 2 is a better description since it reflects accurately your intention
that the comparison is performed only on rising edges of clk. In this case the process does not
contain any unnecessary implied state.
4. "Doing it right" vs. "Doing it wrong"
51
a) "Doing it right"
y <= a or b; -- simple gate, easy to synthesize
y <= a when x = '1' else b; -- simple multiplexer, no process
-- statement necessary!
Extend VHDL code already written to describe new blocks. Example which uses the description of
a flip-flop to specify a counter:
-- flipflop description
ff2 : process (reset, clk) is
begin
if reset = '1' then
q <= '0';
elsif rising_edge(clk) then
if x = '1' then
q <= 'a';
else
q <= 'b';
end if;
end if;
end process ff2;
-- flipflop extended to form a counter
constant terminal_count : integer := 2**6-1;
subtype counter_range is integer range 0 to terminal_count;
signal count : counter_range;
...
counter6: process (reset, clk) is
begin
if reset = '0' then
count <= '0';
elsif rising_edge(clk) then
if count < terminal_count then
count <= count + 1;
else
count <= '0';
end if;
end if;
end process counter6;
Describe Finite State Machines (FSMs) using two-processes model:
architecture behavioral of an_FSM is
type state_type is (S0, S1, S3, S4);
signal state, next_state : state_type;
begin
combinational_part: process (input, state) is
begin
case state is
when S0 =>
if input = '1' then
output <= '1';
next_state <= S0;
else
52
output <= '1';
next_state <= S1;
end if;
when S1 =>
if input = '1' then
output <= '0';
next_state <= S1;
else
output <= '0';
next_state <= S0;
end if;
when S2 =>
if input = '1' then
output <= '0';
next_state <= S1;
else
output <= '0';
next_state <= S3;
end if;
when S3 =>
if input = '1' then
output <= '0';
next_state <= S3;
else
output <= '1';
next_state <= S0;
end if;
end case;
end process combinational_part;
state_register: process (reset, clk) is
begin
if reset = '0' then
state <= S0;
elsif rising_edge(clk) then
state <= next_state;
end if;
end process state_register;
end architecture behavioral;
With this type of description the register which holds the current state is independent of the logic
that determines the next state and the outputs. Synthesis tools work better with the state machine
specified in this way. In the above example, the four states can be encoded using, two, three or
four bits. The best encoding depends on the synthesis target library, the required speed of the
circuit, and the circuit area available. You can force the above machine to use “one hot” state
encoding by modifying the state definition as follows:
-- forcing one hot encoding
subtype state_type is std_logic_vector (3 downto 0);
constant S0 : state_type := "0001";
constant S1 : state_type := "0010";
constant S2 : state_type := "0100";
constant S3 : state_type := "1000";
b) "Doing it wrong"
53
-- Wrong:
y <= a + b + c + d; -- will be synthesized as three stage circuit!
-- Correct:
y <= (a + b) + (c + d); -- will be synthesized as two stage circuit!
-- Wrong:
y <= a or b or c and d; -- wrong if you want (a+b)+(cd)!
-- Recall the operator associativity.
-- Correct:
y <= (a or b) or (c and d);
Incomplete definitions:
-- Wrong version because there is uncertainty about what is x when a = 1
-- and about what is z when NOT(a = 1).
if (a = '1' ) then
z <= f();
elsif (clk'event and clk = '1') then
x <= g();
end if;
-- Correct:
if (a = '1' ) then
z <= f(); x <= x;
elsif (clk'event and clk = '1') then
x <= g(); z <= z;
end if;
-- Wrong construct; because there is an else after a clocked if. Try
-- to draw a schematic and figure it out what is wrong.
if (clk'event and clk = '1') then
x <= f();
y <= g();
else
z <= h();
end if;
Avoid putting too much on the sensitivity list of processes! Usually we put on the sensitivity list
clocks, resetting signals and inputs. Do not include in the sensitivity lists output signals!
5. Distinguishing when to use signals and when to use variables
The behavior of signals and variables can be completely different. Variables can only be used to
store data only temporarily in a process or subprogram.
-- Undesired construct:
signal int : std_logic;
begin
process(a, b, c, d, int) is
begin
int <= a and b and c;
q <= int and d;
end process;
end;
It is undesired because we assign the signal int inside the process and then use it to assign q inside
the same process. Because int is updated only after a delta delay, in the current step int has still the
54
old (incorrect) value. To get around this, int has to be on the sensitivity list, and thus the process
will be activated again. But according to the previous guideline, you have to avoid overloaded
sensitivity list! A better option to write the above construct is:
-- Better construct:
begin
process(a, b, c, d) is
variable int : std_logic;
begin
int := a and b and c;
q <= int and d;
end process;
end;
This will present also the advantage of a faster simulation because, now, the process will be
executed only once! The advantage of the version using int declared as signal is that int can be
used as a waveform in the simulator. This is not possible if int is declared as a variable because no
time is linked to a variable. This makes it harder to debug the variable example then the signal
example!
Rule: Use variables only when you want to store a value temporarily!
6. Others
Declaring vectors:
-- NOT recommended:
signal a : std_logic_vector (0 to 3);
-- Recommended: (because the MSB will be always the one with the highest index)
signal a : std_logic_vector (3 downto 0);
Counter synthesis:
The following is the description of a counter without resetting line:
library IEEE;
use IEEE.Std_Logic_1164.all;
entity COUNTER is
port ( CLK : in std_ulogic;
Q : out integer range 0 to 15 );
end COUNTER;
architecture my_cool_arch of COUNTER is
signal COUNT : integer range 0 to 15 ;
begin
process (CLK)
begin
if (CLK'event and CLK = '1') then
if (COUNT >= 9) then
COUNT <= 0;
else
COUNT <= COUNT + 1;
end if;
end if;
end process;
Q <= COUNT;
end my_cool_arch;
55
Note the range assignment in the port declaration of the output. Here, only integers between 0 and
15 are allowed - which means that 4 bits are sufficient for the binary representation of the output
port. The port Q is replaced by the synthesis tool with a 4 bit signal (ultimately all types are
transformed by means of the synthesis tools to std_logic types).
Note also that Q, as an output port, can be only written and it cannot be read within the architecture
declaration (unless its mode is changed to buffer). Therefore, a signal COUNT must be declared
within the architecture to be able to query COUNT >= 9. The result of the counting is finally
transferred to Q in a concurrent signal assignment, Q<=COUNT, as an additional process. That
means that each change of COUNT triggers the assignment Q<=COUNT. Thus, the internal IF
assignment describes the combinatorial circuit before the FF. The number of FFs is derived from
the width of the signal, which receive an assignment inside the outer IF assignment. In this
example, the width is four for signal COUNT (because of its range 0 to 15).
Finally note that only signal CLK is on the sensitivity list.
7. Conclusion
The discussion in this supplemental material is not meant to be an exhaustive list of how VHDL
code should be written for synthesis. Rather, the purpose is to provide a rough idea about what
writing code for synthesis means. Remarkably, the intent is to make you aware of possible
situations where the VHDL description performs correctly during simulation but it does not after
synthesis and implementation (or even worse: code is not synthesizeable in the first place)!
56
Lab 5: Memories: ROMs and BRAMs Internal to the FPGA
1. Objective
The objective of this lab is to illustrate the use of ROM and block RAM memories located inside the FPGA
– a Spartan-6 in the case of our Atlys board. We’ll learn how to use the ISE’s Core Generator tool to create
BRAMs. Depending on what your course project will do, you may need to use such memories in your
project.
2. Description
In this lab we’ll create a project to implement the following design description: The circuit must contain
two memories. A ROM created using a case statement and initialized to the desired values (such as the
coefficients of a filter). This memory will be inferred as a distributed RAM memory by the Xilinx synthesis
tool (XST). The second memory is a block RAM (BRAM) created using the Core Generator tool (part of
ISE WebPack). The contents of these memories will be read continuously and displayed on the 7 LEDs.
Slide switch SW(0) is used to select between the two outputs of the two memories to drive the LEDs. A
simplified representation of this functionality is shown in the block diagram in Fig.1.
Figure 1 Block diagram of desired circuit
3. ISE WebPack project
Create a new ISE project and add to it the VHDL files listed at the end of this document. These files
contain the declaration and description of this lab’s entities including clock divider, ROM, and top level
design. These files together with other useful files (.ucf and .coe files) are also included in the
downloadable archive with all the data for this lab.
To create a custom single-port block RAM using the Core Generator, inside your ISE project, follow these
steps:
First create, using ISE’s or any other text editor, a file named my_bram8x8.coe and save it in the main
directory of your ISE project, with the following contents:
memory_initialization_radix=2;
memory_initialization_vector=
57
00000000,
01000000,
00100000,
00010000,
00001000,
00000100,
00000010,
00000001;
Select New Source->IP (CORE Generator and Architecture Wizard); name it my_bram8x8 and click
Next.
In the new New Source Wizard that pops-up, select Memories and Storage Elements->RAMs & ROMs>Block Memory Generator and click Next then Finish, action which closes the New Source Wizard
window and brings-up the Block Memory Generator window.
Click Next, leave the Memory Type as “Single Port RAM” and click Next.
On Page 3 of 6, set Write Width to 8 and Write Depth to 8. Click Next.
On Page 4 of 6, select Load Init File and then browse to locate the file created earlier,
my_bram8x8.coe, and then click Next. Click Next again. Page 6 of 6 should look like in Figure 2.
Click Generate. This will generate the memory core and in your ISE project’s Console you should get
the message:
Wrote CGP file for project 'my_bram8x8'.
Core Generator create command completed successfully.
Figure 2
During the above process, several files are created and stored in ipcore_dir/ folder of your ISE project main
folder. Among them, you can find my_bram8x8.vhd. Open it and read the VHDL code. Identify the
BRAM entity declaration, and use it to instantiate a component in the top level VHDL file of your project.
58
Do the pin assignment. After this, your UCF file should have the following contents:
NET
NET
NET
NET
NET
NET
NET
NET
NET
NET
"clk_100MHz" LOC = L15;
"switch" LOC = A10;
"leds[0]" LOC = U18;
"leds[1]" LOC = M14;
"leds[2]" LOC = N14;
"leds[3]" LOC = L14;
"leds[4]" LOC = M13;
"leds[5]" LOC = D4;
"leds[6]" LOC = P16;
"leds[7]" LOC = N12;
Run the Implement Design step inside ISE WebPack to perform placement and routing and observe the
messages that the tool prints in the Console window. These messages provide useful information about the
resource utilization on the FPGA as well as performance estimates.
Generate the programming .bit file and program the FPGA. Verify the operation of your design; turn on/off
the first slide-switch. Observe and comment.
4. Lab assignment
Modify the project to be able to also write into the BRAM. The writing process should allow writing into
BRAM new words (as dictated by the status of the slide switches) during eight cycles. These cycles should
be controlled via one of the push-buttons on the Atlys board (BTND P3). It is up to you how you want to
utilize the remaining push-buttons to achieve the desired operation of the whole system.
5. Credits and references
[1] XST User Guide for Virtex-6, Spartan-6, and 7 Series Devices - ROMs and ROM coding examples
(page 247): http://www.xilinx.com/support/documentation/sw_manuals/xilinx13_1/xst_v6s6.pdf
[2] Spartan-6 Libraries Guide for HDL Designs:
http://www.xilinx.com/support/documentation/sw_manuals/xilinx11/spartan6_hdl.pdf
[3] Spartan-6 FPGA Block RAM Resources:
http://www.xilinx.com/support/documentation/user_guides/ug383.pdf
Appendix: Listing of VHDL code
my_modules.vhd
-- This is a ROM. XST tool (part of ISE WebPack tools) will
-- infer this and implement this declaration as a distributed
-- memory.
-- The contents of this basically LUT will be utilized to drive
-- the 8 LED on the Atlys board. This should turn them on one-by-one
-- from right to left.
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
59
ENTITY rom8x8 IS
PORT (
addr: in std_logic_vector(2 downto 0);
dout:
out std_logic_vector(7 downto 0));
END rom8x8;
ARCHITECTURE behav OF rom8x8 IS
BEGIN
PROCESS(addr)
BEGIN
CASE addr IS
when "000" => dout <=
when "001" => dout <=
when "010" => dout <=
when "011" => dout <=
when "100" => dout <=
when "101" => dout <=
when "110" => dout <=
when "111" => dout <=
when others => NULL;
END case;
END process;
"00000001";
"00000010";
"00000100";
"00001000";
"00010000";
"00100000";
"01000000";
"10000000";
END behav;
-- This is a clock divider. It takes as input a signal of 100 MHz
-- and generates an output as signal with a frequency of about 1 Hz.
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
entity ck_divider is
Port ( CK_IN : in STD_LOGIC;
CK_OUT : out STD_LOGIC);
end ck_divider;
architecture Behavioral of ck_divider is
constant TIMECONST : integer := 84;
signal count0, count1, count2, count3 : integer range 0 to 1000;
signal D : std_logic := '0';
begin
process (CK_IN, D)
begin
if (CK_IN'event and CK_IN = '1') then
count0 <= count0 + 1;
if count0 = TIMECONST then
count0 <= 0;
count1 <= count1 + 1;
elsif count1 = TIMECONST then
count1 <= 0;
count2 <= count2 + 1;
elsif count2 = TIMECONST then
count2 <= 0;
count3 <= count3 + 1;
elsif count3 = TIMECONST then
count3 <= 0;
D <= not D;
end if;
60
end if;
CK_OUT <= D;
end process;
end Behavioral;
top_level.vhd
-- This is a simple design, in which we use two memories:
-- memory1: ROM created using a case statement and initialized to desired values
-- This should be inferred as a distributed RAM memory by the Xilinx tool
-- memory2: block RAM created using the Core Generator and then only instantiated
-- This mmeory is initialized using a .coe file
-- The contents of these memories will be displayed on the 7 LED of Atlys.
-- Slide switch SW(0) is used to select between the two memories.
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
-- Uncomment the following library declaration if using
-- arithmetic functions with Signed or Unsigned values
use IEEE.NUMERIC_STD.ALL;
-- Uncomment the following library declaration if instantiating
-- any Xilinx primitives in this code.
--library UNISIM;
--use UNISIM.VComponents.all;
entity top_level is
Port ( clk_100MHz : in STD_LOGIC; -- FPGA's external oscillator
switch : in STD_LOGIC; -- hooked to slide switch SW(0) on Atlys board
leds : out STD_LOGIC_VECTOR (7 downto 0)); -- drives all eight LEDs on board
end top_level;
architecture Structural of top_level is
component ck_divider
Port (CK_IN : in STD_LOGIC;
CK_OUT : out STD_LOGIC);
end component;
-- Question: what would be different if instead of using this component we would
-- simply declare an array that would also need to be initialized with desired values?
-- type ram_t is array (0 to 7) of std_logic_vector(7 downto 0);
-- signal ram : ram_t := (others => (others => '0'));
-- Exersize: Currently if mem_selector changes the counter does not reset. Change
-- the code such that each time the mem_selector changes the counter is reset to
-- zero.
component rom8x8
PORT (addr : in std_logic_vector(2 downto 0);
dout : out std_logic_vector(7 downto 0));
end component;
-- This component is created using the Core Generator. Its VHDL description
-- is inside ipcore_dir/my_bram8x8.vhd, which was created during the use
-- Core Generator as explained in the lab.
component my_bram8x8
PORT (
clka : IN STD_LOGIC;
wea : IN STD_LOGIC_VECTOR(0 DOWNTO 0);
addra : IN STD_LOGIC_VECTOR(2 DOWNTO 0);
dina : IN STD_LOGIC_VECTOR(7 DOWNTO 0);
61
douta : OUT STD_LOGIC_VECTOR(7 DOWNTO 0)
);
end component;
signal clk_1Hz : STD_LOGIC;
signal my_addr_counter : STD_LOGIC_VECTOR (2 downto 0) := "000";
signal dout_rom8x8, dout_bram8x8 : STD_LOGIC_VECTOR (7 downto 0);
-- for the time being, we'll only read from this block RAM, so
-- let's set all data ins to zero;
signal dina_null : STD_LOGIC_VECTOR (7 downto 0) := "00000000";
signal wea_null : STD_LOGIC_VECTOR(0 DOWNTO 0) := "0"; -- no need for writing in this example
begin
clock_divider : ck_divider port map (clk_100MHz, clk_1Hz); -- poor instantiation
memory1 : rom8x8 port map (addr => my_addr_counter, dout => dout_rom8x8); -- better instantiation
-- Instantiate BRAM.
memory2 : my_bram8x8 port map (
clka => clk_1Hz, -- clock for writing data to RAM
wea => wea_null, -- write enable signal for Port A
addra => my_addr_counter, -- 3 bit address for the RAM
dina => dina_null, -- 8 bit data input to the RAM
douta => dout_bram8x8); --8 bit data output to the RAM
multiplex_out : process (clk_1Hz) is
begin
if (clk_1Hz'event and clk_1Hz = '1') then
case switch is
when '0' =>
leds <= dout_rom8x8;
when '1' =>
leds <= dout_bram8x8;
when others => NULL;
end case;
my_addr_counter <= std_logic_vector( unsigned(my_addr_counter) + 1);
end if;
end process;
end Structural;
62
Lab 6: Memories: External SPI Flash and DDR2
1. Objective
The objective of this lab is to learn how to access memory chips from within your VHDL design. These
memory chips are external to the FPGA, located on the Atlys board. The board has a 16Mbyte x4 SPI Flash
for configuration and data storage and a 128Mbyte DDR2 with 16-bit wide data.
2. Description
The Atlys board uses a 128Mbit Numonyx N25Q12 Serial Flash memory device (16,777,216 bytes - 8 bits
each) for non-volatile storage of FPGA configuration files. The SPI Flash can be programmed with a .bit,
.bin., or .mcs file using the Adept software. Adept Flash programming application allows also allows user
data files to be transferred to/from the Flash at user specified addresses. The Read/Write tools of Adept
allow data to be exchanged between files on the host PC and specified address ranges in Flash.
As general-purpose flash, the SPI serial flash can also be used for any other non-volatile storage that you
might require. One example could be to store MicroBlaze processor application code for bootloading.
In the first part of this lab we’ll create a project to implement the following design description: the circuit
must read one byte (8 bits) from a specified location on the Flash memory chip and use it to drive the 8
LEDs on the Atlys board. A simplified representation of this functionality is shown in the block diagram in
Fig.1.
Figure 1 Block diagram of system that reads one memory location and displays it on 8 LEDS
3. SPI Controller
The communication between Spartan-6 FPGA and the Flash memory chip is done via the so-called Serial
Peripheral Interface (SPI) communication method (see Fig.2). This method was used to connect devices
such as printers, cameras, scanners, etc. to a desktop computer; but it has largely been replaced by USB.
However, SPI can still be a useful communication tool for some applications. SPI runs using a master/slave
63
set-up and can run in full duplex mode (i.e., signals can be transmitted between the master and the slave
simultaneously). There is no standard communication protocol for SPI.
SPI is still used to control some peripheral devices and has some advantages over I2C (another type of
serial data communication). SPI can communicate at much higher data rates than I2C. Furthermore, when
multiple slaves are present, SPI requires no addressing to differentiate between these slaves. SPI has the
additional benefit of requiring only simple wiring, when compared to parallel buses.
Figure 2 SPI communication method
To access the Flash memory from within the Spartan-6 FPGA, we implement a finite state machine – an
SPI controller – that is responsible with the SPI communication. The controller implements only a subset of
all the commands that the Flash memory supports. The SPI controller utilized in this lab is a slightly
modified version of the one developed by Johannes Hausensteiner and available at opencores.org [1]. You
should read the spi_ctrl.vhd and study the state diagram (included in Appendix A as well as in the
downloadable archive for this lab) to understand how the SPI controller works. In addition, you will need to
read the datasheet of Numonyx N25Q12 Serial Flash memory device and understand how it works [2]. In
addition, you should search online and read more about SPI [3,4].
4. Aldec-HDL Simulation
To help understanding the SPI controller and the circuit designed to use it for reading one memory location,
we’ll first use Aldec-HDL simulation to investigate the overall system operation.
Following the procedure presented in lab#1, create a new design and add to it the source files
testbench_spi_ctrl.vhd, top_level_spi_ctrl.vhd, and spi_ctrl.vhd. Al these files are included in the
downloadable archive for this lab. Run your simulation for 6 us. Study the provided VHDL files and display
necessary waveforms to understand the operation of the circuit. An example of useful waveforms is shown
in Fig.3.
In your assignment for this lab, you will need to modify these files and verify the correct functionality of
your new design. Using Aldec-HDL first to do your VHDL coding and debugging will save you a lot of
trouble and frustration, which you might otherwise experience if you wanted to go directly for the
implementation of your design on the FPGA with ISE WebPack. In addition, another important thing you
should be aware of is that simulation does not always give you the same results as the hardware
implementation. Most often, things appear to work in simulation but the hardware implementation would
fail – and we need to go back to the simulation stage and continue to debug our designs.
64
Figure 3 Zoom-in into the Aldec's testbench simulation
5. ISE WebPack project
By now, you should have a good idea about how the top level circuit works. Create a new ISE project and
add to it the VHDL files top_level_spi.vhd and spi_ctrl.vhd. Again, these files together with other useful
files (such as the .ucf file) are included in the downloadable archive with all the data for this lab.
Do the pin assignment. Your UCF file should have the following contents:
NET
NET
NET
NET
NET
NET
NET
NET
NET
NET
NET
NET
NET
NET
NET
NET
"clk_100MHz" LOC = L15;
"reset_btn" LOC = F5;
"spi_din" LOC = R13;
"spi_dout" LOC = T13;
"spi_cs" LOC = V3;
"spi_clk" LOC = R15;
"spi_wp_bar" LOC = T14;
"spi_hold_bar" LOC = V14;
"leds[0]" LOC = U18;
"leds[1]" LOC = M14;
"leds[2]" LOC = N14;
"leds[3]" LOC = L14;
"leds[4]" LOC = M13;
"leds[5]" LOC = D4;
"leds[6]" LOC = P16;
"leds[7]" LOC = N12;
65
Run the “Implement Design” step inside ISE WebPack to perform placement and routing and observe the
messages that the tool prints in the Console window. These messages provide useful information about the
resource utilization on the FPGA as well as performance estimates.
Generate the programming .bit file and program the FPGA. Verify the operation of your design. Observe
and comment.
Note that the provided top_level_spi.vhd file reads the content of the Flash memory location at address (or
bias) x”18A230”, which I selected randomly and hard-coded it inside the VHDL code. To verify that what
the design reads from the Flash memory (and displays on the 8 LEDs) is indeed the actual information
stored inside the memory, I use Digilent’s Adept to first read the entire content of the Flash memory and
save it into a temporary binary file. Then, using a hex editor (such as HxD Hex Editor [5]) I see that for
example the information at address x”18A230” is x”80” - shown in Fig.4 - which corresponds to how the
LEDs are turned on/off (only the left most LED is turned on)!
Figure 4 “Current” content of Flash memory at address 0018A230 is 80
6. Lab assignment
Modify the project to be able to also write into the Flash memory. Your design should be able to write ten
consecutive memory locations starting at an arbitrary address (write numbers 1 through 10) and then read
them back and display them using the 8 LEDs. Each of the numbers should be displayed for one second.
7. Credits and references
[1] Johannes Hausensteiner, SPI Controller in VHDL. http://opencores.org/project,spiflashcontroller
[2] Datasheet Numonyx N25Q12 Serial Flash memory. http://www.alldatasheet.com/datasheetpdf/pdf/353314/NUMONYX/N25Q128.html
66
[3] SPI description. http://www.ee.nmt.edu/~teare/ee308l/datasheets/S12SPIV3.pdf
[4] Google search for Serial Peripheral Interface (SPI).
[5] HxD - Freeware Hex Editor and Disk Editor. http://mh-nexus.de/en/hxd/
Appendix A: SPI controller state diagram – authored by Johannes Hausensteiner and available at
http://opencores.org/project,spiflashcontroller
67
Lab 7: Interfacing FPGA Spartan-6 with AC’97 Codec
1. Objective
The objective of this lab is to demonstrate the use of the National Semiconductor LM4550 AC‘97 audio
codec (IC3), which is available on the Atlys board. We’ll code in VHDL a driver and implement it on the
FPGA to communicate with and control the codec. The driver can select the input into the codec (e.g.,
microphone, line-in) and set the volume – via the slide switches of the Atlys board.
2. Introduction
AC'97 (Audio Codec '97; also MC'97 for Modem Codec '97) is an audio codec standard developed by Intel
Architecture Labs in 1997. The standard is used in motherboards, modems, and sound cards.
Read more about AC’97 here: http://www-inst.eecs.berkeley.edu/~cs150/Documents/ac97_r23.pdf
The Atlys board includes a National Semiconductor LM4550 AC‘97 audio codec (IC3) with four 1/8” audio
jacks for line-out (J5), headphone-out (J7), line-in (J4), and microphone-in (J6). Audio data at up to 18 bits
and 48KHz sampling is supported, and the audio in (record) and audio out (playback) sampling rates can be
different. The microphone jack is mono, all other jacks are stereo. The headphone jack is driven by the
audio codec's internal 50mW amplifier. LM4550 basically serves as an interface between the analog world
of traditional audio components (e.g., headphones and microphones) and the digital world of the FPGA.
Read more about LM4550 here: http://www.ti.com/lit/ds/symlink/lm4550.pdf
3. VHDL driver
This is an example hardware driver used to interface the AC97 audio codec with an FPGA running at 100
MHz. The design can be scaled to other clock speeds by either scaling the internal counters, or instantiating
an onboard PLL to attain a 100 MHz clock. The VHDL code and description of this controller is based on
the work of Tony Storey and Scott Larson [1].
Spartan-6 FPGA
From 100MHz
oscillator
L15
CLK
SOURCE
VOLUME
3
5
AC97CMD
command
state machine
Reset
T15
AC97
controller
cmd_addr
8
cmd_data
16
latching_cmd
ready
From eight slide
switches
From Atlys’ RESET
push-button
N16
T18
U17
L13
T17
SDATA_IN
SDATA_OUT
SYNC
12.288
RESET
To LM4550 AC97
on Atlys board
Figure 1: Block diagram of desired circuit
68
The inputs to the controller “AC97 controller” include the CLK (main FPGA oscillator), an active low
reset, a serial data in line, a 12.288 MHz bit clock from the ac97 chip, a 3 bit source selector (slide switches
SW7-5) and a 5 bit volume control (slide switches, SW4-0). The controller’s outputs include a sync signal,
serial data output, and an ac97 active low reset signal for initializing the ac97 (LM4550). There are two
internal signals to sync the main ac97 controller with the “command state machine AC97CMD” (a small
FSM to setup codec's registers). One of these signals pulses every 20us and the other is a signal used for
error checking during the tag phase. Consult the LM4550 data sheet for details on the serial frame
input/output.
The VHDL files can be downloaded on the course website. The downloadable archive contains additional
files (datasheets) including the .ucf file that must be utilized to assign FPGA I/O pins correctly. Its content
is listed here:
# PlanAhead Generated physical constraints
NET "SOURCE[2]" LOC = E4;
NET "SOURCE[1]" LOC = T5;
NET "SOURCE[0]" LOC = R5;
NET "VOLUME[4]" LOC = P12;
NET "VOLUME[3]" LOC = P15;
NET "VOLUME[2]" LOC = C14;
NET "VOLUME[1]" LOC = D14;
NET "clk" LOC = L15;
NET "BIT_CLK" LOC = L13;
NET "SDATA_IN" LOC = T18;
NET "SDATA_OUT" LOC = N16;
NET "SYNC" LOC = U17;
NET "AC97_n_RESET" LOC = T17;
NET "n_reset" LOC = T15;
NET "VOLUME[0]" LOC = A10;
4. Synthesis and FPGA programming
Use ISE WebPack to synthesize the entire design and then program the FPGA. Test the whole system using
a microphone and the audio signal from your favorite YouTube music video connected to the MIC and
LINE IN of the Atlys board. Use the slide switches to select between the two inputs and vary the volume.
5. Lab assignment
Read the datasheets of AC97 and of LM4550 to get an understanding of the serial communication. Read the
provided VHDL code and understand how it works – try to sketch the state graphs of the two FSM’s from
Fig.1 above.
Propose and implement a new VHDL design; you should reuse some or the entire VHDL code to do
something different. The given VHDL design hierarchy simply routes the parallel outputs of the controller
back to its parallel inputs. This makes the AC97 talk through from input to output. This process in the top
level file can be replaced by port mapping user components for various signal processing tasks for example.
An excellent example is the following voice-recorder design:
http://web.mit.edu/6.111/www/f2008/handouts/labs/lab4.html
The top-level plan is pretty simple – when recording, store the stream of incoming samples in a memory
(inside FPGA or on Atlys’ memory?), when playing back feed the stored data stream back to the codec.
69
6. Credits and references
[1] Tony Storey and Scott Larson, AC’97 Codec Hardware Driver Example.
http://eewiki.net/display/LOGIC/AC%2797+Codec+Hardware+Driver+Example
[2] http://www.javiervalcarce.eu/wiki/VHDL_Macro:_DC97#cite_note-0
[3] http://www-mtl.mit.edu/Courses/6.111/labkit/audio.shtml
[4] http://web.mit.edu/6.111/www/f2008/handouts/labs/lab4.html
70
Lab 7 Supplemental: PS2 Keyboard and UART
1. Objective
The objective of this lab is to learn how to connect a keyboard to the Atlys board, read pressed keys via a
PS2 receiver, and send the key code to the host computer via an UART transmitter. The host computer
displays the pressed key character in a Hyperterminal. The PS2 receiver and UART transmitter are
implemented on the Spartan-6 FPGA.
2. Description
To set-up the communication with the host computer:
Download Windows driver from www.exar.com. Type the EXAR part number "XR21V1410". Download
driver and install.
To use Windows' Hyperterminal program, use port settings Bits per second: 19200, Data bits: 8, Parity:
None, Stop bits: 1, Flow control: None
3. Credits and references
[1] P.P. Chu, FPGA Prototyping by VHDL Examples: Xilinx Spartan-3 Version, Wiley 2008.
71
Lab 8: Interfacing FPGA Spartan-6 with Host Computer via USB
1. Objective
The objective of this lab is to learn one method of implementing communication via USB between the
FPGA (Spartan-6 on Atlys board) and the host computer. This method is based on using an excellent open
source project called FPGALink [1]. Once this lab is completed you should be able to extend this method
and utilize it in any project where you require the computer host to exchange data with the FPGA.
2. Introduction
The Universal Serial Bus (USB) is a specification developed (in the mid-1990s) by Compaq, Intel,
Microsoft and NEC, joined later by Hewlett-Packard, Lucent and Philips. The USB was developed as a new
means to connect a large number of devices to the PC, and eventually to replace the 'legacy' ports (serial
ports, parallel ports, keyboard and mouse connections, joystick ports, midi ports, etc.). USB requires a
shielded cable containing 4 wires.
The USB is based on a “tiered star topology” in which there is a single host controller and up to 127 “slave”
devices. The host controller is connected to a hub, integrated within the PC, which allows a number of
attachment points (referred to as ports). The USB is intended as a bus for devices near to the PC. For
applications requiring distance from the PC, another form of connection is needed, such as Ethernet. Note
however, that USB is not a true bus: only the root hub sees every signal on the bus. This implies there is no
method to monitor upstream communications from a downstream device.
There a lot of online information describing the USB. As a start, you may want to read [2,3].
In this lab we’ll use one of the USB ports available on the Atlys board; that is, the so called “Adept USB
Port” (see Fig.1), marked as J8 on the board and on the schematic diagram [4]. The USB Controller is a
Cypres chip, CY7C68013A-56 USB Microcontroller High-Speed USB Peripheral Controller.
J8
Figure 1
72
3. FPGALink Library
The FPGALink library was developed by Chris McClelland [1]. It provides an end-to end solution capable
of JTAG-programming the FPGA on a variety of USB-based hardware platforms (including Atlys board). It
also facilitates communication with the FPGA using a straightforward API on the host side and a standard
FIFO interface on the FPGA side.
The FPGALink library is just a C DLL. So, we would normally embed it in our application, for example
developed in C/C++ or Python. To get started and help you become familiar with the FPGALink library, the
binary distribution archive contains also a utility (called "flcli") which provides straightforward commandline access to many of the library functions.
In this lab we will:
1) Use the "flcli" utility to demonstrate the host-FPGA communication using an example that is a slightly
changed example that comes with the FPGALink library.
2) Build a simple C++ application to utilize the FPGALink library.
3.1 Working Environment Setup
Notes:
-- Steps 1 and 2 are necessary only if you plan to compile the FPGALink or you are doing this on your
personal home computer. Because we'll use the provided downloadable binaries of this library, these steps
can be skipped.
-- I have done this lab on Windows (though FPGALink can be used on Linux and Mac too). These steps
refer to the Windows.
1) Download and install "Visual C++ Express 2010"
http://www.microsoft.com/visualstudio/en-us/products/2010editions/express#Visual_Studio_2010_Express_Downloads
2) Download and install "Microsoft Visual C++ 2010 Redistributable Package (x86)"
http://www.microsoft.com/en-us/download/details.aspx?id=5555
3) Download "Build Infrastructure", windows version. This is the environment where we’ll work with the
FPGALink library binaries.
http://www.makestuff.eu/wordpress/software/build-infrastructure/
On windows, unpack the downloaded archive makestuff-win32-20111211.zip in your own directory.
In my case, I did this directly in C:\. This created C:\makestuff\.
4) Download and install "Console 2". Console is a Windows console window enhancement.
http://sourceforge.net/projects/console
Simply unpack the downloaded archive directly in C:\Program Files\
Then create a shortcut to C:\Program Files\Console2\Console.exe
Launch Console 2 and enter "C:\makestuff\msys\bin\sh.exe --login" in the "Shell" box at Edit->Settings>Console
73
5) Download the latest FPGALink library binaries (at the time of writing this lab, the latest version is
"libfpgalink-20120621.tar.gz (Linux, MacOSX & Win32)"). This is basically the library that we’ll use. If
your course project will require communication with the host, this will turn out to be very handy.
http://www.makestuff.eu/wordpress/software/fpgalink/
Unpack it in C:\makestuff\libs\
6) Download “LibUSB-Win32”. libusb-win32 is a port of the USB library libusb
(http://sf.net/projects/libusb/) to 32/64bit Windows (2k, XP, 2003, Vista, Win7, 2008; 98SE/ME for
v0.1.12.2). The library allows user space applications to access many USB device on Windows.
http://sourceforge.net/projects/libusb-win32/
Plug in the Atlys board and turn the power on. Then run bin/inf-wizard.exe. Click “Next”, select your
FPGA board, make a note (in our case, the Atlys board, that is 1443, 0007) of the vendor and product IDs
and click “Next” twice. Choose a location for the driver and click “Save”. Click “Install Now”.
That's all. We are now ready to use FPGALink library! You should now take the time to read the
FPGALink manual:
http://www.swaton.ukfsn.org/docs/fpgalink/vhdl_paper.pdf
FPGALink library comes with two nice examples. Please follow the steps from "README"
(C:\makestuff\libs\libfpgalink-20120621\README) to run either of the examples.
3.2 EXAMPLE #1: Communication Host (flcli utility) – FPGA (simple VHDL design)
A) Description
Our application implemented on the FPGA works in this simple example with primarily four registers,
referred to as R0, R1, R2, R3. These registers provide the storage space for communicating with the host,
and are associated with four different channels of the communication between host and FPGA.
From the host, writes to R0 are simply displayed on the Atlys board’s eight LEDs. Reads from R0 return the
state of the board’s eight slide switches. Writes to R1, R2, and R3 are registered and may be read back. The
circuit implemented on the FPGA simply multiplies the R1 with R2 and places the result in R3.
A simplified block diagram of the entire system (host + FPGA) is shown in the Fig.2 below.
Figure 2 Interfacing the host computer with the FPGA via FPGALink
74
B) VHDL coding and .xsvf programming file generation
The two VHDL source files (comm_fpga_fx2.vhdl and top_level.vhdl) together with the .UCF file
required to implement the circuit on the FPGA are provided in the downloadable archive of this lab. These
files are modified versions of the VHDL example files from the FPGALing library. top_level.vhdl is also
included in Appendix A at the end of this document.
First, please read these files to understand what they do. Then, create a new ISE WebPack project and add
these files to your project. In my case, I called my new project lab8_usb_fpgalink. The entire directory of
my ISE WebPack project is also included in the downloadable archive of this lab. Synthesize and
implement the design.
Generate .xsvf: Method 1
Because we will be programming the FPGA using the flcli utility provided as part of the FPGALink library,
we need to generate an .xsvf programming file. Recall that the FPGA can be programmed using different
programming file formats including .bit, .svf, and .xsvf. To generate the .xsvf file follow these steps:
Inside ISE WebPack, select “Manage Configuration Project (iMPACT)”, right-click and choose “Run”.
The ISE iMPACT window should pop-up after a few seconds.
Double click on “Boundary Scan” and then File->Initialize Chain. You should get the sc6slx45
“instantiated” in the “Boundary Scan” panel like in this figure:
Figure 3
Right click on the chip and choose Set target device. Assign a configuration file. This is usually a .BIT
file such as top_level.bit in our case. So, go ahead and select top_level.bit and assign it.
Select from the menu Option->XSVF File->Create XSVF File… Name it and then click OK to save it
in your ISE project directory. In my case I named it lab8_usb_fpgalink.xsvf.
Right click on the chip and choose Program.
The output will be saved to .XSVF file, lab8_usb_fpgalink.xsvf. We’ll use this file to program the
device.
Close ISE iMPACT. Close also the ISE WebPack but keep the Atlys board connected and powered-on.
75
We have now lab8_usb_fpgalink.xsvf and so we’re ready to program the FPGA and to communicate with
it via the flcli utility of the FPGALink library binaries distribution.
Generate .xsvf: Method 2
This is optional and meant for the curious. Until now, we’ve been using Xilinx ISE WebPack tools via the
graphical user interface, the actual ISE. However, these tools can be run using Makefiles at the command
line too. This alternative approach is especially useful when we want to automate and thereby speed-up the
design process: all design steps can be executed via a single makefile. In addition, memory and CPU
resource utilization is better. It is left as an exercise for you to read Xilinx and other documentation [5] and
to write the simplest makefile required to run the whole process of implementing the design of this lab and
to finally generate the .XSVF programming file.
C) Testing and validation of the overall host-FPGA system
Before launching flcli, first create a new folder inside C:\makestuff\libs\libfpgalink-20120621\gen_xsvf and
copy lab8_usb_fpgalink.xsvf to it. We’ll use the newly created folder, gen_xsvf, to store .xsvf
programming files of our own projects.
--Connect and power-on the Atlys board if not already.
--Start a terminal by launching Console 2. We’ll use the flcli utility on the host side.
flcli is a command-line utility, which offers many of the FPGALink library’s features. It is useful for
testing, etc. Read more about it in the FPGALink manual vhdl_paper.pdf:
http://www.swaton.ukfsn.org/docs/fpgalink/vhdl_paper.pdf
--Use flcli utility to program the FPGA. In the Console 2 terminal, do:
> cd libs/libfpgalink-20120621
> ./win32/rel/flcli -v 1443:0007 -i 1443:0007 -s -x gen_xsvf/lab8_usb_fpgalink.xsvf
--Use flcli utility to connect to the FPGALink device 1443:0007 (that is the USB controller on the Atlys
board):
> ./win32/rel/flcli -v 1443:0007 -c
Which enters the command-line mode, where we can use the flcli utility’s built-in functions to write and
read the registers we have created on the FPGA. For example, try this:
> w0 13
And observe the LEDs on the Atlys board. They should be turned on/off accordingly. Or for example, read
the status of the slide switches:
> r0
Write into R1 and R2:
> w1 02;w2 03
> r1
> r2
> r3
If everything went OK, your Console 2 window should look like in Fig.4.
Quit the flcli utility:
>q
76
Figure 4 Snap-shot of Console 2 window
4. Lab assignment
Implement a project in which you utilize the FPGALink from your own host-side application written in
C/C++ or Python. Your project should open a file file_host2fpga.txt (the file format is with a byte in hex
format on each line) and read its content line by line and send it to the FPGA to drive the eight LED. Also,
your application should read the eight slide switches and save their status in the same format as above in
file_fpga2host.txt. Append a new line to this file each time the switches are changed.
To get started, read first the C example provided as part of the FPGALink binaries distribution. This
example is located in: C:\makestuff\libs\libfpgalink-20120621\examples\c
5. Credits and references
[1] Chris McClelland , FPGALink: Easy USB to FPGA Communication.
http://www.makestuff.eu/wordpress/software/fpgalink
[2] USB Home: http://www.usb.org/home
[3] USB Made Simple: http://www.usbmadesimple.co.uk/index.html
[4] Atlys schematic diagram. http://www.digilentinc.com/Data/Products/ATLYS/Atlys_C2_sch.pdf
[5] Xilinx’s command line tools user guide.
http://www.xilinx.com/support/documentation/sw_manuals/xilinx14_1/devref.pdf
& others: http://www.demandperipherals.com/docs/CmdLineFPGA.pdf
77
http://outputlogic.com/xcell_using_xilinx_tools/74_xperts_04.pdf
Appendix A: Content of top_level.vhd file.
--- Copyright (C) 2009-2012 Chris McClelland
--- This program is free software: you can redistribute it and/or modify
-- it under the terms of the GNU Lesser General Public License as published by
-- the Free Software Foundation, either version 3 of the License, or
-- (at your option) any later version.
--- This program is distributed in the hope that it will be useful,
-- but WITHOUT ANY WARRANTY; without even the implied warranty of
-- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-- GNU Lesser General Public License for more details.
--- You should have received a copy of the GNU Lesser General Public License
-- along with this program. If not, see <http://www.gnu.org/licenses/>.
--- Additional changes/comments by Cristinel Ababei, 2012
-- Description:
-- From the host, writes to R0 are simply displayed on the Atlys board's
-- eight LEDs. Reads from R0 return the state of the board's eight slide
-- switches. Writes to R1 and R2 are registered and may be read back.
-- The circuit implemented on the FPGA simply multiplies the R1 with R2
-- and places the result in R3. Only reads, from host side, are allowed
-- from from R3; that is an attempt to write into R3 will have no effect.
-- When you input, from host side, data into R1 and R2, data should
-- represent numbers that can be represented on 4 bits only. Because
-- data will have to be input (will be done via the flcli application)
-- in hex, writing for example 07 or A7 into R1 will have the same effect
-- as writing 07 because the four MSB will be discarded inside the
-- VHDL application on FPGA.
-library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity top_level is
port(
-- FX2 interface ----------------------------------------------------------------------------fx2Clk_in
: in
std_logic;
-- 48MHz clock from FX2
fx2Addr_out
: out
std_logic_vector(1 downto 0); -- select FIFO: "10" for EP6OUT, "11" for EP8IN
fx2Data_io
: inout std_logic_vector(7 downto 0); -- 8-bit data to/from FX2
-- When EP6OUT selected:
fx2Read_out
: out
std_logic;
fx2OE_out
: out
std_logic;
fx2GotData_in : in
std_logic;
-- asserted (active-low) when reading from FX2
-- asserted (active-low) to tell FX2 to drive bus
-- asserted (active-high) when FX2 has data for us
-- When EP8IN selected:
fx2Write_out : out
std_logic;
fx2GotRoom_in : in
std_logic;
-- asserted (active-low) when writing to FX2
-- asserted (active-high) when FX2 has room for more data
fx2PktEnd_out : out
-- asserted (active-low) when a host read needs to be
from us
std_logic;
committed early
-- Onboard peripherals ----------------------------------------------------------------------led_out
: out
std_logic_vector(7 downto 0); -- eight LEDs
slide_sw_in
: in
std_logic_vector(7 downto 0) -- eight slide switches
);
end top_level;
architecture behavioural of top_level is
-- Channel read/write interface ----------------------------------------------------------------signal chanAddr : std_logic_vector(6 downto 0); -- the selected channel (0-127)
-- Host >> FPGA pipe:
signal h2fData
: std_logic_vector(7 downto 0);
signal h2fValid : std_logic;
-- data lines used when the host writes to a channel
-- '1' means "on the next clock rising edge, please accept the data on
signal h2fReady
-- channel logic can drive this low to say "I'm not ready for more data
h2fData"
: std_logic;
yet"
-- Host << FPGA pipe:
signal f2hData
: std_logic_vector(7 downto 0);
-- data lines used when the host reads from a channel
78
signal f2hValid
: std_logic;
-- channel logic can drive this low to say "I don't have data ready for
you"
signal f2hReady : std_logic;
-- '1' means "on the next clock rising edge, put your next byte of data
on f2hData"
-- ----------------------------------------------------------------------------------------------- Needed so that the comm_fpga_fx2 module can drive both fx2Read_out and fx2OE_out
signal fx2Read
: std_logic;
-- Registers
signal reg0,
signal reg1,
signal reg2,
signal reg3,
implementing the channels
reg0_next
: std_logic_vector(7
reg1_next
: std_logic_vector(7
reg2_next
: std_logic_vector(7
reg3_next
: std_logic_vector(7
downto
downto
downto
downto
0)
0)
0)
0)
:=
:=
:=
:=
x"00";
x"00";
x"00";
x"00";
begin
-- BEGIN_SNIPPET(registers)
-- Infer registers
process(fx2Clk_in)
begin
if ( rising_edge(fx2Clk_in) ) then
--checksum <= checksum_next;
reg0 <= reg0_next;
reg1 <= reg1_next;
reg2 <= reg2_next;
reg3 <= reg3_next;
end if;
end process;
-- Drive register inputs for each channel when the host is writing
reg0_next <= h2fData when chanAddr = "0000000" and h2fValid = '1' else reg0;
reg1_next <= h2fData when chanAddr = "0000001" and h2fValid = '1' else reg1;
reg2_next <= h2fData when chanAddr = "0000010" and h2fValid = '1' else reg2;
reg3_next <= std_logic_vector(unsigned(reg1(3 downto 0)) * unsigned(reg2(3 downto 0)));
-- Select values to return for each channel when the host is reading
with chanAddr select f2hData <=
slide_sw_in
when "0000000", -- return status of slide switches when reading R0
reg1
when "0000001",
reg2
when "0000010",
reg3
when "0000011",
x"00"
when others;
-- Assert that there's always data for reading, and always room for writing
f2hValid <= '1';
h2fReady <= '1';
-- CommFPGA module
fx2Read_out <= fx2Read;
fx2OE_out <= fx2Read;
fx2Addr_out(1) <= '1'; -- Use EP6OUT/EP8IN, not EP2OUT/EP4IN.
comm_fpga_fx2 : entity work.comm_fpga_fx2
port map(
-- FX2 interface
fx2Clk_in
=> fx2Clk_in,
fx2FifoSel_out => fx2Addr_out(0),
fx2Data_io
=> fx2Data_io,
fx2Read_out
=> fx2Read,
fx2GotData_in => fx2GotData_in,
fx2Write_out
=> fx2Write_out,
fx2GotRoom_in => fx2GotRoom_in,
fx2PktEnd_out => fx2PktEnd_out,
-- Channel read/write interface
chanAddr_out
=> chanAddr,
h2fData_out
=> h2fData,
h2fValid_out
=> h2fValid,
h2fReady_in
=> h2fReady,
f2hData_in
=> f2hData,
f2hValid_in
=> f2hValid,
f2hReady_out
=> f2hReady
);
-- LEDs
led_out <= reg0;
end behavioural;
79
--END_SNIPPET(registers)
Lab 9: Video Interfaces: HDMI and DVI
1. Objective
The objective of this lab is to learn how to transmit High-Definition Multimedia Interface (HDMI) and
Digital Visual Interface (DVI) data streams to HDMI and DVI capable monitors. The top-level design in
this lab displays a simple colored pattern. In addition, we will learn how to create ISE WebPack projects
that use both VHDL and Verilog source files.
2. Introduction
The Atlys board contains four HDMI ports (see Fig.1), including two buffered – via the TI’s TMDS141
buffers – HDMI input/output ports (type A connector), one buffered HDMI output port (type D connector),
and one unbuffered port that can be input or output [1].
Figure 1 Illustration of the four HDMI ports of the Atlys board (left). HDMI Male to DVI-D Female
Rotating Adapter (top right). HDMI connectors (bottom right)
Since the HDMI and DVI systems use the same transition-minimized differential signaling TMDS signaling
standard, a simple adaptor shown in the right hand side of Fig.1 (available at most electronics stores such as
TigerDirect [2]) can be used to drive a DVI connector from either of the HDMI output ports. The HDMI
connector does not include VGA signals, so analog VGA displays cannot be driven. The Atlys board does
not have any VGA connector. For examples on how to drive VGA monitors from the Atlys board, please
see the supplemental material of this lab on the course’s website.
In this lab we will drive HDMI and DVI capable monitors to display a colored pattern. For this we’ll use a
Verilog project developed by Bob Feng of Xilinx [3]. We’ll modify Bob’s project by converting to VHDL
part of the Verilog code. In this way, this lab becomes a good opportunity for you to create ISE projects that
use both Verilog and VHDL source files. By comparing two files, Verilog and VHDL, that implement the
same functionality you get a fist time exposure to Verilog too.
80
Transition-minimized differential signaling (TMDS):
TDMS is a method for transmitting high-speed serial data and is used by the DVI and HDMI video
interfaces, as well as other digital communication interfaces.
Developed by Silicon Image Inc. as a member of the Digital Display Working Group
Transmitter incorporates an advanced coding algorithm which reduces electromagnetic interference over
copper cables and enables robust clock. Recovery at the receiver to achieve high skew tolerance.
TMDS uses 4 channels: Red, Green, Blue, Clock
TMDS is a two-stage process. Converts an input of 8 bits into a 10 bit code
o TMDS signaling uses a twisted pair for noise reduction.
o Current Mode Logic (CML), DC coupled and terminated to 3.3 Volts.
o 3 twisted pairs are used to transfer video data - each a different RGB component
o 8 bit data transmission plus 2 bits of control signals
3. Brief HDMI Description
What s HDMI:
HDMI is the first & only industry supported, uncompressed, all-digital audio/video interface.
HDMI is a compact audio/video interface for transferring uncompressed digital audio/video data from an
HDMI-compliant device ("the source") to a compatible digital audio device, computer monitor, video
projector, and digital television.
HDMI provides an interface between any A/V source, such as a set-top box, DVD player, or A/V receiver
and an audio and/or video monitor, such as a digital television (DTV), over a single cable.
HDMI is a digital replacement for existing analog standards such as composite video, S-Video, SCART,
component video, and VGA.
HDMI supports standard, enhanced, or high-definition video, plus multi-channel digital audio on a single
cable.
Transmits all ATSC HDTV standards and supports 8-channel, 192kHz, uncompressed digital audio, all
currently-available compressed formats & lossless digital audio formats with bandwidth to spare to
accommodate future enhancements and requirements
HDMI acts like Cat5, it passes a data signal not an RF signal like CATV.
DVI is HDMI without the audio - separate cable needed for audio!
HDMI communication channels (see Fig.2):
HDMI has three physically separate communication channels, which are the TMDS, DDC, and the optional
CEC:
The HDMI cable and connectors carry four differential pairs that make up the TMDS data and clock
channels.
o Audio, video and auxiliary data is transmitted across the three TMDS data channels.
o A TMDS clock, typically running at the video pixel rate, is transmitted on the TMDS clock channel
HDMI carries a VESA DDC (Display Data Channel) channel. The DDC is used for configuration and
status exchange between a single transmitter and a single receiver.
o The DDC is used by the transmitter to read the receiver’s Enhanced Extended Display Identification
Data (E-EDID) in order to discover the receiver’s configuration and capabilities.
The optional CEC (Consumer Electronics Control) protocol provides high-level control functions
between all of the various audiovisual products in a user’s environment.
81
Advantages of HDMI:
Because HDMI is a digital interface, it provides the best quality of the video since there are no lossy
analog to digital conversions as are required for all analog connections (such as component or S-Video).
Digital video will be sharper.
Single cable for both video and audio is the most effective format!
HDMI devices supporting High-bandwidth Digital Content Protection (HDCP) have the comfort of
knowing they will have access to premium HD content now and in the future.
Figure 2 HDMI Signals
Compatibility with DVI:
HDMI is backward-compatible with single-link Digital Visual Interface digital video (DVI-D or DVI-I, but
not DVI-A). No signal conversion is required when an adapter or asymmetric cable is used, so there is no
loss of video quality. From a user's perspective, a DVI-D monitor would have the same level of basic
interoperability unless there are content protection issues with High-bandwidth Digital Content Protection
(HDCP), not supported by DVI, or the HDMI color encoding is in component color space YCbCr which is
not supported by DVI, instead of RGB.
Because discussing HDMI is not the main purpose of this lab, you may want to take some time to search
and read more about HDMI on the Internet. There is tons of information out there. Here are some starting
pointers [5].
82
4. ISE WebPack Project
Xilinx Application Note
One challenge of working with HDMI is that there is no or only a limited number of design examples in the
public domain. So, one would need to do many things from scratch, which becomes very challenging in the
case of HDMI. However, a design example is the XAPP495 [3]. This is a good start as it was created to
work with the Atlys board. But there are a few issues. One is that, generally, HDMI design requires
significant effort and attention to many details. Another one is that it does not implement EDID (monitor
identification data) nor audio. These will not stop us from using it. However, XAPP495 is written in
Verilog while in this course we focus on VHDL. Hence, to put this lab together required some code
conversion from Verilog to VHDL. Before continuing, you should take some time now and read XAPP495
paper [3] (also included in the downloadable archive of this lab).
Driving DVI and HDMI Monitors to Display a Colored Pattern
While the XAPP495 provides examples of both DVI transmitters and receivers, in this lab, we focus only
on the transmitter part. Specifically, we create a design that drives HDMI and DVI monitors to display a
colored bar pattern. The design basically uses all Verilog files related to the transmitting part from the
xapp495 archive (downloadable from Xilinx). The only primary exception is the top-level file, which I
replaced with a VHDL version of it. During the conversion Verilog-VHDL process, I had to make some
other minor changes inside syncro.v and serdes_n_to_1.v to work around the fact that apparently
parameters in Verilog modules cannot be instantiated as generics in VHDL (at least not with the ISE
WebPack?). The VHDL top-level file is simply a VHDL counterpart of the vtc_demo.v file from the
xapp495 archive. The name of the new VHDL top-level file is vtc_demo.vhd.
At this time, you should open both files vtc_demo.vhd (located in lab9_files_ISE folder) and vtc_demo.v
to read and compare them. Notice similarities and differences between VHDL and Verilog. Also, notice how
Verilog modules are declared as components and instantiated in the VHDL top-level file. Comments
inserted inside vtc_demo.vhd provide additional information on the main elements of the design.
The block diagram of the design entity described in vtc_demo.vhd is shown in Fig.3. While reading the
VHDL top-level file, try to identify the signals and components corresponding to those from Fig.3.
The design is based on several IP cores that are available on the Spartan-6 FPGA [5]. These include (see [5]
for description of each):
IBUF - input buffer
BUFIO2 - Dual Clock Buffer and Strobe Pulse
BUFG - Global Clock Buffer
SRL16E - 16-Bit Shift Register Look-Up Table (LUT) with Clock Enable
OSERDES2 - Dedicated IOB Output Serializer
DCM_CLKGEN - Digital Clock Manager
PLL_BASE - Basic Phase Locked Loop Clock Circuit
BUFPLL - PLL Buffer
OBUFDS - 3-State Differential Signaling I/O Buffer with Active Low Output Enable
83
Figure 3 Block diagram of the “SMPTE HD Color bar Generation with Programmable Video Timing”
designed by Bob Feng of Xilinx.
Create a new ISE WebPack project and add to it all the Verilog and VHDL files in lab9_files_ISE folder.
These files together with other useful files (such as the .ucf file) are included in the downloadable archive
with all the data for this lab. Synthesize and implement the design. Download bitstream to the FPGA board
and test. To test the design, we need to attach a monitor to the HDMI OUT (J2) port of the Atlys board. An
HDMI monitor can be connected directly using an HDMI cable. To connect a DVI monitor (like most of
today’s monitors) we need an HDMI to DVI converter; I got mine from TigerDirect [3] for $10. For this
lab, the TA will have one such converter for you to take turns and use; however, if your project in this
course involves using a monitor, you may want to buy your own converter.
After setting everything up, you should see your monitor display the colored pattern shown in Fig.4.
Figure 4 HDMI (left, 7” TFT) and DVI (right, 20” LCD) monitors display colored bar pattern
84
5. Lab Assignment
Convert to VHDL the hdcolorbar module describes in hdclrbar.h file. You must create a new VHDL file
hdclrbar.vhd inside which you must describe the hdcolorbar design entity in VHDL (similarly to how I
converted the top level Verilog module to top level VHDL entity). Then use the VHDL file to replace the
Verilog file in the ISE WebPack project.
Optional (this is very challenging; do not attempt before talking to the instructor): implement the whole
design in VHDL. Moreover, design and describe in VHDL your own entities for as many IP cores as
possible. That is create your own VHDL entities to replace SRL16E , BUFPLL, etc. to improve the
portability of the design to other FPGAs.
6. Credits and references
[1] Digilent Atlys board reference manual;
http://www.digilentinc.com/Data/Products/ATLYS/Atlys_rm.pdf
[2] CTG HDMI Male to DVI-D Female Rotating Adapter; from TigerDirect;
http://www.tigerdirect.com/applications/SearchTools/item-details.asp?EdpNo=3444838&CatId=467
[3] Bob Feng, Implementing a TMDS Video Interface in the Spartan-6 FPGA, Xilinx app note;
http://www.xilinx.com/support/documentation/application_notes/xapp495_S6TMDS_Video_Interface.pdf
[4] HDMI and DVI pointers:
--HDMI resource center; http://www.hdmi.org/learningcenter/
--Wikipedia HDMI introduction; http://en.wikipedia.org/wiki/HDMI
--HDMI specification document Version 1.3; see file included in this lab archive;
--HDMI Hider: TI TMDS141 (datasheet of the chip on the Atlys board);
http://www.ti.com/lit/ds/symlink/tmds141.pdf (also included in the archive of this lab);
--HDMI connectors A,B pinouts; http://pinouts.ru/Video/hdmi_pinout.shtml
--Wikipedia DVI introduction; http://en.wikipedia.org/wiki/Digital_visual_interface
--DVI 10; http://www.ddwg.org/lib/dvi_10.pdf
--DVI pinouts; http://pinouts.ru/Video/dvi_pinout.shtml
--Wikipedia RMDS introduction; http://en.wikipedia.org/wiki/Transition-minimized_differential_signaling
[5] Spartan-6 Libraries Guide for HDL Designs (BUFG, BUFIO2, SRL16E, PLL_BASE, etc.);
http://www.xilinx.com/support/documentation/sw_manuals/xilinx12_4/spartan6_hdl.pdf (also included in
the archive of this lab);
85
Lab 10: PicoBlaze – an embedded microcontroller
1. Objective
The objective of this lab is to utilize PicoBlaze - an embedded into the FPGA fabric 8-bit microcontroller to implement a simple circuit that takes as input two 4-bit binary numbers, a and b (set via the slide
switches of the Atlys board), and computes a2+b2, result which is displayed on the 8 LEDs of the Atlys
board. As part of this lab, you will learn assembly language, which is utilized to code the algorithm that
PicoBlaze must execute.
2. Preparation
As part of this lab preparation you must fully read chapters 14 and 15 of Pong. P. Chu’s book [1].
This lab is created based on those two chapters. In addition, you should read additional materials included
in the downloadable archive with all the files for this lab [2,3]. Please allocate enough time to do this
especially if you do not have prior experience with microcontroller architectures and/or assembly
languages.
The PicoBlaze processor is a compact 8-bit microcontroller core for Xilinx FPGA devices. It is freely
provided as a cell-level HDL description (referred to also as a soft-core) and can be synthesized along with
other logic as part of a bigger digital design. It is optimized for efficiency and occupies very little area. It is
recommended to be utilized for simple data-processing and control applications. Single or multiple copies
of the PicoBlaze processor can be easily integrated into larger systems to add flexibility to FPGA-based
designs. The PicoBlaze design was originally named KCPSM which stands for “Constant(K) Coded
Programmable State Machine” (formerly “Ken Chapman's PSM”). Ken Chapman is the Xilinx systems
designer who devised and implemented the microcontroller.
You should also know that there is also MicroBlaze – another soft processor core designed for Xilinx
FPGAs from Xilinx. However, it is not free, and you would need Xilinx’s EDK (Embedded Development
Kit) tool to be able to build MicroBlaze embedded processor systems in Xilinx FPGAs.
PicoBlaze is based on a RISC architecture of 8-bits and can reach speeds up to 100 MIPS on the Virtex-4
FPGA's family. The processors have an 8-bit address and data port for access to a wide range of peripherals.
The latest version (available for download on Xilinx’s website for registered users) is KCPSM6. Its main
characteristics include:
- Only 26 Slices plus program memory (BRAM).
- Performance 52 MIPS to 120 MIPs depending on device family and clock rate.
- Supports programs up to 4K instructions.
- 32 General Purpose Registers arranged in 2 banks.
- 256 General Purpose Input Ports.
- 256 General Purpose Output Ports.
- 16 Constant-Optimised Output Ports.
- 64-bytes of scratch pad memory expandable to 128 and 256-bytes (additional 2 and 6 Slices).
- Fully automatic CALL/RETURN stack supporting nested subroutines to 30 levels.
- Interrupt with user definable interrupt vector and maximum response time of 4 clock cycles.
86
- Power saving features including 'sleep' mode.
- Superset of KCPSM3 with high degree of code compatibility.
3. Lab Description
In this lab, we design a digital system that is uses a PicoBlaze microcontroller to compute a2+b2. a and b are
two 4-bit numbers input via the slide switches of the Atlys board. The result is displayed on the 8 LEDs of
the Atlys board. To do this, we follow the steps described below and illustrated in Fig.1.
Figure 1 Block diagram of example1 design
Step 1: Determine the software-hardware partition
Decide about the structure of the design. In this example, the functionality of our design is very simple, so
we only need a single instance of the PicoBlaze microcontroller. Basically it’s an all software
implementation.
Step 2: Develop the assembly program for the software portion
87
Because of its simplicity, PicoBlaze cannot effectively support high-level programming languages (such as
C) and the code is generally developed in assembly language. Developing a complete assembly program
consists of the following steps:
-1-Derive the pseudocode of the main program.
-2-Identify tasks in the main program and define them as subroutines. In needed, continue refining the
complex subroutines and divide them into smaller subroutines.
-3-Determine the register and data RAM use.
-4-Write the assembly code for the subroutines.
The main program usually has the following structure:
call initialization_routine
forever:
call task_1_routine
call task_2_routine
...
call task_n_routine
jump forever
The result of steps 1,2,4 is the assembly program in file example1_sio_rom.psm that you find in the
example1/ folder of the downloadable archive of this lab. Please read it thoroughly to see what and how it
does achieve al the tasks. The structure of the main program is:
call clear_data_ram
forever:
call read_switch
call square
call write_led
jump forever
Step 3 above is unique for assembly code development because we must manually allocate the data storage
in assembly code. In this example, the allocation of the data RAM is done as shown in Fig.2.
Figure 2 Allocation of data RAM
88
Step 3: Compiling with KCSPM6
The assembly code (file example1_sio_rom.psm in our example) is placed in the same folder (say
example1/ in our case) with the assembler, which is kcpsm6.exe that is part of the PicoBlaze files
downloaded from Xilinx. Also in the same folder we must place the file ROM_form.vhd (also part of the
PicoBlaze files downloaded from Xilinx). This is a template used by kcpsm6.exe.
Invoke a DOS window, navigate to the project directory, and run the program. Type:
kcpsm6 example1_sio_rom.psm
After successful compilation, several files are created. The one that we need is the one that contains the
block RAM VHDL entity that we’ll plug into our top-level design. In this example this file is
example1_sio_rom.vhd.
Step 4: Create the ISE WebPack project and test
Use the following source files to create a new ISE project and then implement the top-level design.
kcpsm6.vhd – the VHDL file of the PicoBlaze microcontroller (comes with PicoBlaze files from Xilinx)
example1_sio_rom.vhd – the instruction ROM entity; contains basically our program to be executed
example1_top_level.vhd – top-level description of our design
example1_top_level.ucf – you should know what this is
Generate bitstream file and download it to the FPGA. Test operation and comment.
4. Lab Assignment
Use PicoBlaze microcontroller to design a circuit that implement a Binary-to-BCD converter. Write the
assembly code, compile, create a top-level design, implement, and test on Atlys board. Use 8 slide switches
to input an 8-bit binary number. The BCD code should drive the 8 LEDs.
Optional: Read also chapters 16 and 17 of P.P. Chu’s book and then implement some more complex design
of your choice.
5. Credits and references
[1] Pong P. Chu, FPGA Prototyping by VHDL Examples: Xilinx Spartan-3 Version, Wiley 2008.
[2] Ken Chapman, PicoBlaze for Spartan-6, Virtex-6 and 7-Series (KCPSM6), USER GUIDE (comes with
PicoBlaze files from Xilinx).
[3] Ken Chapman, PicoBlaze 8-Bit Microcontroller for Virtex-E and Spartan-II/IIE Devices. Xilinx
Application Note. http://www.xilinx.com/support/documentation/application_notes/xapp213.pdf
89
Lab 11: Single Cycle Computer (SCC)
1. Objective
The objective of this lab is to design and utilize a completely functional (both in simulation and
implemented on Atlys board) of a single-cycle computer (SCC). We implement in VHDL the SCC
described in Chapter 9 of Mano and Kime book [1]. The goal of this exercise is to get insights into the
design and operation of a simple computer (control + datapath) and to set the stage for more advanced
exercises regarding processor pipelining.
2. Preparation
As part of this lab preparation you must read chapter 9 of M. Morris Mano and Charles R. Kime
book [1]. You should read this chapter as well as the complete VHDL implementation provided as part of
the downloadable archive of this lab. When reading the VHDL code, identify each block of the top-level
block diagram of the SCC shown in Fig.1 below.
Figure 1 Top-level diagram of the Single-Cycle Computer (SCC)
90
3. Lab Description
In this lab, we verify both in simulation as well in hardware on the Atlys board the operation of the SCC.
More precisely, we set it up to execute the following program description:
------
bring 1 from data mem loc 0 (1 is hard coded inside data mem)
and place it in R4; then add it with constant 3 to get 4 and
place 4 in R5; then add R5 and R4 that is 4 + 3 = 7 and place it
in R6; finally store R6 into mem loc 1; so, finally data mem
at location 1 should contain the result 7;
This is accomplished by the following small program:
LD
ADDI
ADD
ST
$4,
$5,
$6,
$4,
$0
$4, 3
$5, $4
$6
Which corresponds to the following, stored directly into the instruction memory:
0010000100000000
1000010101100011
0000010110101100
0100000100110000
Part 1: Aldec-HDL project
Download the lab files from the course website and create your own Aldec-HDL project to simulate and
verify the operation of the SCC with the above program. You should be able to see waveforms like those
shown in Fig.2.
Figure 2 Snapshot of simulation to show 7 as the final result.
Part 2: ISE WebPack project
Use the files from the folder corresponding to the ISE WebPack project (from the downloaded files) and
create your own ISE project. Synthesize and implement the design. Generate the bitstream file and
download it to the FPGA. Observe the final result of the above program execution, 7, also displayed on the
LEDs. Test operation and comment.
4. Lab Assignment
While fully operational, the current single-cycle computer (SCC) is one of the simplest. Here is a list of
suggested tasks that you may want to work on as a follow-up to this lab:
91
Write a new program and place it into the instruction memory. Implement the whole design again and
verify that it works correctly. The program should find the maximum number among three numbers
pre-stored in the first three locations of the data memory. Could such a program be implemented with
less than 16 instructions (that is the current size of the instruction memory of the provided SCC)?
Replace the data and instruction memories with BRAMs specific to Spartan-6 FPGA. Read lab#6 to
recall how to do that.
Implement a fully structural description of the Function Unit and then compare the best achievable
operation frequency with the current one.
Enhance the ISA of SCC with new instructions and make the necessary changes in the control unit
and/or datapath.
Aside from really helping you to get a feel about how a single-cycle computer works, this VHDL
implementation is very useful to use as a platform to get additional insight into pipelining. Modify the
entire VHDL source code to implement a pipelined version of the SCC as discussed in the last part of
Chapter 9 of Mano and Kime book [1].
Read also Chapters 10 and 11 of this book and I am sure you will have your own ideas about how to
enhance this simple SCC.
5. Credits and references
[1] M. Morris Mano and Charles Kime, Logic and Computer Design Fundamentals, Pearson Prentice Hall,
4th Edition, 2008.
92