Creating a 12 x 8 MAC Using VHDL and the Xilinx CORE Generator

advertisement
Creating a 12 x 8 MAC Using VHDL
and the Xilinx CORE Generator
For Academic Use Only
Creating a 12 x 8 MAC Using VHDL and the Xilinx
CORE Generator
Introduction
In this lab, you will create a 12-bit x 8-bit MAC (Multiplier Accumulator) using a combination of
VHDL and the Xilinx CORE Generator. You will create a multiplier unit in VHDL and an
accumulator using Core Generator, and then connect them together in the top-level design. This
lab helps familiarize you with the Xilinx CORE Generator and the Xilinx implementation tools by
having you generate the accumulator as an IP core. This lab is completed using the Xilinx ISE 6
software. You will use a typical VHDL flow to black-box (instantiate) the core into a top-level
piece of VHDL code, run a functional HDL simulation, synthesize your design with XST, and
take the synthesized design through the Xilinx implementation tools. You will then verify the
functionality of the design on-chip using Chipscope-Pro.
Note: For this lab, you do not need to know VHDL because the top-level VHDL file is provided.
There is a completed example in c:\xup\dsp_flow\labs\lab2\lab1_soln.
Objectives
After completing this lab, you will be able to:




Generate a CORE Generator macro
Simulate a piece of VHDL containing a CORE Generator macro
Synthesize the VHDL and black-box instantiations using XST
Implement a synthesized design through the Xilinx implementation tools
Design Description
Use a CORE Generator to create a 12 x 8 MAC using VHDL that has the following behavior:



Multiplier input data widths of 12-bits and 8-bits of signed data
Multiplier output width of 20 bits
Accumulator output width of 27-bits
Procedure
This lab comprises nine primary steps: you will start the project navigator and open the project;
create a 12x8 multiplier unit using VHDL; generate an accumulator core using CORE Generator;
add the CORE Generator macro into the provided VHDL code; synthesize the design using XST;
insert the ILA and ICON cores into the MAC design; implement the MAC design; use ChipscopePro Analyzer to configure the FPGA and specify match units and trigger conditions; and then
perform an on-chip verification. Below each general instruction for a given procedure, you will
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-3
find accompanying step-by-step directions and illustrated figures providing more detail for
performing the general instruction. If you feel confident about a specific instruction, feel free to
skip the step-by-step directions and move on to the next general instruction in the procedure.
Note: If you are unable to complete the lab at this time, you can download the lab files for this
module from the Xilinx University Program site at http://university.xilinx.com
Start the Project Navigator and Open the Project
Step 1
Launch the ISE Project Navigator and open the mac_cgen project.
 Open the Xilinx ISE 6 software: Go to Start Menu  Programs  Xilinx ISE 6 
Project Navigator
 Open the mac_cgen project: In the Project Navigator, select File  Open Project
 Browse to c:\xup\dsp_flow\labs\lab1 using the pull-down arrow
 Open the mac_cgen folder and select the mac_cgen.npl project file
 Click OK
Generate the VHDL Code for the Multiplier
Step 2
Open the mac_cgen.vhd file and modify it to perform the 12 x 8 multiply
operation. Refer to Figure 21-1 block diagram to understand the provided code.
The comments in the code will guide you to complete this step. Spend 15 minutes
working on your VHDL code, then move on and use the solution provided in
lab1_soln directory.
 Open the mac_cgen.vhd file: In the Sources in Project window, double-click
mac_cgen.vhd
 Read through the VHDL file and add code to the following sections:

“Generating the Multiplier
 Select mac_cgen.vhd in the Sources in Project window
 In the Processes for Current Source window, expand Synthesis
 Double-click the Check Syntax option to perform syntax check
 Fix any reported errors
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-4
Generate an Accumulator Using the CORE Generator
Step 3
Generate an accumulator by invoking the CORE Generator through the project.
Make sure that the input data are signed data and the output width is 27 bits
 Create a new source: select Project  New Source, or right-click, and choose New Source
Figure 12-1. Adding a New Source to an ISE Project
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-5
 Select IP(CoreGen & Architecture Wizard), type accum in the File Name field, and click
Next
Figure 12-2. Adding a CORE Generator to Your ISE Project.
 Select Core Type dialog box will be displayed. Expand Math Functions and then
Accumulators
Figure 12-4. Selecting Multiply Accumulators function.
 Select Accumulator, click Next button and then Finish
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-6
 Fill in the following options on for the Accumulators GUI and click Generate to create the
accumulator.






Component Name: accum
Operation: Add
Port B Input Options: Port B Width 20; signed
Output Options: Width 27, Registered
 Register Options: Clock Enable and Asynchronous Clear
Create RPM: checked
Select Display Core Footprint (bottom right of the GUI)
Figure 12-6. Accumulator Options.
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-7
 You will see a pop-up window indicating that the accum core was generated successfully.
Click OK to invoke the Core Viewer
Fill in the following information from the Core Viewer window
 The shape of the generated core should look like the following
Figure 12-9. Core Viewer of the Multiplier Accumulator.
?
1. Fill in the following information from the Core Viewer window:
Number of CLB wide:
Number of CLB tall:
Number of slices:
 Close the Core Viewer and the Core Generator by clicking the DISMISS button
Note: For a detailed explanation of the output files, please see the documentation Help 
Online Documentation  CORE Generator Guide, Chapter 3 Using the CORE Generator.
The section listing inputs and outputs will thoroughly describe the input and output files
Note: A accum.xco file will be added to your project in the mac_cgen hierarchy
Adding the CORE Generator Macro into VHDL Code
Step 4
Using the ISE Language Template, instantiate the multiply accumulator macro,
accum, into the supplied top-level VHDL file mac_cgen.vhd
 Double-click the VHDL file mac_cgen.vhd in the Sources in Project window
 Open the Language Template by clicking on icon
Template
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
or select Edit  Language
12-8
 Expand the Coregen  VHDL folder, and select the accum template
The template similar to shown below appears:
Figure 12-10. Selecting the accum template.
 Using the template, add the component declaration between the architecture and begin
statements as indicated in the mac_cgen.vhd file
 Using the template, add the instance of the accum in the mac_cgen.vhd file
 Change the instance name to U2
 Connect the ports of accum to appropriate signals
 Check the syntax and correct any errors before proceeding to the next step
Synthesize the Design Using XST
Step 5
Synthesize the mac_cgen.vhd design using Xilinx Synthesis Technology (XST)
tool with default options
 Remove the my_mac.xco file from the project.
 Select the mac_cgen.vhd file in the Sources in Project Window
 Run synthesis: Right-click Synthesis in the Processes for Current Source window and
select the Run option
 If there are any errors, you can View Synthesis Report by expanding Synthesis, right-click
and choose the View option
 Fix any errors and re-synthesize, otherwise continue on to the next step
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-9
Implement the MAC design
Step 6
Implement your mac_cgen.vhd design using Xilinx implementation tools and
view the Post-place & Route Static Timing Report. Make sure that the settings
are as follows




Device Family: Spartan3
Device: xc3s200
Speed Grade: 4
Package: FT256
 Right-click Implement Design, and choose the Run option, or double left-click Implement
Design
?
2. Which netlist files do the Xilinx implementation tools use for the accum black
box?
 View the placed design in the FPGA Editor by selecting View/Edit Routed Design (FPGA
Editor) under the Place and Route
Figure 12-11. Opening the FPGA Editor.
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-10
 Close the FPGA Editor when you are finished
 Use the place and route report and Text Based Post Place & Route Static Timing Report
files:
?
3. Fill in the information requested below.
Number of Slices:
Number of Block Multipliers:
Number of Block RAMs:
Number of BUFGMUXs:
Number of external IOBs:
Maximum clock frequency:
Create New Chipscope-Pro Project
Step 7
Create a new Chipscope Pro project through the Project Navigator.
 Select Project  New Source in Project Navigator to open the new source dialogue, click on
Chipscope Definition and Connection, and enter the name mac_cs. Click <Next> to continue.
Figure 12-12. Add New Chipscope Source
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-11
 Select Chipscope Definition and Connection from the list and enter mac_cgen as the file name
and click <next>.
 Select mac_cgen as the source. Click <next> and then <finish>. A Chipscope Pro source
will be added to the Sources in Project window.
Figure 12-13. Chipscope Definition and Connection
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-12
ILA Core Parameters and Connections
Step 8
Insert an ICON and ILA core into the design netlist using Chipscope-Pro Core Inserter.
Connect the output of the accumulator to the trigger and input data ports of the ILA core.
 Double-click the mac_cs.cdc file in the sources in project window to open the core inserter
project.
Figure 12-14. Chipscope Pro Core Inserter
Projects saved in the Core Inserter hold all relevant information about source files, destination
files, destination files, core parameters and core settings.
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-13
 Click <next>. Leaving the Disable JTAG Clock BUFG Insertion option unchecked, click New ILA
Unit. Notice in the left hand window how an instance of the ILA core, U0:ILA, is added to the system.
Figure 12-15. Insert the New ILA Unit
Note: Disabling the JTAG clock BUFG insertion causes the ISE tools to route the JTAG clock
using normal routing resources instead of global clock routing resources. This option should
only be selected of global routing resources are scarce.
 Click <next> to setup the trigger parameters
Each ILA or ILA/ATC core can have up to 16 separate trigger ports that can be setup
independently. The individual trigger ports are buses that are made up of individual signals or bits
that can range from 1 to 256 bits. Each trigger port can be connected to 1 to 16 match units. A
match unit is a comparator that is connected to a trigger port and is used to detect events on that
trigger port. The results of one or more match units are combined together to form the overall
trigger condition event that is used to control the capturing of data. The different comparisons or
match functions that can be performed by the trigger port match units depend on the type of match
unit. The ILA and ILA/ITC cores support six types of match units.
 Set the following ILA trigger parameters as follows and then click <next>
Trigger Input and Match Unit Settings
 Number of trigger ports: 2
 TRIG0:
 Trigger width: 1
 # Match Units: 1
 Counter Width: disabled
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-14
 Match type: extended
 TRIG1:
 Trigger width: 1
 # Match Units: 1
 Counter Width: disabled
 Match type: extended
 Trigger Condition Settings
 Enable Trigger Sequencer: checked
 Max Number of Sequencer Levels: 2
 Storage Qualification Condition Settings
 Enable Storage Qualification: unchecked
Figure 12-16. Trigger Parameters
The maximum number of data sample words that the ILA core can store in the sample buffer is
called the data depth. The data depth determines the number of data width bits contributed by
each block RAM unit used by the ILA unit. The maximum number of data sample words that can
be captured depends on the number and size of block RAM, which varies according to device
family and density.
 Set the following options and click <next>
 Data Depth: 512
 Sample On: Rising clock edge
 Data Same as Trigger Port: unchecked
 Data Width: 47
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-15
Figure 12-17. Capture Parameters
The net connections tab allows you to choose the signals to connect to the ILA or ILA/ATC core.
If trigger is separate from data, then clock, trigger, and data must be specified. Connections that
have not been made will appear in red.
Figure 12-18. Net Connections
 Click the Modify Connections tab
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-16
Figure 12-19. Net Connections
This dialogue provides an easy interface to choose nets to connect to the ILA, ILA/ATC or ATC2
cores. The hierarchical structure of the design can be traversed using the Structure/Nets pane. All
the design’s nets of the selected structure hierarchy appear in the table at the lower left pane. The
Clock Signals and Trigger/Data Signals tabs illustrate the net connections between the design and
the ILA core.
 With the Clock Signals tab under Net Selections selected, highlight the entry for clk_int and
click the Make Connections button to connect the clock signal in the design to the clock port
of the ILA core.
Figure 12-20. Connect the clock
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-17
 Click the Trigger/Data Signals tab and make the following connections in each of the subtabs:


TP0:CH:0  nd_reg
TP1:CH:0  clr_IBUF
Figure 12-21. Trigger Signal Connection
 Click the Data Signals tab, make the following connections, and then click <OK>:



CH:0 – CH:11  a_reg<0> - a_reg<11>
CH:12 – CH:19  b_reg<0> - b_reg<7>
CH:20 – CH:46  Q_0_OBUF - Q_26_OBUF
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-18
Figure 12-22. Data Signal Connections
 You will notice that the Clock and Trigger ports under Net Connections are highlighted in
black, indicating valid connections. Click Return to Project Navigator and save the file.
Implement the MAC Design
Step 9
Implement your mac_cgen.vhd design using Xilinx implementation tools to
generate a bitstream for downloading to the FPGA.
 Right-click Generate Programming File, and choose the Rerun All option.
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-19
Figure 12-23. Generate the Programming File
Note: This runs the design through Place and Route. You will notice green check marks (or
warning exclamations) next to the processes that have finished successfully. It will also run
Post-Place & Route Static Timing and generate the static timing report.
 Use the place and route report and Text Based Post Place & Route Static Timing Report
files:
?
4. Fill in the information requested below.
Number of Slices:
Number of Block Multipliers:
Number of Block RAMs:
Number of BUFGMUXs:
Number of external IOBs:
Maximum clock frequency:
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-20
Setup Chipscope-Pro Analyzer Options
Step 10
The Chipscope-Pro Analyzer tool interfaces directly to the ICON, ILA, ILA/ATC, IBA/OPB,
IBA/PLB, VIO, and ATC2 cores. You can configure your device, choose triggers, setup the
console, and view the results of the capture on the fly. The data views and triggers can be
manipulated in many ways, providing an easy and intuitive interface to determine the functionality
of the design. Using Analyzer, you will configure the FPGA, specify the match units, and then
setup the trigger conditions.
 Open Chipscope-Analyzer by going to Start  Programs  Chipscope Pro 6.3  Chipscope
Pro Analyzer
 Connect the download cable to the PC parallel port and JTAG connection of the Spartan-3
board, and then power up the board.
 Click the Open Cable/Search JTAG Chain button
Figure 12-24. Establish JTAG Connection
The Spartan-3 board contains two devices in the JTAG chain: The Spartan-3 XC3S200 and a
Platform Flash PROM XCF00S. Impact will detect these devices and list the device names along
with Instruction Register (IR) Lengths and Device ID Codes.
Figure 12-25. Impact Detects Devices in JTAG Chain
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-21
 Click <OK>. Right Click on the Spartan-3 device, indicated as DEV: 0 MyDevice0
(XC3S200) and select configure.
Figure 12-26. Download Program File to FPGA
 Click Select New File, browse to the project directory and select the bitstream file
mac_cgen.bit.
The Chipscope Pro Analyzer interface consists of four parts:
 Project Tree in the upper part of the split pane on the left side of the
window
 Signal Browser in the lower part of the split pane on the left side of the
window
 Message pane at the bottom of the window
 Main window area
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-22
Figure 12-27. Chipscope Pro Analyzer
Each Chipscope Pro ILA, ILA/ATC, and IBA core has its own Trigger setup window, which
provides a graphical interface for the user to setup triggers. The trigger mechanism inside
each Chipscope Pro core can be modified at run-time without having to recompile the design.
There are three components to the trigger mechanism:
 Match Functions: Defines the match or comparison value of each match
unit
 Trigger Conditions: Defines the overall trigger condition based on a binary
equation or sequence of one or more match functions
 Capture Settings: Defines how many samples to capture, how many capture
windows, and the position of the trigger in those windows
In this design, you will setup the triggers to capture 256 samples of both inputs to the
multiplier and the output of the accumulator.
 Specify the Match Units as follows:
 Radix (both trigger ports): binary
 M0:TriggerPort0: Function ==; Value 1
 M1:TriggerPort1: Function ==; Value 1
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-23
Figure 12-28. Setup the Match Units
You will now setup the trigger condition equation to capture samples after the following
conditions occur in the proper order:
1. clear is asserted
2. enable signal nd_reg is asserted
 Click the field under Trigger Condition Equation, select the Sequencer tab and
specify the following options to generate the equation M1  M0, and then click
<OK>.
 Number of Levels: 2
 Level 1: M1
 Level 2: M0
Figure 12-29. Trigger Condition Equation
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-24
Perform on-chip Hardware Verification
Step 9
In the next steps, you will combine the signals into busses in the waveform viewer to make it
easier to view the results of on-chip debug. You will then perform on-chip verification and view
the results of the verification. The wave form viewer will then be used to verify the operation of
the MAC design.
 Perform the following actions to create buses that represents the A and B multiplier inputs
and the Q_int accumulator output
 Select signals DataPort[0] through DataPort[11] so that they are highlighted
 Right-click the highlighted signals and select Add to Bus  New Bus
 Right-click on the newly created bus, BUS_0, and rename it to A.
 Select signals DataPort[12] through DataPort[19] so that they are highlighted
 Right-click the highlighted signals and select Add to Bus  New Bus
 Right-click on the newly created bus, BUS_1, and rename it to B.
 Select signals DataPort[20] through DataPort[46] so that they are highlighted
 Right-click the highlighted signals and select Add to Bus  New Bus
 Right-click on the newly created bus, BUS_0, and rename it to Q_int.
Figure 12-30. Create Buses
 Right-click on each of the busses and set the radix to signed decimal.
 Click the Apply Settings and Arm Trigger button
Figure 12-31. Apply Settings and Arm Trigger
 Push the switch SW0 on the Spartan-3 board so that it is at the “on” position. This switch will
enable the design.
 Press and release button BTN0 to set off the trigger and capture data samples in the ILA
buffer. Pressing the button resets the following registers:
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-25



A and B registered inputs of the multiplier
Registered enable nd_reg
Accumulator
Releasing the reset button connects nd_reg to the nd input pin, which is connected to switch
SW0.
Figure 12-32. Verification Results
Once triggered the ILA core will capture the results in Block RAM and the ICON core will
route the results back to the PC via the JTAG connection. The results will be illustrated in the
waveform view. Notice the multiply-accumulate operations
1x1 = 1
(1x1) + (2x2) = 5
(1x1) + (2x2) + (3x3) = 14
etc.
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-26
Conclusion
In this lab, you learned the basic design flow involved in incorporating the CORE Generator
macros into a VHDL code. You generated a CORE Generator macro, and then simulated a design
that contains CORE Generator macros, and then synthesized a design that contains CORE
Generator macros using synthesis using XST. You ran a synthesized design that contains CORE
Generator macros through the Xilinx implementation tools, and viewed how the core is
implemented using the FPGA Editor. During the last steps, you inserted the ILA and ICON cores
into the design and performed an on-chip verification using the Chipscope-Pro Analyzer.
A
Answers
The Core Viewer Result:
Figure 12-10. Core Viewer Results.
1. Fill in the following information from the Core Viewer window:
Number of CLB wide:
Number of CLB tall:
Number of slices:
1
7
14
2. Which netlist files do the Xilinx implementation tools use for the accum black
box?
accum.edn (EDIF) netlist file which is generated by the CORE Generator
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-27
3. Fill in the information requested below.
Number of Slices:
Number of Block Multipliers:
Number of Block RAMs:
Number of BUFGMUXs:
Number of external IOBs:
25
1
0
1
30
Maximum clock frequency:
~ 180 MHz
4. Fill in the information requested below.
Number of Slices:
Number of Block Multipliers:
Number of Block RAMs:
Number of BUFGMUXs:
Number of external IOBs:
304
1
2
2
30
Maximum clock frequency:
~112 MHz
Creating a 12 x 8 MAC
Using the Xilinx CORE Generator
university.xilinx.com
For Academic Use Only
12-28
Download