Virtual Chess Opponent

advertisement
ECE532 Project:
Virtual Chess Opponent
Mandheerej S. Nandra
Eugen Tkachenko
Tuesday, April 13, 2004
Table of Contents
1. Overview ........................................................................................................................ 4
1.1 Objective ................................................................................................................... 4
1.2 Component Overview ................................................................................................ 4
1.2.1 Video Recognition System .................................................................................. 4
1.2.2 Chess Control System ........................................................................................ 4
1.2.3 Chessboard ........................................................................................................ 5
1.3 Overall System .......................................................................................................... 5
1.4 Project Block Diagram ............................................................................................. 6
2. Video Recognition System ............................................................................................ 7
2.1 Vidcap core ............................................................................................................... 7
2.1.1 Hardware design ................................................................................................ 7
2.1.2 Simulation .......................................................................................................... 8
2.2 External Memory Controller..................................................................................... 8
2.3 GPIO for Controlling Video Capture (2 bits) ........................................................... 9
2.4 GPIO’s for Chessboard Registers ............................................................................ 9
2.5 Recognition Software .............................................................................................. 10
2.5.1 Camera Setup ................................................................................................... 11
2.5.2 Piece Recognition ............................................................................................ 11
3. Chess Control System ................................................................................................. 13
3.1 ZBT Controllers ...................................................................................................... 13
3.2 GPIO’s for Chessboard Registers .......................................................................... 13
3.3 GPIO for Controlling VGA Port ............................................................................. 14
3.4 Timer and Interrupt Controller............................................................................... 14
3.5 Chess Control Software .......................................................................................... 14
4. Chessboard .................................................................................................................. 16
4.1 Power Supply .......................................................................................................... 16
4.2 Control Electronics ................................................................................................. 16
5. Project Outcomes ........................................................................................................ 17
5.1 Video Recognition System Outcomes ...................................................................... 17
5.2 Chessboard and Control System Outcomes ............................................................ 17
5.3 Unresolved Issues ................................................................................................... 17
5.4 Further Work .......................................................................................................... 18
6. Design Tree Description ............................................................................................. 19
Appendix A: Video Capture Core ................................................................................. 20
A1: vidcap.vhd code listing: ......................................................................................... 20
A2: vidcap_v2_1_0.mpd listing: ................................................................................... 24
Appendix B: Video capture core simulation ................................................................ 25
B1: OPB bus functional language test script (vidcap.bfl) ............................................ 25
B2: Modified device instantiation ................................................................................. 26
B3: Video test sequence ................................................................................................ 27
2
B4: Sample simulation results: ..................................................................................... 28
Appendix C: Electronics................................................................................................. 29
C1: L298 dual H-bridge IC........................................................................................... 29
C2: LM741 op-amp ....................................................................................................... 30
C3: VGA connector pinout............................................................................................ 30
C4: Motor signal decoder circuit and operation .......................................................... 31
3
1. Overview
1.1 Objective
The goal of this project is to produce a fully autonomous chessboard. It will not
only be able to recognize when and how a human player physically moves a piece, but
will also compute a reply and move the appropriate piece(s) unassisted.
1.2 Component Overview
There are three main components to this project: the video recognition system,
which determines the user’s movement of a piece; the chess control system that
calculates the computer’s move and controls piece movement; and finally, the chessboard
itself.
1.2.1 Video Recognition System
The video recognition system consists of a camera and video capture hardware
coupled with an image recognition processor. Details of this system are in section 2, and
a diagram can be found in section 1.4.
The camera is mounted above the chessboard that sends a live video signal to the
multimedia board’s video decoder (Analog Devices ADV71851) which outputs YCbCr
data in digital format. The data is then sent to the video capture core, which writes the
data to off-chip ZBT memory via an On-chip Peripheral Bus (OPB #1).
The contents of the memory (the framebuffer) is analyzed with a Microblaze
processor that runs simple image recognition software, and the results are written to the
chessboard registers, which are in fact two instances of the GPIO IP2 connected to OPB
#1.
1.2.2 Chess Control System
The chess control system consists of a processor with significant software as well
as the hardware needed to output the appropriate signals to the chessboard. Details of this
system are in section 3, and a diagram can be found in section 1.4.
The processor is a second instantiation of the Microblaze IP, with two banks of
ZBT memory connected to another OPB bus (OPB #2). A GPIO instance attached to this
bus is connected to the various pins of the VGA encoder (Fairchild Semiconductor
FMS38103). The software on this processor reads the chessboard registers from two
1
Datasheet can be found at http://www.analog.com/UploadedFiles/Data_Sheets/436811360ADV7185_0.pdf
Datasheet can be found in the hw\XilinxProcessorIPLib\pcores\opb_gpio_v2_00_a\doc directory of a
Xilinx EDK 6.1 installation
3
Datasheet can be found at http://www.fairchildsemi.com/ds/FM/FMS3810.pdf
2
4
GPIO instances connected to OPB #2, and using a third party chess engine determines the
computers move. The correct signals needed to move the chess pieces are then written to
the GPIO instance controlling the VGA encoder, and the process repeats when the
chessboard registers have changed.
1.2.3 Chessboard
The chessboard was built as part of a high-school project by Mandheerej Nandra
and his former classmate, Nelson Hu, in 2000. The majority of this project, then, is
focused on the previous two systems; nevertheless, some additional electronics were
required to accept input from the VGA port of the multimedia board, as the chessboard
was originally designed to interface with the parallel port of a PC. These can be found in
section 3.
1.3 Overall System
Section 1.4 below contains the block diagram, illustrating the entire project and
how its components are interconnected. The first two components are implemented on
the Xilinx FF896 Microblaze and Multimedia Development Board (henceforth referred to
as the multimedia board). The board contains all the components necessary to implement
these two systems, but since it does not have any general purpose digital output ports, the
VGA output signals of the board were used to interface with the chessboard.
The video recognition system is entirely independent from the chess control
system, even using a different bus. They are only connected together by four 32-bit
registers, each holding the required information for two rows of the chessboard (see
section 2.4 for the storage format). Only 2-bits are needed for each square; therefore, only
128 bits are required to describe the standard 8x8 chessboard. The chess control system
polls these registers for any change.
5
1.4 Project Block Diagram
Video camera
over chessboard
Chessboard
electronics
Video input
VGA output
Chess
board
ZBT RAM
Video
decoder
ZBT RAM:
Framebuffer
ZBT RAM
Video
capture
core
GPIO
OPB #1
Microblaze1
Video
processor
OPB #2
Chessboard
registers
(GPIO)
Video recognition system
Microblaze2
Chess
processor
Chess control system
Xilinx Virtex-II XC2V2000 FPGA
Xilinx Microblaze and Multimedia Development board
6
2. Video Recognition System
This system is based on the zbt_test system supplied to us by Leslie Shannon, and
is located in the vicap directory. A video capture core was created and added to the
system, as well as several GPIO instances (see sections 2.3 and 2.4). Finally, the program
code in the zbt_test system was replaced with image recognition software.
2.1 Vidcap core
The vidcap core is a custom core that takes the raw data from the video decoder
chip and stores the valid data in external memory that is attached to the same OPB bus as
this core.
2.1.1 Hardware design
The input video stream from the video decoder has special codes that identify
horizontal and vertical blanking signals, as well as whether the current field contains the
even or odd lines. The code in XAPP2864 from Xilinx has the ability to decode these
inserted codes, and hence was used as the base for this core. The video core was written
in VHDL, so this code was converted from its original Verilog format.
The video core is also required to write image data to memory, and to do this it
must be a master device that issues requests on the OPB and follows the OPB data
transfer protocol, as defined by IBM5. This is the main addition that was made to the
XAPP286 code.
Appendix A1 contains the vhdl code used in the video core, and is commented to
explain the various sections. Of particular note is that the Mn_request signal (used to
request a transfer on the OPB) will go high as soon as the data is ready, but will only go
low synchronous to the OPB clock. This is accomplished through the use of MstReq2 and
MstReq3, as explained in the code. This reduces the latency of the bus transfer. Since the
core tries to write one 32-bit value every 4 cycles of the video clock, this extra time can
be crucial. This code makes no attempt to start writing at the beginning of a frame and
stop at the end of one. This is because chess is a very slow game with a mostly static
image being captured. Whether or not the core writes the same image over itself several
times is irrelevant for the purposes of this project.
Appendix A2 shows the corresponding MPD file. This is used to let Xilinx
Platform Studio know how the signals in the VHDL file are to be connected with the rest
of the system. The C_FBADDR parameter is the only additional parameter in this core
beyond those needed to be an OPB device, and it specifies the address of the framebuffer.
For this project, it was set to 0x80105000.
4
XAPP286 can be found in the following ZIP file:
http://www.xilinx.com/products/boards/multimedia/docs/examples/Video_Input_Processing_Examples.zip
5
Found at http://www-306.ibm.com/chips/techlib/techlib.nsf/techdocs
/9A7AFA74DAD200D087256AB30005F0C8/$file/OpbBus.pdf
7
2.1.2 Simulation
The vidcap core was thoroughly simulated before it was downloaded to the FPGA
and tested. To do this, the IBM OBP Toolkit6 was used to simulate the OPB bus
interface. This toolkit has an VHDL entity called “opb_device” that is a model for either
a master or a slave device. The model engages in OPB bus transactions as described by
commands in a simple bus functional language (BFL). A script converts these commands
to a ModelSim .do file, and the commands in this file are executed during simulation to
define the behaviour of each “opb_device” instance.
Appendix B1 shows the listing of the test BFL script “vidcap.bfl” that was used in
the simulation. This particular script initializes the arbiter as well as one OPB slave
device, namely “opb_device2”. The latter is then set up to accept many transactions, as
the vidcap core will be making many writes to memory. Different delays are used to
observe the behaviour of the hardware under various circumstances. After conversion to a
.do file, some additional commands were added to control the enabling of the
Appendix B2 shows a section of the “opb_complex”, the master testbench file
provided by the toolkit, that was modified to incorporate a “vidcap” instance as opposed
to another “opb_device” instance. There are some signals there that are either not part of
the Xilinx implementation of OPB or have no usage in this testing scenario, so they are
simply set to the appropriate constant.
Finally, the “opb_complex” entity was modified so that it would provide stimulus
to the “vidcap” instance through its “YCrCb_in” and “vidcap” ports. The stimulus was
generated through a custom Perl script that outputs VHDL code for each byte of input to
“YCrCb_in”. A portion of this code is listed in Appendix B3.
Running the simulation in ModelSim yielded results similar to those found in
Appendix B4. After some debugging, the various signals were found to behave as
expected, and when this core was connect to the system, it functioned without any
problems.
2.2 External Memory Controller
The video system employs an obp_emc7 (external memory controller) core that
controls a single memory block (memory block 2) of external ZBT memory on the
multimedia board. In this project, the C_MEM0_BASEADDR parameter is set to
0x80100000, and the C_MEM0_HIGHADDR parameter is set to 0x801FFFFF. This
creates a 1 MB address space that is big enough to hold the framebuffer, which is about
0.8 MB. C_SYNCH_MEM0 is set to 1, and C_OPB_CLK_PERIOD_PS is set to 37037
to match the OPB clock.
6
Acquired from http://www.idt.mdh.se/kurser/ct3410/ibm_cc_2_9/tk/OPB_TOOLKIT2X_052401.tar.gz
Datasheet can be found in the hw\XilinxProcessorIPLib\pcores\opb_emc_v1_10_b\doc directory of a
Xilinx EDK 6.1 installation
7
8
A dcm_module8 (digital clock manager module) core is also needed to
synchronize the memory’s clock with the data. Default parameters are used with the
exception of C_CLKFX_MULTIPLY, C_CLKIN_PERIOD, and
C_EXT_RESET_HIGH, which were set to 1, 37.037, and 0 repectively.
All the necessary pin assignments are located as usual in system.ucf.
2.3 GPIO for Controlling Video Capture (2 bits)
A 2-bit GPIO core (opb_gpio_1) is used to control video capture core of the
program. The “GPIO_d_out” port is connected to the “inflags” port of the vidcap core.
Although only one bit is used to enable and disable the vidcap core, the vector definition
makes it easy to add more configuration flags should the need arise in the future.
Since this GPIO instance is used strictly for output, the C_TRI_DEFAULT
parameter must be set to 0, and the C_DOUT_DEFAULT should also be set to 0 to avoid
framebuffer writing on startup. Of course, C_WIDTH is set to 2 to make this 2 bits wide.
2.4 GPIO’s for Chessboard Registers
Because chess is a game where only one piece can be moved at a time (except for
certain easily identifiable circumstances), only three states are necessary to describe each
square: black piece present, white piece present, or no piece present (empty). Adding an
additional error state, only two bits are required for each square:
State
Binary value
Empty
00
Black piece
01
White piece
10
Error
11
Table 2.1: Possible states for each square
A square has the error state associated with it when the program cannot recognize
the color of this particular square. This situation can arise when the user moves his own
piece and his hand is over the board. Categorizing each square into one of these four
categories is a simple task, and the algorithm is explained in the next section (2.5).
After all 64 squares are checked, the 2-bit values are packed into four 32-bit
variables. These values must be transferred to the independent chess control system. This
is achieved by using additional GPIO instances. Two are attached to the video system’s
OPB, and two are attached to the chess control system’s OPB. The GPIO_d_out ports of
the former are connected to the GPIO_in ports of the latter.
8
Datasheet can be found in the hw\XilinxProcessorIPLib\pcores\dcm_module_v1_00_a\doc directory of a
Xilinx EDK 6.1 installation
9
Only two 32-bit GPIO instances (board_gpio1 and board_gpio2 in the design) are
required to store four 32-bit variables, as a single 32-bit GPIO instance holds two 32-bit
registers when the C_IS_DUAL parameter is set to 1. Figure 1 (obtained from the GPIO
datasheet) below shows how this parameter effectively doubles the GPIO hardware. The
C_TRI_DEFAULT and C_TRI_DEFAULT_2 parameters are also set to 0 to indicate
output ports.
2.5 Recognition Software
The video recognition software accesses the chessboard image that is located in
the ZBT memory at the address specified in the vidcap core parameter C_FBADDR.
Video data in the memory consists of 263 horizontal lines. The real picture has 525
horizontal lines in it, composed of 263 odd lines (line 1, 3, 5, etc.) and 262 even lines
(number 2, 4, 6 and so on). In the usual case, these lines get interlaced with each other to
get the final picture. In this project it is not necessary to use all 512 lines into memory.
Using only the odd lines results in an adequate picture of the board (Figure 2).
For the purposes of this project, each horizontal line consists of 360 pixels, with
each pixel containing four data elements: two Y (luminance) elements, one Cr (red
difference) element and one Cb (blue difference) element.
10
Figure 2: Odd lines of a test image captured using the video capture core
2.5.1 Camera Setup
The camera is fixed to a slanted tripod so that it is directly over the chessboard
without interfering with its moving parts. Once fixed and oriented correctly, the axes of
the board are aligned with the axes of the camera image. This allows the vertical
boundaries of all squares in a row to be marked by top and bottom Y values, and likewise
the horizontal boundaries of all squares in a column are marked by left and right X
values.
In vidcap.c, the left and right boundaries for each column are stored in the arrays
xmin and xmax respectively, and the top and bottom boundaries are stored in the arrays
ymax and ymin respectively. These values do not cover the entire square so as to allow
for some movement of the board, and prevent a tall piece from overlapping another
square via perspective.
2.5.2 Piece Recognition
The piece recognition code scans values of Y, Cr and Cb of all dots in one square. Black
pieces have red squares attached at the top and white pieces have green squares attached
to them. It is much easier to recognize green and red colors on a black-and-white board.
The values of color signals are being checked for each individual dot in the square and
then compared to preset values of Y, Cr and Cb signals for particular type of square. The
program does not compare all three signals for each square. For example, in order to
determine if there is a black (red) piece standing on the square, it is only necessary to
check Cr values from all dots in the square. Cr value represents red color signal, thus
square with red piece in it will have much higher average value of Cr signal than other
types of squares. Similarly, green dot has much higher average value of luminance (Y)
signal than other dots, and value of Cr and Cb signals are approximately equal.
Therefore, it is only necessary to check for high Y value and for difference in values
between Cr and Cb signals. Dots in empty squares are determined by checking particular
11
range for all three video signals, because beige (white) and dark-brown (black) colors of
squares on the board represent combination of three main signals.
Also, margin of error is introduced when comparison of signal values to preset values is
being done. This margin of error is required due to unequal lighting of different squares.
Squares of the same color will look different from each other because of light from the
lamp falling onto them at different angles.
After the square is fully scanned, the program checks counters of red, green, black and
white dots in this square. Then, the square is assigned two-bit value based on which
counter is prevailing, which in turn is being shifted onto 32-bit variable. Following all
these operations, the program will move onto next square and complete scanning of all 64
squares. When scanning of the entire board is complete, four 32-bit variables containing
information about all squares are being shifted into memory using two 32-bit dual GPIO
cores.
12
3. Chess Control System
The chess control system is, for the most part, just a software system, and located
in the ChessAI directory. Other than a Microblaze instantiation and external memory
there are only a few GPIO instances to connect this system with the video control system
and chessboard. The greatest complexity of this system lies in the software.
This software, however, is complicated enough that it requires two banks of ZBT
memory to reside in, mostly occupied by the third party chess engine used to compute
moves. The MP3 decoder example from Xilinx was used as a base system for this reason,
and all the hardware added for MP3 decoding and sound capability was removed. Any
basic elements, such as a UART instantiation for remote debugging, that are present in a
system built with the “Base System Builder” feature of Xilinx Platform Studio were also
added and connected appropriately. The result is a basic system with two banks of ZBT
memory. Other peripherals that were added to this system are described in the following
sections.
However, this system was not tested at all, as effort was directed towards making
a much more simple system work alongside the video system. The chess engine was
removed and replaced with pre-programmed moves so that the ZBT memory would not
be necessary. The only significant code that remained was that controlling the board.
Unfortunately, it seems this effort was misguided as many unusual errors arose even
when this basic system was compiled in the presence of the video system.
The following sections describe the final untested chess control system (including
ZBT controllers, derived from the MP3 example) that was created late in the project
timeline, and not the basic test system.
3.1 ZBT Controllers
Two instances of the opb_zbt_controller core from Xilinx were created to control
the two banks of memory. The C_ZBT_ADDR_SIZE parameter size was set to 19 in
order to control 2 MB (512k x 32bit) of memory, and the C_EXTERNAL_DLL
parameter was set to 1 to indicate the use of a digital clock manager.
Two digital clock managers were used in this design, but they were not the same
as in the video system. Here, they are instances of the dcm_block core found in the
pcores directory. They serve the same function as those mentioned in section 2.2, and are
connected similarly.
3.2 GPIO’s for Chessboard Registers
As explained in section 2.4, GPIO instances were used to transfer the chessboard
information from the video system to this system. The two GPIO instances on this
system, however, only need to receive input. Thus the only parameters that needs to be
set are C_ALL_INPUTS, C_ALL_INPUTS_2 and C_ISDUAL, each to a value of 1.
13
3.3 GPIO for Controlling VGA Port
The multimedia board has a FMS3810 VGA encoder, and it’s input pins
(connected to the FPGA) control the voltage level of the red, blue, and green pins of the
VGA port (see Appendix C3). The h-sync and v-sync pins are directly connected to pins
on the FPGA. In the chess control system, all these pins are accessed by connecting them
to the GPIO_IO port of a GPIO instance.
This GPIO instance is 29-bits and used strictly for output. For these reasons, the
parameters C_TRI_DEFAULT and C_WIDTH are set to 0 and 29 respectively. The
GPIO_IO port is added and made external so that it can connect to the required FPGA
pins.
3.4 Timer and Interrupt Controller
An instance of the opb_timer core is used to keep track of time. This is needed so
that the chess engine knows when it must finish calculating its move. An interrupt
controller (and instance of the opb_intc core), while not necessary, is added so that other
events in the future can trigger the chess engine to stop thinking.
3.5 Chess Control Software
The chess control software is the main component of this system. It has two main
parts: one controls the signals to the chessboard and the other controls the gameplay.
The code that controls the chessboard is contained in the class cBoardControl. It
borrows a lot of code from the original chessboard, but is rewritten in C++ for better
organization. The chess pieces used are small enough that they can move between each
other when moving from one part of the board to the other. This is especially useful in the
special cases of piece capture, knight moves, and castling. All these special cases are
accounted for in this class. The GPIO device controlling VGA output is written to in the
“UpdatePort()” function, and the values used to set the voltages on the red and green pins
are stored in the array named “step”. These values were determined experimentally so
that the circuit in Appendix C4 functioned correctly.
Two other classes that were created are a cBoard class and a cChessAI class. The
former is used to store a chessboard in a similar form as that mentioned in section 2.4. It
is also capable of determining which piece was moved by comparing two such boards.
The cChessAI class is simply a wrapper for the third party chess engine. It updates the
chessboard data structures in the chess engine, acquires computed moves, and runs the
“PonderPosition()” function while waiting for the user move. This will assist the
computer when it is time for it’s own move.
The chess engine used was Beowulf9, a capable third party open source chess
engine. It was chosen for its strength and ease of integration relative to other chess
9
Available at www.chessbrain.net/beowulf
14
engines. Many modifications had to be made to the some 20,000 lines of C code included
in this engine, most importantly the elimination of all file access. Since all chess engines
require an opening book, this data had to be stored in a fixed array instead. Dynamic
memory allocation was also eliminated since there is very limited memory on the
multimedia board, and errors on allocation was very likely. Of course, the user interface
had to be completely removed also so that the engine would work with game flow of
BoardControl.cpp, the parent C++ file for this project that contains the program entry
point “main()”. Other system calls, such as those to the windows timer, also needed
replacing. Overall, this was a very large task.
This system was tested and debugged on a PC first. All writes to GPIO were
disabled, and the chessboard registers were manually changed in the debugger rather than
by the video system. Through the use of #ifdef _WIN32 … #endif constructs, minimal
code changes were necessary when porting the debugged PC version of the software to
the Microblaze system.
15
4. Chessboard
The chessboard is constructed like a plotter beneath the playing surface. Stepper
motors control the x and y position of a strong magnet that is mounted on a solenoid.
When current passes through the solenoid, the magnet moves close to the board, allowing
it to hold and move the piece directly above it when the plotter moves. When there is no
current, it is far enough from the board that the plotter mechanism moves the magnet
without affecting the position of any pieces.
4.1 Power Supply
While the chessboard already had electronics to control the stepper motors, it used
an unorthodox setup needing both +12V and -12V power sources with significant current
supply to drive the motor coils in different directions. The better solution is to use a Hbridge circuits, which allows the use of a single 12V power supply to drive the motors,
easily obtainable in any computer power supply. This was easily accomplished through
the use of L298 H-bridge IC’s (see Appendix C1 for pinouts).
The power could now be supplied with a standard AT power supply. The newer
ATX power supply standard requires loading the +5V terminals in order to turn on,
adding unnecessary complication.
4.2 Control Electronics
The larger design issue is interfacing the chessboard to the multimedia board.
Since there are no general purpose digital output ports, the VGA port was used to deliver
the required information to the board. The VGA port consists of red, green, blue, h-sync
and v-sync signals. The chessboard needs two signals for each of the two motors, one
signal to control the magnet, and one signal to disable current through the motors. The
last signal is desired because stepper motors dissipate energy even when they aren’t
turning, and can thus get very hot, especially during a lengthy chess game.
Therefore six signals are needed, but only five are available. Furthermore, our
particular multimedia board had a malfunctioning blue signal. This problem is
circumvented by multiplexing two signals on the red and green signals. Using 4 voltage
levels on these pins, two bits of information can be carried on each. Appendix C4 shows
the details of the circuit used to decode the voltage from one of these pins into four
digital TTL-compatible signals. These output signals are connected to the corresponding
input pins of the L298 (pins 5, 7, 10, and 12). The stepper motors have two coils: one is
driven by pins 2 and 3, and the other by pins 3 and 4. The remaining pins are connected
as suggested in the upper diagram of Appendix C1. This entire circuit was duplicated for
the
The h-sync signal controls the magnet via a TIP29 power transistor, and the vsync signal controls the two enable inputs (pins 11 and 6 on the L298) of the two Hbridge circuits. The VGA signals thus have full control of the chessboard.
16
5. Project Outcomes
All components of this system work well separately, and the connection between
the chess control system and chessboard electronics is also flawless. The major problem
encountered that prevented completion of this project occurred when merging the chess
control system with the video recognition system.
5.1 Video Recognition System Outcomes
The video capture and pattern recognition system works well. The image is
captured from video inputs into external memory, each square on the board is scanned
and assigned the correct value based on the type of piece standing on it. The values are all
packed correctly as well into the chessboard registers. There was a minor issue in this
system that was given low priority; see section 5.3 for a description of this issue.
5.2 Chessboard and Control System Outcomes
The chess control system also functioned correctly, although the complete system
with AI was never fully tested on the multimedia board. As explained in section 4, efforts
were concentrated in getting the basic control system without the chess engine or external
memory to merge with the video recognition system, but to no avail. Nevertheless, the
VGA port of the multimedia board was outputting the correct signals, and the software
controlled the chessboard perfectly.
The chessboard itself was also completely functional. Although this is to be
expected since the board was functional prior to embarking on this project, there were
new electronics added that were by no means trivial. These circuits functioned perfectly,
and demonstrated a creative and successful workaround for the limited means of
outputting arbitrary signals from the multimedia board. This allowed the chess control
software to correctly control the movement of pieces on the chessboard. The only major
problem encountered was some sort of short circuit at the power supply plug when
presenting the project, resulting in the inability to properly showcase the piece
movement.
5.3 Unresolved Issues
As mentioned above, problems with adding another Microblaze system to the
video recognition system prevented further integration of all system components. We
were able to have two simple Microblaze systems work after hours of searching through
the many files generated by XPS as well as the online ISE documentation, but could not
do the same with the video recognition system.
17
There was a minor problem with the video recognition system. The video capture
would only work if the vidcap code was enabled and disabled by manually stepping
through the code that asserted and deasserted the 2-bit GPIO signals (see section 2.3).
When a delay loop was used instead to allow the vidcap core enough time to capture a
frame, nothing was written to memory, even with long delay times. One theory as to the
cause of this problem is that the delay loop causes the Microblaze to saturate the OPB
when incrementing the counter, thus blocking the video data from being written. Another
theory is that the debugger is constantly polling the UART, and saturates the system in
that way. There was not enough time to test either theory, unfortunately, as other aspects
of the project had higher priority.
5.4 Further Work
The obvious direction for further work is to complete the integration of the video
recognition system with chess control system. If the minor problem with the video
recognition system was also fixed, then the resulting system would be a complete chessplaying computer, which can recognize opponent’s moves, find the best move to answer
using the Beowulf chess engine, and move required pieces on the board, thus satisfying
the project’s goals.
However, should this task be too difficult, it is possible to run the video capture
software and chessAI software on one Microblaze processor, provided there is enough
external memory available. However, there won’t be any pondering ability (see section
3.5), since there is only a single execution thread now. In hindsight, this would have been
a better course of action when it was discovered that multiple processors were causing
problems.
18
6. Design Tree Description
ChessAI\ – Contains the chess control system.
ChessAI\code\ – The source code for the software component of the chess control system
BoardController.cpp – Parent file containing the entry point
cBoardControl.cpp/h – Class controlling input to chessboard electronics
cBoard.cpp/h – Chessboard class
cChessAI.cpp/h – Chess engine wrapper class
Beowulf\ - Directory containing chess engine source code. The following files
were significantly modified from the original Beowulf source:
main.c – Added the initialization of the chess engine
parser.c – Added functions called by the wrapper class
computil.c – Modified time measurement and user input functions
vidcap\ – Contains the video recognition system
vidcap\sst2 – A TCL script used to download the framebuffer into a file.
vidcap\chdump.txt – A partial memory dump of the framebuffer showing a chessboard
vidcap\code\vidcap.c – The source code for the video recognition
FrameBufferViewer\ – Contains an application created for the PC to view a framebuffer
dump. “chdump.txt”, mentioned above, can be viewed with this application. Both
the source code and the executable are included in this directory.
sim\ – Contains simulation files
sim\OPB_TOOLKIT2X_052401\ – The IBM OPB Toolkit
sim\OPB_TOOLKIT2X_052401\vidcap.bfl – BFL script for OPB simulation
sim\OPB_TOOLKIT2X_052401\vidcap.do – Generated ModelSim .do script
sim\vidcap.do – ModelSim script used for simulation
sim\gen_vid_tb.pl – Perl script to generate video stream stimulus
sim\stimvhd.txt – Output of gen_vid_tb.pl
sim\opb_complex.vhd – Modified testbench using the vidcap core
sim\vidcap.wlf – ModelSim waveform containing simulation results
19
Appendix A: Video Capture Core
A1: vidcap.vhd code listing:
library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_arith.all;
use ieee.std_logic_unsigned.all;
entity vidcap is --USER-- change entity name
generic
(
C_OPB_AWIDTH : INTEGER := 32;
C_OPB_DWIDTH : INTEGER := 32;
C_FAMILY : string := "virtex2";
C_FBADDR : std_logic_vector(0 to 31) := X"00005000"
);
port
(
--Required OPB bus ports, do not add to or delete
Mn_ABus : out std_logic_vector(0 to C_OPB_AWIDTH - 1
Mn_DBus : out std_logic_vector(0 to C_OPB_DWIDTH - 1
Mn_request : out std_logic;
Mn_busLock : out std_logic;
Mn_select : out std_logic;
Mn_RNW : out std_logic;
Mn_BE : out std_logic_vector(0 to C_OPB_DWIDTH/8 - 1
Mn_seqAddr : out std_logic;
OPB_Clk : in std_logic := '0';
OPB_Rst : in std_logic := '0';
OPB_DBus : in std_logic_vector(0 to C_OPB_DWIDTH - 1
OPB_MGrant : in std_logic := '0';
OPB_xferAck : in std_logic := '0';
OPB_errAck : in std_logic := '0';
OPB_retry : in std_logic := '0';
OPB_timeout : in std_logic := '0';
);
);
);
) := (others => '0');
--User ports
led1 : out std_logic;
led2 : out std_logic;
YCrCb_in : in std_logic_vector(9 downto 0);
vid_clk : in std_logic;
inflags : in std_logic_vector(0 to 1)
);
end entity vidcap;
architecture imp of vidcap is
signal Mn_select_s : std_logic := '0';
signal Mn_request_s : std_logic := '0';
signal
signal
signal
signal
pixeldata : std_logic_vector (31 downto 0) := X"00000000";
vidcount : std_logic_vector (0 to 31);
cnt : std_logic_vector (0 to 31) := X"00000000";
write_ena : std_logic;
20
signal
signal
signal
signal
signal
-- the
signal
signal
signal
signal
signal
H_rg : std_logic_vector (4 downto 0);
H_rising : std_logic;
TRS : std_logic;
V_falling : std_logic;
V_rising : std_logic;
next 5 signals act as a fifo for video
YCrCb_rg1 : std_logic_vector (9 downto
YCrCb_rg2 : std_logic_vector (9 downto
YCrCb_rg3 : std_logic_vector (9 downto
YCrCb_rg4 : std_logic_vector (9 downto
YCrCb_rg5 : std_logic_vector (9 downto
data
0);
0);
0);
0);
0);
signal Fo : std_logic;
signal Ho : std_logic;
signal Vo : std_logic;
-- These signals are used to request a bus transfer asynchronously
-- wrt the OPB clock (i.e. as soon as video data is ready)
signal MstReq2 : std_logic;
signal MstReq3 : std_logic;
signal NewData : std_logic;
begin
-- These signals should be hooked up to LEDs
-- led1 is high when writing is enabled
-- led2 flashes about 7 times per second when a valid
-- video source is hooked up
led1 <= write_ena;
led2 <= cnt(10);
--inflags(0) enables video capture
write_ena <= inflags(0);
--constant signals
Mn_busLock <= '0';
Mn_RNW <= '0';
Mn_BE <= (others => '1');
Mn_seqAddr <= '0';
--the request and select signals of OPB
Mn_request <= Mn_request_s;
Mn_select <= Mn_select_s;
-- This process controls the bus signals and follows
-- the OPB transfer protocol. When MGrant is high,
-- deassert the request signal (via MstReq2) and assert
-- the select signal.
-- Deassert the select signal when any termination
-- signal (xferAck, timeout, or retry) is high.
process(OPB_Clk)
variable addr : std_logic_vector(0 to C_OPB_AWIDTH - 1);
begin
if OPB_Clk'event and OPB_Clk = '1' then
-- reset signals
if OPB_Rst = '1' then
Mn_select_s <= '0';
MstReq2 <= '0';
elsif OPB_MGrant = '1' then
21
MstReq2 <= '0';
Mn_select_s <= '1';
elsif MstReq3 = '0' then
-- Follow NewData at the clock edge
MstReq2 <= NewData;
end if;
if OPB_xferAck = '1' or OPB_timeout = '1' or OPB_retry = '1' then
Mn_select_s <= '0';
end if;
-- MstReq3 is the OPB clock synchronized version of NewData
MstReq3 <= NewData;
end if;
end process;
-- TRS (timing reference signal) recognizes the presence of
-- special codes embedded in the video stream
TRS <= '1' when YCrCb_rg2(9 downto 2) = "00000000"
and YCrCb_rg3(9 downto 2) = "00000000"
and YCrCb_rg4(9 downto 2) = "11111111" else '0';
-- The following line allows Mn_request to be asserted immediately after
-- NewData goes high, but allows Mn_request to be deasserted synchronous
-- to the OPB clock via MstReq2 above.
Mn_request_s <= ((NewData and not MstReq3) or MstReq2) and not Mn_select_s;
-- NewData cycles at 1/4 the speed of vidclk, but is low during the
-- horizontal blanking period
NewData <= write_ena and not (Ho or vidcount(30) or OPB_Rst);
-- The OPB
-- signals
Mn_DBus <=
-- Add the
Mn_ABus <=
arbiter ORs all data and address busses together, so these
must only be set high when this device controls the bus
pixeldata when Mn_select_s = '1' else (others => '0');
pixel count to the base address of the framebuffer
("0000000000" & (vidcount(10 to 29) & "00") + C_FBADDR)
when Mn_select_s = '1' else (others => '0');
-- Video signals
Ho <= H_rg(0) or H_rg(4) ;
H_rising <= H_rg(0) and not H_rg(1) ;
V_rising <= ( TRS and YCrCb_rg1(7) ) and not Vo ;
V_falling <= ( TRS and not YCrCb_rg1(7) ) and Vo ;
-- Process synchronous to the
process (vid_clk)
begin
if (OPB_Rst = '1' ) then
YCrCb_rg1 <= (others =>
YCrCb_rg2 <= (others =>
YCrCb_rg3 <= (others =>
YCrCb_rg4 <= (others =>
YCrCb_rg5 <= (others =>
video clock
'0')
'0')
'0')
'0')
'0')
;
;
;
;
;
Fo <= '0' ;
Vo <= '0' ;
H_rg <= "00000";
22
vidcount <= (others => '0');
elsif vid_clk'event and vid_clk = '1' then
-- shift data down the FIFO
YCrCb_rg1 <= YCrCb_in ;
YCrCb_rg2 <= YCrCb_rg1 ;
YCrCb_rg3 <= YCrCb_rg2 ;
YCrCb_rg4 <= YCrCb_rg3 ;
YCrCb_rg5 <= YCrCb_rg4 ;
-- decode the timing information from the special signal
if ( TRS = '1' ) then
Fo <= YCrCb_rg1(8) ;
Vo <= YCrCb_rg1(7) ;
H_rg(4 downto 0) <= H_rg(4 downto 1) & YCrCb_rg1(6) ;
else
Fo <= Fo ;
Vo <= Vo ;
H_rg(4 downto 0) <= H_rg(3 downto 0) & H_rg(0) ;
end if;
-- reset and increment the pixel counter when appropriate
if V_falling = '1' and Fo = '0' then -- end of frame
vidcount <= (others => '0');
elsif H_rg(4) = '1' and H_rg(3) = '0' then
vidcount(30 to 31) <= "00";
elsif Ho = '1' then
vidcount(30) <= '1';
else
vidcount <= vidcount + 1;
end if;
-- counter for making led2 flash
cnt <= cnt + 1;
end if;
end process;
-- pack 4 values into a 32-bit register (pixeldata) every
-- 4 video clocks
process (NewData)
begin
if OPB_Rst = '1' then
pixeldata <= (others => '0');
elsif NewData'event and NewData = '1' then
pixeldata <= YCrCb_rg2(9 downto 2) &
YCrCb_rg3(9 downto 2) &
YCrCb_rg4(9 downto 2) &
YCrCb_rg5(9 downto 2);
end if;
end process;
end architecture imp;
23
A2: vidcap_v2_1_0.mpd listing:
###################################################################
##
## Microprocessor Peripheral Definition
##
###################################################################
BEGIN vidcap, IPTYPE = PERIPHERAL, IMP_NETLIST=TRUE
BUS_INTERFACE BUS = MOPB, BUS_STD = OPB, BUS_TYPE = MASTER
## Generics for VHDL or Parameters for Verilog
PARAMETER c_opb_awidth
= 32,
DT =
PARAMETER c_opb_dwidth
= 32,
DT =
PARAMETER c_family
= virtex2, DT =
PARAMETER c_fbaddr = 0x00005000,
DT =
## Ports
PORT mn_abus
= M_ABus,
PORT mn_dbus
= M_DBus,
PORT mn_request = M_request,
PORT mn_buslock = M_busLock,
PORT mn_select
= M_select,
PORT mn_rnw
= M_RNW,
PORT mn_be
= M_BE,
PORT mn_seqaddr = M_seqAddr,
PORT opb_clk
= "",
PORT opb_rst
= OPB_Rst,
PORT opb_dbus
= OPB_DBus,
PORT opb_mgrant = OPB_MGrant,
PORT opb_xferack = OPB_xferAck,
PORT opb_errack = OPB_errAck,
PORT opb_retry
= OPB_retry,
PORT opb_timeout = OPB_timeout,
PORT led1 = "",
PORT led2 = "",
PORT YCrCb_in = "",
PORT vid_clk = "",
PORT inflags = "",
END
DIR
DIR
DIR
DIR
DIR
DIR
DIR
DIR
DIR
DIR
DIR
DIR
DIR
DIR
DIR
DIR
DIR
DIR
DIR
DIR
DIR
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
integer
integer
string
std_logic_vector
OUT, VEC = [0:(c_opb_dwidth-1)],
OUT, VEC = [0:(c_opb_dwidth-1)],
OUT,
OUT,
OUT,
OUT,
OUT, VEC = [0:((c_opb_dwidth/8)-1)],
OUT,
IN, SIGIS = CLK,
IN,
IN, VEC = [0:(c_opb_dwidth-1)],
IN,
IN,
IN,
IN,
IN,
OUT,
OUT,
IN, VEC = [9:0],
IN,
IN, VEC = [0:1],
BUS
BUS
BUS
BUS
BUS
BUS
BUS
BUS
BUS
BUS
BUS
BUS
BUS
BUS
BUS
BUS
BUS
BUS
BUS
BUS
BUS
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
MOPB
MOPB
MOPB
MOPB
MOPB
MOPB
MOPB
MOPB
MOPB
MOPB
MOPB
MOPB
MOPB
MOPB
MOPB
MOPB
MOPB
MOPB
MOPB
MOPB
MOPB
24
Appendix B: Video capture core simulation
B1: OPB bus functional language test script (vidcap.bfl)
set_device (path=/opb_complex/opb_arbiter,device_type=opb_arbiter)
-- Initialize arbiter
Configure(arbiter_mode=round)
set_device (path=/opb_complex/opb_device2,device_type=opb_device)
-- Initialize slave 2
configure(s_byte_enable=true)
mem_init(addr=00021000,data=AABBCCDD)
mem_init(addr=00021004,data=AABBCCDD)
mem_init(addr=00021008,data=AABBCCDD)
mem_init(addr=0002100C,data=AABBCCDD)
mem_init(addr=00021010,data=AABBCCDD)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=2,ack_type=normal,ack_size=4)
response(delay=4,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
response(delay=1,ack_type=normal,ack_size=4)
25
B2: Modified device instantiation
Excerpted from the testbench “opb_complex”:
OPB_DEVICE0: vidcap
generic map (
C_FBADDR => C_FBADDR )
port map (
OPB_Clk => OPB_CLK ,
OPB_Rst => OPB_RESET ,
OPB_MGrant => OPB_MxGRANT(0) ,
OPB_xferAck => OPB_XFERACK ,
OPB_errAck => OPB_ERRACK ,
OPB_retry => OPB_RETRY ,
OPB_timeout => OPB_TIMEOUT ,
OPB_DBus => to_stdlogicvector(OPB_DBUS(0 to 31)),
Mn_request => Mx_REQUEST(0) ,
Mn_busLock => M0_BUSLOCK ,
Mn_select => Mx_SELECT(0) ,
Mn_RNW => M0_RNW ,
Mn_seqAddr => M0_SEQADDR ,
Mn_BE => Mn_BE ,
Mn_ABus => Mn_ABus ,
Mn_DBus => Mn_DBus ,
YCrCb_in => YCrCb_in ,
vid_clk => vid_clk ,
inflags => inflags );
inflags(0) <= write_ena;
M0_BEXFER <= Mx_SELECT(0);
M0_HWXFER <= Mx_SELECT(0);
M0_FWXFER <= Mx_SELECT(0);
M0_DWXFER <= '0';
M0_UABUS <= (others => '0');
SL0_XFERACK <= '0';
SL0_BE <= (others => '0');
SL0_BEACK <= '0';
SL0_HWACK <= '0';
SL0_FWACK <= '0';
SL0_DWACK <= '0';
SL0_ERRACK <= '0';
SL0_TOUTSUP <= '0';
SL0_RETRY <= '0';
M0_BE <= to_stdulogicvector(Mn_BE) & "0000";
M0_ABUS <= to_stdulogicvector(Mn_ABus);
M_DBUS_0 <= to_stdulogicvector(Mn_DBus) & X"00000000";
M_DBUS_EN(0) <= Mx_SELECT(0);
M_DBUS_EN32_63(0) <= '0';
26
B3: Video test sequence
Excerpted from the testbench “opb_complex”:
vid_clk <= '1';
wait for 19 ns;
vid_clk <= '0';
wait for 9.5 ns;
YCrCb_in <= "1011000100";
wait for 9.5 ns;
vid_clk <= '1';
wait for 19 ns;
vid_clk <= '0';
wait for 9.5 ns;
YCrCb_in <= "0110011100";
wait for 9.5 ns;
vid_clk <= '1';
wait for 19 ns;
vid_clk <= '0';
wait for 9.5 ns;
YCrCb_in <= "0011000100";
wait for 9.5 ns;
vid_clk <= '1';
wait for 19 ns;
vid_clk <= '0';
wait for 9.5 ns;
YCrCb_in <= "1000100000";
wait for 9.5 ns;
vid_clk <= '1';
wait for 19 ns;
vid_clk <= '0';
wait for 9.5 ns;
YCrCb_in <= "1100001100";
wait for 9.5 ns;
vid_clk <= '1';
wait for 19 ns;
vid_clk <= '0';
wait for 9.5 ns;
YCrCb_in <= "0100000100";
wait for 9.5 ns;
vid_clk <= '1';
wait for 19 ns;
vid_clk <= '0';
wait for 9.5 ns;
YCrCb_in <= "0011011100";
wait for 9.5 ns;
vid_clk <= '1';
wait for 19 ns;
vid_clk <= '0';
wait for 9.5 ns;
YCrCb_in <= "0100111100";
wait for 9.5 ns;
-- 2C4
-- 19C
-- 0C4
-- 220
-- 30C
-- 104
-- 0DC
vid_clk <= '1';
wait for 19 ns;
vid_clk <= '0';
wait for 9.5 ns;
YCrCb_in <= "1111001100";
wait for 9.5 ns;
vid_clk <= '1';
wait for 19 ns;
vid_clk <= '0';
wait for 9.5 ns;
YCrCb_in <= "1011111000";
wait for 9.5 ns;
vid_clk <= '1';
wait for 19 ns;
vid_clk <= '0';
wait for 9.5 ns;
YCrCb_in <= "0100001100";
wait for 9.5 ns;
vid_clk <= '1';
wait for 19 ns;
vid_clk <= '0';
wait for 9.5 ns;
YCrCb_in <= "1111111111";
wait for 9.5 ns;
vid_clk <= '1';
wait for 19 ns;
vid_clk <= '0';
wait for 9.5 ns;
YCrCb_in <= "0000000000";
wait for 9.5 ns;
vid_clk <= '1';
wait for 19 ns;
vid_clk <= '0';
wait for 9.5 ns;
YCrCb_in <= "0000000000";
wait for 9.5 ns;
vid_clk <= '1';
wait for 19 ns;
vid_clk <= '0';
wait for 9.5 ns;
YCrCb_in <= "1011011000";
wait for 9.5 ns;
vid_clk <= '1';
wait for 19 ns;
vid_clk <= '0';
wait for 9.5 ns;
-- 3CC
-- 2F8
-- 10C
-- 3FF
-- 000
-- 000
-- 2D8
-- 13C
27
B4: Sample simulation results:
28
Appendix C: Electronics
C1: L298 dual H-bridge IC
Source: http://www.st.com/stonline/books/pdf/docs/1773.pdf
29
C2: LM741 op-amp
Source: http://www.national.com/ds/LM/LM741.pdf
C3: VGA connector pinout
Pin
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Signal
Red
Green
Blue
ID Bit
N/C
Red Return
Green Return
Blue Return
No Pin
Ground
ID Bit
ID Bit
H Sync
V Sync
ID Bit
30
C4: Motor signal decoder circuit and operation
-5V
1k
10k
Vin
V1
Output 2
250k
17k
+5V
100k
I3
Output 1
Output 3
I1
250k
+5V
-12V
I4
-5V
150k
Output 4
I2
+5V
This circuit functions much like a 2-bit analog to digital converter. The upper opamp simply compares Vin with a reference voltage of 0.7 volts, and V1 is set to either
+5V or -5V. The lower op-amp is a summing amplifier. This op-amp doesn’t get saturate
like the previous one, so the inverting input will have a voltage of 0V. Since it doesn’t
have any appreciable current running through it, I2 = -80 A, so:
I4 = I1 + I2 + I3 = I1 + I3 - 80 A
The following table describes the operation of the circuit for the 4 values of Vin
that will be used:
Vin
0.4 V
0.6 V
0.8 V
1.0 V
V1
+5 V
+5 V
-5 V
-5 V
Output1
+5 V
+5 V
0V
0V
Output2
0V
0V
+5 V
+5 V
I1
20 A
20 A
-20 A
-20 A
I3
40 A
60 A
80 A
100 A
I4
-20 A
0 A
-20 A
0 A
Output3
0V
+5 V
0V
+5 V
Output4
+5 V
0V
+5 V
0V
The inverters used in the circuit are from a 7404 hex inverter IC.
31
Download