ECE532 Project: Virtual Chess Opponent Mandheerej S. Nandra Eugen Tkachenko Tuesday, April 13, 2004 Table of Contents 1. Overview ........................................................................................................................ 4 1.1 Objective ................................................................................................................... 4 1.2 Component Overview ................................................................................................ 4 1.2.1 Video Recognition System .................................................................................. 4 1.2.2 Chess Control System ........................................................................................ 4 1.2.3 Chessboard ........................................................................................................ 5 1.3 Overall System .......................................................................................................... 5 1.4 Project Block Diagram ............................................................................................. 6 2. Video Recognition System ............................................................................................ 7 2.1 Vidcap core ............................................................................................................... 7 2.1.1 Hardware design ................................................................................................ 7 2.1.2 Simulation .......................................................................................................... 8 2.2 External Memory Controller..................................................................................... 8 2.3 GPIO for Controlling Video Capture (2 bits) ........................................................... 9 2.4 GPIO’s for Chessboard Registers ............................................................................ 9 2.5 Recognition Software .............................................................................................. 10 2.5.1 Camera Setup ................................................................................................... 11 2.5.2 Piece Recognition ............................................................................................ 11 3. Chess Control System ................................................................................................. 13 3.1 ZBT Controllers ...................................................................................................... 13 3.2 GPIO’s for Chessboard Registers .......................................................................... 13 3.3 GPIO for Controlling VGA Port ............................................................................. 14 3.4 Timer and Interrupt Controller............................................................................... 14 3.5 Chess Control Software .......................................................................................... 14 4. Chessboard .................................................................................................................. 16 4.1 Power Supply .......................................................................................................... 16 4.2 Control Electronics ................................................................................................. 16 5. Project Outcomes ........................................................................................................ 17 5.1 Video Recognition System Outcomes ...................................................................... 17 5.2 Chessboard and Control System Outcomes ............................................................ 17 5.3 Unresolved Issues ................................................................................................... 17 5.4 Further Work .......................................................................................................... 18 6. Design Tree Description ............................................................................................. 19 Appendix A: Video Capture Core ................................................................................. 20 A1: vidcap.vhd code listing: ......................................................................................... 20 A2: vidcap_v2_1_0.mpd listing: ................................................................................... 24 Appendix B: Video capture core simulation ................................................................ 25 B1: OPB bus functional language test script (vidcap.bfl) ............................................ 25 B2: Modified device instantiation ................................................................................. 26 B3: Video test sequence ................................................................................................ 27 2 B4: Sample simulation results: ..................................................................................... 28 Appendix C: Electronics................................................................................................. 29 C1: L298 dual H-bridge IC........................................................................................... 29 C2: LM741 op-amp ....................................................................................................... 30 C3: VGA connector pinout............................................................................................ 30 C4: Motor signal decoder circuit and operation .......................................................... 31 3 1. Overview 1.1 Objective The goal of this project is to produce a fully autonomous chessboard. It will not only be able to recognize when and how a human player physically moves a piece, but will also compute a reply and move the appropriate piece(s) unassisted. 1.2 Component Overview There are three main components to this project: the video recognition system, which determines the user’s movement of a piece; the chess control system that calculates the computer’s move and controls piece movement; and finally, the chessboard itself. 1.2.1 Video Recognition System The video recognition system consists of a camera and video capture hardware coupled with an image recognition processor. Details of this system are in section 2, and a diagram can be found in section 1.4. The camera is mounted above the chessboard that sends a live video signal to the multimedia board’s video decoder (Analog Devices ADV71851) which outputs YCbCr data in digital format. The data is then sent to the video capture core, which writes the data to off-chip ZBT memory via an On-chip Peripheral Bus (OPB #1). The contents of the memory (the framebuffer) is analyzed with a Microblaze processor that runs simple image recognition software, and the results are written to the chessboard registers, which are in fact two instances of the GPIO IP2 connected to OPB #1. 1.2.2 Chess Control System The chess control system consists of a processor with significant software as well as the hardware needed to output the appropriate signals to the chessboard. Details of this system are in section 3, and a diagram can be found in section 1.4. The processor is a second instantiation of the Microblaze IP, with two banks of ZBT memory connected to another OPB bus (OPB #2). A GPIO instance attached to this bus is connected to the various pins of the VGA encoder (Fairchild Semiconductor FMS38103). The software on this processor reads the chessboard registers from two 1 Datasheet can be found at http://www.analog.com/UploadedFiles/Data_Sheets/436811360ADV7185_0.pdf Datasheet can be found in the hw\XilinxProcessorIPLib\pcores\opb_gpio_v2_00_a\doc directory of a Xilinx EDK 6.1 installation 3 Datasheet can be found at http://www.fairchildsemi.com/ds/FM/FMS3810.pdf 2 4 GPIO instances connected to OPB #2, and using a third party chess engine determines the computers move. The correct signals needed to move the chess pieces are then written to the GPIO instance controlling the VGA encoder, and the process repeats when the chessboard registers have changed. 1.2.3 Chessboard The chessboard was built as part of a high-school project by Mandheerej Nandra and his former classmate, Nelson Hu, in 2000. The majority of this project, then, is focused on the previous two systems; nevertheless, some additional electronics were required to accept input from the VGA port of the multimedia board, as the chessboard was originally designed to interface with the parallel port of a PC. These can be found in section 3. 1.3 Overall System Section 1.4 below contains the block diagram, illustrating the entire project and how its components are interconnected. The first two components are implemented on the Xilinx FF896 Microblaze and Multimedia Development Board (henceforth referred to as the multimedia board). The board contains all the components necessary to implement these two systems, but since it does not have any general purpose digital output ports, the VGA output signals of the board were used to interface with the chessboard. The video recognition system is entirely independent from the chess control system, even using a different bus. They are only connected together by four 32-bit registers, each holding the required information for two rows of the chessboard (see section 2.4 for the storage format). Only 2-bits are needed for each square; therefore, only 128 bits are required to describe the standard 8x8 chessboard. The chess control system polls these registers for any change. 5 1.4 Project Block Diagram Video camera over chessboard Chessboard electronics Video input VGA output Chess board ZBT RAM Video decoder ZBT RAM: Framebuffer ZBT RAM Video capture core GPIO OPB #1 Microblaze1 Video processor OPB #2 Chessboard registers (GPIO) Video recognition system Microblaze2 Chess processor Chess control system Xilinx Virtex-II XC2V2000 FPGA Xilinx Microblaze and Multimedia Development board 6 2. Video Recognition System This system is based on the zbt_test system supplied to us by Leslie Shannon, and is located in the vicap directory. A video capture core was created and added to the system, as well as several GPIO instances (see sections 2.3 and 2.4). Finally, the program code in the zbt_test system was replaced with image recognition software. 2.1 Vidcap core The vidcap core is a custom core that takes the raw data from the video decoder chip and stores the valid data in external memory that is attached to the same OPB bus as this core. 2.1.1 Hardware design The input video stream from the video decoder has special codes that identify horizontal and vertical blanking signals, as well as whether the current field contains the even or odd lines. The code in XAPP2864 from Xilinx has the ability to decode these inserted codes, and hence was used as the base for this core. The video core was written in VHDL, so this code was converted from its original Verilog format. The video core is also required to write image data to memory, and to do this it must be a master device that issues requests on the OPB and follows the OPB data transfer protocol, as defined by IBM5. This is the main addition that was made to the XAPP286 code. Appendix A1 contains the vhdl code used in the video core, and is commented to explain the various sections. Of particular note is that the Mn_request signal (used to request a transfer on the OPB) will go high as soon as the data is ready, but will only go low synchronous to the OPB clock. This is accomplished through the use of MstReq2 and MstReq3, as explained in the code. This reduces the latency of the bus transfer. Since the core tries to write one 32-bit value every 4 cycles of the video clock, this extra time can be crucial. This code makes no attempt to start writing at the beginning of a frame and stop at the end of one. This is because chess is a very slow game with a mostly static image being captured. Whether or not the core writes the same image over itself several times is irrelevant for the purposes of this project. Appendix A2 shows the corresponding MPD file. This is used to let Xilinx Platform Studio know how the signals in the VHDL file are to be connected with the rest of the system. The C_FBADDR parameter is the only additional parameter in this core beyond those needed to be an OPB device, and it specifies the address of the framebuffer. For this project, it was set to 0x80105000. 4 XAPP286 can be found in the following ZIP file: http://www.xilinx.com/products/boards/multimedia/docs/examples/Video_Input_Processing_Examples.zip 5 Found at http://www-306.ibm.com/chips/techlib/techlib.nsf/techdocs /9A7AFA74DAD200D087256AB30005F0C8/$file/OpbBus.pdf 7 2.1.2 Simulation The vidcap core was thoroughly simulated before it was downloaded to the FPGA and tested. To do this, the IBM OBP Toolkit6 was used to simulate the OPB bus interface. This toolkit has an VHDL entity called “opb_device” that is a model for either a master or a slave device. The model engages in OPB bus transactions as described by commands in a simple bus functional language (BFL). A script converts these commands to a ModelSim .do file, and the commands in this file are executed during simulation to define the behaviour of each “opb_device” instance. Appendix B1 shows the listing of the test BFL script “vidcap.bfl” that was used in the simulation. This particular script initializes the arbiter as well as one OPB slave device, namely “opb_device2”. The latter is then set up to accept many transactions, as the vidcap core will be making many writes to memory. Different delays are used to observe the behaviour of the hardware under various circumstances. After conversion to a .do file, some additional commands were added to control the enabling of the Appendix B2 shows a section of the “opb_complex”, the master testbench file provided by the toolkit, that was modified to incorporate a “vidcap” instance as opposed to another “opb_device” instance. There are some signals there that are either not part of the Xilinx implementation of OPB or have no usage in this testing scenario, so they are simply set to the appropriate constant. Finally, the “opb_complex” entity was modified so that it would provide stimulus to the “vidcap” instance through its “YCrCb_in” and “vidcap” ports. The stimulus was generated through a custom Perl script that outputs VHDL code for each byte of input to “YCrCb_in”. A portion of this code is listed in Appendix B3. Running the simulation in ModelSim yielded results similar to those found in Appendix B4. After some debugging, the various signals were found to behave as expected, and when this core was connect to the system, it functioned without any problems. 2.2 External Memory Controller The video system employs an obp_emc7 (external memory controller) core that controls a single memory block (memory block 2) of external ZBT memory on the multimedia board. In this project, the C_MEM0_BASEADDR parameter is set to 0x80100000, and the C_MEM0_HIGHADDR parameter is set to 0x801FFFFF. This creates a 1 MB address space that is big enough to hold the framebuffer, which is about 0.8 MB. C_SYNCH_MEM0 is set to 1, and C_OPB_CLK_PERIOD_PS is set to 37037 to match the OPB clock. 6 Acquired from http://www.idt.mdh.se/kurser/ct3410/ibm_cc_2_9/tk/OPB_TOOLKIT2X_052401.tar.gz Datasheet can be found in the hw\XilinxProcessorIPLib\pcores\opb_emc_v1_10_b\doc directory of a Xilinx EDK 6.1 installation 7 8 A dcm_module8 (digital clock manager module) core is also needed to synchronize the memory’s clock with the data. Default parameters are used with the exception of C_CLKFX_MULTIPLY, C_CLKIN_PERIOD, and C_EXT_RESET_HIGH, which were set to 1, 37.037, and 0 repectively. All the necessary pin assignments are located as usual in system.ucf. 2.3 GPIO for Controlling Video Capture (2 bits) A 2-bit GPIO core (opb_gpio_1) is used to control video capture core of the program. The “GPIO_d_out” port is connected to the “inflags” port of the vidcap core. Although only one bit is used to enable and disable the vidcap core, the vector definition makes it easy to add more configuration flags should the need arise in the future. Since this GPIO instance is used strictly for output, the C_TRI_DEFAULT parameter must be set to 0, and the C_DOUT_DEFAULT should also be set to 0 to avoid framebuffer writing on startup. Of course, C_WIDTH is set to 2 to make this 2 bits wide. 2.4 GPIO’s for Chessboard Registers Because chess is a game where only one piece can be moved at a time (except for certain easily identifiable circumstances), only three states are necessary to describe each square: black piece present, white piece present, or no piece present (empty). Adding an additional error state, only two bits are required for each square: State Binary value Empty 00 Black piece 01 White piece 10 Error 11 Table 2.1: Possible states for each square A square has the error state associated with it when the program cannot recognize the color of this particular square. This situation can arise when the user moves his own piece and his hand is over the board. Categorizing each square into one of these four categories is a simple task, and the algorithm is explained in the next section (2.5). After all 64 squares are checked, the 2-bit values are packed into four 32-bit variables. These values must be transferred to the independent chess control system. This is achieved by using additional GPIO instances. Two are attached to the video system’s OPB, and two are attached to the chess control system’s OPB. The GPIO_d_out ports of the former are connected to the GPIO_in ports of the latter. 8 Datasheet can be found in the hw\XilinxProcessorIPLib\pcores\dcm_module_v1_00_a\doc directory of a Xilinx EDK 6.1 installation 9 Only two 32-bit GPIO instances (board_gpio1 and board_gpio2 in the design) are required to store four 32-bit variables, as a single 32-bit GPIO instance holds two 32-bit registers when the C_IS_DUAL parameter is set to 1. Figure 1 (obtained from the GPIO datasheet) below shows how this parameter effectively doubles the GPIO hardware. The C_TRI_DEFAULT and C_TRI_DEFAULT_2 parameters are also set to 0 to indicate output ports. 2.5 Recognition Software The video recognition software accesses the chessboard image that is located in the ZBT memory at the address specified in the vidcap core parameter C_FBADDR. Video data in the memory consists of 263 horizontal lines. The real picture has 525 horizontal lines in it, composed of 263 odd lines (line 1, 3, 5, etc.) and 262 even lines (number 2, 4, 6 and so on). In the usual case, these lines get interlaced with each other to get the final picture. In this project it is not necessary to use all 512 lines into memory. Using only the odd lines results in an adequate picture of the board (Figure 2). For the purposes of this project, each horizontal line consists of 360 pixels, with each pixel containing four data elements: two Y (luminance) elements, one Cr (red difference) element and one Cb (blue difference) element. 10 Figure 2: Odd lines of a test image captured using the video capture core 2.5.1 Camera Setup The camera is fixed to a slanted tripod so that it is directly over the chessboard without interfering with its moving parts. Once fixed and oriented correctly, the axes of the board are aligned with the axes of the camera image. This allows the vertical boundaries of all squares in a row to be marked by top and bottom Y values, and likewise the horizontal boundaries of all squares in a column are marked by left and right X values. In vidcap.c, the left and right boundaries for each column are stored in the arrays xmin and xmax respectively, and the top and bottom boundaries are stored in the arrays ymax and ymin respectively. These values do not cover the entire square so as to allow for some movement of the board, and prevent a tall piece from overlapping another square via perspective. 2.5.2 Piece Recognition The piece recognition code scans values of Y, Cr and Cb of all dots in one square. Black pieces have red squares attached at the top and white pieces have green squares attached to them. It is much easier to recognize green and red colors on a black-and-white board. The values of color signals are being checked for each individual dot in the square and then compared to preset values of Y, Cr and Cb signals for particular type of square. The program does not compare all three signals for each square. For example, in order to determine if there is a black (red) piece standing on the square, it is only necessary to check Cr values from all dots in the square. Cr value represents red color signal, thus square with red piece in it will have much higher average value of Cr signal than other types of squares. Similarly, green dot has much higher average value of luminance (Y) signal than other dots, and value of Cr and Cb signals are approximately equal. Therefore, it is only necessary to check for high Y value and for difference in values between Cr and Cb signals. Dots in empty squares are determined by checking particular 11 range for all three video signals, because beige (white) and dark-brown (black) colors of squares on the board represent combination of three main signals. Also, margin of error is introduced when comparison of signal values to preset values is being done. This margin of error is required due to unequal lighting of different squares. Squares of the same color will look different from each other because of light from the lamp falling onto them at different angles. After the square is fully scanned, the program checks counters of red, green, black and white dots in this square. Then, the square is assigned two-bit value based on which counter is prevailing, which in turn is being shifted onto 32-bit variable. Following all these operations, the program will move onto next square and complete scanning of all 64 squares. When scanning of the entire board is complete, four 32-bit variables containing information about all squares are being shifted into memory using two 32-bit dual GPIO cores. 12 3. Chess Control System The chess control system is, for the most part, just a software system, and located in the ChessAI directory. Other than a Microblaze instantiation and external memory there are only a few GPIO instances to connect this system with the video control system and chessboard. The greatest complexity of this system lies in the software. This software, however, is complicated enough that it requires two banks of ZBT memory to reside in, mostly occupied by the third party chess engine used to compute moves. The MP3 decoder example from Xilinx was used as a base system for this reason, and all the hardware added for MP3 decoding and sound capability was removed. Any basic elements, such as a UART instantiation for remote debugging, that are present in a system built with the “Base System Builder” feature of Xilinx Platform Studio were also added and connected appropriately. The result is a basic system with two banks of ZBT memory. Other peripherals that were added to this system are described in the following sections. However, this system was not tested at all, as effort was directed towards making a much more simple system work alongside the video system. The chess engine was removed and replaced with pre-programmed moves so that the ZBT memory would not be necessary. The only significant code that remained was that controlling the board. Unfortunately, it seems this effort was misguided as many unusual errors arose even when this basic system was compiled in the presence of the video system. The following sections describe the final untested chess control system (including ZBT controllers, derived from the MP3 example) that was created late in the project timeline, and not the basic test system. 3.1 ZBT Controllers Two instances of the opb_zbt_controller core from Xilinx were created to control the two banks of memory. The C_ZBT_ADDR_SIZE parameter size was set to 19 in order to control 2 MB (512k x 32bit) of memory, and the C_EXTERNAL_DLL parameter was set to 1 to indicate the use of a digital clock manager. Two digital clock managers were used in this design, but they were not the same as in the video system. Here, they are instances of the dcm_block core found in the pcores directory. They serve the same function as those mentioned in section 2.2, and are connected similarly. 3.2 GPIO’s for Chessboard Registers As explained in section 2.4, GPIO instances were used to transfer the chessboard information from the video system to this system. The two GPIO instances on this system, however, only need to receive input. Thus the only parameters that needs to be set are C_ALL_INPUTS, C_ALL_INPUTS_2 and C_ISDUAL, each to a value of 1. 13 3.3 GPIO for Controlling VGA Port The multimedia board has a FMS3810 VGA encoder, and it’s input pins (connected to the FPGA) control the voltage level of the red, blue, and green pins of the VGA port (see Appendix C3). The h-sync and v-sync pins are directly connected to pins on the FPGA. In the chess control system, all these pins are accessed by connecting them to the GPIO_IO port of a GPIO instance. This GPIO instance is 29-bits and used strictly for output. For these reasons, the parameters C_TRI_DEFAULT and C_WIDTH are set to 0 and 29 respectively. The GPIO_IO port is added and made external so that it can connect to the required FPGA pins. 3.4 Timer and Interrupt Controller An instance of the opb_timer core is used to keep track of time. This is needed so that the chess engine knows when it must finish calculating its move. An interrupt controller (and instance of the opb_intc core), while not necessary, is added so that other events in the future can trigger the chess engine to stop thinking. 3.5 Chess Control Software The chess control software is the main component of this system. It has two main parts: one controls the signals to the chessboard and the other controls the gameplay. The code that controls the chessboard is contained in the class cBoardControl. It borrows a lot of code from the original chessboard, but is rewritten in C++ for better organization. The chess pieces used are small enough that they can move between each other when moving from one part of the board to the other. This is especially useful in the special cases of piece capture, knight moves, and castling. All these special cases are accounted for in this class. The GPIO device controlling VGA output is written to in the “UpdatePort()” function, and the values used to set the voltages on the red and green pins are stored in the array named “step”. These values were determined experimentally so that the circuit in Appendix C4 functioned correctly. Two other classes that were created are a cBoard class and a cChessAI class. The former is used to store a chessboard in a similar form as that mentioned in section 2.4. It is also capable of determining which piece was moved by comparing two such boards. The cChessAI class is simply a wrapper for the third party chess engine. It updates the chessboard data structures in the chess engine, acquires computed moves, and runs the “PonderPosition()” function while waiting for the user move. This will assist the computer when it is time for it’s own move. The chess engine used was Beowulf9, a capable third party open source chess engine. It was chosen for its strength and ease of integration relative to other chess 9 Available at www.chessbrain.net/beowulf 14 engines. Many modifications had to be made to the some 20,000 lines of C code included in this engine, most importantly the elimination of all file access. Since all chess engines require an opening book, this data had to be stored in a fixed array instead. Dynamic memory allocation was also eliminated since there is very limited memory on the multimedia board, and errors on allocation was very likely. Of course, the user interface had to be completely removed also so that the engine would work with game flow of BoardControl.cpp, the parent C++ file for this project that contains the program entry point “main()”. Other system calls, such as those to the windows timer, also needed replacing. Overall, this was a very large task. This system was tested and debugged on a PC first. All writes to GPIO were disabled, and the chessboard registers were manually changed in the debugger rather than by the video system. Through the use of #ifdef _WIN32 … #endif constructs, minimal code changes were necessary when porting the debugged PC version of the software to the Microblaze system. 15 4. Chessboard The chessboard is constructed like a plotter beneath the playing surface. Stepper motors control the x and y position of a strong magnet that is mounted on a solenoid. When current passes through the solenoid, the magnet moves close to the board, allowing it to hold and move the piece directly above it when the plotter moves. When there is no current, it is far enough from the board that the plotter mechanism moves the magnet without affecting the position of any pieces. 4.1 Power Supply While the chessboard already had electronics to control the stepper motors, it used an unorthodox setup needing both +12V and -12V power sources with significant current supply to drive the motor coils in different directions. The better solution is to use a Hbridge circuits, which allows the use of a single 12V power supply to drive the motors, easily obtainable in any computer power supply. This was easily accomplished through the use of L298 H-bridge IC’s (see Appendix C1 for pinouts). The power could now be supplied with a standard AT power supply. The newer ATX power supply standard requires loading the +5V terminals in order to turn on, adding unnecessary complication. 4.2 Control Electronics The larger design issue is interfacing the chessboard to the multimedia board. Since there are no general purpose digital output ports, the VGA port was used to deliver the required information to the board. The VGA port consists of red, green, blue, h-sync and v-sync signals. The chessboard needs two signals for each of the two motors, one signal to control the magnet, and one signal to disable current through the motors. The last signal is desired because stepper motors dissipate energy even when they aren’t turning, and can thus get very hot, especially during a lengthy chess game. Therefore six signals are needed, but only five are available. Furthermore, our particular multimedia board had a malfunctioning blue signal. This problem is circumvented by multiplexing two signals on the red and green signals. Using 4 voltage levels on these pins, two bits of information can be carried on each. Appendix C4 shows the details of the circuit used to decode the voltage from one of these pins into four digital TTL-compatible signals. These output signals are connected to the corresponding input pins of the L298 (pins 5, 7, 10, and 12). The stepper motors have two coils: one is driven by pins 2 and 3, and the other by pins 3 and 4. The remaining pins are connected as suggested in the upper diagram of Appendix C1. This entire circuit was duplicated for the The h-sync signal controls the magnet via a TIP29 power transistor, and the vsync signal controls the two enable inputs (pins 11 and 6 on the L298) of the two Hbridge circuits. The VGA signals thus have full control of the chessboard. 16 5. Project Outcomes All components of this system work well separately, and the connection between the chess control system and chessboard electronics is also flawless. The major problem encountered that prevented completion of this project occurred when merging the chess control system with the video recognition system. 5.1 Video Recognition System Outcomes The video capture and pattern recognition system works well. The image is captured from video inputs into external memory, each square on the board is scanned and assigned the correct value based on the type of piece standing on it. The values are all packed correctly as well into the chessboard registers. There was a minor issue in this system that was given low priority; see section 5.3 for a description of this issue. 5.2 Chessboard and Control System Outcomes The chess control system also functioned correctly, although the complete system with AI was never fully tested on the multimedia board. As explained in section 4, efforts were concentrated in getting the basic control system without the chess engine or external memory to merge with the video recognition system, but to no avail. Nevertheless, the VGA port of the multimedia board was outputting the correct signals, and the software controlled the chessboard perfectly. The chessboard itself was also completely functional. Although this is to be expected since the board was functional prior to embarking on this project, there were new electronics added that were by no means trivial. These circuits functioned perfectly, and demonstrated a creative and successful workaround for the limited means of outputting arbitrary signals from the multimedia board. This allowed the chess control software to correctly control the movement of pieces on the chessboard. The only major problem encountered was some sort of short circuit at the power supply plug when presenting the project, resulting in the inability to properly showcase the piece movement. 5.3 Unresolved Issues As mentioned above, problems with adding another Microblaze system to the video recognition system prevented further integration of all system components. We were able to have two simple Microblaze systems work after hours of searching through the many files generated by XPS as well as the online ISE documentation, but could not do the same with the video recognition system. 17 There was a minor problem with the video recognition system. The video capture would only work if the vidcap code was enabled and disabled by manually stepping through the code that asserted and deasserted the 2-bit GPIO signals (see section 2.3). When a delay loop was used instead to allow the vidcap core enough time to capture a frame, nothing was written to memory, even with long delay times. One theory as to the cause of this problem is that the delay loop causes the Microblaze to saturate the OPB when incrementing the counter, thus blocking the video data from being written. Another theory is that the debugger is constantly polling the UART, and saturates the system in that way. There was not enough time to test either theory, unfortunately, as other aspects of the project had higher priority. 5.4 Further Work The obvious direction for further work is to complete the integration of the video recognition system with chess control system. If the minor problem with the video recognition system was also fixed, then the resulting system would be a complete chessplaying computer, which can recognize opponent’s moves, find the best move to answer using the Beowulf chess engine, and move required pieces on the board, thus satisfying the project’s goals. However, should this task be too difficult, it is possible to run the video capture software and chessAI software on one Microblaze processor, provided there is enough external memory available. However, there won’t be any pondering ability (see section 3.5), since there is only a single execution thread now. In hindsight, this would have been a better course of action when it was discovered that multiple processors were causing problems. 18 6. Design Tree Description ChessAI\ – Contains the chess control system. ChessAI\code\ – The source code for the software component of the chess control system BoardController.cpp – Parent file containing the entry point cBoardControl.cpp/h – Class controlling input to chessboard electronics cBoard.cpp/h – Chessboard class cChessAI.cpp/h – Chess engine wrapper class Beowulf\ - Directory containing chess engine source code. The following files were significantly modified from the original Beowulf source: main.c – Added the initialization of the chess engine parser.c – Added functions called by the wrapper class computil.c – Modified time measurement and user input functions vidcap\ – Contains the video recognition system vidcap\sst2 – A TCL script used to download the framebuffer into a file. vidcap\chdump.txt – A partial memory dump of the framebuffer showing a chessboard vidcap\code\vidcap.c – The source code for the video recognition FrameBufferViewer\ – Contains an application created for the PC to view a framebuffer dump. “chdump.txt”, mentioned above, can be viewed with this application. Both the source code and the executable are included in this directory. sim\ – Contains simulation files sim\OPB_TOOLKIT2X_052401\ – The IBM OPB Toolkit sim\OPB_TOOLKIT2X_052401\vidcap.bfl – BFL script for OPB simulation sim\OPB_TOOLKIT2X_052401\vidcap.do – Generated ModelSim .do script sim\vidcap.do – ModelSim script used for simulation sim\gen_vid_tb.pl – Perl script to generate video stream stimulus sim\stimvhd.txt – Output of gen_vid_tb.pl sim\opb_complex.vhd – Modified testbench using the vidcap core sim\vidcap.wlf – ModelSim waveform containing simulation results 19 Appendix A: Video Capture Core A1: vidcap.vhd code listing: library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_arith.all; use ieee.std_logic_unsigned.all; entity vidcap is --USER-- change entity name generic ( C_OPB_AWIDTH : INTEGER := 32; C_OPB_DWIDTH : INTEGER := 32; C_FAMILY : string := "virtex2"; C_FBADDR : std_logic_vector(0 to 31) := X"00005000" ); port ( --Required OPB bus ports, do not add to or delete Mn_ABus : out std_logic_vector(0 to C_OPB_AWIDTH - 1 Mn_DBus : out std_logic_vector(0 to C_OPB_DWIDTH - 1 Mn_request : out std_logic; Mn_busLock : out std_logic; Mn_select : out std_logic; Mn_RNW : out std_logic; Mn_BE : out std_logic_vector(0 to C_OPB_DWIDTH/8 - 1 Mn_seqAddr : out std_logic; OPB_Clk : in std_logic := '0'; OPB_Rst : in std_logic := '0'; OPB_DBus : in std_logic_vector(0 to C_OPB_DWIDTH - 1 OPB_MGrant : in std_logic := '0'; OPB_xferAck : in std_logic := '0'; OPB_errAck : in std_logic := '0'; OPB_retry : in std_logic := '0'; OPB_timeout : in std_logic := '0'; ); ); ); ) := (others => '0'); --User ports led1 : out std_logic; led2 : out std_logic; YCrCb_in : in std_logic_vector(9 downto 0); vid_clk : in std_logic; inflags : in std_logic_vector(0 to 1) ); end entity vidcap; architecture imp of vidcap is signal Mn_select_s : std_logic := '0'; signal Mn_request_s : std_logic := '0'; signal signal signal signal pixeldata : std_logic_vector (31 downto 0) := X"00000000"; vidcount : std_logic_vector (0 to 31); cnt : std_logic_vector (0 to 31) := X"00000000"; write_ena : std_logic; 20 signal signal signal signal signal -- the signal signal signal signal signal H_rg : std_logic_vector (4 downto 0); H_rising : std_logic; TRS : std_logic; V_falling : std_logic; V_rising : std_logic; next 5 signals act as a fifo for video YCrCb_rg1 : std_logic_vector (9 downto YCrCb_rg2 : std_logic_vector (9 downto YCrCb_rg3 : std_logic_vector (9 downto YCrCb_rg4 : std_logic_vector (9 downto YCrCb_rg5 : std_logic_vector (9 downto data 0); 0); 0); 0); 0); signal Fo : std_logic; signal Ho : std_logic; signal Vo : std_logic; -- These signals are used to request a bus transfer asynchronously -- wrt the OPB clock (i.e. as soon as video data is ready) signal MstReq2 : std_logic; signal MstReq3 : std_logic; signal NewData : std_logic; begin -- These signals should be hooked up to LEDs -- led1 is high when writing is enabled -- led2 flashes about 7 times per second when a valid -- video source is hooked up led1 <= write_ena; led2 <= cnt(10); --inflags(0) enables video capture write_ena <= inflags(0); --constant signals Mn_busLock <= '0'; Mn_RNW <= '0'; Mn_BE <= (others => '1'); Mn_seqAddr <= '0'; --the request and select signals of OPB Mn_request <= Mn_request_s; Mn_select <= Mn_select_s; -- This process controls the bus signals and follows -- the OPB transfer protocol. When MGrant is high, -- deassert the request signal (via MstReq2) and assert -- the select signal. -- Deassert the select signal when any termination -- signal (xferAck, timeout, or retry) is high. process(OPB_Clk) variable addr : std_logic_vector(0 to C_OPB_AWIDTH - 1); begin if OPB_Clk'event and OPB_Clk = '1' then -- reset signals if OPB_Rst = '1' then Mn_select_s <= '0'; MstReq2 <= '0'; elsif OPB_MGrant = '1' then 21 MstReq2 <= '0'; Mn_select_s <= '1'; elsif MstReq3 = '0' then -- Follow NewData at the clock edge MstReq2 <= NewData; end if; if OPB_xferAck = '1' or OPB_timeout = '1' or OPB_retry = '1' then Mn_select_s <= '0'; end if; -- MstReq3 is the OPB clock synchronized version of NewData MstReq3 <= NewData; end if; end process; -- TRS (timing reference signal) recognizes the presence of -- special codes embedded in the video stream TRS <= '1' when YCrCb_rg2(9 downto 2) = "00000000" and YCrCb_rg3(9 downto 2) = "00000000" and YCrCb_rg4(9 downto 2) = "11111111" else '0'; -- The following line allows Mn_request to be asserted immediately after -- NewData goes high, but allows Mn_request to be deasserted synchronous -- to the OPB clock via MstReq2 above. Mn_request_s <= ((NewData and not MstReq3) or MstReq2) and not Mn_select_s; -- NewData cycles at 1/4 the speed of vidclk, but is low during the -- horizontal blanking period NewData <= write_ena and not (Ho or vidcount(30) or OPB_Rst); -- The OPB -- signals Mn_DBus <= -- Add the Mn_ABus <= arbiter ORs all data and address busses together, so these must only be set high when this device controls the bus pixeldata when Mn_select_s = '1' else (others => '0'); pixel count to the base address of the framebuffer ("0000000000" & (vidcount(10 to 29) & "00") + C_FBADDR) when Mn_select_s = '1' else (others => '0'); -- Video signals Ho <= H_rg(0) or H_rg(4) ; H_rising <= H_rg(0) and not H_rg(1) ; V_rising <= ( TRS and YCrCb_rg1(7) ) and not Vo ; V_falling <= ( TRS and not YCrCb_rg1(7) ) and Vo ; -- Process synchronous to the process (vid_clk) begin if (OPB_Rst = '1' ) then YCrCb_rg1 <= (others => YCrCb_rg2 <= (others => YCrCb_rg3 <= (others => YCrCb_rg4 <= (others => YCrCb_rg5 <= (others => video clock '0') '0') '0') '0') '0') ; ; ; ; ; Fo <= '0' ; Vo <= '0' ; H_rg <= "00000"; 22 vidcount <= (others => '0'); elsif vid_clk'event and vid_clk = '1' then -- shift data down the FIFO YCrCb_rg1 <= YCrCb_in ; YCrCb_rg2 <= YCrCb_rg1 ; YCrCb_rg3 <= YCrCb_rg2 ; YCrCb_rg4 <= YCrCb_rg3 ; YCrCb_rg5 <= YCrCb_rg4 ; -- decode the timing information from the special signal if ( TRS = '1' ) then Fo <= YCrCb_rg1(8) ; Vo <= YCrCb_rg1(7) ; H_rg(4 downto 0) <= H_rg(4 downto 1) & YCrCb_rg1(6) ; else Fo <= Fo ; Vo <= Vo ; H_rg(4 downto 0) <= H_rg(3 downto 0) & H_rg(0) ; end if; -- reset and increment the pixel counter when appropriate if V_falling = '1' and Fo = '0' then -- end of frame vidcount <= (others => '0'); elsif H_rg(4) = '1' and H_rg(3) = '0' then vidcount(30 to 31) <= "00"; elsif Ho = '1' then vidcount(30) <= '1'; else vidcount <= vidcount + 1; end if; -- counter for making led2 flash cnt <= cnt + 1; end if; end process; -- pack 4 values into a 32-bit register (pixeldata) every -- 4 video clocks process (NewData) begin if OPB_Rst = '1' then pixeldata <= (others => '0'); elsif NewData'event and NewData = '1' then pixeldata <= YCrCb_rg2(9 downto 2) & YCrCb_rg3(9 downto 2) & YCrCb_rg4(9 downto 2) & YCrCb_rg5(9 downto 2); end if; end process; end architecture imp; 23 A2: vidcap_v2_1_0.mpd listing: ################################################################### ## ## Microprocessor Peripheral Definition ## ################################################################### BEGIN vidcap, IPTYPE = PERIPHERAL, IMP_NETLIST=TRUE BUS_INTERFACE BUS = MOPB, BUS_STD = OPB, BUS_TYPE = MASTER ## Generics for VHDL or Parameters for Verilog PARAMETER c_opb_awidth = 32, DT = PARAMETER c_opb_dwidth = 32, DT = PARAMETER c_family = virtex2, DT = PARAMETER c_fbaddr = 0x00005000, DT = ## Ports PORT mn_abus = M_ABus, PORT mn_dbus = M_DBus, PORT mn_request = M_request, PORT mn_buslock = M_busLock, PORT mn_select = M_select, PORT mn_rnw = M_RNW, PORT mn_be = M_BE, PORT mn_seqaddr = M_seqAddr, PORT opb_clk = "", PORT opb_rst = OPB_Rst, PORT opb_dbus = OPB_DBus, PORT opb_mgrant = OPB_MGrant, PORT opb_xferack = OPB_xferAck, PORT opb_errack = OPB_errAck, PORT opb_retry = OPB_retry, PORT opb_timeout = OPB_timeout, PORT led1 = "", PORT led2 = "", PORT YCrCb_in = "", PORT vid_clk = "", PORT inflags = "", END DIR DIR DIR DIR DIR DIR DIR DIR DIR DIR DIR DIR DIR DIR DIR DIR DIR DIR DIR DIR DIR = = = = = = = = = = = = = = = = = = = = = integer integer string std_logic_vector OUT, VEC = [0:(c_opb_dwidth-1)], OUT, VEC = [0:(c_opb_dwidth-1)], OUT, OUT, OUT, OUT, OUT, VEC = [0:((c_opb_dwidth/8)-1)], OUT, IN, SIGIS = CLK, IN, IN, VEC = [0:(c_opb_dwidth-1)], IN, IN, IN, IN, IN, OUT, OUT, IN, VEC = [9:0], IN, IN, VEC = [0:1], BUS BUS BUS BUS BUS BUS BUS BUS BUS BUS BUS BUS BUS BUS BUS BUS BUS BUS BUS BUS BUS = = = = = = = = = = = = = = = = = = = = = MOPB MOPB MOPB MOPB MOPB MOPB MOPB MOPB MOPB MOPB MOPB MOPB MOPB MOPB MOPB MOPB MOPB MOPB MOPB MOPB MOPB 24 Appendix B: Video capture core simulation B1: OPB bus functional language test script (vidcap.bfl) set_device (path=/opb_complex/opb_arbiter,device_type=opb_arbiter) -- Initialize arbiter Configure(arbiter_mode=round) set_device (path=/opb_complex/opb_device2,device_type=opb_device) -- Initialize slave 2 configure(s_byte_enable=true) mem_init(addr=00021000,data=AABBCCDD) mem_init(addr=00021004,data=AABBCCDD) mem_init(addr=00021008,data=AABBCCDD) mem_init(addr=0002100C,data=AABBCCDD) mem_init(addr=00021010,data=AABBCCDD) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=2,ack_type=normal,ack_size=4) response(delay=4,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) response(delay=1,ack_type=normal,ack_size=4) 25 B2: Modified device instantiation Excerpted from the testbench “opb_complex”: OPB_DEVICE0: vidcap generic map ( C_FBADDR => C_FBADDR ) port map ( OPB_Clk => OPB_CLK , OPB_Rst => OPB_RESET , OPB_MGrant => OPB_MxGRANT(0) , OPB_xferAck => OPB_XFERACK , OPB_errAck => OPB_ERRACK , OPB_retry => OPB_RETRY , OPB_timeout => OPB_TIMEOUT , OPB_DBus => to_stdlogicvector(OPB_DBUS(0 to 31)), Mn_request => Mx_REQUEST(0) , Mn_busLock => M0_BUSLOCK , Mn_select => Mx_SELECT(0) , Mn_RNW => M0_RNW , Mn_seqAddr => M0_SEQADDR , Mn_BE => Mn_BE , Mn_ABus => Mn_ABus , Mn_DBus => Mn_DBus , YCrCb_in => YCrCb_in , vid_clk => vid_clk , inflags => inflags ); inflags(0) <= write_ena; M0_BEXFER <= Mx_SELECT(0); M0_HWXFER <= Mx_SELECT(0); M0_FWXFER <= Mx_SELECT(0); M0_DWXFER <= '0'; M0_UABUS <= (others => '0'); SL0_XFERACK <= '0'; SL0_BE <= (others => '0'); SL0_BEACK <= '0'; SL0_HWACK <= '0'; SL0_FWACK <= '0'; SL0_DWACK <= '0'; SL0_ERRACK <= '0'; SL0_TOUTSUP <= '0'; SL0_RETRY <= '0'; M0_BE <= to_stdulogicvector(Mn_BE) & "0000"; M0_ABUS <= to_stdulogicvector(Mn_ABus); M_DBUS_0 <= to_stdulogicvector(Mn_DBus) & X"00000000"; M_DBUS_EN(0) <= Mx_SELECT(0); M_DBUS_EN32_63(0) <= '0'; 26 B3: Video test sequence Excerpted from the testbench “opb_complex”: vid_clk <= '1'; wait for 19 ns; vid_clk <= '0'; wait for 9.5 ns; YCrCb_in <= "1011000100"; wait for 9.5 ns; vid_clk <= '1'; wait for 19 ns; vid_clk <= '0'; wait for 9.5 ns; YCrCb_in <= "0110011100"; wait for 9.5 ns; vid_clk <= '1'; wait for 19 ns; vid_clk <= '0'; wait for 9.5 ns; YCrCb_in <= "0011000100"; wait for 9.5 ns; vid_clk <= '1'; wait for 19 ns; vid_clk <= '0'; wait for 9.5 ns; YCrCb_in <= "1000100000"; wait for 9.5 ns; vid_clk <= '1'; wait for 19 ns; vid_clk <= '0'; wait for 9.5 ns; YCrCb_in <= "1100001100"; wait for 9.5 ns; vid_clk <= '1'; wait for 19 ns; vid_clk <= '0'; wait for 9.5 ns; YCrCb_in <= "0100000100"; wait for 9.5 ns; vid_clk <= '1'; wait for 19 ns; vid_clk <= '0'; wait for 9.5 ns; YCrCb_in <= "0011011100"; wait for 9.5 ns; vid_clk <= '1'; wait for 19 ns; vid_clk <= '0'; wait for 9.5 ns; YCrCb_in <= "0100111100"; wait for 9.5 ns; -- 2C4 -- 19C -- 0C4 -- 220 -- 30C -- 104 -- 0DC vid_clk <= '1'; wait for 19 ns; vid_clk <= '0'; wait for 9.5 ns; YCrCb_in <= "1111001100"; wait for 9.5 ns; vid_clk <= '1'; wait for 19 ns; vid_clk <= '0'; wait for 9.5 ns; YCrCb_in <= "1011111000"; wait for 9.5 ns; vid_clk <= '1'; wait for 19 ns; vid_clk <= '0'; wait for 9.5 ns; YCrCb_in <= "0100001100"; wait for 9.5 ns; vid_clk <= '1'; wait for 19 ns; vid_clk <= '0'; wait for 9.5 ns; YCrCb_in <= "1111111111"; wait for 9.5 ns; vid_clk <= '1'; wait for 19 ns; vid_clk <= '0'; wait for 9.5 ns; YCrCb_in <= "0000000000"; wait for 9.5 ns; vid_clk <= '1'; wait for 19 ns; vid_clk <= '0'; wait for 9.5 ns; YCrCb_in <= "0000000000"; wait for 9.5 ns; vid_clk <= '1'; wait for 19 ns; vid_clk <= '0'; wait for 9.5 ns; YCrCb_in <= "1011011000"; wait for 9.5 ns; vid_clk <= '1'; wait for 19 ns; vid_clk <= '0'; wait for 9.5 ns; -- 3CC -- 2F8 -- 10C -- 3FF -- 000 -- 000 -- 2D8 -- 13C 27 B4: Sample simulation results: 28 Appendix C: Electronics C1: L298 dual H-bridge IC Source: http://www.st.com/stonline/books/pdf/docs/1773.pdf 29 C2: LM741 op-amp Source: http://www.national.com/ds/LM/LM741.pdf C3: VGA connector pinout Pin 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Signal Red Green Blue ID Bit N/C Red Return Green Return Blue Return No Pin Ground ID Bit ID Bit H Sync V Sync ID Bit 30 C4: Motor signal decoder circuit and operation -5V 1k 10k Vin V1 Output 2 250k 17k +5V 100k I3 Output 1 Output 3 I1 250k +5V -12V I4 -5V 150k Output 4 I2 +5V This circuit functions much like a 2-bit analog to digital converter. The upper opamp simply compares Vin with a reference voltage of 0.7 volts, and V1 is set to either +5V or -5V. The lower op-amp is a summing amplifier. This op-amp doesn’t get saturate like the previous one, so the inverting input will have a voltage of 0V. Since it doesn’t have any appreciable current running through it, I2 = -80 A, so: I4 = I1 + I2 + I3 = I1 + I3 - 80 A The following table describes the operation of the circuit for the 4 values of Vin that will be used: Vin 0.4 V 0.6 V 0.8 V 1.0 V V1 +5 V +5 V -5 V -5 V Output1 +5 V +5 V 0V 0V Output2 0V 0V +5 V +5 V I1 20 A 20 A -20 A -20 A I3 40 A 60 A 80 A 100 A I4 -20 A 0 A -20 A 0 A Output3 0V +5 V 0V +5 V Output4 +5 V 0V +5 V 0V The inverters used in the circuit are from a 7404 hex inverter IC. 31