2011 International Conference on Software and Computer Applications IPCSIT vol.9 (2011) © (2011) IACSIT Press, Singapore Video Processing Toolbox for FPGA Powered Hardware Vladimir Kasik 1, Tomas Peterek 2 1,2 VSB – Technical University of Ostrava, FEECS, DMC, 17. listopadu 15, 708 33 Ostrava-Poruba, Czech Republic Abstract. The presented work is aimed to image and video processing on a programmable FPGA platform. There is introduced a set of image correction and effect functions implantable in hardware. As a real input a composite videosignal is used and converted in ADV 7180 device to digital data. The methods are tested using a Xilinx FPGA series Spartan3 and verified with the static image and the motion picture. Results are usable in multimedia, embedded systems and also in biomedical engineering, especially in the medical imaging. Keywords: FPGA, Image Processing, Video Processing, Composite Video 1. Introduction The standard approach to image processing is the use of conventional processor or GPU on the graphics card. Using programmable logic for this purpose offers several advantages, which is very important for fast data processing on specialized hardware. Given that many applications use analog cameras for image capturing, the system is designed for composite video input. The signal is then converted and transferred to a digital RGB data and then passed to FPGA device. The internal structure of FPGA is designed to be able to address a few selected functions of the input image. The number of implemented image processing functions is restricted by the size and speed of the FPGA. Fig. 1: Block Structure of the Video kit. FPGA outputs are connected to each D/A converter to obtain the final images. Videoconverter circuit is controlled by the crystal with frequency 28.63636MHz, while the FPGA is driven by the 50 MHz crystal oscillator and the internal FPGA clock signals with even higher frequency in the DCM - Digital Clock Manager. The entire video kit is controlled through serial communication line connected to the FPGA. The entire video kit is controlled through serial communication lines connected to the FPGA. Logical system and all proposed features are designed for image processing in real-time 25Hz frame rate. 2. Videoconverter module 242 To realize the transfer of composite video signal to a digital data the Analog Devices ADV7180 integrated circuit is used. The circuit itself can detect the type of analog video input (composite, S-Video, YPrPb) and convert it to digital format 8-bit ITU-R BT.656 YCrCb 4:2:2. Processed analog baseband signal in NTSC, PAL or SECAM video input comes to the transmitter, where it passes through the antialiasing filter. It is followed by Sample & Hold circuit with a sampling frequency of 86 MHz and 10-bit A/D converter itself. From there, proceed to the digital data block of digital processing (brightness and color correction, conversion to YCrCb) from where they continue into the FIFO stack, from which are then specified intervals sent to the 8-bit parallel port P0 ÷ P7. The circuit also generates pulses for horizontal sync (HS), vertical sync (VS) and the signal output pixel data clock (LLC). Fig. 2: Component Side Layer of the Videoconverter Module. All functions and settings of the video transmitter can be controlled via the I2C bus with a clock signal of 400 kHz. Videoconverter board is a separate module and the FPGA unit is plugged via a connector. Fig. 3: Composite Videosignal Measured in the Input Stage of the ADV7180 Device Electrical circuit of the video transmitter was largely taken from the manufacturer as recommended circuit ADV7180 for 40-pins LFCSP version. The electrical circuit is designed with EMC and video processing quality. Therefore, there are used four voltage regulators to supply power of digital, analog and digital signal parts. Outputs of the supply chains are filtered with the manufacturer's recommended circuits. All voltage regulators are powered with 5-Volt supply. All the necessary signals are fed to 20-pins video converter connector leading to FPGA board. In addition to digital video signals (P0 ÷ P7, LLC, HS and VS) the connector includes also I2C signals (SDA and SCL), the overall circuit reset (/ RESET) and the signal for entry into low-power state (/ PWRDWN). PCB is designed as a double-sided with a combined method of mounting technology, SMT and THT. 243 3. FPGA Unit FPGA Unit includes a Xilinx’s Spartan XC3S200 device. The basic design concept is shown in Fig. 4 and for demonstration purposes includes 3 functions for image processing. Depending on the size of the chip we can choose appropriate number of function units (Image Processing Modules) to be used with the relevant functions. These modules are developed as VHDL modules and they constitutes a toolbox for working with images in real time. Fig. 4: Image Processing Architecture Inside FPGA. 3.1. Developed Image Processing Modules Currently the following Image Processing Modules / functions are developed: • • • • • • • • Tone Map variation Emphasize Edges Find Edges Sharpen Tail Film grain Spot Filter Enhance Lighting Some of the results are shown in following figures. An important parameter of the individual modules is their individual logic utilization. For the set of first three functions listed above the logic utilization is in the table 1 and it contains about 83% of total logic resources of Spartan 3S200 FPGA. a) b) Fig. 1: Test Image: a) Original, b) Tone Map Variation 244 a) b) Fig. 1: Test Image: a) Emphasize Edges, b) Find Edges Most modules use specific features of the FPGA architecture, including dedicated multipliers or XTreme DSP blocks in Virtex architectures. A common way of implementing the individual image processing functions is properly balance the image segmentation and its parallel processing on single slices. With the increasing segmentation the processing speeds up, but also dramatically increases the FPGA utilization. Table 1: Device Utilization Summary for Set of Demonstrated Algorithms Logic Utilization Slice Flip-Flops Number of 4 input LUTs Number of occupied Slices Number of Block RAMs Total equivalent gate count for design Resources Used Available Utilization 2511 3840 65% 2958 3840 77% 3206 1920 83% 12 12 100% 169654 3.2. Operational Speed The operational speed of the FPGA in all algorithms depends mainly on the system clock speed, the picture/movie segmentation and of course the type of selected algorithm. For an example, with the fin = 50 MHz , fsys = 125 MHz and segmentation WxH = 4x4 segments (16 processing units) the Find Edges algorithm takes 4680 TCLK = 37,44μs. In the other hand, the Tone Map Variation can be synthesized just as a combinatorial logic, so the result is received with the latency below 100ns. 4. Conclusion The presented Video Processing methods are useful with advance in those applications, which use a programmable logic in their – embedded – hardware. Some image processing algorithms has been tested also on medical imaging tasks. The advantage of FPGA is processing speed and flexibility. There are some modules in development, using a processor core. Their concept is greatly expanded and requires the FPGA with 1M gates at least. 245 5. Acknowledgements The work and the contribution were supported by the project: Ministry of Education of the Czech Republic under Project 1M0567 “Centre of applied electronics”, student grant agency SV 4501141 “Biomedical engineering systems VII” and TACR TA01010632 “SCADA system for control and measurement of process in real time”. Also supported by project MSM6198910027 Consuming Computer Simulation and Optimization. 6. References [1] M. Augustynek,, M. Penhaker, D. Korpas. Controlling Peacemakers by Accelerometers. In 2010 The 2nd International Conference on Telecom Technology and Applications, ICTTA 2010. March 19-21, 2010, Bali Island, Indonesia, Volume2 NJ. IEEE Conference Publishing Services, 2010, p. 161–163. ISBN 978-0-7695-3982-9, DOI: 10.1109/ICCEA.2010.288 [2] J. Pindor, M. Penhaker, M. Augustynek, D. Korpas. Detection of ECG Significant Waves for Biventricular Pacing Treatment. In 2010 The 2nd International Conference on Telecom Technology and Applications, ICTTA 2010. March 19-21, 2010, Bali Island, Indonesia, Volume 2NJ. IEEE Conference Publishing Services, 2010, p. 164–167. ISBN 978-0-7695-3982-9, DOI: 10.1109/ICCEA.2010.186 [3] O. Krejcar. Large Multimedia Artifacts Prebuffering in Mobile Information Systems as Location Context Awareness. In 4th International Symposium on Wireless Pervasive Computing, ISWPC 2009, February 11-13, 2009, Melbourne, Australia. pp 216-220. ISBN 978-1-4244-4299-7, DOI 10.1109/ISWPC.2009.4800591 [4] M. Stankus, M. Penhaker, J. Kijonka, P. Grygarek. Design and Application of Mobile Embedded Systems for Home Care Applications In Proceedings of 2010 Second International Conference on Computer Engineering and Application,s ICCEA 2010, 19. – 21. March 2010, Bali Island, Indonesia, Volume 1NJ. IEEE Conference Publishing Services, 2010 p. 412-416. ISBN 978-0-7695-3982-9, DOI: 10.1109/ICCEA.2010.86 [5] Analog Devices: ADV 7180. Datasheet, (on-line), http://www.analog.com/static/importedfiles/data_sheets/ADV7180.pdf. (cited 2011-04-01) 246