Morphological Operations for Streaming Videos on Enhanced Reconfigurable Processor 1 B Satya Sai Susmitha , C R S Hanuman2, P Rajeswara Mahidhar3 Department of Electronics and Communication Engineering, Sasi Institute of Technology and Engineering, Tadepalligudem, India 1 susmitha.b406@gmail.com Abstract—Binary Image Processing is a powerful tool in many image and video applications. A Reconfigurable Processor is presented for streaming video processing. The processor’s architecture is a combination of Reconfigurable Binary Processing module, Input and Output Image Control Units and peripheral circuits. The Reconfigurable Binary Processing Module, which consists of mixed-grained reconfigurable binary compute units and output control logic, performs binary image processing operations, especially Mathematical Morphology operations and implements related algorithms for a 1024×1024 image. The periphery circuits control the whole image processing and dynamic reconfiguration process. The results describes that the processor is suitable for real-time binary image processing applications. Keywords Binary image processing, fieldprogrammable gate array (FPGA), mathematical morphology, mixed-grained, real time, reconfigurable. I Introduction Binary image processing is highly useful in various areas, such as object recognition, tracking, motion detection, [1],[2],[3], video processing [4], and identification and authentication systems [5],[6]. Binary image processing has been commonly implemented using processors such as CPU or DSP. However, it is inefficient and difficult to use such processors for binary image processing [7], [8], [9]. Binary image processing chips have attracted much attention in the field of image processing. Application-specific chips and hardware have been reported for various applications. The major drawback of applicationspecific chips is the lack of flexibility. The continuous CMOS technology scaling, the importance of flexibility exceeds that of silicon area, especially in vision chips. The reconfigurable technique can bridge the gap between applicationspecific integrated circuits and flexibility. A vision system with high flexibility and performance, small size, and low power consumption can be implemented in a single chip. Vision can be used in image or video processing, target tracking, multimedia applications, and computer vision. Chips were presented to perform basic binary morphological operations, such as dilation, erosion, opening, and closing [7], [10],[11], . A binary image processor consists of a reconfigurable binary processing module including reconfigurable binary compute units, output control logic, input and output image control units and peripheral circuits. The reconfigurable binary compute units are of a mixed grained architecture, which has the characteristics of high flexibility, efficiency and performance. Basic mathematical morphology operations and complicated algorithms can easily be implemented on it. The processor has the merit of high speed, simple structure and wide application range. II Architecture The presented processor is designed for applications in image or video processing. Such systems should have a high flexibility and high performance processor for wide applications; therefore, the processor design is focused on high flexibility and speed. The architecture of the proposed processor is shown in Fig. 1. The core of the processor is a reconfigurable binary processing module consisting of binary compute units and output control logic. The processor also has two bus interfaces, the input and output control logic units, the process control unit, and a configuration register group. Fig. 1. Architecture of the binary image processor. Fig. 2. Diagram of the reconfigurable binary processing module. A. Reconfigurable Binary Processing Module The diagram of the reconfigurable binary processing module (RBPM) is given in Fig. 2. It can be divided into two main parts. The first part is the output control logic, which selects the output from all the binary compute unit outputs according to the given parameters and converts the series data of 1-bit binary images into parallel data. The second part consists of several binary compute units that perform binary logic and binary image operations at a high speed. architecture .The set operation element can perform binary set operations, such as union, intersection, complement, subtraction, addition, and straightthrough output. The inputs of the set operation element and the outputs of the binary compute unit are transmitted via two sets of multiplexers, respectively, which makes the unit architecture more flexible. The inputs transmitted to the set operation element via the multiplexers can be the operation results of the binary logic elements, the reduction result, and the median filtering result. The outputs of the binary compute unit transmitted via multiplexers can be the original input of the binary compute unit, the operation results of the binary logic elements, the reduction result, the median filtering result, and the operation result of the set operation element. The detailed architecture of the binary compute element is shown in Fig. 4(b). The binary compute element comprises two input control multiplexers, n binary logic elements, a binary reduction element, and a binary median filter. The input control multiplexer selects input data for the binary logic element from the line memories, the SDRAM, and the parameters in the register group. When a video image is processed, line memories are needed to buffer image signals before they are input to binary logic elements. When the block size of the image to be processed is n×n, n − 1 line memories with a depth equal to the image width are needed to buffer the image signals Fig.3.Different operations are performed on an image Binary compute units can execute binary image operations in a pipelined or parallel manner. The operation executed in a binary compute unit is decided by configurable registers, including logic operation parameters, image resolution parameters; mask sizes, input and output selection parameters, and auxiliary parameters. Fig. 3 shows Different operations are performed on an image. The reconfigurable architecture provides higher hardware utilization than the pipelined architecture. The architecture of the binary compute unit is shown in Fig. 4(a). Each binary compute unit, which has two binary compute elements and one set of operation elements, can perform logic, reduction, median filtering, and set operations. The binary compute unit has a mixed-grained Fig. 4. (a) Architecture of the binary compute unit. (b)Architecture of the binary compute element. When images other than videos are processed, the input data are selected from the parameters in the register group or SDRAM. The binary logic element can perform operations such as AND, OR, NOT, NAND, NOR, XOR, XNOR, and straightthrough output. The reduction element performs operations such as reduction AND, reduction OR, reduction NAND, reduction NOR, reduction XOR, reduction XNOR, and straight through output. The set element performs operations such as union, intersection, complement, subtraction, and XOR. All the operation results from the binary logic elements, the reduction element, and the binary median filter are synchronized and output via multiplexers to the next binary compute unit. B. Input and Output Control Logic Unit Fig. 5. Block diagram of the input control logic unit. Image signals need to be synchronized by the input control logic unit before being input to the reconfigurable binary processing module because one-to-one matching is needed between the pixels in different images. The input control logic unit selects and synchronizes the inputs from video images, SDRAM, and registers to the synchronization circuit. The block diagram of the input control logic unit is shown in Fig. 5. The unit contains four data converters and a synchronization circuit. Data converters 1 and 2 convert 1-bit image signals into 32-bit parallel data, which have the same format as the data from SDRAM and registers. Converters 3 and 4 convert the parallel data into 1-bit image signals, which are then synchronized by the synchronization circuit. To increase the processing rate, two down sampling circuits are added to down-sample image signals before they are processed by the data converters 1 and 2. The output control logic unit writes the selected parallel image data from the reconfigurable binary processing module into SDRAM through the bus interface 1. This process is for the video image signals, for the video images the data rate will be more so that, to compress the data without lossless. A typical video image compression scheme is shown in Fig 6. The system consists of three main components, namely, the source encoder, the quantizer, and the entropy encoder. Source encoder Quantizer Entropy encoder Fig .6. The computational block diagram of the functionalities of the compression system This process is for the video image signals, for the video images the data rate will be more so that, to compress the data without lossless. A typical video image compression scheme is shown in Figure 6. The system consists of three main components, namely, the source encoder, the quantizer, and the entropy encoder. source encoder: An encoder is the first major component of image compression system. A variety of linear transforms are available such as Discrete Fourier Transform (DFT), Discrete Cosine Transform (DCT), and Discrete Wavelet Transform (DWT). The Discrete Wavelet Transform is main focus of our work. Quantizer : A quantizer reduces the precision of the values generated from the encoder and therefore reduces the number of bits required to save the transform coefficients. Entropy Encoder: An entropy encoder does further compression on the quantized values. This is done to achieve even better overall compression. The commonly used entropy encoders are the Huffman encoder, arithmetic encoder, and simple run-length encoder. C . Process Control Unit and Configuration Register Group The process control unit reads the configuration information in the configuration registers. It controls the operation process of the reconfigurable binary processing module. It also controls the input and output control logic units and bus interfaces during data access. After the processed image data is written to SDRAM, the process control unit transmits interrupt requests to complete the interaction of the processor with external systems. The configuration register group is an extremely important part in the proposed processor. It contains control parameters, reconfiguration information, operation parameters, and interaction information. Most of the registers in the configuration register group are written by an external CPU via the system bus, and the rest are written by the internal modules in the proposed processor. Fig. 7. Architecture of the binary image processing system. III Circuit Implementation In this section, an image processing system based on the proposed binary image .A fieldprogrammable gate array (FPGA) to verify the performance and feasibility of the processor in binary image processing. The following shows the main circuit blocks of the system. In this section, binary image processing operations in the proposed processor are discussed, including binary mathematical morphology operations are nothing but dilation, erosion, opening, closing, hit and miss, noise filtering, boundary extraction, and image differencing, A. Binary Compute Unit The logic element is set to be 32 bit wide. The maximum block size for image processing is 5 × 5. The two inputs of each binary computing element are the video image signal and register, respectively. B. Image Processing System The SDRAM1 is used as the main memory for the CPU. SDRAM2 is used to store images. The CPU is used as a controller. The register group 2 and interrupt controller are also used for the control of the system. The dynamic reconfiguration approach is applied for reconfiguration of the binary image processor. The reconfiguration parameters are reduced to 24×32 bits due to the mixed-grained architecture of the binary image processor. The reconfiguration time is less than 30 cycles. IV Binary Image Processing Applications Fig. 7. (a) Dilation. Fig. 7. (b) Erosion. The following examples are given to illustrate binary mathematical morphology operations and algorithms in the processor. The actual use of the processor is not confined to the examples given Fig. 8. (a) Opening. (b) Closing. A. Binary Mathematical Morphology Mathematical morphology is a powerful tool for image processing and analysis in a wide range of applications, including shape recognition [2], image processing video processing [4], [12], document authentication , [13],and computer vision The basic binary morphological operations are dilation and erosion. Either of the two operations has two operands: the input signal, which is usually an image, and the structuring element characterized by its shape, size, and center location. 0 0 1 0 0 0 0 1 0 0 1 1 1 1 1 0 0 1 0 0 0 0 1 0 0 Fig.9. Example of Structuring element Fig. 10. Hit-and-miss. Fig. 11. Noise filtering. The other binary morphological operations such as opening, closing, and hit-and-miss operation are based on various combinations of the two basic operations, dilation, and erosion. Assuming that A is the image and B is the structuring element, the dilation is defined by A ⊕ B = {x|[(ˆB)x ∩ A] ≠ ∅}----- (1) The erosion is defined by A ʘ B = {x|(B)x ⊆ A}----- (2) where (B)x is the translation of B by the vector x and (ˆB) denotes the symmetry of B. The implementation of binary dilation and erosion on the binary image processor is illustrated in Fig. 7(a), 7(b) Opening and closing operations, denoted with ◦ and •, are respectively expressed as A ◦ B = (A ʘB) ⊕ B----- (3) A • B = (A ⊕ B) ʘB-----. (4) The implementation of binary opening and closing is illustrated in Fig. 8(a),8(b) The hit-and-miss operation, denoted with ⊗, is expressed as A ⊗ B = (A ʘ B1)ʘ(Ac ʘ B2) ------(5) where Ac denotes the complement of A. The hitand-miss implementation is illustrated in Fig. 10.The noise filtering and boundary extraction are simple morphological algorithms. The noise filtering is described as {[(A ʘ B) ⊕ B] ⊕ B} ʘ B = (A ◦ B) • B ------(6) and the boundary extraction is expressed as A − (A ʘ B).------- (7) The implementation of noise filtering and boundary extraction is illustrated in Figs. 11 and 12, respectively. The binary morphology with spatially variant structuring elements can be implemented on the proposed processor [14]. A binary image is separated into different parts, which are processed with different structuring elements in the basic binary compute units. Then, the processing results are combined to form an entire image. B. Motion Detection and Feature Extraction Fig. 12. Boundary extraction. Image differencing is a common image processing technique used to estimate the shifted locations of an object [1], . The XOR calculates the location of the current frame relative to the previous frame. Two binary edge images are intersected to obtain a moving edge image. Dilation and erosion are successively applied to the moving edge image to get a closed curve and an enhanced moving edge image. Dilation and erosion are successively applied to the moving edge image to get a closed curve and an enhanced moving edge image. There configurable binary image processor can perform basic operations and various algorithms involved in the imaging processing, indicating that the proposed processor is suitable for binary image processing, especially for the morphological operations. Comparison: The structuring element is 5 × 5 and the image resolution is 1024 × 1024. The overall performance of the binary image processor is evaluated according to the number of the binary compute units. Memory usage are decrease compare to previous processors introduction in [10] and [11]. Architecture also 2-D MIMD array. Fig .13. parameters comparison of image processor Table-1 Comparison of image processors Reconfigurable processor for binary image processing Morphologic al operations for streaming videos on enhanced reconfigurab le processor Processor Reconfigurable morphol -ogical image processi ng Real time (ns) 75.02 60.00 47.00 Cpu time 76 68.03 46.22 Memory usage(KB) 381454 312624 256560 Architect -ure 1-D ASIP array 1-D MIMD array 2-D MIMD Array VI Result In Fig.14 (a) result we are using the double edge clocking frequency, at the rising and falling conditions we get the output. As long as the clock, enable and initialization signals are in active state, selection lines decides the which type of data accepts, that data is given as a inputs of morphological operations. If selection line is in ‘01’, then steaming video as an input or ’10’, image as an input otherwise the direct data as an input. The active high selection signal exhorts the dilation and erosion operations. The active low selection signal exhorts the opening and closing operations. The remaining morphological operations depend upon these four operations. Fig.14(a) Input of morphological operations Fig. 14.(b) Output of morphological operations The input matched with the structuring element ‘8’ different morphological operations will be performed. If the output is ‘1’ then the morphological operation is Exists, otherwise not exist. VII Conclusion In these work, morphological operations for streaming videos on enhanced reconfigurable processor was proposed to perform real-time binary image processing. The processor consists of a reconfigurable binary processing module, input and output image control units, and peripherals circuits. The reconfigurable binary processing module has a mixed-grained architecture with the characteristics of high throughput. The dynamic reconfiguration approach was used to increase the processor performance. Basic mathematical morphology operations and complicated algorithms can easily be implemented on it because of its simple structure. References [1] Y. Liu and C. Pomalaza-Raez, “A low-complexity algorithm for theon-chip moment computation of binary images,” in Proc. Int. Conf.Mechatron. Autom., 2009, pp. 1871–1876. [2] E. C. Pedrino, O. Morandin, Jr., and V. O. Roda, “Intelligent FPGAbased system for shape recognition,” in Proc. 7th Southern Conf.Programmable Logic, 2011, pp. 197–202. [3] M. F. Talu and I. Turkoglu, “A novel object recognition method basedon improved edge tracing for binary images,” in Proc. Int. Conf. Appl.Inform. Commun. Technol., 2009, pp. 1–5. [4] M. R. Lyu, J. Song, and M. Cai, “A comprehensive method formultilingual video text detection, localization, and extraction,” IEEETrans. Circuit Syst. Video Technol., vol. 15, no. 2, pp. 243–255, Feb.2005. [5] H. Yang and A. C. Kot, “Binary image authentication with tampering localization by embedding cryptographic signature and block identifier,”IEEE Signal Process. Lett., vol. 13, no. 12, pp. 741–744, Dec. 2006. [6] H. Yang and A. C. Kot, “Pattern-based data hiding for binary imageauthentication by connectivity-preserving,” IEEE Trans. Multimedia, vol. 9, no. 3, pp. 475–486, Apr. 2007. [7] W. Miao, Q. Lin, W. Zhang et al., “A programmable SIMD vision chip for real-time vision applications,” IEEE J. SolidState Circuits, vol. 43,no. 6, pp. 1470–1479, Jun. 2008. [8] A. Lopich and P. Dudek, “A SIMD cellular processor array vision chip with asynchronous processing capabilities,” IEEE Trans. Circuits Syst.I, vol. 58, no. 10, pp. 2420–2431, Oct. 2011. [9] K. M. Shaaban, S. A. Ali, and Y. B. Mahdy, “A chip design for binary and binary morphological operations,” in Proc. Int. Conf. Inform. Intell. Syst., 1999, pp. 554–559. [10] E. N. Malamas, A. G. Malamos, and T. A. Varvarigou, “Fast implementation of binary morphological operations on hardware-efficient systolic architectures,” J. VLSI Signal Process., vol. 25, no. 1, pp. 79–93, 2000. [11] J. Velten and A. Kummert, “Implementation of a highperformance hardware architecture for binary morphological image processing operations,” in Proc. 47th IEEE Int. Midwest Symp. Circuits Syst., Jul. 2004, pp. 241–244. [12] S. Chien and L. Chen, “Reconfigurable morphological image processing accelerator for video object segmentation,” J. Signal Process. Syst., vol. 62, no. 1, pp. 77–96, 2011. [13] H. Yang, A. C. Kot, and S. Rahardja, “Orthogonal data embedding for binary images in morphological transform domain: A high-capacity approach,” IEEE Trans. Multimedia, vol. 10, no. 3, pp. 339–351, Apr. 2008. [14]. Reconfigurable Processor for Binary Image Processing Bin Zhang, Kuizhi Mei, Member, IEEE, and Nanning Zheng, Fellow, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 23, NO. 5, MAY 2013.