International Journal of Research In Science & Engineering Volume: 1 Issue: 1 e-ISSN: 2394-8299 p-ISSN: 2394-8280 BLOCK MATCHING ALGORITHM FOR MOTION ESTIMATION Siddharth Joshi1, Shubham S.Kale2, Akshay Kalokhe3, Sanket Kulkarni4 1 Student, Computer Department, PCCOE, siddharth9527@gmail.com Student, Computer Department, DYPIEMR, shubham.kale27@gmail.com 3 Student, Computer Department, SRTTC, akshaykalokhe22@gmail.com 4 Student, Computer Department, SRTTC, sanket.kulkarni20@gmail.com 2 ABSTRACT This paper is a survey of the block matching algorithms used for motion estimation in video compression. It implements and compares 4 different types of block matching algorithms that range from the very basic Exhaustive Search to the recent fast adaptive algorithms like Adaptive Energy Model For Motion Estimation(AEME). The algorithms that are evaluated in this paper are widely acknowledged by the video compressing community and have been used in implementing various standards, ranging from MPEG1 to MPEG4 . The paper also presents a very brief introduction to the entire flow of video compression. Keywords: Block matching, motion estimation, video compression. ----------------------------------------------------------------------------------------------------------------------------1. INTRODUCTION With the onset of the multimedia age and the spread of Internet, video storage on CD/DVD and streaming video has been gaining a lot of popularity. The Moving Picture Experts Group (MPEG) video coding standards pertain towards compressed video storage on physical media like CD/DVD. Generally in any standard the basic flow of the entire compression decompression process is largely the same and is depicted in Fig below. The encoding side estimates the motion in the current frame with respect to a preceding frame. A motion compensated image for the current frame is then created that is built of blocks of image from the preceding frame. The motion vectors for blocks used for motion estimation are transmitted, as well as the variance of the compensated image with the current frame is also JPEG encoded and sent. The encoded image that is sent is then decrypted at the encoder and used as a reference frame for the subsequent frames. The decoder converses the process and creates a full frame. The whole idea behind motion estimation based video compression is to save on bits by sending JPEG encoded variance images which inherently have less energy and can be highly compressed as associated to sending a full frame that is JPEG encoded. Motion JPEG, where all frames are JPEG encoded. The most computationally classy and resource hungry operation in the entire compression process is motion estimation. Hence, this arena has seen the highest bustle and research interest in the past two decades. This paper evaluates the fundamental block matching algorithms such as Exhaustive Search(ES) , Three Step Search(TSS) , New Three Step Search(NTSS) , AEME. Section II explains block matching in general and then the above algorithms in detail. IJRISE| www.ijrise.org|editor@ijrise.org International Journal of Research In Science & Engineering Volume: 1 Issue: 1 e-ISSN: 2394-8299 p-ISSN: 2394-8280 2. BLOCK MATCHING ALGORITHM The underlying belief behind motion estimation is that the patterns corresponding to objects and background in a frame of video order move within the frame to form corresponding objects on the subsequent frame. The idea behindhand block matching is to split the current frame into a matrix of ‘macro blocks’ that are then compared with corresponding block and its neighbors in the previous frame to create a vector that stipulates the movement of a macro block from one location to another in the preceding frame. This movement calculated for all the macro blocks comprising a frame, organizes the motion estimated in the current frame. The search area for a good macro block match is constrained up to p pixels on all fours sides of the equivalent macro block in previous frame. This ‘p’ is called as the search parameter. Bigger motions require a larger p, and the larger the search parameter the more computationally expensive the method of motion estimation becomes. Usually the macro block is taken as a square of side 16 pixels, and the search parameter p is 7 pixels. The idea is represented in Figure. The matching of a macro block with another is based on the output of a cost function. The macro block that results in the minimum cost is the one that matches the closest to current block. 2.1. Exhaustive Search(ES) This algorithm, also very well known as Full Search, is the most computationally expensive block matching algorithm of all. This algorithm computes the cost function at each possible location in the search window. As a result of which it finds the best promising match and gives the highest PSNR amongst any block matching IJRISE| www.ijrise.org|editor@ijrise.org International Journal of Research In Science & Engineering Volume: 1 Issue: 1 e-ISSN: 2394-8299 p-ISSN: 2394-8280 algorithm. Fast block matching algorithms try to attain the same PSNR doing as little computation as possible. The obvious disadvantage to ES is that as larger the search, the window gets the more computations it requires. 2.2. Three Step Search(TSS) This is one of the initial attempts at fast block matching algorithms and dates back to mid 1980s. The general idea is symbolized in Figure below. It starts with the search location at the center and sets the ‘step size’ S = 4, for a usual search constraint value of 7. It then searches at eight locations +/- S pixels around location (0,0). From these nine positions searched so far it picks the one giving least cost and makes it the new search origin. It then sets the new step size S = S/2, and echoes similar search for two more iterations until S = 1. At that point it finds the position with the minimum cost function and the macro block at that location is the best match. The calculated motion vector is then saved for broadcast. It gives a flat reduction in computation by a factor of 9. So that for p = 7, ES will calculate cost for 225 macro blocks whereas TSS computes cost for 25 macro blocks. The idea behind TSS is that the fault surface due to motion in every macro block is unimodal. A unimodal surface is a bowl shaped surface such that the weights produced by the cost function increase monotonically from the global minimum. 2.3. New Three Step Search (NTSS) NTS improves on TSS results by providing a center biased searching scheme and having provisions for half way stop to reduce computational charge. It was one of the first widely accepted fast algorithms and frequently used for implementing earlier standards like MPEG 1. The TSS uses a uniformly allocated checking pattern for motion detection and is inclined to to missing small motions. The NTSS process is shown graphically in Figure. In the first step 16 points are checked in addition to the search source for lowest weight using a cost function. Of these additional search locations, 8 are a space of S = 4 away (similar to TSS) and the other 8 are at S = 1 away from the search origin. If the minimum cost is at the origin then the search is stopped right here and the motion vector is set as (0, 0). If the smallest weight is at any one of the 8 locations at S = 1, then we change the origin of the search to that point and check for loads adjacent to it. Depending on which point it is we might end up checking 5 points or 3 points. The position that gives the lowest weight is the closest match and motion vector is set to that location. On the other hand if the smallest weight after the first step was one of the 8 locations at S = 4, then we follow the normal TSS process. Hence although this process might need a minimum of 17 points to check every macro block, it also has the worst-case scenario of 33 locations to check. IJRISE| www.ijrise.org|editor@ijrise.org International Journal of Research In Science & Engineering Volume: 1 Issue: 1 e-ISSN: 2394-8299 p-ISSN: 2394-8280 2.4. AEME AEME extracts energy histograms to compare different blocks. After extracting the energy histograms for model’s blocks, neighbours’ histograms will be compared one by one with the current block’s histogram using the Bhattacharyya Coefficient . Then, three of the most similar neighboring blocks will be selected to enter the prediction process and the other two blocks of model will be omitted completely. They will be incorporated to gain the PMV by a weighting formula, which assigns the greatest weight to the Block with Highest Similarity (BHS) and the least weight to the Block with Lowest Similarity (BLS). The Block with Middle Similarity (BMS) will take the middle weight. Equation illustrates the AEME’s PMV calculator formula: PMV = a . MVBHS + b . MVBMS + c . MVBLS . Coefficients a, b and c have such a relation: a ˃ b ˃ c and will be specified empirically. Prediction step should be followed by a searching step, considering that PMV is not a completely precise one. But prediction helps to reduce the SW’s size and reduce the computational burden of the searching scheme. The proposed algorithm uses an adaptive two−step search algorithm which is very similar to the last to step of TSS algorithm and differs with it only in distance between search points of its first step. That is, when the initial prediction point specified, the searching scheme will evaluate this point and its 8 neighboring points which have d pixel distance with it, instead of TSS’s 2 pixel distance. Figure illustrates this searching scheme. Green circle presents the predicted point, 8 stars represent the first step’s neighbouring points which have d pixel far from the predicted point (in this Figure d equals to 2). IJRISE| www.ijrise.org|editor@ijrise.org International Journal of Research In Science & Engineering Volume: 1 Issue: 1 e-ISSN: 2394-8299 p-ISSN: 2394-8280 The blue star represents the selected point of the first step and finally this point with its 8 immediately neighbors which are specified by 8 triangles in Figure 2, will be evaluated to find the best match. It is specified by red triangle. All evaluations are done using SAD criteria. This algorithm adaptively sets the d value dependent on the size of frames. This scheme divides videos into four groups as for their size of frames so: small (frame sizes about 240×352), medium (frame sizes about 576×720) and large (frame sizes about 1080×1920), and initializes d empirically by 2, 3 and 4 respectively for them. 3. SIMULATION RESULTS In this section, AEME will be compared with ES, TSS and NTSS. For a fair comparison, NTSS will use the same searching scheme and model of support as AEME. According to the group of video, thresholds are set empirically as: T1=0.95, T2 =0.9 and T3=0.99 for the small group, T1=0.99, T2 =0.9 and T3=0.995 for the medium group and T1=0.95, T2 =0.9 and T3=0.999 for the large group. The constant factor in Equations 3 and 4, ε, is set to 10% of the maximum energy value among all the energies. The parameters in Equation 5 are given by a=2/P, b=1/P and c=0.25/P (with P=3.25). As mentioned, when there are multiple motions in the scene (i.e., motions with different directions and velocities near each other), the proposed algorithm will illustrate its ability in prediction compared with other predictive algorithms. ES and TSS may behave well too, but they depend on the scene’s color. If the overall colors of the scene are near to each other (i.e., objects and background have almost same color), ES and TSS may make terrible mistakes. Here 30 successive frames from:”Rihanna”. According to it, AEME’s average PSNR value is 1.25 dB greater than NTSS, 1.19 dB more than TSS and only 0.11 dB lower than ES. AEME presents greater PSNR improvements compared with others in “Rihanna” sequences, considering that they include areas with similar colors and multiple motions. AEME does only 8.8% further computations compared with NTSS. But it has been reduced 35.1% and 1116.2% of computational overheads presented by TSS and ES, respectively. Of course, the computational load of AEME should be considered a little more than TSS, when that terrible case occurs and algorithm uses TSS instead of its own procedure. As is clear, AEME has improved the average PSNR value with an acceptable computational burden. 4. SUMMARY The past two eras have seen the growth of wide acceptance of multimedia. Video compression plays an important role in archival of entertaining based video (CD/DVD) as well as real-time reconnaissance / video conferencing applications. While ISO MPEG sets the ordinary for the former types of application, ITU sets the standards for latter low bit rate uses. In the entire motion based video compression process motion estimation is the most computationally expensive and time-consuming procedure. The research in the past decade has focused on reducing both of these side effects of motion estimation. Block matching methods are the most popular and efficient of the various motion estimation techniques. This paper first calls the motion compensation based video compression in brief. It then illustrates and simulates 4 of the block matching algorithms, with their comparative IJRISE| www.ijrise.org|editor@ijrise.org International Journal of Research In Science & Engineering Volume: 1 Issue: 1 e-ISSN: 2394-8299 p-ISSN: 2394-8280 study at the end. Three more, very recent, block matching algorithms are studied in the end. Of the various algorithms studied or simulated during the course of this final project AEME turns out to be the best block matching algorithm. AEME is most useful in a video where background color is similar to objects and it (video) has multiple motions. 5. CONCLUSION Thus, we can conclude that different algorithms can be used in different scenarios. Such as TSS&NTSS are faster but applicable to videos with low bit rate.ES algorithm can be used only in worst case scenario. AEME algorithm can be termed as most useful because of its accuracy in prediction without any annoying computational overheads. 6. REFERENCES [1] Amir Ghahremani, Amir Mousavinia,” An Efficient Adaptive Energy Model Based Predictive Motionestimation Algorithm For Video Coding” [2] https://www.ece.cmu.edu/~ee899/project/deepak_mid.htm [3] en.wikipedia.org/wiki/Block-matching_algorithm [4] www.csee.wvu.edu/~xinl/courses/ee569/4ss.pdf IJRISE| www.ijrise.org|editor@ijrise.org