Computational reduction of motion estimation using quadrant search for fast video encoder Jong-Nam Kim and Byung-Ha Ahn Dept. of Mechatronics, K-JIST 1, Oryong-dong, Buk-ku, Kwangju 500-712, Korea jnkim@moon.kjist.ac.kr, bayhay@kjist.ac.kr Abstract –Among many modified TSS algorithms, new three-step search (NTSS) algorithm obtains good picture quality in predicted images with more reduced computation on the average. This paper suggests the new search algorithm with the computational reduction, which uses the UESA, PDE, qudrant search strategy. The proposed algorithm reduces the useless computations combining above concept and keeps the same quality of predicted image compared with the conventional NTSS algorithms. Our algorithm uses NTSS scheme and PDE to remove useless computations in getting the best matched block. Additionally, we apply the quadrant based search scheme to increase the efficiency of PDE with the concept of UESA. Unlike other fast modified NTSS algorithms, our algorithm obtains lossless predicted image compared with one of original NTSS algorithm. Experimentally, our algorithm showed reduced checking rows for each block compared with the conventional NTSS, while keeping the same PSNR as that of it. Therefore, our work will be useful for the applications requiring fast video encoder. Keywords-- NTSS, PDE, and quadrant search Introduction Motion estimation plays a key role for the compression of moving picture data. For the motion vector in motion estimation, block matching algorithms (BMA) of many estimation techniques are most appropriate in the framework of generation coding. A straightforward BMA is the full search (FS) which exhaustively searches for the accurate motion vector. This algorithm searches all locations in a specific search range and selects the position where the residual error of block matching is minimized. The heavy computational load of the FS, however, can be a significant problem in real time video coding. Many fast algorithms for the reduction of computation have been developed in the last decade. We can classify these fast algorithms into several groups, which are the techniques based on UESA, the methods found on multiresolution, the means built on spatial/temporal correlation of the motion vectors, the fast algorithms with sub sampling of matching block, etc. [1]-[3]. The fast motion estimation techniques based on UESA mainly constrain the number of checking points by applying this assumption to matching block. UESA means that the residual error of block matching increases monotonically when the checking point moves away from the location of the global minimum error. So many algorithms based on this concept have been reported for last decade. These techniques have one fault that the search could fall into a local minimum, which is not the optimal motion vector. It has been reported that these search algorithms could reduce the computational requirement significantly. Among these fast algorithms, TSS is the most popular algorithm and recommended by RM8 of CCITT and SM3 of MPEG due to its simplicity, regularity and reasonable performance [2]-[6]. The goal of this work is to reduce the computational complexity in the motion estimation, while keeping the same error performance as that of the NTSS which has good error performance compared with TSS and modified TSS algorithms. This paper proposes the new search algorithm with the reduction of checking points, which uses the UESA, PDE (partial distortion elimination) [3], quadrant search strategy. The proposed algorithm reduces the useless computations combining above concept and keeps the quality of predicted image compared with the conventional NTSS algorithms. New three-step search (NTSS) algorithm obtains good picture quality in predicted images with more reduced computation on the average. To reduce more the computation while keeping the same as error performance compared with NTSS, this paper proposes a fast NTSS algorithm using UESA, PDE and quadrant search algorithms. Our algorithm uses PDE to remove useless computations in getting the best matched block. Additionally, We apply the quadrant based search scheme to increase the efficiency of PDE. Experimentally, our algorithm showed reduced checking rows for each block compared with the conventional NTSS, while keeping the same PSNT as that of it. checking points for big motion images. In that case, nine inner checking points are unnecessarily consumed. Meanwhile, eight outer checking points are used inefficiently in the small motion. At the worst case, it has thirty three checking points with the search range (P=7). It is reported that built on the center-biased motion vector distribution, the NTSS algorithm results in improved error performance over other modified TSS algorithms [4]. This paper is organized as follows. Section 2 explains the conventional fast algorithms based on the concept of UESA as the previous works. Section 3 describes the proposed algorithms of quadrant search using UESA and PDE to reduce the computational complexity and keep the error performance over the conventional NTSS algorithm. In section 4, simulation results and discussions are shown. Finally, the conclusion is followed in section 5. FTSS keeps the structure of three-step search, while reducing checking points among the candidates. FTSS reduces more the checking points than that of ESS algorithm. It considers more cases after checking three points for each step to reduce efficiently checking points. It was shown that the FTSS algorithm could further speed up the TSS by a factor of two and half experimentally and keep good performance comparable with the TSS. It was reported that the FTSS algorithm employed almost ten points for each block, which is the minimum checking point in all modified TSS algorithms. But it results in degraded error performance in complex images compared with that of TSS [2]. Conventional Fast Algorithms Using UESA Before developing our main algorithm based on UESA, we refer to several popular algorithms using UESA. The popular algorithms based on this concept include the three-step search (TSS), the new three-step search (NTSS), the four step search (FSS), the conjugate directional search (CDS), oneat-a-time search (OTS), the 2-D logarithm search (2-DLOGS), 1-D full search (1-DFS), the parallel hierarchical one-dimensional search (PHODS), efficient-simple search (ESS), fast TSS (FTSS), and their modified algorithms [2]-[13]. It is reported that the NTSS algorithm in the above algorithms has the best prediction quality with reasonable checking points. We’ll summarize selectively several algorithms in the above ones for the background of the work. TSS has unnecessarily many checking points when the motion is small for the block. Therefore a new three-step search (NTSS) algorithm was proposed for small motioned images by R. Li et. al [4]. The features of NTSS are that it uses a center-biased checking pattern, which employs seventeen checking points in the first step unlike TSS algorithm, and takes the adaptive search to the distribution of the motion vector. The adaptive search means that half-stop is possible in the first or second step when motion vector is in the range of (±1, ±1). However, the algorithm has unnecessary Proposed Algorithm As described previously, new three-step search (NTSS) shows very good error performance with reasonable computation on the average. But, it can require more checking points compared with the computation of TSS at the worst case. Additionally, it often consumes useless checking points for a little big motioned-block. We want to reduce the useless computations while keeping the same prediction quality as that of NTSS in motion compensated images. To do so, we selectively check initial some candidates,then continue the search according to the result. J.N. Kim and et al. [12]-[13] reduced the average checking points using conditional search strategy. It first examines one or nine checking points of seventeen checking points in the NTSS with fixed threshold of SAD which is predefined. The algorithm is very effective in reducing the average checking points, but its prediction quality heavily depends on the fixed threshold of SAD. Additionally, with one checking point at first, predicted image quality was degraded by that of TSS accompanying increased average checking points. The problems was originated from the fixed threshold of SAD and initially too small checking point. So, we propose lossless NTSS compared with the origianl NTSS algorithm. Of course, this approach has the less computational reduction compared with methods [12]-[13]. In the proposed algorithms, we want to find the minimum error point as fast as possible. By doing so, we can elliminate the unnecessary computations without any degradation of PSNR in the predicted image. From the concept, we propose our algorithm which first checks the candidate with higher probability of minimum checking error. -7 -7 0 0 7 1 2 3 4 5 6 7 8 9 7 : 1st step Fig. 1. Candidate checking points in each step. Fig. 1 represents nine candidates for each step. our proposed algorithm which has almost same average checking points as NTSS. In our algorithm, The important thing is to find minimum error point for each step. To do that, we will employ other search strategy rather than sequential search such as 1~9 from the Fig. 1. That is, based on the probability of minimum error point we examine the candidates. Table 1 ~ Table 3 shows our search strategy which is not conventional sequential search. Table 1 explains 2D cross search based NTSS algorithm. It checks five candidates at first, then carries checking additional four points according to the minimum points from the first step. For example, if the point 9 is minimum error point, SAD (8) or SAD(6) will be minimum. Then we will check first the point 9 than the point 1 or the point 3 or the point 7. Of course, if the point 1 or the point 3 is minimum point, we must take more the computational time. But based on probability, our algorithm will be encountered first for the minimum point in more cases. Table 2 shows 1D cross NTSS based search algorithm, which checks at first three points such as 4, 5 and 6 in Fig. 1. According to the the result, we continue checking the candidates. It has simpler search rule than Table 1. Table 3 represents Quadrant NTSS based search algorithm, which has three checking points (5, 6 and 8) at the first step. As shown in the talbe, it includes the algorithm of Talble 1. Its search rule is more complex than Table 1 and Table 2. Table 4 ~ Table 6 apply above the concept to lossy TSS algorithm. Unlike the previous tables with lossless predicted images compared with the original NTSS, they has degraded prediction quality compared with the origianl TSS algorithm. Instead, they reduce the checking points significantly, which is about two times. Table 1. Lossless 2D Cross NTSS based search algorithm First sub-step: check SAD(2), SAD(4), SAD(5), SAD(6), and SAD(8) Second sub-step: If SAD(2) or SAD(5) is minimum then check the points as the order: 1 -> 3 -> 7 -> 9 & select min{SAD(x)} elseif SAD(8) is minimum then check the points as the order: 7 -> 9 -> 1 -> 3 & select min{SAD(x)} elseif SAD(4) is minimum then check the points as the order: 1 -> 7 -> 3 -> 9 & select min{SAD(x)} elseif SAD(6) is minimum then check the points as the order: 3 -> 9 ->1 -> 7 & select min{SAD(x)} End Table 2. Lossless 1D Cross NTSS based search algorithm First sub-step: check SAD(4), SAD(5), and SAD(6) Second sub-step: If SAD(4) is minimum then check the points as the order: 1 -> 7 -> 2 -> 5 -> 3 -> 9 & select min{SAD(x)} elseif SAD(5) is minimum then check the points as the order: 2 -> 5 -> 3 -> 9 -> 4 -> 7 & select min{SAD(x)} elseif SAD(4) is minimum then check the points as the order: 3 -> 9 -> 2 -> 5 -> 1 -> 3 & select min{SAD(x)} End Table 3. Lossless Quadrant NTSS based search algorithm First sub-step: check SAD(5), SAD(6), and SAD(8) Second sub-step: If SAD(8) < SAD(6) < SAD(5) then check the points as the order: 9 -> 7 -> 3 -> 3 -> 2 -> 1 & select min{SAD(x)} elseif SAD(6) < SAD(8) < SAD(5) then check the points as the order: 9 -> 3 -> 2 -> 1 -> 4 -> 7 & select min{SAD(x)} elseif SAD(5) < SAD(8) & SAD(5) < SAD(5) then apply 2D Cross NTSS algorithm End Table 4. Lossy 2D Cross TSS based search algorithm First sub-step: check SAD(2), SAD(4), SAD(5), SAD(6), and SAD(8) Second sub-step: If SAD(2) is minimum then check the points as the order: 1 -> 3 & select min{SAD(x)} elseif SAD(8) is minimum then check the points as the order: 7 -> 9 & select min{SAD(x)} elseif SAD(4) is minimum then check the points as the order: 1 -> 7 & select min{SAD(x)} elseif SAD(6) is minimum then check the points as the order: 3 -> 9 & select min{SAD(x)} End Table 5. Lossy 1D Cross TSS based search algorithm First sub-step: check SAD(4), SAD(5), and SAD(6) Second sub-step: If SAD(4) is minimum then check the points as the order: 1 -> 7 & select min{SAD(x)} elseif SAD(5) is minimum then check the points as the order: 2 -> 5 & select min{SAD(x)} elseif SAD(4) is minimum then check the points as the order: 3 -> 9 & select min{SAD(x)} End Table 6. Lossy Quadrant TSS based search algorithm First sub-step: check SAD(5), SAD(6), and SAD(8) Second sub-step: If SAD(8) < SAD(6) < SAD(5) or SAD(6) < SAD(8) < SAD(5) then check the points as the order: 9 & select min{SAD(x)} elseif SAD(8) < SAD(5) < SAD(6) then check the points as the order: 7 -> 4 & select min{SAD(x)} elseif SAD(6) < SAD(5) < SAD(8) then check the points as the order: 3 -> 2 & select min{SAD(x)} elseif SAD(5) < SAD(8) & SAD(5) < SAD(5) then apply 2D Cross TSS algorithm End Experimental Results and Discussions To prove the performance of the proposed algorithm in this paper, we use 100 frames of “Foreman”, “Car phone” , “Claire”, “Trevor”, “Grandmother”, and “Akio” image sequences. In these sequences, “Foreman” and “Car phone” have big motions compared with other image sequences, while “Akio” is almost inactive sequence compared with the two sequences. The proposed algorithms are compared with the original New Three-Step Search (NTSS) algorithm, Three-Step Search (TSS) algorithm, its lossy modified algorithm, and full search (FS) algorithm. The block size is 16 by 16 pixels and the search range is ±7 pixels. Image format is QCIF (176 by 144) for each sequence and only forward prediction is used. SAD as error criterion for finding motion vector is employed. The simulation results are shown in terms of PSNR and average checking points and row. Table 1. Experimental results for Average Checked Rows with various lossless NTSS based algorithms for 30 Hz Seqs Foreman Car phone Trevor Claire Akio Grand Algs Spiral NTSS 7.1744 6.9673 5.5742 6.2807 2.7952 5.3411 1D Cross NTSS 7.0994 6.9732 5.4957 6.2741 2.7875 5.3436 Cross NTSS 6.8486 6.7773 5.4499 6.2706 2.7897 5.2844 Quadrant NTSS 7.0457 6.7732 5.4574 6.2803 2.7926 5.3265 Table 2. Experimental results for PSNR with various lossless NTSS based algorithms for 30 Hz Seqs Foreman Car phone Trevor Claire Akio Grand Algs [dB] [dB] [dB] [dB] [dB] [dB] FS 34.4384 33.4491 33.2838 41.2994 42.3456 42.2903 TSS 33.8478 33.2423 33.2130 41.2954 42.3456 42.2799 1D Cross TSS 33.3158 33.0072 33.0501 41.2624 42.3448 42.2453 Cross TSS 33.5092 33.0767 33.0884 41.2800 42.3456 42.2721 NTSS 34.3382 33.4086 33.2655 41.2994 42.3456 42.2882 Algs Table 3. Experimental results for Average Checking Points with various algorithms for 30 Hz Seqs Foreman Car phone Trevor Claire Akio Grand FS TSS 1D Cross TSS Cross TSS NTSS 225 25 13 14.0778 19.5257 225 25 13 14.0736 19.5926 225 25 13 13.5160 17.9649 225 25 13 13.0808 17.1409 225 25 13 13.0375 17.0505 225 25 13 13.1975 17.4507 Avg. Checking Rows for "Foreman" with 30Hz Avg. Checking Rows for "Car Phone" with 30Hz 10 9.5 only spiral NTSS spiral& 1D cross NTSS spiral& 2D cross NTSS spiral & quadrant NTSS 9 Avg. Checking Rows Avg. Checking Rows 8.5 only spiral NTSS spiral& 1D cross NTSS spiral& 2D cross NTSS spiral & quadrant NTSS 9 8 7.5 7 6.5 8 7 6 6 5 5.5 5 4 0 10 20 (a) For 30 40 50 60 Frame Number 70 80 90 100 ‘Car phone’ sequence 0 10 20 30 (b) For 40 50 60 Frame Number 70 80 90 100 ‘Foreman’ sequence Fig. 2. Average checking rows for each frame with ‘Car phone’ and ‘Foreman’ sequences Table 1 shows average checking rows with various lossless NTSS based algorithms for 30 Hz. In the table, Our algorithms get more the computational reduction than the spiral NTSS algorithm. Of course, spiral NTSS algorithm reduces more the computations than not-spiral NTSS one. In our algorithm, “Cross NTSS” algorithm shows the least computation for motion estimation for all sequences. Table 2 shows average PSNR of the variety of algorithms. As we predicted, NTSS obtains good PSNR over other TSS kinds. Surely, FS has the best performance in the PSNR. Table 3 represents the average checking points with various algorithms. Especially, Cross TSS algorithm shows the good performance in terms of PSNR and average checking points. As compared with Table 2, NTSS has better PSNR with less checking points than TSS algorithm. From the results, most sequences have the motions around (0,0). From Table 1, we can see that our algorithm requires only about 18 ~ 43 % computations compared with that of the original NTSS algorithm. Fig. 2 shows average checking rows for each frame with “Car Phone” and “Foreman” sequences. In the figure, 2D cross NTSS based search algorithm shows the most reduction of the computations for motion estimation. Conclusions In this paper, the proposed algorithm was described to reduce the computation complexity using UESA based quadrant search and PDE. Our algorithm at first uses NTSS scheme and PDE to remove useless computations in getting the best matched block. Then, we apply the quadrant based search scheme to increase the efficiency of PDE with the concept of UESA. Unlikely conventional modified NTSS algorithms, our algorithm keeps lossless predicted image compared with one of original NTSS algorithm. Experimentally, our algorithm showed reduced checking rows for each block compared with the conventional NTSS, while keeping the same PSNR as that of it. Therefore, our work will be contributed to the applications for fast video encoder. References [1]. F. Dufaux, and F. Moscheni, “Motion Estimation Techniques for Digital TV: A Review and a New Contribution,” Proceedings of the IEEE, vol.83, no.6, pp.858-876, June 1995. [2]. J.N. Kim, and et al., “A fast three-step search algorithm with minimum checking points using unimodal error surface assumption,” IEEE Trans. Consumer Electronics, vol. 44, no. 3, pp. 638-647, August 1998. [3]. J.N. Kim, and et al., “A fast motion estimation for software based real-time video coding,” IEEE Trans. Consumer Electronics, vol. 45, no. 2, pp. 417-426, May 1999. [4]. R. Li, B. Aeng, and M.L. Liou, “A new threestep search algorithm for block motion estimation,” IEEE Trans. Circuits Syst. Video Technol., vol. 4, no. 4, pp. 438-442, August 1994. [5]. L.G. Chen, W.T. Chen, Y. Sjehng, and T.D. Chiueh, “An efficient parallel motional algorithm for digital image processing,” IEEE Trans. Circuits Syst. Video Technol., vol. 1, no. 4, pp. 378-385, September 1991. [6]. L.M. Po, and W.C. Ma, “A novel four-step search algorithm for fast block motion estimation,” IEEE Trans. Circuits Syst. Video Technol., vol. 4, no. 3, pp. 313-317, June 1996. [7]. J. Lu, and M.L. Liou, “A simple and efficient search algorithm for block-matching motion estimation,” IEEE Trans. Circuits Syst. Video Technol., vol. 7, no. 2, pp. 429-433, April 1997. [8]. J.N. Kim and et al., "Reduction of checking points using unimodal error surface assumption for fast motion estimation," Proc. SPIE, pp. 124-134, July 1998. [9]. M. R. Pickering, J. F. Arnold, and M. R. Frater, “An adaptive search length algorithm for block matching motion estimation,” IEEE Trans. Circuits Syst. Video Technol., vol. 7, no. 6, pp. 906-912, December 1997. [10]. B. Liu, and A. Zaccarin, “New fast algorithms for the estimation of block motion vectors,” IEEE Trans. Circuits Syst. Video Technol., vol. 3, no. 2, pp. 148-157, April 1991. [11]. J.N. Kim and et al., "Efficient and adaptive three-step search (EATSS) algorithm using adaptive search strategy, unequal subsampling and partial distortion elimination," Proc. SPIE, pp. 666-675, Jul. 1999. [12]. J. N. Kim and et al., "A fast new three-step search algorithm for motion estimation using UESA and adjacent SADs," Proc. IEEE ISCE, pp. 224-229, Nov. 1999. [13]. J. N. Kim and et al., "Fast Block Matching Algorithm Using Threshold-based Half Stop, Cross Search and Partial Distortion Elimination,” Proc. SPIE, pp. 844-852, Feb. 2000.