Block_Matching_Algorithm_For_Motion_Estimation

advertisement
International Journal of Research In Science & Engineering
Volume: 1 Issue: 1
e-ISSN: 2394-8299
p-ISSN: 2394-8280
BLOCK MATCHING ALGORITHM FOR MOTION ESTIMATION
Siddharth Joshi1, Shubham S.Kale2, Akshay Kalokhe3, Sanket Kulkarni4
1
Student, Computer Department, PCCOE, siddharth9527@gmail.com
Student, Computer Department, DYPIEMR, shubham.kale27@gmail.com
3
Student, Computer Department, SRTTC, akshaykalokhe22@gmail.com
4
Student, Computer Department, SRTTC, sanket.kulkarni20@gmail.com
2
ABSTRACT
This paper is a survey of the block matching algorithms used for motion estimation in video compression. It
implements and compares 4 different types of block matching algorithms that range from the very basic
Exhaustive Search to the recent fast adaptive algorithms like Adaptive Energy Model For Motion
Estimation(AEME). The algorithms that are evaluated in this paper are widely acknowledged by the video
compressing community and have been used in implementing various standards, ranging from MPEG1 to
MPEG4 . The paper also presents a very brief introduction to the entire flow of video compression.
Keywords: Block matching, motion estimation, video compression.
----------------------------------------------------------------------------------------------------------------------------1. INTRODUCTION
With the onset of the multimedia age and the spread of Internet, video storage on CD/DVD and streaming
video has been gaining a lot of popularity. The Moving Picture Experts Group (MPEG) video coding standards
pertain towards compressed video storage on physical media like CD/DVD. Generally in any standard the basic
flow of the entire compression decompression process is largely the same and is depicted in Fig below. The
encoding side estimates the motion in the current frame with respect to a preceding frame. A motion compensated
image for the current frame is then created that is built of blocks of image from the preceding frame. The motion
vectors for blocks used for motion estimation are transmitted, as well as the variance of the compensated image with
the current frame is also JPEG encoded and sent. The encoded image that is sent is then decrypted at the encoder
and used as a reference frame for the subsequent frames. The decoder converses the process and creates a full frame.
The whole idea behind motion estimation based video compression is to save on bits by sending JPEG encoded
variance images which inherently have less energy and can be highly compressed as associated to sending a full
frame that is JPEG encoded. Motion JPEG, where all frames are JPEG encoded.
The most computationally classy and resource hungry operation in the entire compression process is
motion estimation. Hence, this arena has seen the highest bustle and research interest in the past two decades. This
paper evaluates the fundamental block matching algorithms such as Exhaustive Search(ES) , Three Step
Search(TSS) , New Three Step Search(NTSS) , AEME. Section II explains block matching in general and then the
above algorithms in detail.
IJRISE| www.ijrise.org|editor@ijrise.org
International Journal of Research In Science & Engineering
Volume: 1 Issue: 1
e-ISSN: 2394-8299
p-ISSN: 2394-8280
2. BLOCK MATCHING ALGORITHM
The underlying belief behind motion estimation is that the patterns corresponding to objects and
background in a frame of video order move within the frame to form corresponding objects on the subsequent frame.
The idea behindhand block matching is to split the current frame into a matrix of ‘macro blocks’ that are then
compared with corresponding block and its neighbors in the previous frame to create a vector that stipulates the
movement of a macro block from one location to another in the preceding frame. This movement calculated for all
the macro blocks comprising a frame, organizes the motion estimated in the current frame. The search area for a
good macro block match is constrained up to p pixels on all fours sides of the equivalent macro block in previous
frame. This ‘p’ is called as the search parameter. Bigger motions require a larger p, and the larger the search
parameter the more computationally expensive the method of motion estimation becomes. Usually the macro block
is taken as a square of side 16 pixels, and the search parameter p is 7 pixels. The idea is represented in Figure.
The matching of a macro block with another is based on the output of a cost function. The macro block that results
in the minimum cost is the one that matches the closest to current block.
2.1. Exhaustive Search(ES)
This algorithm, also very well known as Full Search, is the most computationally expensive block matching
algorithm of all. This algorithm computes the cost function at each possible location in the search window. As a
result of which it finds the best promising match and gives the highest PSNR amongst any block matching
IJRISE| www.ijrise.org|editor@ijrise.org
International Journal of Research In Science & Engineering
Volume: 1 Issue: 1
e-ISSN: 2394-8299
p-ISSN: 2394-8280
algorithm. Fast block matching algorithms try to attain the same PSNR doing as little computation as possible. The
obvious disadvantage to ES is that as larger the search, the window gets the more computations it requires.
2.2. Three Step Search(TSS)
This is one of the initial attempts at fast block matching algorithms and dates back to mid 1980s. The
general idea is symbolized in Figure below. It starts with the search location at the center and sets the ‘step size’ S =
4, for a usual search constraint value of 7. It then searches at eight locations +/- S pixels around location (0,0). From
these nine positions searched so far it picks the one giving least cost and makes it the new search origin. It then sets
the new step size S = S/2, and echoes similar search for two more iterations until S = 1. At that point it finds the
position with the minimum cost function and the macro block at that location is the best match. The calculated
motion vector is then saved for broadcast. It gives a flat reduction in computation by a factor of 9. So that for p = 7,
ES will calculate cost for 225 macro blocks whereas TSS computes cost for 25 macro blocks. The idea behind TSS
is that the fault surface due to motion in every macro block is unimodal. A unimodal surface is a bowl shaped
surface such that the weights produced by the cost function increase monotonically from the global minimum.
2.3. New Three Step Search (NTSS)
NTS improves on TSS results by providing a center biased searching scheme and having provisions for half
way stop to reduce computational charge. It was one of the first widely accepted fast algorithms and frequently used
for implementing earlier standards like MPEG 1. The TSS uses a uniformly allocated checking pattern for motion
detection and is inclined to to missing small motions. The NTSS process is shown graphically in Figure. In the first
step 16 points are checked in addition to the search source for lowest weight using a cost function. Of these
additional search locations, 8 are a space of S = 4 away (similar to TSS) and the other 8 are at S = 1 away from the
search origin. If the minimum cost is at the origin then the search is stopped right here and the motion vector is set
as (0, 0). If the smallest weight is at any one of the 8 locations at S = 1, then we change the origin of the search to
that point and check for loads adjacent to it. Depending on which point it is we might end up checking 5 points or 3
points. The position that gives the lowest weight is the closest match and motion vector is set to that location. On the
other hand if the smallest weight after the first step was one of the 8 locations at S = 4, then we follow the normal
TSS process. Hence although this process might need a minimum of 17 points to check every macro block, it also
has the worst-case scenario of 33 locations to check.
IJRISE| www.ijrise.org|editor@ijrise.org
International Journal of Research In Science & Engineering
Volume: 1 Issue: 1
e-ISSN: 2394-8299
p-ISSN: 2394-8280
2.4. AEME
AEME extracts energy histograms to compare different blocks. After extracting the energy histograms for
model’s blocks, neighbours’ histograms will be compared one by one with the current block’s histogram using the
Bhattacharyya Coefficient . Then, three of the most similar neighboring blocks will be selected to enter the
prediction process and the other two blocks of model will be omitted completely. They will be incorporated to gain
the PMV by a weighting formula, which assigns the greatest weight to the Block with Highest Similarity (BHS) and
the least weight to the Block with Lowest Similarity (BLS). The Block with Middle Similarity (BMS) will take the
middle weight. Equation illustrates the AEME’s PMV calculator formula:
PMV = a . MVBHS + b . MVBMS + c . MVBLS .
Coefficients a, b and c have such a relation: a ˃ b ˃ c
and will be specified empirically. Prediction step should be followed by a searching step, considering that PMV is
not a completely precise one. But prediction helps to reduce the SW’s size and reduce the computational burden of
the searching scheme. The proposed algorithm uses an adaptive two−step search algorithm which is very similar to
the last to step of TSS algorithm and differs with it only in distance between search points of its first step. That is,
when the initial prediction point specified, the searching scheme will evaluate this point and its 8 neighboring points
which have d pixel distance with it, instead of TSS’s 2 pixel distance. Figure illustrates this searching scheme. Green
circle presents the predicted point, 8 stars represent the first step’s neighbouring points which have d pixel far from
the predicted point (in this Figure d equals to 2).
IJRISE| www.ijrise.org|editor@ijrise.org
International Journal of Research In Science & Engineering
Volume: 1 Issue: 1
e-ISSN: 2394-8299
p-ISSN: 2394-8280
The blue star represents the selected point of the first step and finally this point with its 8 immediately
neighbors which are specified by 8 triangles in Figure 2, will be evaluated to find the best match. It is specified by
red triangle. All evaluations are done using SAD criteria. This algorithm adaptively sets the d value dependent on
the size of frames. This scheme divides videos into four groups as for their size of frames so: small (frame sizes
about 240×352), medium (frame sizes about 576×720) and large (frame sizes about 1080×1920), and initializes d
empirically by 2, 3 and 4 respectively for them.
3. SIMULATION RESULTS
In this section, AEME will be compared with ES, TSS and NTSS. For a fair comparison, NTSS will use the
same searching scheme and model of support as AEME. According to the group of video, thresholds are set
empirically as: T1=0.95, T2 =0.9 and T3=0.99 for the small group, T1=0.99, T2 =0.9 and T3=0.995 for the medium
group and T1=0.95, T2 =0.9 and T3=0.999 for the large group. The constant factor in Equations 3 and 4, ε, is set to
10% of the maximum energy value among all the energies. The parameters in Equation 5 are given by a=2/P, b=1/P
and c=0.25/P (with P=3.25). As mentioned, when there are multiple motions in the scene (i.e., motions with
different directions and velocities near each other), the proposed algorithm will illustrate its ability in prediction
compared with other predictive algorithms. ES and TSS may behave well too, but they depend on the scene’s color.
If the overall colors of the scene are near to each other (i.e., objects and background have almost same color), ES
and TSS may make terrible mistakes. Here 30 successive frames from:”Rihanna”. According to it, AEME’s average
PSNR value is 1.25 dB greater than NTSS, 1.19 dB more than TSS and only 0.11 dB lower than ES. AEME presents
greater PSNR improvements compared with others in “Rihanna” sequences, considering that they include areas with
similar colors and multiple motions. AEME does only 8.8% further computations compared with NTSS. But it has
been reduced 35.1% and 1116.2% of computational overheads presented by TSS and ES, respectively. Of course,
the computational load of AEME should be considered a little more than TSS, when that terrible case occurs and
algorithm uses TSS instead of its own procedure. As is clear, AEME has improved the average PSNR value with an
acceptable computational burden.
4. SUMMARY
The past two eras have seen the growth of wide acceptance of multimedia. Video compression plays an
important role in archival of entertaining based video (CD/DVD) as well as real-time reconnaissance / video
conferencing applications. While ISO MPEG sets the ordinary for the former types of application, ITU sets the
standards for latter low bit rate uses. In the entire motion based video compression process motion estimation is the
most computationally expensive and time-consuming procedure. The research in the past decade has focused on
reducing both of these side effects of motion estimation. Block matching methods are the most popular and efficient
of the various motion estimation techniques. This paper first calls the motion compensation based video
compression in brief. It then illustrates and simulates 4 of the block matching algorithms, with their comparative
IJRISE| www.ijrise.org|editor@ijrise.org
International Journal of Research In Science & Engineering
Volume: 1 Issue: 1
e-ISSN: 2394-8299
p-ISSN: 2394-8280
study at the end. Three more, very recent, block matching algorithms are studied in the end. Of the various
algorithms studied or simulated during the course of this final project AEME turns out to be the best block matching
algorithm. AEME is most useful in a video where background color is similar to objects and it (video) has multiple
motions.
5. CONCLUSION
Thus, we can conclude that different algorithms can be used in different scenarios. Such as TSS&NTSS are
faster but applicable to videos with low bit rate.ES algorithm can be used only in worst case scenario. AEME
algorithm can be termed as most useful because of its accuracy in prediction without any annoying computational
overheads.
6. REFERENCES
[1]
Amir Ghahremani, Amir Mousavinia,” An Efficient Adaptive Energy Model Based Predictive
Motionestimation Algorithm For Video Coding”
[2]
https://www.ece.cmu.edu/~ee899/project/deepak_mid.htm
[3]
en.wikipedia.org/wiki/Block-matching_algorithm
[4]
www.csee.wvu.edu/~xinl/courses/ee569/4ss.pdf
IJRISE| www.ijrise.org|editor@ijrise.org
Download