Enhancement of Fast Motion Estimation Algorithm for HEVC using

advertisement

Analysis of Motion Estimation Algorithm (HEVC), using Multi-core processing

Shiba Kuanar

Shiba.kuanar@mavs.uta.edu

1000449352

EE5359 – Interim Report

CONTENTS:

• OVERVIEW OF HEVC

• SEARCH ALGORITHEM

• RESULTS

• REFERENCES

HEVC

• High Efficiency Video Coding (HEVC) is the newest video coding standard of the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture

Experts Group.

• The main goal of the HEVC standardization effort is to enable significantly improved compression performance relative to existing standards in the range of 50% bit-rate reduction for equal perceptual video quality.

• Video Coding Standards: Always have a trade between

1) Picture Quality

2) Compression Ratio

3) Computation complexity

Fig 1:HEVC Encoder

HEVC

• Partitioning – into non overlapping macro blocks

• Prediction - Forward/Backward based on current, past and future frame

• Error signal - transformed and quantized

• Entropy coded

Figure 2: HEVC decoder block diagram [2]

Fig 3: Partitioning of a Macro block

CTB->CBs andTBs

Solid lines indicate CB boundaries and dotted lines indicate TB boundaries

Corresponding QUAD TREE

Motion Estimation(ME)

• Block Based ME algorithms are proposed to reduce the computation time. Algorithms will be discussed the current proposal

“Test Zone (TZ) search”

“Full search algorithm”

• Proposed work is to analyze the existing algorithm and try to reduce motion estimation computation time using multicore programming.

• “ Motion estimation consumes more than 84% encoding time and coding complexity to encode

• HM13.0 – latest HEVC software.

• Block Matching Criterions used in BBME

- Mean of Square Error

- Mean of absolute difference (No multiplication)

- Matching Pixel Count.

Figure: 4. [22], Percentage “execution time” distributions of the

HEVC encoder steps, for an HD sequence

Block based Motion Estimation

• In HEVC video coding, motion estimation (ME) plays a vital role in temporary redundancy reduction between the frames.

In BBME, the current frame is divided into N ×N pixel size macro blocks (MBs) and for each MB a certain area of the reference frame is searched, to minimize a block difference measure (BDM).

• The block difference measure is usually a sum of absolute differences (SAD) between the current and the reference frame MB.

The displacement within the search area which gives the minimum BDM value is called a motion vector (MV).

MVs together with transformed and quantized block differences (residua) are entropy coded into the video bit stream.

• To decrease such a huge computational burden, many fast BBME algorithms have been proposed. On the current proposal only two algorithms are going to be analyzed 1) Algorithm based on the search position number and 2) algorithm based on reduction of fast full search.

Motion Estimation (ME)[19]

• “Pel by Pel” or “Block by Block motion” algorithm used for ME.

ME improves the prediction accuracy between current frame and reference frame.

In Block matching algorithm (BMA), motion of a block of pels (M*N) within a frame interval is estimated.

The range of the motion vector is constrained by the search window.

Best match between the (M*N) block on current frame and corresponding block in previous frame calculated within a search window (M+2*m

2

) * (N+2*n

1

) (figure 5 )

BMA uses minimum mean absolute error, mean squared error cost functions to find the best match.

Figure 5:

Motion Estimation of an (M*N) Block in the previous frame within an [19]

[(M+2*m

2

) * (N+2*n

1

)] search window. MV range is [-n

1 to n

1

] pels and [-m

2 to m

2

] rows

(2d+ 1)² search points

(SPs)

Motion Estimation

• The cost function computed every pel and line displacement within search window & compared to find the location of optimal cost function

• To find the minimum cost function in figure 5, requires

(2m

2

+1)*(2n

1

+1) times computation for every possible displacement constrained by the MV range.

• Once ME carried out based on integer pel MV resolution surrounding the best match then MV can be extended to half a pel

/line by interpolating the fractional pels.

• So ME calculation computationally intensive and many ME algorithms are developed.

Figure 6:

Motion Estimation at the eight fractional pels surrounding the integer pel locations of the MV (pel a)

Fig 7: BBME Algorithm for a) 2DLOG, b) TSS and c) NTSS search

Fig 8: Logarithmic Search algorithm

Figure 9: Various Search Patterns: a) Diamond search, b) Square Search, c)

Horizontal Hexagonal search, d) Vertical Hexagonal Search, e)Rotating Hexagon

Type - 1 and f) Rotating Hexagon Type - 2 [17, 20] a) Diamond search b) Square Search c) Horizontal Hexagonal d) Vertical Hexagonal e) Rotating Hexagon Type – 1 f) Rotating Hexagon Type – 2

Test zone (TZ) Search Algorithm Process (4 steps)

TZ Search algorithm is one of the fast search algorithms in HEVC motion estimation and reduces the encoding time.

• 1) Motion Vector Prediction: TZS algorithm employs median predictor, left predictor, up predictor, upper right predictor. The minimum of these predictors is selected as a starting location for further search steps.

• 2) Initial Grid Search: In this step, the algorithm searches the search window in using diamond or square patterns with different stride lengths ranging from1 through 64, in multiples of 2.

3) Raster Search: The raster search is a simple full-search on a down-sampled version of the search window.

4) Raster/Star Refinement: The Raster refinement is a fine refinement of the motion vectors obtained from the step three.

Figure 10 Flowchart of TZS Algorithm [28]

Test Sequence 1 -BasketballDrill [24]

Resolution = 832x480 and frame = 50

Test Sequence 2RaceHorses [24],

Resolution = 832x480 and frame = 50

QP

PSNR (dB)

Bitrate (Kbps)

Time (Sec)

BasketballDrill : TZ search(Fast Search algorithm)

4:2:0 video format

PSNR(Y, U, V) = [6 * PSNR(Y) + PSNR (U) + PSNR (V)] / 8

24

41.0574

32

36.3168

16115.2160

1953.452 sec

5912.7360

1642.504 sec

41

32.0832

49

28.1364

2025.4720

1343.819 sec

621.3440

1190.627 sec.

QP

PSNR (dB)

BasketballDrill: Full search (Fast Search algorithm)

4:2:0 video format

PSNR(Y, U, V) = [6 * PSNR(Y) + PSNR (U) + PSNR (V)] / 8

24

40.957

32

35.8839

41

31.893

49

27.936

Bitrate (Kbps)

Time (Sec)

16115.2160

1948.201 sec.

5912.7360

1583.513 sec.

2025.4720

1341.228 sec

621.3440

1190.400 sec

Project Results

• The simulation conducted on HM Software 13.0 using

BasketballDrill and RaceHorses test sequences [24] and with a single core system.

• The PSNR (dB), encoding time in sec and bitrate (kbps) are calculated by varying quantization parameter. The results are calculated for both TZ search and Full search algorithms.

• The PSNR calculated using PSNR(y, u, and v) = [6 *

PSNR(y) + PSNR (u) + PSNR (v)] / 8, for 4:2:0 sample video format.

• The “PSNR – Bitrate” and “PSNR – Quantization

Parameter” plotted for the TZ search & Full search.

BasketballDrill

PSNR – Bitrate Plot

(TZ Search and Full Search)

BasketballDrill Test Sequence

38

36

34

32

42

40

30

28

26

0

TZ search

Full search

2000 4000 6000 8000 10000

Bit Rate (Kbps) -> Values

12000 14000 16000 18000

38

36

34

32

30

28

26

20

40

42

BasketballDrill

QP – PSNR Plot (TZ Search and Full Search)

BasketballDrill Test Sequence

TZ search

Full search

25 30 35

Quantization Parameter(QP) -> Values

40 45 50

Future Work

• Further simulation will be conducted on HM Software

13.0 using multi core processors for the TZ search and

Full search algorithms.

• The PSNR (dB), encoding time in sec and bitrate (kbps) will be calculated by varying quantization parameter.

• The PSNR – Bitrate Plot and PSNR – QP Plot will be compared with the single core processor for

BasketballDrill and RaceHorses test sequences

Acronyms

AVC: Advanced Video Coding

BBME: Block Based Motion Estimation

BDM: Block Difference Measure

BMA: Block matching algorithm

CABAC: Context Adaptive Binary Arithmetic Coding

CB: Coding Block

CSVT: Circuits and Systems for Video Technology

CTB: Coding Tree Block

CTU: Coding Tree Unit

CPU: Central Processing Unit

CU: Coding Unit

CUDA: Compute Unified Device Architecture

DCT: Discrete Cosine Transform

Acronyms

GPU: Graphic Processor Unit

HEVC: High Efficiency Video Coding

ISO: International Organization for Standardization

ITU-T: International Telecommunication Union – Telecommunication Standardization Sector

JCT-VC: Joint Collaborative Team on Video Coding

MC: Motion Compensation

MCP: Motion Compensated Predication

ME: Motion Estimation

MPEG: Moving Picture Experts Group

OPENMP: Open Multiprocessing

PB: Prediction Block

Acronyms

PCM: Pulse Code Modulation

PSNR: Peak signal-to-noise ratio

PU: Prediction Unit

SAD: Sum of Absolute Differences

SOA: Sample Adaptive Offset

SIMD: Single Instruction Multiple Data

TB: Transform Block

TU: Transform Unit

VCEG: Video Coding Experts Group

VBSME: Variable Block Size Motion Estimation

2DLOG: Two dimensional logarithmic search procedure

TSS: Three step search

NTSS: New three step search

References

Reference thesis

• 1 Thesis by S.Gangavati on “Complexity reduction of H.264 using parallel programming” which describes significant speed-up in encoding time on GPU using CUDA and CPU combined than on CPU by data and task parallelization, 2012.

http://www-ee.uta.edu/Dip/Courses/EE5359/Sudeep_Thesis_Draft_2.pdf

• 2 Thesis proposed by Pratik Meheta on “Complexity reduction for intra mode selection in HEVC using OpenMP” http://www-ee.uta.edu/Dip/Courses/EE5359/Pratik_Mehta_ThesisProposal.pdf

References

• [1] G. J. Sullivan et al, “Overview of the high efficiency video coding (HEVC) standard”, IEEE Trans. CSVT, vol. 22, pp.1649-1668, Dec. 2012.

• [2] G. J. Sullivan et al “Standardized Extensions of High Efficiency Video Coding (HEVC). J. Sel. IEEE Journal of topics in

Signal Processing” vol. 7, pp.1001-1016, Dec. 2013

• [3] F. Bossen et al, “HEVC complexity and implementation analysis”, IEEE Trans. CSVT, vol. 22, pp.1685-1696, Dec.

2012.

• [4] H. Zhang and Z. Ma, “Fast intra prediction for high efficiency video coding”, Pacific Rim Conf. on Multimedia,

PCM2012, Singapore, Dec. 2012.

• [5] C.C. Chi et al, “Parallel scalability and efficiency of HEVC parallelization approaches”, IEEE Trans. CSVT, vol. 22, pp.1827-1838, Dec. 2012.

References

• [6] Introduction to parallel computing URL https://computing.llnl.gov/tutorials/parallel_comp/#Whatis

• [7] T. Wiegand and G.J. Sullivan, "Overview of the H.264/AVC video coding standard," IEEE Trans. on Circ. Sys. for Video Tech., vol. 13, pp. 560-576, no. 7, July 2003.

• [8] Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG, "Comments on Motion Estimation Algorithms in

Current JM Software (JVT-Q089)", Joint Video Team Document, 17th Meeting: Nice, FR, 14-21 Oct., 2005.

• [9] N.Purnachand, L.N. Alves, A. Navarro “Improvements to TZ search motion estimation algorithm for multiview video coding” 19th International Conference on Systems, Signals and Image Processing IWSSIP, pp. 388 -391, 2012.

• [10] B. Bross et al, “High Efficiency Video Coding (HEVC) Text Specification Draft 10”, Document JCTVC-L1003,

ITU-T/ISO/IEC Joint Collaborative Team on Video Coding (JCT-VC), Mar. 2013 available on http://phenix.itsudparis.eu/jct/doc_end_user/current_document.php?id=7243

• [11] J. Ascenso, C. Brites and F. Pereira, "Improving Frame Interpolation with Spatial Motion Smoothing for

Pixel Domain Distributed Video Coding", in Proc. EURASIP Conference on Speech and Image Processing,

Multimedia Communication and Services, Slovak Republic, June-July 2005.

• [12] W. Hong, “Coherent Block-Based Motion Estimation for Motion-Compensated Frame Rate Up-

Conversion", IEEE International Conference on Consumer Electronics, pp. 165-166, Jan. 2010.

References

• [13] G. Bjontegaard, "Calculation of average PSNR difference between RD curves", VCEG-M33, 2001.

• [14] C. Fogg, “Suggested figures for the HEVC specification”, ITU-T/ISO/IEC Joint Collaborative Team on Video Coding

(JCT-VC) document JCTVC- J0292r1, July 2012.

• [15] P. Hanhart et al, “ Subjective quality evaluation of the upcoming HEVC video compression standard”, SPIE

Applications of digital image processing XXXV , vol. 8499, paper 8499 -30, Aug. 2012.

• [16] M. Horowitz et al, “Informal subjective quality comparison of video compression performance of the HEVC and

H.264/MPEG - 4 AVC standards for low delay applications”, SPIE Applications of digital image processing XXXV , vol.

8499, paper 8499 - 31, Aug. 2012.

• [17] L.N.A. Alves and A. Navarro, " Fast Motion Estimation Algorithm for HEVC ", Proc IEEE International Conf. on

Consumer Electronics -ICCE Berlin , Germany , vol.11 , pp. 11 - 14 , Sep. , 2012

• [18] X. Wang et al, Paralleling Variable Block Size Motion Estimation of HEVC on Multicore CPU plus GPU platform ,

IEEE International Conference on Image Processing (ICIP 2013), Melbourne, Australia, Sep.15-18, 2013.

• [19] K.R. Rao, D.N. Kim and J.J. Hwang, “Video Coding Standards” – Springer 2014.

• [20] M. Jakubowski and G. Pastuszak, “Block-based motion estimation algorithms-a survey,” Journal of Opto-

Electronics Review, vol. 21, pp 86-102, Mar. 2013.

References

• [21] A. Abdelazim,W. Masri and B. Noaman "Motion estimation optimization tools for the emerging high efficiency video coding (HEVC)", SPIE vol. 9029, Visual Information Processing and Communication V, 902905, Feb. 17, 2014.

• [22] M. Shafique et al,"An adaptive workload management scheme for HEVC encoding",Image Processing (ICIP),IEEE

International Conference PP:1850-1854, Sept. 2013

• [23] Software repository for HEVC http://hevc.hhi.fraunhofer.de/

• [24] Video test sequences – http://forum.doom9.org/archive/index.php/t-135034.html

or http://media.xiph.org/video/derf/

• [25] HM Software https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/ or https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM-13.0rc1/

• [26] Y.S. Ho and K.J. Oh, “Overview of Multi-view VideoCoding,” IWSSIP and EC - SIPMCS -Proc. 14th Int.Workshop on

Systems, Signals and Image Processing and 6th EURASIP Conf. Focused on Speech and ImageProcessing, Multimedia

Communications and Services, pp.5-12, 2007.

• [27] X.L. Tang, S.K. Dai, C.H. Cai, “An Analysis of TZSearch Algorithm in JMVC,” 1st InternationalConference on Green

Circuits and Systems ICGCS, pp.516 -520, 2010.

References

• [28] JVT of ISO/IEC MPEG, ITU -T VCEG, MVC software Reference Manual- JMVC 8.5, Mar 2011. http://phenix.itsudparis.eu/jct/doc_end_user/current_document.php?id=7243

• [29] HM Software Manual https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/

• [30] Special issue on emerging research and standards in next generation video coding, IEEE Transactions on CSVT, vol. 22, pp. 1646-1909, Dec 2012

• [31] Special issue on emerging research and standards in next generation video coding. IEEE Transactions on CSVT, vol. 23, pp. 2009-2142, Dec 2013

• [32] IEEE Journal of selected Topics in Signal Processing, vol. 7, pp. 931-1151, Dec 2013

THANK YOU

Download