Srikanth Vasireddy 1001101538 Srikanth.Vasireddy@mavs.uta.edu Multimedia Processing Lab,UTA 1 Growing demand for Video HEVC Encoder and Decoder Features of Moving pictures Block based Motion Estimation Different Motion Estimation Algorithms Test Sequences Simulation Results Work done and In Progress Acronyms References Multimedia Processing Lab,UTA 2 Multimedia Processing Lab,UTA 3 Fig .1: HEVC Encoder[5] ME has 84% coding complexity and time to encode [1] [5] Multimedia Processing Lab,UTA 4 Fig .2: HEVC Decoder[1] Multimedia Processing Lab,UTA 5 Moving images contain significant temporal redundancy • successive frames are very similar Multimedia Processing Lab,UTA 6 Video coding algorithms usually contain two coding schemes : • Intraframe coding : Intraframe coding does not exploit the correlation among adjacent frames; Intraframe coding therefore is similar to the still image coding. • Interframe coding :The interframe coding should include motion estimation/compensation process to remove temporal redundancy. “The amount of data can be reduced significantly if the previous frame is subtracted from the current frame.”[4] Fig.3: Motion Estimation and Motion Compensation [4] Multimedia Processing Lab,UTA 7 M.J.Jakubowski and G.Pastuszak, “Block-based motion estimation algorithms – a survey ,” Opto-Electronic Review , Volume 21, pp 86-102,,March2013. Multimedia Processing Lab,UTA 8 The MPEG and H.26X standards[4] use block based technique for motion estimation /compensation. In this technique, each current frame is divided into equal-size blocks, called source blocks. Each source block is associated with a search region in the reference frame. The objective of block-matching is to find a candidate block in the search region best matched to the source block. The relative distances between a source block and its candidate blocks are called motion vectors. Multimedia Processing Lab,UTA 9 Video Sequence time X: Source block for block-matching Bx: Search area associated with X MV: Motion vector Reference frame current frame Fig.4: Block matching Scenario [6] Multimedia Processing Lab,UTA 10 Search Area Source block Motion vector: (u, v) Search Area: n 2 p n 2 p Candidate block Multimedia Processing Lab,UTA 11 • • • • • Full Search Algorithm Three Step Search Algorithm Four Step Search Algorithm Diamond Search Algorithm Hexagonal Search Algorithm Multimedia Processing Lab,UTA 12 Full Search Algorithm u Candidate Block Search Area v If p=7, then there are (2p+1)(2p+1)=225 candidate blocks. Fig.5 : Full Search Scenario [6][11] • In order to get the best match block in the reference frame, it is necessary to compare the current block with all the candidate blocks of the reference frames. • Full search motion estimation calculates the sum of absolute difference (SAD) value at each possible location in the search window. • Full search computes the all candidate blocks intensively for the large search window. • Computational complexity is of order n^2 for a block size of nxn Multimedia Processing Lab,UTA 14 3SS Algorithm • The first step involves block-matching based on 4-pel resolution at the nine location.(step size m).Now they check for minimum cost distance and shift center to the new point of minimum. • The second step involves blockmatching based on 2-pel resolution around the location determined by the first step.(step size is m/2) Fig.6: 3 Step Search Scenario [6] [11] • The third step repeats the process in the second step (but with resolution 1-pel). The position with minimum cost will give us the motion vector and also position of the best match. Multimedia Processing Lab,UTA 15 4SS Algorithm • 4SS algorithm utilizes a center-biased search pattern with nine checking points on a 5 x 5 window in the instead of a 9 x 9 window in the 3SS. • This algorithm helps in reducing the number of search points compared to the 3SS and hence is more robust. • Block distortion method (BDM) point is used Fig.7: 4Step Search Scenario [6] [11] Multimedia Processing Lab,UTA 16 Diamond Search Algorithm The DS algorithm employs two search patterns. Large diamond search pattern(LDSP) comprises 9 checking points from which eight points surround the center one to compose a diamond shape. Small diamond search pattern (SDSP) consisting of 5 checking points forms a small diamond shape. LDSP is repeatedly used until the minimum block distortion (MBD) occurs at the center point. Fig.8 : Diamond Search Scenario for ME [7] [11] Multimedia Processing Lab,UTA 17 Hexagonal Search Algorithm In block motion estimation, a search pattern with a different shape or size has a very important impact on search speed and distortion performance. • HEXBS algorithm can find a same motion vector with fewer search points than the DS algorithm. (Calculate the minimum cost at 6 corner points of Hexagon) Fig.9:Hexagonal Search Scenario • Generally speaking, the larger the motion for ME [7][11] vector, the more search points the HEXBS algorithm can save. Multimedia Processing Lab,UTA 18 Test Sequences[23] RaceHorses_416x240_30.yuv sequence KirstenAndSara_1280x720_60.yuv sequence BQMall_832x480_60.yuv sequence ParkScene_1920x1080_24.yuv sequence Multimedia Processing Lab,UTA 19 Simulation Results RaceHorses_416x240_30.yuv, Number of frames encoded = 20 Random Access profile (FAST SEARCH) Random Access profile (FULL SEARCH) QP PSNR in dB Bit rate in kbps Encoding time in seconds PSNR in dB Bit rate in kbps Encoding time in seconds 22 39.5858 1504.992 120.810 39.5969 1494.096 1399.795 27 35.9841 769.740 101.093 35.9926 762.0240 1508.792 32 32.7600 391.116 88.600 32.7769 389.296 1306.499 37 30.1072 202.572 71.437 30.1423 202.3080 1231.534 Table 1: Results for RaceHorses_416x240_30.yuv sequence in Random Access Configuration BQMall_832x480_60.yuv , Number of frames encoded = 20 Random Access profile (FAST SEARCH) Random Access profile (FULL SEARCH) QP PSNR in dB Bit rate kbps Encoding time in seconds PSNR in dB Bit rate in kbps Encoding time in seconds 22 40.6653 4538.160 308.840 40.7583 4528.440 4875.343 27 38.2347 2257.968 259.764 38.2446 2252.592 5214.150 32 35.4849 1200.864 244.682 35.4965 1196.352 4837.578 37 32.7343 656.088 224.326 32.7370 655.2240 5248.942 Table 2: Results for BQMall_832x480_60.yuv sequence in Random Access Configuration Multimedia Processing Lab,UTA 20 KristenAndSara_1280x720_60.yuv Number of fames encoded =20 ParkScene_1920x1080_24.yuv Number of fames encoded =20 Random Access profile (FAST SEARCH) Random Access profile (FAST SEARCH) QP PSNR in dB Bit rate in kbps Encoding time in seconds PSNR in dB Bit rate in kbps Encoding time in seconds 22 44.3166 2676.456 533.038 40.7117 8421.072 1808.863 27 42.5646 1232.856 570.403 38.3703 3673.3632 1356.223 32 40.4650 667.344 481.128 36.0006 1697.606 1211.177 37 38.0715 381.720 444.487 33.7621 791.712 1107.963 Table 3: Results for KirstenAndSara_1280x720_60.yuv & ParkScene_1920x1080_24.yuv sequences in Random Access Configuration Testing Platform: Processor Number of cores Intel Core(TM) i5 CPU 4210U 2.40 GHz 2 Memory 8GB Operating System 64 bit (x64-based processor),Windows 8.1 Multimedia Processing Lab,UTA 21 Figure 10: Snapshot of encoder_randomaccess_main.cfg[15] Multimedia Processing Lab,UTA 22 Bit rate vs QP 9000 8000 RaceHorses_416x240 BQMall_832x480_60 6000 5000 KirstenAndSara_1280x720_60 4000 ParkScene_1920x1080_24 PSNR vs QP 3000 2000 50 1000 45 0 40 0 10 20 30 40 QP Fig 11: Bit rate vs QP for all test sequences RaceHorses_416x240 35 PSNR (dB) Bit rate (Kbps) 7000 30 BQMall_832x480_60 25 20 KirstenAndSara_1280x720 15 _60 10 ParkScene_1920x1080_24 5 0 0 10 20 30 40 QP Fig 12: PSNR vs QP for all test sequences Multimedia Processing Lab,UTA 23 R-D plot for BQMall_832x480_60.yuv 45 45 40 40 35 35 30 30 PSNR (dB) PSNR (dB) R-D plot for RaceHorses_416x240_30.yuv 25 20 15 25 20 15 10 10 5 5 0 0 0 500 1000 1500 2000 Bit rate (kbps) Fast Search PSNR in dB Full Search PSNR in dB Fig 13: R-D plot for RaceHorses_416x240_30.yuv sequence 0 1000 2000 3000 4000 5000 Bit rate (Kbps) Fast Search PSNR in dB Full Search PSNR in dB Fig 14: R-D plot for BQMall_832x480_60.yuv sequence Multimedia Processing Lab,UTA 24 R-D plot for KirstenAndSara_1280x720_60.yuv R-D plot for ParkScene_1920x1080_24.yuv 45 45 44 40 35 30 42 PSNR (dB) PSNR (dB) 43 41 40 25 20 15 39 10 38 5 37 0 0 500 1000 1500 2000 2500 3000 0 2000 4000 6000 8000 10000 Bit rate (Kbps) Bit rate (Kbps) Fig 15: R-D plot for KirstenAndSara_1280x720_60.yuv sequence Fig 16: R-D plot for ParkScene_1920x1080_24.yuv sequence Multimedia Processing Lab,UTA 25 Fast search vs Full search for RaceHorses_416x240_30.yuv 6000 Encoding time(sec) 5000 4000 3000 2000 1000 0 22 27 Fast Search 32 QP 37 Full Search Fig 17: Encoding time comparison for RaceHorses_416x240_30.yuv sequence Multimedia Processing Lab,UTA 26 Fast search vs Full search for BQMall_832x480_60.yuv Encoding time(sec) 6000 5000 4000 3000 2000 1000 0 22 27 Fast Search QP 32 37 Full Search Fig 18: Encoding time comparison for RaceHorses_416x240_30.yuv sequence Multimedia Processing Lab,UTA 27 Fast Search Encoding time 2000 Encoding time(sec) 1800 1600 1400 1200 1000 800 600 400 200 0 22 27 QP 32 RaceHorses_416x240_30.yuv BQMall_832x480_60.yuv KirstenAndSara_1280x720_60.yuv ParkScene_1920x1080_24.yuv 37 Fig 19: Fast Search Encoding time comparison for all test sequences Multimedia Processing Lab,UTA 28 Full Search Encoding time 6000 Encoding time (sec) 5000 4000 3000 2000 1000 0 22 27 32 37 QP RaceHorses_416x240 BQMall_832x480_60 Fig 20: Full Search Encoding time comparison for RaceHorses & BQMall Multimedia Processing Lab,UTA 29 Work done and In Progress Analyzed Full Search,3SS and Diamond Search. Developed functions in MATLAB like To To To To find PSNR w.r.t reference image compute motion compensated image find minimum distance among macro blocks compute Mean Absolute difference Developing functions for Full Search ,3SS and Diamond Search Multimedia Processing Lab,UTA 30 BBME : Block Based Motion Estimation BD-BR: Bjontegaard Delta Bitrate. BD-PSNR: Bjontegaard Delta Peak Signal to Noise Ratio. CABAC: Context Adaptive Binary Arithmetic Coding. CTB: Coding Tree Block. CTU: Coding Tree Unit. CU: Coding Unit. DBF: De-blocking Filter. DCT: Discrete Cosine Transform. fps: frames per second. HEVC: High Efficiency Video Coding. HM: HEVC Test Model. ISO: International Organization for Standardization. ITU-T: International Telecommunication Union- Telecommunication Standardization Sector. JCT-VC: Joint Collaborative Team on Video Coding. MAD: Mean Absolute Difference MC: Motion Compensation. ME: Motion Estimation. MPEG: Moving Picture Experts Group. MSE: Mean Square Error. PB: Prediction Block. PSNR: Peak Signal to Noise Ratio. QP: Quantization Parameter SAO: Sample Adaptive Offset. TB: Transform Block. TU: Transform Unit. VCEG: Visual Coding Experts Group. Multimedia Processing Lab,UTA 31 [1] V. Sze and M. Budagavi, “Design and Implementation of Next Generation Video Coding Systems (H.265/HEVC Tutorial)”, IEEE International Symposium on Circuits and Systems (ISCAS), Melbourne, Australia, June 2014, available on http://www.rle.mit.edu/eems/publications/tutorials/ [2] HEVC tutorials http://www.vcodex.com/h265.html [3] G.J. Sullivan; J. Ohm; W.J, Han and T. Wiegand, “Overview of the High Efficiency Video Coding (HEVC) Standard”, IEEE Trans. on Circuits and Systems for Video Technology, Volume: 22, Issue: 12, pp. 1649-1668, Dec. 2012. [4] I.E. Richardson “Video Codec Design : Developing Image and Video compression systems”,Wiley,2002. [5] G. J. Sullivan et al “Standardized Extensions of High Efficiency Video Coding (HEVC).”IEEE Journal of selected topics in Signal Processing” vol. 7, pp.1001-1016, Dec. 2013 [6] L.C.Manikandan et al “A new survey on Block Matching Algorithms in Video Coding” in International Journal of Engineering Research ,Volume 3,pp.121-125,Feb 2014. [7] ] L.N.A. Alves, and A. Navarro, " Fast Motion Estimation Algorithm for HEVC ", Proc IEEE International Conf. on Consumer Electronics -ICCE Berlin , Germany , vol.11 , pp. 11 - 14 , Sep. , 2012 [8] F. Bossen, et al, “HEVC complexity and implementation analysis”, IEEE Trans. on Circuits and Systems for Video Technology, Volume: 22, Issue: 12, pp. 1685 - 1696, Dec. 2012. [9] J. Ohm, et al, “Comparison of the Coding Efficiency of Video Coding Standards –Including High Efficiency Video Coding (HEVC)”, IEEE Trans. on Circuits and Systems for Video Technology, volume: 22, Issue: 12, pp. 1669 - 1684, Dec. 2012. [10] K. Kim, et al, “Block partitioning structure in the HEVC standard,” IEEE Trans. on circuits and systems for video technology, vol. 22, pp.1697-1706, Dec. 2012. [11] M. Jakubowski and G. Pastuszak, “Block-based motion estimation algorithms-a survey,” Journal of Opto-Electronics Review, vol. 21, pp 86-102, Mar. 2013. http://link.springer.com/article/10.2478%2Fs11772-013-0071-0#page-1 [12] A. Abdelazim, W. Masri and B. Noaman "Motion estimation optimization tools for the emerging high efficiency video coding (HEVC)", SPIE vol. 9029, Visual Information Processing and Communication V, 902905, Feb. 17, 2014, doi:10.1117/12.2041166 Multimedia Processing Lab,UTA 32 [13] Software repository for HEVC - https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM-16.0/ [14] HEVC white paper –Ittiam systems - http://www.Ittiam.com/Downloads/en/documentation.aspx [15] HM Software Manual - https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/ [16] G. Bjontegaard, "Calculation of average PSNR difference between RD curves", VCEG-M33,ITU-T SG 16/Q 6,Austin, TX, April 2001. [17] Multimedia Processing Lab at UTA: http://www.uta.edu/faculty/krrao/dip/ Analysis of Motion Estimation (ME) Algorithms. By Tuan Phan Minh Ho (Spring 2014) Comparative study of Motion Estimation (ME) Algorithms by Khyati Mistry (Spring 2008) [18] http://www.h265.net has info on developments in HEVC NGVC – Next generation video coding. [19] Detailed Overview of HEVC/H.265 by Shevach Riabtsev : https://app.box.com/s/rxxxzr1a1lnh7709yvih [20] W. Hong, “Coherent Block-Based Motion Estimation for Motion-Compensated Frame Rate Up-Conversion", IEEE International Conference on Consumer Electronics, pp. 165-166, Jan.2010. [21] N.Purnachand and L.N. Alves, A. Navarro “Improvements to TZ search motion estimation algorithm for multiview video coding” 19th International Conference on Systems, Signals and Image Processing IWSSIP, pp. 388 -391, 2012. [22] Video test sequences - http://forum.doom9.org/archive/index.php/t-135034.html or http://media.xiph.org/video/derf/ or ftp://ftp.kw.bbc.co.uk/hevc/hm-11.0-anchors/bitstreams/ or http://forum.doom9.org/archive/index.php/t-135034.html [23] M. Wien, “High efficiency video coding: Tools and specification”, Springer, 2015. [24] I.E. Richardson, “Coding video: A practical guide to HEVC and beyond”, Wiley, 11 May 2015 [25] V.Sze ,M.Budagavi and G.J.Sullivan “ High Efficiency Video Coding(HEVC) –Algorithms and Architectures”, Springer, 2014. [26] X. Li et al, “Rate-complexity-distortion evaluation for hybrid video coding”, IEEE International Conference on Multimedia and Expo (ICME), pp. 685-690, July. 2010. [27] G. Correa et al, “Performance and computational complexity assessment of high efficiency video encoders”, IEEE Trans. on Circuits and Systems for Video Technology, Vol.22, pp.1899-1909, Dec.2012. Multimedia Processing Lab,UTA 33