EE5359 – Multimedia Processing “A performance comparison of fractional-pel interpolation filters in HEVC and H.264/AVC” Under the guidance of Dr.K.R.RAO EE 5359 - Multimedia Processing Project Updated Report Submitted By: LOHITH SUBRAMANYA UTA ID: 1000928742 E-mail: lohith.subramanya@mavs.uta.edu Date of Submission: 19th February 2014 1 Spring 2014 EE5359 – Multimedia Processing List of Acronyms: AIF: Adaptive Interpolation Filter ALF: Adaptive Loop Filter APEC: Adaptive Prediction Error Coding AVC: Advanced Video Coding AQMS: Adaptive Quantization Matrix Selection CSVT: Circuits and Systems for Video Technology DCT: Discrete Cosine Transform DCTIF: Discrete Cosine Transform Interpolation Filter DMVD: Decoder-side Motion Vector Deviation DSP: Digital Signal Processing EMS: Extended Macro-block Size FIR: Finite Impulse Response HEVC: High Efficiency Video Coding IBDI: Internal Bit Depth Increasing ITU-T: International Telecommunication Union – Telecommunication Standardization Sector JCT-VC: Joint Collaborative Team on Video Coding JPEG: Joint Photographic Experts Group KLT: Karhunen - Loeve Transform LTS: Larger Transform Size MCP: Motion Compensated Prediction MPEG: Moving Picture Experts Group MV: Motion Vectors RDO: Rate Distortion Optimization SOC: System On Chip SVN: Sub-Version VCEG: Video Coding Experts Group VCIP: Visual Communications and Image Processing 2 Spring 2014 EE5359 – Multimedia Processing Objective: The objective of this project is to compare and analyze the fractional-pel interpolation filters in HEVC [18] and H.264/AVC [17] based on their frequency responses, complexity, coding performance and performance gain. BD-PSNR [33] and BD-Bit Rate [33] are the metrics used to evaluate the comparison of the two standards. Introduction: The fractional-pel interpolation filters (6-tap FIR [24] and average) adopted in H.264/AVC [17] improves motion compensation greatly. Similarly, DCT-based fractional-pel interpolation filters (7-tap and 8-tap) is adopted in the HEVC [1] [10] standard. This project involves the differences between these fractional-pel interpolation filters. During this project, the derivations of fractional-pel interpolation filters in HEVC [1] and H.264/AVC [17] will be described in detail and will be compared based on properties of their frequency responses. What is H.264 [17]? It is an industry standard for video compression, the process of converting digital video into a format that takes up less capacity when it is stored or transmitted. Video compression (or video coding) is an essential technology for applications such as digital television, DVD-Video, mobile TV, videoconferencing and internet video streaming. Standardizing video compression makes it possible for products from diļ¬erent manufacturers (e.g. encoders, decoders and storage media) to inter-operate. The encoder converts video into a compressed format and the decoder converts compressed video back into an uncompressed format. H.264 [17] defines a format (syntax) for compressed video and a method for decoding this syntax to produce a displayable video sequence. Figure 1 [17] shows the encoding and decoding processes and highlights the parts that are covered by the H.264 standard. 3 Spring 2014 EE5359 – Multimedia Processing Figure 1: Block Diagram of H.264 [17] How does H.264 [17] codec work? An H.264 [17] video encoder carries out prediction, transform and encoding processes (Figure 1) [17] to produce a compressed H.264 [17] bitstream. An H.264 [17] video decoder carries out the complementary processes of decoding, inverse transform and reconstruction to produce a decoded video sequence. What is HEVC [1]? High Efficiency Video Coding (HEVC) [1] is the current joint video coding standardization project of the ITU-T Video Coding Experts Group (VCEG) (ITU-T Q.6/SG 16) and ISO/IEC Moving Picture Experts Group (MPEG) (ISO/IEC JTC 1/SC 29/WG 11). The Joint Collaborative Team on Video Coding (JCT-VC) [25] has been established to work on this project. The Joint Collaborative Team on 3D Video Coding Extension Development (JCT-3V) [25] has been established to work on 3D video coding extensions of HEVC and other video coding standards. The block diagram of HEVC is shown in figure 2 [6]. 4 Spring 2014 EE5359 – Multimedia Processing Figure2: Block diagram of HEVC [6] Why use HEVC? All video technologies need encoding and decoding to ensure efficient transmission and storage. HEVC’s better use of bandwidth is designed to enable higher resolutions without crippling networks and overstuffing storage systems. Since video and cinema industries are edging towards 8Kx4K [22] video resolutions with images of more than 8 megapixels, thus the new HEVC standard could soon be a widely used technology. However, due to the filing of numerous patents the use will not be for free. The patent filing by country is shown in Figure 3 [2] and the HEVC encoding efficiency for a sample (Kimono1) video sequence is shown in Figure 4 [5]. 5 Spring 2014 EE5359 – Multimedia Processing Figure 3: Patent filings of HEVC related patents by the five biggest patent holders [2] Figure 4: HEVC encoding efficiency in comparison with the previous standards [5] 6 Spring 2014 EE5359 – Multimedia Processing Features of HEVC [3]: The JCT-VC is currently evaluating modifications [3] to current coding tools, such as: ALF: Adaptive Loop Filter EMS: Extended Macro-block Size LTS: Larger Transform Size IBDI: Internal Bit Depth Increasing AQMS: Adaptive Quantization Matrix Selection, As well as new coding tools such as: Modified intra prediction, Modified de-block filter and DMVD: Decoder-side motion vector deviation. The new features [3] proposed to meet the requirements are: 2-D non-separable AIF Separable AIF Directional AIF "Super-macro-block" structure up to 64x64 with additional transforms. Adaptive prediction error coding (APEC) in spatial and frequency domains Competition-based scheme for motion vector selection and coding Mode-dependent KLT [21] for intra coding It is speculated that these techniques are most beneficial with multi-pass encoding. 7 Spring 2014 EE5359 – Multimedia Processing FIR Filters [27]: FIR filters [27] are one of two primary types of digital filters used in Digital Signal Processing (DSP) applications. "FIR" means "Finite Impulse Response". If an impulse response, that is, a single "1" sample is followed by many "0" samples, zeroes will come out after the "1" sample has made its way through the delay line of the filter. An N-Tap FIR filter is shown in figure 5 [28]. Figure 5: N-Tap FIR filter [28] Here h (k) is the filter coefficient array and x (n-k) is the input data array to the filter. The number N represents the number of taps of the filter and relates to the filter performance. An N-tap FIR filter requires N multiply-accumulate cycles. 8 Spring 2014 EE5359 – Multimedia Processing Why use interpolation in video coding? Motion-compensated prediction (MCP) [8] is the key to the success of the modern video coding standards, as it removes the temporal redundancy in video signals and reduces the size of bitstreams significantly. With MCP, the pixels to be coded are predicted from the temporally neighboring ones, and only the prediction errors and the motion vectors (MV) [8] are transmitted. However, due to the finite sampling rate, the actual position of the prediction in the neighboring frames may be out of the sampling grid, where the intensity is unknown. So, the intensities of the positions in between the integer pixels, called sub-positions, must be interpolated and the resolution of MV [8] is increased accordingly. One aspect of HEVC [1] video compression involves interpolation among various pixels to determine brightness. Figure 6 [9] from the draft standard shows how this process takes place. Figure 6: Integer pixel (Shaded with upper case letters) and fractional pixel positions (Non-shaded blocks with lower case letters) for quarter-pel LUMA interpolation [9] (Credit: ITU-T/ISO/IEC Joint Collaborative Team on Video Coding (JCT-VC)) The interpolation filters used in H.264 [17] are 6 tap FIR filter for half-pel interpolation and the average filter for quarter-pel interpolation. Similarly, in HEVC [3], an 8-tap DCTIF is used for half-pel interpolation and a 7-tap DCTIF is used for quarter-pel interpolation. 9 Spring 2014 EE5359 – Multimedia Processing DCT [10] is one of the popular transforms used in video signal processing applications, since DCT exhibits similar properties to the optimal KLT [21]. The 2nd order DCT [10] used in image compression standard JPEG [31] is defined by: Here, X(k) is the 2nd order inverse DCT co-efficient and x(n) is the 2nd order forward DCT coefficient. Also, By substituting the forward 2nd order DCT [10] equation in the inverse equation, the interpolation formula is obtained which is as follows: For even tap filters, the equation changes to: Similarly, for an odd tap filter, the equation is represented by: The filter co-efficients for half-pel and quarter-pel filters are: The filter weights of the corresponding positions in HEVC [1] [10] are: 10 Spring 2014 EE5359 – Multimedia Processing Here, the constant B = 8 and >> denotes arithmetic right shift. The magnitude response graphs of half-pel interpolation filters are shown in figure 7. Figure 7: Magnitude response graphs of half-pel interpolation filters [10] Here, the solid graph represents the DCTIF 8-Tap filter response. Dashed graph represents H.264/AVC filter response. Dotted graph represents DCTIF 6-Tap filter response. Plan of action: Step 1: Based on reference software HM13 [16] - To change the halfpel interpolation filter coefficients to the coefficients in H.264/AVC [17], and thus the fractional-pel pixels might get affected. Step 2: Based on reference software HM13 [16] - To change the half-pel interpolation filter coefficients to the coefficients of DCTIF [10]. Step 3: In addition to the change in step 1, the interpolation methods of quarter-pel pixels in horizontal direction and in vertical direction are to be changed to those in H.264/AVC [17]. Thus the fractional-pel pixels might get affected again. Step 4: Besides the changes in step 3, the interpolation methods of remaining four quarter-pel pixels in the diagonal direction are to be changed to those in H.264/AVC [17]. 11 Spring 2014 EE5359 – Multimedia Processing Figure 8: Representation of integer, half and quarter pel pixels [20] Figure 8 shows the representation of the integer and fractional pel arrangement. The comparison of the modified filter coefficients based on coding performance, frequency response, performance gain and complexity that are obtained from Steps 1-4 can be further assessed for the required parametric results mentioned in “A comparison of Fractional-Pel Interpolation Filters in HEVC and H.264/AVC” [10] 12 Spring 2014 EE5359 – Multimedia Processing References: 1. Fraunhofer Heinrich Hertz Institute - http://hevc.hhi.fraunhofer.de/ 2. Open Patents and Standards Platform - http://www.iplytics.com/en/tag/hevc/ 3. HEVC Review Site http://telcogroup.ru/files/materialspdf/High_Efficiency_Video_Coding_H265.pdf 4. Overview of HEVC - http://iphome.hhi.de/wiegand/assets/pdfs/2012_12_IEEEHEVC-Overview.pdf 5. Extremetech Blog: http://www.extremetech.com/computing/162027-h-265benchmarked-does-the-next-generation-video-codec-live-up-to-expectations 6. Altera Technologies: http://www.altera.com/technology/systemdesign/articles/2013/tv-studio-system.html 7. I. Richardson, “Real time implementation of H.264 Video Coding”, IEEE International SOC Conference, PP: 390, Sept. 2008 8. H.265 Blog http://www.h265.net/2010/07/adaptive-interpolation-filter-for-videocoding.html 9. CNET Blog http://news.cnet.com/8301-11386_3-57566116-76/hevc-video-standardfinished-high-end-improvements-coming/ 10. H.Lv, et al, “ A comparison of fractional-pel interpolation in HEVC and H.264/AVC”, IEEE Conference on Visual Communications and Image Processing (VCIP), PP: 1-6, Nov. 2012 11. G.J.Sullivan, et al, “ Overview of the HEVC Standard”, IEEE Transactions on Circuits and Systems for Video Technology (CSVT), Vol: 22, No: 12, PP: 1649-1668, Dec. 2012 12. B.Lee, et al, “Performance Comparison of various interpolation methods for color filter arrays”, IEEE Symposium on Industrial Electronics, Vol: 1, PP: 232-236, Jun. 2001 13. V.Yu and J.Ostermann, “Locally Adaptive Non-Separable Interpolation Filter for H.264/AVC”, IEEE International Conference on Image Processing, PP: 33-36, Oct. 2006 14. Video Test Sequences: http://trace.eas.asu.edu/yuv/ 15. Tortoise SVN Downloadable Software Link: http://tortoisesvn.net/downloads.html 16. HM 13 Software Link: https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM-13.0+RExt-6.0rc1/ 17. H.264 Advanced Video Coding Blog https://www.vcodex.com/h264.html 18. G.J.Sullivan, et al, “ Standardized Extensions of HEVC”, IEEE Journal of Selected Topics in Signal Processing, Vol: 7, No: 6, PP: 1001-1016, Dec. 2013 19. K.R.Rao, D.N.Kim and J.J.Hwang, “Video coding standards”, Springer Publications, Jan. 2014: http://www.springer.com/physics/book/978-94-007-6741-6 20. SPIE Digital Library: http://electronicimaging.spiedigitallibrary.org/article.aspx?articleid=1730243 21. Karhunen-Loeve Transform: http://en.wikipedia.org/wiki/Karhunen%E2%80%93Lo%C3%A8ve_theorem 22. Sharp 8Kx4K TV: http://www.sound-news.net/index.php/the-novosti/hifi-avnovosti/item/552-sharp-8kx4k-tv 13 Spring 2014 EE5359 – Multimedia Processing 23. Institute of Computer and Communication Engineering: http://research.ncku.edu.tw/re/articles/e/20071102/2.html 24. FIR Filter: http://en.wikipedia.org/wiki/Finite_impulse_response 25. JCT-VC Document Management System: http://phenix.int-evry.fr/jct/ 26. T.Wiegand, et al, “Overview of the H.264/AVC Video Coding Standard”, IEEE Transactions on Circuits and Systems for Video Technology, Vol: 13, No: 7, PP: 560576, July 2003 27. Iowegian International DSP Site: http://www.dspguru.com/dsp/faqs/fir/basics 28. N-Tap FIR Filter: http://www.analog.com/static/importedfiles/seminars_webcasts/MixedSignal_Sect6.pdf 29. I.Richardson, “ The H.264 Advanced Video Compression Standard”, Wiley Publications, Aug. 2010: http://www.wiley.com/WileyCDA/WileyTitle/productCd0470516925.html 30. HM 13 Software Reference Manual: http://mpeg.chiariglione.org/standards/mpegh/high-efficiency-video-coding/high-efficiency-video-coding-hevc-encoderdescription 31. JPEG: http://www.jpeg.org/ 32. JM 18.6 Software Repository: http://iphome.hhi.de/suehring/tml/download/ 33. BD-PSNR and BD-BR: http://www.mathworks.com/matlabcentral/fileexchange/27798-bjontegaardmetric/content/bjontegaard.m 14 Spring 2014