Complexity Reduction Algorithm for Intra Mode Selection in H.264/AVC Video Coding BY AMRUTA KULKARNI STUDENT ID : 1000666836 UNDER SUPERVISION OF DR. K.R. RAO H.264/MPEG-4 Part 10 or AVC Is an advanced video compression standard, developed by ITU-T Video Coding Experts Group together with ISO/IEC Moving Picture Experts Group. It is widely used Video Codec in mobile applications, internet( YouTube, flash players), set top box, TV etc. A H.264 encoder converts the video into a compressed format(.264) and a decoder converts compressed video back into an uncompressed format. How does H.264 codec work ? An H.264 video encoder carries out prediction, transform and encoding processes to produce a compressed H.264 bit stream. A decoder carries out a complementary process by decoding, inverse transform and reconstruction to output a decoded video sequence. H.264 encoder block diagram[7] Profiles in H.264[9] Prediction The H.264 encoder forms a prediction of the current macro block – a) Based on the current frame using intra prediction/spatial prediction. b) Based on the previous frames that have already been coded using inter prediction. Intra prediction Supports the following macro block sizes : a) 16x16 (for Luma) – Mode 0 (vertical): extrapolation from upper samples (H). Mode 1 (horizontal): extrapolation from left samples (V). Mode 2 (DC): mean of upper and left-hand samples (H+V). Mode 3 (Plane): a linear “plane” function is fitted to the upper and left- hand side samples H and V. This works well in areas of smoothlyvarying luminance. 8x8 (for Chroma) – DC , Plane , Horizontal Vertical b) Intra prediction c) 4x4 (for Luma) modes for intra prediction-[2] Introduction The H.264/AVC intra-prediction is conducted for all types of blocks such as 4x4 luma blocks, 16x16 luma blocks, and 8x8 chroma blocks. The residual between the current block and its prediction is then transformed, quantized, and entropy coded. To obtain the best mode among these modes, the H.264/AVC encoder performs the rate-distortion optimization (RDO) technique for each macro block. RDO procedure for one macro block Set macro block parameters : QP(quantization parameter) and Lagrangian multiplier (λ) Calculate : λMODE = 0.85⋅2(QP-12)/3 Then calculate cost, which determines the best mode Cost = D + λMODE ⋅ R, D – Distortion R - bit rate with given QP Distortion (D) is obtained by SSD (Sum of Squared Differences) between the original macro block and its reconstructed block. Bit rate(R) includes the bits for the mode information and transform coefficients for macro block Quantization parameter (QP) can vary from (0-51) RDO (Rate distortion optimization) Considering the RDO procedure for intra mode selection in H.264/AVC, the number of mode combinations in one macro block is N8x(16xN4 + N16) N8 – number of modes of an 8x8 chroma block – 4 modes N4 – number of modes of an 4x4 luma block - 9 modes N16 – number of modes of an 16x16 luma block – 4 modes N8x(16xN4 + N16) = 4x(16x9 + 4) = 592 Thus, to select the best mode for one Macro block in the intra prediction, the H.264/AVC encoder carries out 592 RDO calculations. As a result, the complexity of the encoder increases extremely. Introduction This project uses the baseline profile, as it provides simplicity in implementation. The important features are – a) I and P slice coding b) Enhanced error resilience such as FMO (Flexible macro block ordering) and Arbitrary slice ordering(ASO) and redundant slices (RS) c) Context adaptive variable length coding (CAVLC) Baseline profile is primarily used for low-cost applications , for data loss robustness. The joint model (JM 17.2) implementation of the H.264 encoder is used in this project. Complexity reduction algorithm for intra prediction This project will implement the complexity reduction algorithm for all the 3 block sizes 1) 16x16 luma 2) 4x4 luma 3) 8x8 chroma Algorithm The proposed intra mode selection algorithm for a 16x16 luma block is summarized as follows: Step 1 - Examine sizes of adjacent blocks: if both blocks (upper block and left block) are 16x16, go to Step 2, otherwise go to Step 4. Step 2 - Examine modes of adjacent blocks: if both modes are same, go to Step 3, otherwise select the best mode for a 16x16 luma block, which results in the minimum SATD (sum of absolute transformed differences) between two adjacent modes of modeA and modeB. Step 3 - If both adjacent modes are DC mode, go to Step 4, otherwise select the best mode for a 16x16 luma block, which results in the minimum SATD between the adjacent mode and DC mode. Algorithm Step 4 - Let ΔV be a vertical difference between upper boundary pixels of the current block and boundary pixels of the upper block, and ΔH be a horizontal difference between left boundary pixels of the current block and boundary pixels of the left block as follows. ΔV = Σ |u(i)-q(i)| for i =0 to 15. ΔH = Σ |l(i)-r(i)| for i =0 to 15. Where u(i) -> upper block boundary pixels q(i) -> upper boundary pixels of current block l(i) -> boundary pixels of the left block r(i) -> left boundary pixels of the current block T2 -> Threshold level 2 (T2 = 32) Algorithm Obtain candidate modes as follows by using two difference values, ΔV and ΔH: if |ΔV − ΔH | is smaller than 2T2, candidate modes are DC mode and plane mode; if (ΔV − ΔH) is larger than T2, candidate modes are DC mode and horizontal mode; if (ΔV − ΔH) is smaller than −T2, candidate modes are DC and vertical mode, where T2 is a positive value. Finally, select the best mode between each candidate mode by choosing the mode with minimum SATD. Calculation of ΔV & ΔH [8] References: 1. 2. 3. 4. 5. ITU-T Rec. H.264 | ISO/IEC 14496-10: Information Technology – Coding of Audio-visual Objects, Part 10: Advanced Video Coding 2002. T.Wiegand, G.Sullivan, G.Bjontegaard et al,: “Overview of the H.264/AVC Video Coding Standard.” IEEE Trans. Circuits and Syst. for Video Technol., Vol. 13. pp 560-576, July 2003 . Z.Chen, Z. Zhou,Y.He, et al: “Fast Integer Pixel and Fractional Pixel Motion Estimation for JVT.” Doc. #JVT-F017,Dec 2002. B.Hsieh, B.Huang, et al,:“Fast Motion Estimation for H.264/MPEG4 AVC by Using Multiple Reference Frame Skipping Criteria.” VCIP 2003,Proceedings of SPIE, Vol. 5150, pp 1551-1560, Oct 2003. A.Puri et al. “Video coding using the H.264/MPEG-4 AVC compression standard”, Signal Processing: Image Communication, vol.19, pp 793-849, Oct 2004. References 6. 7. 8. 9. H.264/AVC JM software: http://iphome.hhi.de/suehring/tml/ An overview of the H.264 encoder: www.vcodex.com J. Kim, D. Kim, et al, “Complexity reduction algorithm for intra mode selection in H.264/AVC video coding” J. Blanc-Talon et al. (Eds.): ACIVS 2006, LNCS 4179, pp. 454 – 465, 2006.Springer-Verlag Berlin Heidelberg 2006. I. Richardson, “The H.264 advanced video compression standard” – second edition. Wiley 2010