Optimizing Baseline Profile in H.264/AVC Video Coding by Parallel Programming and Fast Intra and Inter Predictions BY VINOOTHNA GAJULA ID 1000803103 MS Electrical Engineer Under the Guidance of Dr. K. R. RAO Objective 1. 2. 3. In this project, the computational complexity and encoding time of baseline profile of H.264 are reduced by : Using parallel programming [1] , [7]. Then using fast adaptive termination (FAT) algorithm in intra prediction [2],[8] Finally by FAT inter prediction mode decision and motion estimation [3],[9],[20]. Introduction H.264 also known as MPEG Part10/ AVC (MPEG-4’s Advanced Video Coding) was jointly published in 2003 by international standards bodies International Telecommunication Union (ITU-T) and International Organization for Standardization / International Electrotechnical Commission (ISO / IEC) called as Joint Video Team (JVT) [4]. It has many advantages over previous coding standards MPEG-2 and MPEG-4, like significant rate distortion efficiency achieving higher bit rate reduction, error resilience and is most network friendly compared to other standards [4],[5]. Profiles in H.264 Fig.1: Various profiles of H.264 [5] Encoder-Decoder of H.264 Fig. 2 H.264 encoder block diagram [6] Fig. 3 H.264 decoder block diagram [6] Prediction modes INTRA PREDICTION MODES: Table 1: Various intra prediction block sizes and properties[4] INTER PREDICTION MODES: The macroblocks(MB) are split into four types [4] as shown: a. One 16x16 MB partition. b. Two 8x16 MB partitions. c. Two 16x8 MB partitions. d. Four 8x8 partitions and e. Combination of any of b, c and d. Optimization of baseline profile: H.264 provides the best compression but is computationally much more complex than any of the previous standards and also time consuming for real time applications. So as to make H.264 adaptable for practical application the encoding time is to be reduced. In this project this is achieved by applying the following methods simultaneously. 1. Parallel programming in baseline[7] 2. Fast algorithm for intra mode Selection[8] 3. Fast algorithm for inter mode Selection [9] The joint model (JM 18.0) implementation of the H.264 encoder is used in this project. [10] 1. Parallel programming in baseline[7] Step 1. Partition the total number of frames to encode into 2 equal sets. Ex. If the total number of frames to encode is 30, then set1 contains frame numbers from 1 to 15 and set 2 contains frame numbers from 16 to 30. Step 2. Perform intra coding parallelly on initial two frames in both partitions Ex. frame 1 and frame 16 together. Frame 1 can be used as a reference frame for frame 2 and frame 16 can be used as a reference frame for frame 17 and so on. Step 3. Perform inter coding on frame 2 and frame 17 by incorporating changes in the encoding algorithm. Repeat for frame 3 and frame 18 and so on till all the frames are encoded. Fig.4: Parallel processing of frames to reduce encoding time[7] FRAME 1 FRAME 2 FRAME 3 FRAME 15 INTRA INTER INTER INTER PARALLEL ENCODING PARALLEL ENCODING PARALLEL ENCODING PARALLEL ENCODING INTRA INTER INTER INTER FRAME 16 FRAME 17 FRAME 18 FRAME 30 Rate Distortion optimization [6][8][9] Once the prediction is obtained and residual is calculated and the best mode among these modes is chosen based on the least residual, the H.264/AVC encoder performs the ratedistortion optimization (RDO) technique for each macro block to obtain. Set macro block parameters : QP (quantization parameter) and Lagrangian multiplier λ Calculate : λMODE = 0.85⋅2(QP-12)/3 ………………………………..(1) Then calculate cost, which determines the best mode Cost = D + λ MODE. R,………………………………………(2) D – Distortion R - Bit rate with given QP Considering the RDO procedure for intra mode selection in H.264/AVC, the number of mode combinations in one macro block is N8x (16xN4 + N16)=8x(16+16)=592 N8 – number of modes of an 8x8 chroma block N4 – number of modes of an 4x4 luma block N16 – number of modes of an 16x16 luma block The H.264/AVC encoder carries out 592 RDO calculations to choose the best matching MB . As a result, the complexity of the encoder increases extremely. Fast algorithm for intra mode selection[8] Proposed intra mode selection algorithm for a 4x4 luma block [2][8]: Fig5: Pixel indices and modes of adjacent blocks used in the proposed intra mode selection algorithm. (a) indices used in (3) to (10) for a 4x4 luma block, (b) modes of upper and left blocks for additional candidate modes. [8] Step 1 - For a 4x4 luma block, obtain average(avg) and sum of difference(S). [8] Step 2a - If S is larger than a threshold, T1(set to 32), carry out RDO procedure for at most 4 candidate modes: two modes with minimum and second minimum difference(diff), and at most two modes from adjacent blocks. [8] Step 2b - If S is smaller than a threshold, T1, carry out RDO procedure for at most 4 candidate modes: one mode with minimum Diff, at most two modes from adjacent blocks, and DC mode. [8] Proposed intra mode selection algorithm for a 16x16 luma block [2] [8]: Step 1 - Examine sizes of adjacent blocks: if both blocks (upper block and left block) are 16x16, go to Step 2, otherwise go to Step 4. [8] Step 2 - Examine modes of adjacent blocks: if both modes are same, go to Step 3, otherwise select the best mode for a 16x16 luma block, which results in the minimum SATD (sum of absolute transformed differences) between two adjacent modes of modeA and modeB. [8] Step 3 - If both adjacent modes are DC mode, go to Step 4, and otherwise select the best mode for a 16x16 luma block, which results in the minimum SATD between the adjacent mode and DC mode. [8] Step 4 - Let ΔV be a vertical difference between upper boundary pixels of the current block and boundary pixels of the upper block, and ΔH be a horizontal difference between left boundary pixels of the current block and boundary pixels of the left block as follows. [8] ΔV = Σ |u(i)-q(i)| for i =0 to 15. ΔH = Σ |l(i)-r(i)| for i =0 to 15. where u(i) -> upper block boundary pixels q(i) -> upper boundary pixels of current block l(i) -> boundary pixels of the left block r(i) -> left boundary pixels of the current block Obtain candidate modes by using two difference values, ΔV and ΔH: if |ΔV − ΔH | is smaller than 2xT2, candidate modes are DC mode and plane mode; if (ΔV − ΔH) is larger than threshold,T2 (set to 8), candidate modes are DC mode and horizontal mode; if (ΔV − ΔH) is smaller than − T2, candidate modes are DC and vertical mode, where T2 is a positive value. The threshold T2 is set equal to 32. Finally, select the best mode between each candidate mode by choosing the mode with minimum SATD(sum of absolute transformed differences). Fig. 6: Calculation for ΔV and ΔH i n 16x16 luma block. [2] [8] 3. Fast algorithm for inter mode selection [9] 3. Fast algorithm for inter mode selection [9], [20]: FAT for mode decision exploits statistical similarity between current macro block and predicted macro block. Predicted mode is obtained from the spatial and temporal macro blocks. For accuracy, the rate distortion cost is checked against adaptive Threshold I and adaptive Threshold II Adaptive Threshold I: RD thres = RD pred x (1-8xβ) Adaptive Threshold II: RD thres = RD pred x (1+10xβ) Such that Where, β is the modulator, N is the rows of the image and M is number of columns of N X M MB. If the predicted mode is less than P 8 x 8, it is checked if the current macro block is homogeneous or not. Further partitioning is done into 8x4, 4x8 and 4x4 blocks, if the current macro block is not homogenous. ………. (4) FAT inter prediction algorithm [9] Fig 7: Flow chart for inter prediction [9],[20] CONCLUSION As proposed by implementing parallel programming in baseline profile along with FAT algorithm in intra and inter prediction modes on numerous test sequences, and by obtaining various quality measurements like PSNR and SSIM , the optimized baseline profile will be obtained. The performance of the optimized H.264 baseline profile is compared with the H.264 baseline profile using the quality measurements, and thus the faster computation speed, video quality and bit rates can be calculated based on various test sequences. REFERENCES: [1] H. Kalva, “Parallel programming for multimedia applications”, Springer Science and Business Media, Florida Atlantic University, Florida, USA, Dec. 2010. [2] J. Kim, D. Kim, and J. Jeong, “Complexity reduction algorithm for intra mode selection in H.264/AVC video coding” J. Blanc-Talon et al. (Eds.): ACIVS 2006, LNCS 4179, pp. 454 – 465, Springer-Verlag Berlin Heidelberg, 2006. [3] J. Ren, et al, “Computationally efficient mode selection in H.264/AVC video coding”, IEEE Trans. on Consumer Electronics, vol. 54, pp. 877 – 886, May 2008. [4] [5] I. E. G. Richardson, “H.264 and MPEG-4 video compression: video coding for next generation multimedia”, Wiley 2nd edition, Aug. 2010. [6] D. Marpe, T. Wiegand and G. J. Sullivan, “The H.264/MPEG-4 AVC standard and its applications”, IEEE Communications Magazine, vol. 44, pp. 134-143, Aug. 2006. [7] T. Saxena, “Reducing the encoding time of H.264 baseline profile using parallel programming techniques”, M.S., Thesis EE, UTA, expected Dec. 2012. [8] S.K Muniyappa, “Implementation of complexity algorithm for intra mode selection in H.264/AVC video coding”, M.S., Thesis EE, UTA, Dec. 2011. [9] A. Kulkarni, ”Implementation of fast inter-prediction mode decision algorithm in H.264/AVC video encoder”, ” M.S., Thesis EE, UTA, May 2012. [10] JM reference software, Fraunhofer Institute for Telecommunications Heinrich Hertz Institute. http://iphome.hhi.de/suehring/tml/. I. Richardson, “The H.264 advanced video compression standard” –second edition, Wiley, 2010. [11] G. Sullivan, P. Topiwala, and A. Luthra, “The H.264/AVC advanced video coding standard: overview and introduction to the fidelity range extensions”, SPIE Conference on Applications of Digital Image Processing XXVII, vol. 5558, pp. 53-74, 2004. [12] F. Pan et al, “Fast intra mode decision algorithm for H.264/AVC video coding”, in Proc.IEEE Int. Conf. Image Process., pp. 781–784, Singapore, Oct. 2004. [13] I. E.G. Richardson, “H.264 and MPEG-4 video compression: video coding for next-generation multimedia”, Wiley, 2003. [14] ISO/IEC 11172-5. Information technology - Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbps. Nov. 1998. [15] M. Jafari and S. Kasaei, “Fast intra- and inter-prediction mode decision in H.264 advanced video coding”, International Journal of Computer Science and Network Security,VOL.8 No.5, pp. 16, May 2008. [16] T. Stockhammer, D. Kontopodis, and T. Wiegand, “Rate-distortion optimization for H.26L video coding in packet loss environment,” in Proc. Packet Video Workshop 2002, Pittsburgh, PA, April 2002. [17] Draft ITU-T Recommendation and final draft international standard of joint video specification (ITU-T Rec. H.264/ISO/IEC 14 496-10 AVC), Mar. 2003. [18] YUV test video sequences : http://trace.eas.asu.edu/yuv/. [19] T.Wiegand, et al, “Overview of the H.264/AVC Video Coding Standard.” IEEE Trans. Circuits and Syst. for Video Technol.,Vol. 13, pp. 560-576, July 2003. [20] D. Han, A. Kulkarni and K.R. Rao, “Fast inter-prediction mode decision algorithm for H.264 video encoder”, ECTICON 2012, Cha Am, Thailand, May 2012.