PPT

Fast mode decision for Inter Mode Selection in H.264/AVC Video Coding By Amruta Kulkarni Under Guidance of DR. K.R. RAO Contents  Need for video compression  Motivation  Video coding standards, video formats and quality  Overview of H.264  Complexity reduction algorithm for inter mode selection  Experimental results  Conclusions  References Need for Video Compression  It reduces both storage and bandwidth demands.  Insufficient resources to handle uncompressed videos.  Better proposition is to send high- resolution compressed video than a lowresolution, uncompressed stream over a high bit-rate transmission channel. Motivation [2]  Removing redundancy in a video clip  Only a small percentage of any particular frame is new information  Highly complex process  Reduce the overall complexity suitable for handheld devices Timeline of Video Development [10] Inter-operability between encoders and decoders from different manufacturers Build a video platform which helps to interact with video codecs, audio codecs, transport protocols, security and rights management in well defined and consistent ways OVERVIEW OF H.264 / AVC STANDARD  Built on the concepts of earlier standards such as MPEG-2 and MPEG-4 Visual  Achieves substantially higher video compression and has network friendly video representation  50% reduction in bit-rate over MPEG-2  Error resilience tools  Supports various interactive (video telephony) and non-interactive applications (broadcast, streaming, storage, video on demand) H.264/MPEG-4 Part 10 or AVC [2, 5]  Is an advanced video compression standard, developed by ITU-T Video Coding Experts Group(VCEG) together with ISO/IEC Moving Picture Experts Group(MPEG).  It is a widely used video codec in mobile applications, internet ( YouTube, flash players), set top box, DTV etc.  A H.264 encoder converts the video into a compressed format(.264) and a decoder converts compressed video back into the original format. How does H.264 codec work ?  An H.264 video encoder carries out prediction, transform and encoding processes to produce a compressed H.264 bit stream. The block diagram of the H.264 video encoder is shown in Fig 1.  A decoder carries out a complementary process by decoding, inverse transform and reconstruction to output a decoded video sequence. The block diagram of the H.264 video decoder is shown in Fig 2. H.264 encoder block diagram Fig. 1 H.264 Encoder block diagram[7] H.264 decoder block diagram Bitstream Input + Entropy Decoding Inverse Quantization & Inverse Transform Video Output Deblocking Filter + Intra/Inter Mode Selection Picture Buffering Fig.2 H.264 decoder block diagram [2] Intra Prediction Motion Compensation Slice Types [3] I (intra) slice – contains reference only to itself. P (predictive) slice – uses one or more recently decoded slices as a reference (or prediction) for picture construction. B (bi-predictive) slice – works similar to P slices except that former and future I or P slices may be used as reference pictures SI and SP or “switching” slices may be used for transitions between two different H.264 video streams. Profiles in H.264  The H.264 standard defines sets of capabilities, which are also referred to as “Profiles”, targeting specific classes of applications. Fig. 3. Different features are supported in different profiles depending on applications. Table 1. lists some profiles and there applications. Table 1. List of H.264 Profiles and applications[2] Profile Applications Baseline Video conferencing , Videophone Main Digital Storage Media, Television Broadcasting High StreamingVideo Extended Content distribution Post processing Profiles in H.264[9] Fig. 3 Profiles in H. 264[9] Intra Prediction  I – pictures usually have a large amount of information     present in the frame. The spatial correlation between adjacent macro-blocks in a given frame is exploited. H.264 offers nine modes for intra prediction of 4x4 luminance blocks. H.264 offers four modes of intra prediction for 16x16 luminance block. H.264 supports four modes similar to 16x16 luminance block for prediction of 8x8 chrominance blocks. Intra prediction Fig.4 16x16 intra prediction modes [11] Fig. 5 4x4 Intra prediction modes [11] Inter Prediction [5]  Takes advantage of the temporal redundancies that exist among successive frames.  Temporal prediction in P frames involves predicting from one or more past frames known as reference frames. Motion Estimation/Compensation  It includes motion estimation (ME) and motion compensation (MC).  ME/MC performs prediction. A predicted version of a rectangular block of pixels is generated by choosing another similarly sized rectangular block of pixels from previously decoded reference picture.  Reference block is translated to the position of current rectangular block (motion vector).  Different sizes of block for luma: 4x4, 4x8, 8x4, 8x8, 16x8, 8x16, 16x16 pixels. Inter prediction Fig. 6 Partitioning of a MB for motion compensation [5] Integer Transform and Quantization  Transform:  Prediction error block is expressed in the form of transform coefficients.  H.264 employs a purely integer spatial transform, which is a rough approximation of the DCT.  Quantization:  Significant portion of data compression takes place.  Fifty-two different quantization step sizes can be chosen.  Step sizes are increased at a compounding rate of approximately 12.5%. De-blocking Filter and Entropy Coding  De-blocking filter:  Removes the blocking artifacts due to the block based encoding pattern  In-loop de-blocking filter  Entropy coding:  Assigning shorter code-words to symbols with higher probabilities of occurrence, and longer code-words to symbols with less frequent occurrences.  CAVLC and CABAC FAT (Fast Adaptive Termination) for Mode Selection [9]  The proposed fast adaptive mode selection algorithm includes the following:  Fast mode prediction  Adaptive rate distortion threshold  Homogeneity detection  Early Skip mode detection Fast mode prediction  In H264/ AVC video coding is performed on each frame by dividing the frame into small macro blocks from up-left to right-bottom direction.  The spatial macro blocks in the same frame generally have the similar characteristics such as motion, detailed region.  For example, if most of the neighboring macro blocks have skip mode, that means the current macro block has more chance of having the same mode.  Temporal similarity also exists between the collocated macro blocks in the previous encoded frame. Fast mode prediction • Fig. 7 shows the spatial macro blocks, the current macro block X has similar characteristics with its neighboring macro blocks from A through H. • In Fig. 8 shows the temporal similarity between current and collocated macro block PX in the previous frame and its neighbors. Fig. 7 Spatial Neighboring blocks [8] Fig. 8 Temporal Neighboring blocks [8] Fast mode prediction  A mode histogram from spatial and temporal neighboring macro blocks is obtained, we select the best mode as the index corresponding to the maximum value in the mode histogram.  The average rate-distortion cost of each neighboring macro block corresponding to the best mode is then selected as the prediction cost for the current macro block. Rate Distortion Optimization •Rate–distortion optimization (RDO) is a method of improving video quality in video compression. The name refers to the optimization of the amount of distortion (loss of video quality) against the amount of data required to encode the video, the rate. •Macro block parameters : QP(quantization parameter) and Lagrange multiplier (λ) •Calculate : λ Mode= 0.85*2(QP-12)/3 •Then calculate cost, which determines the best mode, RDcost = D + λ MODE * R, D – Distortion R - bit rate with given QP λ – Lagrange multiplier •Distortion (D) is obtained by SAD (Sum of Absolute Differences) between the original macro block and its reconstructed block. •Bit rate(R) includes the bits for the mode information and transform coefficients for macro block . •Quantization parameter (QP) can vary from (0-51) •Lagrange multiplier (λ) a value representing the relationship between bit cost & quality. Adaptive Rate Distortion Threshold  RDthres for early termination is dependent on RD pred which is computed according to spatial and temporal correlations.  RDthres also depends on the value of β modulator.  Thus, rate distortion threshold is given by, Rdthres = (1+ β) x RD pred  β modulator provides a trade-off between computational efficiency and accuracy. Threshold selection RD thres = RD pred x (1-8xβ)  Adaptive Threshold II: RD thres = RD pred x (1+10xβ)  The threshold is adaptive as it depends on the predicted rate distortion cost derived from spatial and temporal correlations.  Where, β is the modulation Coefficient, and it depends on two factors namely quantization step (Qstep) and block size (N and M).  Adaptive Threshold I: Homogeneity Detection  Smaller block sizes like P4x8, P8x4 and P4x4 often correspond to detailed regions and thus requires much more computation when compared to larger block sizes.  So, before checking smaller block sizes it is necessary to check if a P8x8 block is homogeneous or not.  The method adopted to detect homogeneity is based on edge detection.  An edge map is created for each frame using the Sobel operator [27]. Homogeneity Detection  For each pixel pm, n, an edge vector is obtained Dm,n ( dxm,n, dym,n)  dxm, n = pm-1, n+1 + 2 * pm, n+1 + pm+1, n+1 - pm-1, n-1 – 2 * pm, n- 1 - pm+1, n-1  dym,n = pm+1, n-1 + 2 * pm+1, n + pm+1, n+1 - pm-1, n-1 – 2 * pm-1, n - pm-1, n+1 (1) (2)  Here dxm, n and dym, n represent the differences in the vertical and horizontal directions respectively.  The amplitude Amp (D (m, n)) of the edge vector is given by,  Amp (D (m, n)) = │ dxm, n │+ │ dym, n │ (3)  A homogeneous region is detected by comparing the summation of the amplitudes of edge vectors over one region with predefined threshold values [30]. In the proposed algorithm, such thresholds are made adaptive depending on the amplitude of left, up blocks and mode information. Homogeneity Detection  The adaptive threshold is determined as per the following four cases:  Case 1: If the left block and the up block are both P8x8  Case 2: If the left block is P8x8 and up block is not P8x8  Threshold = Homogeneity Detection  Case 3: If the left block is not P8x8 and up block is P8x8  Threshold =  Case 4: If the left block is not P8x8 and up block is not P8x8 FAT Algorithm [8] Fig. 9 FAT algorithm [8] FAT Algorithm  Step 1 : If current macro block belongs to I slice, check for intra     prediction using I4x4 or I16x16,go to step 10 else go to step 2. Step 2 : If a current macro block belongs to the first macro block in P slice check for inter and intra prediction modes, go to step 10 else go to step 2. Step 3: Compute mode histogram from neighboring spatial and temporal macro blocks, go to step 4. Step 4 : Select prediction mode as the index corresponding to maximum in the mode histogram and obtain values of Adaptive Threshold I and Adaptive Threshold II, go to step 5. Step 5 : Always check over P16x16 mode and check the conditions in the skip mode, if the conditions of skip mode are satisfied go to step 10, otherwise go to step 6. FAT Algorithm  Step 6 : If all left, up , up-left and up-right have skip modes, then check the skip mode against Adaptive Threshold I if the rate distortion is less than Adaptive Threshold I , the current macro block is labeled as skip mode and go to step 10, otherwise, go to step 7.  Step 7 : First round check over the predicted mode; if the predicted mode is P8x8, go to step 8; otherwise, check the rate distortion cost of the predicted mode against Adaptive Threshold I. If the RD cost is less than Adaptive Threshold I, go to step 10; otherwise go to step 9.  Step 8 : If a current P8x8 is homogeneous, no further partition is required. Otherwise, further partitioning into smaller blocks 8x4,4x8, 4x4 is performed. If the RD of P8x8 is less than Adaptive Threshold I , go to step 10; otherwise go to step 9. FAT Algorithm  Step 9 : Second round check over the remaining modes against Adaptive Threshold II : If the rate distortion is less than Adaptive Threshold II; go to step 10; otherwise continue check all the remaining modes, go to step 10.  Step 10 : Save the best mode and rate distortion cost. CIF and QCIF sequences  CIF (Common Intermediate Format) is a format used to standardize the horizontal and vertical resolutions in pixels of Y, Cb, Cr sequences in video signals, commonly used in video teleconferencing systems.  QCIF means "Quarter CIF". To have one fourth of the area as "quarter" implies the height and width of the frame are halved.  The differences in Y, Cb, Cr of CIF and QCIF are as shown below in fig.6. [16] Fig.10 CIF and QCIF resolutions(Y, Cb, Cr ). Results  The following QCIF and CIF sequences were used to test the         complexity reduction algorithm. [10] Akiyo Foreman Car phone Hall monitor Silent News Container Coastguard Test Sequences Akiyo Coastguard News Foreman Car phone Container Test Sequences Hall monitor Silent Experimental Results  Baseline profile  IPPP type.  Various QP of 22,27, 32 and 37.  QCIF -30 frames CIF - 30 frames  The results were compared with exhaustive search of JM in terms of the change of PSNR, bit-rate, SSIM, compression ratio, and encoding time.  Intel Pentium Dual Core processor of 2.10GHz and 4GB memory. Experimental Results  Computational efficiency is measured by the amount of time reduction, which is computed as follows: TimeJM 1 7.2  Timen ew Time   100% Time1 7.2  Delta Bit rate is measured by the amount of reduction which is computed by, Bit rate17.2  Bit rateJMnew Bit rate  100% Bit rate17.2  Delta PSNR (Peak Signal to Noise Ratio) is measured by the amount of reduction which is computed by, PSNR1 7.2  PSNRJMn ew PSNR  100% PSNR1 7.2 Quality  Specify, evaluate and compare  Visual quality is inherently subjective.  Two types of quality measures :  Objective quality measure- PSNR, MSE  Structural quality measure- SSIM [29]  PSNR - most widely used objective quality measurement PSNRdB = 10 log10 ((2n − 1)2 / MSE) where, n = number of bits per pixel, MSE = mean square error  SSIM – SSIM emphasizes that the human visual system is highly adapted to extract structural information from visual scenes. Therefore, structural similarity measurement should provide a good approximation to perceptual image quality. Results Conclusions  To achieve time complexity reduction in inter prediction, a fast adaptive termination mode selection algorithm, named FAT [8] has been used.  Experimental results reported on different video sequences and comparison with open source code (JM17.2) indicate that the algorithm used achieves faster encoding time with a negligible loss in video quality. Numbers are as shown below:  Encoding time: ~43% reduction for QCIF and ~40% reduction for CIF  PSNR: ~0.15% reduction for QCIF and ~0.26% reduction for CIF  Bit Rate: ~6% reduction for QCIF and ~9.5% reduction for CIF  SSIM: ~0.077% reduction for QCIF and ~0.073% reduction for CIF  These results show that considerable reduction in encoding time is achieved using FAT algorithm while not degrading the video quality References: 1. Open source article, “Intra frame coding” : http://www.cs.cf.ac.uk/Dave/Multimedia/node248.html 2. Open source article, “MPEG 4 new compression techniques” : http://www.meabi.com/wp-content/uploads/2010/11/21.jpg 3. Open source article, “H.264/MPEG-4 AVC,” Wikipedia Foundation, http://en.wikipedia.org/wiki/H.264/MPEG-4_AVC 4. I.E.Richardson, “The H.264 advanced video compression standard”,2nd Edition ,Wiley 2010. 5. R. Schafer and T. Sikora, “Digital video coding standards and their role in video communications, ”Proceedings of the IEEE Vol 83,pp. 907-923,Jan 1995. 6. G. Escribano et al, “Video encoding and transcoding using machine learning,” MDM/KDD’08,August 24,2008,Las Vegas,NV,USA. 7. 8. D. Marpe, T. Wiegand and S. Gordon, “H.264/MPEG4-AVC Fidelity Range Extensions: Tools, Profiles, Performance, and Application Areas”, Proceedings of the IEEE International Conference on Image Processing 2005, vol. 1, pp. 593 - 596, Sept. 2005. ITU-T Recommendation H.264-Advanced Video Coding for Generic Audio-Visual services. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. S. Kwon, A. Tamhankar and K.R. Rao, ”Overview of H.264 / MPEG-4 Part 10”, J. Visual Communication and Image Representation, vol. 17, pp.186-216, April 2006. A. Puri et al, “Video coding using the H.264/ MPEG-4 AVC compression standard”, Signal Processing: Image Communication, vol. 19, pp: 793 – 849, Oct. 2004. G. Sullivan, P. Topiwala and A. Luthra, “The H.264/AVC Advanced Video Coding Standard: Overview and Introduction to the Fidelity Range Extensions”, SPIE conference on Applications of Digital Image Processing XXVII, vol. 5558, pp. 53-74, Aug. 2004. K. R. Rao and P. C.Yip, “The transform and data compression handbook”, Boca Raton, FL: CRC press, 2001. T. Wiegand and G. J. Sullivan, “The H.264 video coding standard”, IEEE Signal Processing Magazine, vol. 24, pp. 148-153, March 2007. I.E.Richardson “H.264/MPEG-4 Part 10 White Paper : Inter Prediction”, www.vcodex.com, March 2003. JM reference software http://iphome.hhi.de/suehring/tml/ G. Raja and M.Mirza, “In-loop de-blocking filter for H.264/AVC Video”, Proceedings of the IEEE International Conference on Communication and Signal Processing 2006, Marrakech, Morroco, Mar. 2006. M. Wien, “Variable block size transforms for H.264/AVC”, IEEE Trans. on Circuits and Systems for Video Technology, vol. 13, pp. 604–613, July 2003. A. Luthra, G. Sullivan and T. Wiegand, “Introduction to the special issue on the H.264/AVC video coding standard”, IEEE Trans. on Circuits and Systems for Video Technology, vol. 13, issue 7, pp. 557-559, July 2003. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. A. Luthra, G. Sullivan and T. Wiegand, “Introduction to the special issue on the H.264/AVC video coding standard”, IEEE Trans. on Circuits and Systems for Video Technology, vol. 13, issue 7, pp. 557-559, July 2003. H.Kim and Y.Altunhasak, “Low-Complexity macroblock mode selection for H.264-AVC encoders”, IEEE International Conference on Image Processing, vol.2, pp .765-768, Oct. 2004. “Editor's Proposed Draft Text Modifications for Joint Video Specification (ITU-T Rec. H.264 | ISO/IEC 14496-10 AVC), Draft 2”, JVT-E022d2, Geneva, Switzerland, 9-17 October, 2002 A. Tourapis, O. C. Au, and M. L. Liou, “Highly efficient predictive zonal algorithm for fast block-matching motion estimation,” IEEE Trans. Circuits Syst. Video Technol., vol. 12, pp. 934-947, Oct. 2002 Z. Chen, P. Zhou and Y He, “Fast integer pel and fractional pel motion estimation for JVT”, JVTF017r1.doc, Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG, 6th meeting, Awaji, Island, JP, 513 December, 2002. A. M. Tourapis, " Enhanced predictive zonal search for single and multiple frame motion estimation,” in proceedings of Visual Communications and Image Processing 2002 (VCIP-2002), pp. 1069-1079, San Jose, CA, January 2002. Y. Lin and S. Tai, “Fast full search block matching algorithm for motion compensated video compression”, IEEE Transactions on Communications, vol. 45, pp. 527-531, May 1997. T. Uchiyama, N. Mukawa, and H.Kaneko, “Estimation of homogeneous regions for segmentation of textured images,” Proceedings of IEEE ICPR, pp. 1072-1075, 2002. X. Liu, D. Liang and A. Srivastava, “Image segmentation using local special histograms,” Proceedings of IEEE ICIP, pp. 70-73, 2001. F. Pan, X. Lin, R. Susanto, K. Lim,Z. Li,G. Feng,D. Wu and S. Wu, ”Fast mode decision for intra prediction,“ Doc. JVT-G013,Mar.2003. 30. D. Wu et al ,“Fast intermode decision in H.264/AVC video coding,” IEEE Transactions on Circuits and System for Video Technology, vol. 15, no. 7, pp. 953-958,July 2005. 31. YUV test video sequences : http://trace.eas.asu.edu/yuv/ 32. J. Ren et al , “Computationally efficient mode selection in H.264/AVC video coding”, IEEE Transactions on Consumer Electronics, Vol. 54, No.2, pp. 877-886, May 2008. 33. Z. Wang et al,”Image quality assessment: From error visibility to structural similarity,” IEEE Transactions on Image Processing, vol.13, no.4, pp.600-612, Apr. 2004. 34. A.Puri et al, “Video coding using the H.264/MPEG-4 AVC compression standard”, Signal Processing: Image Communication, vol.19, pp. 793-849, Oct. 2004. 35. Multi-view Coding H.264/MPEG-4 AVC : http://mpeg.chiariglione.org/technologies/mpeg-4/mp04- mvc/index.htm 36. CIF and QCIF format : http://en.wikipedia.org/wiki/Common_Intermediate_Format 37. T.Wiegand et al,”Rate-constrained coder control and comparison of video coding standards,” IEEE Trans. Circuits Systems Video Technology, vol. 13, no.7, pp.688-703, July 2003. 38. T.Stockhammer, D.Kontopodis, and T.Wiegand,” Rate-distortion optimization for H.26L video coding in packet loss environment,” in Proc. Packet Video Workshop 2002, Pittsburgh, PA, April 2002. 39. K.R.Rao and J.J.Hwang,”Techniques and standards for digital image/video/audio coding,” Englewood Cliffs, NJ: Prentice Hall, 1996. 40. Open source article,”Blu-ray discs”, http://www.blu-ray.com/info/ 41. Open source article,” Coding of moving pictures and audio” ,http://mpeg.chiariglione.org/standards/mpeg-2/mpeg-2.htm 42. Open source article,” Studio encoding parameters of digital television for standard 4:3 and wide screen 16:9 aspect ratios”, http://www.itu.int/rec/R-REC-BT.601/ 43. Integrated Performance Primitives from Intel Website: http://software.intel.com/en-us/articles/intel- ipp/#support, 2009. 44. T.Purushotham,” Low complexity H.264 encoder using machine learning,” M.S. Thesis, E.E Dept, UTA, 2010. 45. S.Muniyappa,” Implementation of complexity reduction algorithm for intra mode selection in H.264/AVC,” M.S. Thesis, E.E Dept, UTA, 2011. 46. R.Su, G.Liu and T.Zhang,” Fast mode decision algorithm for intra prediction in H.264/AVC with integer transform and adaptive threshold,” Signal, Image and Video Processing, vol.1, no.1, pp. 11-27, Apr. 2007. 47. D.Kim, K.Han and Y.Lee,” Adaptive single-multiple prediction for H.264/AVC intra coding,” IEEE Trans. on Circuits and Systems for Video Technology, vol. 20, no. 4, pp. 610-615, April 2010. 48. G.J.Sullivan,” The H.264/ MPEG-4 AVC video coding standard and its deployment status,” Proc. SPIE Conf. Visual Communications and Image Processing (VCIP), Beijing, China, July 2005. 49. D.Marpe, T.Wiegand and G.Sullivan,” The H.264/MPEG-4 advanced video coding standard and its applications,” IEEE Communications Magazine, vol. 44, no.8, pp. 134-143, Aug. 2006. 50. T.Wiegand and G.Sullivan,”The picturephone is here: Really”, IEEE Spectrum, vol.48, pp.50-54, Sept. 2011. Thank You !! SSIM           The difference with respect to other techniques mentioned previously such as MSE or PSNR, is that these approaches estimate perceived errors on the other hand SSIM considers image degradation as perceived change in structural information. Structural information is the idea that the pixels have strong inter-dependencies especially when they are spatially close. These dependencies carry important information about the structure of the objects in the visual scene. The SSIM metric is calculated on various windows of an image. The measure between two windows and of common size N×N is: With the average of μx ; the average of μy; the variance of σx; the variance of σy ; the covariance of and σxy; C1 and C2, two variables to stabilize the division with weak denominator; In order to evaluate the image quality this formula is applied only on luma. The resultant SSIM index is a decimal value between -1 and 1, and value 1 is only reachable in the case of two identical sets of data. Typically it is calculated on window sizes of 8×8. The window can be displaced pixelby-pixel on the image but the authors propose to use only a subgroup of the possible windows to reduce the complexity of the calculation. RDO Rate–distortion optimization (RDO) is a method of improving video quality in video compression. The name refers to the optimization of the amount of distortion (loss of video quality) against the amount of data required to encode the video, the rate. While it is primarily used by video encoders, rate-distortion optimization can be used to improve quality in any encoding situation (image, video, audio, or otherwise) where decisions have to be made that affect both file size and quality simultaneously.  Rate–distortion optimization solves the aforementioned problem by acting as a video quality metric, measuring both the deviation from the source material and the bit cost for each possible decision outcome. The bits are mathematically measured by multiplying the bit cost by the Lagrangian, a value representing the relationship between bit cost and quality for a particular quality level. The deviation from the source is usually measured as the mean squared error, in order to maximize the PSNR video quality metric.  Calculating the bit cost is made more difficult by the entropy encoders in modern video codecs, requiring the rate-distortion optimization algorithm to pass each block of video to be tested to the entropy coder to measure its actual bit cost. In MPEG codecs, the full process consists of a discrete cosine transform, followed by quantization and entropy encoding. Because of this, rate-distortion optimization is much slower than most other block-matching metrics, such as the simple sum of absolute differences (SAD) and sum of absolute transformed differences (SATD). As such it is usually used only for the final steps of the motion estimation process, such as deciding between different partition types in H.264/AVC.  PSNR       The PSNR is most commonly used as a measure of quality of reconstruction of lossy compression codecs (e.g., for image compression). The signal in this case is the original data, and the noise is the error introduced by compression. When comparing compression codecs it is used as an approximation to human perception of reconstruction quality, therefore in some cases one reconstruction may appear to be closer to the original than another, even though it has a lower PSNR (a higher PSNR would normally indicate that the reconstruction is of higher quality). One has to be extremely careful with the range of validity of this metric; it is only conclusively valid when it is used to compare results from the same codec (or codec type) and same content. [1][2] It is most easily defined via the mean squared error (MSE) which for two m×n monochrome images I and K where one of the images is considered a noisy approximation of the other is defined as: The PSNR is defined as: Here, MAXI is the maximum possible pixel value of the image. When the pixels are represented using 8 bits per sample, this is 255. More generally, when samples are represented using linear PCM with B bits per sample, MAXI is 2B−1. For color images with three RGB values per pixel, the definition of PSNR is the same except the MSE is the sum over all squared value differences divided by image size and by three. Alternately, for color images the image is converted to a different color space and PSNR is reported against each channel of that color space, e.g., YCbCr or HSL.[3][4][5] Typical values for the PSNR in lossy image and video compression are between 30 and 50 dB, where higher is better.[6][7] Acceptable values for wireless transmission quality loss are considered to be about 20 dB to 25 dB.[8][9] When the two images are identical, the MSE will be zero. For this value the PSNR is undefined ( Bit rate         In telecommunications and computing, bit rate (sometimes written bitrate, data rate or as a variable R[1]) is the number of bits that are conveyed or processed per unit of time. In digital multimedia, bitrate represents the amount of information, or detail, that is stored per unit of time of a recording. The bitrate depends on several factors: The original material may be sampled at different frequencies The samples may use different numbers of bits The data may be encoded by different schemes The information may be digitally compressed by different algorithms or to different degrees Generally, choices are made about the above factors in order to achieve the desired trade-off between minimizing the bitrate and maximizing the quality of the material when it is played. If lossy data compression is used on audio or visual data, differences from the original signal will be introduced; if the compression is substantial, or lossy data is decompressed and recompressed, this may become noticeable in the form of compression artifacts. Whether these affect the perceived quality, and if so how much, depends on the compression scheme, encoder power, the characteristics of the input data, the listener’s perceptions, the listener's familiarity with artifacts, and the listening or viewing environment.  The most computational expensive process in H.264 is the Motion Estimation.  For example, assuming FS and P block types, Q reference frames and a search range of MxN, MxNxPxQ computions are needed.

PPT

Related documents

Products

Support

PPT

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib