EE 5359 TOPICS IN SIGNAL PROCESSING Interim Report ANALYSIS OF AVS-M FOR LOW PICTURE RESOLUTION MOBILE APPLICATIONS Under Guidance of: Dr. K. R. Rao Dept. of Electrical Engineering , UT Arlington Submitted by: Aditya R Deshkar aditya.deshkar@mavs.uta.edu Student ID:1000848085 OBJECTIVE • Audio video standard for Mobile (AVS-M) [1] is seventh part of the standard developed by Audio Video coding Standard (AVS) workgroup of China. • Provide insight into AVS-M video coding standard (Jiben Profile) [2] • Analyze its architecture, features and data formats for its use in low complexity and low picture resolution mobile applications. History[1] Figure 1 : Audio video coding standards history [1] AVS CHINA PROFILES[4] Table 1: AVS China Profiles and Applications[4] Various AVS China Parts[3] Table 1: AVS China Parts [3] Main Characteristics of AVS Video standards: • Streamlined and highly efficient video coder • Provide optimization between absolute coding performance and complexity of implementation • Designed to provide near optimum performance • Provide low cost implementations Data Formats Used in AVS[5] 1) Progressive scan format It is a method of storing and transmitting images , where in , all lines of each frame is drawn in sequence 2) Interlace scan format • It involves alternate drawing of odd and even lines Advantages of Progressive Scan Format • Efficiency in operation of motion estimation[11] • Significantly lower bit rate required for encoding • Less complexity involved in motion compensation[11] Layered Structure[5] Figure 2 :Layered Structure of AVS China[5] Sequence[3] Figure 3 : Video Sequence example[3] The Sequence layer provides an entry point into the coded video. Sequence headers should be placed in the bit stream to support user access appropriately for the given distribution medium. Picture[3] • The picture layer provides the coded representation of a video frame. It comprises a header with mandatory and optional parameters and optionally with user data. • There are 3 types of pictures defined by the AVS: 1) I- Pictures (Intra Pictures) 2) P-Pictures (Predicted Pictures) 3) B-Pictures (Interpolated Pictures) Slices[3] • Slice comprises a series of Macro blocks. • The Slice structure provides the lowest-layer mechanism for re-synchronizing the bit stream in case of transmission error Figure 4 :SLICE STRUTURE FOR AVS PART 7[3] Macro blocks and Blocks[3] • Picture is divided into macro blocks • The upper left sample of each MB should not exceed picture boundary. • Macro block partitioning is used for motion compensation. The number in each rectangle specifies the order of appearance of motion vectors. Figure 5 : MACROBLOCK PARTITIONING[3] AVS-M Encoder[5] Figure 6 : AVS-M Encoder [5] AVS-M Decoder[5] Figure 7: AVS-M Decoder [5] Network Abstraction Layer(NAL) Unit[12] •Packetization layer– Prefixes certain headers to encoded bit streams. •NAL is designed for : 1) Provide network friendly environment 2) Address video related applications 3) Covert AVS encoded raw bit stream into NAL unit for secure transfer over network Figure 8 : NAL Unit Syntax [12] Table 3: NAL Unit types[13] Intra Prediction[4],[13] It significantly reduces the complexity and maintains a comparable performance. There are two types of Intra Prediction which are used. • Intra _4x4 [13] • Direct Intra Prediction (DIP) [4] Intra_4x4 [13] Figure 9 : INTRA_4X4 PREDICTION [13] •Prediction using prior decoded samples in adjacent block • For each 4x4 block, one of the nine predictions modes can be utilized to exploit spatial correlation Figure 10 : NINE INTRA_4X4 PREDICTION MODES OF AVS PART 7 [4] Direct Intra Prediction [4] Direct intra prediction mainly contains 5 steps. Step 1: All 16 4×4 blocks in a MB use their MPMs to do Intra_4×4 prediction and calculate RDCost(DIP) of this MB. Step 2: Mode search of Intra_4×4, find the best intra prediction mode of each block, and calculate RDCost(Intra_4x4). Step 3: Compare RDCost(DIP) and RDCost(Intra_4x4). If RDCost(DIP) is less than RDCost(Intra_4x4), DIP flag equals to 1 then go to step 4, else DIP lag equals to 0 go to step5. Step 4: Encode the MB using DIP and finish encoding of this MB. Step 5: Encode the MB using ordinary Intra_ 4×4 and finish encoding of this MB Inter-frame Prediction [13] • The positions of the integer, half and quarter pixel samples are shown in the figure. • Capital letters indicate integer sample positions, while small letters indicate half and quarter sample positions. Figure 11 :The position of integer, half and quarter pixel samples[13] Inter-Frame Prediction • If the half_pixel_mv_flag is equal to 1, the precision of the motion vector is up to ½ pixel, otherwise the precision of motion vector is up to ¼ pixel. • When half_pixel_mv_flag is not present in the bit stream, it shall be inferred to be 11. • The interpolated values at half sample positions can be obtained using 8 tap filter F1 = (-1, 4,12,41,41,-12,4,-1) and 4 tap filter F2 = (-1, 5,5, 1). Further Addition • De-Blocking Filter • Entropy Coding • Error Concealment Experimental Results • The software which has been used to perform for AVS China Part 7 it is RM 3.3.7 [9]. • Microsoft Visual Studio Professional 2012 [14] has been used to run the code and build the project for the codec. • After building the project, code will generate two application files namely encode.exe and decode.exe. • We run these two files using appropriate and necessary parameters and obtain the final result which is a decoded file. • The original file and decoded file are than evaluated using MSU video quality measurement tool. The values of PSNR[8], MSE and SSIM[3] are obtained from it. Software used for Quality Measurement[15] Figure 12 : Screenshot of MSU Video Quality Measurement Tool software Input sequence : mother-daughter_qcif.yuv[16] BIT RATE, PSNR • • • • • • Input Sequence: mother-daughter_qcif.yuv Total No: of frames: 30 frames. Original file size : 1139Kb Width: 176. Height: 144. Frame rate: 30 fps Original Image Video quality at various QP values QP = 10 QP = 63 QP = 50 Results for miss america_qcif Sequence Compressed file size, compression ratio, bit rate, PSNR and SSIM at various QP for mother-daughter_qcif sequence PSNR vs Bit Rate SSIM vs Bit Rate Conclusion • AVS part 7 targets low complexity and low picture resolution mobility applications. • The AVS encoder and decoder are implemented using AVS M software. • Tests are carried out on various QCIF and CIF sequences. • The performance of AVS-china was analyzed by varying the quantization parameter (QP). • The PSNR and bit rate and SSIM were calculated. Acronyms • • • • • • • • • • • • • • • • • • AU AVS AVS-M B-Frame CAVLC CBP CIF DIP DPB EOB HD HHR ICT IDR I-Frame IMS ITU-T MB Access Unit Audio Video Standard Audio Video Standard for mobile Interpolated Frame Context Adaptive Variable Length Coding Coded Block Pattern Common Intermediate Format Direct Intra Prediction Decoded Picture Buffer End of Block High Definition Horizontal High Resolution Integer Cosine Transform Instantaneous Decoding Refresh Intra Frame IP Multimedia Subsystem International Telecommunication Union Macroblocks • • • • • • • • • • • • • • • MPEG MPM MV NAL P-Frame PIT PPS QCIF QP RD SAD SD SEI SPS VLC Moving Picture Experts Group Most Probable Mode Motion Vector Network Abstraction Layer Predicted Frame Prescaled Integer Transform Picture Parameter Set Quarter Common Intermediate Format Quantization Parameter Cost Rate Distortion Cost Sum of Absolute Differences Standard Definition Supplemental Enhancement Information Sequence Parameter Set Variable Length Coding References: [1] AVS working group official website, http://www.avs.org.cn [2] W. Gao et al, "AVS– the Chinese next-generation video coding standard," National Association of Broadcasters, Las Vegas, 2004 [3] L.Fan et al, "Overview of AVS Video Standard", IEEE International conference on multimedia and expo, Vol 1, pp. 423 - 426, June 2004. [4] B. Tang, Y. Chen and W. Ji "AVS Encoder Performance and Complexity Analysis Based on Mobile Video Communication", 2009 International Conference on Communications and Mobile Computing [5] L.Fan, "Mobile Multimedia Broadcasting Standards", Springer US, 2009 [6] AVS-M Reference Software, http://www.avs.org.cn/fruits/en/softList.asp [7] Y. Cheng et al, "Analysis and application of error concealment tools in AVS-M decoder", Journal of Zhejiang University –Science A, vol. 7, pp. 54-58, Jan 2006 [8] Website for PSNR, http://en.wikipedia.org/wiki/Peak_signal-to-noise_ratio [9] AVS China software: Part 7: ftp://124.207.250.92/incoming/video_codec/AVS1_P7 [10] S. Ma , S. Wang, W. Gao, "Overview of IEEE 1857 Video Coding Standards” IEEE ICIP, pp. 1500-1504, September 2013 , Melbourne, Australia (Several papers related to AVS China are in IEEE ICIP,2013) [11] Lu Yu et al, " Overview of AVS-video coding standards", Signal Processing: Image Communication, pp. 247-262, Nov 2009. [12] Y. Wang ” AVS_M: From standards to Applications”, Journal of Computer Science and Technology - Special section on China AVS standard Vol.21. No.3 pp. 332-344, May 2006 [13] L. Yu, “AVS Project and AVS-Video Techniques”, http://wwwee.uta.edu/dip/Courses/EE5351/ISPACSAVS.pdf, Dec.13, 2005 ISPACS 2005 [14] Microsoft Visual Studio Professional 2012 : http://www.microsoft.com/enus/download/details.aspx?id=34673 [15] MSU video quality measurement tool: http://www.softrecipe.com/Download/msu_video_quality_measurement_tool. html [16] Test video sequences : http://trace.eas.asu.edu/yuv/ [17] M. Liu and Z. Wei, “A fast mode decision algorithm for intra prediction in AVS-M video coding” Vol. 1, ICWAPR apos;07,Issue, 2-4, pp.326 -331, Nov. 2007. [18] Y. Cheng et al, “Analysis and application of error concealment tools in AVS-M decoder”, Journal of Zhejiang University –Science A, vol. 7, pp. 54-58, Jan 2006. [19] S.Hu, X.Zhang and Z.Yang, “Efficient Implementation of Interpolation for AVS”, Congress on Image and Signal Processing,2008. Vol 3, pp133 –138, 27-30 May 2008