Acknowledgement: Many of the materials are adapted from Prof. Jechang Jeong’s excellent presentation on international coding standards. Computer Vision – Coding Standards Hanyang University Jong-Il Park Topics to be covered International coding standards Background and brief history Key techniques in JPEG MPEG-1,2,4 * Only image/video coding techniques will be covered Department of Computer Science and Engineering, Hanyang University Multimedia Everywhere Towards Multimedia : Consumer Electronics Computer Multimedia TeleCommunication Broadcasting Department of Computer Science and Engineering, Hanyang University Still Picture Compression Standards 1980 : ITU-T T.4 : G3 FAX for PSTN Modified Huffman and Modified READ 1984 : ITU-T T.6 : G4 FAX for ISDN Modified MR 1992 : JPEG (ISO 10918, ITU-T T.81) : Color Still Pictures used for Color Fax, Electronic Still Camera, Color Printer, Computer Applications etc Lossless/Lossy Modes, Baseline/Extended Modes, Progressive/Sequential Modes DPCM + DCT + Q + RLE + Huffman/Arithmetic Codes Motion JPEG can be used for Moving Pictures. 1993 : JBIG (ISO 11544, ITU-T T.82) : Bi-level Pictures Improvement on T.4 and T.6 Recently: JPEG-LS, JBIG2, etc Department of Computer Science and Engineering, Hanyang University Moving Picture Compression Standards 1982 : ITU-R BT.601 : Studio Quality PCM Component Video Common to 525/60 and 625/50 Systems 13.5 MHz Sampling, 8 bit/sample, 4:2:2 Format 1990 : ITU-T H.261 : Video Phone/Conference Application via ISDN Bitrate = p x 64 kbps, p = 1-30 MC DPCM + DCT + Q + RLE + Huffman Codes Reference Model 1 - 8 1992 : MPEG-1 Video : DSM Applications (e.g. Video CD) Bitrate = 1.5 Mbps MC DPCM + DCT + Q + RLE + Huffman Codes GOP Structure for Random Access and Error Recovery (I, P, B Frames) Simulation Model 1 - 3 Department of Computer Science and Engineering, Hanyang University Moving Picture Compression Standards(Cont.) 1994 : MPEG-2 Video (ISO 13818-2, ITU-T H.262) : Generic Algorithm for Various Applications (Broadcasting, Communication, Network, DSM etc) 5 Profiles of Functionality (Simple, Main, Spatial Scalable, SNR Scalable, High) 4 Levels of Resolution (Low, Main, High-1440, High) Deals with Interlaced Scan as well as Progressive Scan Field/Frame ME & DCT, Dual Prime ME, Intra VLC, Alternate Scan, Nonuniform Q, etc 1993 : ITU-R CMTT.721 : 140 Mbps Contribution Quality Video Adaptive DPCM, Componentwise 1993 : ITU-R CMTT.723 : 34-45 Mbps Contribution Quality Video MC DPCM + DCT + Q + RLE + Huffman Codes Department of Computer Science and Engineering, Hanyang University Moving Picture Compression Standards(Cont.) 1995 : ITU-T H.263 : Videophone via PSTN Bitrate < 64 kbps (V.34 modem = 33.6 kbps, Recent modem = 56 kbps) Improved version of H.261 1998 : MPEG-4 Bitrates < 2 Mbps Targets: Multimedia data base access Wireless multimedia communication Components of H.263 are incorporated Content-based compression Synthetic and natural video/audio Multiple tools/algorithms/profiles => Flexibility 1999 : MPEG-4 Version 2, MPEG-7 Department of Computer Science and Engineering, Hanyang University Continuous-tone still image JPEG(Joint Photographic Experts Group) Applications : color FAX, digital still camera, multimedia computer, internet JPEG Standard consists of - a lossy baseline coding system - an extended coding system for greater compression, higher precision or progressive reconstruction applications - a lossless independent coding system for reversible compression References - ITU-T recommendation T.81, “Information Technology - Digital compression and Coding of Continuous-Tone Still Images Requirements and Guideline”, 92. 2 - K. R. Rao, J. J. Hwang, “Techniques & Standards for Image, Video & Audio Coding”, Prentice Hall PTR, 1996 Department of Computer Science and Engineering, Hanyang University Baseline system Baseline system : most widely used among JPEG standards Data precision - 8 bits for input and output - 11 bits for quantized DCT coefficients Algorithm - DCT + quantization + variable length coding Compression Guideline - 0.25 ~ 0.5 bits/pixel : moderate to good quality, some applications - 0.5 ~ 0.75 bits/pixel : good to very good quality, many applications - 0.75 ~ 1.5 bits/pixel : excellent quality, most applications - 1.5 ~ 2.0 bits/pixel : indistinguishable (visually lossless) quality, most demanding applications Department of Computer Science and Engineering, Hanyang University Block diagram of baseline system Baseline system encoder Baseline system decoder Department of Computer Science and Engineering, Hanyang University Quantization and inverse quant. Quantization table - No default values for quantization tables - Application may specify the tables - Q(u, v) : quantization table integer value from 1 to 255 F u , v : F Q u , v round Qu , v Dequantization : Ru , v F Q u , v Qu , v Quantization Department of Computer Science and Engineering, Hanyang University Example f (x,y) FQ (u,v) F (u,v) FDCT r (x,y) Quant. e (x,y) Inverse Q & IDCT Department of Computer Science and Engineering, Hanyang University Entropy coding DC Coefficient Coding Differential Coding DC coefficients of adjacent blocks are strongly correlated. VLC(Huffman Coding) Department of Computer Science and Engineering, Hanyang University Entropy coding(Cont.) AC coefficients Coding - Zigzag Scanning - VLC(Variable Length Coding, Huffman Coding) Department of Computer Science and Engineering, Hanyang University Eg. JPEG Compression Original image (24bpp) JPEG Compressed image (8:1 -- 3bpp) JPEG Compressed image ( 32:1 -0.75bpp ) JPEG Compressed image ( 128:1 -0.1875bpp ) Department of Computer Science and Engineering, Hanyang University MPEG Digital Video Technology MPEG-1( ISO/IEC 11172 ) and MPEG-2( ISO/IEC 13818 ) Applications : MPEG-1 : Digital Storage Media(CD-ROM…) MPEG-2 : Higher bit rates and broader generic applications ( Consumer electronics, Telecommunications, Digital Broadcasting, HDTV, DVD, VOD, etc. ) Coding scheme : Spatial redundancy : DCT + Quantization Temporal redundancy : Motion estimation and compensation Statistical redundancy : VLC References : - ISO/IEC 11172-2 (MPEG-1), ISO/IEC 13818-2 (MPEG-2) - K.R.RAO and J.J. HWANG, “TECHNIQUES & STANDARDS FOR IMAGE•VIDEO & AUDIO CODING,” Prentice Hall, 1996. Department of Computer Science and Engineering, Hanyang University MPEG Overview MPEG : - Motion Picture Experts Group - Specifies a standard compression, transmission, and decompression scheme for video and audio. - ISO/IEC 11172 : MPEG-1 - ISO/IEC 13818 : MPEG-2 - Consists of 3 parts. Part 1 : System Part 2 : Video Part 3 : Audio Department of Computer Science and Engineering, Hanyang University MPEG compression of video How to remove spectral, spatial, temporal, and statistical redundancy? Department of Computer Science and Engineering, Hanyang University Intra-frame compression Rate Control Quantization step size Video DCT No information loss No data reduction Entropy Coding Q Information loss Data reduction MUX Buffer RLE VLC Data reduction Data reducetion Run Length Coding Generates (Run, Level) symbols Variable Length Coding Use short words for most frequent symbols (like Morse code) 111 110 101 100 011 010 001 000 8-bit quantization Compressed Data 11 10 2-bit quantization 01 00 In pu t Val u e In pu t Val u e Quantizing Coefficients processing order to encourage runs of 0s Reduce the number of bits for each coefficient. Give preference to certain coefficients. Reduction can differ for each coefficient Department of Computer Science and Engineering, Hanyang University Removing spatial redundancy Pixel Coding using the DCT • As human eyes are insensitive to HF color changes, the R,G, B signal is converted into a luminance and two color difference signals. We can remove redundancy more on U, V than on Y. • The top left DCT component is taken as the dc datum for the block. • DCT coefficients to the right are increasingly higher horizontal spatial freqs. DCT coefficients below are higher vertical spatial frequencies. Department of Computer Science and Engineering, Hanyang University Inter-frame compression Activity calculator Rate control Field/Frame DCT selector SOURCE INPUT Frame reordering Field/Frame memory + + MQ Side informations DCT VLC MUX Q BUFFER CODED BITSTREAM De Q Motion estimator 1 IDCT + Adaptive predictor Field/Frame memory Motion estimator 2 Side informations Department of Computer Science and Engineering, Hanyang University Temporal redundancy Inter-frame prediction & motion estimation • This really reduces the overall bit rate from frame to frame! Department of Computer Science and Engineering, Hanyang University Motion estimation Department of Computer Science and Engineering, Hanyang University Putting it all together I, P, B Frames • The Intra Frames contain full picture information • Predicted(P) Frames are predicted from past I, or P frames • Bi-directional predicted frames offer the greatest compression and use past and future I & P frames for motion compensation. Department of Computer Science and Engineering, Hanyang University Building the elementary stream • This slide shows how the actual blocks, slices, frames etc. are all put together to form the elementary stream • Along with the actual picture data, header information is required to reconstruct the I, B, P frames. This header structure is shown. • The next stage is to take this ES and convert it into something that can be transmitted and decoded at the other end. Department of Computer Science and Engineering, Hanyang University Ordering frames Frame Reordering Department of Computer Science and Engineering, Hanyang University MPEG-4 MPEG-4( ISO/IEC 14496 ) Applications : Internet Multimedia Wireless Multimedia Communication Multimedia Contents for Computers and Consumer Electronics Interactive Digital TV Coding scheme : Spatial redundancy : DCT + Quantization, Wavelet Transform Temporal redundancy : Motion estimation and compensation Statistical redundancy : VLC (Huffman Coding, Arithmetic Coding) Shape Coding : Context-based Arithmetic Coding References : - ISO/IEC 14496 Department of Computer Science and Engineering, Hanyang University Interactive television Department of Computer Science and Engineering, Hanyang University Scene composition Department of Computer Science and Engineering, Hanyang University MPEG-4 Department of Computer Science and Engineering, Hanyang University MPEG-4: Background Department of Computer Science and Engineering, Hanyang University MPEG-4: Concept Department of Computer Science and Engineering, Hanyang University MPEG-4: Scene composition Department of Computer Science and Engineering, Hanyang University MPEG-4 Video: Summary Department of Computer Science and Engineering, Hanyang University MPEG-4 Decoder Department of Computer Science and Engineering, Hanyang University Sprite in MPEG-4 Department of Computer Science and Engineering, Hanyang University SNR Scalability Department of Computer Science and Engineering, Hanyang University Spatial Scalability Department of Computer Science and Engineering, Hanyang University