Image and Video Compression: An Overview Jayanta Mukhopadhyay Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur, 721302, India jay@cse.iitkgp.ernet.in 1 Data Compression • Alternative description of data requiring less storage and bandwidth. Uncompressed 1 Mbyte Compressed (JPEG) 50 Kbyte (20:1) 2 •IMAGE COMPRESSION The process of obtaining a representation of the image with less number of bits than the original. •DECOMPRESSION The process of recovering the image from its compressed form to its (almost) original form. •LOSSLESS COMPRESSION Compression without any loss of information-There is no difference between the Original Image and Reconstructed one. •LOSSY COMPRESSION Compression with partial loss of information – Reconstructed Image differs from the Original one (within and accepted limit). 3 Compression / Decompression 4 Different Forms of Alternative Representation • Image Transforms - DCT - DWT • Fractal Representation - Partial Iterated Function System • Multi-resolution Representation - Laplacian Pyramid - Wavelet Subbands • Differential Representation - DPCM 5 6 JPEG Scheme N 1 2 (2n 1) k ( N ) C2e{x(n)} X II (k ) (k ) x(n)cos , N 2N n 0 0 k N 1 7 Quantization Table 8 DC and AC encoding 9 JPEG: Code Structure 10 JPEG2000 Scheme 11 Dyadic Decomposition 12 JPEG2000 9/7 wavelet filters Analysis Filter Coefficients Synthesis Filter Coefficients i Lowpass H(i) Highpass G(i) Lowpass H’(i) Highpass G’(i) 0 0.6029 -1.1151 1.1151 -0.6029 1 0.2669 0.5913 0.5913 0.2669 2 -0.0782 0.0575 -0.0575 0.0782 3 -0.0169 -0.0913 -0.0913 -0.0169 4 0.0267 -0.0267 13 JPEG2000 Code Structure 14 Fractals Infinite Detailing Continuous at every point but nowhere differentiable Self Similarity: Rules of production similar at every scale Whole can be described from part and vice versa 15 BASIC COMRESSION SCHEME R 1 W1 D1 D2 For every RANGE BLOCK (R) find suitable DOMAIN BLOCK(D) so that RW(D) [W is the Transformation applied on D] 16 ADAPTIVE SCHEMES • RANGE & DOMAIN SIZE – QUADTREE PARTITION • DOMAIN POOL – TRIANGULATED PARTITION – LATTICE OF SEPARATION • TRANSFORMATION SET – IGNORE ISOMETRIES – NON-LINEAR CONTRACTIVE TRANSFORM 17 DECOMPRESSION ALGORITHM • Start with any arbitrary image • For each Range Block, apply transformation on corresponding Domain Block in every iteration. • After sufficiently large number of iteration it converges to the target image (lossy version of original image) 18 MPEG2 • Uses DCT based JPEG for temporal compression. • Lossy video compression and lossy audio compression. • Core of most digital television and DVD formats. • Designed to code standard-definition television at bit rates from about 3-15 Mbit/s and high-definition television at 1530 Mbit/s . 19 (DCT, Quant., Motion Estimation & Compensation, VLC) Encoding INTRA Motion Compensated Inter Frames INTRA (IDCT, IQuant., Inverse Motion Compensation, VLC) Decoding 20 Frame Types • I Frames (Intra Pictures) - encoded using only information within that frame. - used as reference frames. - provides least amount of compression. • P Frames (Predicted Pictures) - encoded with reference to a previous I or P frame. - used as reference frame. - provides moderate compression • B Frames (Bidirectional Pictures) - requires both past and future reference frames. - never used as reference. - provides highest amount of compression 21 MPEG2 : Coding Sequence Forward Prediction 1 2 3 4 5 6 7 8 1 I B B B P B B B I Bidirectional Prediction 22 MPEG2 Encoder Video in _ _ DCT Q VLC IQ Coded bitstream out (I)DCT=(Inverse) discrete cosine transform (I)Q=(Inverse) Quantization IDCT MCP=Motion Compensated Prediction VLC= Variable Length Coder + MCP 23 MPEG2 Decoder Coded Bitstream in VLD IQ IDCT + Decoded video out MCP (I)DCT=(Inverse) discrete cosine transform (I)Q=(Inverse) Quantization MCP=Motion Compensated Prediction VLD = Variable Length Decoder 24 Intra Frame Encoding DCT For each 8x8 block Quant. Zig-Zag Scan ……011000011010 Huffman { RLE 25 Motion Estimation & Prediction to construct Inter Frame (P/B- frames) m’ 1 2 3 4 5 6 7 8 9 10 11 m 13 14 15 16 17 18 19 20 21 22 23 24 25 Predicted P Reference e = m – m’ 26 } Reference Difference P-Frame Encoding Best Match DCT + Quant. + RLE Motion Vector B-Frame Encoding Past reference [ 01101100 Future reference Target - 0.5 x Huffman + ] = DCT + Quant. + RLE Motion vectors Huffman coder 01101100 27 MPEG-4 29 MJPEG2000 (Motion JPEG2000) 30 MCJ2K codec Structure Video in I frame JPEG2000 Compression Compressed Stream + P or B Motion Estimation /Compensation JPEG2000 Compression JPEG2000 Decompression of I or P Frames 31 H.264/AVC: Encoder 32 Image Transform • MPEG-2 uses 8x8 DCT(Discrete Cosine Transform) • A separable integer 4x4 transform an approximation to 4x4 DCT • Can be computed with only 16 bit additions, subtractions, and shifts • Repeated transform of DC coeffs for 8x8 chroma and 16x16 Intra luma blocks 33 Intra-frame Prediction • Two modes for luma block – Intra 4x4 • 9 modes • Used in texture area – Intra 16x16 • 4 modes • Used in flat area • One mode of chroma block – Similar to Intra 16x16 • I_PCM: bypass the prediction and transform coding 34 Intra 16x16 & Chroma prediction • 4 modes 35 Intra 4x4 • 9 modes 36 Inter-Frame Prediction – Multiple reference pictures and decoupling of reference order from display order. – Quarter sample accuracy motion vector 37 Macro Block and Block size 8x8 8x8 8x8 8x8 Macro Block 16x16 38 Performance comparison H264/AVC Frame MPEG-2 Frame 39 Performance comparison contd. MPEG-2 Vs H264/AVC Baseline profile(foreman.cif video) 40