Chapter 11. Image compression Digital image processing 11. IMAGE COMPRESSION 11.1. Introduction 11.2. Pixel level coding PCM coding Entropic coding RL coding Arithmetic coding 11.3. Predictive coding techniques Principles of the predictive coding Delta modulation 1-D DPCM 2-D DPCM Comparison of the performances of various predictive coding methods 11.4. Transform-based image compression techniques Bits allocation Transform-based image compression Zonal coding; threshold-based coding Transform-based coding with visibility criteria Adaptive transform-based coding 2-D DCT based image coding Image compression algorithm based on DCT 11.5. Hybrid coding; vectorial DPCM based coding 11.6. Interframe coding Principles of the inter-frame coding Predictive coding with motion compensation Motion estimation algorithms Hybrid inter-frame coding Digital image processing Chapter 11. Image compression 11. 1. INTRODUCTION • Def. Compression = minimization of the number of bits needed to represent the visual information in the scene, without noticeable loss of information • Applications: • image transmission • image storage in multimedia/image databases Chapter 11. Image compression Digital image processing The need for image compression: • huge sizes of the data files, in uncompressed form • the storage devices have relatively slow access =>impossible to render/index image info in real time • limited bandwidth of the communication channels => impossible to do image transmission in real time TIPUL DIMENSIUNE FISIER SI LARGIME DE BANDA TEXT - ASCII - EBCDIC IMAGINI -grafica bitmap -fax -imagini statice 2kB/pagina - simple (320x200x8biti) 77kB/imagine -complexe (true color) (1100x900x24biti) 3Mb/imagine AUDIO -secvente necodate audio sau voce -voce/telefon 8kHz/8biti (mono) 6-44kB/s - audio CD 44,1 kHz 16 biti/ stereo 176 kB/s (44,1x2chx16biti) ANIMATIE - sunet si imagini cu 15-19 cadre/s -secvente 16biti/pixel, 16cadre/s 6,5 Mb/s (32x640x16bitix16cadre/s) VIDEO - TV analogic[ sau imagini cu 24-30 cadre/s -secvente color pe 24 biti cu 30cadre/s 27,6 Mb/s (640x480x24bitix30cadre/ s) Memory needed for storage before and after compression text 2:1, color images 15:1, maps 10:1, stereo sound 6:1, animation 50:1, video 50:1. Chapter 11. Image compression Digital image processing •Shannon => the average amount of information per symbol: L -1 H pilog2 pi i 1 biti/simbol L = number of symbols => It is possible to lossless encode the source having the entropy H by (H+e) bits/symbol, e>0, e->0. The compression factor (ratio) C: C Averagenumber of bits/symbol of theuncompressed data (B) B B Averagenumber of bits/symbol of thecompresseddata (H e) H e H •Maximal (ideal) compression can be achieved by vector quantization (theoretically). Practically – near optimal only: Chapter 11. Image compression Digital image processing 11. 2. PIXEL LEVEL CODING 11.2.1. PCM (pulse code modulation) u(t) • Sample and u(nT) yZero-memory gi(n) quantizer; hold encoder Transmission Digital modulation The minimal number of bits per pixel possible to be achieved by PCM: 1 log 2 u 2 ; 2 q 2 RPCM q u 2 2 σ2u= the variance of the signal at the input of the quantizer; σ2q= quantization error 11.2.2. Entropic coding (Huffman) = coding one block of M pixels, M – small; without compression: B bits/pixel needed => MB bits/block => the source–has L=2MB symbols => any block can be encoded on less than MB bits, according to: L1 = average bit rate H pi log2 pi e i 0 => entropic coding principle: use a codebook with variable length code words, the length of the code word is different for each block as follows: - the more frequent blocks (pi large) => have -log2pi smaller => shorter code word - the less frequent blocks (pi small) => have -log2pi larger => longer code word Chapter 11. Image compression Digital image processing • Huffman coding algorithm: 1) 2) Order decreasingly the probabilities pi , i=1,2,…,L, and write them in a list of symbols. Chose from the list the last 2 elements (with the smallest probabilities, pk , pl) . From the 2 => create a node with the probability pk+pl => and insert it in the ordered list in the right position; erase pk, pl from the list Asign the codes: 0 to k ; 1 to l Repeat steps (1)…(3) 3) 4) Symbol A B C D E Total: Binary tree pi 15/39 7/39 6/39 6/39 5/39 -log2(pi) 1.38 2.48 2.70 2.70 2.96 Code 0 111 110 101 100 Subtotal 2*15 3*7 3*6 3*6 3*5 87 bits H=87/39=2.23 bits per pixel in average Chapter 11. Image compression Digital image processing 1) Decoding: Simbol A B C D E Code 0 111 110 101 100 The received string: 1000100010101010110111 (22 bits) Decode! Chapter 11. Image compression Digital image processing • Versions of the Huffman coding: * the truncated Huffman code - first L1 symbols – Huffman encoded; - the remaining L-L1 are encoded using a prefix code followed by a fixed length code * the modified Huffman code - the integer number “i” is represented as i=qL1+j - encode first L1 symbols with a Huffman code - if above L1 => encode q by a prefix code + use a terminator code = Huffman code of j (L 1) i qL1 j,0 q Int ,0 j L1 1 L1 11.2.3. RLC (run length coding) => For binary images, 0 – white, 1- black; p(white)>>p(black); encode the length of the white run => applications: fax; sketches p(white)=p(0)=p; p(black)=p(1)=q; p→1. Notations: M = maximum length of a run of zeros; M=2m-1 => m bits to encode the run Chapter 11. Image compression Digital image processing 11.2.3. RLC - continued The probability of appearance of a run of l symbols ‘0’, 0 ≤ l ≤ M, followed by a ‘1’: p l (1 p); 0 l M 1 g(l) pM ; l M The average number of symbols/run: M 1 (l 1)p (1 p) Mp l l 0 M 1 pM 1 p The compression ratio: 1- pM C m m(1 p) The average number of bits per pixel: Ba m nr. biti pentrua coda cursa de "0"bitul "1"de incheiere m nr. mediu de biti de "0"continutiintr- o cursa H The coding efficiency: h can be increased by the Huffman coding or by the arithmetic Ba coding of the run lengths (instead of the uniform m bits encoding) Chapter 11. Image compression Digital image processing 11.2.3. The arithmetic coding => A single codeword is used for an entire sequence of symbols => The code = a number in a sub-range of [0;1). Source’s symbol Probability a1 a2 a3 a4 0,2 0,2 0,4 0,2 The sequence to encode a1 a2 1 0 * Decoding: => Initial sub-range 0,2 [0,00,2) [0,2 0,4) [0,4 0,8) [0,8 1,0) a3 a4 0,08 a5 0,072 0,0688 a4 a4 a4 a4 a4 a3 a3 a3 a3 a3 a2 a2 a2 a2 a2 a1 0,04 a1 0,056 a1 0,0624 a1 a1 0 1) 0.068 [0.0; 0.2) => a1; 2) 0.34 [0.2; 0.4) => a2; (0.068-0.0)/(0.2-0.0)=0.34 (0.34-0.2)/(0.4-0.2)=0.7 => codeword: Any number in [0.0624;0.0688) ex. 0.068 Digital image processing Chapter 11. Image compression Arithmetic decoding - continued => => => 3) 0.7 [0.4; 0.8) => a3; 4) 0.75 [0.4; 0.8) => a3; 4) 0.875 [0.8; 1) => a4 (0.7-0.4)/(0.8-0.4)=0.75 (0.75-0.4)/(0.8-0.4)=0.875 => The decoded sequence: a1 a2 a3 a3 a4 * Drawbacks of the arithmetic coding: - limited by the representation precision for long sequences - an end-of-message flag is needed * Alternative solutions: re-normalization and rounding 11.2.4. Bits planes encoding => e.g. 8 planes of 1 bit, independently encoded (MSB …. LSB) => disadvantage: sensitive to errors Chapter 11. Image compression Digital image processing 11. 3. PREDICTIVE CODING 11.3.1. DPCM coding principles = Maximize image compression efficiency by exploiting the spatial redundancy present in an image! Many close to zero data => spatial redundancy, the brightness is almost repeating from a point to the next one! => no need to encode all brightness info, only the new one! Line-by-line difference of the luminance see images: one can “estimate” ( “predict”) the brightness of the subsequent spatial point based on the brightness of the previous (one or more) spatial points = PREDICTIVE CODING Chapter 11. Image compression Digital image processing 11. 3. PREDICTIVE CODING 11.3.1. DPCM coding principles – continued – BASIC FORMULATION OF DPCM: Let {u(m)} – the image pixels, represented on a line by line basis, as a vector. Consider we already encoded u(0), u(1),…, u(n-1); then at the decoder => only available their decoded versions ( original + coding error), denoted: u•(0), u•(1),…, u•(n-1) => currently to do: predictive encode u(n), n-current sample: (1) estimate (predict) the grey level of the nth sample based on the knowledge of the previously encoded neighbor pixels u•(0), u•(1),…, u•(n-1): u (n) (u (n 1)u (n 2)...) (2) compute the prediction error: e(n) u(n) u (n) (3) quantize the prediction error e(n) and keep the quantized e•(n); encode e•(n) PCM and transmit it. At the decoder: 1) decode e => get e•(n) + 2) build prediction u (n) + 3) construct u•(n): u (n) u (n) e (n) * The encoding & decoding error: u(n) u(n) u(n) e(n) e(n) q(n) Chapter 11. Image compression Digital image processing 11. 3. PREDICTIVE CODING - continued Basic DPCM codec DPCM codec: (a) with distorsions; (b) without distorsions Chapter 11. Image compression Digital image processing 11.3.2. Delta modulation => the simplest DPCM encoder; prediction from 1 sample, 1 bit quantizer. u (n) u(n 1); e(n) u(n) u (n 1) => alternatively, one can predict the value of the sample as: u (n) u (n 1) (n); with: (n)= zero-mean error; - factor which minimizes the mean square error (leak factor); given by the error covariance equation: E (n) (m) (1 2 )u 2 (m n) => SNR can be estimated as: (SNR) M 10log10 {1 (2 1)f(1)} dB 2(1 )f(1) Digital image processing Chapter 11. Image compression Chapter 11. Image compression Digital image processing Delta modulation result for images: q=6,5; =1 Original Compressed (compression rate 8:1) Chapter 11. Image compression Digital image processing 11.3.2. 1-D DPCM (line-by-line DPCM) => is the usual version of applying DPCM on digital images. The sequence to encode = one image line; consider the sequence {u(n)} stationary => one can use a p order linear predictor: p u(n) a(k)u (n k) e(n); k 1 where a(k) are chosen to minimize: E[ e(n)2 ] 2 E[ e(n)2 ] 2 **The equations of 1-D DPCM codec: (1) Predictor: p u (n) a(k)u (n k) k 1 (2) Quantizer’s input: e(n) u(n) u (n) (3) Quantizer’s output: e(n) p (4) Decoder: u (n) u (n) e (n) a(k) u (n k) e (n) k 1 => Estimation of the SNR: (1 2 f(B) (SNR)DPCM 10log10 [dB] 2 (1 )f(B) => Average bit rate reduction vs. PCM: 1 1 RPCM RDPCM log2 biti/pixel 2 1 2 Chapter 11. Image compression Digital image processing 11.3.2. 1-D DPCM - Comparison to PCM; the effect of error propagation: PCM 8 bits The error in 3 DPCM on 3 bits PCM 3 bits DPCM 3 bits Digital image processing Chapter 11. Image compression 11.3.2. 1-D DPCM - Comparison to PCM; the effect of error propagation: Chapter 11. Image compression Digital image processing 11.3.2. 2-D DPCM => prediction as a function of one neighbor on the same line and 3 neighbors on the previous line: u(m, n) a1u (m 1, n) a2u (m, n 1) a3u (m 1, n 1) a4 u (m 1, n 1) n m B C D 2 2 with a(k,l) chosen to minimize E[ e(n) ] A X r(1,0) a1r(0,0) a2r(1,-1) a3r(0,1) a4r(0,1) r(0,1) a1r(1,-1) a2r(0,0) a3r(1,0)a4r(1,-2) r(1,1) a1r(0,1) a2r(1,0) a3r(0,0) a4r(0,2) r(1,-1) a1r(0,1) a2r(1,-2) a3r(0,2) a4r(0,0) 2 E2(m,n) r(0,0) - a1r(1,0) - a2r(0,1) - a3r(1,1) - a4r(1,-1) r(k,l) = covariance of u(m,n). If r(k,l) is separable =>: a1 1 , a2 2 , a3 12 , a4 0 2 2 (1 12 )(1 22 ) Chapter 11. Image compression Digital image processing 11.3.2. 2-D DPCM - continued **Equations of 2-D DPCM: (1) Predictor: u (m ,n) a1u (m 1,n) a2u (m ,n 1) a3u (m 1,n 1) a4u (m 1,n 1) e(m,n) u(m,n) u (m,n) (2) Quantizer’s input: e(n) (3) Quantizer’s output: (4) Decoder: u (m,n) u (m,n)+ e (m,n) ** Comparison of the performances of the predictive coding methods: SNR (dB) 35 2D DPCM (maxim) 30 25 2-D DPCM cu predictie cu trei puncte 20 DPCM linie cu linie 15 PCM 10 5 0 Rata (biti/esantion) 1 2 3 4 Digital image processing Chapter 11. Image compression 11.4. TRANSFORM-BASED IMAGE COMPRESSION ( “block quantization”) • The essence of transform-based image compresssion: (1) an image block is transformed using a unitary transform, so that a large percent from the energy of the block is packed in a small number of transform coefficients (2) the transform coefficients are independently quantized • The optimal transform-based encoder = the encoder which minimizes the mean square error of the decoded data, for a given number of bits the Karhunen-Loeve based encoder. • Transform-based image coding algorithm: u = a random vector, u[N×1], of zero mean; v=A∙u – a liniar transform of u, so that v(k) – not correlated => they can be quantized independently => v’(k); u’=B∙v’ – inverse transform Chapter 11. Image compression Digital image processing 11.4.1. Main issues to be solved in transform-based image coding: (1) Finding the optimal transform matrices: A[N×N], B[N×N], able to minimize the mean square value of the distortion for a given number of bits, D: 1 N 1 D E (u(n) u ' (n))2 E (u - u ' )T (u - u ' ) N n1 N (2) • Designing the optimal quantizers for v, to minimize D. Approaches to solve these issues: (1) For a given quantizer and a forward transform matrix A, the optimal reconstruction matrix B is: 1 0 ... 0 B A 1 0 ... 0 2 k 'k k k Ev(k ) v'* (k ) • 2 ' k E v ' (k ) . . . 0 . . . 0 . . . . . . ... N The optimal matrix A from the max. de-correlation p.o.v.= the KLT of u; the lines of A =the eigenvectors of the covariance matrix R of u. Then: B=A-1=A*T. Unfortunately KLT does not have a fast computation algorithm => is typically replaced in compression by: DCT; DST; DFT; Hadamard; Slant good compromise between decorrelation performance & computational speed Chapter 11. Image compression Digital image processing 11.4.2. Bits allocation: • The variance of v(k) is typically non-uniform => different number of bits needed on each coefficient for the same quantization error; v(k) → nk • The optimal design of the transform-based encoder: how to efficiently allocate the total available number of bits between the transform coefficients, so that D – minimized; for A – some unitary matrix, B=A-1=A*T: 1 D N E v( k ) v'( k ) N 1 k 0 2 1 N N 1 2 k f (nk ) k 0 where: σk2 = the variance of v(k); nk = the number of bits allocated for the quantization of v(k); f(nk) = the distortion rate of the quantizer, monotonic, f(0)=1; f(∞)=0. • • 1 N 1 The transformed-based encoder rate is: RA nk B ; B= the average N k 0 number of bits/sample Bits allocation = finding the optimal nk, nk ≥ 0, which minimizes D for the average number of bits/sample B given. Chapter 11. Image compression Digital image processing » Bits allocation algorithm: (1) Define h( x ) f ' 1 ( x); f ' ( x) f ( x) x (2) Find θ = the root of the non-linear equation: RA 1 N f '(0) h B 2 2 k k k (11.30) i One encodes only the coefficients having σk2 > θ. (3) The number of bits nk is: 0, 2k nk 2 2 h(f '( 0) / k ), k (11.31) => The minimum mean distortion: 1 D N • 2 2 k f (nk ) k k :2 k :2k k (11.32) If instead of imposing B, one imposes D=d: (1) solve (11.32) => θ; (2) solve (11.31) => nk; (3) estimate B. Chapter 11. Image compression Digital image processing » Practical (simplified) algorithm to allocate the bits for nk - integer: = iterative algorithm, j – the current iteration: (1) At the beginning of the iterations, set: nk0 0, 0 k N 1; j 1 (2) Compute: nkj nkj 1 (k I ), (3) If N 1 j nk N B => stop the algorithm. Otherwise repeat (2). k so that I max k ; k 2k f nkj 1 f nkj 1 1 k 0 Chapter 11. Image compression Digital image processing 11.4.3. Transform-based image compression steps: (1) • • The image = U[M×N], U={u(m,n)} vector u[MN×1] => A[MN×MN], B[MN×MN], v[MN×1] In practice: use DCT, DST, DFT, Slant, Hadamard – fast algorithms Practical implementation of the transform-based image compression: (1) divide the image in small sized blocks: U[M×N] => I = MN/pq blocks Ui[p×q], i=0,1,…,I-1, p<<M, q<<N (2) transform + independently encode the blocks Ui[p×q]. => Advantages: (a) memory efficient implementation of the transform, MN/pq times (b) decrease the number of operations for computing the transform on the factor log2MN/log2pq times Disadvantage: lower compression rate r. r r – compression rate (bits/pixel) 2,5 2,0 1,5 1,0 0,5 2x2 4x4 8x8 16x16 32x32 64x64 128x128 Block size Fig. 11.11 Compression rate when using a KLT-based compression scheme Chapter 11. Image compression Digital image processing • Practical algorithm for transform-based image compression: q p uj ApUiATq Quantizer Transmit/ storage Encoding q Decoding *T A p V jA * q u'j p * The steps of the algorithm: Fig. 11.12 2-D transform image coding (1) Divide the input image into blocks Ui, transform them 2-D to get Vi, i=0,..., I-1, (2) Bit allocation: Bit allocation for DCT for 16x16 pixels block I NM pq Chapter 11. Image compression Digital image processing (3) Quantizer design: (4) Encode the quantizer’s output (5) Decode & inverse quantization of the coefficients – at the decoder: vi (k , l ) k ,l , vi (k , l ) 0, (k , l ) I t in Cosinus, KL Hadamard Sinus v (k , l ) v~i (k , l ) i , (k , l ) (0,0) k ,l 10-1 10-2 10-3 rest 0,25 => One can compute: the mean distortion, the mean bit rate: D 0,5 1,0 2,0 4,0 1 p 1 q 1 2 1 p 1 q 1 f ( n ) , R k ,l nk ,l k ,l pq k 0 l 0 pq k 0 l 0 R (bits/pixel) Chapter 11. Image compression Digital image processing 11.4.4. Zonal coding and threshold-based coding: • Zonal coding: according to the bit allocation matrix => transmit just N t coeficients = coeficients of the transform with large variances => the zonal mask = mask with values 1 for the sent coefficients, 0 otherwise: 1, k , l I t m(k , l ) 0, in rest • Threshold based coding: transmit only the coefficients having the amplitude above a threshold (regardless their variance); the set of the transmitted coefficients = I’t I 't k , l : v(k , l ) 1, k , l I ' t m (k , l ) 0, in rest = threshold-based mask Chapter 11. Image compression Digital image processing 11.4.4. Transformed-based coding with visibility criteria: Brightness/ contrast Brigthness transform f() L(m,n) u FUF v' Visual quantization v * * F VF u' Inverse transform L'(m,n) contrast/ brightness f-1() a. The encoding system v,(k,l) Frequency filtering w(k,l) H(k,l) MMSE quantizer w'(k,l) Inverse frequency filtering v'(k,l) 1/H(k,l) b. Visual quantization based on visibility criteria Fig.11.16 Transform-based codec with visibility criteria (F – the unitary Fourier transform matrix) 11.4.5. Adaptive transform-based coding: • • • adapt the transform adapt the bit allocation adapt the quantization levels Chapter 11. Image compression Digital image processing 11.4.6. 2-D DCT –based coding: N 1 N 1 (2m 1)k (2n 1)l cos u(m ,n) cos 2N 2N m0 n 0 N 1 N 1 (2m 1)k (2n 1)l u(m,n) (k) (l) v(k,l) cos cos 2N 2N k=0 l=0 1 2 (0) and (k) for 1 k N N N v(k,l) (k) (l) Fig.11.17 The influence of the different DCT coefficients Digital image processing 11.4.6. 2-D DCT –based coding- continued: Chapter 11. Image compression Digital image processing • DCT-based encoding algorithm: Chapter 11. Image compression Digital image processing • DCT-based encoding algorithm – continued: Chapter 11. Image compression