Data Compression: Advanced Topics Huffman Coding Algorithm Motivation Procedure Examples Unitary Transforms Definition Properties Applications EE465: Introduction to Digital Image Processing 1 Recall: Variable Length Codes (VLC) Recall: Self-information I ( p) log 2 p It follows from the above formula that a small-probability event contains much information and therefore worth many bits to represent it. Conversely, if some event frequently occurs, it is probably a good idea to use as few bits as possible to represent it. Such observation leads to the idea of varying the code lengths based on the events’ probabilities. Assign a long codeword to an event with small probability Assign a short codeword to an event with large probability EE465: Introduction to Digital Image Processing 2 Two Goals of VLC design • achieve optimal code length (i.e., minimal redundancy) For an event x with probability of p(x), the optimal code-length is –log2p(x) , where x denotes the smallest integer larger than x (e.g., 3.4=4 ) code redundancy: r l H ( X ) 0 Unless probabilities of events are all power of 2, we often have r>0 • satisfy uniquely decodable (prefix) condition EE465: Introduction to Digital Image Processing 3 “Big Question” How can we simultaneously achieve minimum redundancy and uniquely decodable conditions? D. Huffman was the first one to think about this problem and come up with a systematic solution. EE465: Introduction to Digital Image Processing 4 Huffman Coding (Huffman’1952) Coding Procedures for an N-symbol source Source reduction List all probabilities in a descending order Merge the two symbols with smallest probabilities into a new compound symbol Repeat the above two steps for N-2 steps Codeword assignment Start from the smallest source and work back to the original source Each merging point corresponds to a node in binary codeword tree EE465: Introduction to Digital Image Processing 5 Example-I Step 1: Source reduction symbol x S N E W p(x) 0.5 0.25 0.125 0.125 0.5 0.25 0.25 (EW) 0.5 0.5 (NEW) compound symbols EE465: Introduction to Digital Image Processing 6 Example-I (Con’t) Step 2: Codeword assignment symbol x p(x) NEW 0.5 0.5 0.5 S 0 N 1 0 S 0.25 0.25 0 0.5 EW 10 0 E 0.125 N 1 0 0.25 1 W 0.125 1 111 110 W E 1 0 EE465: Introduction to Digital Image Processing codeword 0 0 1 10 110 111 7 Example-I (Con’t) 1 0 NEW 0 1 0 S EW 10 N 1 0 110 W E 0 1 NEW 1 0 1 S or EW 01 N 1 0 000 001 W E The codeword assignment is not unique. In fact, at each merging point (node), we can arbitrarily assign “0” and “1” to the two branches (average code length is the same). EE465: Introduction to Digital Image Processing 8 Example-II Step 1: Source reduction symbol x e a i o u p(x) 0.4 0.4 0.2 0.2 0.1 0.1 0.2 0.2 0.2 (ou) 0.4 0.6 (aiou) 0.4 (iou) 0.2 0.4 compound symbols EE465: Introduction to Digital Image Processing 9 Example-II (Con’t) Step 2: Codeword assignment symbol x e a i o u p(x) 0.4 0.4 0.2 0.2 0.1 0.1 0.2 0.2 0.4 0.4 (iou) 0.2 0.6 0 (aiou) 0.4 1 0.2 (ou) codeword 1 01 000 0010 0011 compound symbols EE465: Introduction to Digital Image Processing 10 Example-II (Con’t) 0 1 (aiou) e 00 01 (iou) a 000 001 (ou) i 0010 0011 o u binary codeword tree representation EE465: Introduction to Digital Image Processing 11 Example-II (Con’t) symbol x e a i o u 5 p(x) 0.4 0.2 0.2 0.1 0.1 codeword length 1 1 01 2 3 000 0010 4 0011 4 l pi li 0.4 1 0.2 2 0.2 3 0.1 4 0.1 4 2.2bps i 1 5 H ( X ) pi log 2 pi 2.122bps r l H ( X ) 0.078bps i 1 If we use fixed-length codes, we have to spend three bits per sample, which gives code redundancy of 3-2.122=0.878bps EE465: Introduction to Digital Image Processing 12 Example-III Step 1: Source reduction compound symbol EE465: Introduction to Digital Image Processing 13 Example-III (Con’t) Step 2: Codeword assignment compound symbol EE465: Introduction to Digital Image Processing 14 Summary of Huffman Coding Algorithm Achieve minimal redundancy subject to the constraint that the source symbols be coded one at a time Sorting symbols in descending probabilities is the key in the step of source reduction The codeword assignment is not unique. Exchange the labeling of “0” and “1” at any node of binary codeword tree would produce another solution that equally works well Only works for a source with finite number of symbols (otherwise, it does not know where to start) EE465: Introduction to Digital Image Processing 15 Data Compression: Advanced Topics Huffman Coding Algorithm Motivation Procedure Examples Unitary Transforms Definition Properties Applications EE465: Introduction to Digital Image Processing 16 An Example of 1D Transform with Two Variables x2 y2 y1 (1,1) (1.414,0) x1 y1 1 1 1 x1 1 1 1 y 1 1 x y Ax , A 1 1 2 2 2 2 Transform matrix EE465: Introduction to Digital Image Processing 17 Decorrelating Property of Transform x2 y1 y2 x1 x1 and x2 are highly correlated p(x1x2) p(x1)p(x2) y1 and y2 are less correlated p(y1y2) p(y1)p(y2) Please use MATLAB demo program to help your understanding why it is desirable to have less correlation for image compression EE465: Introduction to Digital Image Processing 18 Transform=Change of Coordinates Intuitively speaking, transform plays the role of facilitating the source modeling Due to the decorrelating property of transform, it is easier to model transform coefficients (Y) instead of pixel values (X) An appropriate choice of transform (transform matrix A) depends on the source statistics P(X) We will only consider the class of transforms corresponding to unitary matrices EE465: Introduction to Digital Image Processing 19 Unitary Matrix Definition conjugate transpose A matrix A is called unitary if A-1=A*T Example 1 A 2 1 1 1 1 1 1 T , A A 1 1 1 1 2 Notes: transpose and conjugate can exchange, i.e., A*T=AT* For a real matrix A, it is unitary if A-1=AT EE465: Introduction to Digital Image Processing 20 Example 1: Discrete Fourier Transform (DFT) DFT Matrix: A N N a11 ... ... a N 1 ... ... a1N ... ... ... F ... ... ... ... ... a NN 1 akl WNkl , N WN e j 2 N ,WNN 1 N yk akl xl Im l 1 Re DFT: 1 yk N N xW l 1 l kl N EE465: Introduction to Digital Image Processing WN 21 Discrete Fourier Transform (Con’t) Properties of DFT matrix akl alk symmetry Proof: F FT unitary F 1 FT * F* Proof: If we denote P F FT * then we have j 2 ( k l ) n j 2 ( k l ) n N N 1 e 1 k l 1 1 * N pkl akn anl e j 2 ( k l ) n / N N n1 N 1 e n 1 0 k l 1 j 2kl / N N 1 n 1 r N 1 akl e r , (r 1) 1 r N n 0 P F FT * I (identity matrix) EE465: Introduction to Digital Image Processing 22 Example 2: Discrete Cosine Transform (DCT) A N N a11 ... ... a N 1 ... ... a1N ... ... ... C ... ... ... ... ... a NN 1 , k 1,1 l N N akl (2l 1)( k 1) 2 cos ,2 k N ,1 l N N 2N C C* , C1 CT real You can check it using MATLAB demo EE465: Introduction to Digital Image Processing 23 DCT Examples 1 C 2 N=2: N=4: C 1 1 1 1 Haar Transform 0.5000 0.5000 0.5000 0.5000 0.6533 0.2706 -0.2706 -0.6533 0.5000 -0.5000 -0.5000 0.5000 0.2706 -0.6533 0.6533 -0.2706 Here is a piece of MATLAB code to generate DCT matrix by yourself % generate DCT matrix with size of N-by-N Function C=DCT_matrix(N) for i=1:N; x=zeros(N,1);x(i)=1;y=dct(x);C(:,i)=y;end; end EE465: Introduction to Digital Image Processing 24 Example 3: Hadamard Transform 1 1 1 1 A n A1 A 1 1, A 2 n 2 2 n An H An H H* H 1 HT Here is a piece of MATLAB code to generate Hadamard matrix by yourself % generate Hadamard matrix N=2^{n} function H=hadamard(n) H=[1 1;1 -1]/sqrt(2); i=1; while i<n H=[H H;H -H]/sqrt(2); i=i+1; end EE465: Introduction to Digital Image Processing 25 1D Unitary Transform When the transform matrix A is unitary, the defined 1D transform y Ax is called unitary transform Forward Transform y Ax y1 a11 a1N x1 y x 2 2 y a a NN x N N N1 N yk akl xl j 1 Inverse Transform 1 *T xA yA y x1 a11* a*N 1 y1 x y2 2 * * x y a a N 1N N NN N xk alk* yl l 1 EE465: Introduction to Digital Image Processing 26 Basis Vectors y1 a11 a1N x1 y x 2 2 y a a NN x N N N1 N for for y xk bk , bk [ak1 ,..., akN ]T k 1 basis vectors corresponding to forward transform (column vectors of transform matrix A) x1 a11* a*N 1 y1 inv inv N x * * T y x y b , b [ a ,..., a ] 2 2 k k k 1k Nk k 1 * * x y N a1N aNN N basis vectors corresponding to inverse transform (column vectors of transform matrix A*T ) EE465: Introduction to Digital Image Processing 27 From 1D to 2D Do N 1D transforms in parallel YN N A N N X N N y11 y1N a11 a1N x11 x1N y N 1 y NN aN 1 aNN xN 1 xNN y1 | ... | yi | ... | yN yi Axi , i 1 N x1 | ... | xi | ... | xN T xi [ xi1 ,..., xiN ] , yi [ yi1 ,..., yiN ]T EE465: Introduction to Digital Image Processing 28 Definition of 2D Transform T Y A X A 2D forward transform N N N N N N N N y11 y1N a11 a1N x11 x1N a11 aN 1 y N 1 y NN aN 1 aNN xN 1 xNN a1N aNN 1D column transform EE465: Introduction to Digital Image Processing 1D row transform 29 2D Transform=Two Sequential 1D Transforms T Y AXA column transform row transform row transform column transform Y1 AX (left matrix multiplication first) Y Y1AT (AY1T )T Y2 XAT ( AXT )T(right matrix multiplication first) Y AY2 Conclusion: 2D separable transform can be decomposed into two sequential The ordering of 1D transforms does not matter EE465: Introduction to Digital Image Processing 30 Basis Images X T X δ ij [ kl ] Y AXA T T Y AXA T Bij T Bij bi b j , bi [ai1 ,..., aiN ]T 1 k i, l j kl 0 otherwise N1 8 8 X xijδij 1N 8 T i 1 j 1 8 Y xij Bij i 1 j 1 Basis image Bij can be viewed as the response of the linear system (2D transform) to a delta-function input ij EE465: Introduction to Digital Image Processing 31 Example 1: 8-by-8 Hadamard Transform 1 1 1 1 1 A 8 1 1 1 1 b1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 , 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 b8 j DC i Bij In MATLAB demo, you can generate these 64 basis images and display them EE465: Introduction to Digital Image Processing 32 Example 2: 8-by-8 DCT A88 a11 ... ... a81 j ... ... a18 ... ... ... ... ... ... ... ... a88 DC 1 , k 1,1 l 8 8 akl 1 (2l 1)( k 1) cos ,2 k 8,1 l 8 16 2 i In MATLAB demo, you can generate these 64 basis images and display them EE465: Introduction to Digital Image Processing 33 2D Unitary Transform Suppose A is a unitary matrix, forward transform YN N A N N X N N ATN N inverse transform X N N A *NT N YN N A*N N Proof 1 *T A A Since A is a unitary matrix, we have A*T YA * A*T (AXA T ) A* ( A*T A) X( AT A* ) ( A*T A) X( AT* A)T I X I T X ( AB )T BT AT EE465: Introduction to Digital Image Processing 34 Properties of Unitary Transforms Energy compaction: only a small fraction of transform coefficients have large magnitude Such property is related to the decorrelating capability of unitary transforms Energy conservation: unitary transform preserves the 2-norm of input vectors Such property essentially comes from the fact that rotating coordinates does not affect Euclidean distance EE465: Introduction to Digital Image Processing 35 Energy Compaction Property How does unitary transform compact the energy? Assumption: signal is correlated; no energy compaction can be done for white noise even with unitary transform Advanced mathematical analysis can show that DCT basis is an approximation of eigenvectors of AR(1) process (a good model for correlated signals such as an image) A frequency-domain interpretation Most transform coefficients would be small except those around DC and those corresponding to edges (spatially high-frequency components) Images are mixture of smooth regions and edges EE465: Introduction to Digital Image Processing 36 Energy Compaction Example in 1D Hadamard matrix 1 1 1 1 100 1 1 1 1 98 1 A ,x 98 2 1 1 1 1 1 1 1 1 100 significant 1 1 1 1 100 198 1 1 1 1 1 98 0 y Ax 2 1 1 1 1 98 0 1 1 1 1 100 2 A coefficient is called significant if its magnitude insignificant coefficients (th=64) is above a pre-selected threshold th EE465: Introduction to Digital Image Processing 37 Energy Compaction Example in 2D Example 1 1 1 1 1 1 1 1 1 A 2 1 1 1 1 1 1 1 1 0 5.5 1 100 100 98 99 391.5 T 100 100 94 94 Y AXA 2.5 2 4 . 5 2 X Y 98 97 96 100 1 0.5 2 0.5 100 99 97 94 2 1 . 5 0 1 . 5 A coefficient is called significant if its magnitude is above a pre-selected threshold th insignificant coefficients (th=64) EE465: Introduction to Digital Image Processing 38 Image Example low-frequency Original cameraman image X high-frequency Its DCT coefficients Y (2451 significant coefficients, th=64) Notice the excellent energy compaction property of DCT EE465: Introduction to Digital Image Processing 39 Counter Example Original noise image X Its DCT coefficients Y No energy compaction can be achieved for white noise EE465: Introduction to Digital Image Processing 40 Energy Conservation Property in 1D 1D case y Ax Proof A is unitary 2 2 || y || || x || 2 *T 2 *T || y || | yi | y y ( Ax ) ( Ax ) N i 1 *T *T *T N 2 2 x ( A A) x x x | xi | || x || i 1 EE465: Introduction to Digital Image Processing 41 Numerical Example 1 A 2 1 1 3 1 1, x 4 1 1 1 3 1 7 y Ax 1 1 4 1 2 2 Check: 2 2 2 2 7 1 || x || 3 42 25, || y ||2 25 2 EE465: Introduction to Digital Image Processing 42 Implication of Energy Conservation T y [ y1 ,..., y N ] Q ˆ y [ yˆ1 ,..., yˆ N ]T T-1 T x [ x1 ,..., x N ]T y Ax ˆ x [ xˆ1 ,..., xˆ N ]T Linearity of Transform ˆ ˆ y y A( x x ) ˆ ˆ y Ax A is unitary ˆ 2 ˆ 2 || y y || || x x || EE465: Introduction to Digital Image Processing 43 Energy Conservation Property in 2D 2-norm of a matrix X 2 X | xij | || xi || N 2 N N 2 i 1 j 1 Step 1: Proof: Y AX Y AX A unitary i 1 Y X 2 2 yi Axi , i 1 N Using energy conservation property in 1D, we have 2 2 || yi || || xi || , i 1 N 2 N 2 || y || || x || i i N i 1 i 1 Y X 2 2 EE465: Introduction to Digital Image Processing 44 Energy Conservation Property in 2D (Con’t) Step 2: Y AXA T Y X 2 A unitary 2 Hint: 2D transform can be decomposed into two sequential 1D transforms, e.g., column transform row transform Y1 AX Y Y1A (AY1 ) T T T Use the result you obtained in step 1 and note that EE465: Introduction to Digital Image Processing X T 2 X 2 45 Numerical Example 1 / 2 A 1 / 2 1 2 X 3 4 T 1/ 2 1/ 2 5 1 Y AXA 2 0 T Check: X 12 2 2 32 4 2 30 2 || Y ||2 52 22 12 02 30 EE465: Introduction to Digital Image Processing 46 Implication of Energy Conservation X T Y Q T-1 Ŷ Y AXA T X̂ ˆ A*T Y ˆ A* X Linearity of Transform ˆ A(X X ˆ )AT YY ˆ ||2 || X X ˆ ||2 || Y Y Similar to 1D case, quantization noise in the transform domain has the same energy as that in the spatial domain EE465: Introduction to Digital Image Processing 47 Why Energy Conservation? image X forward Y Transform ˆ || || X X ˆ || || Y Y 2 ^ image X inverse Transform s f entropy coding probability estimation 2 ^ Y f-1 s binary bit stream super channel entropy decoding EE465: Introduction to Digital Image Processing 48