The Discrete Wavelet Transform for Image Compression Jing-De Huang E-mail: r95942104@ntu.edu.tw Graduate Institute of Communication Engineering National Taiwan University, Taipei, Taiwan, ROC Abstact Although the DCT-based image compression method using in the JPEG standard has been very successful in the several years, it still has some properties to improvement. A fundamental shift in the image compression approach came after the discrete wavelet transform (DWT) became popular, and it is adopted in the new JPEG 2000 standard. In this paper, we will introduce the DWT method and why it can be use in the image compression. 1. Subband Coding In subband coding, an image is decomposed into a set of bandlimited components, called subbands, which can be reassembled to reconstruct the original image without error. Figure 1 shows the components of two-band subband coding and decoding system. Since the bandwidth of the resulting subbands y0 (n) or y1 (n) is smaller than the original signal x(n), the subbands can be downsampled without loss of information. Reconstruction of the original signal is accomplished by upsampling, filtering, and summing the individual subbands. h0(n) x ( n) 2 2 y0 (n) Analysis h1(n) 2 Synthesis y1 (n) 2 H0 () H1 () Low band High band g0(n) + g1(n) xˆ (n) 0 /2 Figure 1 Two-band filter bank for one-dimension subband coding and decoding. In according to the Z-transform and its sampling theorem, we can express the output as H 0 ( z )G0 ( z ) H1 ( z )G1 ( z ) X ( z ) 12 H 0 ( z )G0 ( z ) H1 ( z )G1 ( z ) X ( z ) Xˆ ( z ) 1 2 (1) where the second component which contains the z dependence represents the 1 aliasing introduced by the downsampling-upsampling process. For error-free reconstruction of the input, that is, Xˆ ( z ) X ( z ) , we impose the following conditions: H 0 ( z )G0 ( z ) H1 ( z )G1 ( z ) 0 (2) H 0 ( z )G0 ( z ) H1 ( z )G1 ( z ) 2 Eq. (2) reveal that the analysis and synthesis filter are cross-modulated. For finite impulse response (FIR) filters and ignoring the delay, we can get g0 (n) (1) n h1 (n) (3) g1 (n) (1) n 1 h0 (n) Thus, FIR synthesis filters are cross-modulated copies of the analysis filters with one (and only one) being sign reversed. Eq. (2) can also be used to demonstrate the biorthogonality of the analysis and synthesis filters. That is g 0 (k ), h0 (2n k ) (n), g 0 (k ), h1 (2n k ) 0 g1 (k ), h1 (2n k ) (n), g1 (k ), h0 (2n k ) 0 (4) or hi (2n k ), g j (k ) (i j ) (n), i, j {0,1} (5) Filter banks satisfying this condition are called biorthogonal. Moreover, the analysis and synthesis filter impulse responses of all two-band, real-coefficient, perfect reconstruction filter banks are subject to the biorthogonality constraint. One solution that satisfies the biorthogonality requirement of Eq. (5) and used in the development of the fast wavelet transform are called orthonormal. It require gi (n), g j (n 2m) (i j ) ( m), i, j {0,1} (6) which defines orthonormality for perfect reconstruction filter banks. The relationship of the four filter is g1 (n) (1) n g 0 (2 K 1 n) hi (n) gi (2K 1 n), (7) i {0,1} (8) where 2K denotes the number of coefficients in each filter. As can be seen, g1 (n) is related to g0 (n) and both h0 (n) and h1 (n) are time-reversed versions of g0 (n) and g1 (n) , respectively. 2. Multiresolution Analysis In multiresolution analysis, a scaling function is used to create a series of approximations of a signal. A wavelet function is used to encode the difference in 2 information between adjacent approximations. A signal f (x) can be analyzed as a linear combination of expansion functions f ( x) kk ( x) (9) k where the k are real-valued expansion coefficients, and the k ( x) are real-valued expansion functions. If the expansion is unique, the k ( x) are called basis functions. The function space of the expansion set k ( x) is V span k ( x ) (10) k And f ( x) V means that f ( x) is in the span of k ( x) and can be written in the form of Eq. (9). The coefficients k are computed by taking the integral inner products of the dual k ( x) ’s and function f (x). That is k k ( x), f ( x) k* ( x) f ( x)dx (11) If k ( x) is an orthonormal basis for V , then k ( x) k ( x) . If k ( x) are not orthonormal but are an orthogonal basis for V , then the basis funcitons and their duals are called biorthogonal. That is 0 , j k (12) j ( x),k ( x) jk 1 , j k Now consider the set of expansion functions j ,k ( x ) composed of integer translations and binary scalings of the real, square-integrable function ( x) where j ,k ( x) 2 j / 2 (2 j x k ) (13) for k Z and ( x) L2 R . Because the shape of j ,k ( x ) changes with j, ( x) is called a scaling function. We denote the subspace spanned over k for any j as V j span j ,k ( x) (14) k The scaling function have four fundamental requirements of multiresolution analysis: 1. The scaling function is orthogonal to its integer translates. 2. The subspaces spanned by the scaling function at low scales are nested within those spanned at higher scales. That is V V1 V0 V1 V2 V . (15) 3. The only function that is common to all V j is f ( x) 0 . That is V 0 . 4. Any function can be represented with arbitrary precision. That is, 3 (16) V L2 R (17) The expansion functions of any subspace can be built from double-resolution copies of themselves. That is, ( x) h (n) 2 (2 x n) (18) n where the h ( n ) coefficients are called scaling function coefficients. Given a scaling function that meets the multiresolution requirements, we can define a wavelet function ( x) that spans the difference between any two adjacent scaling subspaces, V j and V j 1 . We can define the set j ,k ( x ) of wavelets j ,k ( x) 2 j / 2 (2 j x k ) (19) for all k Z that spans the space W j where W j span j ,k ( x) . (20) k V2 V1 W1 V0 W0 W1 V1 V0 W0 W1 W0 V0 Figure 2 The relationship between scaling and wavelet function spaces. The scaling and wavelet function subspaces in Fig. 2 are related by V j 1 V j W j . (21) We can now express the space of all measurable, square-integrable function as L2 R V0 W0 W1 W2 (22) or even L2 R W2 W1 W0 W1 W2 (23) Similar to the scaling function, the wavelet function can be expressed as a weighted sum of shifted, double-resolution scaling functions. That is, ( x) h (n) 2 (2 x n) n 4 (24) where the h ( n) are called the wavelet function coefficients. It can be shown that h ( n) is related to h ( n ) by h (n) (1)n h (1 n) . (25) 3 Discrete Wavelet Transform We begin by defining the wavelet series expansion of function f ( x) L2 R relative to wavelet ( x) and scaling function ( x) . We can write f ( x) c j0 (k ) j0 ,k ( x) d j (k ) j ,k ( x) (26) j j0 k k where j0 is an arbitrary starting scale and the c j0 (k ) ’s are normally called the approximation or scaling coefficients, the d j ( k ) ’s are called the detail or wavelet coefficients. The expansion coefficients are calculated as c j0 (k ) f ( x), j0 ,k ( x) f ( x) j0 ,k ( x)dx (27) d j (k ) f ( x), j ,k ( x) f ( x) j ,k ( x)dx (28) If the function being expanded is a sequence of numbers, like samples of a continuous function f ( x) . The resulting coefficients are called the discrete wavelet transform (DWT) of f ( x) . Then the series expansion defined in Eqs. (27) and (28) becomes the DWT transform pair W ( j0 , k ) 1 M W ( j , k ) 1 M M 1 f ( x) x 0 ( x) (29) ( x) (30) j0 , k M 1 f ( x) x 0 j,k for j j0 and 1 1 W ( j0 , k ) j0 ,k ( x) (31) W ( j, k ) j ,k ( x) M k M j j0 k where f ( x) , j0 , k ( x) , and j , k ( x ) are functions of discrete variable x = 0, 1, 2, ... , f ( x) M 1. 4 The Fast Wavelet Transform The fast wavelet transform (FWT) is a computationally efficient implementation of the discrete wavelet transform (DWT) that exploits the relationship between the coefficients of the DWT at adjacent scales. It also called Mallat's herringbone algorithm. The FWT resembles the twoband subband coding scheme of Section 1. Consider again the multiresolution equation ( x) h (n) 2 (2 x n) n 5 (32) Scaling x by 2 j , translating it by k, and letting m = 2k + n gives (2 j x k ) h (n) 2 2(2 j x k ) n n h (m 2k ) 2 2 j 1 x m (33) m Similarity, (2 j x k ) h (m 2k ) 2 (2 j 1 x m) (34) m Now consider the discrete wavelet transfrom. Assume ( x) ( x) and ( x) ( x) , we substitute Eq. (18) into Eq. (29), we get 1 (35) W ( j0 , k ) h (m 2k ) 2 (2 j0 x m) . M m Then we substitute Eq. (19) into Eq. (30), we get W ( j , k ) 1 M x f ( x)2 j / 2 (2 j x k ) 1 M h (m 2k ) x f ( x)2 j / 2 m 1 h (m 2k ) m M x 2 (2 j 1 x m) f ( x)2( j 1) / 2 (2 j 1 x m) (36) where the bracketed quantity is identical to Eq. (35) with j0 j 1 . Therefore, W ( j, k ) h (m 2k )W ( j 1, m) . (37) m Similarity, W ( j, k ) h (m 2k )W ( j 1, m) . (38) m Eqs. (37) and (38) reveal a remarkable relationship between the DWT coefficients of adjacent scales. We see that both W ( j , k ) and W ( j , k ) , the scale j approximation and the detail coefficients, can be computed by convolving W ( j 1, k ) , the scale j + 1 approximation coefficients, with the time-reversed scaling and wavelet vectors, h ( n) and h (n) , and subsampling the results. We can rewrite Eqs. (33) and (34) as W ( j, k ) h (n) W ( j 1, n) n 2 k , k 0 W ( j, k ) h (n) W ( j 1, n) n 2 k , k 0 and can be illustrated with block diagram of Figure 3. 6 (39) (40) h (n) 2 W ( j , n) h ( n) 2 W ( j , n) W ( j 1, n) Figure 3 An FWT analysis filter bank. If function f ( x) is sampled above the Nyquist rate, its samples are good approximations of the scaling coefficients and can be used as the starting high-resolution scaling coefficient inputs. Therefore, no wavelet or detail coefficients are needed at the sampling scale. The inverse fast wavelet transform (FWT-1) uses the level j approximation and detail coefficients, to generate the level j + 1 approximation coefficients. Noting the similarity between the FWT analysis filter bank in Figure 3 and the two-band subband analysis portion of Figure 1, the FWT-1 have the synthesis filter bank of Figure 4. W ( j , n) 2 h ( n) + W ( j , n) 2 W ( j 1, n) h ( n ) Figure 4 An FWT-1 synthesis filter bank. By subband coding theorem of section 1, perfect reconstrucion for two-band orthonormal filters requires gi (n) hi (n) for i = {0, 1}. That is, the synthesis and analysis filters must be time-reversed versions of one another. Since the FWT analysis filter are h0 (n) h (n) and h1 (n) h (n) , the required FWT-1 synthesis filters are g 0 (n) h0 (n) h (n) and g1 (n) h1 (n) h (n) . 5 Wavelet Transforms in Two Dimension In two dimensions, a two-dimensional scaling function, ( x, y ) , and three two-dimensional wavelet H ( x, y) , V ( x, y ) and D ( x, y) , are required. Each is the product of a one-dimensional scaling function and corresponding wavelet . ( x, y ) ( x) ( y) (41) H ( x, y) ( x) ( y) V ( x, y) ( y) ( x) D ( x, y) ( x) ( y) (42) (43) (44) H V where measures variations along columns (like horizontal edges), responds to variations along rows (like vertical edges), and D corresponds to variations along diagonals. Like the one-dimensional discrete wavelet transform, the two-dimensional DWT can be implemented using digital filters and downsamplers. With separable 7 two-dimensional scaling and wavelet functions, we simply take the one-dimensional FWT of the rows of f (x, y), followed by the one-dimensional FWT of the resulting columns. Figure 5 shows the process in block diagram form. 2 h (m) h (n) WD ( j, m, n) Rows 2 2 h ( m ) Columns W ( j 1, m, n) WV ( j, m, n) Rows 2 h (m) h ( n) 2 WH ( j, m, n) Rows Columns 2 h ( m ) W ( j , m, n) Rows Figure 5 The two-dimensional FWT the analysis filter. bank W ( j , m, n) WH ( j, m, n) W ( j 1, m, n) WV ( j , m, n) WD ( j, m, n) Figure 6 Two-scale of two-dimensional decomposition WD ( j, m, n) 2 h (m) Rows WV ( j, m, n) 2 2 + h ( m ) h (n) Columns Rows WH ( j, m, n) 2 + h (m) 2 2 + Rows W ( j , m, n) W ( j 1, m, n) h ( n) Columns h ( m ) Rows Figure 7 The two-dimensional FWT the synthesis filter bank. The single-scale filter bank of Figure 5 can be “iterated” by tying the approximation output to the input of another filter bank to produce a arbitrary scale 8 transform. As in the one-dimensional case, image f (x, y) is used as the first scale input, and output four quarter-size subimages W , WH , WV , and WD . These subimages are shown in the middle of Figure 6. Two iterations of thc filtering process produces the two-scale decomposition at the right of Figure 6. Figure 7 shows the synthesis filter bank that reverses the process described above. 6 Image Compression When a image has been processed of the DWT, the total number of transform coefficients is equal to the number of samples in the original image, but the important visual information is concentrated in a few coefficients. To reduce the number of bits needed to represent the transform, all the subbands are quantized. Quantization of DWT subbands is one of the main sources of information loss. In the JPEG2000 standard, the quantization is performed by uniform scalar quantization with dead-zone about the origin. In dead-zone scalar quantizer with step-size j, the width of the dead-zone is 2j as shown in Figure 8. The standard supports separate quantization step-sizes for each subband. The quantization step size j for a subband j is calculated based on the dynamic range of the subband values. The formula of uniform scalar quantization with a dead-zone is W j (m, n) q j (m, n) sign( y j (m, n)) j (45) where Wj(m,n) is a DWT coefficient in subband j and j is the quantization step size for the subband j. All the resulting qunantized DWT coefficients qj(m,n) are signed integers. j j j 2 j j j j 3 2 1 0 1 2 3 Figure 8 Dead-zone quantization about the origin. After the quantization, the quantized DWT coefficients are then use entropy coding to remove the coding redundancy. 7 Simulation Result Finally, we compare the wavelet-based image compression with the DCT-based image compression while the compression ratio is similar. In the figures followed, the left column is DCT-based images, and the right column is Wavelet-based images. Both of them use larger and larger quantization step-size to generate higher and higher 9 compresion ratio, and we observe the difference of the two compression method. I1: Original image with width W and height H C: Encoded jpeg stream from I1 I1 I2: Decoded image from C JPEG2 encoder C JPEG decoder I2 CR (Compression Ratio) = sizeof(I1) / sizeof(C) H RMS (Root mean square error) = W I ( x, y) I y 1 x 1 1 ( x, y ) /( H W ) 2 2 Original image DCT-based image compression Wavelet-based image compression CR = 11.2460, RMS = 4.1316 CR = 10.3565, RMS = 4.0104 10 CR = 27.7401, RMS = 6.9763 CR = 26.4098, RMS = 6.8480 CR = 53.4333, RMS = 10.9662 CR = 51.3806, RMS = 9.6947 We can see that the result of the two mathod is similar when compression ratio is low. When compression ratio is higher and higher, the DCT-based method generates the clearer and clearer characteristic 'blocky and blurry' artifacts while Wavelet-based methode does not. In the fact, their RMS is similar when their have the same compression ratio, but the Wavelet-based method is superior in human vision then the other one. 11 Conclusion Wavelet-based image compression method using in JPEG2000 is the new standard for still image compression. It provides a new framework and an integrated toolbox to better address increasing needs for compression. It also provides a wide range of functionalities for still image applications. Lossless and lossy coding, embedded lossy to lossless, progressive by resolution and quality, high compression efficiency, error resilience and lossless color transformations are some of its characteristics. Comparative results have shown that JPEG2000 is indeed superior to existing still image compression standards. Work is still needed in optimizing its implementation performance. Reference [1] R. C. Gonzolez, R. E. Woods, "Digital Image Processing second edition", Prentice Hall, 2002. [2] R. C. Gonzolez, R. E. Woods, S. L. Eddins, "Digital Image Processing Using Matlab", Prentice Hall, 2004. [3] T. Acharya, A. K. Ray, "Image Processing: Principles and Applications", John Wiley & Sons, 2005. [4] B. E. Usevitch, 'A Tutorial on Modern Lossy Wavelet Image Compression: Foundations of JPEG 2000', IEEE Signal Processing Magazine, vol. 18, pp. 22-35, Sept. 2001. 12