Applications of the wavelet transform in image processing Øyvind Ryan

Applications of the wavelet transform in image processing Øyvind Ryan Department of informatics, University of Oslo e–mail: oyvindry@ifi.uio.no ∗ 12 nov 2004 Abstract Mathematical methods applied in the most recent image formats are presented. First of all, the application of the wavelet transform in JPEG2000 is gone through. JPEG2000 is a standard established by the same group which created the widely used JPEG standard, and it was established to solve some of the shortcomings of JPEG. Also presented are other recently established image formats having wavelet transforms as part of the codec. Other components in modern image compression systems are also gone through, together with the mathematical and statistical methods used. 1 Prelimiaries All image formats gone through here use an image transform, quantization and coding. All of these are described for the different formats in question. Transforms mentioned below can be separably extended to two dimensions for applications to image processing. Therefore, we state our results in one dimension only for the sake of simplicity. We will describe this separation process later. We use value m for block dimension (or more precisely, the number of channels), for our wavelet transforms we will always have m = 2. We will associate a transform with m filters, so that for m = 2 we will have only two filters. With respect to signal processing, these are to be interpreted as low-pass high-pass filters. If we denote the signal by x, the block dimension (or number of channels) describes the size of the block partitioning of the signal. JPEG applies only block transforms, meaning that if the signal is split into blocks x = x[i] (each x[i] a vector of dimension m), the transformed signal y = y[i] is given by y[i] = A∗ x[i] for some m × m matrix A. This way we lose influence from different blocks, for instance for pixels on block boundaries. This is what gives rise to the blockiness artifact in JPEG. Important block transforms are sketched below. 1 Sponsored by the Norwegian Research Council, project nr. 160130/V30 1 1.1 KLT (Karhunen-Loeve Transform) KLT is the unique transform which decorellates its input. To be precise, define the covariance matric CX of a random vector X by CX = E ((X − µx )(X − µx )∗ ) . If the KLT transform is called K, then the random vector Y = K ∗ X should have uncorrellated components, i.e. CY = K ∗ CX K is a diagonal matrix. This transform is what gives rise to principal component analysis (PCA). Among linear transforms, KLT minimizes MSE when keeping a given number of it’s principal components (when principal components are ranked in decreasing order). The drawback for the KLT is that we need to recompute the transform each time the statistics of the source changes. By it’s nature, it cannot be separable either. 1.2 DFT (Discrete Fourier Transform 2π The DFT is defined by the transform A = ei m pq , which is unitary. p,q DFT has an efficient implementation through the FFT. It is also separable. One drawback of DFT is that the transform works badly when the end points (x0 and xm−1 ) are far apart. If the full Fourier transform was applied in this case, many higher Fourier components would be introduced to compensate for this. 1.3 DCT DCT is defined by A = cq cos 2πfq p + where cq = p1 pm 2 m 1 2 ; fq = q,p q , 2m if q = 0 . if q = 6 0 DCT can be constructed through DFT by symmetrically extending the input sequence about the last point, applying the 2m point DFT, and recovering the first m points afterwards. It is separable since the DFT is, and the FFT can be used as an efficient implementation of the DCT. No need to adapt the transform to the statistics of the source material (as with KLT). DCT is robust approximation to the KLT for natural image sources. Used in JPEG, MPEG, CCITT H.261. One drawback is that there is no way to use the DCT for lossless compression, since outputs of the transform are not integers. 2 2.1 JPEG (baseline) Transform DCT is used as the transform of the input signal, after the input has been level shifted. If elements in the input signal is 8 bits, level shift would mean subtracting 128 from numbers in [0, 256i, producing numbers in [−128, 128i. Block dimension is always 8. 2 2.2 Quantization Uniform midtread quantization (midtread means that 0 is the midpoint of the quantization interval about 0) is used. A quantization table, consisiting of the step sizes of each coefficient quantizer (the table has size 8 × 8). This table is emitted with the data itself. Use smaller values in this table if you want less loss during encoding. The coefficients after quantizations are called labels. The first label in a block is called an DC coefficient, the rest are called AC coefficients. Higher AC coefficients typically rounded to 0. This provides us with good compression. 2.3 Coding AC and DC coefficients are coded differently. Differences between successive DC labels are coded, instead of the DC label itself. This difference is not coded for AC labels. Labels are partitioned into categories: 0, −1, 1, −3, −2, 2, 3, ..., of sizes 20 , 21 , 22 , ..., numbered 0, 1, 2, 3, , .... Category numbers are Huffman coded. Coding is done in zig zag scan order. Each label is coded with first the Huffman code of the category number, followed by the value within the category (number of bits required equals the category number). Zig zag scan order ensures that many coeffiecients are zero near end of traversal, these are skipped with an end of block code. Drawbacks and lacks in generality of the JPEG standard are 1. blockiness, due to splitting image into independent parts, 2. blocks are always processed sequentially (no way to obtain other types of progression), 3. lossy version only, since quantization is always performed on floating point values. 3 3.1 JPEG2000 Wavelet transform basics We follow the introduction to subband transforms and wavelets used in [4]. A lighter introduction with more examples can be found in [5]. Going through these two references together can be very helpful. Instead of applying a block transform, one can attempt with a transform where one block influences many other (surrounding) blocks. This may reduce the blockiness, even if the transformed signal at the end is partitioned into independent blocks anyway. We will consider a subband transform as our candidate to such a transform: Definition 1 An (analysis) subband transform for the set of m × m matrices A[i]i∈Z is defined by y[n] = X A∗ [i]x[n − i]. i∈Z Definition 2 A (synthesis) subband transform for the set of m × m matrices S[i]i∈Z is defined by x[n] = X S[i]y[n − i]. i∈Z 3 x[n − 2] m x[n − 1] x[n] x[n + 1] B @ A[1] A[0] A[−1] B @ B @ B @ @ B @ B BB @@ L -x[k] H HH y[n − 2] y[n − 1] y[n] y[n + 1] m Figure 1: One dimensional convolutional transform. -y[k] These definitions can be thought of as one dimensional convolutional transforms, as shown in figure 3.1. The analysis transform produces a transformed signal from an input signal, while the synthesis transform should recover the input signal from the transformed signal. We say that we have perfect reconstruction if there exists a synthesis transform exactly inverting the analysis transform. JPEG2000 applies subband transforms with only two channels (m = 2), as opposed to JPEG’s block transform with eight channels. So, artifact like blocking may be removed when using subband transforms, even if the number of channels is decreased. 3.2 Expressing transforms with filter banks One can write y[nm + q] = (x ? hq )[mn], 0 ≤ q < m, (1) ∗ where the filter bank {hq }0≤q<m is defined by hq [mi − j] = (A [i])q,j . This expresses the analysis operation through filter banks. One can also write m−1 x= X (y˜q ? gq ), q=0 where the filter bank {gq }0≤q<m is defined by gq [mi + j] = (S[i])j,q , and where i yq [ m ] if m divides i y˜q [i] = 0 otherwise This expresses the synthesis operation through filter banks. When constructing subband transforms from wavelets, we will construct the transform by first finding a filter bank from the scaling function of the wavelet. 3.3 Expressing transforms in terms of vectors (n) Let h·, ·i be the inner product in `2 (Z). One can write yq [n] = hx, aq i, where aq [k] = h∗q [−k], a(n) q [k] = aq [k − mn] are the analysis vectors. This expresses the analysis operation in terms of the analysis vectors. Pm−1 P (n) One can also write x = q=0 y [n]sq , where n q sq [k] = gq [k] 4 s(n) q [k] = sq [k − mn] are the synthesis vectors. This expresses the synthesis operation in terms of the analysis vectors. 3.4 Orthonormal and biorthogonal transforms Definition 3 An orthonormal subband transform is a transform for which the synthesis vectors are orthonormal. It is easy to show that for orthonormal subband transforms, the analysis and synthesis vectors are equal (sq = aq ∀q), and the analysis and synthesis matrices are reversed version of one another A[i] = S[−i]. Orthonormal subband transform are the natural extension of orthonormal (unitary) block transforms. If the analysis system is given by filters h0 , h1 , and the synthesis system is given by filters g0 , g1 , one can calculate the end-to-end transfer function of analysis combined with synthesis. In order to avoid aliasing, one will find that hˆ0 (ω + π)gˆ0 (ω) + hˆ1 (ω + π)gˆ1 (ω) = 0, (2) and if we in addition want perfect reconstruction, hˆ0 (ω)gˆ0 (ω) + hˆ1 (ω)gˆ1 (ω) = 2 (3) Example 1 Let’s take a look at a popular definition of orthonormal subband transforms through filter banks. This is an alternative definition of what is called Quadrature Mirror Filters (QMF). Given a low-pass prototype f , h0 [k] = g0 [−k] = f [k] h1 [k] = g1 [−k] = (−1)k+1 f (−(k + 1)). Note that g1 [n] = (−1)n+1 g0 [−(n − 1)], −iω or gˆ1 (ω) = e (4) ∗ gˆ0 (ω + π) in the Fourier domain. Note also that h1 [n] = (−1)n+1 h0 [−(n + 1)], (5) ∗ or hˆ1 (ω) = eiω hˆ0 (ω + π) in the Fourier domain. These relations will be used when we construct biorthogonal transforms below. It is not hard to see that the alias condition( 2) is satisfied, and that perfect reconstruction 3 is satisfied if hˆ0 (ω)2 + hˆ0 (ω ± π)2 = 2. (6) It is also not hard to show that the {h0 [i − 2n]}n , {h1 [i − 2n]}n are orthonormal and gives rise to an orthonormal subband transform if f = h0 satsified this. Example 2 Lapped orthogonal transform with cosine modulated filter bank: Analysis vectors are defined by aq [k] = cq [k] = √1 m cos 2πfq k − 0 q+ 1 m−1 2 if −m ≤ k < m otherwise where the cosine frequencies are fq = 2m2 ; 0 ≤ q < m. It is easy to verify that these analysis vectors give rise to an orthonormal subband transform, 5 where all analysis matrices are 0, except for A[0] and A[1]. Such transforms are called lapped transforms. One can get a more general family of lapped transforms by defining aq [k] = cq [k]w[k], where the windowing sequence w satisfies w[k] w2 [k] + w2 [m − 1 − k] = = w[−1 − k]; 2; 0≤k<m 0≤k<m One can show that any window sequence satisfying these assumptions gives rise to a new lapped orthonormal transform. These transforms work well, and by choosing a windowing sequence wisely, one can obtain very good frequency discrimination between the subbands. The only thing we miss in the above example is linear phase (linear phase means that the filter sequence is symmetric or anti-symmetric about some point). Linear phase will ensure that filter applications will preserve the support of the filter, which is a very nice property to use in an implementation. As it turns out, we can’t get this in addition to orthonormality: One can show that there exists no two channel (m = 2) nontrivial, linear phase, finitely supported orthonormal subband transforms. We therefore extend our transforms to the following class. Definition 4 A biorthogonal subband transform is a transform for which (n2 ) 1) hs(n q1 , aq2 i = δ[q1 − q2 ]δ[n1 − n2 ], 0 ≤ q1, q2 < m, n1 , n2 ∈ Z (7) Contrary to the case for orthonormal transforms, there exists two channel nontrivial, linear phase, finitely supported biorthogonal subband transforms. Biorthogonal transforms are important in image compression also because they may approximate orthonormal transforms well. It is not hard to see that biorthogonality is equivalent with perfect reconstruction. Example 3 We will construct biorthogonal subband transforms in the following way: We start with filters h0 , g0 , and construct filters h1 , g1 using equations 5 and 4. To get a biorthogonal transform, we must construct h0 , g0 jointly so that equation 7 is satisfied. Alias cancellation and perfect reconstruction in this case reduce to hˆ0 (ω + π)gˆ0 (ω) = hˆ0 (ω)∗ gˆ0 (ω + π)∗ , ∗ hˆ0 (ω)gˆ0 (ω) + hˆ0 (ω + π) gˆ0 (ω + π)∗ = 2. These equations are normally specified another way, in which it uses functions associated with wavelets: (m̂0 (ω) = √12 gˆ0 (ω), m̃ˆ0 (ω) = √12 hˆ0 (−ω)). 3.5 Multi-resolution analysis (MRA) We turn now to the concept of constructing biorthogonal/orthonormal transforms from wavelets. Definition 5 A multi-resolution analysis (MRA) on L2 (R) is a set of sub-spaces · · · ⊂ V (2) ⊂ V (1) ⊂ V (0) ⊂ V (−1) ⊂ V (−2) ⊂ · · · satisfying S the following properties. (MR-1) m∈Z V (m) = L2 (R). 6 (MR-2) m∈Z V (m) = {0}. (MR-3) x(t) ∈ V (0) ⇐⇒ x(2−m t) ∈ V (m) . (MR-4) x(t) ∈ V (0) ⇐⇒ x(t − n) ∈ V (0) . (MR-5) There exists an orthonormal basis {φn }n∈Z , for V (0) such that φn (t) = φ(t − n). The function φ(t) is called the scaling function. T Since V (0) ⊂ V (−1) , MR-3 and MR-4 shows that we can write ∞ √ X 2 g0 [n]φ(2t − n) φ(t) = n=−∞ for some sequence g0 . g0 is to be thought of as a low-pass prototype. This equation is called the two-scale equation. From the MR properties one can deduce from the two scale equation that the vectors {g0 [i − 2n]}n are orthonormal. From example 1, we can associate it with a function f , and obtain an orthonormal subband transform this way. the high-pass filter obtained in this way, g1 , can be used to construct a function ψ(t) = ∞ √ X 2 g1 [n]φ(2t − n). n=−∞ This function is called the mother wavelet, and has the nice property that (m) it’s translated dilations ψn (t) = ψ(2−m t − n) are orthonormal functions 2 spanning L (R). 3.5.1 Interpretation of MRA in image processing MRA has the following interpretations with respect to image processing. The input signal is represented as an element in V (0) , by putting the components in the signal as coefficients for the translates of the scaling function: X (0) x(t) = y0 [n]φ(t − n). n∈Z (m) Define W (m) by the span of the {ψn (t)}n . It is not difficult to show that 1. V (m) and W (m) are orthogonal subspaces, 2. V (m) = V (m+1) ⊕ W (m+1) , 3. the coefficients in such a decomposition can be obtained by filtering with h0 and h1 respectively. Note that point 1 and 2 implies that V (0) = ⊕i>0 W (i) . We need to explain point 3 further. Equation 1 produces, through filtering with h0 , h1 , two sequences (i.e. the two polyphase components of the transformed signal) (0) from an input signal. We let y0 be the input signal, and let the two (1) (1) sequences produced be y0 and y1 . Then one can show (by also using the two-scale equation) X (0) y0 [n]φ(t − n) = (1) y0 [n]φ(1) n (t) + n∈Z n∈Z | X {z ∈V (0) } | X (1) y1 [n]ψn(1) (t), n∈Z {z ∈V (1) } | {z } ∈W (1) which explains point 3. This can be done iteratively, by writing X (1) y0 [n]φ(1) n (t) = n∈Z | X (2) y0 [n]φ(2) n (t) + n∈Z {z ∈V (1) } | X (2) y1 [n]ψn(2) (t), n∈Z {z ∈V (2) 7 } | {z ∈W (2) } and so on. Therefore, we obtain wavelet coefficients (i.e. coefficients in W (m) ) by iterative applications of the filters h0 , h1 . The interpretation of the wavelet suspaces W (i) in image processing is in terms of resolution: Wavelet subspaces at higher indices can be thought of as image content at lower resolution, and the subspace V (m) for high m can be thought of as a base for obtaining a low resolution approximation of the image. If one writes V (0) = V (2) ⊕ W (2) ⊕ W (1) , (2) (2) (1) and decomposes a signal x = y0 + y1 + y1 into components in these (2) (2) subspaces, one has in addition two approximations to the signal: y0 +y1 (2) and y0 , where the first one is a better approximation than the last one. One can view these approximations as versions of the signal with higher frequencies dropped, since the coefficients are obtained through filtering with h0 , which can be viewed as a low-pass filter. The effect of dropping high frequencies in the approximation can be seen especially at sharp edges in an image. These get more blurred, since they can’t be represented exactly at lower frequencies. Transforms in image processing are two-dimensional, so we need a few comments on how we implement a separable transform. When a twodimensional transform is separable, we can calculate it by applying the corresponding one-dimensional transform to the columns first, and then to the rows. When filtering, we have four possibilities 1. low-pass filter to rows, followed by low-pass filter to columns (LLcoefficients) 2. low-pass filter to rows, followed by high-pass filter to columns (HLcoefficients) 3. high-pass filter to rows, followed by low-pass filter to columns (LHcoefficients) 4. high-pass filter to rows, followed by high-pass filter to columns (HHcoefficients) When a separable transform is applied, only the LL-coefficients may need further decomposition. When this decomposition is done at many levels, we get the subband decomposition in figure 3.5.1. A similar type of decomposition is sketched for FBI fingerprint compression in section 3.12. The wavelet subspace decomposition in two dimensions has a similar forms: (m+1) V (m) = V (m+1) ⊕ W0,1 (m+1) ⊕ W1,0 (m+1) ⊕ W1,1 , and the mother wavelet basis functions are expressed in terms of the synthesis filters by ψ0,1 (s1 , s2 ) = 2 X g0 [n1 ]g1 [n2 ]φ(2s1 − n1 , 2s2 − n2 ), n1 ,n2 ∈Z ψ1,0 (s1 , s2 ) = 2 X g1 [n1 ]g0 [n2 ]φ(2s1 − n1 , 2s2 − n2 ), n1 ,n2 ∈Z ψ1,1 (s1 , s2 ) = 2 X g1 [n1 ]g1 [n2 ]φ(2s1 − n1 , 2s2 − n2 ). n1 ,n2 ∈Z 8 LL3 HL3 HL2 LH3 HH3 HL1 LH2 HH2 LH1 HH1 Figure 2: Passband structure for a two dimensional subband transform with three levels. Example 4 With three different resolutions, our subband decomposition can be written (2) (2) (2) (1) (1) (1) V (0) = V (2) ⊕ W0,1 ⊕ W1,0 ⊕ W1,1 ⊕ W0,1 ⊕ W1,0 ⊕ W1,1 = LL2 ⊕ HL2 ⊕ LH2 ⊕ HH2 ⊕ HL1 ⊕ LH1 ⊕ HH1 . Contributions from these subspaces appear in the same order as above in a JPEG2000 file, and by including more of the subspaces results in higher resolution. We demonstrate this here with a computer generated file. Figures 3 through 9 show file sizes and images at all decomposition levels in this case. Also shown graphically are the subbands which are dropped, these are blacked out. We see that we gradually lose resolution when dropping more and more wavelet subband spaces, but that even at the lowest resolution the image is recognizable, even if the file size is reduced from 105 kb to 17kb. This is a very nice property for usage in web browsers. Note that these files sizes are calculated by replacing the contents of the subbands dropped with zeroes. They are close to number of bytes if the subbands were dropped in their entirety. If we can find a wavelet with nice properties, most wavelet coefficients are close to 0, and can thus be dropped in a lossy compression scheme. 3.5.2 Biorthogonal wavelets Given that orthonormality in (MR-5) is replaced with linear independence, we can follow the same reasoning as for orthonormality to create a mother wavelet function. The wavelet coefficient subspaces will not be orthogonal in this case. When iterating the filters g0 , g1 , we decompose into subspaces spanned by scaling function and mother wavelet φ, ψ. When we iterate the filters h0 , h1 , we decompose into subspaces spanned by dual scaling function φ̃ and dual mother wavelet ψ̃. Scaling and dual scaling functions differ only in the biorthogonal case. We can deduce a biorthogonal subband transform under appropriate conditions. Expressed with mother wavelets, the criteria for constructing a biorthogonal wavelet becomes (m̃) hψn(m) , ψ̃ñ i = δ[n − ñ]δ[m − m̃], ∀n, m, ñ, m̃ 9 Figure 3: File with no loss, i.e. all wavelet subband spaces included. It’s size is 105kb (1) Figure 4: We then remove the W1,1 -coefficients. It’s size is 94kb. 10 (1) Figure 5: We then remove the W1,0 -coefficients also. It’s size is 84kb. (1) Figure 6: We then remove the W0,1 -coefficients also. It’s size is 73kb. 11 (2) Figure 7: We then remove the W1,1 -coefficients also. It’s size is 57kb. (1) Figure 8: We then remove the W1,0 -coefficients also. It’s size is 37kb. 12 (1) Figure 9: Finally, we remove the W0,1 -coefficients also. It’s size is 17kb. Since we assume linear phase, we will work with biorthogonal wavelets from now on. Not all biorthogonal subband transforms and orthonormal transforms give rise to wavelets. In order for a filter bank to give rise to a wavelet, one can show that m̂0 (defined in example 3) must have a zero at π, and that the number of zeroes affects the smoothness of the scaling function: More zeroes means a smoother scaling function. Daubechies found all FIR filters with N zeroes at π which give rise to orthonormal wavelets. Using similar calculations, Cohen, Daubechies, and Feaveau [3] found FIR filters of odd length, linear phase and with delay-normalized transforms with N, Ñ zeroes at π (These must both be even), which give rise to biorthogonal wavelets. These are: m̂0 (ω) = cos ω 2 N ω p0 (cos(ω)) , and m̃ˆ0 (ω) = cos 2 Ñ p˜0 (cos(ω)) where p0 (x)p˜0 (x) is an arbitrary factorization of the polynomial (set M = N +Ñ ) 2 M −1 X M + n − 1 1 n 2 n=0 n −x . The factorization of the polynomial P (x) is not completely arbitrary, since we must group complex conjugate roots together to get real-valued filter coefficients. Example 5 If we set p0 (x) ≡ 1, we obtain biorthogonal wavelets with filter banks consisting of dyadic fractions only. If we in addition set Set N = Ñ = 2, we obtain the Spline 5/3 transform. This is used by JPEG2000 for lossless compression. One can show that definition 1 simplifies to y[n] = √ 2 − 18 0 − 41 − 18 x[n + 1] + 13 3 4 1 4 − 14 3 4 x[n] + − 81 1 4 0 − 18 x[n − 1] in this case. Similarly, definition 2 simplifies to √ x[n] = 2 0 1 4 0 0 y[n + 1] + 1 2 1 4 − 14 1 2 y[n] + 0 0 − 14 0 y[n − 1] Example 6 If we split the zeroes at π as equally as we can, and the factors in p0 (x) and p˜0 (x) equally also, and also set N = Ñ = 4, we obtain the wavelet JPEG2000 uses for lossy compression. This is the CFD 9/7 transform. For the lossless transform above, only the A[−1], A[0], A[1], S[−1], S[0], S[1] were nonzero. For this lossy transform, only the A[−2], A[−1], A[0], A[1], A[2], S[−2], S[−1], S[0], S[1], S[2] are nonzero. 5/3 and 9/7 above referr to the number nonzero coefficients in the corresponding filters. 3.6 Transform Wavelet-transform, with different wavelet kernels. Transform may be skipped in it’s entirety (typically done so in lossless compression). Transform may also be applied fully or only partially. DWT. Has efficient implementation, both in lossy and lossless case. 3.7 Quantization Deadzone scalar quantization is the quantization method used by JPEG2000. Extensions to the standard opens up for another quantization scheme. 3.8 Coding MQ coding Image split into tiles (not smaller blocks as in JPEG). Typical size 512×512. Each tile is decomposed into constituent parts, using particular filter banks. Demands: 1. Spatial random access into bitstream 2. Distortion scalable bitstream 3. progression scalability 4. resolution scalability 3.9 Applications to video compression MPEG4 3.10 Applications to speech recognition PCA (Principal component analysis) is a common technique in specch recognition. 3.11 Applications to face recognition Elastic bunch graph matching. Gabor wavelets. 14 Figure 10: Decomposition structure employed in the FBI fingerprint compression standard. 3.12 Applications to fingerprint compression and matching FBI uses it’s own standard [1] for compressing fingerprint images. Compression algorithms with wavelet-based transformations were selected in competition with compression using fractal transformations. FBI’s standard has similarities with the JPEG2000 standard, and especially with an extension to the JPEG2000 standard. It uses another subband decomposition, this is demonstrated in figure /reffig:fingerprintdecomp. Further decomposition of the LH-, HL- and HH-bands like this may improve compression somewhat, since the effect of the filter bank application may be thought of as an ”approximative orthonormalization process”. The extension to the JPEG2000 standard also opens up for this type of more general subband decompositions. In FBI’s standard we may also use many different wavelets, with the coefficients of the corresponding filter banks signalled in the code-stream. The only constraint on the filters is that there should be no more than 32 nonzero coefficients. This is much longer than lossy compression in JPEG2000 (9 nonzero coefficients). This may be necessary, since fingerprint images are images with much more sharp edges than most natural images. FBI has their own set of filters which they recommend. The JPEG2000 extension opens up for user-definable wavelet kernels also. The coding is done differently with FBI’s standard. They use Huffman coding, with tables calculated on a per image basis. It turns out to be impossible (at least yet) to find a lossless compression algorithm for fingerprint images with compression ratio more than 2 : 1. Similar phenomena can be observed with JPEG2000: If an image with many sharp edges is wavelet-transformed, the compressed data may be larger compared to when the image is NOT wavelet-transformed (many small coefficients are obtained after wavelet transformation, and we would obtain compression in the lossy case only, since these would be quantized to 0 for lossy transformation). JPEG2000 solves this by not making the wavelet transform mandatory: It can be applied down to a given level only, or skipped altogether. 15 3.13 Other applications of wavelets Blending of multiple images [2]. If several lighting sources are combined, the result may be obtained by combining a set of basis images. The combination can be done very fast in the wavelet domain. References [1] WSQ Gray-scale Fingerprint Image Compression Specification. [2] I. Drori D. Lischinski. Fast multiresolution image operations in the wavelet domain. IEEE Transactions on Visualization and Computer Graphics., 9(3):395–412, 2003. [3] A. Cohen I. Daubechies J.-C. Feauveau. Biorthogonal bases of compactly supported wavelets. Communications on Pure and Appl. Math., 45(5):485–560, June 1992. [4] David S. Taubman Michael W. Marcellin. JPEG2000. Image compression. Fundamentals, standards and practice. Kluwer Academic Publishers, 2002. [5] Khalid Sayood. Introduction to Data Compression. Academic Press, 2000. 16

Applications of the wavelet transform in image processing Øyvind Ryan

Related documents

Products

Support

Applications of the wavelet transform in image processing Øyvind Ryan

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib