Wavelet

advertisement
The Discrete Wavelet Transform
for Image Compression
Jing-De Huang
E-mail: r95942104@ntu.edu.tw
Graduate Institute of Communication Engineering
National Taiwan University, Taipei, Taiwan, ROC
Abstact
Although the DCT-based image compression method using in the JPEG standard
has been very successful in the several years, it still has some properties to
improvement. A fundamental shift in the image compression approach came after the
discrete wavelet transform (DWT) became popular, and it is adopted in the new JPEG
2000 standard. In this paper, we will introduce the DWT method and why it can be
use in the image compression.
1. Subband Coding
In subband coding, an image is decomposed into a set of bandlimited
components, called subbands, which can be reassembled to reconstruct the original
image without error. Figure 1 shows the components of two-band subband coding and
decoding system. Since the bandwidth of the resulting subbands y0 (n) or y1 (n) is
smaller than the original signal x(n), the subbands can be downsampled without loss
of information. Reconstruction of the original signal is accomplished by upsampling,
filtering, and summing the individual subbands.
h0(n)
x ( n)
2
2
y0 (n)
Analysis
h1(n)
2
Synthesis
y1 (n)
2
H0 ()
H1 ()
Low band
High band
g0(n)
+
g1(n)
xˆ (n)
0
/2


Figure 1 Two-band filter bank for one-dimension subband coding and decoding.
In according to the Z-transform and its sampling theorem, we can express the
output as
 H 0 ( z )G0 ( z )  H1 ( z )G1 ( z ) X ( z )
 12  H 0 ( z )G0 ( z )  H1 ( z )G1 ( z )  X ( z )
Xˆ ( z ) 
1
2
(1)
where the second component which contains the z dependence represents the
1
aliasing introduced by the downsampling-upsampling process.
For error-free reconstruction of the input, that is, Xˆ ( z )  X ( z ) , we impose the
following conditions:
H 0 ( z )G0 ( z )  H1 ( z )G1 ( z )  0
(2)
H 0 ( z )G0 ( z )  H1 ( z )G1 ( z )  2
Eq. (2) reveal that the analysis and synthesis filter are cross-modulated. For finite
impulse response (FIR) filters and ignoring the delay, we can get
g0 (n)  (1) n h1 (n)
(3)
g1 (n)  (1) n 1 h0 (n)
Thus, FIR synthesis filters are cross-modulated copies of the analysis filters  with
one (and only one) being sign reversed.
Eq. (2) can also be used to demonstrate the biorthogonality of the analysis and
synthesis filters. That is
g 0 (k ), h0 (2n  k )   (n),
g 0 (k ), h1 (2n  k )  0
g1 (k ), h1 (2n  k )   (n),
g1 (k ), h0 (2n  k )  0
(4)
or
hi (2n  k ), g j (k )   (i  j ) (n), i, j  {0,1}
(5)
Filter banks satisfying this condition are called biorthogonal. Moreover, the analysis
and synthesis filter impulse responses of all two-band, real-coefficient, perfect
reconstruction filter banks are subject to the biorthogonality constraint.
One solution that satisfies the biorthogonality requirement of Eq. (5) and used in
the development of the fast wavelet transform are called orthonormal. It require
gi (n), g j (n  2m)   (i  j ) ( m),
i, j  {0,1}
(6)
which defines orthonormality for perfect reconstruction filter banks. The relationship
of the four filter is
g1 (n)  (1) n g 0 (2 K  1  n)
hi (n)  gi (2K  1  n),
(7)
i  {0,1}
(8)
where 2K denotes the number of coefficients in each filter. As can be seen, g1 (n) is
related to g0 (n) and both h0 (n) and h1 (n) are time-reversed versions of g0 (n)
and g1 (n) , respectively.
2. Multiresolution Analysis
In multiresolution analysis, a scaling function is used to create a series of
approximations of a signal. A wavelet function is used to encode the difference in
2
information between adjacent approximations.
A signal f (x) can be analyzed as a linear combination of expansion functions
f ( x)    kk ( x)
(9)
k
where the  k are real-valued expansion coefficients, and the k ( x) are real-valued
expansion functions. If the expansion is unique, the k ( x) are called basis functions.
The function space of the expansion set k ( x) is
V  span k ( x )
(10)
k
And f ( x) V means that f ( x) is in the span of k ( x) and can be written in the
form of Eq. (9). The coefficients  k are computed by taking the integral inner
products of the dual k ( x) ’s and function f (x). That is
 k  k ( x), f ( x)   k* ( x) f ( x)dx
(11)
If k ( x) is an orthonormal basis for V , then k ( x)  k ( x) . If k ( x) are not
orthonormal but are an orthogonal basis for V , then the basis funcitons and their duals
are called biorthogonal. That is
0 , j  k
(12)
 j ( x),k ( x)   jk  
1 , j  k
Now consider the set of expansion functions  j ,k ( x ) composed of integer
translations and binary scalings of the real, square-integrable function  ( x) where
 j ,k ( x)  2 j / 2 (2 j x  k )
(13)
for k Z and  ( x)  L2  R  . Because the shape of  j ,k ( x ) changes with j,  ( x)
is called a scaling function. We denote the subspace spanned over k for any j as
V j  span  j ,k ( x)
(14)
k
The scaling function have four fundamental requirements of multiresolution
analysis:
1. The scaling function is orthogonal to its integer translates.
2. The subspaces spanned by the scaling function at low scales are nested within
those spanned at higher scales. That is
V   V1  V0  V1  V2   V .
(15)
3. The only function that is common to all V j is f ( x)  0 . That is
V  0 .
4. Any function can be represented with arbitrary precision. That is,
3
(16)


V  L2  R 
(17)
The expansion functions of any subspace can be built from double-resolution copies
of themselves. That is,
 ( x)   h (n) 2 (2 x  n)
(18)
n
where the h ( n ) coefficients are called scaling function coefficients.
Given a scaling function that meets the multiresolution requirements, we can
define a wavelet function  ( x) that spans the difference between any two adjacent
scaling subspaces, V j and V j 1 . We can define the set  j ,k ( x ) of wavelets
 j ,k ( x)  2 j / 2 (2 j x  k )
(19)
for all k Z that spans the space W j where
W j  span  j ,k ( x) .
(20)
k
V2  V1  W1  V0  W0  W1
V1  V0  W0
W1
W0
V0
Figure 2 The relationship between scaling and wavelet function spaces.
The scaling and wavelet function subspaces in Fig. 2 are related by
V j 1  V j  W j .
(21)
We can now express the space of all measurable, square-integrable function as
L2  R   V0  W0  W1  W2 
(22)
or even
L2  R  
W2 W1 W0 W1 W2 
(23)
Similar to the scaling function, the wavelet function can be expressed as a weighted
sum of shifted, double-resolution scaling functions. That is,
 ( x)   h (n) 2 (2 x  n)
n
4
(24)
where the h ( n) are called the wavelet function coefficients. It can be shown that
h ( n) is related to h ( n ) by
h (n)  (1)n h (1  n) .
(25)
3 Discrete Wavelet Transform
We begin by defining the wavelet series expansion of function f ( x)  L2  R 
relative to wavelet  ( x) and scaling function  ( x) . We can write

f ( x)   c j0 (k ) j0 ,k ( x)    d j (k ) j ,k ( x)
(26)
j  j0 k
k
where j0 is an arbitrary starting scale and the c j0 (k ) ’s are normally called the
approximation or scaling coefficients, the d j ( k ) ’s are called the detail or wavelet
coefficients. The expansion coefficients are calculated as
c j0 (k )  f ( x),  j0 ,k ( x)   f ( x) j0 ,k ( x)dx
(27)
d j (k )  f ( x), j ,k ( x)   f ( x) j ,k ( x)dx
(28)
If the function being expanded is a sequence of numbers, like samples of a
continuous function f ( x) . The resulting coefficients are called the discrete wavelet
transform (DWT) of f ( x) . Then the series expansion defined in Eqs. (27) and (28)
becomes the DWT transform pair
W ( j0 , k ) 
1
M
W ( j , k ) 
1
M
M 1
 f ( x)
x 0
( x)
(29)
( x)
(30)
j0 , k
M 1
 f ( x)
x 0
j,k
for j  j0 and
1
1 
W ( j0 , k ) j0 ,k ( x) 
(31)

 W ( j, k ) j ,k ( x)
M k
M j  j0 k
where f ( x) ,  j0 , k ( x) , and  j , k ( x ) are functions of discrete variable x = 0, 1, 2, ... ,
f ( x) 
M  1.
4 The Fast Wavelet Transform
The fast wavelet transform (FWT) is a computationally efficient implementation
of the discrete wavelet transform (DWT) that exploits the relationship between the
coefficients of the DWT at adjacent scales. It also called Mallat's herringbone
algorithm. The FWT resembles the twoband subband coding scheme of Section 1.
Consider again the multiresolution equation
 ( x)   h (n) 2 (2 x  n)
n
5
(32)
Scaling x by 2 j , translating it by k, and letting m = 2k + n gives
 (2 j x  k )   h (n) 2  2(2 j x  k )  n 
n

  h (m  2k ) 2 2 j 1 x  m

(33)
m
Similarity,
 (2 j x  k )   h (m  2k ) 2 (2 j 1 x  m)
(34)
m
Now consider the discrete wavelet transfrom. Assume  ( x)   ( x) and  ( x)   ( x) ,
we substitute Eq. (18) into Eq. (29), we get
1
(35)
W ( j0 , k ) 
h (m  2k ) 2 (2 j0 x  m) .

M m
Then we substitute Eq. (19) into Eq. (30), we get
W ( j , k ) 
1
M
x f ( x)2 j / 2 (2 j x  k )

1
M
h (m  2k )
x f ( x)2 j / 2 
m


 1
  h (m  2k ) 
m
 M
x

2 (2 j 1 x  m) 


f ( x)2( j 1) / 2  (2 j 1 x  m) 

(36)
where the bracketed quantity is identical to Eq. (35) with j0  j  1 . Therefore,
W ( j, k )   h (m  2k )W ( j  1, m) .
(37)
m
Similarity,
W ( j, k )   h (m  2k )W ( j  1, m) .
(38)
m
Eqs. (37) and (38) reveal a remarkable relationship between the DWT
coefficients of adjacent scales. We see that both W ( j , k ) and W ( j , k ) , the scale j
approximation and the detail coefficients, can be computed by convolving
W ( j  1, k ) , the scale j + 1 approximation coefficients, with the time-reversed scaling
and wavelet vectors, h ( n) and h (n) , and subsampling the results. We can
rewrite Eqs. (33) and (34) as
W ( j, k )  h (n) W ( j  1, n)
n  2 k , k 0
W ( j, k )  h (n) W ( j  1, n)
n  2 k , k 0
and can be illustrated with block diagram of Figure 3.
6
(39)
(40)
h (n)
2
W ( j , n)
h ( n)
2
W ( j , n)
W ( j  1, n)
Figure 3 An FWT analysis filter bank.
If function f ( x) is sampled above the Nyquist rate, its samples are good
approximations of the scaling coefficients and can be used as the starting
high-resolution scaling coefficient inputs. Therefore, no wavelet or detail coefficients
are needed at the sampling scale.
The inverse fast wavelet transform (FWT-1) uses the level j approximation and
detail coefficients, to generate the level j + 1 approximation coefficients. Noting the
similarity between the FWT analysis filter bank in Figure 3 and the two-band subband
analysis portion of Figure 1, the FWT-1 have the synthesis filter bank of Figure 4.
W ( j , n)
2
h ( n)
+
W ( j , n)
2
W ( j  1, n)
h ( n )
Figure 4 An FWT-1 synthesis filter bank.
By subband coding theorem of section 1, perfect reconstrucion for two-band
orthonormal filters requires gi (n)  hi (n) for i = {0, 1}. That is, the synthesis and
analysis filters must be time-reversed versions of one another. Since the FWT analysis
filter are h0 (n)  h (n) and h1 (n)  h (n) , the required FWT-1 synthesis filters
are g 0 (n)  h0 (n)  h (n) and g1 (n)  h1 (n)  h (n) .
5 Wavelet Transforms in Two Dimension
In two dimensions, a two-dimensional scaling function,  ( x, y ) , and three
two-dimensional wavelet  H ( x, y) ,  V ( x, y ) and  D ( x, y) , are required. Each is
the product of a one-dimensional scaling function  and corresponding wavelet  .
 ( x, y )   ( x) ( y)
(41)
 H ( x, y)   ( x) ( y)
 V ( x, y)   ( y) ( x)
 D ( x, y)   ( x) ( y)
(42)
(43)
(44)
H
V
where  measures variations along columns (like horizontal edges),  responds
to variations along rows (like vertical edges), and  D corresponds to variations
along diagonals.
Like the one-dimensional discrete wavelet transform, the two-dimensional DWT
can be implemented using digital filters and downsamplers. With separable
7
two-dimensional scaling and wavelet functions, we simply take the one-dimensional
FWT of the rows of f (x, y), followed by the one-dimensional FWT of the resulting
columns. Figure 5 shows the process in block diagram form.
2
h (m)
h (n)
WD ( j, m, n)
Rows
2
2
h (  m )
Columns
W ( j  1, m, n)
WV ( j, m, n)
Rows
2
h (m)
h ( n)
2
WH ( j, m, n)
Rows
Columns
2
h (  m )
W ( j , m, n)
Rows
Figure 5 The two-dimensional FWT  the analysis filter.
bank
W ( j , m, n) WH ( j, m, n)
W ( j  1, m, n)
WV ( j , m, n) WD ( j, m, n)
Figure 6 Two-scale of two-dimensional decomposition
WD ( j, m, n)
2
h (m)
Rows
WV ( j, m, n)
2
2
+
h (  m )
h (n)
Columns
Rows
WH ( j, m, n)
2
+
h (m)
2
2
+
Rows
W ( j , m, n)
W ( j  1, m, n)
h ( n)
Columns
h (  m )
Rows
Figure 7 The two-dimensional FWT  the synthesis filter bank.
The single-scale filter bank of Figure 5 can be “iterated” by tying the
approximation output to the input of another filter bank to produce a arbitrary scale
8
transform. As in the one-dimensional case, image f (x, y) is used as the first scale input,
and output four quarter-size subimages  W , WH , WV , and WD . These
subimages are shown in the middle of Figure 6. Two iterations of thc filtering process
produces the two-scale decomposition at the right of Figure 6. Figure 7 shows the
synthesis filter bank that reverses the process described above.
6 Image Compression
When a image has been processed of the DWT, the total number of transform
coefficients is equal to the number of samples in the original image, but the important
visual information is concentrated in a few coefficients. To reduce the number of bits
needed to represent the transform, all the subbands are quantized. Quantization of
DWT subbands is one of the main sources of information loss. In the JPEG2000
standard, the quantization is performed by uniform scalar quantization with dead-zone
about the origin. In dead-zone scalar quantizer with step-size j, the width of the
dead-zone is 2j as shown in Figure 8. The standard supports separate quantization
step-sizes for each subband. The quantization step size j for a subband j is calculated
based on the dynamic range of the subband values. The formula of uniform scalar
quantization with a dead-zone is
 W j (m, n) 
q j (m, n)  sign( y j (m, n)) 

  j

(45)
where Wj(m,n) is a DWT coefficient in subband j and j is the quantization step size
for the subband j. All the resulting qunantized DWT coefficients qj(m,n) are signed
integers.
j
j
j
2 j
j
j
j
3
2
1
0
1
2
3
Figure 8 Dead-zone quantization about the origin.
After the quantization, the quantized DWT coefficients are then use entropy
coding to remove the coding redundancy.
7 Simulation Result
Finally, we compare the wavelet-based image compression with the DCT-based
image compression while the compression ratio is similar. In the figures followed, the
left column is DCT-based images, and the right column is Wavelet-based images.
Both of them use larger and larger quantization step-size to generate higher and higher
9
compresion ratio, and we observe the difference of the two compression method.
I1: Original image with width W and height H
C: Encoded jpeg stream from I1
I1
I2: Decoded image from C
JPEG2
encoder
C
JPEG
decoder
I2
CR (Compression Ratio) = sizeof(I1) / sizeof(C)
H
RMS (Root mean square error) =
W
  I ( x, y)  I
y 1 x 1
1
( x, y )  /( H W )
2
2
Original image
DCT-based image compression
Wavelet-based image compression
CR = 11.2460, RMS = 4.1316
CR = 10.3565, RMS = 4.0104
10
CR = 27.7401, RMS = 6.9763
CR = 26.4098, RMS = 6.8480
CR = 53.4333, RMS = 10.9662
CR = 51.3806, RMS = 9.6947
We can see that the result of the two mathod is similar when compression ratio is
low. When compression ratio is higher and higher, the DCT-based method generates
the clearer and clearer characteristic 'blocky and blurry' artifacts while Wavelet-based
methode does not. In the fact, their RMS is similar when their have the same
compression ratio, but the Wavelet-based method is superior in human vision then the
other one.
11
Conclusion
Wavelet-based image compression method using in JPEG2000 is the new
standard for still image compression. It provides a new framework and an integrated
toolbox to better address increasing needs for compression. It also provides a wide
range of functionalities for still image applications. Lossless and lossy coding,
embedded lossy to lossless, progressive by resolution and quality, high compression
efficiency, error resilience and lossless color transformations are some of its
characteristics. Comparative results have shown that JPEG2000 is indeed superior to
existing still image compression standards. Work is still needed in optimizing its
implementation performance.
Reference
[1] R. C. Gonzolez, R. E. Woods, "Digital Image Processing second edition",
Prentice Hall, 2002.
[2] R. C. Gonzolez, R. E. Woods, S. L. Eddins, "Digital Image Processing Using
Matlab", Prentice Hall, 2004.
[3] T. Acharya, A. K. Ray, "Image Processing: Principles and Applications", John
Wiley & Sons, 2005.
[4] B. E. Usevitch, 'A Tutorial on Modern Lossy Wavelet Image Compression:
Foundations of JPEG 2000', IEEE Signal Processing Magazine, vol. 18, pp.
22-35, Sept. 2001.
12
Download