A Comparative Study of DCT, ... DWT-based Image Coders Warit Wichakool

A Comparative Study of DCT, LOT, and
DWT-based Image Coders
by
Warit Wichakool
Submitted to the Department of Electrical Engineering and Computer
Science
in partial fulfillment of the requirements for the degrees of
Bachelor of Science in Electircal Engineering and Computer Science
and
Master of Engineering in Electrical Engineering and Computer Science
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
June 2001
© Warit Wichakool, MMI. All rights reserved.
The author hereby grants to MIT permission to reproduce and
BARKE
distribute publicly paper and electronic copies of this thesis document
in whole or in part.
MASSACHUSETTS INSTITUTE
OF TECHNOLOGY
JUL 11 2001
LIBRARIES
Author ......
Department of Electrical Engineering and Computer Science
May 23, 2001
Certified by.....
VIA
K.S. Thyagarajan
Principal Engineer
daiesisispupervisf
Certified by...
David H.itaelin
_-Professor of Electrical Engineering
WL.T ThesiS~uj rvisor
Accepted by...........
Arthur C. Smith
Chairman, Department Committee on Graduate Students
A Comparative Study of DCT, LOT, and DWT-based Image
Coders
by
Warit Wichakool
Submitted to the Department of Electrical Engineering and Computer Science
on May 23, 2001, in partial fulfillment of the
requirements for the degrees of
Bachelor of Science in Electrical Engineering and Computer Science
and
Master of Engineering in Electrical Engineering and Computer Science
Abstract
Ten transform coding systems were compared in terms of their system complexity
and peak signal-to-noise ratios (PSNR). The results showed that the relationship
between PSNR and bit rate was affected by the combination of the quantizer and
the encoder, but was not affected by the type of transform. The performance of the
embedded zerotree wavelet (EZW) system was effected by the number of discrete
wavelet transform (DWT) levels. In comparison with the 2-level EZW system at a
given bit rate, the EZW system improved PSNR by up to 4 dB at low bit rates as
the number of levels increased from two to three, and gained another 1 dB as the
number of levels increased from three to four. Four of the ten systems were studied
further: the discrete cosine transform (DCT) baseline JPEG, the lapped orthogonal
transform (LOT) version of baseline JPEG, the visual threshold wavelet with the
run-length Huffman coder, and the EZW with the adaptive Huffman coder. The
PSNR values for the Lena image at 0.5 bit/pixel for the four systems were 34.56,
34.43, 34.97, and 34.52 dB, respectively. In comparison with the DCT JPEG, the
LOT JPEG provided 0.5 dB better PSNR and also reduced the image blockiness, but
it introduced small ringing artifacts in areas with sharp edges. The visual threshold
wavelet yielded better PSNR than the DCT system at the same bit rate, but the
reconstructed image suffered from blurriness. Finally, the EZW system performed
comparably to the DCT system. Although the reconstructed image exhibited no
blockiness, it clearly lost some details.
VI-A Company Thesis Supervisor: K.S. Thyagarajan
Title: Principal Engineer
M.I.T. Thesis Supervisor: David H. Staelin
Title: Professor of Electrical Engineering
Acknowledgments
I would like to take this opportunity to express my gratitude toward many people
on the course of my thesis. First of all, I would like to thank K.S. Thyagarajan for
his advice and guidance. I also would like to thank Professor Staelin for his guidance
and comments on my thesis, and Henrique Malvar for providing the programs and
references of the LOT system for my simulation.
I also would like to thank all
members of Digital Cinema at QUALCOMM INCORPORATED for their supports
during my research at the company. In addition, I would like to thank Peter Agboh
and Songpon Deechongkit for their comments on my research. In addition, I have to
thank Yui (Siraprapha Sanchatjate), my family, and all my friends for all the mental
support and encouragement they have been giving me through out the years at MIT.
Contents
1
Introduction
12
2 Background
2.1
2.2
2.3
16
Transform . . . . . . . . . . . . . . . . . . . .
17
2.1.1
Discrete Cosine Transform . . . . . . .
18
2.1.2
Lapped Orthogonal Transform . . . . .
20
2.1.3
Discrete Wavelet Transform . . . . . .
26
. . . . . . . . . . . . . . . . . .
31
2.2.1
Optimal Uniform Quantizer . . . . . .
33
2.2.2
JPEG Uniform Quantizer
. . . . . . .
34
2.2.3
Visual Threshold Wavelet Quantizer
35
2.2.4
Embedded Zerotree Wavelet Quantizer
37
Quantization
Entropy Coding . . . . . . . . . . . . . . . . .
40
2.3.1
Huffman Coding
. . . . . . . . . . . .
41
2.3.2
Adaptive Huffman Coding . . . . . . .
41
2.3.3
Run-Length Huffman Coding
. . . . .
41
3 Simulation Methods
44
3.1
Part I: System Complexity . . . . . .
46
3.2
Part II: System Performance.....
46
3.2.1
Part II-A: Effect of Number of Levels on DWT Systems .
47
3.2.2
Part II-B: Effect of Transform . . . . . . . . . . . . . . .
47
3.2.3
Part II-C: Effect of Quantizer
48
4
4
3.2.4
Part II-D: Effect of Entropy Coder
. . . . . . . . . . . . . . .
49
3.2.5
Part II-E: Overall System Performance . . . . . . . . . . . . .
49
3.3
Part III: Visual Quality . . . . . . . . . . . . . . . . . . . . . . . . . .
50
3.4
Test Im ages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
50
4.1
4.2
4.3
5
52
Results and Discussions
Part I: System Complexity . . . . . . . . . . . . . . . . . . . . . . . .
52
4.1.1
Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
52
4.1.2
Quantization
. . . . . . . . . . . . . . . . . . . . . . . . . . .
54
4.1.3
Entropy Coding . . . . . . . . . . . . . . . . . . . . . . . . . .
55
Part II: System Performance . . . . . . . . . . . . . . . . . . . . . . .
56
4.2.1
Part II-A: Effect of Number Levels on DWT Systems . . . . .
56
4.2.2
Part II-B: Effect of Transform . . . . . . . . . . . . . . . . . .
65
4.2.3
Part II-C: Effect of Quantizer . . . . . . . . . . . . . . . . . .
70
4.2.4
Part II-D: Effect of Entropy Coder
. . . . . . . . . . . . . . .
71
4.2.5
Part II-E: Overall System Performance . . . . . . . . . . . . .
78
Part III: Visual Quality. . . . . . . . . . . . . . . . . . . . . . . . . .
90
Summary
105
A General Thesis Release Letter and Classification Review Letter
5
107
List of Figures
2-1
Basic transform coding system . . . . . . . . . . . . . . . . . . .
16
2-2
Block diagram for a separable 2-D transform . . . . . . . . . .
18
2-3
Flowgraph conventions . . . . . . . . . . . . . . . . . . . . . . . .
18
2-4
1-D DCT basis functions . . . . . . . . . . . . . . . . . . . . . . .
19
2-5
Fast DCT for M=8 ....................
. . . . . . . . . .
21
2-6
Fast IDCT for M=8 ................
. . . . . . . . . .
22
2-7
General structure of the LOT . . . . . . . . . . . . . . . . . . . .
23
2-8
LOT basis functions ................
. . . . . . . . . .
25
2-9
Type-I, fast LOT for a block size of 16 . . . . . . . . . . . . . .
26
. . . . . . . . . .
27
2-11 Type-I, fast LOT for the finite length signal . . . . . . . . . . .
28
2-12 2-level, 2-band analysis and synthesis for the 1-D DV T . . .
30
2-13 Organization of 3-level DWT coefficients . . . . . . . . . . . . .
31
. . . . .
39
2-15 Zig-zag scan of the run-length Huffman coder . . . . . . . . . .
43
2-16 Flowgraph of the run-length Huffman coder . . . . . . . . . . .
43
2-10 Type-I, fast ILOT for a block size of 16 ....
2-14 Locations of parent-descendants of the 3-level EZW
4-1
Effect of number of DWT levels on the PSNR performance of
the DWT systems using the EZW quantizer and the adaptive
H uffm an coder . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
64
4-2
Efficiency test for the adaptive Huffman coder . . . . . . . . .
89
4-3
bfragl: original image at 8 bpp . . . . . . . . . . . . . . . . . . .
93
4-4
bfrag2: DCT + JPEG + Run-Length Huffman at 0.50 bpp .
94
6
4-5
bfrag3: 3-level DWT + EZW + Adaptive Huffman at 0.50 bpp 95
4-6
bfrag4: 4-level DWT + EZW + Adaptive Huffman at 0.50 bpp 96
4-7
bfrag5: 3-level DWT + Visual Threshold + Run-Length Huffman at 0.50 bpp .......
97
............................
4-8 bfrag6: LOT + JPEG + Run-Length Huffman at 0.50 bpp .
4-9 lenal: original image at 8 bpp
..................
98
.
99
4-10 lena2: DCT + JPEG + Run-Length Huffman at 0.50 bpp . . 100
4-11 lena3: 3-level DWT + EZW + Adaptive Huffman at 0.50 bpp 101
4-12 lena4: 4-level DWT + EZW + Adaptive Huffman at 0.50 bpp 102
4-13 lena5: LOT + JPEG + Run-Length Huffman at 0.50 bpp . .
103
4-14 lena6: 3-level DWT + Visual Threshold + Run-Length Huffman at 0.50 bpp ............................
7
104
List of Tables
2.1
Computational costs of the fast DCT . . . . . . . . . . . . . . .
21
2.2
Computational costs of the type-I, fast LOT
. . . . . . . . . .
29
2.3
Coefficients of 9/7-tap Villasenor biorthogonal filters . . . . .
29
2.4
Computational costs of the DWT
. . . . . . . . . . . . . . . . .
32
2.5
Basis function amplitudes for 9/7-tap biorthogonal filters . .
36
2.6
Quantization levels for 9/7-tap biorthogonal filters
. . . . . .
37
3.1
List of tested transform coding systems
. . . . . . . . . . . . .
44
3.2
List of systems for comparing the effect of number of DWT
levels on the PSNR performance . . . . . . . . . . . . . . . . . .
3.3
47
List of systems for comparing the effect of transforms on the
PSNR performance using the optimal uniform quantizer, the
Huffman coder, and the run-length Huffman coder . . . . . .
3.4
List of systems for comparing the effect of quantizers on the
PSNR performance using the run-length Huffman coder . . .
3.5
48
49
List of systems for comparing the effect of entropy coders on
the PSNR performance using Huffman and run-length Huffman coders
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
3.6
List of selected systems for comparing the PSNR performance 50
3.7
List of test im ages . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
4.1
Computational costs of transform algorithms . . . . . . . . . .
53
4.2
Normalized costs of transforming a 512x512 image
. . . . . .
53
8
4.3
List of systems for comparing the effect of DWT levels on the
PSNR performance of the DWT systems . . . . . . . . . . . . .
4.4
57
Effect of DWT levels on the PSNR performance of the DWT
systems using the optimal uniform quantizer and the run58
length Huffman coder ...............................
4.5
Effect of DWT levels on the PSNR performance of the DWT
systems using the optimal uniform quantizer and the run59
length Huffman coder (cont.) ....................
4.6
Effect of DWT levels on the PSNR performance of the DWT
systems using the visual threshold quantizer and the runlength Huffman coder ...............................
4.7
60
Effect of DWT levels on the PSNR performance of the DWT
systems using the visual threshold quantizer and the runlength Huffman coder (cont.) ....................
4.8
61
Effect of DWT levels on the PSNR performance of the DWT
systems using the EZW quantizer and the adaptive Huffman
co d er . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.9
62
Effect of DWT levels on the PSNR performance of the DWT
systems using the EZW quantizer and the adaptive Huffman
coder (cont.)
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
63
4.10 List of systems for comparing the effect of transforms on the
PSNR performance
65
..........................
4.11 Effect of transforms on the PSNR performance using the optimal uniform quantizer and the Huffman coder ........
66
4.12 Effect of transforms on the PSNR performance using the optimal uniform quantizer and the Huffman coder (cont.)
. ..
67
4.13 Effect of transforms on the PSNR performance using the optimal uniform quantizer and the run-length Huffman coder .
9
68
4.14 Effect of transforms on the PSNR performance using the optimal uniform quantizer and the run-length Huffman coder
(con t.)
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
69
4.15 List of systems for comparing the effect of quantizers on the
PSNR performance using the run-length Huffman coder . . .
71
4.16 Effect of quantizers on the PSNR performance of the DCT
systems using the run-length Huffman coder . . . . . . . . . .
72
4.17 Effect of quantizers on the PSNR performance of the DCT
systems using the run-length Huffman coder (cont.) . . . . . .
73
4.18 Effect of quantizers on the PSNR performance of the LOT
systems using the run-length Huffman coder . . . . . . . . . .
74
4.19 Effect of quantizers on the PSNR performance of the LOT
systems using the run-length Huffman coder (cont.) . . . . . .
75
4.20 Effect of quantizers on the PSNR performance of the DWT
systems using the run-length Huffman coder . . . . . . . . . .
76
4.21 Effect of quantizers on the PSNR performance of the DWT
systems using the run-length Huffman coder (cont.) . . . . . .
77
4.22 List of systems for comparing the effect of entropy coders on
the PSNR performance using the optimal uniform quantizer
78
4.23 Effect of entropy coders on the PSNR performance of the
DCT systems using the optimal uniform quantizer
. . . . . .
79
4.24 Effect of entropy coders on the PSNR performance of the
DCT systems using the optimal uniform quantizer (cont.) . .
80
4.25 Effect of entropy coders on the PSNR performance of the
LOT systems using the optimal uniform quantizer . . . . . . .
81
4.26 Effect of entropy coders on the PSNR performance of the
LOT systems using the optimal uniform quantizer (cont.) . .
82
4.27 Effect of entropy coders on the PSNR performance of the
DWT systems using the optimal uniform quantizer . . . . . .
10
83
4.28 Effect of entropy coders on the PSNR performance of the
DWT systems using the optimal uniform quantizer (cont.)
.
84
4.29 List of selected systems for the comparison in PSNR perform an ce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
85
4.30 Comparison of the PSNR performance of DCT, LOT, and
DWT systems .......
..............................
86
4.31 Comparison of the PSNR performance of DCT, LOT, and
DW T systems (cont.) ........................
4.32 Result images and the corresponding PSNRs ..........
11
. 87
91
Chapter 1
Introduction
Modern media is overwhelming with graphics such as images and movies. Constraints
on bandwidth and memory space create trade-offs between the size and quality of
images.
One solution is to compress the image using a transform coding system.
The basic transform coding system consists of three blocks: a transform block, a
quantization block, and an entropy coding block. At a given bit rate, different systems
yield different image quality. In addition, each system varies in terms of the algorithm
complexity. Within the system, the complexity and the image quality may be mainly
influenced by a single functional block or by a combination of all three blocks. It is
hard to determine the best system to use.
One of the most popular compression systems uses the discrete cosine transform
(DCT) system recommended by Joint Photographic Experts Group (JPEG). The
simplest JPEG system is the baseline JPEG. This system combines the DCT with a
uniform scalar quantizer and a run-length Huffman coder [1], [9]. The DCT system
provides high quality images at reasonable bit rates, but exhibits blockiness at low
bit rates. This artifact is caused by independent processes of transformed blocks and
the discontinuity of the DCT basis functions [7], [14]. This artifact can be reduced by
increasing the bit rate or using more complex systems. However, the gain in quality
may not justify additional complexity of the system.
Another solution to reduce the blocking effect is to change the transform block
to the lapped orthogonal transform (LOT). This transform solves the problem by
12
overlapping and modifying the DCT basis functions to eliminate the independent
transformation of image blocks and the discontinuity of the basis functions [7]. The
LOT is similar to the DCT because it is a block transform. Furthermore, the LOT
uses the DCT bases as its building blocks for new basis functions [7]. However, there
is no standard quantizer and entropy coder for the LOT system. Since the LOT is
similar to the DCT, the LOT may benefit from the JPEG-like quantizer and encoder
in the same way as the DCT. Therefore, the reduction of the blocking effect can be
done by modifying the transform block to the LOT and updating parameters for the
quantizer and the encoder accordingly. This solution may be the simplest one to
improve the visual quality of the image.
Recently, the wavelet-based system has been extensively studied and developed for
image compression applications. Many of the candidates for the JPEG-2000 standard
are wavelet-based [6].
The discrete wavelet transform (DWT) is one of sub-band
coding. One implementation of DWT system employs a filter bank to separate the
signal into a number of frequency bands. Each band is then quantized and encoded
depending on the systems. These filter banks operate on the whole image instead
of a block like the DCT and the LOT. As a result, it should eliminate the image
blockiness completely. In addition, the DWT system can gain compression from
a special structure called a zerotree. Sample encoders based on this structure are
the embedded zerotree wavelet (EZW) and the set partitioning in hierarchical trees
(SPIHT). Both encoders use a bit-plane coding scheme and the zerotree structure to
compress the information and are claimed to provided higher PSNR than the baseline
DCT JPEG at given bit rates [10], [11]. In addition, both the EZW and the SPIHT
are embedded systems that can achieve the targeted bit rate exactly. Furthermore,
the encoders do not have to send a code table in the encoded file because the decoders
will generate the code table during the decoding process [11]. These properties allow
both systems to perform progressive coding. Another wavelet system is the visual
threshold system. It uses a predefined quantization level for each wavelet band [16].
This quantization scheme is similar to the baseline JPEG. Therefore, this system
can be adapted to use the run-length coder of the baseline JPEG system with some
13
modifications.
The choices above present alternatives for image compression. All transform coding systems may achieve the same PSNR and bit rate but at different costs. Since
the system consists of three functional blocks: the transform, quantizer, and entropy
coder, the performance of each system may depend on one particular block or the
combination of the blocks. It is useful to compare the PSNR gain for each type
of transform, quantizer, and entropy coder independently. The PSNR gain of the
transform block can be compared by using the same type of the quantizer and the
entropy coder for all transforms. The PSNR gain, as a result, is influenced mainly
by the transform block. The comparison of quantizers can be done by restricting the
choice of transform and entropy coder to be the same for all compared systems. The
comparison of entropy coders can be done in a similar manner. The results of these
comparisons should reveal the optimal encoding scheme for each transform.
This thesis compared several transform coding systems in terms of their system
complexity and peak signal-to-noise ratios (PSNR) at given bit rates. The tested
transforms included the discrete cosine transform (DCT), the lapped orthogonal
transform (LOT), and the discrete wavelet transform (DWT). The tested quantizers included the optimal uniform quantizer, the JPEG uniform quantizer, the visual
threshold quantizer, and the embedded zerotree wavelet (EZW). Finally, the tested
entropy coders included the Huffman coder, the adaptive Huffman coder, and the runlength Huffman coder. The system complexity was compared in terms of number of
additions and multiplications. This thesis focused primarily on the transform block
for the complexity comparison.
However, comparisons of quantizers and entropy
coders were also included. Selected combinations of these three components were
used to study the influence of each component on the PSNR performance. Finally,
four systems were studied further to compare overall system performance including
the visual quality of reconstructed images. They were the baseline DCT JPEG, the
baseline LOT JPEG, the visual threshold wavelet with the run-length Huffman coder,
and the EZW with the adaptive Huffman coder. In all comparisons, seven standard
test images were used, including Lena and Barbara.
14
This thesis is organized as follows. Chapter 2 presents background and algorithm
implementation of the functional blocks used in this thesis. Chapter 3 presents lists
of comparison methods and corresponding tested systems. In Chapter 4, the results
are presented and discussed, Finally, the thesis is concluded and further studies are
proposed in Chapter 5.
15
Chapter 2
Background
This chapter presents brief background and algorithm implementation of all components of the still image compression used in this thesis. A simple transform coding
system consists of three fucntional blocks: transform, quantization, and entropy coding. A block diagram of a simple encoder/decoder system is shown in Figure 2-1
below.
original
image
S Transform
Quantization
Entropy
Cdn
Coding
compressed
mg
image
(a)
reconstructed
image
Entropy
Inversed
Transfom
Dequantization
Decoding
compressed
image
(b)
Figure 2-1: Block diagram of a basic transform coding system for still
image copmression (a) The encoding part of the system. (b) The decoding counterpart of the system
An original image is passed through the transform block that outputs transform
coefficients. The quantizer then reduces the coefficients to fewer numbers or symbols.
16
Finally, the entropy coder translates those numbers or symbols into a stream of binary
bits in order to be stored or transmitted. In this thesis, lists of all components in the
simulation are shown as follows.
" Transform:
(a) Discrete Cosine Transform (DCT)
(b) Lapped Orthogonal Transform (LOT)
(c) Discrete Wavelet Transform (DWT)
" Quantization:
(a) Optimal Uniform Quantizer
(b) JPEG Uniform Quantizer
(c) Visual Threshold Wavelet Quantizer
(d) Embedded Zerotree Wavelet Quantizer (EZW)
* Entropy Coding:
(a) Huffman
(b) Adaptive Huffman
(c) Run-Length Huffman
The quantizers and coders listed above may be specifically designed for particular
transform coefficients.
Some modifications are required in order to use the same
quantizer or entropy coder for different transforms. The modifications are described
in the background section of that particular algorithm.
2.1
Transform
The transform block is used to remove redundancy of the information of the input
signal, an image in this case [1], [4], [8]. The image signal is projected onto a particular set of basis functions. In general, the basis functions are orthonormal in order
17
to minimize the redundancy in the representations. In this thesis, all transforms are
either orthogonal or nearly orthogonal because of the advantages of fast algorithms.
Furthermore, all transforms are separable, meaning the transform can be performed
on each dimension independently. This property reduces the complexity of the implementation significantly. The separable 2-D transform can be computed according
to the diagram shown in Figure 2-2 below. There are three transforms used in this
thesis: the DCT, the LOT, and the DWT. The details of each transform are discussed
in the sections below. In addition, there are some conventions of the flowgraphs used
in this thesis. All conventions are shown in Figure 2-3 below.
input
Row
-
Row
Transform
Transpose
Transform
Transpose
2-D transformed
image
Transpose
2-D transformed
:
a
(a)
reconstructed
Inversed
Row
Inversed
Row
Transform
Transpose
Transform
signal
image
(b)
Figure 2-2: Block diagram for a separable 2-D transform
a
b
a+b
a-b
a
a
c
c*a
b
Figure 2-3: Flowgraph conventions
2.1.1
Discrete Cosine Transform
The discrete cosine transform (DCT) is an orthogonal block transform. It projects
each block of the input signal onto a set of truncated, sampled cosine waveform. The
computation of the DCT coefficients can be done as follows. Let M be the block
length, k be the index of the basis function, and n be the sample index, the basis
18
functions of the 1-D DCT are given by
=
ank
Cos (nk+ ) k
M
2
2
]
;
(2.1)
O<k<M-1,
where
; k=O
r-
2
1
c[k]
(2.2)
;
otherwise.
First four basis functions of the 1-D DCT are shown in Figure 2-4 below.
(b)
(a)
0.5
-0.5
)
2
4
6
-0.5
E
0
2
4
6
8
6
8
(d)
0.5
-T-
0
p
0
(c)
0.5(
-0.5
T
0
0
0
'1111.'
2
4
6
-0.5
8
0
2
4
Figure 2-4: First four basis functions of the 1-D DCT for M=8
The 2-D DCT is a separable transform. The 2-D transform can be done as previ-
19
ously shown in Figure 2-2 above. In the case of one dimensional transform, the DCT
coefficients, X[k], can be calculated as follows. Let x[n] be an input signal indexed
by n and M is the DCT block length, X[k] is given by
2 M-1
X[k]
=
c[k]
Ik
[n] Cos (n + 2)
0
k<
M-1,
(2.3)
M -l,
(24
n:=O
where
-k0
c[k]
=-
v'
1
;
otherwise.
The inverse discrete cosine transform (IDCT) is given by
V22 M1
z~~n]
=
[1k
0O<ni<
7r]
~~c[k]ZC[k] cos (n + 2)
M
O[
Mk=
;
_<M-,
(24
(2.4
where i[n] is a reconstructed signal and k[k] is the dequantized DCT coefficient. All
other symbols are the same as those in the DCT algorithm shown above. In this
thesis, the transform block has a length of 8 for 1-D transform and a size of 8x8 for
2-D signal. The fast algorithms for DCT and IDCT are shown below in Figure 2-5
and 2-6 respectively. The DCT algorithms can be found in [6].
Finally, the computational costs of the 1-D and 2-D DCT algorithms used in this
thesis are summarized in Table 2.1 below. The costs of the IDCT algorithms are
the same as the DCT. The computational costs shown below assumes that the image
dimension is RxC, where R is the number of rows and C is the number of columns.
In addition, it is assumed that R and C are divisible by 8 in this thesis.
2.1.2
Lapped Orthogonal Transform
Although the DCT has been employed extensively in still image compression,
the DCT system exhibits blocking artifact at low bit rates. The lapped orthogonal
transform (LOT) was developed to minimize this artifact. The causes of the blocking
20
1/(2sqrt(2))
x[O]
m[0
x[1]
m[1
x[2]
m[2
n[O]
n[1]
-
--
-
X[0]
n[0
1/(2sqrt(2))
n[1-
n[2]
(C2)/2
n[2]
X[2]
(S2
x[4]
-
-
m[4
(S2)/2
n[3---
-------n[3]
m[13
x[3]
m[4]
X[4]
------
X[6]
-
n[4]
p[4
(Cl)/2
(S1)/2
x[5]
x[6]
- -
----
x[7]--------
m[5
m[5]
n[5]
p[5
m[6
m[7
m[7
n[6]1
n[6 ------
p[6
-
n[7]
1/sqrt(2)
(C 1)/2
Figure 2-5: Flowgraph of Fast DCT for M=8
Si represents sin(
(C5)/2
p[7
-
X[5]
(S5)/2
(S5)/2
n[7 ------
X[7]
-----
1/sqrt(2)
m[6
X[1]
(S1)/2
(C5)/2
Ci represents cos(
X[3]
) and
).
Table 2.1: Computational costs of the fast DCT, M=8
#
Additions
#
Multiplications
1-D, 8-point DCT
26
16
RxC, 2-D DCT
13RC
4RC
2
effect can be divided into two categories.
First, the DCT bases are short, non-
overlapped, and discontinuous at the edges. Second, the DCT system transforms
each block independently. The LOT eliminates this artifact by forcing the bases to
decay to zero at both edges and overlapping with a neighboring transform block [7],
[14]. In this thesis, the overlapped part was half of the block length and started at
the middle of the preceding block. The general structure of LOT is shown in Figure
2-7.
In order to make the LOT an orthogonal transform, all the bases must form an
orthonormal set. Furthermore, the overlapped portion must be orthogonal to all bases
21
1/(2sqrt(2))
X[O]
n[]
n0n[]
1/(2sqrt(2))
X[4]
- - -
-
-
x[0]
n[1]
m[1]
m[1]
x[1]
n[2
n[2]
m[2]
m[2]
x[2]
n[3
n[3] ------
m[3]
x[3]
(/2
-
m[O]
n[1]
(C2)/2
X[2]
(S2)>:
X[6] --
m[O]
(S1)/2
X[7]
p[5
X[3]
p[6] 1/sqrt(2) p[6
/sqrt(2)
X[5]
-p[7]
n[5
---
m[4]
m[4]
(C1)/2
n[4]
p[4]
X[1]
-m[3]
-m[5]
-----(C 1)/2
(C5)/2
----- n[6]
m[5]
m[6]
(C
-
n[7 --
- m[7]
x[5]
--
- - - -
x[6]
m[7]
- - - - - -
x[7]
Ci represents cos(
Figure 2-6: Flowgraph of Fast IDCT for M=8
Si represents sin(
x[4]
m[6]
5)/2
p[7 ---- --
-
(S1)/2
N'r
) and
).
of the neighboring block [5]. It is not practical to use the original version of the LOT
because there was no fast algorithm available. However, the nearly orthogonal lapped
transform has a fast algorithm. This type of LOT, called the type-I, fast LOT, uses
bases of DCT as building blocks for its bases. This algorithm requires few additional
computations beyond the existing DCT algorithm [7].
Let M be the length of a
The LOT coefficients can be calculated as follows.
transform block, V is the overlapped length, where M
-
2V. The transform pair is
given by
X
= PTX
(2.5)
and
x =
PX,
(2.6)
where P is the basis function matrix whose size is MxV. x is an input vector of
22
-2M+1
-M
-M+1
I I
-2M+1
M
M+1
01
M
M+l
2M
2M
x[n]
Direct LOT
-M
pT
PT
P
Inverse LOT
-2M+1
0 1
-M+1
P
P
P
M
0 1
T
M+1
2M
x[n]
Figure 2-7: General structure of the LOT In this structure, the overlapped
portion is a half of the block length. PT and P are the transform and the inverse
transform operator respectively.
length M. The basis function matrix, P, for the fast algorithm is given by
P = P0 Z,
(2.7)
where P is the MxV matrix and Z is the orthogonal matrix of size VxV. The matrix
Po is given by
PO=
1
De
De - Do
Do
J(De -
(2.8)
Do) -J(De - Do)
De and Do are the even and odd functions of the DCT bases. J is an anti-identity
matrix. Z is approximated in the fast algorithm as
Z
~
I
0
0
Z
,7
23
(2.9)
where
Z is
a matrix of size
V
by v. The matrix
Z
Z is defined
as
TOT 1T2 ... Ty- 2,
(2.10)
where matrix T is an m by ! matrix. It is defined as
1
Ti=
0
0
0 Y(0j)
0
0
I
0
.
(2.11)
Y(Oj) is a rotational matrix in the position (i, i) of the matrix T. Y is a 2x2 matrix
and is defined as
Y(00
-
cos Oi sin Oi
sin Oi cos O6
(2.12)
where Oi is a rotation angle.
In this thesis, the type-I, fast LOT had the block length, M, of 16 and the overlapped length, V, of 8 because the 8-point fast DCT can be used as part of LOT
algorithms. The matrix
are [0,
01, 02] =
Z
is equal to TOT 1 T2 . The three angles for matrix Y(0j)
[0.137r, 0.167r, 0.137r] for the optimal coding gain, and [00,
0
1, 02]
= [0.1457r, 0.177r, 0.167r] for the QR-based quasi-optimal LOT [5]. This thesis used
the angle for the optimal coding gain. The first four basis functions of the type-I,
fast LOT for optimal coding gain with the block length of 16 samples and overlapped
portion of 8 samples are shown in Figure 2-8 below. The implementations of type-I,
fast LOT block transform and inverse transform are shown in Figure 2-9 and 2-10
respectively. The LOT programs can be found in [6]. In addition, all figures of the
LOT system were reproduced with the permission of Henrique Malvar.
In the case of a finite length signal, the beginning block and the ending block must
be modified to support the non-existing overlapping part. It is done by reflecting few
samples of the beginning and the ending block outside the signal. Then the reflected
parts are treated as part of the signal. In this thesis, only first and last four samples
24
(a)
(b)
1
1
0.5
0.5
0(D
0
0
-0.5
-0.5
-1
0
10
5
-T iI
-1
15
0
5
0.5
ii
0
-0.5
-1
0
5
15
(d)
(c)
1
10
1
OT
TT
0.5
0
jj
10
TI
0
-0.5
-1
15
0
5
jj
jc
10
15
Figure 2-8: First four basis functions of the type-I fast LOT, for M = 2V,
V =8
were reflected in order to compute the 8-point DCT of the first and the last block.
As a result, only the even DCT coefficients are non-zero and the algorithm uses only
the non-zero coefficients to compute the LOT coefficients. The type-I, fast LOT of a
finite length signal is shown in Figure 2-11.
The computational costs for the type-I, fast LOT are summarized in Table 2.2,
where R and C are image dimension. In this thesis, both R and C are assumed to
be divisible by 8. The computational costs for the whole image take into account the
half length DCT of all edges of the transformed image.
25
br
x [0] 0-4
r
b
x [1]
r
b
x [2] w--r
b
x [3]
Sb
0
0
0
1
2
2
4
3
DCT
6
1
X[0]
5
3
X[2]
b
6
5
X[4]
x [7] r
b
X [+
0] 0
xr+1t] * b
7
7
0
0
1
2
-b 2
4
4
x r[5]
b
b
x [6]
x [4]
b
r
x r+12]
r+1
b
xr+ 14]
b
3
4
DCT
a
----
X[6--][1
a
-
-
X[1
-S
X[3]
-
--
-X[5]
X[7]
6
Cos
1
r+151
"r+
11 * b
5
3
6
5
"r+17
7
7
-sinei
__1.
Cos
Figure 2-9: Flowgraph of the type-I fast LOT, for M = 2V, V = 8
2.1.3
Discrete Wavelet Transform
The discrete wavelet transform (DWT) uses the wavelet expansion to represent the
signal in different time scales or spatial resolutions. The DWT is a series expansion
of a finite length signal. The expansion consists of an approximation and a wavelet or
detail. The wavelet expansion can be implemented by a filter bank. The derivation
and detailed explanation can be found in [2], [13], and [12]. The filter bank algorithms
have been well known for a long time. Therefore, the development cost for the DWT
algorithm using the filter bank is practically low.
There are two stages for the DWT system. The transform process of the DWT is
called an analysis stage. The inverse DWT is called a synthesis stage. The analysis
stage separates the input signal into a set of octave-band frequency bands, whereas
the synthesis stage reconstructs the signal from DWT coefficients. DWT coefficients
26
0
0
2
1
4
2
6
X[0]
1
X[2]
3
X[4]
X[6]
- - ---a
X[1]
-
-
-
-
X[5]
-
X[7]
------
-
-
IDCT
b
b
b
34
b
b
b
5
b
5
6
7
7 -
b
b
0b
0
2
-4
1 --
4
2 ---4
6
3
1
IDCT
b
b
-0
b
4
-- 4
0i~
0
-sin 0.-
3
5
5
6
7
7
1
-4
b
x r [0]
[1]
x
Xr[2]
Xr[3]
x r [4]
xr[5]
x [6]
r
r
x [7]
Xr+i[0]
x [1
r+1
x [2]
r+1
x [3]
r+1
x [4]
r+1
x [5]
r+1
x [6]
r+1
x [7]
r+1
cos 0
Figure 2-10: Flowgraph of the type-I fast ILOT, for M = 2V, V = 8
can be calculated as follows. The input signal, x[n], is passed through two filters:
a low-pass filter and a high-pass filter. Ideally, the cut-off frequency of both filters
should be at
.
Then the output signal of both filters are decimated by a factor
of two. To further divide the frequency band, the output of the fist stage can be
passed through the same set of filters. However, the distribution of DCT coefficients
of a normal image consists of mostly low frequency components. Therefore, only the
output of the low-pass filter from the first stage is used as the input to the second
stage filter banks. At the second stage, the length of input signal is half of the original
length. The process can be repeated many times as long as the length of the input
signal is longer than or equal to the length of the longest filter. Furthermore, the
computational costs of the DWT increase according to number of levels or stages
that the image is passed through. During the synthesis, the synthesis filters undo
27
2
He
W
1/2
Block 1 0-E1/2
1E
1/2
DCT
B lock 2
LOT of
Block 1
----- -- -----e -E
D CT
0
--------.
....
-- -
1/2
LOT of
Block 3
Z
--......... .......
E
0
1/2
---
0 -E
DCT
Block M\
LOT of
Block 2
-e
I
:..........1/J2 .. . . . . . . ..
Block 3
D CT
1/2
1/2
1/2
2
JHe
fV
Z
Z
- LOT
LO of
o
Block M
Figure 2-11: Flowgraph of the type-I, fast LOT for a finite length signal
The LOT runs from left to right without the factor 2 after He and JHe. The inverse
LOT runs from right to left. E is a set of even DCT coefficients and 0 is a set of odd
DCT coefficients.
the analysis part by combining the highest level first, and then reconstructing the
signal in the lower level. The structure of the 2-level analysis and synthesis DWT
and the corresponding frequency bands are shown in Figure 2-12 below. In addition,
the 2-D DWT is also separable. Therefore, the 2-D DWT can be calculated in each
dimension independently, as previously shown in Figure 2-2 above. After the 2-D
transformation of the image, the coefficients are organized in a certain way. An
example of the organization of the 3-level DWT coefficients is shown in Figure 2-13
below.
The DWT coefficients also depend on the type of filter used. There are two kinds of
filters: orthogonal and biorthogonal. The orthogonal filters are asymmetric and have
28
Table 2.2: Computational costs of the type-I, fast LOT, M = 2V, V = 8
#
#
Additions
Multiplications
single block, 1-D
48
28
12RC - 18(R + C)
7RC - 4(R + C)
type-I, fast LOT, M=16
RxC Image 2-D
type-I, fast LOT
non-linear phase in the frequency response. The non-linear phase creates distortion
or artifact in the reconstructed image, which is undesirable for image processing. On
the other hand, the biorthogonal filters are symmetric and have linear or zero phase
in the frequency domain. Therefore, it is desirable to use linear phase FIR filters for
image processing. This thesis used 9/7-tap Villasenor biorthogonal filters [15]. The
coefficients of both filters for the low-pass version are shown in Table 2.3 below.
Table 2.3: Coefficients of 9/7-tap Villasenor biorthogonal filters
Length
9_
Coefficients
0.03828, -0.023849, -0.110624, 0.377402, 0.852699,
0.377402, -0.110624, -0.023849, 0.03828
-0.064539, -0.040689, 0.418092, 0.788486,
0.418092, -0.040689, -0.064539
The orthogonality in the biorthogonal system is preserved by the relationships
between the analysis filters and the synthesis filters. The relationships among these
four filters are
ga[n]
=
n]
(2.13)
(-1)nha[1 - n]
(2.14)
(-1)"hs[1 -
and
g,[n] =
29
Input
Output
Signal
a9-
2
ha
2--
Signal
2
a
D2S
2
C2:
2
s -
e
s
_h
Synthesis
Analysis
jH(w)l
1
D2
C2
pi/4
DI
pi/2
pi
Figure 2-12: Structure of 2-level, 2-band analysis and synthesis for the
1-D DWT and the frequency band to which the output of each filter bank
ga is a high-pass analysis filter. ha is a low-pass analysis filter. g8 is
corresponds
a high-pass synthesis filter. h, is a low-pass synthesis filter.
where g, is a high-pass analysis filter, ha is a low-pass analysis filter, g, is a high-pass
synthesis filter, and h, is a low-pass synthesis filter [2].
In this thesis, the DWT algorithm was derived from the WaveLab802 MATLAB
package from Stanford university. The programs can be found in [3]. Parts of the programs were translated into C programs to reduce the simulation time. The MATLAB
programs were used for verification purpose.
The computational costs for the algorithm are given in two types: the circular
convolution algorithm and the minimum limit. The fist one is the circular convolution
algorithm, which was used in this thesis. The second one is the minimum limit, which
has lower computational costs by exploiting the symmetry of the biorthogonal filters.
30
HL3
LL3
HL2
LH3
HH3
HLI
HH2
LH2
LHl
HHi
Figure 2-13: Organization of 3-level DWT coefficients
The computations, especially multiplication, can be reduced significantly by reusing
the the previous results. Given the filter length, F, only
are distinct values
2
for the odd-length symmetric filters. Let the length of the filters be F and F2 , where
F = F + F 2 , the computational costs for both cases are summarized in Table 2.4
below. These costs were calculated for the odd-length biorthogonal wavelet filters
only. However, the cost of the even-length filters should be similar.
2.2
Quantization
The quantization block is used to limit the range of transform coefficients. This
block gives the compression gain for the system. However, it introduces loss in the
system. There are many quantization schemes used in the lossy image compression.
This thesis used four quantizers, listed as follows.
31
Table 2.4: The computational costs of L-level DWT using biorthogonal
Define variables Ka = RC(F - 2), Kb= RCF, and
filters for image size RxC
Ke = RC(F + 2).
Thesis
Minimum
Level
Additions
Multiplications
Additions
Multiplications
Ka
Kb
Ka
1
2 Ke
5
5
5
-Ka
5
-Kc
21
-Kb
21
16
16
21
-K
85
85
64
64
1
2
21
3
16
Ka
85
4
L
-Kb
4
-Ka
4
Ka
4(1
(1))L
-
3
Ka
---
1
4
L
(1
3
2
8
Ka
32
85
128
L
4L
Kb
3
Ka
21
3
3
K
(a) Optimal Uniform Quantizer
(b) JPEG Uniform Quantizer
(c) Visual Threshold Uniform Quantizer
(d) Embedded Zerotree Wavelet Quantizer (EZW)
The following sections explain algorithms, advantages, and disadvantages of the above
quantizers. It should be noted that certain quantizer, such as the JPEG uniform quantizer, is specifically designed for the block-based DCT coefficients. Some modifications
were made in order to adapt these quantizers to work with LOT coefficients. The
modifications are explained in more detail in the section of that particular quantizer.
32
2.2.1
Optimal Uniform Quantizer
This quantizer is designed for a specific set of transform coefficients in order to
minimize the mean square error (MSE). This quantizer is constructed from the statistics of the transform coefficients. Theoretically, it should give the smallest MSE which
indicates the best performance according to the PSNR value used in this thesis. This
quantizer was used in order to provide a fair comparison among all transform coding
schemes because it is not optimally designed for any particular transform. In order
to apply this quantizer to the coefficients of DCT, LOT, and DWT, there are three
factors to be considered. First, how many quantizers are necessary. Second, which
quantizer does each coefficient use. Finally, how to distribute the bit resources for
each quantizer.
To answer these questions, let us look at the derivation and the organization of
transform coefficients.
The DCT and the LOT are block transforms whereas the
DWT is not. The DCT and the LOT coefficients are arranged in a block of size 8
by 8 in this thesis. Each coefficient in this 8x8 block corresponds to different basis
functions in the 2-D transform. Furthermore, the coefficients that are located at
the same location in the 8x8 block of different blocks correspond to the same basis
function. Therefore, it is reasonable to use the same quantizer for each coefficient of
the same basis function. As a result, there are 64 different quantizers for DCT and
LOT transform coefficients because there are 64 distinct frequencies. In the case of
the DWT, the coefficients are organized in the frequency bands. Similarly, wavelet
coefficients in different bands use different quantizers. Given L-level DWT, there are
3L + 1 distinct quantizers.
The bit distribution problem can be solved by the rate-distortion theory as described in [1]. Given the average bit rate, B, the bit allocated to the ith quantizer,
bi, is
bi=
o.2
1
B + 1og 2
.k=1
33
(2.15)
N,
.
where ao
is the variance of the coefficients in the ith group, and N is the total
number of groups, i.e. 64 for the DCT and the LOT. Next, the width of each bin or
the quantization level, qi, can be computed by
qi
=
k max,(IAi
2bi
(2.16)
where max(IAil) is the maximum magnitude of the ith group, bi is the number of bits
allocated for the ith group, and k is the scaling factor, which can be used to change
the final bin width and adjust the bit rate. This scaling factor does not effect PSNR
but enables the simulation to adjust the bit rate close to the desired value. Finally,
the quantized coefficients, j, is calculated by dividing each input coefficient by the
corresponding quantization level, i.e.
a
=
-
Ci qj
(2.17)
The advantage of using this quantizer is that the quantizer is symmetric around
zero, also called flat-zero [1]. This property would benefit the system if the coefficients
have zero mean value because the distribution of the coefficients would be symmetrical
about zero. There are encoders that can take advantage of this kind of distribution.
Details of the coding schemes are presented in the next sections.
2.2.2
JPEG Uniform Quantizer
This quantizer is similar to the optimal uniform quantizer but it is specifically
designed for block-based DCT coefficients.
The JPEG standard recommends the
quantization level for each frequency in the transformed block.
These levels are
derived from experimental results based on of many images and subjects in order to
give the best visual quality for a given bit rate [1], [9]. This thesis used the following
34
quantization table for the black & white image.
16 11
Q
=
10 16
24
40
51
61
12 12 14 19
26
58
60
55
14 13 16 24
40
57
59
56
14
17 22 29
51
87
80
62
18 22 37 56
68
109 103
77
24 35 55 64
81
104 113
92
49 64 78 87 103 121
(2.18)
120 101
72 92 95 98 112 100 103
99
The top left hand corner corresponds to the quantization level for the DC component. The bottom right hand corner is the quantization level for the highest frequency
component of the transform coefficients in 2-D. This table is the recommended quantization levels for the luminescence component according to the JPEG standard [9].
Although this quantizer is specifically designed for the DCT coefficients, the LOT
coefficients might be able to use this quantization matrix because the LOT uses the
DCT as part of its building block for bases. Therefore, this thesis compared the
performance of the DCT system and the LOT system using this quantizer.
The
quantized coefficient, c3 , is computed by a division operation as in equation (2.17).
2.2.3
Visual Threshold Wavelet Quantizer
This quantizer is intended to preserve the visual quality of the reconstructed image
for the DWT coding system. The development of this quantizer was similar to those
of the JPEG uniform quantizer. The quantization levels depend on the experimental
results of the DWT coefficients. They also depend on the type of wavelet filter and
number of levels of the DWT structure. In this thesis, only 9/7-tap biorthogonal filters
were used. The DWT coefficients that belong to the same frequency band share the
same quantizer. There are three bands at each DWT level with an exception of the
coarsest level which has four bands. In other words, given the number of DWT levels,
35
L, there are 3L + 1 different quantizers. The quantization level, QL,O, for each DWT
orientation and each DWT level is given as follows. The subscripted '0' refers to the
orientation which is the same as frequency band {LL, LH, HL, HH}.
k 2 a1O
QL,O
[log
(o
)J
2
(2.19)
AL,O
where
" AL,O : the basis function amplitude
" a
empirical value = 0.495
Sko: empirical value = 0.466
* fo
empirical value = 0.401
e go : orientation parameter,
"
r
JHL = 9LH
=
1, 9LL
1.501, and gHH= 0.534
output resolution (pixel/degree)
" L: number of DWT levels
* k
scaling factor for adjusting quantization levels and bit rate
In the case of the 9/7-tap biorthogonal filters, values of AL,O and QL,O are shown in
Table 2.5 and 2.6 respectively.
Table 2.5: The basis function amplitudes, AL,O, for 9/7-tap biorthogonal
wavelet filters
Orientation
1
LL
LH
HL
HH
0.62171
0.672341
0.672341
0.727095
1 2
Level(s)
0.345374
0.413174
0.413174
0.494284
36
0.18004
0.227267
0.227267
0.286881
4
0.0914012
0.117925
0.117925
0.152145
Table 2.6: Quantization levels for 9/7-tap biorthogonal filters
resolution is set at 32 pixel/degree.
Orientation
____a__
LL
LH
HL
HH
_Level(s)
1
14.049
23.028
23.028
58.756
2
11.106
14.685
14.685
28.408
3
11.363
12.707
12.707
19.540
The output
4
14.500
14.156
14.156
17.864
The quantized coefficient is computed by dividing the DWT coefficients with the
corresponding quantization level. According to the quantization table given above,
the coefficients in the higher frequency band are quantized more heavily than the
low frequency coefficients. This property is similar to the quantization for the JPEG
system. Models and development processes of this quantizer can be found in [16].
2.2.4
Embedded Zerotree Wavelet Quantizer
The embedded zerotree wavelet (EZW) quantizer combines the bit-plane encoding
and the zerotree structure to compress the data. The zerotree structure captures relationships among the magnitude of the DWT coefficients across different levels. The
bit-plane part extracts each plane of bits, and the encoder searches for zerotree structures and encodes them into symbols. The combination of the EZW and an adaptive
arithmetic coder has been claimed to achieve higher compression than the baseline
JPEG algorithm [11]. Due to the time constraint and unavailability of the arithmetic
coder, the adaptive Huffman coder was used in this thesis. The implementation of
the EZW is described below. More details can be founded in [11]. The terms used in
the EZW algorithm are given as follows.
parents-descendants The relationship of the spatial location between the DWT
coefficients of the different level. Given the parent's coordinate (i, j), L-level
DWT, and image size of RxC, positions of the descendant are {(4i, 4j), (4i +
1, 4j), (4i, 4j + 1), (4i + 1, 4j + 1)} if (ij)
37
is not in the LL-band, and {(i +
R2
+ R
+ C
if (i,j) is in the LL-band, where L is the
number of the DWT level. Examples of parent-descendants of 3-level DWT are
shown in Figure 2-14.
significant: The DWT coefficient X[k] is significant if and only if IX[k] ;> T, where
T is the current threshold.
insignificant: The DWT coefficient X[k] is insignificant if IX[k]I < T, where T is
the current threshold.
zerotree root (Z): The DWT coefficient X[k] is a zerotree root if X[k] and all its
descendants are insignificant.
isolated zero (I): The DWT coefficient X[k] is an isolated zero if X[k] is insignificant and at least one of its descendants is significant.
positive (P): The DWT coefficient X[k] is positive if X[k] is significant and positive.
negative (N): The DWT coefficient X[k] is negative if X[k] is significant and negative.
dominant list: List of DWT coefficient that have not been found significant.
subordinate list: List of DWT coefficient that have been found significant.
The process of the EZW can be divided into two steps: the dominant pass and the
subordinate pass. The dominant pass compares all coefficients in the dominant list
with the current threshold and searches for zerotree structure. On the other hand,
the subordinate pass transmits the next bit of the coefficients that have been found
significant. The coding begins with the dominant pass by selecting the initial threshold, To, such that max(IX[k]I) < 2To. The EZW starts comparing each coefficient
from the highest level down to the lowest level. Within the same level, the EZW
scans through all frequency bands according to the following order, {LL, HL, LH,
HH}. The output symbols are coded as one of the following five symbols, {P, N, I, Z,
EOB}. The first four symbols are defined above. The EOB symbol is the end of block
38
....................
..........
e f
............
g h
d
. . .
. . . .
. . .
. .
..
..
..............
.........
j ..........
.
. . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
. . .
. . . . . . . . .
.. .
. ..
. . . . . . . . .
..
. . . . . . . . . . . .
. . . . . . . . .
. ..
. . . . . . . . . . . .
. . . . . . . .
. . . . . . . . . .
.
. .
. . .
. . . . . . . . . . . .
q
...
m
P
..
. . . . . . . . . . . .
. . . . . . . .
. . .
.
.r ....................
S
V
........................................
........................................
....
. .....................................
..................................
.. .....................................
..........
.. .....................................
......
.
..........
..............................
w
. . . . . . . . . . .
. . . . . . . .
..
. . . . . . . . . . . .
.
.. .....................................
V
. . . . . . . . . . . .
.................
.................
u
..
...........
....................
y .z ....................
....................
........................................
x
The
Figure 2-14: Locations of parent-descendants of the 3-level EZW
parent coefficient a, in the LL-band, has b, c, and d as its descendants. The coefficient
b is the parent of e, f, g, and h. All of them reside in the HL-band but in the different
level.
39
symbol, which is the same as in the JPEG system. It can be used to indicate the
termination of the encoding process. If a coefficient is significant, it is removed from
the dominant list and put in the subordinate list. Once all coefficients are compared
with the current threshold, the subordinate pass sends the next lower resolution bit of
all coefficients in the subordinate list. The output of the subordinate pass is a binary
bit, 0 or 1. After the subordinate pass finishes, the current threshold is divided by a
factor of two. This threshold is used in the next dominant pass. More details about
the EZW can be found in [11].
According to the statistics, if the parent is insignificant, it is likely that all its
descendants are also insignificant [11]. At high enough threshold, it is likely to find a
lot of zerotree root symbols. As a result, the run-length encoder could take advantage
of the long streams of zerotree root symbols.
Furthermore, there are only seven
symbols in the system, namely {P, N, I, Z, EOB, 0, 1}. The adaptive arithmetic
encoder may be more efficient due to the small set of symbols [11].
The EZW may achieve high compression, but the zerotree searching and the recomputation of the code table are the bottle neck of the algorithm used in this thesis.
Future works can be done to improve the EZW. Another improved version of the
EZW is called the set partitioning in hierarchical trees (SPIHT), which can be found
in [10]. Due to the time constraint, the SPIHT was exempted from this thesis.
2.3
Entropy Coding
The entropy coder translates values or symbols into a string of binary bits. Then
the output can be stored or transmitted in a digital format. Three encoders used in
this thesis are
(a) Huffman
(b) Adaptive Huffman
(c) Run-length Huffman
The following sections describe each encoder briefly.
40
2.3.1
Huffman Coding
The Huffman coding is a variable length coder. It computes the minimum bits
to represent each symbol according to its probability. This coding method can give
the expected code length close to the entropy of the system [1]. In this thesis, the
Huffman coder is used for all transform coefficients. For DCT and LOT, there are
two Huffman tables, one for the DC coefficients and the other for the AC coefficients.
In the case of DWT, one Huffman table is used for the coefficients in the LL-band,
whereas a second table is used for the rest of the coefficients. This Huffman table is
computed directly from the statistics of the coefficients. Therefore, it is image specific.
This coder was used to compare the effect of the transform and the quantizer because
it is the common coder for all transforms.
2.3.2
Adaptive Huffman Coding
The adaptive Huffman coder is an adaptive version of the Huffman coder. The
general concepts are the same as the Huffman coding, but the code table is updated
as the coding progresses. This algorithm could improve the compression ability of
the Huffman coder because the coder has the updated statistics when the symbol
is coded. However, the disadvantage of this algorithm is speed. The adaptiveness
requires the re-computation of the code words. Therefore, it slows the system down
as mentioned in the EZW section.
2.3.3
Run-Length Huffman Coding
The run-length Huffman coder combines the regular Huffman coding with the
modifications of the quantized coefficients to achieve higher compression. This coder
is the standard coder used in the baseline JPEG. This coder separates the coefficients
into two parts: DC and AC. Each type is coded with different methods. The general
concepts can be described as follows.
The DC coefficients have a very large range. For example, 8-bit data, the 8x8
block DCT can give the DC coefficients in a of [-2047, 2047]. This range requires 12
41
bits to represent. However, the value of DC coefficients of the adjacent blocks are very
close to each other because most image are smooth and does not change abruptly.
The DC coder part uses the differential pulse code modulation (DPCM) to compute
the first difference, and then encodes these differences. The Huffman code word for
the DC coefficients can be separated into two parts: the category and the residue.
The category determines the range of the magnitude, while the residue determine the
sign and the resolutions of the coefficients [1], [9]. The Huffman table for the DC
coefficients of the DCT JPEG can be found in [9].
According to the statistics, the quantized AC coefficients of the DCT are likely
to be zeroes at high frequencies [1], [9]. The run-length coder counts the maximum
number of zeroes before encountering a non-zero coefficient. The scanning method
used for the AC coefficients is called zig-zag scan [1], [9]. The zig-zag scan is shown in
Figure 2-15 below. The maximum number of zero-run code word is 16. The end-ofblock (EOB) is used if the rest of the scanning coefficients are zeroes. The run-length
Huffman table for the AC coefficients can be found in [9].
The JPEG run-length
Huffman coder is shown in Figure 2-16 below.
This coder is designed for the DCT coefficients. However, the LOT coefficients
share similar organizations and properties. Therefore, the run-length coder can be
adjusted for the LOT system. Few modifications must be made to accommodate the
higher range of the LOT coefficients in both DC and AC coefficients.
In the case of the DWT system, the coefficients in the LL-band are treated as the
DC coefficients, and the other bands are treated as the AC coefficients. This quantizer
was used with the optimal uniform quantizer and the visual threshold quantizer for
the DWT system. The code table for the DWT system is computed from the statistic
of the coefficients of that transformed image. Although it is not a universal encoder,
it provides a method to compare the visual threshold system with the current JPEG
system.
42
Figure 2-15: Zig-zag scan of the run-length Huffman coder
DC
DPCM
Category - residue
Huffman Coder
coefficients
MUX
AC
coefficients
Zig-zag
Scan
Run-length
Encoded
3
Bit stream
Huffman Coder
Figure 2-16: Flowgraph of the run-length Huffman coder
43
Chapter 3
Simulation Methods
This chapter explains the simulation and comparison processes of all transform coding
systems used in this thesis.
The simulation imitated all functional blocks in the
systems, which consists of the transform, the quantizer, and the entropy coder. A list
of systems used in this thesis is shown in Table 3.1 below.
Table 3.1: List of tested transform coding systems
1
2
3
Transform
DCT
LOT
DWT
Quantizer
Optimal Uniform
Optimal Uniform
Optimal Uniform
Entropy Coder
Huffman
Huffman
Huffman
4
5
6
DCT
LOT
DWT
Optimal Uniform
Optimal Uniform
Optimal Uniform
Run-length Huffman
Run-length Huffman
Run-length Huffman
#
7
DCT
JPEG Uniform
Run-length Huffman
8
LOT
JPEG Uniform
Run-length Huffman
9
DWT
Visual Threshold
Run-length Huffman
10
DWT
Embedded Zerotree Wavelet
Adaptive Huffman
The simulation program can be configured to run all ten systems. Details of each
component in the systems can be found in the Chapter 2. However, the modifications
were made to make each type of quantizer and entropy coder work with each transform
coefficients. Brief descriptions of each system are summarized as follows.
44
(a) The first three systems, (1-3), used the same type of quantizer and the entropy
coder. The quantizer and the entropy coder were computed according to the
statistics of the coefficients and the quantized coefficients respectively. Although
these systems are not practical, they are helpful for investigating the effect of
the quantizer and the coder on the PSNR performance. Lists of comparisons
are shown in the next sections.
(b) The next three systems, (4-6), differed from the first three systems at the encoder block. The thesis used this difference to test effects of the coder on the
PSNR. In addition, the run-length Huffman coder for the DCT is used in the
baseline JPEG standard as described in
[9].
In a case of the LOT system, the
code table was expanded to accommodate larger range of coefficients. On the
other hand, the run-length coder for the DWT system was computed from the
statistics of quantized coefficients because there was no standardized run-length
coder for the DWT system.
(c) The next three systems, (7-9), used the same type of quantizer and encoder:
the universal uniform quantizer and the run-length Huffman coder respectively.
Together with the run-length coder, the DCT system imitated the baseline
JPEG, while the LOT system attempted to simulate the baseline JPEG-like
coder. The DWT system that used the visual threshold quantizer and the
run-length coder represented the baseline JPEG-like coder for the wavelet.
(d) Finally, the EZW system uses a different approach to compress an image. One
systematic comparison between this system and other systems was to use the
PSNR and bit rate. This system was compared against the baseline JPEG-like
coder to compare the PSNR performance of the last four systems.
The system evaluations were divided into three parts: the system complexity, the
system performance in terms of the PSNR and bit rate, and the visual quality. Each
comparison investigated the effects of one or more functional blocks in the system.
Details of each comparison are given in the sections below.
45
3.1
Part I: System Complexity
The system complexity is an indicator of the cost of running the system. The
system consists of three functional blocks: transform, quantizer, and entropy coder.
The complexity of the transform was measured by the number of additions and multiplications. The number of operations include all additions and multiplications for
the transform of the whole image because only the DCT and the LOT are block
transforms, whereas the DWT is not. The total costs show the amount of processing required for each transform. The complexity of quantizers and encoders were
measured by the cost of running the algorithms. In this thesis, emphasis was placed
on the complexity of the transform block. The quantizer blocked and the entropy
encoder are briefly discussed to complete system comparisons.
3.2
Part II: System Performance
This part of the simulation evaluated the performance in terms of bit rate and
PSNR. The simulation compared different transform coding systems that differed in
the choice of transform, quantizer, and entropy coder. The purpose of this comparison
was to evaluate the effect of each component of the system on the PSNR. The PSNR
for the 8-bit image is given by
PSNR
=
10 log1 0
i=N255'2
E(X, - j)
Z
.
(3.1)
i=1.
where N are the total number of pixels, xi is the original data, and the 'i is the
reconstructed signal. The PSNR is related to the mean square error (MSE). Although
the PSNR may not represent the visual quality directly, the PSNR is a systematic
way to compare the error of the reconstructed signal and can be computed exactly.
The system performance attempted to evaluate
(a) the effect of number of levels on DWT systems,
46
(b) the effect of transform,
(c) the effect of quantizer,
(d) the effect of entropy coder,
(e) and the overall system performance.
In order to evaluate these quantities, the comparisons were divided into multiple
sections according to the above list. Details of each set of comparisons are explained
below.
3.2.1
Part II-A: Effect of Number of Levels on DWT Systems
This part compared how the number of levels of wavelet expansion affected the
system performance. Comparisons were done to determine the trade-offs between the
computational costs of the DWT transform and the PSNR gain of the DWT system.
A list of systems in this part is shown in Table 3.2 below.
Table 3.2: List of systems for comparing the effect of number of DWT
levels on the PSNR performance
Transform
2-level DWT
3-level DWT
4-level DWT
2-level DWT
3-level DWT
4-level DWT
2-level DWT
3-level DWT
4-level DWT
3.2.2
Quantizer
Optimal Uniform
Optimal Uniform
Optimal Uniform
Visual Threshold
Visual Threshold
Visual Threshold
EZW
EZW
EZW
Entropy Coding
Huffman
Huffman
Huffman
Run-Length Huffman
Run-Length Huffman
Run-Length Huffman
Adaptive Huffman
Adaptive Huffman
Adaptive Huffman
Part II-B: Effect of Transform
This part compared the PSNR performance of different transforms by restricting
the choice of quantizer and entropy coder. Because of these constraints, the PSNR
47
performance should reflect the effectiveness of each transform in terms of ability to
extract necessary information from the image. In this section, the optimal uniform
quantizer was used because it was the only common quantizer among three transforms.
In addition, all transforms used either Huffman or the run-length Huffman for the
coder. As described in the background section, the optimal uniform quantizer gave
the best PSNR performance for a given set of coefficients. A list of systems in this
section is shown in Table 3.3 below.
Table 3.3: List of systems for comparing the effect of transforms on the
PSNR performance using the optimal uniform quantizer, the Huffman
coder, and the run-length Huffman coder
Transform
DCT
LOT
3-level DWT
DCT
LOT
3-level DWT
3.2.3
Quantizer
Optimal Uniform
Optimal Uniform
Optimal Uniform
Optimal Uniform
Optimal Uniform
Optimal Uniform
Entropy Coder
Huffman
Huffman
Huffman
Run-length Huffman
Run-length Huffman
Run-length Huffman
Part II-C: Effect of Quantizer
This part compared the effects on the PSNR performance of each quantizer. Comparisons were done by changing the quantizer block while keeping the transform and
the quantizer the same. This thesis compared the optimal uniform quantizer with
the JPEG uniform quantizer for the DCT and the LOT. In the case of the DWT,
the thesis compared the optimal uniform quantizer with the visual threshold uniform
quantizer. In all cases, the only common encoder that can be used in all system was
the run-length Huffman coder. A list of systems compared is shown in Table 3.4
below. In addition, this thesis compared the simulated baseline DCT JPEG with the
actual baseline JPEG using XV version 3.10a in the Unix platform to validate the
simulated DCT JPEG system.
48
Table 3.4: List of systems for comparing the effect of quantizers on the
PSNR performance using the run-length Huffman coder
3.2.4
Transform
DCT
Quantizer
Optimal Uniform
Entropy Coder
Run-length Huffman
DCT
JPEG Uniform
Run-length Human
LOT
LOT
3-level DWT
3-level DWT
Optimal Uniform
JPEG Uniform
Optimal Uniform
Visual Threshold
Run-length
Run-length
Run-length
Run-length
Huffman
Huffman
Huffman
Huffman
Part II-D: Effect of Entropy Coder
This part compared the effects on the PSNR of entropy coders. Comparisons were
done by restricting the choice of transform and quantizer. Given the same symbols
to be coded, the output bit rate should reflect the efficiency of the coder. Only two
coders were compared in this thesis.
They were the regular Huffman coder, and
the run-length Huffman coder. The implementation details are in the background
chapter. A list of systems compared is shown in Table 3.5 below.
Table 3.5: List of systems for comparing the effect of entropy coders on
the PSNR performance using Huffman and run-length Huffman coders
Transform
DCT
DCT
LOT
LOT
3-level DWT
3-level DWT
3.2.5
Quantizer
Optimal Uniform
Optimal Uniform
Optimal Uniform
Optimal Uniform
Optimal Uniform
Optimal Uniform
Entropy Coder
Huffman
Run-length Huffman
Huffman
Run-length Huffman
Huffman
Run-length Huffman
Part II-E: Overall System Performance
This part compared the current JPEG standard which uses the baseline DCT
JPEG with the baseline LOT JPEG system, the visual threshold wavelet system,
49
and the EZW system. The comparisons were targeted to test the DWT with the old
JPEG standard and determined the trade-offs among four different systems. Four
selected systems are shown in Table 3.6 below.
Table 3.6: List of selected systems for comparing the PSNR performance
Transform
DCT
LOT
3-level DWT
4-level DWT
3.3
Quantizer
JPEG Uniform
JPEG Uniform
Visual Threshold
EZW
Entropy Coding
Run-Length Huffman
Run-Length Huffman
Run-Length Huffman
Adaptive Huffman
Part III: Visual Quality
This comparison discussed the visual artifacts of each system. This thesis was not
intended to provide the way to evaluate the visual quality of the reconstructed image.
On the other hand, the thesis pointed out the dominant artifacts of each system at
low bit rates because at high bit rates, 1 bit/pixel or more, the reconstructed image
exhibits no artifact.
3.4
Test Images
Seven images were used in this simulation to reduce the image dependency of the
comparisons. These images are black & white and have a resolution of eight bits. A
list of images is shown in Table 3.7 below.
50
Table 3.7: List of test images
Image
baboon
barbara2
bfrag(barbara)
boat
goldhill
lena
peppers
Dimension
(pixels x pixels)
512x512
720x576
512x512
256x256
720x576
512x512
512x512
51
Chapter 4
Results and Discussions
This chapter presents the results and discussions of the simulated systems according
to the list of system comparisons described in the last chapter.
4.1
Part I: System Complexity
The evaluation of the system complexity was done in terms of computational costs
of the algorithms. The evaluation was done for all functional blocks: the transform,
the quantizer, and the entropy coder. This thesis focused primarily on the transform
block. However, the quantizer and the entropy coder were also evaluated to complete
the system.
4.1.1
Transform
The transform complexities were calculated in terms of number of additions and
multiplications. The algorithm used for each transform is presented in the Background
chapter. Given a RxC image, the computational costs of each transform are shown in
Table 4.1. In the DWT system, F is the sum of the length of both filters, F = F +F 2 .
The DWT filters used in this thesis were the 9/7-tap biorthogonal filters.
These
costs were calculated for the odd-length biorthogonal filters only. However, the costs
for the even-length biorthogonal filters should be similar to the odd-length filters.
52
These values in Table 4.1 are the closed form representation of the costs. In order to
compare the systems more specifically, the costs of transforming a 512x512 image are
summarized in Table 4.2 below. The DCT costs were normalized to 1.00.
Table 4.1: Computational costs of transform algorithms
#
Transform
#
Additions
Multiplications
13RC
2
4RC
12RC - 18(R + C)
7RC - 4(R + C)
8x8 DCT
2
Type-I, fast LOT
M=16, V=8
L-level Circular
4(1
-
L-level DWT
4(1
-
RCF
3
3
Convolution DWT
Minimum Cost
4(1-
(4))L
2(1
( ))L
3
RC(F - 2)
-
(4)L
3
RC(F + 2)
Table 4.2: Normalized costs of transforming a 512x512 image
Transform
DCT
LOT
Circular Convolution, 2-Level DWT
Circular Convolution, 3-Level DWT
Circular Convolution, 4-Level DWT
Minimum cost, 2-Level DWT
Minimum cost, 3-Level DWT
Minimum cost, 4-Level DWT
# Additions
# Multiplications
1.00
1.84
2.69
2.83
2.86
2.69
2.83
2.86
1.00
1.74
5.00
5.25
5.31
2.82
2.95
2.98
According to the normalized cost above, the implementation of the DWT in this
thesis was the most costly compared to those of the DCT and the LOT. The costs
of the DWT were concentrated on the number of multiplications in the circular convolution process. Compared to the minimum cost DWT, the algorithm used in this
thesis can be greatly improved. One solution is to take advantage of the symmetry
53
of the biorthogonal filters. The number of multiplications can be reduced to almost
one-half of the original cost by keeping a record of the previous multiplication results.
There are also other algorithms that can reduce the computational costs of the DWT
coefficients such as the lifting algorithm [2]. On the other hand, the costs of the LOT
were only about 75-85% more than those of the DCT system.
Another aspect of the algorithm comparisons was the similarity of the algorithm.
The LOT algorithm reuses the DCT block transform algorithm. Therefore, the cost
of implementing the LOT is left to the butterfly structure for the overlapped parts.
On the contrary, the DWT algorithm is completely different from the DCT algorithm.
It does not use the block transform. As a result, the cost of changing from the DCT
to the DWT is more expensive than the cost of changing from the DCT to the LOT
for the transform block. In addition, the entropy coder may need modifications.
Effectively, the whole system must be changed. The costs of changing to another
system may or may not be desirable for practical applications. The change may be
made if the improvement after changing can justify the cost. The effects of the other
parts of the system are presented in the following sections.
4.1.2
Quantization
The complexity of the quantizer can be compared in a similar manner as the
transform comparison. Evaluations of the quantizers were done in terms of algorithm
implementation such as the cost of the algorithm. In this thesis, all quantizers can
be classified into three groups as shown below.
(a) Optimal uniform
(b) Universal uniform: JPEG uniform quantizer and visual threshold
(c) Embedded zerotree wavelet
The optimal uniform quantizer was used to compare the system from the theoretical perspective only. This quantizer gives a fair comparison of different transforms
because it minimizes the signal-to-noise ratio (SNR) of the quantized coefficients of
54
a particular image. It is also useful because only the DCT system has the standard
quantizer. To develop the quantizer for the LOT system and the DWT system, further research must be done separately and may take some period of time. Therefore,
this optimal quantizer gives a fast way to compare the system fairly. Furthermore,
this quantizer uses a recursive algorithm to compute the quantization levels, which is
more computational expensive than other types of quantizers. Therefore, this quantizer may not be attractive for practical systems.
On the other hand, the universal uniform quantizer uses a predefined set of quantization levels for all images. These levels are usually tested and verified to give good
visual quality. The advantage of this type of quantizer is that quantizer and the dequantizer can perform their functions properly regardless of the image as long as they
both agree on the quantization levels. The calculation of the quantized coefficient of
the uniform quantizer, both the optimal one and the universal one, involves only a
division operation, which can be implemented in either hardware or software.
Finally, the EZW quantizer combines the bit-plane coding and the zerotree searching together in an efficient way. The compression is achieved by coding the tree structure as one symbol instead of coding the elements of the tree individually. Another
advantage of the EZW quantizer is that the EZW algorithm can achieve a desired bit
rate exactly, which might be useful for certain applications. Despite these advantages,
the zerotree search algorithm used in this thesis was repetitive and slow because once
the zerotree is found for some threshold, the whole tree was tested against different
thresholds until all elements in the tree were found significant or the bit budget was
exhausted. Improvements are needed for the version of EZW in this thesis in order
to compete with other quantizers and coders. Some algorithms have been experimented with to optimize the zerotree searching. For example, the set partitioning in
hierarchical trees (SPIHT) might be another alternative to the DWT system [10].
4.1.3
Entropy Coding
The entropy coder allows the computation of the bit rate of each system. This
thesis did not emphasize in choosing a particular type of coder. In order to complete
55
the system, the comparisons of different coders used in this thesis is briefly mentioned
below.
There are three coders used in this thesis: Huffman, adaptive Huffman, and the
run-length Huffman. Compared to the Huffman coder, the adaptive Huffman coder
is more costly in terms of computation. The adaptive Huffman coder required recomputation of the code words at certain rates. The update of the code table increased the overhead of the EZW algorithms used in this thesis. Nevertheless, the
adaptiveness of the coder may be locally optimized to code particular regions of the
image.
On the other hand, the run-length Huffman is easy to implement. This coder
changes the definition of the symbols from the single symbol to strings of symbols.
This coder is particularly useful for the system with a long string of numbers or
characters, i.e. zeros in the case of DCT JPEG, and the LOT JPEG. In the case of
the DWT, the size of each band varies according to the DWT level. Therefore, the
DWT system may require multiple sets of the Huffman tables in order to efficiently
compress the data. Future study may be done to find a universal run-length coder
for the visual threshold system and the EZW system.
4.2
Part II: System Performance
The performance comparison of all systems was done in terms of the PSNR and the
bit rate. The comparisons were divided into five parts as mentioned in the previous
chapter. The results and discussions of those five parts are presented below.
4.2.1
Part II-A: Effect of Number Levels on DWT Systems
As previously described in the background section, the DWT coefficients are effected by the number of DWT level in the system. Specifically, the LL band of the
system changes the size and the value according to the number of filter bank stages.
These comparisons investigated the effect of the number of level on the PSNR performance for different combinations of quantizers and entropy coders. Furthermore,
56
these comparisons were used to find the optimal level for the DWT systems, which
were used to compare with the DCT and the LOT systems in the subsequent comparisons. There were three combinations of the quantizer and the entropy coders used in
these comparisons. The systems are listed in Table 4.3 below. The optimal uniform
quantizer used the scaling factor of 0.35, and the run-length Huffman coder limited
the maximum category size to 11 and the maximum run size to 15. This coder is
similar to the baseline JPEG run-length Huffman coder.
Table 4.3: List of systems for comparing the effect of DWT levels on the
PSNR performance of the DWT systems
Transform
2-Level DWT
3-Level DWT
4-Level DWT
2-Level DWT
3-Level DWT
4-Level DWT
2-Level DWT
3-Level DWT
4-Level DWT
Quantizer
Optimal Uniform
Optimal Uniform
Optimal Uniform
Visual Threshold
Visual Threshold
Visual Threshold
EZW
EZW
EZW
Entropy Coder
Run-Length Huffman
Run-Length Huffman
Run-Length Huffman
Run-Length Huffman
Run-Length Huffman
Run-Length Huffman
Adaptive Huffman
Adaptive Huffman
Adaptive Huffman
The results of the simulation on all seven images are summarized in Table 4.4 4.9 below. The average bit rate presented in the table were the following set, {0.50,
0.75, 1.00, 1.25, 1.50, 1.75}. The PSNR values and the average bit rates were linearly
interpolated to achieve desired value because the system simulation could not fix the
exact bit rate with the exception of the EZW system. The graph of the EZW systems
for some images are shown in Figure 4-1 below.
According to the results, the optimal uniform quantizer and the visual threshold
quantizer were unaffected by the number of DWT levels.
In fact, as the output
resolution of the visual threshold was varied from 32 to 16 or 64 pixels per degree, the
PSNR performance remained unaffected. On the other hand, the PSNR performance
of the EZW system was significantly improved as the number of levels was changed
from two to three. The EZW improved the PSNR about 0.5 to 4 dB as the level was
57
Table 4.4: Effect of DWT levels on the PSNR performance of the DWT
systems using the optimal uniform quantizer and the run-length Huffman
coder
Image
baboon
barbara2
bfrag
btfrag
Bit Rate
2 Levels
PSNR (dB)
3 Levels
APSNR (dB)
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
(23.35)
(25.95)
(27.21)
(28.86)
(30.11)
(31.63)
(27.82)
(30.85)
(32.94)
(34.99)
(36.35)
(37.67)
(28.08)
(32.06)
(33.97)
(35.13)
(37.66)
(38.97)
(27.75)
(30.79)
(33.53)
(35.46)
(38.51)
(39.94)
0.65
0.21
0.30
0.09
0.23
0.31
1.43
0.54
0.94
0.41
0.42
0.39
1.33
0.68
0.34
0.99
0.42
0.68
0.97
0.72
0.66
0.29
0.18
0.18
58
4 Levels
APSNR (dB)
0.65
0.21
0.26
0.10
0.22
0.22
1.56
0.63
1.00
0.49
0.49
0.45
1.46
0.76
0.38
0.98
0.48
0.65
0.98
0.88
0.64
0.23
0.17
0.08
Table 4.5: Effect of DWT levels on the PSNR performance of the DWT
systems using the optimal uniform quantizer and the run-length Huffman
coder (cont.)
Image
goidhill
lena
peppers
Bit Rate
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
12
Levels
PSNR (dB)
(31.09)
(33.79)
(35.47)
(36.93)
(37.76)
(39.23)
(33.36)
(36.27)
(38.19)
(39.38)
(40.65)
(41.66)
(31.94)
35.44
36.59
(37.76)
(38.76)
(40.04)
59
3 Levels
APSNR (dB)
0.72
0.46
0.49
0.19
0.53
0.20
1.61
0.85
0.31
0.26
0.11
0.47
1.80
0.35
0.27
0.12
0.21
0.00
4 Levels
APSNR (dB)
0.85
0.53
0.47
0.27
0.53
0.10
1.67
0.90
0.33
0.20
0.16
0.58
1.91
0.37
0.29
0.14
0.24
-0.93
Table 4.6: Effect of DWT levels on the PSNR performance of the DWT
systems using the visual threshold quantizer and the run-length Huffman
coder
Image
baboon
barbara2
bfrag
btfrag
g B2
Bit Rate
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
Levels
PSNR (dB)
3 Levels
APSNR (dB)
(23.80)
(25.75)
(27.23)
(28.59)
(29.75)
(30.97)
(29.02)
(31.53)
(33.71)
(35.35)
(36.70)
(37.84)
(28.83)
(31.91)
(33.94)
(35.78)
(37.27)
(38.92)
(28.22)
(31.60)
(34.48)
(36.65)
(38.35)
(37.75)
0.20
0.18
0.11
0.13
0.06
0.06
0.42
0.39
0.25
0.27
0.11
0.28
0.59
0.50
0.33
0.22
0.34
0.14
0.24
0.32
0.07
0.16
0.13
0.20
60
4 Levels
APSNR (dB)
0.22
0.17
0.10
0.12
0.07
0.05
0.44
0.41
0.27
0.28
0.13
0.31
0.68
0.50
0.28
0.18
0.37
0.09
0.20
0.28
0.07
0.21
-0.04
0.13
Table 4.7: Effect of DWT levels on the PSNR performance of the DWT
systems using the visual threshold quantizer and the run-length Huffman
coder (cont.)
Image
goidhill
lena
peppers
Bit Rate
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
12
Levels
PSNR (dB)
(32.30)
(34.26)
(35.73)
(37.05)
(38.27)
(39.22)
(34.64)
(36.86)
(38.26)
(39.37)
(40.64)
(41.77)
(33.97)
(35.63)
(36.82)
(37.71)
(38.50)
(39.34)
61
3 Levels
APSNR (dB)
0.13
0.17
0.11
0.16
0.07
0.10
0.32
0.19
0.22
0.31
0.14
0.17
0.24
0.13
-0.03
0.05
-0.03
0.15
4 Levels
APSNR (dB)
0.13
0.14
0.14
0.09
-0.06
0.08
0.32
0.22
0.19
0.28
0.09
0.11
0.19
0.05
-0.05
0.03
-0.05
-0.01
Table 4.8: Effect of DWT levels on the PSNR performance of the DWT
systems using the EZW quantizer and the adaptive Huffman coder
Image
baboon
barbara2
bfrag
btfrag
Bit Rate
2 Levels
PSNR (dB)
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
(22.20)
(23.73)
(24.81)
(26.99)
(27.70)
(28.55)
(24.57)
(27.05)
(28.76)
(30.30)
(32.68)
(34.50)
(23.95)
(26.93)
(28.72)
(31.29)
(33.05)
(35.21)
(26.36)
(28.48)
(31.06)
(32.72)
(35.09)
(36.09)
62
3 Levels
APSNR (dB)
1.20
1.01
2.04
0.65
0.77
1.15
3.02
3.20
3.28
3.30
2.53
2.37
3.32
3.65
3.55
3.59
2.93
2.53
2.69
2.99
3.52
2.80
2.03
3.30
4 Levels
APSNR (dB)
1.68
1.25
2.19
0.77
0.99
1.56
3.94
3.84
4.70
3.60
3.03
2.99
4.54
4.21
4.97
3.95
3.43
3.14
3.69
3.55
3.77
3.14
2.73
3.46
Table 4.9: Effect of DWT levels on the PSNR performance of the DWT
systems using the EZW quantizer and the adaptive Huffman coder (cont.)
2 Levels
ImageImaeBit
Bt Rate
RtePSNR
(dB)
goldhill
lena
peppers
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
(27.44)
(29.79)
(31.48)
(32.95)
(34.55)
(35.50)
(28.50)
(31.67)
(33.92)
(35.91)
(37.01)
(38.59)
(26.40)
(30.02)
(32.60)
(34.09)
(35.64)
(36.34)
63
3 Levels
APSNR (dB)
4 Levels
ZPSNR
A
(dB)
3.60
3.18
2.98
2.56
2.26
2.79
4.69
4.19
3.21
2.91
2.63
1.63
3.60
3.18
2.98
2.56
2.26
2.79
4.38
4.24
3.41
3.01
3.08
3.01
6.02
4.72
3.93
3.29
2.89
1.92
4.40
4.24
3.41
3.01
3.08
3.01
barbara2.raw
baboon.raw
40
40-
-30
30
z
z
I20
1u
a)
n.20
0.5
0
40
1.5
1
bfrag. raw
. . . . . . . . . . .... . . . . . . . .
10
2
S
0.5
. . -. . . . .
. .-..
-.
- . . . . ... . . . .
z
z
a)
0-20
10
30
. . . . . . . . . . . . . . . ..
.. . . . . .. . .
1.5
40
. . . . . . . . . ...
- . .. . .-..
30
1
goldhill. raw
(n
.. .
. . . . .... . . . . . . . .
c20 -
. . . . . . . . . ... . . . . . . . . .
-
0.5
0
1.5
1
10
2
S
40
40
' 30
30
-
a)
-
z
.. -.-.
.-.
1.5
1
2
1.5
1
0.5
Average Bit Rate (bit per pixel)
---2-Level
I. 20 -.
0.20
10,
0
0.5
peppers.raw
lena.raw
z
. . . . . . ... . . . .
. .. . . .-..
10
2
0
9- 3-Level
-.-. 4-Level
1.5
1
0.5
Average Bit Rate (bit per pixel)
2
Figure 4-1: Effect of number of DWT levels on the PSNR performance
of the DWT systems using the EZW quantizer and the adaptive Huffman
coder
64
changed from two to three, and gained about 1 dB as the level was changed from
three to four.
This result clearly indicates that the number of levels affects the PSNR performance of the EZW system, but does not effect the DWT system with the uniform
quantizer. Therefore, the number of DWT levels can be limited to three levels for
the systems that use the uniform quantizer such as the optimal uniform quantizer or
the visual threshold. The EZW may gain advantage over other systems using four or
higher levels wavelet expansions. However, the trade-offs must be considered between
the cost to compute the additional DWT level and the gain in the image quality.
4.2.2
Part II-B: Effect of Transform
This section compared the effect of the choice of the transform block on the
PSNR performance. Comparisons were done by restricting the choice of quantizer
and entropy coder. Because the only difference in the system is the transform block,
the differences in the PSNR should be mainly influenced by the transform block. The
list of systems tested is shown in Table 4.10.
Table 4.10: List of systems for comparing the effect of transforms on the
PSNR performance
Transform
DCT
LOT
3-Level DWT
DCT
LOT
3-Level DWT
Quantizer
Optimal Uniform
Optimal Uniform
Optimal Uniform
Optimal Uniform
Optimal Uniform
Optimal Uniform
Entropy Coder
Huffman
Huffman
Huffman
Run-Length Huffman
Run-Length Huffman
Run-Length Huffman
The DWT system with the optimal uniform quantizer used 3 levels because the
results from the previous section showed that there was no significant improvement
by increasing the DWT level to four. The optimal uniform used the scaling factor of
0.35 for all transforms. Finally, the run-length Huffman table for the DWT system
was generated from the statistics of the quantized coefficients because there is no
65
standard coder for the DWT system. The PSNR performance of all systems listed
above are summarized in Table 4.11 - 4.14 below.
Table 4.11: Effect of transforms on the PSNR performance using the optimal uniform quantizer and the Huffman coder
(23.66)
(25.29)
(26.82)
(28.20)
(29.53)
LOT
APSNR (dB)
0.31
0.32
0.32
0.30
0.25
DWT
APSNR (dB)
-0.20
-0.17
0.14
-0.06
-0.03
1.75
(30.82)
0.22
-0.12
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
(28.05)
(30.47)
(32.25)
(33.88)
(35.48)
(37.00)
(27.16)
(30.00)
(32.70)
(35.17)
(37.02)
(38.51)
(28.79)
(31.56)
(33.92)
(35.82)
(37.43)
(38.92)
0.70
0.47
0.59
0.61
0.59
0.49
0.98
1.23
1.25
0.80
0.54
0.38
0.55
0.67
0.59
0.42
0.47
0.28
-0.96
-0.75
-0.48
-0.80
-0.71
-1.51
-0.11
0.28
0.55
-0.55
-1.34
-1.75
0.16
-0.67
-0.87
-0.94
-1.24
-0.94
Image
Bit Rate
DCT
PSNR (dB)
baboon
0.50
0.75
1.00
1.25
1.50
barbara2
bfrag
btfrag
According to the results in Table 4.11 - 4.14, when the optimal uniform quantizer
and the Huffman coder were used, the LOT performed the best of three transforms
across all images. The PSNR gain was as high as 1 dB in some cases for the LOT
compared to the DCT. On the other hand, the 3-level DWT system had lower PSNR
than the other two systems. The PSNR of the DWT systems could decrease down to
-1.75 dB compared to the DCT systems in some images.
66
Table 4.12: Effect of transforms on the PSNR performance using the optimal uniform quantizer and the Huffman coder (cont.)
Image
goidhill
lena
peppers
Bit Rate
DCT
PSNR (dB)
LOT
APSNR (dB)
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
(31.48)
(33.27)
(34.73)
(36.04)
(37.28)
(38.54)
(32.71)
(35.00)
(36.61)
(37.81)
(39.00)
(39.99)
(31.89)
(33.81)
(35.05)
(36.02)
(36.91)
(37.84)
0.41
0.37
0.29
0.33
0.33
0.20
0.59
0.44
0.22
0.31
0.31
0.27
0.41
0.26
0.26
0.23
0.31
0.27
67
DWT
APSNR (dB)
-0.40
-1.21
-1.69
-0.93
-0.47
-1.16
0.15
-0.54
-0.33
-0.70
-0.47
-0.77
-0.32
-1.07
-0.96
-0.21
-0.65
0.06
Table 4.13: Effect of transforms on the PSNR performance using the optimal uniform quantizer and the run-length Huffman coder
Image
baboon
barbara2
bfrag
btfrag
Bit
Bi Rate
DCT
PSNR (dB)
LOT
APSNR (dB)
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
(23.28)
(25.01)
(26.69)
(28.16)
(29.50)
(30.80)
(28.72)
(31.29)
(33.48)
(35.31)
(36.79)
(38.07)
(27.99)
(31.82)
(34.63)
(36.64)
(38.29)
(39.60)
(29.16)
(32.08)
(34.97)
(36.96)
(38.72)
(40.12)
0.17
0.23
0.21
0.14
0.07
0.18
0.49
0.48
0.48
0.25
0.30
0.11
1.17
1.00
0.65
0.59
0.35
0.21
-1.46
-1.06
-0.99
-1.04
-0.76
-0.55
68
DWT
APSNR (dB)
0.73
1.15
0.81
0.80
0.83
1.14
0.52
0.10
0.40
0.09
-0.02
-0.01
1.42
0.92
-0.32
-0.52
-0.21
0.05
-0.44
-0.57
-0.78
-1.21
-0.03
0.00
Table 4.14: Effect of transforms on the PSNR performance using the optimal uniform quantizer and the run-length Huffman coder (cont.)
DCT
Bit Rate
Image
ImgeBi
RtePSNR (dB)
(32.06)
0.50
0.75
(34.10)
1.00
(35.51)
goidhill
1.25
(36.77)
1.50
(37.91)
1.75
(38.80)
0.50
(34.26)
0.75
(36.55)
lena
1.00
(37.97)
1.25
(38.88)
1.50
(39.76)
1.75
(40.68)
0.50
(33.54)
(32.30)
0.75
1.00
(35.71)
peppers
1.25
(36.07)
1.50
(37.10)
1.75
(38.28)
69
LOT
APSNR (dB)
0.21
0.22
0.21
0.20
0.07
0.20
0.08
0.07
-0.05
0.07
0.15
0.18
-0.29
-0.32
-0.35
-0.12
-0.18
0.00
DWT
APSNR (dB)
-0.24
0.16
0.45
0.35
0.39
0.63
0.72
0.56
0.53
0.76
1.00
1.45
0.19
0.49
1.15
1.81
1.87
1.76
As the coder was changed to the run-length Huffman coder, the PSNR results of
all transforms were comparable. The LOT system gave a slightly higher PSNR, up
to 0.5 dB in some cases, than the DCT system. For btfrag image and peppers image,
the DCT system performed slightly better in the PSNR than the LOT system, about
1 dB in the case of btfrag image. In the case of the 3-level DWT system, it performed
better than the DCT system in all images, except for btfrag image. The performance
gain of the DWT system could go as high as 1.80 dB when applied to peppers image.
This comparison shows a small positive gain in PSNR of the LOT system with
the optimal uniform quantizer in comparison with the DCT system in most cases. On
the other hand, the performance of the DWT depends on the type of encoder. The
gain in the LOT system is too small to confirm that the gain is obtained from the
transform block. In the case of the DWT system, the gain clearly does not depend on
the transform block. Two comparison results above suggest that the DWT system can
gain the PSNR performance at least from the entropy encoder block. The influence
of the quantizer block and the entropy coder block is shown in the next comparisons.
4.2.3
Part II-C: Effect of Quantizer
The following comparisons investigated the effect of the choice of the quantizer on
the PSNR performance . Specifically, the comparison focused on the optimal quantizer and the universal quantizer because practical systems usually uses the universal
quantizer. By constraining the choice of the transform and the entropy coder, the
comparison isolated the effect on PSNR due to the quantizer alone. The systems
tested are listed in Table 4.15. The comparison results are summarized in Table 4.16
- 4.21 below.
The optimal uniform quantizer used the scaling factor of 0.35 for all systems. The
visual threshold quantizer used in the DWT system used the output resolution of 32
pixels/degree . The output resolution did not significantly change the PSNR performance. This thesis had experimented with 16 pixels/degree and 64 pixels/degree and
their results were almost the same as the system that used 32 pixels/degree. Therefore, the system with the output resolution of 32 pixels/degree was used to represent
70
Table 4.15: List of systems for comparing the effect of quantizers on the
PSNR performance using the run-length Huffman coder
Transform
DCT
DCT
LOT
LOT
3-Level DWT
3-Level DWT
Quantizer
Optimal Uniform
JPEG Uniform
Optimal Uniform
JPEG Uniform
Optimal Uniform
Visual Threshold
Entropy
Run-Length
Run-Length
Run-Length
Run-Length
Run-Length
Run-Length
Coder
Huffman
Huffman
Huffman
Huffman
Huffman
Huffman
the visual threshold system in all comparisons. The EZW system were not compared
in this section because the encoder used in the EZW system were not the same as
the visual threshold system. However, the comparison was made at the system level
in the later section.
According to the results, the optimal uniform quantizer systems had better PSNR
than the universal uniform quantizer across most images as expected. However, the
relative performance between two quantizers was not significantly different for all
transform, about 0.5 dB. As a result, the universal uniform quantizer can be used in
a practical system in place of the optimal uniform quantizer without losing much of
the PSNR performance.
4.2.4
Part II-D: Effect of Entropy Coder
The comparisons of the effect of the choice of entropy coder on the PSNR performance were done by choosing different encoding schemes after the quantizer output.
The compared coders used in this thesis were the Huffman coder and the run-length
Huffman coder. The complete list of the tested systems is shown in Table 4.22. The
results of the compared systems are summarized in Table 4.23 - 4.28. The optimal
uniform quantizer in all systems used the scaling factor of 0.35.
According to the results, the run-length coder gave slightly better PSNR performance than the DCT system and the LOT system with the optimal uniform coder.
The PSNR gain was about 0.5-1.0 dB. On the other hand, the visual threshold wavelet
71
Table 4.16: Effect of quantizers on the PSNR performance of the DCT
systems using the run-length Huffman coder
Image
baboon
barbara2
bfrag
btfrag
Bit Rate
Optimal Uniform
PSNR (dB)
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
(23.28)
(25.01)
(26.69)
(28.16)
(29.50)
(30.80)
(28.72)
(31.29)
(33.48)
(35.31)
(36.79)
(38.07)
(27.99)
(31.82)
(34.63)
(36.64)
(38.29)
(39.60)
(29.16)
(32.08)
(34.97)
(36.96)
(38.72)
(40.12)
72
JPEG Uniform
APSNR (dB)
0.38
0.17
-0.29
-0.61
-0.85
-1.02
0.25
0.00
-0.44
-0.78
-0.91
-0.94
0.19
-0.54
-0.86
-0.90
-0.93
-0.85
0.36
0.27
-0.36
-0.48
-0.65
-0.66
Table 4.17: Effect of quantizers on the PSNR performance of the DCT
systems using the run-length Huffman coder (cont.)
Image
goidhill
lena
peppers
Bit Rate
Optimal Uniform
PSNR (dB) 7f
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
(32.06)
(34.10)
(35.51)
(36.77)
(37.91)
(38.80)
(34.26)
(36.55)
(37.97)
(38.88)
(39.76)
(40.68)
(33.54)
(35.30)
(35.71)
(36.07)
(37.10)
(38.28)
73
JPEG Uniform
zPSNR (dB)
0.29
0.04
0.04
0.08
-0.17
-0.17
0.29
-0.09
-0.20
0.01
0.00
-0.05
0.24
-0.03
0.57
0.98
0.65
0.11
Table 4.18: Effect of quantizers on the PSNR performance of the LOT
systems using the run-length Huffman coder
Image
baboon
barbara2
bfrag
btfrag
Bit Rate
Optimal Uniform
PSNR (dB)
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
(23.45)
(25.24)
(26.90)
(28.30)
(29.58)
(30.98)
(29.21)
(31.77)
(33.96)
(35.56)
(37.09)
(38.18)
(29.15)
(32.83)
(35.28)
(37.64)
(38.64)
(39.81)
(27.70)
(31.02)
(33.98)
(35.92)
(37.96)
(39.58)
74
JPEG Uniform
APSNR (dB)
0.52
0.24
-0.19
-0.44
-0.61
-0.94
0.48
0.08
-0.38
-0.49
-0.70
-0.61
0.51
-0.22
-0.47
-0.62
-0.56
-0.50
2.17
1.63
0.83
0.73
0.22
-0.05
Table 4.19: Effect of quantizers on the PSNR performance of the LOT
systems using the run-length Huffman coder (cont.)
Image
goidhill
lena
peppers
Bit Rate
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
Optimal Uniform
PSNR (dB)
(32.26)
(34.33)
(35.72)
(36.97)
(37.98)
(39.00)
(34.34)
(36.62)
(37.92)
(38.96)
(39.91)
(40.86)
(33.25)
(34.98)
(35.36)
(35.95)
(36.93)
(38.29)
75
JPEG Uniform
APSNR (dB)
0.45
0.16
0.10
-0.01
-0.03
-0.14
0.58
0.07
0.02
0.05
0.02
-0.07
0.65
0.33
0.93
1.12
0.85
0.17
Table 4.20: Effect of quantizers on the PSNR performance of the DWT
systems using the run-length Huffman coder
Image
baboon
barbara2
bfrag
btfrag
Bit Rate
Optimal Uniform
PSNR (dB)
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
(24.00)
(26.16)
(27.51)
(28.96)
(30.34)
(31.94)
(29.24)
(31.39)
(3388)
(35.40)
(36.77)
(38.06)
(29.41)
(32.74)
(34.31)
(36.12)
(38.08)
(39.65)
(28.72)
(31.51)
(34.19)
(35.75)
(38.69)
(40.12)
76
Visual Threshold
APSNR (dB)
0.00
-0.23
-0.17
-0.24
-0.53
-0.91
0.21
0.53
0.08
0.21
0.04
0.06
0.02
-0.33
-0.04
-0.12
-0.47
-0.59
-0.26
0.41
0.35
1.06
-0.22
-0.18
Table 4.21: Effect of quantizers on the PSNR performance of the DWT
systems using the run-length Huffman coder (cont.)
Image
goidhill
lena
peppers
Bit Rate
Optimal
PSNRUniform
(dB)
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
(31.82)
(34.26)
(35.96)
(37.12)
(38.30)
(39.43)
(34.98)
(37.12)
(38.50)
(39.64)
(40.75)
(42.13)
(33.73)
(35.79)
(36.86)
(37.88)
(38.97)
(40.05)
77
Visual Threshold
APSNR (dB)
0.63
0.17
-0.12
0.09
0.04
-0.12
-0.02
-0.06
-0.03
0.04
0.03
-0.36
0.48
-0.04
-0.06
-0.12
-0.50
-0.55
Table 4.22: List of systems for comparing the effect of entropy coders on
the PSNR performance using the optimal uniform quantizer
Transform
DCT
DCT
LOT
LOT
3-level DWT
3-level DWT
Quantizer
Optimal Uniform
Optimal Uniform
Optimal Uniform
Optimal Uniform
Optimal Uniform
Optimal Uniform
Entropy Coder
Huffman
Run-Length Huffman
Huffman
Run-Length Huffman
Huffman
Run-Length Huffman
system gained about 2-3 dB above the DWT system with the Huffman coder across
most images. The comparisons in the previous section showed that the choice of
the optimal uniform quantizer and the universal uniform quantizer had very little
effect on the PSNR performance. Therefore, these result should remain valid for the
systems that use the universal uniform quantizer as well. The results suggests that
the DWT system can gain significant increase in PSNR when using the run-length
Huffman coder with the universal uniform quantizer. Although this simulation computed the run-length Huffman table for the DWT system from the statistics of the
coefficients, the optimized universal run-length coder should give similar or slightly
less PSNR performance for the DWT system with the uniform quantizer. The actual
system of the visual threshold wavelet and the run-length coder may achieve at least
the same PSNR performance as the current baseline JPEG system.
4.2.5
Part II-E: Overall System Performance
This section compared the overall performance of the system in general.
All
transforms were allowed to be combined with any choice of the quantizer and the
entropy coder to optimize their performance. According to the earlier comparisons,
the universal uniform quantizer gave similar PSNR performance to the optimal uniform quantizer for all transforms. In addition, the universal quantizer represented
the practical system, which was the main focus of most applications. As a result, we
decided to exclude the system with optimal uniform quantizer from this comparison.
78
Table 4.23: Effect of entropy coders on the PSNR performance of the DCT
systems using the optimal uniform quantizer
Image
baboon
barbara2
bfrag
btfrag
Bit Rate
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
Huffman
PSNR (dB)
(23.66)
(25.29)
(26.82)
(28.20)
(29.53)
(30.82)
(28.05)
(30.47)
(32.25)
(33.88)
(35.48)
(37.00)
(27.16)
(30.00)
(32.70)
(35.17)
(37.03)
(38.51)
(28.79)
(31.56)
(33.91)
(35.82)
(37.43)
(38.92)
79
Run-Length
Huffman
APSNR (dB)
-0.38
-0.28
-0.12
-0.04
-0.03
-0.01
0.67
0.82
1.23
1.43
1.32
1.07
0.83
1.82
1.93
1.46
1.26
1.10
0.37
0.51
1.05
1.14
1.29
1.20
Table 4.24: Effect of entropy coders on the PSNR performance of the DCT
systems using the optimal uniform quantizer (cont.)
Image
goidhill
lena
peppers
Bit Rate
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
Huffman
PSNR (dB)
(31.48)
(33.27)
(34.73)
(36.04)
(37.28)
(38.54)
(32.71)
(35.00)
(36.61)
(37.81)
(39.00)
(39.99)
(31.89)
(33.81)
(35.05)
(36.02)
(36.91)
(37.84)
80
Run-Length
Huffman
APSNR(dB)
0.57
0.83
0.78
0.73
0.63
0.26
1.55
1.55
1.36
1.07
0.76
0.69
1.66
1.49
0.65
0.05
0.20
0.45
Table 4.25: Effect of entropy coders on the PSNR performance of the LOT
systems using the optimal uniform quantizer
Image
Bit Rate
0.50
0.75
baboon
barbara2
bfrag
btfrag
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
Huffman
PSNR (dB)
(23.98)
(25.61)
(27.13)
(28.49)
(29.78)
(31.03)
(28.74)
(30.95)
(32.84)
(34.50)
(36.07)
(37.49)
(28.14)
(31.23)
(33.96)
(35.97)
(37.57)
(38.89)
(29.34)
(32.23)
(34.51)
(36.24)
(37.90)
(39.20)
81
Run-Length
Huffman
APSNR (dB)
-0.52
-0.37
-0.23
-0.19
-0.20
-0.05
0.47
0.82
1.12
1.06
1.02
0.70
1.01
1.59
1.32
1.25
1.07
0.92
-1.63
-1.21
-0.53
-0.32
0.06
0.37
Table 4.26: Effect of entropy coders on the PSNR performance of the LOT
systems using the optimal uniform quantizer (cont.)
Huffman
Image
goidhill
lena
peppers
Bit Rate
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
PSNR (dB)
(31.89)
(33.63)
(35.02)
(36.37)
(37.61)
(38.75)
(33.30)
(35.44)
(36.82)
(38.12)
(39.31)
(40.26)
(32.29)
(34.07)
(35.31)
(36.24)
(37.22)
(38.11)
82
Run-Length
Huffman
APSNR(dB)
0.37
0.69
0.70
0.60
0.37
0.26
1.04
1.18
1.10
0.84
0.59
0.60
0.96
0.90
0.04
-0.29
-0.29
0.18
Table 4.27: Effect of entropy coders on the PSNR performance of the
DWT systems using the optimal uniform quantizer
Image
baboon
barbara2
bfrag
btfrag
Bit Rate
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
Huffman
PSNR (dB)
(23.47)
(25.12)
(26.96)
(28.13)
(29.50)
(30.70)
(27.09)
(29.72)
(31.77)
(33.08)
(34.77)
(35.49)
(27.04)
(30.28)
(33.26)
(34.62)
(35.69)
(36.75)
(28.95)
(30.90)
(33.04)
(34.88)
(36.19)
(37.98)
83
Run-Length
Huffman
APSNR (dB)
0.54
1.04
0.54
0.82
0.83
1.24
2.15
1.67
2.12
2.32
2.00
2.57
2.36
2.46
1.05
1.50
2.39
2.90
-0.23
-0.61
1.15
0.87
2.50
2.14
Table 4.28: Effect of entropy coders on the PSNR performance of the
DWT systems using the optimal uniform quantizer (cont.)
Huffman
Image
goidhill
lena
peppers
Bit Rate
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
IRun-Length
Huffman
PSNR (dB)
(31.12)
(32.06)
(33.04)
(35.11)
(36.81)
(37.39)
(32.86)
(34.46)
(36.28)
(37.11)
(38.53)
(39.22)
(31.57)
(32.75)
(34.09)
(35.81)
(36.26)
(37.89)
84
APSNR(dB)
0.70
2.20
2.92
2.01
1.49
2.05
2.12
2.65
2.23
2.53
2.23
2.91
2.16
3.04
2.76
2.07
2.71
2.15
The list of the compared systems in this section is shown in Table 4.29 below.
Table 4.29: List of selected systems for the comparison in PSNR performance
Transform
DCT
LOT
3-Level DWT
4-Level DWT
Quantizer
JPEG Uniform
JPEG Uniform
Visual Threshold
EZW
Entropy Coder
Run-Length Huffman
Run-Length Huffman
Run-Length Huffman
Adaptive Huffman
The DCT system was the simulated baseline JPEG system which is one of the
JPEG standards for lossy image compression. The LOT system in this comparison
was similar to the baseline JPEG. The visual threshold system is the JPEG-like
system for the DWT. The output resolution of this system was set to 32 pixels/degree.
According to the earlier comparison in this thesis, the EZW system used the 4-level
wavelet expansion because this number of levels increased the compression ability and
PSNR performance of the system. The results are summarized in Table 4.30 and 4.31
below.
According to the results, the baseline LOT JPEG-like system performed better
than the DCT for all images. On average, the LOT JPEG system provided 0.5 dB
PSNR above the baseline DCT JPEG system, and it provided up to 1.5 dB PSNR
above the DCT system in some cases. Therefore, it can be concluded that the LOT
JPEG system generally provides better PSNR than the DCT JPEG system.
For the wavelet system, the visual threshold system provided slightly better PSNR
than the DCT system. However, the optimal, run-length encoder was used instead
of the universal coder. This coder was the reason for the PSNR gain of the visual
threshold system. Nevertheless, the efficient universal coder should slightly reduce
the PSNR. At this point, the visual threshold system should performed comparably
to the DCT JPEG system.
On the other hand, the 4-level EZW system resulted about 0.5 dB less PSNR
than other systems. This PSNR loss may be caused by inefficiency of the coder.
In other words, the coder might use the bit resource inefficiently such that there
85
Table 4.30: Comparison of the PSNR performance of DCT, LOT, and
DWT systems
Image
baboon
barbara2
bfrag
btfrag
Bit Rate
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
1.75
DCT
PSNR
(dB)
(23.65)
(25.18)
(26.41)
(27.55)
(28.66)
(29.78)
(28.98)
(31.29)
(33.04)
(34.53)
(35.88)
(37.13)
(28.17)
(31.28)
(33.77)
(35.74)
(37.36)
(38.75)
(29.53)
(32.34)
(34.64)
(36.48)
(38.07)
(39.46)
LOT
APSNR
(dB)
0.32
0.31
0.30
0.32
0.31
0.26
0.71
0.56
0.54
0.54
0.51
0.44
1.49
1.32
1.04
0.86
0.72
0.57
0.35
0.30
0.16
0.17
0.11
0.07
86
VT DWT
APSNR
(dB)
0.35
0.75
0.94
1.17
1.15
1.25
0.47
0.63
0.93
1.09
0.93
0.99
1.25
1.13
0.50
0.26
0.25
0.32
-1.07
-0.42
-0.10
0.33
0.40
0.49
EZW DWT
APSNR
(dB)
0.03
-0.19
0.59
0.21
0.04
0.33
-0.46
-0.40
0.43
0.04
-0.18
0.35
0.32
-0.14
-0.08
-0.50
-0.88
-0.40
0.53
-0.31
0.19
-0.61
-0.24
0.09
Table 4.31: Comparison of the PSNR performance of DCT, LOT, and
DWT systems (cont.)
Image
goidhill
lena
peppers
0.50
0.75
1.00
1.25
1.50
1.75
0.50
0.75
1.00
1.25
1.50
DCT
PSNR
(dB)
(32.35)
(34.14)
(35.55)
(36.69)
(37.73)
(38.63)
(34.56)
(36.46)
(37.77)
(38.89)
(39.76)
LOT
APSNR
(dB)
0.37
0.34
0.27
0.27
0.22
0.23
0.37
0.23
0.18
0.12
0.16
VT DWT
APSNR
(dB)
0.10
0.28
0.29
0.52
0.60
0.69
0.41
0.59
0.71
0.79
1.02
EZW DWT
APSNR
(dB)
-0.51
-0.12
-0.66
-0.72
-0.10
-0.13
-0.04
-0.07
0.09
0.31
0.15
1.75
(40.63)
0.16
1.14
-0.12
0.50
0.75
1.00
1.25
1.50
1.75
(33.78)
(35.27)
(36.28)
(36.05)
(37.76)
(38.39)
0.12
0.04
0.01
0.03
0.02
0.07
0.43
0.49
0.52
0.72
0.71
1.11
0.18
0.25
0.08
0.42
0.68
0.64
Bit Rate
87
are not enough bits left to send more image information. The efficiency test of the
adaptive Huffman for the EZW system was done by comparing the theoretical bit
rate that is needed to encode the same file with the actual bit rate. The theoretical
bit rate is the minimum bit rate to needed to encode a particular EZW file. The EZW
encoded file contains two sections of information: the header section and the data
section. The header section includes image dimensions, number of DWT levels, and
an initial threshold. The data section contains coefficient information to be stored
using the following six symbols, {P,N, 1, Z, 0, 1}. Definitions of all symbols can be
found in Chapter 2. The first four symbols are transmitted during the dominant
pass, whereas the binary digits, 0 and 1, are sent during the subordinate pass. Most
of the information is in the data section. Therefore, in this calculation, the header
section is ignored. Assuming the symbols to be independent, the average entropy is
the weighted sum of the entropy of the dominant pass and the subordinate pass. The
entropy, I(x), is given by
N
I(x)
=
Ep,(x) ln(p,(x)),
(4.1)
i=1
where
ND
Nx, is number of symbols in the file, where xi can be one of the six symbols above.
ND is the total symbols in the dominant pass. The entropy of the subordinate pass
can be calculated in the same way. The average bit to represent each symbol in the
dominant pass and the subordinate pass were calculated. The theoretical, total bits
to encode the EZW encoded file, therefore, are given by,
Total bits =
ID(x)ND ± Is(x)Ns,
(4.2)
where ID(x) and Is(x) represent the entropy of the dominant and the subordinate
pass respectively, and Ns represents the number of symbols in the subordinate pass
88
respectively. The theoretical bit rate can be calculated by dividing the theoretical,
total bits by the number of pixels in the image. The graphs between the theoretical
bit rate and the actual bit rate of the 4-level DWT with the EZW system are shown
in Figure 4-2.
baboon.raw
barbara2. raw
2
2
....
.......
.....
1.5
. ...
-01.5
......
.......- . . . . . .
a
. ... .. . .. .. .. .. . . . . . . .
0.
..
-..
. ...
a
. ..
. . ...
. ...
_1
....
.. .... . -.
..
0.5 F
0
S
0.5
1
bf rag. raw
1.5
... .. .. . . . . . . -
:30.5
0
2
0
0.5
1
goidhill.raw
1.5
2
2
2
0.
1.5
e-1.5
--.-.-.-.-.-.
1
....
.....
-0
a0.
..
- .. .....
0.5 F
0
)
C)
- ..
- ..........
- -
-.
1
0.5
0.5
1
1.5
0
2
0.5
0
lena.raw
1
peppers.raw
1.5
2
2
2
.0
.........
............ ...........
.....................................
..
.........
j0.5
0 Soo
0.5
aD 1.5
1
CO
---...
............--....-.. .
0.5
Ca
.. . . . . . . . .
......
. . . . ... . . . . . . . . .
C.)
0
0
1.5
1
0.5
Theoretical bit rate (bpp)
0
2
0.5
1
1.5
Theoretical bit rate (bpp)
2
Figure 4-2: Efficiency test for the adaptive Huffman coder using the 4level DWT with the EZW system
The solid line represents the 45-degree line.
The second line represents the relationship between the theoretical bit rate and the
actual bit rate.
In Figure 4-2, the 45-degree line indicates the theoretical bit rate that the coder
can achieve. The other line represents the actual bit rate. The improvement room for
the adaptive Huffman coder can be concluded from the gap between two lines. Results
in Figure 4-2 shows small gaps across all images. Therefore, given the definition of
89
the symbols, the adaptive Huffman coder encoded the file efficiently. This encoding
scheme is only one way to encode the symbols of the EZW system. Other encoding
scheme may provide better compression than the adaptive Huffman used in this thesis.
However, the results shows that the EZW system has a potential to achieve at least
the same performance as the current JPEG system.
4.3
Part III: Visual Quality
The visual quality is the most important aspect in image compression. The results
in this section show the artifact of the reconstructed images at 0.5 bit/pixel of selected
transfrom coding systems. attempted to indicate any artifacts of all transform coding
systems. Two of the seven images, bfrag and lena, were tested and presented in this
thesis. According to the simulation results, the compressed image at 1 bit/pixel could
not be distinguished from the original image. The artifact becomes dominant as the
bit rate decreases. The PSNR values are shown in Table 4.32. For each image, the
original image is listed first and the DCT JPEG system next as references. The other
systems are listed in the increasing order of magnitude of the PSNR. The images are
shown in Figure 4-3 to 4-14.
According to the result images, at 0.5 bit/pixel, the reconstructed images of the
DCT JPEG system displayed blocking effect strongly. At the same bit rate, the LOT
JPEG system significantly reduced the image blockiness. However, the LOT system
introduced ringing in areas of sharp edges, but they were not prominant. The ringing
artifact of the LOT system can be compared with the blocking artifact of the DCT
system to see which artifact is more objectionable to the viewers.
On the other hand, the DWT systems suffered from blurriness for both the visual
threshold system and the EZW system. The EZW lost more details than the visual
threshold system. As the number of DWT levels increased from three to four, more
details were recovered, but the image was still not as sharp as those of the baseline
DCT JPEG and the baseline LOT JPEG system. The loss of details in the EZW
system was caused by exhaustion of the bit budgets at the encoder stage.
90
Table 4.32: Result images and the corresponding PSNRs
Name
System
PSNR
(dB)
bfrag1
bfrag2
bfrag3
bfrag4
bfrag5
bfrag6
lenal
lena2
lena3
lena4
lena5
lena6
Original
DCT + JPEG + Run-Length Huffman
3-Level DWT + EZW + Adaptive Huffman
4-Level DWT + EZW + Adaptive Huffman
3-Level DWT + VT + Run-Length Huffman
LOT + JPEG + Run-Length Huffman
Original
DCT + JPEG + Run-Length Huffman
3-Level DWT + EZW + Adaptive Huffman
4-Level DWT + EZW + Adaptive Huffman
LOT + JPEG + Run-Length Huffman
3-Level DWT + VT + Run-Length Huffman
N/A
28.19
27.27
28.50
29.43
29.67
N/A
34.58
33.19
34.51
34.92
34.98
APSNR
(dB)
N/A
0.00
-0.92
0.31
1.24
1.48
N/A
0.00
-1.73
-0.07
0.34
0.60
The system comparisons in this chapter showed that the relationship between
PSNR and bit rate was mainly influenced by the combination of quantizer and entropy
coder. The transform did not have any impact on the PSNR performance.
In the comparison between the DCT JPEG, the LOT JPEG, the visual threshold
wavelet with the run-length Huffman coder, and the EZW with the adaptive Huffman
coder, the LOT JPEG system gave the best performance in terms of PSNRs and the
visual quality. The LOT JPEG system provided about 0.5 dB better PSNR than the
DCT JPEG system. In addition, it reduced blocking artifacts. However, it introduced
small ringings in areas of sharp edges. Furthermore, the cost of changing from the
DCT system to the LOT system was negligible because only the transform block was
affected. A few adjustments are needed to update quantization levels and code tables.
The computational costs of the LOT system were only 75-85% more than the DCT
system. Therefore, the LOT system is an economical choice to improve the existing
DCT system.
On the other hand, the DWT system introduced new ways to compress the image.
The PSNR performance of the DWT systems were comparable to the DCT JPEG system. The visual threshold quantizer worked well together with the optimal run-length
91
Huffman coder. The reconstructed image suffered from blurriness due to severe quantization of high frequency bands. The 3-level EZW system with the adaptive Huffman
coder yielded the worse PSNR performance of all four systems. Nevertheless, it had
the potential to improve by using a different coder as mentioned earlier. This system
exhibited no blockiness but suffered from blurriness on the reconstructed image like
the visual threshold. In the case of the EZW system, the blurriness occurred because
the bit resource ran out. Costs of the wavelet system were mainly the computational
costs of wavelet coefficients. The algorithm used in this thesis was the circular convolution, which was not the optimized algorithm. Other algorithms, such as the lifting
algorithm, may reduce the computational costs. The quantization and the coding cost
is not the main problem with the visual threshold. However, in the case of the EZW
system, the zerotree searching algorithm increases the overhead of the system. The
newer version of the zerotree algorithm such as SPIHT may work better. In addition,
the EZW system in this thesis used the adaptive Huffman to translate one symbol at
a time. This encoding scheme may not be optimal. According to the statistics, there
were a lot of zerotree root symbols in the encoded file. The run-length coder can take
advantage of this situation.
92
Figure 4-3: bfragl: original image at 8 bpp
93
Figure 4-4: bfrag2: DCT + JPEG + Run-Length Huffman at 0.50 bpp
94
Figure 4-5: bfrag3: 3-level DWT + EZW + Adaptive Huffman at 0.50
bpp
95
Figure 4-6: bfrag4: 4-level DWT + EZW + Adaptive Huffman at 0.50
bpp
96
Figure 4-7: bfrag5: 3-level DWT + Visual Threshold + Run-Length Huffman at 0.50 bpp The visual threshold quantizer used the output resolution of 32
pixels/degree
97
Figure 4-8: bfrag6: LOT + JPEG + Run-Length Huffman at 0.50 bpp
98
Figure 4-9: lenal: original image at 8 bpp
99
Figure 4-10: lena2: DCT + JPEG + Run-Length Huffman at 0.50 bpp
100
Figure 4-11: lena3: 3-level DWT + EZW + Adaptive Huffman at 0.50
bpp
101
Figure 4-12: lena4: 4-level DWT + EZW + Adaptive Huffman at 0.50
bpp
102
Figure 4-13: lena5: LOT + JPEG + Run-Length Huffman at 0.50 bpp
103
Figure 4-14: lena6: 3-level DWT + Visual Threshold + Run-Length Huffman at 0.50 bpp The visual threshold quantizer used the output resolution of 32
pixels/degree
104
Chapter 5
Summary
Ten variants of three transform coding systems were compared in terms of their system
complexity and peak signal-to-noise ratios (PSNR). Seven black & white images were
used including Lena and Barbara. The transforms tested included the DCT, the
LOT, and the DWT. The quantizers tested included the optimal uniform quantizer,
the JPEG uniform quantizer, the visual threshold quantizer, and the EZW quantizer.
The entropy coders tested included the Huffman coder, the adaptive Huffman coder,
and the run-length Huffman coder.
It was found that the relationship between PSNR and bit rate was mainly influenced by the choice of quantizer and entropy coder. In addition, the EZW can take
advantage of the number of DWT levels to provide better compression. Finally, in
terms of the visual quality given the same bit rate, the LOT JPEG ranked the first,
while the DCT JPEG and the visual threshold wavelet tied the second. Finally, the
EZW system ranked the last. However, the EZW system may be more desirable when
the exact bit rate is required.
This simulation investigated only a few combinations of transform coding systems.
There are other kinds of quantizer and entropy coder left to be compared, such as the
arithmetic coder. In the case of the LOT system, the quantizer and the entropy coder
used in the thesis are specifically designed for the DCT JPEG system. Therefore, it
may be possible to improve the LOT system by designing the quantizer and the
encoder specifically for the LOT coefficients.
105
Compared to the DCT system, the wavelet system is relatively new in this field.
Up to now, there is no standardized wavelet system for image compression. Many
studies have been conducted about wavelet compression, but further studies should
be done to improve both the computational costs and the PSNR performance the
DWT system.
One possible study is to design the optimal, universal run-length
Huffman coder for the visual threshold. In the case of the EZW system, different
encoding schemes, such as the run-length coder or the arithmetic coder, may be used
to improve the PSNR of the system because there are many zerotree root symbols in
the encoded file.
106
Mil~ibraries
Document Services
Room 14-0551
77 Massachusetts Avenue
Cambridge, MA 02139
Ph: 617.253.2800
Email: docs@mit.edu
http://libraries.mit.edu/docs
DISCLAIMER
MISSING PAGE(S),
Page 107 is missing from the Archives copy.
This is the most complete version available.
Bibliography
[1] Vasudev Bhaskaran and Konstantinos Konstantinides. Image and Video Compression Standards: Algorithms and Architectures. Kluwer Academic Publisher,
Boston, 1995.
[2] C.S. Burrus, Ramesh A. Gopinath, and Haitao Guo. Introduction to Wavelets
and Wavelet Transforms: A Primer. Prentice Hall, Upper Saddle River, NJ,
1998.
[3] David Donoho et al. Wavelab 802 for MATLAB5.x. [Online Document], October
1999. Available HTTP: http://www-stat.stanford.edu/-wavelab/.
[4] Jae S. Lim. Two-Dimentional Signal and Image Processing.Prentice Hall, Upper
Saddle River, NJ, 1990.
[5] Henrique S. Malvar. Signal Processing with Lapped Transforms. Artech House,
Boston, MA, 1992.
[6] Henrique S. Malvar. Lapped transform algorithms. [Online Document], October
2000. Available HTTP: http://www.research.microsoft.com/-malvar/.
[7] Henrique S. Malvar and David H. Staelin. The LOT: Transform coding without
blocking effects. IEEE Transactions on Acoustics, Sppech and Signal Processing,
37(4):553-559, April 1989.
[8] Alan V. Oppenheim, Ronald W. Shafer, and John R. Buck. Dicrete-Time Signal
Processing. Prentice Hall, Upper Saddle River, NJ, 2 edition, 1999.
108
[9] William B. Pennebaker and Joan L. Mitchell. JPEG: Still Image Data Compression Standard. Van Norstrand Reinhold, New York, 1993.
[10] A Said and W.A. Pearlman. A new, fast, and efficient image codec based on set
partitioning in hierarchical trees. IEEE Transactions on Circuits and Systems
for Video Techonology, 6(3):243-250, June 1996.
[11] J.M. Shapiro. Embedded image coding using zerotrees of wavelet coefficients.
IEEE Transactions on Signal Processing, 41(12):3445-3462, December 1993.
[12] Gilbert Strang. Wavelets from filter banks. In Gordon Erlebacher, M.Yousuff
Hussaini, and Leland M. Jameson, editors, Wavelets: Theory and Applications.
Oxford University Press, New York, 1996.
[13] Gilbert Strang and Truong Nguyen.
Wavelets and Filter Banks.
Wellesly-
Cambridge Press, WellesleyMA, 1996.
[14] T.D. Tran and T.Q. Nguyen.
A progressive transmission image coder using
linear phase uniform filter banks as block transforms. IEEE Transactions on
Image Processing,8(11):1493-1507, November 1999.
[15] J.D. Villasenor, B. Belzer, and J. Liao. Wavelet filter evaluation for image compression. IEEE Transactionson Image Processing,4(8):1053-1060, August 1995.
[16] Andrew B. Watson, Gloria Y. Yang, Joshua A. Solomon, and John D. Villasenor.
Visual thresholds for wavelet quantization error. In Bernice E. Rogowitz and
Jan P. Allebach, editors, Proceedings of SPIE: Human Vision and Electronic
Imaging, volume 2657, pages 382-392, April 1996.
109