Case Study on Dithered Images Using Lossless Compression Basar

advertisement
Case Study on Dithered Images Using Lossless
Compression
Basar Koc #1, Ziya Arnavut *2, Huseyin Kocak #
1 Department of Computer Science, University of Miami, Miami, FL 33146 USA
(Email id: b.koc@ umiami.edu)
2 Department of Computer Science, SUNY Fredonia, Fredonia, NY 14063, USA
Abstract- In order to display high-bit resolution images on
present when less than optimal (i.e. the smallest
length) code words are used; (2) Interpixel
redundancy, which results from correlations
between the pixels of an image & (3) psycho visual
redundancy which is due to data that is ignored by
the human visual system (i.e. visually nonessential
information). Most of the existing image coding
algorithm is based on the correlation between
adjacent pixels and therefore the compression ratio
is not high. Image compression may be lossy or
lossless. Lossless compression is preferred for
archival purposes and often for medical imaging,
technical drawings, clip art, or comics. This is
because lossy compression methods, especially
when used at low bit rates, introduce compression
artifacts. Lossy methods are especially suitable for
natural images such as photographs in applications
where minor (sometimes imperceptible) loss of
Keywords—Lossless image compression, Lossless compression fidelity is acceptable to achieve a substantial
techniques, pseudo-distance metric, GIF, PNG
reduction in bit rate. It is possible to compress
many types of digital data in a way that reduces the
size of a computer file needed to store it, or the
I. INTRODUCTION
bandwidth needed to stream it, with no loss of the
An image is essentially a 2-D signal processed by full information contained in the original file. The
the human visual system. The signals representing original contains a certain amount of information;
images are usually in analog form. However, for there is a lower limit to the size of file that can
processing, storage and transmission by computer carry all the information. As an intuitive example,
applications, they are converted from analog to most people know that a compressed ZIP file is
digital form. A digital image is basically a 2- smaller than the original file, but repeatedly
Dimensional array of pixels. In a raw state, images compressing the file will not reduce the size to
can occupy a large amount of memory both in nothing and will in fact usually increase the size. [7]
RAM and in storage. Image compression is to
II. COMPRESSION TECHNIQUES
reduce irrelevance and redundancy of the image
data in order to be able to store or transmit data in Compression techniques come in two general
an efficient form. Compression is achieved by the flavors: lossless and lossy. As the name states,
removal of one or more of three basic data when lossless data is decompressed, the resulting
redundancies: (1) Coding redundancy, which is image is identical to the original. Lossy
low-bit resolution displays, bit resolution needs to be
reduced. This problem is vital especially for low-cost or
small (mobile) devices. To untangle the bit reduction
problem, special color quantization algorithms, called
dithering, are employed on high-bit resolution images. The
dithering process helps to provide solution the problem,
but it does not help much in terms of storage and
transmission of images. To reduce storage and to lower
data transmission, numerous special compression
techniques have been proposed in the last several decades.
While the well-known compression algorithms, such as
gzip, help lower image file sizes, they are not usually
adequate. To improve the compression gain, special
compression techniques that take into account structure of
image data must be developed. We propose to implement
the pseudo-distance technique (PDT) for dithered images
which has been reported in the literature earlier [6].The
compression result of this approach will be compared with
result of graphic interchange format (GIF) and portable
network format (PNG).
Page 1
compression algorithms result in loss of data and
the decompressed image is not exactly the same as
the original. In this we can use lossless compression.[7]
A) Lossless Compression: - In lossless
compression scheme, the reconstructed
image, after compression, is numerically
identical to the original image. It is used in
many applications such as ZIP file format &
in UNIX tool gizp.It is important when the
original & the decompressed data be
identical. Some image file formats like PNG
or GIF use only lossless compression. Most
lossless compression programs do two
things in sequence: the first step generates a
statistical model for the input data, and the
second step uses this model to map input
data to bit sequences in such a way that
"probable" (e.g. frequently encountered)
data will produce shorter output than
"improbable" data.
B) Lossy Compression: - In order to achieve
higher rates of compression, we give up
complete reconstruction and consider lossy
compression techniques. So we need a way
to measure how good the compression
technique is How close to the original data
the reconstructed data is
A distortion measure is a
mathematical quality that specifies how
close an approximation is to its original.
 Difficult to find a measure which
corresponds to our perceptual
distortion
 The average pixel difference is given
by the Mean Square Error (MSE)
•
The size of the error relative to the signal is
given by the signal-to-noise ratio (SNR)
•
Another common measure is the peaksignal-to-noise ratio (PSNR)
Lossless image representation formats. True
color image [3]:
This image is a 24 bit depth image. I
compressed it with TIFF and PNG. The un
compressed image is in BMP and has a size of
696KB. This image is a very good image to
compress with lossless algorithms, because it
has lots of areas of homogeneous colors, so we
can see that both TIFF and PNG perform very
well. PNG is more powerful than TIFF. I have
also compress it with JPEG to see what would
be the size of it compressed with a lossy
algorithm. The ratio of compression for TIFF is
around 2:1, for PNG is around 2, 7:1 and for
JPEG we obtained a compression ratio of 16:1.
BMP 696 KB
TIFF-LZW 378KB
PNG 258KB
Page 2
JPEG 43,3KB
III. DITHERED IMAGES
Dithered images means distributes
errors among pixels. Three dithering methods as
follows:
1. Random method
2. Ordered method
3. Error Diffusion method
IV. LOSSLESS COMPRESSION TECHNIQUES
There
are
five
compression techniques as follows [13]:
lossless
A) Run Length Encoding:-RLE is used in
lossless data compression RLE s a simple
form of data compression in which data is in
the form of runs. Runs is sequences in which
the same data value occurs in many
consecutive data elements are stored as a
single data value and count, rather than as the
original run.RLE may also be used to refer to
an early graphics file format. It does not work
well at all on continuous tone images such as
photographs, although JPEG uses it quite
effectively on the coefficients that remain
after transforming and quantizing image
blocks.
B) Entropy Encoding:-An entropy encoding is a
coding scheme that involves assigning codes
o symbols so as to match code lengths with
the probabilities of the symbols. Typically,
entropy encoders are used to compress data
by replacing symbols represented by equal
length codes with symbols represented by
codes proportional to the negative logarithm
of the probability. Therefore; the most
common symbols use the shortest codes.
C) Huffman Coding:-The Huffman’s algorithm
is generating minimum redundancy codes
compared to other algorithms. The Huffman
coding
has
effectively
used
in
text,image,video
compression
and
conferencing system such as JPEG,MPEG2,MPEG-4 and H.263 etc.The Huffman
coding technique collects unique symbols
from the source image and calculates its
probability value for each symbol and sorts
the symbols based on its probability
value.Further,from the lowest probability
value symbol to the highest probability value
symbol, two symbols combined at a time to
form a binary tree.Moreover,allocates zero to
the left node and one to the right node starting
from the root of the tree. To obtain Huffman
code for a particular symbol, all zero and one
collected from the root to that particular node
in the same order.
D) Arithmetic Encoding:-AC is the most
powerful technique for statiscal lossless
encoding that has attracted much attention in
the recent years. It provides more flexibility
and better efficiency than the celebrated
Huffman coding does. The aim of AC is to
define a method that provides code words
with an ideal length. Like for every other
entropy coder, it is required to know the
probability for the appearance of the
individual symbols.AC is the most efficient
method to code symbols according to the
probability of their occurrence. The average
code length is very close to the possible
minimum given by information theory. The
AC assigns an interval to each symbol whose
size reflects the probability for the appearance
of this symbol. The code word of a symbol is
an arbitrary rational number belonging to the
corresponding interval.
E) LZW Coding:-LZW algorithm is working
based on the occurrence multiplicity of
character sequences in the string to be
encoded. Its principle consists in substituting
patterns with an index code, by progressively
building a dictionary. The dictionary is
initialized with the 256 values of the ASCII
table. The file to be compressed is split in to
strings of bytes (thus monochrome images
coded on 1 bit this compression is not very
effective), each of these strings is compared
Page 3
with the dictionary and is added, if not found
there. In encoding process the algorithm goes
over the stream of information, coding it; if a
string is never smaller than the longest word
in the dictionary then its transmitted. In
decoding process, the algorithm rebuilds the
dictionary in the opposite direction; it thus
does not need to be stored.
V. ALGORITHMS
A. ENCODING ALGORITHM
B. Decoding Algorithm
The decoder part of the
application uses a procedure similar to that
used by the encoding algorithm. The
encoded image file consists of the index
table of the original image and the error
signals. Hence, the decoder can easily
construct the Euclidean distance and the
pseudo-distance matrices.
The original image is
reconstructed as follows: First, the pseudodistance matrix is reconstructed from the
index table. Then the index number of the
first pixel at the top-left corner, call it a, is
encoded with the original index value. Later
the process is applied as follows:
Receive the error signal e of
the next pixel of the encoded image E.In
row a of the pseudo-distance matrix, search
for the error signal value e. Emit the
corresponding column value x as the
original index value of the reconstructed
image, as shown in Figure 6. Let x be a and
repeat the process from 1 until we reach the
end of the file. Clearly, in the decoding
process, we obtain the original index values
without any loss.
First, the top-left corner index
value (3) is copied to the top-left corner of
the error matrix E. Then, this index value,
along with the right neighbouring index
value (3), is used to determine the error
value 0 from the pseudo-distance matrix (P
(3, 3) = 0). After that, the error value (0) is
written next to the top-left corner element
(3) of the matrix E. In each row, this process
is repeated similarly, except the error value
of the left most element of each row in E is
predicted using the pixel immediately above
that element in I.
Page 4
A survey on palette
recording
author
methods for improving the compression
of color
affiliation
indexed author
image name
[12], the entropy values for original
image and
titlerecorded image are reported as 7.62.
It has been reported that further
Lossless compression of images [1],
the compression ratio is the size of original image entropy values 7.41 for the original image and 5.21
to the size of the compressed image and bit rate is for recorded image was obtained by using Zeng’s
the number of bits per pixel required by the method. The compression was 4%-5% worse for
compressed image. For example, a 256*256, 8 bits JPEG-LS and 3%-5% worse for lossless JPEG2000
per pixel image requires 256*256*8 bits=65536*8 by using Zeng’s method.
The following Table gives the quality
bits=65536 bytes form. If the compressed image
requires 32768 bytes then compression ratio is 2.0. PSNR and quantization error by using LBG
Since the image has 256*256=65,536 pixels, the
Images
Compression Ratio
compressed file needs 32768*8/65536=4 bits per
Lenna
1.09
pixel on average bit rate is 4.
Man
1.09
The compression ratio reported by
Chall
1.17
research on any images by using compression
algorithm such as gizp.It has been reported that
Coral
1.16
further improvement was obtained by using
Shuttle
1.18
Huffman coding method. The compression ratios of
any images are given in table as follows:
Sphere
1.25
Brain
1.60
Color quantization of images [2],
the PSNR and quantization error reported by algorithm.
research as follows: For girl image PSNR is 30.6
dB and quantization error is 15.13 by using Linde Images-
Girl
Lena
Pepper
Buzo –Gray (LBG) algorithm. For Lena image
PSNR is 30.15 dB and quantization error is 5.37 by
PSNR (dB)
30.6
15.13
28.92
using LBG algorithm.
Quantization
30.15
5.37
8.72
An
efficient
re-indexing
algorithm for color mapped images [8], the entropy
Error
value for 2 to 32 color palette reported as 2.1 and
33 to 256 color palette reported as 3.3. It has been
reported that further entropy was decreased; the
entropy value 2.0 for 2 to 32 color palette and 2.5
for 33 to 256 was obtained by using Zeng’s method.
Lossless compression of dithered VI. STUDY ON PSEUDO IMAGE MATRIX, EUCLIDEAN DISTANCE
PSEUDO DISTANCE MATRIX
images with the pseudo-distance technique [11],the
pseudo distance technique (PDT) is used to achieve
1. Pseudo Image Matrix:better compression gain than the any well-known
A pseudo-color (color-mapped) image is a
image compression schemes (GIF,PNG &
matrix of indices, where each index number
JPEG2000).The compression gain reported by
i corresponds to a triplet (Ri, Gi, Bi) in the
research is 46.8% for GIF and 14.2% for PNG. It
color-mapped table of the image. In the case
has been reported that further improvement 3.9%
of a bit-mapped image, its color-mapped
was obtained by using PDT technique. This
table is found after the image header and
indicates promising feature of PDT for image
‘before the image matrix.
compression.
Page 5
Since an index can be represented with a byte,
color-mapped image data can be stored more
compactly than its corresponding unprocessed RGB
image. For instance, an image of size 512 x 512
pixels can be stored with 54 bytes of header
information, and 1024 bytes for the color-mapped
table, along with 512 x 512 bytes of index values,
instead of 512 x 512 x 3 bytes. The size of a
pseudo-color mapped image is about one third the
size of the original image.
2. Euclidean distance:We define the Euclidean distance between
A new pseudo-distance matrix P is formed,
two color indices a and b (0 ≤ a, b ≤ 255)
where
each row element P (a, b) gets a unique value
from the color-mapped table of an image as
c (0≤ c ≤ 255) based on the rank of D (a, b) in the
follows:
sorted row a of matrix D
VII.
Where,
• E [a, b]: Distance between index a and
index b.
• eR: Difference between the R values of a
and b.
• eG: Difference between the G values of a
and b.
• eB: Difference between the B values of a
and b.
3. Pseudo Distance Matrix:-
STUDY ON GIF AND PNG
1. Graphics Interchange Format (GIF):The GIF format was developed
in 1987 by CompuServe Incorporated, primarily
for use on the Internet. The current version,
GIF89a, was released in 1990. It supports
colour depths from 1-bit (monochrome) to 8-bit
(256 colours) and always stores images in
compressed form, using lossless LZW
compression. Other supported features include
interlacing and transparency.GIF is a
proprietary format. In addition, the patent for
the LZW algorithm is owned by Unisys
Corporation, which has licensed its use by
CompuServe. Despite the controversy that
surrounded the application of license fees to
GIF readers and writers and the developments
Page 6
of alternative open formats such as PNG the
GIF format remains one of the most widespread
in use, particularly on the Internet. The US
patent expired in June 2003 and the UK patent
expired in June 2004.
2. Portable Network Graphics (PNG):The Portable Network Graphics
(PNG) format was developed by the PNG
Development Group in 1996, to provide an
open alternative to GIF and the associated
licensing issues with LZW compression (see
3.1). It supports colour depths from 1-bit to 48bit. Image data is always stored compressed,
using a variation on the lossless LZ77-based
deflate compression algorithm which is not
patented, and therefore free to use. Support for
interlacing and transparency is also provided,
and the format incorporates a number of error
detection mechanisms. The PNG specification
is entirely public domain and free to use. It also
has a very rich feature set. However, it has not
yet achieved very wide scale adoption.
VIII.
CONCLUSIONS
The PDT is an efficient lossless image
compression method for color-palette images since
the PDT runs in linear time and requires only onepass [11]. In our previous work [11], we have
shown that, when the PDT is used in conjunction
with a context-model BAC, we obtained better
compression results than the well-known image
compressors such as GIF, PNG, JPEG-LS, and
JPEG2000 on dithered images. In this paper, we
have shown that further compression gains are
possible when we update one more neighbor of the
predicted pixel in PDT matrix. Indeed, with this
modification, on three sets of dithered images, we
have achieved on average 3.9% improvement over
our previous work resulting in total gains of 46.8%
over GIF and 14.2% over PNG.
REFERENCES
[1] Lossless compression of dithered images, June 2013.
[2] M. T. Orchard and C. A. Bouman, “Color quantization of
images”,[ IEEE Trans. Signal Process., vol. 39, no. 12,pp.
2677–2690, Dec. 1991.
[3][Online].Available:“http://www.visgraf.impa.br/Courses/i
p00/proj/Dithering1/”
[4] R. W. Floyd and L. Steinberg, “An adaptive algorithm for
spatial grey scale”,[ in Proc. Soc. Inf. Display, 1976, vol.
17,pp. 75–77.
[5] L. Akarun, Y. Yardunci, and A. E. Cetin, “Adaptive
methods for dithering color images”,[ IEEE Trans. Image
Process.,vol. 6, no. 7, pp. 950–955, Jul. 1997.
[6] C. Alasseur, A. G. Constantinides, and L. Husson,
BColour quantisation through dithering techniques,[ in Proc.
ICIP, Sep. 14–17, 2003, vol. 1, pp. I-469–I-472.
[7] [Online] Available:”http://www.google.com/Lossless
compression/Wikipedia.”
[8] S. Battiato, G. Gallo, G. Impoco, and F. Stanco, “An
efficient re-indexing algorithm for color-mapped images”,[
IEEE . Image Process., vol. 13, no. 11, pp. 1419–1423, Nov.
2004.
[9] B. Koc and Z. Arnavut, “Gradient adjusted predictor with
pseudo-distance technique for lossless compression of color
mapped” images,[ in Proc. FIT, Dec. 19–21, 2011, pp. 275–
280.
[10] B. Koc and Z. Arnavut, “A new context-model for the
pseudo-distance technique in lossless compression of color
mapped Images”,[ in Proc. SPIE Opt. Eng.þ Appl., Appl.
Digit. Image Process. XXXV, Aug. 12–16, 2012, p. 84 991O.
[11] B. Koc, Z. Arnavut, and H. Kocak, “Lossless
compression of dithered images with the pseudo-distance
technique”,[ in Proc. 9th Int. Conf. HONET, Dec. 12–14,
2012, pp. 147–151.
[12] A. J. Pinho and A. J. R. Neves, A survey on palette
reordering methods for improving the compression of colorindexed images,[ IEEE Trans. Image Process., vol. 13, no.
11, pp. 1411–1418, Nov. 2004.
[13]Manjinder Kaur and Gaganpreet Kaur,A survey of
lossless and lossy image compression techniques.[IEEE
Trans.vol.3issue 2,February 2013]
Page 7
Page 8
Page 9
Download