Case Study on Dithered Images Using Lossless Compression Basar Koc #1, Ziya Arnavut *2, Huseyin Kocak # 1 Department of Computer Science, University of Miami, Miami, FL 33146 USA (Email id: b.koc@ umiami.edu) 2 Department of Computer Science, SUNY Fredonia, Fredonia, NY 14063, USA Abstract- In order to display high-bit resolution images on present when less than optimal (i.e. the smallest length) code words are used; (2) Interpixel redundancy, which results from correlations between the pixels of an image & (3) psycho visual redundancy which is due to data that is ignored by the human visual system (i.e. visually nonessential information). Most of the existing image coding algorithm is based on the correlation between adjacent pixels and therefore the compression ratio is not high. Image compression may be lossy or lossless. Lossless compression is preferred for archival purposes and often for medical imaging, technical drawings, clip art, or comics. This is because lossy compression methods, especially when used at low bit rates, introduce compression artifacts. Lossy methods are especially suitable for natural images such as photographs in applications where minor (sometimes imperceptible) loss of Keywords—Lossless image compression, Lossless compression fidelity is acceptable to achieve a substantial techniques, pseudo-distance metric, GIF, PNG reduction in bit rate. It is possible to compress many types of digital data in a way that reduces the size of a computer file needed to store it, or the I. INTRODUCTION bandwidth needed to stream it, with no loss of the An image is essentially a 2-D signal processed by full information contained in the original file. The the human visual system. The signals representing original contains a certain amount of information; images are usually in analog form. However, for there is a lower limit to the size of file that can processing, storage and transmission by computer carry all the information. As an intuitive example, applications, they are converted from analog to most people know that a compressed ZIP file is digital form. A digital image is basically a 2- smaller than the original file, but repeatedly Dimensional array of pixels. In a raw state, images compressing the file will not reduce the size to can occupy a large amount of memory both in nothing and will in fact usually increase the size. [7] RAM and in storage. Image compression is to II. COMPRESSION TECHNIQUES reduce irrelevance and redundancy of the image data in order to be able to store or transmit data in Compression techniques come in two general an efficient form. Compression is achieved by the flavors: lossless and lossy. As the name states, removal of one or more of three basic data when lossless data is decompressed, the resulting redundancies: (1) Coding redundancy, which is image is identical to the original. Lossy low-bit resolution displays, bit resolution needs to be reduced. This problem is vital especially for low-cost or small (mobile) devices. To untangle the bit reduction problem, special color quantization algorithms, called dithering, are employed on high-bit resolution images. The dithering process helps to provide solution the problem, but it does not help much in terms of storage and transmission of images. To reduce storage and to lower data transmission, numerous special compression techniques have been proposed in the last several decades. While the well-known compression algorithms, such as gzip, help lower image file sizes, they are not usually adequate. To improve the compression gain, special compression techniques that take into account structure of image data must be developed. We propose to implement the pseudo-distance technique (PDT) for dithered images which has been reported in the literature earlier [6].The compression result of this approach will be compared with result of graphic interchange format (GIF) and portable network format (PNG). Page 1 compression algorithms result in loss of data and the decompressed image is not exactly the same as the original. In this we can use lossless compression.[7] A) Lossless Compression: - In lossless compression scheme, the reconstructed image, after compression, is numerically identical to the original image. It is used in many applications such as ZIP file format & in UNIX tool gizp.It is important when the original & the decompressed data be identical. Some image file formats like PNG or GIF use only lossless compression. Most lossless compression programs do two things in sequence: the first step generates a statistical model for the input data, and the second step uses this model to map input data to bit sequences in such a way that "probable" (e.g. frequently encountered) data will produce shorter output than "improbable" data. B) Lossy Compression: - In order to achieve higher rates of compression, we give up complete reconstruction and consider lossy compression techniques. So we need a way to measure how good the compression technique is How close to the original data the reconstructed data is A distortion measure is a mathematical quality that specifies how close an approximation is to its original. Difficult to find a measure which corresponds to our perceptual distortion The average pixel difference is given by the Mean Square Error (MSE) • The size of the error relative to the signal is given by the signal-to-noise ratio (SNR) • Another common measure is the peaksignal-to-noise ratio (PSNR) Lossless image representation formats. True color image [3]: This image is a 24 bit depth image. I compressed it with TIFF and PNG. The un compressed image is in BMP and has a size of 696KB. This image is a very good image to compress with lossless algorithms, because it has lots of areas of homogeneous colors, so we can see that both TIFF and PNG perform very well. PNG is more powerful than TIFF. I have also compress it with JPEG to see what would be the size of it compressed with a lossy algorithm. The ratio of compression for TIFF is around 2:1, for PNG is around 2, 7:1 and for JPEG we obtained a compression ratio of 16:1. BMP 696 KB TIFF-LZW 378KB PNG 258KB Page 2 JPEG 43,3KB III. DITHERED IMAGES Dithered images means distributes errors among pixels. Three dithering methods as follows: 1. Random method 2. Ordered method 3. Error Diffusion method IV. LOSSLESS COMPRESSION TECHNIQUES There are five compression techniques as follows [13]: lossless A) Run Length Encoding:-RLE is used in lossless data compression RLE s a simple form of data compression in which data is in the form of runs. Runs is sequences in which the same data value occurs in many consecutive data elements are stored as a single data value and count, rather than as the original run.RLE may also be used to refer to an early graphics file format. It does not work well at all on continuous tone images such as photographs, although JPEG uses it quite effectively on the coefficients that remain after transforming and quantizing image blocks. B) Entropy Encoding:-An entropy encoding is a coding scheme that involves assigning codes o symbols so as to match code lengths with the probabilities of the symbols. Typically, entropy encoders are used to compress data by replacing symbols represented by equal length codes with symbols represented by codes proportional to the negative logarithm of the probability. Therefore; the most common symbols use the shortest codes. C) Huffman Coding:-The Huffman’s algorithm is generating minimum redundancy codes compared to other algorithms. The Huffman coding has effectively used in text,image,video compression and conferencing system such as JPEG,MPEG2,MPEG-4 and H.263 etc.The Huffman coding technique collects unique symbols from the source image and calculates its probability value for each symbol and sorts the symbols based on its probability value.Further,from the lowest probability value symbol to the highest probability value symbol, two symbols combined at a time to form a binary tree.Moreover,allocates zero to the left node and one to the right node starting from the root of the tree. To obtain Huffman code for a particular symbol, all zero and one collected from the root to that particular node in the same order. D) Arithmetic Encoding:-AC is the most powerful technique for statiscal lossless encoding that has attracted much attention in the recent years. It provides more flexibility and better efficiency than the celebrated Huffman coding does. The aim of AC is to define a method that provides code words with an ideal length. Like for every other entropy coder, it is required to know the probability for the appearance of the individual symbols.AC is the most efficient method to code symbols according to the probability of their occurrence. The average code length is very close to the possible minimum given by information theory. The AC assigns an interval to each symbol whose size reflects the probability for the appearance of this symbol. The code word of a symbol is an arbitrary rational number belonging to the corresponding interval. E) LZW Coding:-LZW algorithm is working based on the occurrence multiplicity of character sequences in the string to be encoded. Its principle consists in substituting patterns with an index code, by progressively building a dictionary. The dictionary is initialized with the 256 values of the ASCII table. The file to be compressed is split in to strings of bytes (thus monochrome images coded on 1 bit this compression is not very effective), each of these strings is compared Page 3 with the dictionary and is added, if not found there. In encoding process the algorithm goes over the stream of information, coding it; if a string is never smaller than the longest word in the dictionary then its transmitted. In decoding process, the algorithm rebuilds the dictionary in the opposite direction; it thus does not need to be stored. V. ALGORITHMS A. ENCODING ALGORITHM B. Decoding Algorithm The decoder part of the application uses a procedure similar to that used by the encoding algorithm. The encoded image file consists of the index table of the original image and the error signals. Hence, the decoder can easily construct the Euclidean distance and the pseudo-distance matrices. The original image is reconstructed as follows: First, the pseudodistance matrix is reconstructed from the index table. Then the index number of the first pixel at the top-left corner, call it a, is encoded with the original index value. Later the process is applied as follows: Receive the error signal e of the next pixel of the encoded image E.In row a of the pseudo-distance matrix, search for the error signal value e. Emit the corresponding column value x as the original index value of the reconstructed image, as shown in Figure 6. Let x be a and repeat the process from 1 until we reach the end of the file. Clearly, in the decoding process, we obtain the original index values without any loss. First, the top-left corner index value (3) is copied to the top-left corner of the error matrix E. Then, this index value, along with the right neighbouring index value (3), is used to determine the error value 0 from the pseudo-distance matrix (P (3, 3) = 0). After that, the error value (0) is written next to the top-left corner element (3) of the matrix E. In each row, this process is repeated similarly, except the error value of the left most element of each row in E is predicted using the pixel immediately above that element in I. Page 4 A survey on palette recording author methods for improving the compression of color affiliation indexed author image name [12], the entropy values for original image and titlerecorded image are reported as 7.62. It has been reported that further Lossless compression of images [1], the compression ratio is the size of original image entropy values 7.41 for the original image and 5.21 to the size of the compressed image and bit rate is for recorded image was obtained by using Zeng’s the number of bits per pixel required by the method. The compression was 4%-5% worse for compressed image. For example, a 256*256, 8 bits JPEG-LS and 3%-5% worse for lossless JPEG2000 per pixel image requires 256*256*8 bits=65536*8 by using Zeng’s method. The following Table gives the quality bits=65536 bytes form. If the compressed image requires 32768 bytes then compression ratio is 2.0. PSNR and quantization error by using LBG Since the image has 256*256=65,536 pixels, the Images Compression Ratio compressed file needs 32768*8/65536=4 bits per Lenna 1.09 pixel on average bit rate is 4. Man 1.09 The compression ratio reported by Chall 1.17 research on any images by using compression algorithm such as gizp.It has been reported that Coral 1.16 further improvement was obtained by using Shuttle 1.18 Huffman coding method. The compression ratios of any images are given in table as follows: Sphere 1.25 Brain 1.60 Color quantization of images [2], the PSNR and quantization error reported by algorithm. research as follows: For girl image PSNR is 30.6 dB and quantization error is 15.13 by using Linde Images- Girl Lena Pepper Buzo –Gray (LBG) algorithm. For Lena image PSNR is 30.15 dB and quantization error is 5.37 by PSNR (dB) 30.6 15.13 28.92 using LBG algorithm. Quantization 30.15 5.37 8.72 An efficient re-indexing algorithm for color mapped images [8], the entropy Error value for 2 to 32 color palette reported as 2.1 and 33 to 256 color palette reported as 3.3. It has been reported that further entropy was decreased; the entropy value 2.0 for 2 to 32 color palette and 2.5 for 33 to 256 was obtained by using Zeng’s method. Lossless compression of dithered VI. STUDY ON PSEUDO IMAGE MATRIX, EUCLIDEAN DISTANCE PSEUDO DISTANCE MATRIX images with the pseudo-distance technique [11],the pseudo distance technique (PDT) is used to achieve 1. Pseudo Image Matrix:better compression gain than the any well-known A pseudo-color (color-mapped) image is a image compression schemes (GIF,PNG & matrix of indices, where each index number JPEG2000).The compression gain reported by i corresponds to a triplet (Ri, Gi, Bi) in the research is 46.8% for GIF and 14.2% for PNG. It color-mapped table of the image. In the case has been reported that further improvement 3.9% of a bit-mapped image, its color-mapped was obtained by using PDT technique. This table is found after the image header and indicates promising feature of PDT for image ‘before the image matrix. compression. Page 5 Since an index can be represented with a byte, color-mapped image data can be stored more compactly than its corresponding unprocessed RGB image. For instance, an image of size 512 x 512 pixels can be stored with 54 bytes of header information, and 1024 bytes for the color-mapped table, along with 512 x 512 bytes of index values, instead of 512 x 512 x 3 bytes. The size of a pseudo-color mapped image is about one third the size of the original image. 2. Euclidean distance:We define the Euclidean distance between A new pseudo-distance matrix P is formed, two color indices a and b (0 ≤ a, b ≤ 255) where each row element P (a, b) gets a unique value from the color-mapped table of an image as c (0≤ c ≤ 255) based on the rank of D (a, b) in the follows: sorted row a of matrix D VII. Where, • E [a, b]: Distance between index a and index b. • eR: Difference between the R values of a and b. • eG: Difference between the G values of a and b. • eB: Difference between the B values of a and b. 3. Pseudo Distance Matrix:- STUDY ON GIF AND PNG 1. Graphics Interchange Format (GIF):The GIF format was developed in 1987 by CompuServe Incorporated, primarily for use on the Internet. The current version, GIF89a, was released in 1990. It supports colour depths from 1-bit (monochrome) to 8-bit (256 colours) and always stores images in compressed form, using lossless LZW compression. Other supported features include interlacing and transparency.GIF is a proprietary format. In addition, the patent for the LZW algorithm is owned by Unisys Corporation, which has licensed its use by CompuServe. Despite the controversy that surrounded the application of license fees to GIF readers and writers and the developments Page 6 of alternative open formats such as PNG the GIF format remains one of the most widespread in use, particularly on the Internet. The US patent expired in June 2003 and the UK patent expired in June 2004. 2. Portable Network Graphics (PNG):The Portable Network Graphics (PNG) format was developed by the PNG Development Group in 1996, to provide an open alternative to GIF and the associated licensing issues with LZW compression (see 3.1). It supports colour depths from 1-bit to 48bit. Image data is always stored compressed, using a variation on the lossless LZ77-based deflate compression algorithm which is not patented, and therefore free to use. Support for interlacing and transparency is also provided, and the format incorporates a number of error detection mechanisms. The PNG specification is entirely public domain and free to use. It also has a very rich feature set. However, it has not yet achieved very wide scale adoption. VIII. CONCLUSIONS The PDT is an efficient lossless image compression method for color-palette images since the PDT runs in linear time and requires only onepass [11]. In our previous work [11], we have shown that, when the PDT is used in conjunction with a context-model BAC, we obtained better compression results than the well-known image compressors such as GIF, PNG, JPEG-LS, and JPEG2000 on dithered images. In this paper, we have shown that further compression gains are possible when we update one more neighbor of the predicted pixel in PDT matrix. Indeed, with this modification, on three sets of dithered images, we have achieved on average 3.9% improvement over our previous work resulting in total gains of 46.8% over GIF and 14.2% over PNG. REFERENCES [1] Lossless compression of dithered images, June 2013. [2] M. T. Orchard and C. A. Bouman, “Color quantization of images”,[ IEEE Trans. Signal Process., vol. 39, no. 12,pp. 2677–2690, Dec. 1991. [3][Online].Available:“http://www.visgraf.impa.br/Courses/i p00/proj/Dithering1/” [4] R. W. Floyd and L. Steinberg, “An adaptive algorithm for spatial grey scale”,[ in Proc. Soc. Inf. Display, 1976, vol. 17,pp. 75–77. [5] L. Akarun, Y. Yardunci, and A. E. Cetin, “Adaptive methods for dithering color images”,[ IEEE Trans. Image Process.,vol. 6, no. 7, pp. 950–955, Jul. 1997. [6] C. Alasseur, A. G. Constantinides, and L. Husson, BColour quantisation through dithering techniques,[ in Proc. ICIP, Sep. 14–17, 2003, vol. 1, pp. I-469–I-472. [7] [Online] Available:”http://www.google.com/Lossless compression/Wikipedia.” [8] S. Battiato, G. Gallo, G. Impoco, and F. Stanco, “An efficient re-indexing algorithm for color-mapped images”,[ IEEE . Image Process., vol. 13, no. 11, pp. 1419–1423, Nov. 2004. [9] B. Koc and Z. Arnavut, “Gradient adjusted predictor with pseudo-distance technique for lossless compression of color mapped” images,[ in Proc. FIT, Dec. 19–21, 2011, pp. 275– 280. [10] B. Koc and Z. Arnavut, “A new context-model for the pseudo-distance technique in lossless compression of color mapped Images”,[ in Proc. SPIE Opt. Eng.þ Appl., Appl. Digit. Image Process. XXXV, Aug. 12–16, 2012, p. 84 991O. [11] B. Koc, Z. Arnavut, and H. Kocak, “Lossless compression of dithered images with the pseudo-distance technique”,[ in Proc. 9th Int. Conf. HONET, Dec. 12–14, 2012, pp. 147–151. [12] A. J. Pinho and A. J. R. Neves, A survey on palette reordering methods for improving the compression of colorindexed images,[ IEEE Trans. Image Process., vol. 13, no. 11, pp. 1411–1418, Nov. 2004. [13]Manjinder Kaur and Gaganpreet Kaur,A survey of lossless and lossy image compression techniques.[IEEE Trans.vol.3issue 2,February 2013] Page 7 Page 8 Page 9