Abstract Image Compression addresses the problem of reducing the amount of data required to represent a digital image. The underlying basis of the reduction process is the removal of redundant data. From a mathematical viewpoint, this amounts to transforming a 2-D pixel array into statically uncorrelated data set. The transformation is applied prior to storage or transmission of the image. At some later time, the compressed image is decompressed to reconstruct the original image or approximation of it. Currently image compression is recognized as an “enabling technology”. Image compression is the natural technology for handling the increased spatial resolution of today’s imaging sensors and evolving broadcast television standards. Image Compression plays major role important and diverse applications, including televideo-conferencing, remote sensing, document and medical imaging, facsimile transmission and the control of remotely vehicles in military, space and hazardous waste management applications. JPEG 2000 image compression standards make use of DWT (Discrete Wavelet Transform). DWT can be used to reduce the image size without losing much of the resolutions computed and values less than a pre-specified threshold are discarded. Thus it reduces amount of memory required to represent given image. Wavelet based coding provides substantial improvement in picture quality at high compression ratios mainly due to better energy compaction property of wavelet transforms. Wavelet transform partitions a signal into a set of functions called wavelets. Wavelets are obtained from single prototype wavelet called mother wavelet by dilations and shifting. The wavelet transform is computed separately for different segments of the time-domain signal at different frequencies. Contents: Chapter 1: Introduction Chapter 2: Image Compression 2.1 What is the need of image compression? 2.2 Fundamentals of image compression 2.3 Different Classes of compression techniques 2.4 Image compression and reconstruction 2.5 Atypical Image coder Chapter 3: Image compression using DWT 3.1 What is wavelet transform? 3.2 Subband coding 3.3 Compression steps Chapter 4: Implementation 4.1 Choosing the images 4.2 Collecting Results 4.3 Choosing threshold values 4.4 Collecting result for all images 4.5 Using the database toolbox 4.6 Methods of analysis 4.7 Defining the Best result Chapter 5: Discrete Wavelet Transform 5.1 DWT Results 5.2 Results for DWT based on various performance parameters 5.3 Conclusion Chapter 1 Introduction: Image compression is very important for efficient transmission and storage of images. Demand for communication of multimedia data through telecommunication network and accessing multimedia data through internet is growing explosively [2]. With the use of digital cameras, requirement for storage, manipulation and transfer of digital images has grown explosively. These files can be very large and can occupy a lot of memory. A gray scale image that is 256×256 pixels has 65,536 elements to store. Downloading of these files from internet can be very time consuming task. Image data comprise of significant portion of the multimedia data and they occupy the major portion of the communication bandwidth for multimedia communication. Therefore, development of efficient techniques for image compression has become quite necessary [3].A common characteristic of most images is that neighboring pixels are highly correlated and therefore contain highly redundant information. The basic objective of image compression is to find an image representation in which pixels are less correlated. The two fundamental principles used in image compression are redundancy and irrelevancy. Redundancy removes redundancy from the signal source and irrelevancy omits pixel values which are not noticed by human eye.JPEG and JPEG 2000 are two important techniques used for image compression. Image compression bring out many benefits, such as (1) easier exchange of image files between different devices and applications; (2) reuse of existing hardware and software for wider array of products; (3) existence of benchmarks and reference data sets for new and alternative developments. Chapter 2 Image Compression: 2.1 What is the need of image compression? The need for image compression becomes apparent when number of bits per image is computed resulting from typical sampling rates and quantization methods. For example, the amount of storage required for given image is (1) a low resolution, TV quality, color video image which has 512×512 pixels/coor,8 bits/pixels and 3 colors approximately consists of 6×106 bits (2) a 24×36mm negative photograph scanned at 12×10-6mm:3000×2000 pixels/colors, 8 bits/pixel, and 3 colors nearly contains 144×106bits; (3) a 14×17 inch radiograph scanned at 70×106 mm:5000×6000 pixels,12 bits/pixels nearly contains 360×106bits. Thus storage of even a few images could cause a problem. As another example of the need for image compression, consider the transmission of low resolution 512×512×8 bits/pixels×3-color video image over telephone lines. Using a 96000 bauds (bits/sec) modem, the transmission would take approximately 11 minutes for just a single image, which is unacceptable for most applications. 2.2 Fundamentals of image compression The term data refers to the process of reducing the amount of data required to represent a given quantity of information. There are three types of redundancies: (1) Spatial redundancy, which is due to the correlation or dependence between neighboring pixel values; (2) spectral redundancy, which is due to the correlation between different color planes or spectral bands; (3) temporal redundancy which is present because of correlation between different frames in images. Image compression research aims to reduce the number of bits required to represent an image by removing the spatial and spectral redundancies as much as possible. Data redundancy is of central issue in digital image compression. If n1 and n2 denote the number of information carrying units in two data sets that represent the same information, the relative data redundancy RD of the first data set can be defined as RD = 1-1/CR Where CR, commonly called the compression ration is CR = n1/n2 For the case n2=n1 and RD =0, indicating that the first representation of the information contains no redundant data. When n2<<n1. CR>∞ and RD>1, implying significant compression and highly redundant data. When n2>>n1, CR>0 and RD>∞, indicating that the second data set contains much more data than the original representation. Generally CR=10(10:1) defines that the first data set has 10 information carrying units for every 1 unit in the second or compressed data set. Thus the corresponding redundancy of 0.9 means 90 percent of the data in the first data set is redundant with respect to the second one. 2.3 Different Classes Of compression Techniques: Lossless and Lossy compression: In lossless compression schemes, the reconstructed image, after compression, is numerically identical to the original image. However lossless compression can only achieve a modest amount of compression. Lossless compression is preferred for archival purposes and often medical imaging, technical drawings, clip art or comics. This is because lossy compression methods, especially when used at low bit rates, introduce compression artifacts. An image reconstructed following lossy compression contains scheme completely discards redundant information. However, lossy schemes are capable of achieving much higher compression. Lossy methods are especially suitable for natural images such as photos in applications where minor loss of fidelity is acceptable to achieve a substantial reduction in bit rate. This lossy compression that produces imperceptible differences can be called visually lossless [1]. Predictive and Transform coding: In predictive coding, information already sent or available is used to predict future values, and the difference is coded. Since this is done in the image or spatial domain, it is relatively simple to implement and is readily adapted to local image characteristics. Differential Pulse Code Modulation (DPCM) is one particular example of predictive coding. Transform Coding, on the other hand, first transforms the image from its spatial domain representation to a different type of representation using some well-known transform and then codes the transformed values (coefficients). This method provides greater data compression compared to predictive methods, although at the expense of greater computational requirements. 2.4 Image Compression and Reconstruction: Three basic data redundancies can be categorized in the image compression standard. 1. Spatial redundancy due to correlation between neighboring pixels. 2. Spectral redundancy due to correlation between the color components. 3. Psycho-visual redundancy due to properties of the human visual system. The spatial and spectral redundancies are present because certain spatial and spectral patterns between the pixels and the color components are common to each other, whereas the psycho-visual redundancy originates from the fact that the human eye is insensitive to certain spatial frequencies. The principle of image compression algorithm is (1) reducing the redundancy in the image data and (2) producing a reconstructed image from the original image with the introduction of error that is insignificant to the intended applications. The aim here is to obtain an acceptable representation of digital image while preserving the essential information contained in that particular data set [4]. Original image Transform Quantization Lossless coding compressed image nnnn Fig1. Image Compression System The problem faced by image compression is very easy to define, as in above figure1.First; the original digital image is usually transformed into another domain, where it is highly decorrelated by using some transform. This de-correlation concentrates the important image information into a more compact form.The compressor then removes the redundancy in the transformed image and stores it into compressed file or data stream. In second stage, the quantization block reduces the accuracy of transformed output in accordance with some preestablished fidelity criterion. Also this stage reduces the psycho-visual redundancy of input image. Quantization operation is a reversible process and thus may be omitted when there is a need of error free or lossless compression. In the final stage of the data compression model the symbol coder creates a fixed or variable-length code to represent the quantizer output and maps the output in accordance with the code. Generally a variable-length code is used to represent the mapped and quantized data set. If assigns the shortest code words to the most occurring output values and thus reduces coding redundancy. The operation in fact is a reversible one. The decompression reverses the compression process to produce the recovered image as shown in figure. The recovered image may have lost some information due to the compression, may have error or distortion compared to the original image. Compressed Image Reconstructed Inverse transform De-quantisation Lossless Decoding Image Fig2. Image Decompression System 2.5 A Typical Image Coder: Image Compression model consists of Transformer, Quantizer and encoder. Transformer: It transforms the input data into format to reduce inter-pixel redundancies in the input image. Transform coding techniques use a reversible, linear mathematical transform to map the pixels values onto a set of coefficients, which are then quantized and encoded. The key factor behind the success of transform based coding schemes is that many of the resulting coefficients for most natural image have small magnitudes and can be quantized without causing significant distortion in the decoded image. Transform coding algorithm usually start by partitioning the original image into subimages of small size (usually 8×8). For each block the transform coefficient are calculated, effectively converting the original 8×8 array of pixel values into an array of coefficients within which the coefficients closer to the top-left corner usually contain most of the information needed to quantize and encode the image with little perceptual distortion. The resulting coefficients are then quantized and the output of quantizer is used by symbol encoding techniques to produce the output bit stream representing the encoded image. In image decompression model at the decoder’s side,the reverse process takes place with obvious difference that the dequantization stage will generate an approximated version of the original coefficient values e.g., whatever loss was introduced by the quantizer in the encoder stage is not reversible. Quantizer: It reduces the accuracy of transformer’s output in accordance with some preestablished fidelity criterion. Reduces the psycho visual redundancies of the input image. This operation is not reversible and must be omitted if lossless compression is desired. The quantization stage is at the core of any lossy image encoding algorithm. Quantization at encoder side, means partitioning of input data range into smaller set of values. There are two main types of quantizers: scalar quantizers and vector quantizers. A scalar quantizer partitions the domain of input values into smaller number of intervals. If the output intervals are equally spaced, which is the simplest way to do it, the process is called uniform scalar quantization; otherwise, for reasons usually related to minimization of total distortion, it is called non uniform scalar quantization. One of the most popular non uniform quantizer is the Lloyd-Max quantizer. Vector quantization techniques extend the basic principles of scalar quantization to multiple dimensions. Entropy Encoder: It creates a fixed or variable-length code to represent the quantizer’s output and maps the output in accordance with the code. In most cases, a variable-length code is used. An entropy encoder compresses the compressed values obtained by the quantizer to provide more efficient compression. Most important types of encoder used in lossy image compression techniques are arithmetic encoder, Huffman encoder and run-length encoder. Chapter 3: Image Compression Using Discrete Wavelet Transform: The wavelet transform has gained widespread acceptance in signal processing and image compression. Because of their inherent multi-resolution nature, wavelet-coding schemes are especially suitable for applications where scalability and tolerable degradation are important. Recently the JPEG committee has released its new image coding standard, JPEG-2000, which has been based upon DWT [5]. 3.1 What is wavelet transform? Wavelets are functions defined over a finite interval and having an average value of zero. The basic idea of the wavelet transform is to represent any arbitrary function (t) as a superposition of a set of such wavelets or basis functions. These basis functions or baby wavelets are obtained from a single prototype wavelet called the mother wavelet, by dilation or contradictions (scaling) and translations (shifts). The discrete wavelet Transform of a finite length signal x (n) having N components, for example, is expressed by an N×N matrix. 3.2 Subband coding: A signal is passed through a series of filters to calculate DWT. Procedure starts by passing this signal sequence through a half band digital low pass filter with impulse response h(n). Filtering of a signal is numerically equal to convolution of the title signal with impulse response of the filter. X[n]*h[n] = x[k].h[n-k] A half band low pass filter removes all frequencies that are above half of the highest frequency in the title signal. Then the signal is passed through high pass filter. The two filters are related to each other as H [L-1-n]=(-1)n g(n) Filters satisfying this condition are known as quadrature mirror filters. After filtering half of the samples can be eliminated since the signal now has the highest frequency as half of the original frequency. The signal can therefore be subsampled by 2, simply by discarding every other sample. This constitutes 1 level of decomposition and can mathematically be expressed as Y1 [n] =x[k]h[2n-k] Y2 [n] =x[k]g[2n+1-k] Where y1[n] and y2[n] are the output of low pass and high pass filters, respectively after subsampling by 2. This decomposition halves the time resolution since only half the number of sample now characterizes the whole signal. Frequency resolution has doubled because each output has half the frequency band of the input. This process is called as sub band coding. It can be repeated further to increase the frequency resolution as shown by the filter bank. Filter bank 3.3 Compression steps: 1. Digitize the source image into a signal s, which is a string of numbers. 2. Decompose the signal into a sequence of wavelet coefficients w. 3. Use threshold to modify the wavelet coefficients from w to w’. 4. Use quantization to convert w’ to a sequence q. 5. Entropy encoding is applied to convert q into a sequence e. Digitation: The image is digitized first. The digitized image can be characterized by its intensity level, or scales of gray which range from 0(black) to 255(white), and its resolution, or how many pixels per square inch[6]. Thresholding: In certain signals, many of the wavelet coefficients are closer or equal to zero. Through threshold these coefficients are modified so that the sequence of wavelet coefficients contains long strings of zeros. In hard threshold, a threshold is selected. Any wavelet whose absolute value falls below the tolerance is set to zero with the goal to introduce many zeros without losing a great amount of detail. Quantization: Quantization converts a sequence of floating numbers w to a sequence of integer’s q. The simplest form is to round to the nearest integer. Another method is to multiply each number in w’ by a constant k, and then round to the nearest integer. Quantization is called lossy because it introduces error into the process, since the conversation of w’ to q is not one to one function [6]. Entropy Encoding: With this method, an integer sequence q is changed into a shorter sequence e, with the numbers in e being 8 bit integers. The conversion is made by an entropy encoding table. Strings of zeros are coded by numbers 1 through 100,105 and 106, while the non-zero integers in q are coded by 101 through 104 and 107 through 254. Chapter 4: Implementation: 4.1 Choosing the Images: The images used were selected from a range of over 40 images provided in Matlab. A script was programmed to calculate the entropy He of an image using equations: P(k)= h(k)/MN He = - P(k) log2[P(k)] The script read through the image matrix noting the frequency, h, of each intensity level, k. Images that were not already in matrix format had to be converted using the imread function in Matlab. For example to convert an image file flower.tif to a Matlab matrix, the following code could be use: X= double(imread(‘flower.tif’)); The entropy could then be calculated using: He = imageEntropy(X); The entropy result showed a range of entropies from 0.323 to 7.7804 and a sample of 9 images was selected to represent the full range with intervals as even as possible. Two further images were added to act as controls and help confirm or disprove the effect of image entropy on compression. Since image entropy of control 1(‘Saturn’) was similar to image ‘Spine’ similar result would tend to confirm the significance of image entropy. This also applied to control 2(‘Cameraman’) and ‘Tire’. Image Image Entropy Text 0.323 Circbw 0.9996 Wifs 1.9754 Julia 3.1812 Spine 4.0935 Saturn(Control1) 4.114 Woman 5.003 Cameraman(Control2) 7.0097 4.2 Collecting Result: The Results Collected The results that were collected were values for percentage energy retained and percentage number of zeros. These values were calculated for a range of threshold values on all the images, decomposition levels and wavelets used in the investigation. The energy retained describes the amount of image detail that has been kept; it is a measure of the quality of the image after compression. The number of zeros is a measure of compression. A greater percentage of zeros implies that higher compression rates can be obtained. How the results were collected? Results were collected using the wdencmp function from the Matlab Wavelet toolbox, this return the L2 recovery(energy retained) and percentage of zeros when given the following inputs: 1. ‘gbl’ or ‘lvd’ for global or level dependent thresholding. 2. An image matrix 3. A wavelet name 4. Level of decomposition 5. Threshold value(s) 6. ‘s’ or ‘h’ for sort or hard thresholding 7. Whether the approximation values should be threshold An automated script was written which took as input an image, a wavelet and a level. It calculated 10 appropriate threshold levels to use and then collected the energy retained and percentage of zeros using wdencmp with the given image, wavelet, decomposition level for each of the 10 threshold values. 4.3Choosing the threshold values: There are a number of different options for thresholding. These include: 1. The approximation signal is threshold or not threshold. 2. Level dependent or global threshold values. 3. Threshold different areas of an image with different threshold values. Thresholding for Results set 1 The first set of results used the simplest combination which was to use global thresholding and not to threshold the approximation signals. To get an even spread of energy values, the function ‘calcres’ first calculated the decomposition coefficients up to the level given as input. The script then calculated 10 values to be evenly spread along the range 0.max_c, where max_c is the maximum value in the coefficients matrix. The spread of coefficients,10points are sampled However it was observed that the results for energy retained and percentage zeros only changed significantly between the first 4 or 5 thresholds. The most dramatic compression and energy loss occurred before reaching a threshold of 0.25max_c. The reason for this must be because a large percentage of coefficients and energy occurs at the lower values, therefore setting these values to zero would change a lot of the coefficients, and increase the compression rate. Setting the higher values to zero had little effect on compression because there weren’t many coefficients to affect. Therefore the threshold calculations were changed to zoom in on the first quarter of possible thresholds. The idea of this was to get a better view of how the energy was lost with thresholding, the optimal compromise between energy loss and compression, if any were possible, was more likely to be contained within this section. Thresholding for Results set 2 This used local thresholding. The function ‘calcres2’ was written to calculate the coefficient matrices for each level and direction (horizontal, vertical and diagonal), the maximum value for each level was noted. Again 10 sample of threshold values were taken from the range 0 to max_c for each local value. The first thresholds were all zero; the 10th set contained the maximum values for each. Each level and direction sub signal can be threshold differently. The Level 3 horizontal would be tested with thresholds in the range 0 to 880 but the level 1 diagonal only requires a range of thresholds 0 to 120. 4.4 Collecting results for all images The function ‘calcres’ was used to get one set of results, for a particular combination of image, wavelet and decomposition level. However results were required for many images with many wavelets at many decomposition levels. Therefore an automated script was written (getwaveletresults.m), which follows the following algorithm: For each image, For each wavelet, For level=1 to 5, RES = calcres(image, wavelet, level) Save RES end end end The list of images and wavelets to use are loaded from a database, using the Database Toolbox. The results are then saved in the database. 4.5 Using the Database Toolbox The Database Toolbox was used to provide a quick and relatively simple solution to problems involved with saving the results. Originally the results were saved to a large matrix but in Matlab the matrix couldn’t include strings, so the names of images and wavelets could not be included in the matrix. The initial solution was to give each image and wavelet a numeric identifier that could be put into the matrix. The image names and wavelet names could then be written in separate files, and the ID values linked to their position in this file. However this lead to concurrency problems, if the image file was changed then the results pointed to the wrong images and thus were wrong. The second idea was to save a matrix of strings in the same file that the results matrix was saved, however if the strings are of different lengths they can not go into the same matrix. Thus database toolbox allows Matlab to communicate with a database e.g. MS Access database in order to import and export data. An advantage of saving results into a database of this sort is that SQL statement can be used to retrieve certain sets of data and thus makes the analysis of results a lot easier. The database contained 3 tables: Images, Results and Wavelets 4.6 Method of Analysis In order to analyze the results, graphs were plotted of energy retained against percentage of zeros. In an ideal situation an image would be compressed by 100% whilst retaining 100% of the energy. The graph would be a straight line similar to: The best threshold to use would be one that produced 100% zeros whilst retaining 100% of the energy, corresponding to point E above. However, since energy retained decreases as the number of zeros increases, it is impossible for the image to have the same energy if the values have been changed by thresholding. Thus all the graphs produced had the general form. Important things to analyze about the curves were: 1. The starting point: the x co-ordinate gives the percentage of zeros with no thresholding and Therefore shows how much the image could be compressed without thresholding. 2. The end point: this shows how many zeros could be achieved with the investigated thresholds And how much energy is lost by doing so. 3.The changing gradient of the curve: this shows how rapidly energy is lost when trying to compress the image. It is the ability of wavelets to retain energy during compression that is important, so energy being lost quickly suggests that a wavelet is not good to use. 4.7 Defining the Best Result In figure the starting point indicates the best energy retention and the end point indicates the best compression. In practice compressing an image requires a ‘trade-off’ between compression and energy loss. It may not be better to compress by an extra small amount if this would cause a dramatic loss in energy. The Distance, D For the purposes of this study it was decided to define the best ‘trade-off’ to be the point T on the curve which was closest to the idealized point E. This was calculated using simple Pythagoras Theorem with the ten known points on each graph. For the rest of the study the distance between each point T and point E will be known simply as D. By calculating D for each result we can take the minimum to be the best, in other words the most efficient at retaining energy while compressing the signal. This will not necessarily be the best to use for all situations but be the most efficient found in the results. Energy Loss per percentage Zero Ratio A second possible measure of the best result is the ratio of Energy Loss per percentage Zero. This can be calculated for each result by the following equation: Energy Loss per percentage Zero = 100% - %Energy Retained/%Zeros This gives a measure of how much energy is lost by compressing the images, so a smaller value for the ratio would mean the compression is less lossy, and therefore better. Chapter 5 5.1 DWT Result: Results obtained with the matlab code[7] are shown below. Figure5.1 shows original Lena image. Figure 5.2 to 5.4 show compressed images for various threshold values. As threshold value increases blurring of image continues to increase. Figure 5.1 Original Lena image Figure 5.2 Compressed image for threshold value 1 Figure 5.3 Compressed image for threshold value 2 Figure 5.4 Compressed image for threshold value 5 5.2Result for DWT based on various performance parameters: Mean Square Error(MSE) is defined as the square of differences in the pixel values between the corresponding pixels of the two images. DWT based image compression Fig(5.2.1) shows that MSE first decreases with increase in window size and then starts to increase slowly with finally attaining a constant value. Compression decreases with increase in window size for DWT. Figure 5.2.1 Mean Squared Error vs. Window size for DWT Figure 5.2.2 Compression vs. window size for DWT 5.3 Conclusion: DWT is used as basis for transformation in JPEG 2000 standard. DWT provides high quality compression at low bit rates. The use of larger DWT basis function or wavelet filters produces blurring near edges in images. DWT performs better than DCT in the context that it avoids blocking artifacts which degrade reconstructed images. DWT provides lower quality than JPEG at low compression rates. DWT requires longer compression time. References: 1. http://en.wikipedia.org/Image_Compression. 2. http://www.3.interscience.wiley.com 3. Grey Ames,” image compression”, dec07,2002 4. David Salomon, Data Compression. The complete reference, 2nd Edition Springer-verlag 1998. 5. R.C Gonzalez and R.E Woods, ”Digital Image Processing.” Reading MA: Addison Wesley, 2004. 6. Grey Ames,” Image compression,” dec07,2002. 7. http://www.spelman.edu/%7 Ecolm/wav.html.