International Journal of Engineering Trends and Technology (IJETT) – Volume17 Number 10–Nov2014 Hybrid Compression Scheme for Scanned Documents Fathima Najeeb #1, Rahul R Nair *2 P G Scholar, Assistant Professor, Electronics and Communication Engineering Department, M G University Caarmel Engineering College, Pathanamthitta, Kerala, India Abstract—Compression is used to reduce the irrelevant or redundant data, in order to save space or transmission time. Scanned document is compressed to reduce the requirement of the storage space and bandwidth. This paper proposes, a hybrid coding method for scanned documents. This hybrid codec includes a lossy block based compression scheme called Absolute Moment Block Truncation Coding (AMBTC), a block based pattern matching and a lifting based discrete wavelet transform. AMBTC encoder outputs a bitmap and two mean values for each block. For greyscale document images bitmap contains most of the information. A bitmap interpolation and a lossless block matching scheme is applied on bit map to reduce its size. The higher and lower mean sub images from the AMBTC encoder is coded separately by DWT. The proposed method combines both the advantages of AMBTC and DWT. It performs a higher compression with high quality reconstructed versions. The proposed work is analysed by Peak Signal to Noise Ratio (PSNR) and compression ratio (CR). The result reveals that, proposed method shows high PSNR and high compression ratio. Keywords—AMBTC, DWT, Scanned Document Compression, Block based pattern matching. I. INTRODUCTION Image compression algorithms aim to remove redundancy in data in a way which makes image reconstruction possible. This basically means that image compression algorithms try to exploit redundancies in the data; they calculate which data needs to be kept in order to reconstruct the original image and therefore which data can be ’thrown away’. By removing the redundant data, the image can be represented in a smaller number of bits, and hence can be compressed. Scanned documents in digital form play an important role in everyday life. It also provide a way to store historically important books and data. Compression of the compound documents needs more work than the compression of the images because if the compressed image is having some lossy information also the human visual system can’t able to identify or it won’t affect the whole image content or information but in the documents if it is having text means the little lossy quality also can be easily identify by a human observer and we can’t satisfy the users even if the data is compressed. Compression of scanned documents can be tricky. The scanned document is either compressed as a continuous tone picture, or it is binarised before compression. The binary document can then be compressed using any available two level lossless compression algorithm (such as JBIG [1] and JBIG2 [2]), or it may undergo character recognition [3]. Binarization may cause strong degradation to object contours ISSN: 2231-5381 and textures, such that, whenever possible, continuous-tone compression is preferred [2]. In single/multi-page document compression, each page may be individually encoded by some continuous-tone image compression algorithm, such as JPEG [4] or JPEG2000 [5]. Multi-layer approaches such as the mixed raster content (MRC) imaging model [6] are also challenged by soft edges in scanned documents, often requiring pre- and post-processing. In this paper, we propose a novel method of encoding a scanned document using the Absolute Moment Block truncation coding (AMBTC) [7], a lifting based Discrete Wavelet Transform (DWT) [8] and a block based pattern matching [9] to achieve significant improvement in scanned document image compression performance. AMBTC has very few computations, edge-preserving ability; but only a medium compression ratio. We investigate the optimal implementation of DWT and block match encoding to improve the compression without losing the quality. The AMBTC is an improved version of Block truncation coding. AMBTC is a block based lossy compression algorithm. The encoded output contains one bitmap and two mean values for each block. The size of the bitmap is compressed by bitmap interpolation and a block matching method. By pattern based block matching method, we are exploiting the redundancies in scanned documents. Text characters have redundancies in their pixel elements. So a document image comprised of redundant data. These redundancies can be eliminated by block matching methods. Means are of higher means and lower means. These means can be arranged into two sub images. These sub images are compressed by means of a lifting based DWT [19]. This will eliminate the correlations between the pixel values. The techniques used in the proposed method is described as follows: II. ABSOLUTE MOMENT BLOCK TRUNCATION CODING AMBTC is an improved version of BTC [9], preserves absolute moments rather than standard moments, here also a digitized image is divided into blocks of n x n pixels. Each block is quantized in such a way that each resulting block has the same sample mean and the same sample first absolute central moment of each original block [10]. In AMBTC algorithm similar to BTC there are four separate steps while coding a single block of size n x n. They are quad tree segmentation [11], bit plane omission, bit plane coding using 32 visual patterns and interpolative bit plane coding [12]. http://www.ijettjournal.org Page 490 International Journal of Engineering Trends and Technology (IJETT) – Volume17 Number 10–Nov2014 Steps for AMBTC algorithm are: Step1: Original image is segmented into blocks of size 4x4, 8x8 or higher for processing. Step2: Find the mean of each block also the mean of the higher range and lower range. Step3: Based on these means calculated for each block, bit plane is calculated. Step4: Image block is reconstructed by replacing the 1 s with higher mean and 0 s with lower mean. At the encoder side an image is divided into nonoverlapping blocks. The size of each non overlapping block may be (4 x 4) or (8 x 8), etc. Calculate the average gray level of the block (4x4) as (for easy understanding we made the compression for 4 x 4 block. The same procedure is followed for block of any size): Where Xi represents the ith pixel in the block. Pixels in the image block are then classified into two ranges of values. The upper range is those gray levels which are greater than the block average gray level and the remaining brought into the lower range. The mean of higher range XH and the lower range XL are calculated as: Where k is the number of pixels whose gray level is greater than . A binary block, denoted by b, is also used to represent the pixels. We can use “1” to represent a pixel whose gray level is greater than or equal to and “0” to represent a pixel whose gray level is less than . Fig. 1 quad tree segmentation B. Bit Plane Omission In bit plane omission technique, the bit plane obtained in an AMBTC method is omitted in the encoding procedure if the absolute difference between the higher mean H and the lower mean L of that block is less than a threshold value only the block mean retained. At the time of decoding the bit plane omitted blocks are replaced by a block filled with the block mean. C. Bit Plane Coding A bit plane of a digital discrete signal (such as image or sound) is a set of bits having the same position in the respective binary numbers. For example, for 16-bit data representation there are 16 bit-planes: the first bit-plane contains the set of the most significant bit and the 16th contains the least significant bit. It is possible to see that the first bit-plane gives the roughest but the most critical approximation of values of a medium, and the higher the number of the bit plane, the less is its contribution to the final stage. Thus, adding bit-plane gives a better approximation [14]. D. Interpolative Technique When there is a correlation among neighbouring pixels, A. Quad Tree Segmentation there is a correlation among the neighbouring bits in the bit The quad tree segmentation technique divides the given plane also. Based on this idea, the alternate bits in the image in to set of variable sized blocks using a threshold value. AMBTC bit plane are dropped and the remaining 8 bits are Here we use the absolute difference between the higher mean preserved. The dropped bits are then predicted using and the lower mean of AMBTC method as controlling value neighbouring bits. In this technique half (8 bits) of the number to segment the given image [13]. Steps for quad tree of the bits in the bit plane of AMBTC is dropped. segmentation: III. WAVELET TRANSFORMATION Step1: Divide the image into non overlapping blocks, suppose the block size to be m1 x m1. The output of the AMBTC encoder contains interpolated Step2: For each block, if the absolute difference between the bitmap and two sub sampled images. Higher and lower sub higher mean and the lower mean is less than mean then apply sampled images is separately encoded using a lifting discrete AMBTC else divide the block into 4 sub images. wavelet transform. Wavelet transformation can efficiently Step3: Repeat the above step until the condition gets satisfied. encode the sub sampled images by removing the correlations Step4: Check the above two steps for every block in the image. among higher and lower mean pixels. The power of wavelets The image is divided as shown below depending on the comes from the use of multi resolution rather than examining condition that is taken into consideration. entire signals through the same window, different parts of the wave are viewed through different sized windows where high frequency parts of the signal uses small window to give good ISSN: 2231-5381 http://www.ijettjournal.org Page 491 International Journal of Engineering Trends and Technology (IJETT) – Volume17 Number 10–Nov2014 time resolution while the low frequency parts uses big window to extract good frequency information[8]. Wavelet transform analysis represents image as a sum of wavelet functions with different locations and scales. Wavelet transforms exhibit high decorrelation and energy compaction properties. Problems like blocking artifacts and mosquito noise are absent in wavelet based image coders, also distortion due to aliasing is totally eliminated through design of proper filters. These properties made wavelets suitable for image compression. Similar to other linear transforms, wavelet transforms reduce the entropy of an image i.e. the wavelet coefficient matrix has lower entropy than the image itself and hence coefficient matrix is more efficiently encoded. The Wim Sweldens76 developed the lifting scheme for the construction of bi- orthogonal wavelets[15]-[16]. The main feature of the lifting scheme is that all constructions are derived in the spatial domain. It does not require complex mathematical calculations that are required in traditional methods. Lifting scheme is simplest and efficient algorithm to calculate wavelet transforms. It does not depend on Fourier transforms. Lifting scheme is used to generate secondgeneration wavelets, which are not necessarily translation and dilation of one particular function. It was started as a method to improve a given discrete wavelet transforms to obtain specific properties. Later it became an efficient algorithm to calculate any wavelet transform as a sequence of simple lifting steps. Digital signals are usually a sequence of integer numbers, while wavelet transforms result in floating point numbers. In [17], Calderbank et al. introduced how to use the lifting scheme presented in [18], where Sweldens showed that the computational complexity of convolution based biorthogonal wavelet transforms can be reduced by implementing a lifting based scheme. For an efficient reversible implementation, it is of great importance to have a transform algorithm that converts integers to integers. Fortunately, a lifting step can be modified to operate on integers, while preserving the reversibility. Thus, the lifting scheme became a method to implement reversible integer wavelet transforms. A lifting based DWT is applied on the sub sampled images in order to reduce the inter pixel redundancy. As a result of decomposition, most of the coefficients with high frequency low scale region are either zero or in close proximity to zero. Hence, most of significant coefficients can be extracted and coded by applying strategies such as designing a JPEG like quantization table or applying threshold. Threshold parameter value is chosen intuitively based on experimentation and satisfactory visual effect of reconstructed image as reported in [26]. The significance of coefficients is directly related to its magnitude as well as their sub bands after wavelet decomposition. The lifting based wavelet transform consists of splitting, lifting, and scaling modules and the wavelet transform is treated as prediction error decomposition. It provides a complete spatial interpretation of DWT. In Fig. 4.5, let X denotes the input signal and X_LI and X_HI be the decomposed output signals, where they are obtained through the following three modules of lifting based 1D-DWT: ISSN: 2231-5381 Fig. 2 lifting based wavelet transform. (1) Splitting: In this module, the original signal X is divided into two disjoint parts, i.e., Xe(n)= x (2n ) and Xo(n)=x (2n+1) that denotes all even-indexed and odd indexed samples of X , respectively. (2) Lifting: In this module, the prediction operation P is used to estimate Xo(n) from Xe(n) and results in an error signal d(n) which represents the detail part of the original signal. Then we update d(n) by applying it to the update operation U and the resulting signal is combined with Xe(n) to estimate s(n) which represents the smooth part of the original signal. (3) Scaling: A normalization factor is applied to d (n) and s(n) , respectively. In the even indexed part is multiplied by a normalization factor Ke to produce the wavelet subband XLI . Similarly in the odd index part, the error signal d(n) is multiplied by Ko to obtain the wavelet sub band XHI. IV. BLOCK BASED PATTERN MATCHING A block based pattern matching [9] is applied on the bitmap image. In a grey scale scanned document, the bitmap can convey most of the information. So losses in the bitmap details can deteriorate the complete document. Therefore an efficient block based pattern matching compression without losing the data can be applied to bitmap. Fig 3 shows example of scanned documents and its bitmap images. From this it is clear that a bitmap image can convey most of the information from a grey scale document images. Since bitmap from AMBTC encoder is implemented in a block wise strategy, every small details can be recovered back from the decompressed high and low mean sub images. Fig. 3: original and bitmap images. The block matching algorithm used here is simple and accurate. Only hundred percent matches are considered as matches. Others are not considered. A tight matching is applied because the bitmap is already compressed by means of bit map interpolation. For comparing the blocks an exclusive OR operation between the blocks can be done. If the result http://www.ijettjournal.org Page 492 International Journal of Engineering Trends and Technology (IJETT) – Volume17 Number 10–Nov2014 after matching is zero, those blocks are considered as matched. In the proposed system this block matching algorithm involves more computations and consumes more time Steps for block based pattern matching for bitmap are as follows: Step1: Same sized bitmap blocks are grouped together for matching. Different block sizes are 16X16, 8X8, 4X4 etc. Step2: Blocks are written into the encoder in its order and each block is compared with previously entered blocks. Step 3: If a match is found the encoder writes only its position and identification for reference block. Step4: Only the bitmap block with same pattern is considered as matched and even a single pixel differenced blocks are not considered. Step5: Compressed bitmap contains different patterned blocks and positions of the repeated blocks. with higher means and all the 0’s are replaced with lower means. In this scheme only AMBTC is lossy and other two methods are almost lossless. So the reconstructed version can achieve quality as same as that of AMBTC encoded output with high compression ratio. Original image can be recovered with good quality. Fig 5 shows the size reduction flow of the proposed system. It describes the contribution of compression by each techniques and overall compression ratio that can be achieved by this hybrid algorithm. It can achieve a compression ratio higher than 1:42. V. IMPLEMENTATION OF PROPOSED METHOD In our proposed method scanned documents are the inputs. A hybrid coder named AMBTC-DWT is used for the compression. The encoder of the proposed method is done as follows. Block diagram of the proposed system is shown in fig 4. The encoder divides the scanned document input into non overlapping blocks. Fig. 5: size reduction flow VI. SIMULATION RESULTS Proposed system is implemented in MATLAB was simulated and tested using several scanned images. In our test different scanned pages are compressed using ADC and proposed method. PSNR and Compression Ratio are calculated for the two schemes and the result is plotted against bit per pixels. Results shown that the proposed algorithm gives a high PSNR ratio with better compression ratio. Reconstructed version shows good quality. Fig. 4: Proposed system- block diagram. For each block mean of the whole block, mean of the higher grey levels and mean of the lower grey levels are calculated. Then the blocks are arranged into two distinct image such that one contains the lower grey level s and the other contain the higher grey level pixels. Then these sub images are processed by lifted wavelet transform and at the output both are summed together. The size of the bitmap is primarily reduced by bit map interpolation technique and then a block based pattern matching. Decoder works as reverse of the encoder. Blocks are predicted using the reference block and position vectors. Dropped bits in the bit plane is predicted by neighbouring bits. Apply IDWT on compressed subsampled images. In the bit plane all the 1’s are replaced ISSN: 2231-5381 Fig. 6: a. original document. b. reconstructed version Fig 6 shows the zoomed version of original and reconstructed image. An acceptable lossy content is there in the reconstructed version, but it does not affect the readability. http://www.ijettjournal.org Page 493 International Journal of Engineering Trends and Technology (IJETT) – Volume17 Number 10–Nov2014 60 PSNR Comparison 49.18 48.44 50 38.72 PSNR 40 30 existing system proposed system 39.48 36.68 32.88 numbers compared to floating point numbers. A tight block match encoding is used here i.e. any tolerance between two blocks is not considered as matched blocks. So after AMBTC encoding the quality is not altered but compression can be increased. This method can be implemented for continues tone images also. REFERENCES 25.47 [1] 22.11 19.7 17.72 17.08 20 13.72 [2] 10 [3] 0 image 1 image 2 image 3 image 4 Fig. 7: PSNR comparison image 5 image 6 Fig 7 shows the Peak Signal to Noise Ratio comparison between jpeg2000 and the proposed system. PSNR is calculated for various images with both methods. Proposed system shows a high PSNR in high compression ratios. High PSNR reveals that the reconstructed versions has a good quality and good readability. Table 1 shows compression ratio for proposed system and JPEG 2000. Proposed system shows a significant improvement than the existing system. TABLE I COMPRESSION RATIO COMPARISON Input image 1. 2. 3. 4. 5. 6. [4] [5] [6] [7] [8] [9] Compression ratio Jpeg 2000 28.98 32.76 30.98 25.73 29.94 27.91 VII. Proposed system 35.19 43.65 42.14 34.69 40.32 36.98 [11] [12] CONCLUSIONS The presented hybrid algorithm combines the simple computation and edge preservation properties of AMBTC and high compression ratio and PSNR of DWT, and may be implemented with significantly lower coding delay than DWT alone. The first stage encoding i.e. AMBTC is a lossy compression scheme. The other compressions involved in the procedures are block matching and a lifting DWT. A block matching is implemented on its bitmap due to the fact that; for document images the bitmap contain most of its information. The lifting based DWT used here allows us to implement reversible integer wavelet transforms. In conventional scheme it involves floating point operations, which introduces rounding errors due to floating point arithmetic. While in case of lifting scheme perfect reconstruction is possible for lossless compression. It is easier to store and process integer ISSN: 2231-5381 [10] [13] [14] [15] [16] [17] [18] [19] Information Technology—Coded Representation of Picture and Audio Information, Lossy/Lossless Coding of Bi-level Images, ITU-T Standard T.88, Mar. 2000 R. L. de Queiroz, Compressing Compound Documents, in the Document and Image Compression Handbook, M. Barni, Ed. New York, USA: Marcel Dekker, 2005. S. Mori, C. Y. Suen, and K. Yamamoto, “Historical review of OCR research and development,” Proc. IEEE, vol. 80, no. 7, pp. 1029–1058, Jul. 1992. W. B. Pennebaker and J. L. Mitchell, JPEG Still Image Data Compression Standard, London, U.K.: Chapman & Hall, 1993. Information Technology—JPEG2000 Image Coding System—Part 1: Core Coding System, ISO/IEC Standard 15444-1, 2000. R. L. de Queiroz, R. Buckley, and M. Xu, “Mixed raster content (MRC) model for compound image compression,” Proc. SPIE, vol. 3653, pp. 1106–1117, Jan. 1999. Wen Jan Chen, Shen Chuan Tai ,Bit rate reduction techniques for Absolute Block Truncation Coding, journal for Chinese institute of engineers ,vol. 22,no 5,pp661-667(1999) Calderbank AR, Daubechies I, Sweldens W, and Yeo BL, "Wavelet transforms that map integers to integers,", Appl. Comput. Harmon, Vol .5,1998, pp 332- 369. Alexandre Zaghetto, Member, IEEE, and Ricardo L. de Queiroz, Senior Member, IEEE. ”Scanned Document Compression Using Block-Based Hybrid Video Codec.” IEEE Transactions On Image Processing, Vol. 22, No. 6, June 2013 S. Vimala, M. Sathya , K. Kowsalya Devi ,” Enhanced AMBTC for Image Compression using Block Classification and Interpolation” International Journal of Computer Applications (0975 – 8887) Volume 51– No.20, August 2012 G. Lu and T. L. Yew, “Image compression using quad tree partitioned iterated function systems,” Electronic Letters, VOL. 30, NO.1, pp. 2324, Jan. 1994. Reversible Image Watermarking using Bit Plane Coding and Lifting Wavelet Transform by S. Kurshid Jinna, Dr. L. Ganesan , IJCSNS International Journal of Computer Science and Network Security, VOL.9 No.11, November 2009 Yu-Chen Hu, “Low complexity and low bit-rate image compression scheme based on Absolute Moment Block Truncation Coding”, Vol. 42 No. 7 (2003) pp 1964-1975. Y V Ramana, and H Eswaran, “A new algorithm for BTC image bit plane coding,” IEEE Trans on Comm, 32(10), 1984, pp. 1148-1157. S. G. Chang, B Yu and M Vetterli. “Adaptive Wavelet Thresholding for image Denoising and Compression”. IEEE Transactions on Image Processing, Vol. 9, No. 9, September 2000 F. Belgassem, E. Rhoma, A. Dziech," Performance evaluation of interpolative BTC image coding algorithms",ICSES2008, International Conference on Signals and Electronic Systems,14-17,Sep.2008. Rioul and M. Vetterli, “Wavelet and signal Processing,”, IEEE SP Magazine , Oct. 1991, pp. 14 – 37. Said A and Pearlman WA," An image multiresolution representation for lossless and lossy compression,", IEEE Trans on Image Processing, Vol .5, pp. 1303-1310, 1996. Ali A. Elrowayati, Zakaria Suliman Zubi, “IBTC-DWT Hybrid Coding of Digital Images,” Latest Trends in Applied Informatics and Computing . http://www.ijettjournal.org Page 494