Hybrid Compression Scheme for Scanned Documents Fathima Najeeb

International Journal of Engineering Trends and Technology (IJETT) – Volume17 Number 10–Nov2014
Hybrid Compression Scheme for Scanned
Fathima Najeeb #1, Rahul R Nair *2
P G Scholar, Assistant Professor, Electronics and Communication Engineering Department, M G University
Caarmel Engineering College, Pathanamthitta, Kerala, India
Abstract—Compression is used to reduce the irrelevant or
redundant data, in order to save space or transmission time.
Scanned document is compressed to reduce the requirement of
the storage space and bandwidth. This paper proposes, a hybrid
coding method for scanned documents. This hybrid codec
includes a lossy block based compression scheme called Absolute
Moment Block Truncation Coding (AMBTC), a block based
pattern matching and a lifting based discrete wavelet transform.
AMBTC encoder outputs a bitmap and two mean values for each
block. For greyscale document images bitmap contains most of
the information. A bitmap interpolation and a lossless block
matching scheme is applied on bit map to reduce its size. The
higher and lower mean sub images from the AMBTC encoder is
coded separately by DWT. The proposed method combines both
the advantages of AMBTC and DWT. It performs a higher
compression with high quality reconstructed versions.
The proposed work is analysed by Peak Signal to Noise Ratio
(PSNR) and compression ratio (CR). The result reveals that,
proposed method shows high PSNR and high compression ratio.
Keywords—AMBTC, DWT, Scanned Document Compression,
Block based pattern matching.
Image compression algorithms aim to remove redundancy
in data in a way which makes image reconstruction possible.
This basically means that image compression algorithms try to
exploit redundancies in the data; they calculate which data
needs to be kept in order to reconstruct the original image and
therefore which data can be ’thrown away’. By removing the
redundant data, the image can be represented in a smaller
number of bits, and hence can be compressed.
Scanned documents in digital form play an important role
in everyday life. It also provide a way to store historically
important books and data. Compression of the compound
documents needs more work than the compression of the
images because if the compressed image is having some lossy
information also the human visual system can’t able to
identify or it won’t affect the whole image content or
information but in the documents if it is having text means the
little lossy quality also can be easily identify by a human
observer and we can’t satisfy the users even if the data is
Compression of scanned documents can be tricky. The
scanned document is either compressed as a continuous tone
picture, or it is binarised before compression. The binary
document can then be compressed using any available two
level lossless compression algorithm (such as JBIG [1] and
JBIG2 [2]), or it may undergo character recognition [3].
Binarization may cause strong degradation to object contours
ISSN: 2231-5381
and textures, such that, whenever possible, continuous-tone
compression is preferred [2]. In single/multi-page document
compression, each page may be individually encoded by some
continuous-tone image compression algorithm, such as JPEG
[4] or JPEG2000 [5]. Multi-layer approaches such as the
mixed raster content (MRC) imaging model [6] are also
challenged by soft edges in scanned documents, often
requiring pre- and post-processing.
In this paper, we propose a novel method of encoding a
scanned document using the Absolute Moment Block
truncation coding (AMBTC) [7], a lifting based Discrete
Wavelet Transform (DWT) [8] and a block based pattern
matching [9] to achieve significant improvement in scanned
document image compression performance. AMBTC has very
few computations, edge-preserving ability; but only a medium
compression ratio. We investigate the optimal implementation
of DWT and block match encoding to improve the
compression without losing the quality. The AMBTC is an
improved version of Block truncation coding.
AMBTC is a block based lossy compression algorithm. The
encoded output contains one bitmap and two mean values for
each block. The size of the bitmap is compressed by bitmap
interpolation and a block matching method. By pattern based
block matching method, we are exploiting the redundancies in
scanned documents. Text characters have redundancies in
their pixel elements. So a document image comprised of
redundant data. These redundancies can be eliminated by
block matching methods. Means are of higher means and
lower means. These means can be arranged into two sub
images. These sub images are compressed by means of a
lifting based DWT [19]. This will eliminate the correlations
between the pixel values. The techniques used in the proposed
method is described as follows:
AMBTC is an improved version of BTC [9], preserves
absolute moments rather than standard moments, here also a
digitized image is divided into blocks of n x n pixels. Each
block is quantized in such a way that each resulting block has
the same sample mean and the same sample first absolute
central moment of each original block [10]. In AMBTC
algorithm similar to BTC there are four separate steps while
coding a single block of size n x n. They are quad tree
segmentation [11], bit plane omission, bit plane coding using
32 visual patterns and interpolative bit plane coding [12].
Page 490
International Journal of Engineering Trends and Technology (IJETT) – Volume17 Number 10–Nov2014
Steps for AMBTC algorithm are:
Step1: Original image is segmented into blocks of size 4x4,
8x8 or higher for processing.
Step2: Find the mean of each block also the mean of the
higher range and lower range.
Step3: Based on these means calculated for each block, bit
plane is calculated.
Step4: Image block is reconstructed by replacing the 1 s with
higher mean and 0 s with lower mean.
At the encoder side an image is divided into nonoverlapping blocks. The size of each non overlapping block
may be (4 x 4) or (8 x 8), etc. Calculate the average gray level
of the block (4x4) as (for easy understanding we made the
compression for 4 x 4 block. The same procedure is followed
for block of any size):
Where Xi represents the ith pixel in the block. Pixels in the
image block are then classified into two ranges of values. The
upper range is those gray levels which are greater than the
block average gray level and the remaining brought into the
lower range. The mean of higher range XH and the lower
range XL are calculated as:
Where k is the number of pixels whose gray level is greater
than . A binary block, denoted by b, is also used to represent
the pixels. We can use “1” to represent a pixel whose gray
level is greater than or equal to and “0” to represent a pixel
whose gray level is less than .
Fig. 1 quad tree segmentation
B. Bit Plane Omission
In bit plane omission technique, the bit plane obtained in an
AMBTC method is omitted in the encoding procedure if the
absolute difference between the higher mean H and the
lower mean L of that block is less than a threshold value
only the block mean retained. At the time of decoding the bit
plane omitted blocks are replaced by a block filled with the
block mean.
C. Bit Plane Coding
A bit plane of a digital discrete signal (such as image or
sound) is a set of bits having the same position in the
respective binary numbers. For example, for 16-bit data
representation there are 16 bit-planes: the first bit-plane
contains the set of the most significant bit and the 16th
contains the least significant bit. It is possible to see that the
first bit-plane gives the roughest but the most critical
approximation of values of a medium, and the higher the
number of the bit plane, the less is its contribution to the final
stage. Thus, adding bit-plane gives a better approximation
D. Interpolative Technique
When there is a correlation among neighbouring pixels,
A. Quad Tree Segmentation
there is a correlation among the neighbouring bits in the bit
The quad tree segmentation technique divides the given plane also. Based on this idea, the alternate bits in the
image in to set of variable sized blocks using a threshold value. AMBTC bit plane are dropped and the remaining 8 bits are
Here we use the absolute difference between the higher mean preserved. The dropped bits are then predicted using
and the lower mean of AMBTC method as controlling value neighbouring bits. In this technique half (8 bits) of the number
to segment the given image [13]. Steps for quad tree of the bits in the bit plane of AMBTC is dropped.
Step1: Divide the image into non overlapping blocks, suppose
the block size to be m1 x m1.
The output of the AMBTC encoder contains interpolated
Step2: For each block, if the absolute difference between the bitmap and two sub sampled images. Higher and lower sub
higher mean and the lower mean is less than mean then apply sampled images is separately encoded using a lifting discrete
AMBTC else divide the block into 4 sub images.
wavelet transform. Wavelet transformation can efficiently
Step3: Repeat the above step until the condition gets satisfied. encode the sub sampled images by removing the correlations
Step4: Check the above two steps for every block in the image. among higher and lower mean pixels. The power of wavelets
The image is divided as shown below depending on the comes from the use of multi resolution rather than examining
condition that is taken into consideration.
entire signals through the same window, different parts of the
wave are viewed through different sized windows where high
frequency parts of the signal uses small window to give good
ISSN: 2231-5381
Page 491
International Journal of Engineering Trends and Technology (IJETT) – Volume17 Number 10–Nov2014
time resolution while the low frequency parts uses big
window to extract good frequency information[8].
Wavelet transform analysis represents image as a sum of
wavelet functions with different locations and scales. Wavelet
transforms exhibit high decorrelation and energy compaction
properties. Problems like blocking artifacts and mosquito
noise are absent in wavelet based image coders, also distortion
due to aliasing is totally eliminated through design of proper
filters. These properties made wavelets suitable for image
compression. Similar to other linear transforms, wavelet
transforms reduce the entropy of an image i.e. the wavelet
coefficient matrix has lower entropy than the image itself and
hence coefficient matrix is more efficiently encoded.
The Wim Sweldens76 developed the lifting scheme for the
construction of bi- orthogonal wavelets[15]-[16]. The main
feature of the lifting scheme is that all constructions are
derived in the spatial domain. It does not require complex
mathematical calculations that are required in traditional
methods. Lifting scheme is simplest and efficient algorithm to
calculate wavelet transforms. It does not depend on Fourier
transforms. Lifting scheme is used to generate secondgeneration wavelets, which are not necessarily translation and
dilation of one particular function.
It was started as a method to improve a given discrete
wavelet transforms to obtain specific properties. Later it
became an efficient algorithm to calculate any wavelet
transform as a sequence of simple lifting steps. Digital signals
are usually a sequence of integer numbers, while wavelet
transforms result in floating point numbers. In [17],
Calderbank et al. introduced how to use the lifting scheme
presented in [18], where Sweldens showed that the
computational complexity of convolution based biorthogonal
wavelet transforms can be reduced by implementing a lifting
based scheme. For an efficient reversible implementation, it is
of great importance to have a transform algorithm that
converts integers to integers. Fortunately, a lifting step can be
modified to operate on integers, while preserving the
reversibility. Thus, the lifting scheme became a method to
implement reversible integer wavelet transforms.
A lifting based DWT is applied on the sub sampled images
in order to reduce the inter pixel redundancy. As a result of
decomposition, most of the coefficients with high frequency
low scale region are either zero or in close proximity to zero.
Hence, most of significant coefficients can be extracted and
coded by applying strategies such as designing a JPEG like
quantization table or applying threshold. Threshold parameter
value is chosen intuitively based on experimentation and
satisfactory visual effect of reconstructed image as reported in
[26]. The significance of coefficients is directly related to its
magnitude as well as their sub bands after wavelet
decomposition. The lifting based wavelet transform consists
of splitting, lifting, and scaling modules and the wavelet
transform is treated as prediction error decomposition. It
provides a complete spatial interpretation of DWT. In Fig. 4.5,
let X denotes the input signal and X_LI and X_HI be the
decomposed output signals, where they are obtained through
the following three modules of lifting based 1D-DWT:
ISSN: 2231-5381
Fig. 2 lifting based wavelet transform.
(1) Splitting: In this module, the original signal X is divided
into two disjoint parts, i.e., Xe(n)= x (2n ) and Xo(n)=x
(2n+1) that denotes all even-indexed and odd indexed
samples of X , respectively.
(2) Lifting: In this module, the prediction operation P is
used to estimate Xo(n) from Xe(n) and results in an error
signal d(n) which represents the detail part of the original
signal. Then we update d(n) by applying it to the update
operation U and the resulting signal is combined with Xe(n) to
estimate s(n) which represents the smooth part of the original
(3) Scaling: A normalization factor is applied to d (n) and
s(n) , respectively. In the even indexed part is multiplied by a
normalization factor Ke to produce the wavelet subband XLI .
Similarly in the odd index part, the error signal d(n) is
multiplied by Ko to obtain the wavelet sub band XHI.
A block based pattern matching [9] is applied on the bitmap
image. In a grey scale scanned document, the bitmap can
convey most of the information. So losses in the bitmap
details can deteriorate the complete document. Therefore an
efficient block based pattern matching compression without
losing the data can be applied to bitmap. Fig 3 shows example
of scanned documents and its bitmap images. From this it is
clear that a bitmap image can convey most of the information
from a grey scale document images. Since bitmap from
AMBTC encoder is implemented in a block wise strategy,
every small details can be recovered back from the
decompressed high and low mean sub images.
Fig. 3: original and bitmap images.
The block matching algorithm used here is simple and
accurate. Only hundred percent matches are considered as
matches. Others are not considered. A tight matching is
applied because the bitmap is already compressed by means of
bit map interpolation. For comparing the blocks an exclusive
OR operation between the blocks can be done. If the result
Page 492
International Journal of Engineering Trends and Technology (IJETT) – Volume17 Number 10–Nov2014
after matching is zero, those blocks are considered as matched.
In the proposed system this block matching algorithm
involves more computations and consumes more time
Steps for block based pattern matching for bitmap are as
Step1: Same sized bitmap blocks are grouped together for
matching. Different block sizes are 16X16, 8X8, 4X4 etc.
Step2: Blocks are written into the encoder in its order and
each block is compared with previously entered blocks.
Step 3: If a match is found the encoder writes only its position
and identification for reference block.
Step4: Only the bitmap block with same pattern is considered
as matched and even a single pixel differenced blocks are not
Step5: Compressed bitmap contains different patterned blocks
and positions of the repeated blocks.
with higher means and all the 0’s are replaced with lower
means. In this scheme only AMBTC is lossy and other two
methods are almost lossless. So the reconstructed version can
achieve quality as same as that of AMBTC encoded output
with high compression ratio. Original image can be recovered
with good quality.
Fig 5 shows the size reduction flow of the proposed system.
It describes the contribution of compression by each
techniques and overall compression ratio that can be achieved
by this hybrid algorithm. It can achieve a compression ratio
higher than 1:42.
In our proposed method scanned documents are the inputs.
A hybrid coder named AMBTC-DWT is used for the
compression. The encoder of the proposed method is done as
follows. Block diagram of the proposed system is shown in fig
4. The encoder divides the scanned document input into non
overlapping blocks.
Fig. 5: size reduction flow
Proposed system is implemented in MATLAB was
simulated and tested using several scanned images. In our test
different scanned pages are compressed using ADC and
proposed method. PSNR and Compression Ratio are
calculated for the two schemes and the result is plotted against
bit per pixels. Results shown that the proposed algorithm
gives a high PSNR ratio with better compression ratio.
Reconstructed version shows good quality.
Fig. 4: Proposed system- block diagram.
For each block mean of the whole block, mean of the
higher grey levels and mean of the lower grey levels are
calculated. Then the blocks are arranged into two distinct
image such that one contains the lower grey level s and the
other contain the higher grey level pixels. Then these sub
images are processed by lifted wavelet transform and at the
output both are summed together. The size of the bitmap is
primarily reduced by bit map interpolation technique and then
a block based pattern matching. Decoder works as reverse of
the encoder. Blocks are predicted using the reference block
and position vectors. Dropped bits in the bit plane is predicted
by neighbouring bits. Apply IDWT on compressed
subsampled images. In the bit plane all the 1’s are replaced
ISSN: 2231-5381
Fig. 6: a. original document.
b. reconstructed version
Fig 6 shows the zoomed version of original and
reconstructed image. An acceptable lossy content is there in
the reconstructed version, but it does not affect the readability.
Page 493
International Journal of Engineering Trends and Technology (IJETT) – Volume17 Number 10–Nov2014
PSNR Comparison
existing system
proposed system
numbers compared to floating point numbers. A tight block
match encoding is used here i.e. any tolerance between two
blocks is not considered as matched blocks. So after AMBTC
encoding the quality is not altered but compression can be
increased. This method can be implemented for continues tone
images also.
image 1
image 2
image 3
image 4
Fig. 7: PSNR comparison
image 5
image 6
Fig 7 shows the Peak Signal to Noise Ratio comparison
between jpeg2000 and the proposed system. PSNR is
calculated for various images with both methods. Proposed
system shows a high PSNR in high compression ratios.
High PSNR reveals that the reconstructed versions has a
good quality and good readability. Table 1 shows compression
ratio for proposed system and JPEG 2000. Proposed system
shows a significant improvement than the existing system.
Compression ratio
Jpeg 2000
Proposed system
The presented hybrid algorithm combines the simple
computation and edge preservation properties of AMBTC and
high compression ratio and PSNR of DWT, and may be
implemented with significantly lower coding delay than DWT
alone. The first stage encoding i.e. AMBTC is a lossy
compression scheme. The other compressions involved in the
procedures are block matching and a lifting DWT. A block
matching is implemented on its bitmap due to the fact that; for
document images the bitmap contain most of its information.
The lifting based DWT used here allows us to implement
reversible integer wavelet transforms. In conventional scheme
it involves floating point operations, which introduces
rounding errors due to floating point arithmetic. While in case
of lifting scheme perfect reconstruction is possible for lossless compression. It is easier to store and process integer
ISSN: 2231-5381
Information Technology—Coded Representation of Picture and Audio
Information, Lossy/Lossless Coding of Bi-level Images, ITU-T
Standard T.88, Mar. 2000
R. L. de Queiroz, Compressing Compound Documents, in the
Document and Image Compression Handbook, M. Barni, Ed. New
York, USA: Marcel Dekker, 2005.
S. Mori, C. Y. Suen, and K. Yamamoto, “Historical review of OCR
research and development,” Proc. IEEE, vol. 80, no. 7, pp. 1029–1058,
Jul. 1992.
W. B. Pennebaker and J. L. Mitchell, JPEG Still Image Data
Compression Standard, London, U.K.: Chapman & Hall, 1993.
Information Technology—JPEG2000 Image Coding System—Part 1:
Core Coding System, ISO/IEC Standard 15444-1, 2000.
R. L. de Queiroz, R. Buckley, and M. Xu, “Mixed raster content (MRC)
model for compound image compression,” Proc. SPIE, vol. 3653, pp.
1106–1117, Jan. 1999.
Wen Jan Chen, Shen Chuan Tai ,Bit rate reduction techniques for
Absolute Block Truncation Coding, journal for Chinese institute of
engineers ,vol. 22,no 5,pp661-667(1999)
Calderbank AR, Daubechies I, Sweldens W, and Yeo BL, "Wavelet
transforms that map integers to integers,", Appl. Comput. Harmon,
Vol .5,1998, pp 332- 369.
Alexandre Zaghetto, Member, IEEE, and Ricardo L. de Queiroz,
Senior Member, IEEE. ”Scanned Document Compression Using
Block-Based Hybrid Video Codec.” IEEE Transactions On Image
Processing, Vol. 22, No. 6, June 2013
S. Vimala, M. Sathya , K. Kowsalya Devi ,” Enhanced AMBTC for
Image Compression using Block Classification and Interpolation”
International Journal of Computer Applications (0975 – 8887) Volume
51– No.20, August 2012
G. Lu and T. L. Yew, “Image compression using quad tree partitioned
iterated function systems,” Electronic Letters, VOL. 30, NO.1, pp. 2324, Jan. 1994.
Reversible Image Watermarking using Bit Plane Coding and Lifting
Wavelet Transform by S. Kurshid Jinna, Dr. L. Ganesan , IJCSNS
International Journal of Computer Science and Network Security,
VOL.9 No.11, November 2009
Yu-Chen Hu, “Low complexity and low bit-rate image compression
scheme based on Absolute Moment Block Truncation Coding”, Vol. 42
No. 7 (2003) pp 1964-1975.
Y V Ramana, and H Eswaran, “A new algorithm for BTC image bit
plane coding,” IEEE Trans on Comm, 32(10), 1984, pp. 1148-1157.
S. G. Chang, B Yu and M Vetterli. “Adaptive Wavelet Thresholding
for image Denoising
and Compression”. IEEE Transactions on
Image Processing, Vol. 9, No. 9, September 2000
F. Belgassem, E. Rhoma, A. Dziech," Performance evaluation of
interpolative BTC image coding algorithms",ICSES2008, International
Conference on Signals and Electronic Systems,14-17,Sep.2008.
Rioul and M. Vetterli, “Wavelet and signal Processing,”, IEEE SP
Magazine , Oct. 1991, pp. 14 – 37.
Said A and Pearlman WA," An image multiresolution representation
for lossless and lossy compression,", IEEE Trans on Image Processing,
Vol .5, pp. 1303-1310, 1996.
Ali A. Elrowayati, Zakaria Suliman Zubi, “IBTC-DWT Hybrid Coding
of Digital Images,” Latest Trends in Applied Informatics and
Computing .
Page 494