Detection and Removal of Graphical Components in Pre

advertisement
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 7 (2016) pp 4849-4856
© Research India Publications. http://www.ripublication.com
Detection and Removal of Graphical Components in Pre-Printed Documents
N. Shobha Rani
Department of Computer Science, Amrita Vishwa Vidyapeetham,
Amrita University, Mysuru campus, Karnataka, India.
Vineeth,P
Department of Computer Science, Amrita Vishwa Vidyapeetham,
Amrita University, Mysuru campus, Karnataka, India.
Deeptha Ajith
Department of Computer Science, Amrita Vishwa Vidyapeetham,
Amrita University, Mysuru campus, Karnataka, India.
Government or private organizations, as per the variety of job
requirements that are relative to their task accomplishments.
These documents are defined with a pre structured layout
indicating various fields for data entry. It is also consists with
information like company name, purpose, captions, logos and
symbolic entities indicating the details of organization,
department etc. These graphical diacritics are overlaid with
text in most of the documents during the process of data entry.
The detection and removal of all these graphical elements may
lead to the error free subsequent processing, that is
segmentation, feature extraction, classification, and finally
result to an accurate character recognition by OCR. Since text
exist is the very minute gradient information that is sensitive
to the noisy content in the image and when this textual
portions are bounded or overlaid with the graphical entities
like horizontal or vertical lines, presence of logos, symbols,
photos etc. It increases conflicts in accurate resolution process
of textual components in the image. The accurate resolution of
textual components in the image is connected mostly with the
pure textual images. Therefore it is very much significant to
have the image free from all the graphical entities that are
mentioned above.
The present work focuses on pre-processing of pre-printed
document images. The pre-printed documents in the proposed
work belongs to the regions of Anantapur district of Andhra
Pradesh state. Figure 1 depicts one of such pre-printed
document.
The pre-printed document represented in figure 1 consists of
printed components, handwritten and other graphical
components like horizontal or vertical edges, symbols, logos
etc. The various graphical entities that we propose to work on
are as depicted in Figure 2.
The graphical entities that are in Figure 2 may obstruct the
process of text recognition. This requires the separation of
graphical entities from the textual portions. Thus it is more
crucial to detect and remove the graphical entities. There are
numerous experimentations in the literature addresses more on
pre-processing of the document images. Moreover the preprinted documents differs from one type of organization to the
other. Some of the experimentations that are revised in the
literature are discussed below.
Abstract
Pre-processing of document images is one of the most
intensive operations for pre-printed document images. The
recognition of text in pre-printed documents is most sensitive
to graphical components coexisting with it. In this paper we
address the problem of detection and removal of graphical
components like logos, emblems and other symbolic entities,
which leads to an error free document processing in the
subsequent stages of Optical Character Recognition. The
detection of graphical entities is performed by employing
Zernike moments and histogram of gradient features, followed
by which the line detection and removal is accomplished by
masking the image with a vertical line structuring element by
computation of region covered by convex hull within the area
by structuring element in the image. The detection of line
structuring element also addresses the problem of characters
overlapping with lines leading to retention of the character
during erosion of lines from the image. The experimental
outcomes produced by emblem detection of algorithm are
appreciable with accuracy of around 97% for the emblem
detection and 92% accurate outcomes in case of line detection
and removal.
Keywords: emblem detection, graphical components, preprinted documents, line detection, moments and HOG
features.
Introduction
Enhancement of document image prior to Region of Interest
(ROI) processing is the inclination of efficient optical
recognition systems. The document images are of varied
categories. There are document image ranging from simple
text to documents with fully complex gradient details. The
simple text documents are composed either printed or
handwritten text, whereas few documents are composed of
handwritten as well as printed text. There are still some hybrid
documents which consist of both graphical and textual
components, this type of documents are termed as pre-printed
documents [1]. The pre-printed documents are printed by
4849
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 7 (2016) pp 4849-4856
© Research India Publications. http://www.ripublication.com
a clear idea of the image and an analytical test that provides a
statistical measurement based on a benchmark dataset and
evaluation measurement and gave best performance.
Subhadipet. al. [7] had developed a novel framework with the
implementation of Hough transform for recognition of postal
codes in Latin, Devnagiri, Bangla and Urdu script from multiscript postal address block. This work achieved around 98%
postal-code localization accuracy. Manjunathet. al. [8],
described the study of robust text detection in color and
regular image. First stage used combination wavelet transform
and Gabor filter to extract sharpened edges and textural
features of the image. In second stage wavelet entropy is
imposed to get the further experimental values. They achieved
97.9% of accuracy. Battista et. al. [9] had proposed a
comprehensive survey and categorization of computer vision
and pattern recognition techniques proposed so far against
image spam, and make an experimental analysis and
comparison of some of them on real, publicly available data
sets. Alvaro et. al [10], contributed a robust method to localize
and recognize text in natural image using CC-based approach
that extract and discard basic letter candidates using a series
of easy and fast-to compute features. Rohanet. al. [11], had
presented a completely automated way to detect brain tumor.
Bounding box method using symmetry is used to detect the
location of tumor and they achieved good accuracy. Aswiniet.
al. [12], had proposed a system implements SURF to extract
local features from logos and to match the features. They
proposed a simple and compact SURF algorithm. Prof.
Mrinalineeet. al.[13], developed an improved approach for
logo detection and recognition. They used SIFT and CDS to
extract feature and match the image logo. Amrapaliet. al. [14],
extended CDS method to implementing scalable and highly
effective method for logo detection.Firojet. al. [15] had used
bounding boxes by morphological dilation for the
segmentation of Arabic word. They have tested appropriate
methods on documents of Arabic script and theirs have
obtained encouraging results from proposed techniques.
Victor et. al. [16] had discussed how the bounding box can be
further used to impose a powerful topological prior, which
prevents the solution from excessive shrinking and ensures
that the user provided box bounds the segmentation in a
sufficiently tight way. Thawaret. al. [17], in their paper three
kinds of moments: Geometrical, Zernike and Legendre
moments have been evaluated for classifying 3D object image
using Nearest Neighbor classifier. Subhajitet. al. [18] had
proposed an efficient algorithm for recognizing palm prints
for biometric identification of individuals by complex Zernike
moments are constructed using a set of complex polynomials.
Jyotsnaraniet. al. [19], presented a reconstruction of the basic
characters in Oriya text, which can handle different font sizes
and font types, by using Hu’s seven moments and Zernike
moments. Diptiet. al. [20], discussed the form image
registration technique and the image masking and image
improvement techniques implemented in their system as part
of the character image extraction process.
To best of our knowledge the works reported in the literature
focus on graphical component detection specific to the type of
documents that are proposed to work in their research and
none of the works addresses the problem of emblem detection
with respect to the e-seva documents belonging to the regions
Figure 1: The pre-printed document
Figure 2: Types of graphical entities
Gatos et. al. [2] had proposed an algorithm for automatic table
detection in documents using line length and line width
estimation by using edge detection operators.Yefenget. al [3]
had contributed an algorithm to detect the severely broken
parallel lines in handwritten document images based on
directional single connected chain method using three
parameters called skew angle, vertical line gap and vertical
translation. The experimentation had produced results of
around 94% for Arabic documents. Shobhaet. al [4], proposed
a generic line elimination methodology for removal of
horizontal grid like structures using circular structuring
element for application form images and had achieved an
accuracy of more than 90%. Ping et. al. [5], proposed a novel
face detection system using hybrid feature extraction and
three set of face features are extracted. This system achieved
accuracy of 95%. Bilal et. al. [6] had proposed an adaptive
local Binarization method for document images which
includes two type of experiments: visual experiment provides
4850
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 7 (2016) pp 4849-4856
© Research India Publications. http://www.ripublication.com
of Andhra Pradesh state. Thus, we propose to work on the
detection of emblems and lines inherent in pre-printed e-seva
documents.
If
Di represents a pre-processed image which is subjected to
capture the various objects. The objects in the document
Di
are captured by employing bounding box construct, which
encloses a set of pixels fully connected with in a rectangle to
its borders. The set of pixels fully connected represents an
object that can be either a graphical component like emblem
or logo or character images.
Proposed Methodology
The proposed methodology for detection and removal of
graphical components in pre-printeddocuments is comprised
of two crucial stages. The stage one prefers the processing of
graphical components like emblems and logos. Horizontal and
the vertical line overlaid with the text is accomplished in stage
two. The block diagram of proposed methodology is depicted
in Figure 3.
Figure 3: Block diagram of graphical entity detection system
The subsection A and B describes the methodologies for logo
and emblem detection and detection of horizontal and vertical
lines
Figure 4: Flow chart of proposed algorithm
Detection of graphical entities - logos and emblems
In stage 1, the proposed methodology for the detection and
removal of graphical entities from application form images
has been addressed. Here we mainly focus on the detection of
graphical entities like emblems and logos in the pre-printed
documents. The detection and removal of emblems from
application form images will reduces the computational
conflicts during segmentation and classification of characters
and renders to an error free recognition by OCR. The
algorithm for emblem detection and removal initially prompts
the user for acquisition of pre-printed document image as
input. The acquired input is subject to pre-processing to obtain
a transformed and enhanced binary image. Further the binary
image is processed to connect all the broken gradient details
by employing morphological bridge operation [21].
The detection of emblems is accomplished by tracking all the
objects in the image with bounding boxes and filtering it
further to identify only the required graphical entities. Finally
the Histogram of Gradient and Zernike moments features are
computed for the detection of bounding box with emblems or
logos. Figure 4 depicts the block diagram of algorithm for
emblem or logo detection.
Obj1 , Obj2 , Obj3 ...Objn are the objects captured by
applying bounding boxes to pre-processed image Di . Each
Let
Obji will be interpreted to identify whether the
Obji  Class(Ch )
or
Obji  Class(Gc ) where i  1, 2,3...n , Class(Ch ) and
Class(Gc ) represents the set of objects with textual
object
components and graphical components. Figure 5 and figure 6
presents the pre-processed image and the objects captured
within the image using bounding boxes.
Once the objects are captured in the image, each object
Obji is inspected to check whether it is maximum area
bounding box or not. The maximum area bounding boxes
exists for those objects with graphical entities like logos,
photos and emblems and termed as max area objects
shown in figure 7.
4851
M obj as
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 7 (2016) pp 4849-4856
© Research India Publications. http://www.ripublication.com
Figure 7: maximum area objects
The max area objects
M obj are filtered from the other objects
i.e., objects with textual components. If H is the filter applied
on each object to filter the max area objects, then filtering of
max area objects is given by equation (1)
M obj (i )  H (Obji )
(1)
The filter H implies a transformation to detect whether it is
max area object or not. The filter H is associated with a
criterion given by equation (2).
(2)
The filter returns the top two maximum length bounding
boxes which are usually called as nested objects. The outcome
of filtering transformation is shown in figure 8.
Once the nested objects are detected, each nested object
NObj is subjected to undergo the concatenation transformation
that converts a nested object into a simple object. The
concatenation transformation CT combines all the smaller
Figure 5: Pre-processed image
bounding boxes into a bounding box of maximum length and
width. The concatenation transformation CT is given by
equation (3).
Obji  CT ( Nobj )
Figure 6: Image with objects detected using bounding boxes
Figure 8: The result of filtering transformation
4852
(3)
H
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 7 (2016) pp 4849-4856
© Research India Publications. http://www.ripublication.com
The figure 9 represents the simple object
Obji obtained after
The graphical representation for the “phi” value computed for
the emblem and the other graphical components are shown in
the figure 10 below.
applying concatenation transformation CT .
Figure 9: Result of Concatenation transformation
CT
The transformed nested objects are forwarded for
classification of objects into various graphical entities that
include logos, photos and emblems. The classification in the
proposed methodology accomplished through the histogram
of gradient (HOG)[22]and Zernike moments features
respectively, the classification is performed by thresholding
operations on the feature value extracted. Once we get the
biggest bounding box after concatenation transformation,
Itmake sure that the algorithm detected the logo only. This
detection stage is done with the help of moment and HOG
values. From the experimentation result(refer table 1) for
fourth order Zernike moment, the degree of rotation is
negative(ie, anti-clockwise) for emblems. Similarly for
identifying “e-seva” emblem HOG descriptor is used.
Proposed algorithm finds the range of HOG value for the
specific type of emblem from a set of 30 emblems. Then this
range is used for further classification.
Figure 10: features of zernike moments in degrees
Here the negative value clearly shows that the graphical
components are emblem where as features in a negative value
indicates other graphical components.
HOG Descriptors
The main objective of Histograms for oriented gradients
(HOG) is object detection. The basic idea is, local shape
information often is well described by the distribution of
intensity gradients or edge directions even without precise
information about the location of the edges themselves. The
HOG features differs greatly from a bounding box with
emblem to a bounding box with simple text, thus we employ
HOG features in our work
The computed HOG features for various types of emblems
depict a great dissimilarity in features of graphical component
to a non-graphical component. Table 2 shows an overview of
HOG descriptor features of the various graphical components
detected.
Zernike Moments
Generally moments explains numeric quantities at some
distance from one reference point or one axis. The main
advantage of using Zernike moment is better accuracy and
simple rotation invariance. Zernike moments are used here to
find the “phi” value (degree of rotation) of the emblem
detected in the application form images.The given table 1
given below shows few observed phi values.
Table 2: Hog descriptor values
Table 1: Phase angle of the moment in degree
4853
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 7 (2016) pp 4849-4856
© Research India Publications. http://www.ripublication.com
Figure 13: Detection of horizontal and vertical lines and
removal
For removing the vertical line presented in the application
form images the mask will move through the identified row
with origin of mask as the target pixel. The 2 x 11 mask
determines the presence of black pixels and if more than 20
percentage of row length, the continuous black is encountered
to its right then the target pixel will replace with back ground.
The same method will repeat with 11 x 2 mask for the
removal of horizontal line
Figure 11: HOG descriptor features
The given figure 11 shows the graphical representation for the
HOG descriptor. After detecting emblems in the application
form images, it converts into the background pixels. So it will
remove from the image. Result of this proposed algorithm in
application form images is shown in figure 12below.
Experimental Analysis
The experimental analysis in the proposed system is
conducted on the datasets of around 80 pre-printed
documents. The documents are collected from the e-seva
centers of Andhra Pradesh regions. The accuracy in the
proposed system is defined individually for stage 1 and stage
2. The accuracy of emblem detection is the number of
emblems correctly detected Dc to the total number of
graphical components originally detected
equation
Accuracy 
D as given by
Dc
D
(4)
The accuracy in stage 2 is the number of lines detected
correctly Lc to the total number of lines present L in the
Figure 12: Application form image before and after applying
the algorithm
image as given in equation (5).
Accuracy 
Detection of horizontal and vertical lines
In this second stage, the proposed algorithm focuses on the
detection and removal of horizontal and vertical lines from the
pre-printed application form images. The application form
images for undergoes for the initial pre-processing operations
like binarization and noise reduction. From the binarized
image the continuous count of black pixel values locate the
position of horizontal or vertical line.
Lc
L
(5)
The experimental outcomes of the proposed system are as
depicted in figure 14.
Rectangular element mask
After identifying the horizontal and vertical lines the masking
operation is employed using rectangular structuring element.A
11 x 2 rectangular element mask with its middle row as the
target used here to detect the horizontal line and in the same
way a 2 x 11 rectangular element mask with middle column as
target used for the detection of vertical lines. Figure 13 gives
an overview of the algorithm proposed by rectangular
structuring element.
Figure 14: Accuracy of proposed methodology
4854
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 7 (2016) pp 4849-4856
© Research India Publications. http://www.ripublication.com
Conclusion
The proposed algorithm for emblem detection has employed
bounding boxes for detection of objects and features of
Zernike moments and HOG descriptors for the detection and
removal of emblems in the application form images. The
Bounding box is very efficient detection of the objects in the
application form images and the features employed are
consistent and adequate enough in detection of objects with
emblems. For the specific “e-seva” emblem this proposed
algorithm working with high efficiency and for the other
graphical components it provides satisfactory results. The
algorithm proposed for the vertical and horizontal line
detection and removal works efficiently in detection
horizontal and vertical lines. In future, the work can be further
extended to remove lines where text is overlapping. The
proposed work is applicable for the pre-printed document of
various languages. The dynamic detection threshold values for
the emblem and line detection can considered as a future
work.
References
[8]
Aradhya, V.M., Pavithra, M.S. and Naveena, C.,
2012. “A robust multilingual text detection approach
based on transforms and wavelet entropy”. Procedia
Technology, 4, pp.232-237.
[9]
Biggio, B., Fumera, G., Pillai, I. and Roli, F., 2011.
“A survey and experimental evaluation of image
spam filtering techniques”. Pattern Recognition
Letters, 32(10), pp.1436-1446.
[10]
González, Á. and Bergasa, L.M., 2013. “A text
reading algorithm for natural images. Image and
Vision Computing”, 31(3), pp.255-274.
[11]
Kaus, M.R., Warfield, S.K., Nabavi, A., Black, P.M.,
Jolesz, F.A. and Kikinis, R., 2001. “Automated
segmentation of mr images of brain tumors
1”.Radiology, 218(2), pp.586-591.
[12]
C. Aswini, D. Chitra., 2014, “Enhanced Logo
Matching and Recognition using SURF Descriptor”.
International Journal of Engineering Research &
Technology (IJERT). Vol. 3.4, ISSN: 2278-0181.
[13]
Prof. MrunalineePatole, MeeraSambhajiSawalkar.,
2014, “Improved approach for logo detection and
recognition”. International Journal of Emerging
Trends & Technology in Computer Science
(IJETTCS).ISSN 2278-6856.Vol 3.6.
[14]
Amrapali A. Dudhgaonkar, Prof. N.N. Thune., 2014,
“Novel and Scalable Solution for Logo Detection and
Recognition using CDS method”. International
Journal of Engineering Research & Technology
(IJERT). ISSN: 2278-0181.vol 3.6
[1]
Akram, S., Dar, M.D. and Quyoum, A., 2010.
“Document
Image
Processing-A
Review”.
International Journal of Computer Applications,
10(5), pp.35-40.
[2]
Gatos, B., Danatsas, D., Pratikakis, I. and Perantonis,
S.J., 2005. Automatic table detection in document
images”.In Pattern Recognition and Data Mining (pp.
609-618).Springer Berlin Heidelberg.
[3]
Zheng, Y., Li, H. and Doermann, D., 2003, August.
“A model-based line detection algorithm in
documents”.In Document Analysis and Recognition,
2003.Proceedings. Seventh International Conference
on (pp. 44-48). IEEE.
[15]
Parwej, F., 2013.” A Perceptive Method for Arabic
Word Segmentation using Bounding Boxes by
Morphological Dilation”. International Journal of
Computer Applications, 71(1).
[4]
Shobha Rani N, Vasudev T., 2014. “A Generic Line
Elimination Methodology using Circular Masks for
Printed and Handwritten Document Images “,
Proceedings of second international conference on
emerging research in computing, information,
communication and applications, ERCICA, ISBN:
9789351072638.
[16]
Lempitsky, V., Kohli, P., Rother, C. and Sharp, T.,
2009, September.“Image segmentation with a
bounding box prior”.In Computer Vision, 2009 IEEE
12th International Conference on (pp. 277284).IEEE.
[17]
Arif, T., Shaaban, Z., Krekor, L. and Baba, S.,
2009.”Object classification via geometrical, zernike
and legendre moments”. Journal of Theoretical and
Applied Information Technology, 7(1), pp.31-37.
[18]
Karar, Subhajit, and Ranjan Parekh., 2012, "Palm
Print Recognition using Zernike Moments."
International Journal of Computer Applications
55.16.
[19]
Tripathy, J., 2010. “Reconstruction of oriya
alphabets using Zernike moments”. International
Journal of Computer Applications (0975-8887), 8(8).
[20]
Deodhare, D., Suri, N.R. and Amit, R., 2005.
“Preprocessing and Image Enhancement Algorithms
for a Form-based Intelligent Character Recognition
System”. IJCSA, 2(2), pp.131-144.
[5]
[6]
[7]
Zhang, P. and Guo, X., 2012. “A cascade face
recognition system using hybrid feature extraction”.
Digital Signal Processing, 22(6), pp.987-993.
Bataineh, B., Abdullah, S.N.H.S. and Omar, K.,
2011. “An adaptive local binarization method for
document images based on a novel thresholding
method and dynamic windows”. Pattern Recognition
Letters, 32(14), pp.1805-1813.
Basu, S., Das, N., Sarkar, R., Kundu, M., Nasipuri,
M. and Basu, D.K., 2010. “A novel framework for
automatic sorting of postal documents with multiscript address blocks”. Pattern Recognition, 43(10),
pp.3507-3521.
4855
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 7 (2016) pp 4849-4856
© Research India Publications. http://www.ripublication.com
[21]
Dougherty, Edward R., Roberto A. Lotufo, and The
International Society for Optical Engineering SPIE.,
2003, “Hands-on morphological image processing”.
Vol. 71. Washington: SPIE Optical Engineering
Press.
[22]
Dalal, N. and Triggs, B., 2005, June.”Histograms of
oriented gradients for human detection”.In Computer
Vision and Pattern Recognition, 2005.CVPR
2005.IEEE Computer Society Conference on (Vol. 1,
pp. 886-893).IEEE.
4856
Download