An Approach of K-Means and ART Network for Character Recognition Ankush Goyal,

advertisement
International Journal of Engineering Trends and Technology (IJETT) – Volume 24 Number 1- June 2015
An Approach of K-Means and ART Network for
Character Recognition
1
Ankush Goyal, 2Shallu
1
Asst. Prof (CSE) , Sri Ram College of Engg.,Palwal, India
2
M.Tech(CSE), Sri Ram College of Engg.,Palwal, India
Abstract- The main utility of character recognition
system is to classify the digital and optical patterns
so that the alphanumeric character will be
obtained. To perform this recognition a series of
operations are adapted such as segmentation,
feature extraction and classification. Based on
these operations the actual recognition of
character is performed. The scanning is also been
under the human recognized characters and text so
that the effective detection of the character will be
performed. The presented work has three stages. In
first stage, the image improvement is performed
removing noise from the image. In second stage,
the image feature extraction is done using K-Means
approach to identify the character ROI and the
feature points.. At the final stage, the image
classification and recognition is performed using
ART Network approach. The obtained results from
system show the effective recognition rate.
Keywords: OCR, KNN, Art Network, Feature
Extraction,
I.
INTRODUCTION
Most of the human work is done in the form of
some written work that is now been performed
using the computer system. This application area
has grown in application area such as reading the
cheque signs, reading the traffic number plates,
reading the electricity meter reading etc. The major
broad areas associated with handwritten character
recognition includes the reading the digital
characters from printed media and to convert it to
textual form, the recognition of characters, and
textual information present on printed media,
enhancing the digital representation of characters.
Character Recognition is one of the applications of
Neural Network.
Figure 1 : Applications of Neural Network
Image Compression needs information for the
processing and the neural networks can receive and
ISSN: 2231-5381
process a wide range of information at once.
Character Recognition is widely used field for
recognition of digital and handwritten characters
and neural network helps in recognition of
characters. Feature Extraction is a field to extract
the information from the data or images and
multilayer perceptron neural network is highly
useful in this field. Classification can be done on
the basis of different patterns and neural network
provides various networks like ART for this
purpose.
Optical character recognition (OCR) is commonly
used term for Character Recognition which is used
for the conversion of digital or handwritten images
into computer readable form. It is a field of
research in pattern recognition, artificial
intelligence and machine vision. The goal of
Optical Character Recognition (OCR) is to classify
optical patterns (often contained in a digital image)
corresponding to alphanumeric or other characters.
The process of OCR involves several steps
including segmentation, feature extraction, and
classification.
Optical Character Recognition
(OCR) works as its name defines it. It recognizes
characters in the document that has been scanned
into computer. On the other side, Optical Word
Recognition (OWR) recognizes words rather than
the characters. OWR accomplishes this through the
process of comparing and contrasting the results of
several OCR engines, by which OWR evaluates
and then identifies each word. Through our studies,
which results we will review, OWR has proved
more effective than OCR. Intelligent Character
Recognition(ICR) can recognize and extract printed
handwritten characters as well as cursive
handwritten characters. The ICR recognition
system does not give highly accurate results every
time as the handwritten characters can be of
different styles, font and cursive. Every individual
has its own style of writing characters which makes
it complex to recognize them with same accuracy
results in same system. The ICR software
commonly used mostly have their own self learning
system within them which train themselves for new
inputs and automatically arrange for different
inputs. Intelligent Word Recognition(IWR) works
on handwritten words or phrases instead of
character to character.
http://www.ijettjournal.org
Page 45
International Journal of Engineering Trends and Technology (IJETT) – Volume 24 Number 1- June 2015
IWR technology matches handwritten words to a
speed and accuracy. It was able to identify the
user-defined dictionary, significantly reducing
pattern recognition and abnormality detection.
character errors encountered in typical characterS.Nagaprasad [3] has presented a data mining
based recognition engines.
based neural network model for soil image
classification and processing. In this paper Author
implemented, spatial image processing mining for
soil classification using diversified domains like
Digital Image Processing, Neural Networks, and
Soil fundamentals. The three most important
algorithms used in implementation are Back
Propagation Network (BPN), Adaptive Resonance
Theory 1 (ART) and Simplified Fuzzy ARTMAP
for soil classification as well as spatial image
recognition. Further Author are working on
Presented research by combining the visual data
Figure 2 : Applications of Character Recognition
mining with spatial data mining algorithms, such as
spatial clustering, spatial association rules, a selforganizing map etc. in order to try to detect
Process automation is an area of application to
patterns in the data in an even more effective way.
control some particular process. The general
Dan C. Ciresan [4] has defined a flexible and high
approach is to get all the available information and
performance neural network approach for image
for the redundancy check use the postcode.
classification.
Author presents a fast, fully
Signature Verification and Identification is an area
parameterizable
GPU
implementation
of
useful for banking purpose. The identity of the
Convolutional Neural Network variants. Presented
writer is established without reading the
feature extractors are neither carefully designed nor
handwriting. And the pattern to be matched is
pre-wired, but rather learned in a supervised way.
simply a signature with signatures collected in
Munish Kumar[5], in 2011 ,presented a KNN based
database. Automatic Cartography is helpful for
handwritten Gurumukhi Character recocgnition. In
recognizing characters from maps. The graphics
this work, firstly information is extracted about
and symbols get mixed and the different fonts and
character by creating Skeleton of character.
styles can be present during recognition. Automatic
Character features in terms of diagonal and
Number Plate Readers basically for vehicles. Here
transition have been computed. And Euclidean
the input image must be captured by a fast camera
distance is calculated to find the nearest neighbor.
and it is not like other bilevel images and this thing
The presented work showed accuracy of 94.12% in
makes recognition complex.
recognition.
Puttipong Mahasukhon[6] , presented a fuzzy
II.
RELATED WORK
theory based Handprinted English Character
Recognition. The work divided in two main stages,
feature extraction and pattern recognition. Position,
Character Recognition includes image processing
size and shape are parameters which creates
and it is an important part of Neural Networks also.
variation in recognition. The system was tested on
The work already done by different researchers in
26 lowercase hand printed English Character with
this area is discussed in this section.
different writers.
Nadine Hajj[7], in 2012 presented a system for
Tim J. Klassen[1] has presented an effective
isolated letter handwriting recognition system. Two
recognition process for Arabic characters. Author
stages are categorized for the work, feature
defined the work for online and offline character
extraction using Pen trajectory modeling and
recognition. Author presented the SOM based
classification
using
Support
Vector
heuristic approach to perform feature analysis on
machines(SVM).
The
best
recognition
rate
online data so that the effective recognition will be
achieved was of 89.15% using KNN nearest
obtained. Author presented the genetic based
neighbor having k=3. And Dynamic Time
approach to improve the recognition process.
Wrapping.
Yuefeng Chen[2] has defined an artificial immune
R.Arnold[8], proposed his work using MATLAB’s
system based handwritten character recognition.
Neural Network tool box to recognize printed and
Author defined the analysis over the optimization
handwritten characters by making their projection
of rate and time for the recognition. This approach
on grids of different size . Character Recognition
is based on the biological principle with the
match depends on resolution of character
memory cell based analysis. Author presented the
projection. It was found that the resolution of
experimentation on UCI dataset. The adaptive
character projection is necessary for the evaluation
algorithm provided by the author had improved the
of the match of the character recognition. The
ISSN: 2231-5381
http://www.ijettjournal.org
Page 46
International Journal of Engineering Trends and Technology (IJETT) – Volume 24 Number 1- June 2015
results came that not every writing style can be
INPUT NOISY
DENOISING
recognized using same network with the same
IMAGE
ALORITHM
precision value.
Another author, E.J. Bellarda[9] ,showed his work
using a bank of multilayer feedforward neural
FEATURE
K-MEANS BASED
EXTRACTION
CLUSTERING
network for handwritten character recognition . He
used the preclassification on segmentation concept
that is taken as basic building block for
handwriting. Second is Connectionist Approach.
ART BASED
RECOGNITION
RECOGNITION
The set of parallel networks taken instead of single
network. The results are evaluated on the bases of
similarity shaped character and upper case
Figure 3 : Character Image Recognition
characters in discrete manner.
The basic algorithm approaches used in this work
K.Toscano[10], had done the work for recognition
are given here under. The complete work is divided
of Cursive Handwriting and test of system’s ability
in two main algorithmic stages. The Gaussian filter
to work like human being. The feature extraction is
based denoising and hybrid recognition algorithm.
done using SALOM, a natural spline function and
These algorithmic approaches are defined in this
the steepest descent method is used for
section.
optimization. The recognition phase had two sub
phases, global feature classification and local
feature classification.
A)
Gaussian Filter
Another wok on the dictionary based analysis to
As the presented work is defined to perform the
perform character recognition was done by Shinji
recognition under noisy input image. The denoising
Tsuruoka[11]. Author defined a separate library set
is here performed using Gaussian filter. The
for each author to identify the writing similarity.
algorithmic approach for Gaussian filter is shown
Author defined the character specific analysis
in figure 4. The Gaussian filter is more robust
along with feature space generation so that
compared to the mean and median filter. Thus, a
effective covariance matrix will be generated.
single very unrepresentative pixel in a
Author defined the work on Japanese characters so
neighborhood will not affect the median value
that effective recognition will be done.
significantly. Since the FFT based decoding
process is defined to obtain the signal values. This
III. Proposed Approach
value must actually be the value of one of the
pixels in the neighborhood; the filter does not
In this section, the proposed hybrid model is
create new unrealistic pixel values when the filter
presented to perform the recognition. As the earlier
straddles an edge. For this reason the median filter
stage, the training set is defined on which the
is much better at preserving sharp edges than the
feature extraction is performed and the featured
mean filter. These advantages aid median filters in
dataset is generated. Once the feature dataset is
denoising uniform noise as well from an image.
obtained, the noisy input image is captured to
The denoising approach is effective for additive
perform the recognition process. This prenoise as well as on multiplicative noise. The
processing stage includes the denoising algorithm
Flowchart of the work is shown here under
and the character area identification over the
image. To remove the image noise, Gaussian filter
is applied in this work and to perform the image
segmentation, the combination of mathematical
filters is applied. These mathematical filters include
the convolution filter and morphological filters.
Based on these filters, the character area from the
image is extracted. At the final stage, the K-Means
Art network approach is applied to perform the
recognition and classification. The K-Means is here
used for clustering and feature extraction, vigilance
vector of the recognition process is obtained. Now
the vigilance ratio match is performed using art
network to identify the character class based on the
dataset classes. The basic model of this presented
work is shown in figure 3.
ISSN: 2231-5381
http://www.ijettjournal.org
Page 47
International Journal of Engineering Trends and Technology (IJETT) – Volume 24 Number 1- June 2015
14.
p=img1;
Start
15.
}
16.
}
17.
if(p==null)
Read the Input Image
18.
{
19.
Print “No Match Image Found”
20.
}
21.
else
Define the Gaussian Noise Level called Leveli
22.
{
23.
Print “Image Detected “+ p
24.
}
Implement the FFT on Input Image
25.
}
IV. RESULTS
Implement the Gaussian Adaptive Filter
Perform Inverse FFT on Drive Image
The presented work is applied on cursive
alphanumeric characters defined in grayscale.
Dataset 1 is shown here for sample
Derive the Result Image
Start
Figure 4 : Gaussian Filter
B)
Figure 5 : DataSet Sample
The properties of the dataset is shown in table 2.
Recognition
Table 2: Dataset Properties
The recognition is here defined using K-Means and
ART network. The recognition is here defined as
the feature based vigilance match of input image
with dataset images. The dataset is trained at initial
stage and the vigilance values are obtained and the
vigilance value dataset is generated. The
algorithmic approach for recognition process is
shown in table 1
Parameter
Value
Number of Images
26
Color
No
Images Type
Alphabet
Image Size
100x100
Table 1 : Recognition Algorithm
Image Format
BMP
Image Fault/Noise
Yes ( Level .2)
Noise Intensity
.1
Noise Type
Speckle
Image Filtration
Gaussian
Recognition
K-Means Art
Network
Input Image
A(1)
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
We have a Trained Art Network with
N Classes
Input Image Img
Define Vegilience Vector V
matchratio=0;
p=null
/*
initialize the match image*/
for c=1 to N
{
img1=GetImage(c)
Find Feature Difference Diff=img1img
M=Matchingratio(img,img1)
if Differecne>=V and M<matchratio
{
matchratio=M;
ISSN: 2231-5381
The recognition process is here defined to perform
the recognition. The recognition is here defined at
class level under vigilance vector so that effective
recognition will be performed. The recognition
property set is shown here under
http://www.ijettjournal.org
Page 48
International Journal of Engineering Trends and Technology (IJETT) – Volume 24 Number 1- June 2015
Table 3 : Recognition Properties
V.
CONCLUSION
In this paper, a K-Means ART network approach is
defined to perform the recognition. The work is
here defined for English alphanumeric characters.
The work is effective for noisy images. The
recognition rate obtained from the work shows the
effective detection of objects.
Properties
Values
Number of training
Images
26
Number of Test Images
12
Noisy Images
5
Correctly Detected
11
References
Noisy Correctly
Detected
4
[1] Tim J. Klassen," Towards the On-line Recognition of
Non Noisy Correctly
Detected
7
Recognition Rate Non
Noisy Images
100%
Recognition Rate Noisy
Images
80%
Matching Ratio of Input
Image(A)
99.7339%
Matching Ration of
Input Image(S)
99.7722%
Arabic Characters", 0-7803-7278-6/02@2002 IEEE
Yuefeng Chen, A Handwritten Character Recognition
Algorithm based on Artificial Immune, International
Conference on Computer Application and System
Modeling, vol 12, pp 273-276, 2010
[3] S.Nagaprasad,” Spatial Data Mining Using Novel Neural
Networks for Soil Image Classification and Processing”,
International Journal of Engineering Science and
Technology
[4] Dan C. Ciresan,” Flexible, High Performance Convolution
Neural Networks for Image Classification”, Proceedings of
the Twenty-Second International Joint Conference on
Artificial Intelligence
[5] Munish Kumar, k-nearest neighbor based offline
handwritten Gurmukhi character recognition, 978-161284-859-4@2011 IEEE
[6] Puttipong Mahasukhon, Hand Printed English Character
Recognition based on Fuzzy Theory, 978-1-4673-08199@2012 IEEE
[7] Nadine Hajj, Isolated Handwriting Recognition Via MultiStage Support Vector Machines, 978-1-4673-22768@2012 IEEE
[8] R. Arnold, Character recognition using neural networks,
978-1-4244-9279-4 @2010 IEEE
[9] E.J. Bellagarda, On-line handwritten character recognition
using parallel neural networks, 0-7803-1775-0@1994
IEEE
[10] K.Toscano, Cursive Character Recognition System, 07695-2569-5@2006 IEEE
[11] Shinji Tsuruoka, Personal Dictionaries for Handwritten
Character Recognition Using Characters Written by a
Similar Writer, 12th International Conference on Frontiers
in Handwriting Recognition, pp 599-604, 2010
[2]
The results are also shown in the form of Plot
Graph for recognition of character A as shown in
figure
Figure 6 : Matching Ratio Plot Graph
Figure 7 : Histogram for inputted image(A)
ISSN: 2231-5381
http://www.ijettjournal.org
Page 49
Download