Comparison of Handwriting characters Accuracy using

International Journal of Science, Engineering and Technology Research (IJSETR)
Volume 1, Issue 1, July 2012
Comparison of Handwriting characters Accuracy using
Different Feature Extraction Methods
Thin Nu Nu Lwin, Thandar Soe
Department of Information Technology, Mandalay Technological University
Thinnunulwin@gmail.com
Abstract –Feature extraction techniques can be important in
character recognition, because they can enhance the efficiency
of recognition in comparison to pixel-based approaches. These
study aims to investigate the noval feature extraction techniques
in order to use it for representing handwritten characters. In
this system, three feature extraction methods (Gradient, DCT
and DWT) are used to compare the effectiveness of handwriting
characters. The system is started by acquiring an image
containing characters. The characters are processed into
several phases such as binarization, noise filtering,
normalization and feature extraction before recognizing. A
multilayer neural network is used for the recognition phase;
feed forward back propagation algorithm is applied for training
the network. The purpose of this paper is to compare different
feature extraction methods in terms of recognition accuracy
and training time.
Keywords—Handwritten English characters (A-Z),
Gradient, DCT, DWT, Multilayer neural network, Feed
forward back propagation, Recognition accuracy
I. INTRODUCTION
andwriting recognition is one of the most desirable
computer features in enhancing communication between
human and computer. According to initial research,
handwriting recognition is developed to allow computers to
read and understand human language in written form. The
field of handwriting recognition is divided into off-line and
on-line recognition .The on-line approach uses a tracking
device to collect time-position-action of writing strokes, i.e.,
tablet digitizer or pen device. The sequences of the writing
positions are stored in a timely order. The off- line approach
uses a light sensitivity device, e.g., a scanner or a digital
camera, to read a written document. The off-line data
acquisition and recognition approach are the interest of this
study.
Common processes of off-line handwriting recognition
systems are preprocessing, feature extraction and recognition.
The feature extraction process extracts the relevant
information, known as feature vectors, which could be used to
identify the input image in the recognition step. The
recognition process uses these features to find the most
compatible class with the input. The main objective for using
feature extraction is to reduce the data dimensionality by
extracting in most important features from character image
[1].
Moreover, feature extraction is a significant factor for
obtaining high accuracies in character recognition systems
especially if there is not a lot of training data available. The
present study aims to compare the performance of the
Gradient and DCT for handwritten English characters.
The gradient feature provides higher resolution on both
magnitude and angle of the directional stokes, which leads to
improvement on the character recognition rate. The gradient
feature represents local characteristic of a character image.
Features extracted from handwritten characters are directions
H
of pixels with respect to their neighboring pixels. This
approach increases the information content and gives better
recognition rate with reduced recognition time. The DCT is a
popular signal transformation method, which is made the use
of cosine functions of different frequencies. The DCT
transform coding method compresses image data by
representing the original signal with a small number of
transform coefficients. It exploits the fact that for typical
images a large amount of signal energy is concentrated in a
small number of coefficients. The goal of DCT transform
coding is to minimize the number of retained transform
coefficients while keeping distortion at an acceptable level.
For that reason, the DCT have become the most widely used
transform coding technique. The Wavelet Transform is a
powerful technique for representing data at different scales
and frequencies. A discrete wavelet transform represents a
time domain signal into time frequency domain and the
signals are called wavelet coefficients. The Discrete Wavelet
Transform (DWT) is based on sub-band coding, is found to
yield a fast computation of Wavelet Transform. It is easy to
implement and reduces the computation time and resources
required.
The purpose of the work is to ascertain the effectiveness of
each feature extraction technique to capture useful
information and hence resulting in more accurate recognition
results. The remainder of the paper is organized as follows:
Section II briefly reviews the prior works on feature
extraction of handwritten characters. The proposed system
components are given in Section III. Section IV briefly
explains about materials and methods used on the current
system. The results of our experiment are described in
Section V and conclusions are mentioned in Section VI.
II. RELATED WORKS
Handwriting recognition is one of the most challenging and
oldest problems in computer-related research. The challenge
of handwriting recognition is how to implement computer
systems that can read like humans. Many researchers had
done work towards the off- line handwritten character
recognition.
Kumar
[2]
compared
performances
of
five
feature-extraction methods on handwritten Devanagari
characters. The various features covered are Kirsch
directional edges, distance transform, chain code, gradient
and directional distance distribution. From that
experimentation, it was found that Kirsch directional edges
are least performing and gradient is the best performing with
SVM classifiers. With multilayer perceptron (MLP), the
performance of gradient and directional distance distribution
is almost same. The chain code based feature is better as
compared to Kirsch directional edges, distance transform.
Lawgali [3] compared the effectiveness of Discrete Cosine
Transform DCT and Discrete Wavelet Transform DWT to
1
All Rights Reserved © 2012 IJSETR
International Journal of Science, Engineering and Technology Research (IJSETR)
Volume 1, Issue 1, July 2012
capture discriminative features of Arabic handwritten
characters. A new database containing 5600 characters
covering all shapes of Arabic handwritten characters had also
developed. DCT and DWT techniques are used for feature
extraction of the characters. Coefficients of both techniques
are used in ANN for classification of the characters. That
experiment had demonstrated that feature extraction by DCT
has a higher recognition rate than DWT.
Olarik Surinta et al. [4] proposed a novel feature extraction
technique called the hotspot technique for representing
handwritten characters and digits. In the hotspot technique,
the distance values between the closet black pixels and the
hotspots in each direction are used as representation for a
character. The hotspot technique is applied to three data sets
including Thai handwritten characters (65 classes), Bangla
numeric (10 classes) and MNIST (10 classes). The data sets
are then classified by the k-Nearest Neighbors algorithm
using the Euclidean distance as function for computing
distances between data points. In that study, the classification
rates obtained from the hotspot, mark direction and direction
of chain code techniques were compared. The results showed
that the hotspot technique provides the largest average
classification rates.
Dayashankar Singh et al.[5] presented a new feature
extraction technique to calculate only twelve directional
feature inputs depending upon the gradients. Total 500
handwritten samples, including handwritten Hindi
characters, English characters and some special characters,
were used in that experiment. Features extracted from
handwritten characters are directions of pixels with respect to
their neighboring pixels. These inputs are given to a back
propagation neural network with one hidden layer and one
output layer. Experiment result showed that the new
approach provides better results as compared to other
techniques in terms of recognition accuracy, training time
and classification time.
Amir Mowlaei et. al (2002) [6] presented a feature
extraction using wavelet transform for Farsi/Arabic
characters and numerals. The DWT is used to produce the
wavelet coefficient and Haar wavelet is used during the
feature extraction. The experiment is done using 480 samples
per digit and 190 samples per character. Then both of this
samples is divided into training and test set. Both of this set
has a high recognition rate between 91 -99%.
Wunsch and Laine[7] proposed the wavelet descriptors for
recognition of handwritten characters. Their experimental
results showed that wavelet descriptors are an efficient
representation. In that paper, they proposed a new feature
extraction method based on two-dimensional discrete
wavelet transform for off-line recognition of unconstrained
handwritten numerals using back-propagation neural
networks as a classifier. In order to verify the performance of
the proposed approach, 1500 handwritten numerals written
by 30 persons were collected as the database, 750 numerals
are used as the training set and the other 750 numerals as the
testing
set.
Classification
is
accomplished
by
back-propagation neural networks with three layers. The
recognition rate of the training set and the testing set are 99.1
% and 96.8 %, respectively. The experimental result shows
that the proposed method is a simple and an efficient
representation for unconstrained handwritten numerals
recognition using fewer image preprocessing.
From the above literature survey, it is clear that feature
extraction is an integral part any recognition system and the
selection of feature extraction techniques is an important step
for getting higher recognition accuracy. Gradient, DCT and
DWT are used and compare for feature extraction of all the
shapes of handwritten English characters using Neural
Network in the reorganization stage.
I.
SYSTEM COMPONENTS
The entire system can be divided into four parts:
A. Image Acquisition
B. Preprocessing
C. Feature extraction
D. Recognition
Start
Image Acquisition
Preprocessing
Noise
removal
Binarization
Normalization
Feature
Extraction
DCT
Gradient
DWT
Train
Database
Train with Neural Network
End
Figure1: Block Diagram of Training System
Start
Image Acquision
Preprocessing
Feature extraction
Gradient
Neural
Network
DCT
DCT
Neural
Network
Neural
Network
Train
database
Save recognition accuracy
Comparison of recognition accuracy
End
Figure 2: Block Diagram of Recognition System
2
All Rights Reserved © 2012 IJSETR
International Journal of Science, Engineering and Technology Research (IJSETR)
Volume 1, Issue 1, July 2012
IV. MATERIALS AND METHODS
The steps of the proposed comparison algorithms based on
Gradient and DCT are described in Fig. 1 and Fig. 2.
A. Image acquisition
An image is acquired to the system as an input. This image
should have a specific format, for example, png format. This
image can be acquired through the scanner or, digital camera
or other digital input devices. Scanner is the most common
device used to get the image comparing to other devices due
to less noisy accrued during imaging process. The input
images are scanned at a resolution of 300 dpi (dot per inch)
and stored by gray scale image as shown in Fig 3.
characters varies from person to person and even with the
same person from time to time.
Therefore, the characters should be scaled to a
standardized matrix to make the recognition process
independent of the writing size and to get better recognition
accuracy. In this paper, the character is normalized to size
32*32 pixels. An example of size normalization is illustrated
in Fig.6.
Fig. 6: Normalization of the character “A
Fig. 3: Input Image
B. Preprocessing
Before the image is given to the recognition system, it
needs to be brought in a format that is standard and
acceptable to the neural network as input. Various blocks of
the preprocessing step are as follows:
(i).Binarization: In image binarization, the text image which
is gray scale image is converted into a binary image with each
pixel taking a value of 0 or 1 depending on threshold value of
the image. The technique is most commonly employed for
determining the threshold involves analyzing the histogram
of gray scale levels in the digitized image. The scanned image
and its binarized output are shown in Fig. 4.
Fig.4: Input Image and Binary Image of “A”
(ii).Noise removal: Noise removal means reducing noise in
an image. For off-line recognition, the noise may come from
the writing style or from the optical device that captures the
image. The presence of noise can reduce the efficiency of the
character recognition system. So, it should be eliminated
from the image as much as possible to avoid confusion
recognition. Median filtering is used in this paper. The
example of noisy image and its filtering image are shown in
Fig.5.
C. Feature Extraction
In printed and handwritten text, the features capture the
information extracted from the characters. This information
is passed onto the matcher to assist in the classification
process. In this research, Gradient and DCT are adopted to
extract the features of the characters. Both Gradient and DCT
are widely used in the field of digital signal processing
applications.
(i).Gradient
The gradient measures the magnitude and direction of the
greatest change in intensity in a small neighborhood of each
pixel. Gradients are computed by means of the Sobel
operator. The Sobel templates used to compute the horizontal
(X) & vertical (Y) components of the gradient. The templates
are shown in Fig.7.
Horizontal Template
Vertical Template
Fig.7: Sobel operator Template
The two gradient components at location (i, j) are calculated
by:
Gx (i, j) = f( i-1, j+1) + 2f( i, j+1) +f( i+1, j+1) – f(i-1, j- 1)2f(i, j-1) –f(i+1, j+1)
Gy (i, j) = f( i-1, j-1) + 2f( i-1, j) +f( i-1, j+1) – f(i+1, j-1) –
2f(i+1, j) –f(i+1, j+1)
The gradient strength and the direction are calculated as :
G (i, j ) =√Gx 2 + Gy 2
Θ (i, j) =tan-1Gx (i, j) /Gy (i, j)
Fig .5: Noisy image and noise removed image
(iii).Normalization: Normalization is used to standardize the
fast size within the image. The size of the handwritten
After computing the gradient of each pixel of the character,
the gradient values are mapped onto 18 direction values to the
angle span of 10 degree between any two adjacent direction
values.
(ii).Discrete Cosine Transform (DCT)
The discrete cosine transform (DCT) is a technique for
converting a signal into elementary frequency components.
3
All Rights Reserved © 2012 IJSETR
International Journal of Science, Engineering and Technology Research (IJSETR)
Volume 1, Issue 1, July 2012
DCT technique includes three steps: Transformation,
Quantization and Encoding. The Discrete Cosine Transform
converts data of the image into its elementary frequency
components. Quantization is the process of reducing the
number of possible values of a quantity, thereby reducing the
number of bits needed to represent it. Entropy encoding is a
technique for representing the quantize data as compactly as
possible. First, image is divided into 8*8 block. It clusters the
lowest frequency components in upper left corner, whereas
the highest frequency components in bottom right corner of
the array (m, n).
1
𝛼(u)=
,
√𝑀
{
√
2
, 1≤𝑢 ≤𝑀−1
𝑀
1
𝛼(v)=
{
Fig: 9. DWT decomposition at one level
𝑢=0
√𝑁
,
𝑣=0
2
√ ,1 ≤ 𝑣 ≤ 𝑁 − 1
𝑁
where f(m, n) is the pixel value at the (m, n)coordinate
position in the image. F(u, v) is DCT domain representation
of f(m, n), where u and v represent vertical and horizontal
frequencies.The DCT coefficients are then quantized. After
quantization, all of the quantized coefficient are extracted in
a zigzag fashion and stored in a vector sequence as shown in
Fig.8.These
coefficients
are
encoded
for
efficienttransmission of the image. Therefore, these
coefficients are used to extract the features of the character
image.
Fig.8: Zig Zag Sequence
(iii).Discrete Wavelet Transform (DWT)
The Wavelet Transform (WT) is a way to represent a signal
in time- frequency form. DWT provides a more detailed
picture of the signal being analyzed. DWT is applied
low-pass filter (LPF) and high-pass filter to decompose the
image along Row or Colum. The results of each filter are
down-sampled by two. Each of the sub- signals is then again
high and low filtered and the result is again down-sampled by
two. At decomposition level, DWT separates an image into
one low- frequency sub-band (LL) and three high frequency
sub-bands (LH, HL, HH). The LL is an approximation
sub-band, LH is horizontal detail sub-band, HL is vertical
detail sub-band and HH is diagonal sub-band. The low
frequency coefficients of sub-band (LL) are closed to the
original image and they contain full details of the image.
Therefore, these coefficients are used to detect the features of
the character image.
D. Recognition in Classification
Template matching, structural analysis and neural
networks have traditionally been popular classification
methods for character recognition, but neural networks are
increasingly proving to offer better and more reliable
accuracy for handwriting recognition [8].
Architecture: The most popular architecture of neural
network used in English character recognition takes a
network with three layers: input layer, hidden layer and
output layer. Fig.9 depicts example of the architecture 3-layer
neural network. The input layer is fed by the feature of the
characters. Therefore, the number of nodes in this layer
depends on the number of input features of the network. The
last layer is called the output layer and the number of its
nodes is based on the desired outs. The hidden layer lies
between the input and output layers. The system consists of
208 characters of different writers. For gradient, the number
of input feature is 18, the number of neurons in hidden layer
is 400 and the number of neurons in output layer is 26 in this
system. For DCT, the number of input feature is 16, the
number of neurons in hidden layer is 400 and the number of
neurons in output layer is 26 in this system. In the hidden
layer, the number of nodes governs the variance of samples
which can be accurately and correctly recognized by the
network. If the network has trouble in learning, then neurons
can be added to this layer.
Training phase: Commonly neural networks are trained, so
that a particular input leads to a specific target output. There,
the network is adjusted, based on a comparison of the output
and the target, until the network output matches the target.
The system apply feed forward back propagation neural
network algorithm.
The back propagation algorithm consists of three stages.
The first is the forward phase, spread inputs from the input
layer to the output layer through hidden layer to provide
outputs. The second is the backward stage, calculate and
propagate back of the associated error from the output layer
to the input layer through hidden layer. And the third stage is
the adjustment of the weights and biases.
The backward stage is similar to the forward stage except
that error values are propagated back through the network to
determine how the weights are to be changed during training.
During training each input pattern will have an associated
target pattern. After training, application of the network
involves only the computations of the feed forward stage.
4
All Rights Reserved © 2012 IJSETR
International Journal of Science, Engineering and Technology Research (IJSETR)
Volume 1, Issue 1, July 2012
Fig.10: Example of architecture of neural network with 3-layer
V.EXPERIMENTAL RESULTS
Experiments were carried out using 286 isolated English
characters from 11 independent writers. These characters are
divided into two data sets: training 208 characters and testing
78 characters. Comparative studies between Gradient, DCT
and DWT in terms of recognition accuracy are summarized in
Table 1.
TABLE 1
RECOGNITION ACCURACY BY USING DIFFERENT
FEATURE EXTRACTION TECHNIQUES
Feature
method
Train
Images
Gradient
208
DCT
DWT
Test
Images
Known
images
Unknown
images
Accuracy
78
68
10
87.2%
208
78
65
13
83.3%
208
78
60
18
77.5%
Hanwritten Character Classification using the Hotspot Feature
xtraction Technique”, Department of Artificial Intelligence,
University of Groningen, Nyenborgh 9, Graningen, The
Netherlands.
[5] Dayashankar Singh, Sanjay Kr. Singh, Dr. (Mrs.) Maitreyee
Dutta, “Hand written character recognition using twelve
directional feature input and neural network”, ©2010
international Journal of Computer Applications (0975 – 8887)
Volume 1 – No. 3.
[6]
Amir Mowlaei et. al (2002) ‘Handwritten Arabic Character
Recognition: Which FeatureExtraction Method?’ A. Lawgali, A.
Bouridane, M. Angelova, Z. Ghassemlooy School of computing,
Engineering and Information Sciences Northumbria University,
Newcastle upon Tyne, UK ahmed.lawgali@northumbria.ac.uk
[7] Wunsch and Laine “Handwritten Script Recognition using DCT
and Wavelet Features at Block Level” G. G. Rajput of
Computer Science, Gulbarga University, Gulbarga-585106
Karnataka, India.
The accuracy for the gradient system was 87.2% of correct
readings with 13% of incorrect readings and DCT was 83.3%
of correct readings with 17% of incorrect readings and DWT
was 77.5% of correct readings with 22.5% of incorrect
reading on the test data set used. The result has shown that the
feature extraction based on Gradient yields a higher
recognition rate than other two methods. DCT was slightly
higher recognition rate than DWT counterpart.
VI .CONCLUSION
This paper has compared two features extraction
techniques (Gradient and DCT) for handwritten English
characters. Both techniques have been used in ANN for
classification of the characters. The recognition rates of
Gradient and DCT techniques are 87.2% and 83.3%
respectively. The results have demonstrated that features
extraction by Gradient has a higher recognition rate for
handwritten English characters. A reason may be that the
ability of Gradient to compress data of the image makes it
more efficient far pattern recognition application.
[1]
[2]
[3]
[4]
REFERENCES
Lauer, F., Suen, C.Y, and Bloch, G. (2007). A trainable
feature extractor for handwritten digit recognition. Pattern
Recognition, 40(6): 1816-1824
Kumar,Singh “ Performance comparison of features on
Devanagari hand print dataset”, International Journal Recent
Trends, vol.1, no.2, pp.33-37,2009.
Lawgail A., “Handwritten Arabic Character Recognition: Which
Feature Extraction Methods?”School of computing, Engineering
and Information Sciences, Narthumbria University, Newcastle
upon Tyne, UK
Olarik Surinta, Lambert schonraker and Marco Wiering, “
5
All Rights Reserved © 2012 IJSETR