Hand-Written Arabic Character Recognition, Using a Simple

advertisement
A Novel Invariant Mapping
Graffiti as a system for hand-written (English)
Applied to Hand-written Arabic
character recognition in hand-held computers.
Recently, 3Com is selling the Palm III as a
Character Recognition
similar product. On the other hand, research
Nawwaf Kharma & Rabab Ward
in the academic arena has been intensive and
E.C.E. Department,
varied. To obtain a meaningful and relevant
(#289) 2366 Main Mall, University of British
survey, we limited the scope of our review to
Columbia, Vancouver BC, Canada. V6T 1Z4.
recent work that uses rotation, position, and
Nawwaf@ieee.org & RababW@cs.ubc.ca
size (or simply: RPS-) invariant mappings,
regardless of the language that these mappings
were applied to.
Keywords
Invariant
Mapping,
Arabic
The main RPS- Invariant techniques may
Character
be divided into four types:
Recognition, Pattern Recognition.
A. Moment Invariant Techniques
Abstract
These techniques [3-6,14-15,17,21-22] use
This paper describes an application of a novel
functions of moments, which take in a curve
mapping, one that is intended for use in on-line hand-
as input, and produce a single number as
written character recognition. This mapping produces
output. Their complexity order increases with
the same output pattern regardless of the orientation,
the complexity of the moments that they use.
position, and size of the input pattern. The mapping
In general their complexity is O(nm), where
has the advantage of being simple. This makes it
n is the number of points, and m is the highest
computationally efficient and fast, which in turn
power of the moments they use.
makes it appropriate for on-line implementations. To
demonstrate the usefulness of this mapping, a
B. Fourier Descriptors (FDs)
recognition system utilising it has been developed for
Here, Fourier Analysis is used (e.g. [12,13]) to
hand-written Arabic characters. The performance of
find the coefficients of Sine and Cosine
this system is shown to be comparable to that of
functions that can best fit a curve represented
existing on-line Arabic character recognition systems.
as a series of points along its outline. To
achieve
1 Background
this,
a
discrete
Fast
Fourier
Transform (or FFT) is carried out. This has a
Cursive hand-written character recognition is
complexity order of O(n log n), which is
a
slightly higher than O(n).
wide
field
both
academically
and
commercially. Commercially, Apple Computer
was the first major company to introduce
1

C. Boundary-Based Techniques
Our
own
mapping
falls
under
Normalising the orientation of the
these
character is more complicated. It is
techniques. They include functions that map
done in fundamentally varied ways. For
the distance from the centroid of a character
example, FDs are inherently rotation
against the length of the character, and others
insensitive, while Moments are not.
that map the angle that a line (connecting a
boundary point and the centroid) makes
In what follows the novel mapping is
against the points, etc. For a summary about
described. This mapping has the advantage of
such systems see [7]. These techniques
being simple. In addition, it takes as input the
(typically) have complexity order O(n),
trace of the pattern that the writer is forming,
which is the best possible order other than
as it is being written, as opposed to the final
O(1) (- impossible with on-line recognition,
form of the pattern as it (finally) appears on
even with parallel architectures).
paper. This works in conjunction with the
RPS-invariant nature of the mapping to
D. Other Techniques
produce similar output patterns for characters
For example: Vector Analysis [22]. In this
that are formed in a similar way, but which may
method the series of points making up a
appear different.
character is the basis for the character vector.
The simplicity of the mapping makes
Position normalisation is carried out by
it fast. While on the other hand, the RPS-
translating the character, so that its centroid
invariant nature of the mapping makes it more
moves to the origin used. Dilation and
tolerant to differences in character shapes.
rotation normalisation are not explicitly
And, the faster and more tolerant to
carried out.
individual writing styles a system is, the more
successful it is likely to be in real-time use on
What all the above techniques have in
standard (i.e. not very fast) hardware. The
common are that:

They
normalise
the
size
of
details of the mapping are provided in section
the
2. In section 3 we apply the mapping to the
character, by dividing certain size-related
recognition of Arabic characters.
features by the total length of the
character.

2 A Simple RPS-invariant Mapping
They normalise with respect to position
 Figure 1 lies here 
by moving the centre of co-ordinates to a
point, which is at a fixed position on the
character- either the centroid, or the
Fig.1 above shows a possible input pattern
starting point of that character.
(similar to the Latin ‘S’). The mapping
2
produces an output pattern, which plots a
2.1 Application of Mapping
certain rotation R against a certain length L, at
Fig.2a & b present experimental (simulation)
a point (say pt. B) along the line (that makes
results that show how patterns with different
up the character.)
rotations, positions and sizes, produce similar
Rotation (R) is calculated using
output patterns.
formula 1, while the length is found using
 Figure 2a lies here 
formula 2, below.
Rn+1 = Rn + dR
… formula 1
L = length of line / Lt
… formula 2
 Figure 2b lies here 
Where, Rn+1 and Rn are the rotations
Figures 2a and b above clearly show that the
at (measured) points n+1 and n, respectively.
output patterns produced for the different ‘S’
dR is the signed difference between the angle
patterns are almost identical (with respect to a
of the tangent to point n+1 and the angle of
Euclidean distance measure of difference).
the tangent to point n. A counter-clockwise
The only reason why they are not exactly
rotation is considered positive and a clockwise
identical is the slight variations in the way they
one negative. R0 (of the first point along the
were formed, caused by the fact that they
line) is zero, by definition. R (at any point) is
were written by hand.
calculated incrementally, and may go up (or
Before we proceed to using the
down) to any value. Length of line, at any point,
mapping as part of a complete system for
is measured from the first point. While Lt is
classifying Arabic characters, we also (as an
the total length of the (finished) line making
example) present the input and output
up a stroke. This has the effect of normalising
patterns for two Arabic characters (the
L to 1.
starting Hhaa, and the starting Seen).
It
is
worth
noting
that
slight
 Figure 3 lies here 
roughness of the line, both at the ends and in
the middle of a stroke, would not cause a
 Figure 4 lies here 
significant change in the corresponding
output pattern. The reason is that, in actual
implementation,
two
pre-processing
3 Application of Mapping to Hand-written
procedures are carried out; one clips a couple
Arabic Characters
of points off the ends of a stroke, and the
The aim of the application is to recognize
other applies a (3-point) moving average
Arabic characters as they are being written on
window to the whole of the stroke.
a pen pad, in real-time. The characters are
3
written by hand in boxes, such that exactly
existence (or lack of it) of at least one
one character (though connected), with all of
substantial loop (-any loop, closed or open, is
its appendages, falls wholly in one box. The
characterised by a straight line in the mapped
input from each of the boxes is processed by
input). The other feature is the number of cusps
a program, which then produces as outputs a
such as the ones in the middle ‘Noon’, as well
code identifying the letterform. If any
as the ‘Seen’ (in Fig. 4). And, the third feature
(additional)
is the achieved total rotation.
pre-processing
functions
are
required, in addition to the pre-processing
imbedded in the invariant mapping, then it
3.1.2 Non-Mapping-Related Features
should be introduced prior to the application
A. Dots etc.
of the mapping. However, it is crucial that any
are
Many Arabic characters have the same written
introduced by an agent who fully understands
form but are distinguished from one another
the details of the mapping.
solely by the presence, number and position
additional
pre-processing
functions
of dots. Some characters have one, two or
3.1 Features
three dots above them, or one or two below.
This section describes all the features used to
They may also have other separate marks such
classify the characters of the (hand-written)
as a short Arabic ‘Alif’ above them (in the
Arabic alphabet.
case of the Ttaa- See Fig. 5.) This also applies
We use two groups of features in
to the ‘Hamza’ (shown below), which looks
classification. The first is the output of our
like a starting ‘Ain’, except that it is normally
RPS-invariant mapping. The second group is
smaller in size, and is written either above,
a mixed group of features related (mostly) to
below, or immediately next to a proper letter.
 Figure 5 lies here 
special characteristics of the Arabic alphabet.
B. Connectivity
3.1.1
RPS-invariant
Mapping
related
An Arabic character may connect to one side
Features
(either right or left), to both sides, or to
Three main features are extracted from the
neither side. For example, the ‘Alif’ connects
character (output) function, which results
to the right but never to the left. This
from applying the mapping to the original
information offers a clear-cut method of
character. The features chosen are those that
(usually)
are easily identifiable, that are constant
confidence level in the classification of some
regardless of the style of writing, and that are
characters. For example, if a ‘Waw’ was mixed
characteristic of the letter being classified.
with a ‘Meem’, the conflict could easily be
Three features are central here. One is the
resolved to the ‘Meem’s’ advantage if the
4
validating
or
increasing
the
suspected character was connected to the left,
original (unmapped) character. These are: the
for a ‘Waw’ never connects to the left.
vertical extension of a character and the angle
C. Absolute Features
of
its
starting
segment.
These features are termed absolute because
they refer to features that are related to the
3.2 Classification
With respect to speed of execution of the
program on standard hardware (a Pentium
A decision tree was used for classifying Arabic
166MHz with 64M bytes of RAM was used); the
characters. The tree is quite shallow- 4 levels
program runs in real-time. More specifically, the
deep, which is better for speed. Also, the tests
output of the component of the program
used for classification are themselves simple.
implementing
This combination, together with multiple testing
the
RPS-invariant
mapping
consistently produces results within less than 0.1
features is designed to ensure speed of
second of completion of input. Also, this (wait)
execution, as well as certainty of classification.
time did not increase as a function of the
number of points in the stroke inputted. This
4 Test Results
entails that this implementation of the RPS-
For the purpose of testing, Arabic characters
mapping is functioning not only in real-time, but
that differ only in dot number/position (such as
also in linear-time.
the ‘Baa’ and the ‘Taa’) were tested together as
part of one group. All the letterforms (e.g.
4.1 Analysis and Future Work
starting, middle, or end) of each of the
In Fig. 6 the recognition rates for the various
characters were included (on an equal basis) in
letter groups are displayed. The overall (average)
the test.
character recognition rate, for all the groups,
We used the handwritten samples of one
was about 92.75%. The reasons for that vary,
hundred individuals for each letterform. This
but come under the following conceptual
means the total number of individual test files
headings:
exceeded 9000. The results are displayed in Fig.
6. CR stands for character recognition rate,

which is reported for each of the character
Disallowed letterforms. Such as the (rarely
used) flower-like middle ‘Haa’.
groups. The overall CR rate is 92.75%.

Imperfectly
or
incompletely
written
characters, such as a ‘Meem’ with an open
 Figure 6 lies here 
loop.
5



Radically embellished characters, such as an
4.2 Comparison to other On-line Arabic
ending ‘Ain’ with a loopy flourish at the
Character Recognition Techniques
end.
The character recognition system as a whole is
Misclassified
characters
because
of
compared to other on-line Arabic character
similarity to other characters. The most
recognition systems. Amin proposed several
common example is recognising a ‘Faa’ to
systems. The best character recognition rate
be a ‘Ain’, and visa versa.
(CR) achieved was 95.4% [1], which is slightly
Badly deformed characters.
better than the preliminary CR obtained here.
Badi [2] used structural features, just as [1] and
It is suggested that the above deficiencies are
we did. However, unlike Amin, who used a
dealt with in the following way. Include all the
nearest-neighbor classification technique, Badi
disallowed forms stated above (except the other
used a decision tree technique, similar to ours.
‘LamAlif’ form) in the decision tree. Deal with
Badi’s system gave a CR of 90%. El-Wakil in his
the ‘Ain’ and ‘Faa’ in the same manner that we
1989 paper [11] applied his mechanism to
deal with Raa/Daal.
We decided to refine
isolated characters (as done here), utilized
further the definition of the ‘Hamza’, to allow
structural features (as done here) in a chain
for a wide degree of variance from the norm.
code, but unlike our work, used a nearest-
Finally, if similar characters (such as the Raa &
neighbor method for classification. His method
Daal) are, in the future, forced into separate
yielded a CR (identical to ours) of 93%.
classes, while keeping all other conditions
The best CR claimed for any work in
constant, then the character recognition rate will
on-line Arabic Character Recognition is the
fall to a wholly unacceptable level, for those
99.6% stated in [9] papers. This work also used
characters. Hence, it is necessary to add new
structural features and a decision tree for
features to the features set before attempting
classification. The system was applied to the
any new such modification.
recognition of characters as well as mathematical
It is believed that after incorporating
formulae. However, the system was tried by
the above refinements, a learning module
only one ‘uncooperative’ individual, though
(especially for the similarly-written Raa & Daal
extensively. Hence, there is a lack of information
as well as Ain & Faa), and more clearly stated
about how well the system would perform if it
restrictions on writing style, it would be possible
were tested by a number of individuals, with
to significantly improve the overall recognition
different writing styles.
rate. But that is for future work to demonstrate.
In summary, we can also claim that the
(preliminary) CR rate achieved with our system
is among the best reported, but still can be
6
improved significantly. However, it is crucial
IEEE Transactions on Pattern Analysis and
Machine Intelligence. Vol.18, No.4.
[4] Belkasim S O et al (1989). Shape Recognition
Using Zernike Moment Invariants. In Asilomar
conference on circuits. Vol.1, pp. 161-171.
[5] Belkasim S O et al (1991). Pattern
Recognition with Moment Invariants: A
Comparative Study and New Results. In Pattern
Recognition, Vol.24, pp.1117-1138.
[6] Desai M and Cheng H D (1994). Pattern
Recognition by Local Radial Moments. In
Proceedings of the International Conference on
Pattern Recognition. Pp. 168-172.
[7] Di Zenzo S et al (1992). Optical Recognition
of Hand-Printed Characters of any Size,
Position, and Orientation. In IBM Journal of
Research and Development, Vol. 36, No. 3, pp.
487-501.
[8] El-Desouky, A, Salem, M., and Arafat, H.
(1992). A Handwritten Arabic Character
Recognition Technique for Machine Reader. Int.
Journal for Mini Microcomputer, Vol. 14, No. 2,
pp. 57-61.
[9] El-Sheikh, T.S. & El-Taweel, S.G. (1989).
Real-time Arabic Handwritten Character
Recognition. In the Proceedings of the 3rd Int.
Conference on Image Processing and its
Applications, pp. 212-216. Held in Warwick,
UK, IEE. London, UK
[10] El-Sheikh, T.S. (1990). Recognition of
Handwritten Arabic Mathematical Formulas. In
the Proceedings of the UK IT 1990 Conference,
pp. 344-351. Suthampton, UK.
[11] El-Wakil, M.S. & Shoukry, A (1989). Online Recognition of handwritten Arabic
Characters. Pattern Recognition, Vol. 22, No. 2,
pp. 97-105.
[12] Granlund G H (1972). Fourier
Preprocessing for Hand Print Character
Recognition. In IEEE Transactions on
Computers, February 1972.
[13] Kauppinen, et al (1995). An experimental
comparison of autoregressive and Fourier-based
descriptors in 2D shape classification. In IEEE
Transactions on Pattern Analysis and Machine
Intelligence, Vol.17, No.2.
[14] Khotanzad A and Hong Y H (1990).
Invariant Image Recognition by Zernike
Moments. In IEEE Transactions on Pattern
Analysis and Machine Intelligence, Vol.12, No.5.
[15] Kim W and Yuan P (1994). A Practical
Pattern Recognition System for Translation,
that we reach that high CR rate, without
sacrificing, too much, the simplicity of the
system, nor indeed, its real-time nature.
5 Conclusion
In this paper we presented a simple and
computationally efficient mapping, which can
used for character recognition. We applied it to
hand-written Arabic characters, and explained
that it (taken with other features of the
character) can be used to produce an effective
recognition system to identify each character
uniquely.
The
relatively simple
system
was
applied to more than 9000 handwritten samples
produced by 100 different individuals. It
returned an average recognition rate of 92.75%.
This is comparable to the CR rates of some of
the most sophisticated Arabic recognition
systems available.
References
[1] Amin, A., Kaced, A., Haton, J., and Mohr, R.
(1980). Hand written Arabic Character
Recognition by the I.R.A.C. system. Proceedings
of the Fifth Int. Conference on Character
Recognition, Miami, FL. Pp. 729-731.
[2] Badi, K. & Shimura, M. (1982). Machine
Recognition of Arabic Cursive Scripts. In
Transactions of the Institute of Electronics &
Communications Engrs. Japan, Vol. E65, pp.
107-114.
[3] Bailey R R and Srinath M (1996).
Orthogonal Moment Features for Use with
Parametric and Non-Parametric Classifiers. In
7
Scale and Rotation Invariance. In proceedings of
the IEEE Society Conference on Computer
Vision and Pattern Recognition. Pp. 391-396.
[16] Nishimura M and Van der Spiegel J (1995).
Pattern Recognition Based on Orientation and
Linestops Using an Orientation Sensor and
Multilayered. Neural Network. In proceedings
of SPIE’95.
[17] Perantonis S J and Lisboa P J G (1992).
Translation, Rotation, and Scale Invariant
Pattern Recognition by High-Order Neural
Networks and Moment Classifiers. In IEEE
Transactions on Neural Networks. Vol.3, No.2.
[18] Persoon E (1977). Shape Discrimination
using Fourier Descriptors. In IEEE
Transactions on Systems, Man and Cybernetics.
Vol.smc-7, No.3.
[19] Sanossian H Y Y (1996). An Arabic
Character Recognition System Using Neural
Network. In proceedings of the IEE workshop
on Neural Networks for Signal Processing. Pp.
340-348.
[20] Simon J-C (1994). Uncertainty versus
Computational Complexity. In Artificial
Intelligence in Mathematics. Johnson J H,
McKee S, and Vella (eds), 1994. Oxford
University Press.
[21] Wang, Dayong and Xie, Weixin (1996).
Invariant Image Recognition by Neural
Networks and Modified Moment Invariants. In
proceedings of SPIE’96.
[22] Wilfong G, et al (1996). On-Line
Recognition of Handwritten Symbols. In IEEE
Transactions on Pattern Analysis and Machine
Intelligence, Vol.18, No. 9.
[23] Wong W-H et al. (1995). Generation of
Moment Invariants and their uses for Character
Recognition.
8
9
Download