A Novel Invariant Mapping Graffiti as a system for hand-written (English) Applied to Hand-written Arabic character recognition in hand-held computers. Recently, 3Com is selling the Palm III as a Character Recognition similar product. On the other hand, research Nawwaf Kharma & Rabab Ward in the academic arena has been intensive and E.C.E. Department, varied. To obtain a meaningful and relevant (#289) 2366 Main Mall, University of British survey, we limited the scope of our review to Columbia, Vancouver BC, Canada. V6T 1Z4. recent work that uses rotation, position, and Nawwaf@ieee.org & RababW@cs.ubc.ca size (or simply: RPS-) invariant mappings, regardless of the language that these mappings were applied to. Keywords Invariant Mapping, Arabic The main RPS- Invariant techniques may Character be divided into four types: Recognition, Pattern Recognition. A. Moment Invariant Techniques Abstract These techniques [3-6,14-15,17,21-22] use This paper describes an application of a novel functions of moments, which take in a curve mapping, one that is intended for use in on-line hand- as input, and produce a single number as written character recognition. This mapping produces output. Their complexity order increases with the same output pattern regardless of the orientation, the complexity of the moments that they use. position, and size of the input pattern. The mapping In general their complexity is O(nm), where has the advantage of being simple. This makes it n is the number of points, and m is the highest computationally efficient and fast, which in turn power of the moments they use. makes it appropriate for on-line implementations. To demonstrate the usefulness of this mapping, a B. Fourier Descriptors (FDs) recognition system utilising it has been developed for Here, Fourier Analysis is used (e.g. [12,13]) to hand-written Arabic characters. The performance of find the coefficients of Sine and Cosine this system is shown to be comparable to that of functions that can best fit a curve represented existing on-line Arabic character recognition systems. as a series of points along its outline. To achieve 1 Background this, a discrete Fast Fourier Transform (or FFT) is carried out. This has a Cursive hand-written character recognition is complexity order of O(n log n), which is a slightly higher than O(n). wide field both academically and commercially. Commercially, Apple Computer was the first major company to introduce 1 C. Boundary-Based Techniques Our own mapping falls under Normalising the orientation of the these character is more complicated. It is techniques. They include functions that map done in fundamentally varied ways. For the distance from the centroid of a character example, FDs are inherently rotation against the length of the character, and others insensitive, while Moments are not. that map the angle that a line (connecting a boundary point and the centroid) makes In what follows the novel mapping is against the points, etc. For a summary about described. This mapping has the advantage of such systems see [7]. These techniques being simple. In addition, it takes as input the (typically) have complexity order O(n), trace of the pattern that the writer is forming, which is the best possible order other than as it is being written, as opposed to the final O(1) (- impossible with on-line recognition, form of the pattern as it (finally) appears on even with parallel architectures). paper. This works in conjunction with the RPS-invariant nature of the mapping to D. Other Techniques produce similar output patterns for characters For example: Vector Analysis [22]. In this that are formed in a similar way, but which may method the series of points making up a appear different. character is the basis for the character vector. The simplicity of the mapping makes Position normalisation is carried out by it fast. While on the other hand, the RPS- translating the character, so that its centroid invariant nature of the mapping makes it more moves to the origin used. Dilation and tolerant to differences in character shapes. rotation normalisation are not explicitly And, the faster and more tolerant to carried out. individual writing styles a system is, the more successful it is likely to be in real-time use on What all the above techniques have in standard (i.e. not very fast) hardware. The common are that: They normalise the size of details of the mapping are provided in section the 2. In section 3 we apply the mapping to the character, by dividing certain size-related recognition of Arabic characters. features by the total length of the character. 2 A Simple RPS-invariant Mapping They normalise with respect to position Figure 1 lies here by moving the centre of co-ordinates to a point, which is at a fixed position on the character- either the centroid, or the Fig.1 above shows a possible input pattern starting point of that character. (similar to the Latin ‘S’). The mapping 2 produces an output pattern, which plots a 2.1 Application of Mapping certain rotation R against a certain length L, at Fig.2a & b present experimental (simulation) a point (say pt. B) along the line (that makes results that show how patterns with different up the character.) rotations, positions and sizes, produce similar Rotation (R) is calculated using output patterns. formula 1, while the length is found using Figure 2a lies here formula 2, below. Rn+1 = Rn + dR … formula 1 L = length of line / Lt … formula 2 Figure 2b lies here Where, Rn+1 and Rn are the rotations Figures 2a and b above clearly show that the at (measured) points n+1 and n, respectively. output patterns produced for the different ‘S’ dR is the signed difference between the angle patterns are almost identical (with respect to a of the tangent to point n+1 and the angle of Euclidean distance measure of difference). the tangent to point n. A counter-clockwise The only reason why they are not exactly rotation is considered positive and a clockwise identical is the slight variations in the way they one negative. R0 (of the first point along the were formed, caused by the fact that they line) is zero, by definition. R (at any point) is were written by hand. calculated incrementally, and may go up (or Before we proceed to using the down) to any value. Length of line, at any point, mapping as part of a complete system for is measured from the first point. While Lt is classifying Arabic characters, we also (as an the total length of the (finished) line making example) present the input and output up a stroke. This has the effect of normalising patterns for two Arabic characters (the L to 1. starting Hhaa, and the starting Seen). It is worth noting that slight Figure 3 lies here roughness of the line, both at the ends and in the middle of a stroke, would not cause a Figure 4 lies here significant change in the corresponding output pattern. The reason is that, in actual implementation, two pre-processing 3 Application of Mapping to Hand-written procedures are carried out; one clips a couple Arabic Characters of points off the ends of a stroke, and the The aim of the application is to recognize other applies a (3-point) moving average Arabic characters as they are being written on window to the whole of the stroke. a pen pad, in real-time. The characters are 3 written by hand in boxes, such that exactly existence (or lack of it) of at least one one character (though connected), with all of substantial loop (-any loop, closed or open, is its appendages, falls wholly in one box. The characterised by a straight line in the mapped input from each of the boxes is processed by input). The other feature is the number of cusps a program, which then produces as outputs a such as the ones in the middle ‘Noon’, as well code identifying the letterform. If any as the ‘Seen’ (in Fig. 4). And, the third feature (additional) is the achieved total rotation. pre-processing functions are required, in addition to the pre-processing imbedded in the invariant mapping, then it 3.1.2 Non-Mapping-Related Features should be introduced prior to the application A. Dots etc. of the mapping. However, it is crucial that any are Many Arabic characters have the same written introduced by an agent who fully understands form but are distinguished from one another the details of the mapping. solely by the presence, number and position additional pre-processing functions of dots. Some characters have one, two or 3.1 Features three dots above them, or one or two below. This section describes all the features used to They may also have other separate marks such classify the characters of the (hand-written) as a short Arabic ‘Alif’ above them (in the Arabic alphabet. case of the Ttaa- See Fig. 5.) This also applies We use two groups of features in to the ‘Hamza’ (shown below), which looks classification. The first is the output of our like a starting ‘Ain’, except that it is normally RPS-invariant mapping. The second group is smaller in size, and is written either above, a mixed group of features related (mostly) to below, or immediately next to a proper letter. Figure 5 lies here special characteristics of the Arabic alphabet. B. Connectivity 3.1.1 RPS-invariant Mapping related An Arabic character may connect to one side Features (either right or left), to both sides, or to Three main features are extracted from the neither side. For example, the ‘Alif’ connects character (output) function, which results to the right but never to the left. This from applying the mapping to the original information offers a clear-cut method of character. The features chosen are those that (usually) are easily identifiable, that are constant confidence level in the classification of some regardless of the style of writing, and that are characters. For example, if a ‘Waw’ was mixed characteristic of the letter being classified. with a ‘Meem’, the conflict could easily be Three features are central here. One is the resolved to the ‘Meem’s’ advantage if the 4 validating or increasing the suspected character was connected to the left, original (unmapped) character. These are: the for a ‘Waw’ never connects to the left. vertical extension of a character and the angle C. Absolute Features of its starting segment. These features are termed absolute because they refer to features that are related to the 3.2 Classification With respect to speed of execution of the program on standard hardware (a Pentium A decision tree was used for classifying Arabic 166MHz with 64M bytes of RAM was used); the characters. The tree is quite shallow- 4 levels program runs in real-time. More specifically, the deep, which is better for speed. Also, the tests output of the component of the program used for classification are themselves simple. implementing This combination, together with multiple testing the RPS-invariant mapping consistently produces results within less than 0.1 features is designed to ensure speed of second of completion of input. Also, this (wait) execution, as well as certainty of classification. time did not increase as a function of the number of points in the stroke inputted. This 4 Test Results entails that this implementation of the RPS- For the purpose of testing, Arabic characters mapping is functioning not only in real-time, but that differ only in dot number/position (such as also in linear-time. the ‘Baa’ and the ‘Taa’) were tested together as part of one group. All the letterforms (e.g. 4.1 Analysis and Future Work starting, middle, or end) of each of the In Fig. 6 the recognition rates for the various characters were included (on an equal basis) in letter groups are displayed. The overall (average) the test. character recognition rate, for all the groups, We used the handwritten samples of one was about 92.75%. The reasons for that vary, hundred individuals for each letterform. This but come under the following conceptual means the total number of individual test files headings: exceeded 9000. The results are displayed in Fig. 6. CR stands for character recognition rate, which is reported for each of the character Disallowed letterforms. Such as the (rarely used) flower-like middle ‘Haa’. groups. The overall CR rate is 92.75%. Imperfectly or incompletely written characters, such as a ‘Meem’ with an open Figure 6 lies here loop. 5 Radically embellished characters, such as an 4.2 Comparison to other On-line Arabic ending ‘Ain’ with a loopy flourish at the Character Recognition Techniques end. The character recognition system as a whole is Misclassified characters because of compared to other on-line Arabic character similarity to other characters. The most recognition systems. Amin proposed several common example is recognising a ‘Faa’ to systems. The best character recognition rate be a ‘Ain’, and visa versa. (CR) achieved was 95.4% [1], which is slightly Badly deformed characters. better than the preliminary CR obtained here. Badi [2] used structural features, just as [1] and It is suggested that the above deficiencies are we did. However, unlike Amin, who used a dealt with in the following way. Include all the nearest-neighbor classification technique, Badi disallowed forms stated above (except the other used a decision tree technique, similar to ours. ‘LamAlif’ form) in the decision tree. Deal with Badi’s system gave a CR of 90%. El-Wakil in his the ‘Ain’ and ‘Faa’ in the same manner that we 1989 paper [11] applied his mechanism to deal with Raa/Daal. We decided to refine isolated characters (as done here), utilized further the definition of the ‘Hamza’, to allow structural features (as done here) in a chain for a wide degree of variance from the norm. code, but unlike our work, used a nearest- Finally, if similar characters (such as the Raa & neighbor method for classification. His method Daal) are, in the future, forced into separate yielded a CR (identical to ours) of 93%. classes, while keeping all other conditions The best CR claimed for any work in constant, then the character recognition rate will on-line Arabic Character Recognition is the fall to a wholly unacceptable level, for those 99.6% stated in [9] papers. This work also used characters. Hence, it is necessary to add new structural features and a decision tree for features to the features set before attempting classification. The system was applied to the any new such modification. recognition of characters as well as mathematical It is believed that after incorporating formulae. However, the system was tried by the above refinements, a learning module only one ‘uncooperative’ individual, though (especially for the similarly-written Raa & Daal extensively. Hence, there is a lack of information as well as Ain & Faa), and more clearly stated about how well the system would perform if it restrictions on writing style, it would be possible were tested by a number of individuals, with to significantly improve the overall recognition different writing styles. rate. But that is for future work to demonstrate. In summary, we can also claim that the (preliminary) CR rate achieved with our system is among the best reported, but still can be 6 improved significantly. However, it is crucial IEEE Transactions on Pattern Analysis and Machine Intelligence. Vol.18, No.4. [4] Belkasim S O et al (1989). Shape Recognition Using Zernike Moment Invariants. In Asilomar conference on circuits. Vol.1, pp. 161-171. [5] Belkasim S O et al (1991). Pattern Recognition with Moment Invariants: A Comparative Study and New Results. In Pattern Recognition, Vol.24, pp.1117-1138. [6] Desai M and Cheng H D (1994). Pattern Recognition by Local Radial Moments. In Proceedings of the International Conference on Pattern Recognition. Pp. 168-172. [7] Di Zenzo S et al (1992). Optical Recognition of Hand-Printed Characters of any Size, Position, and Orientation. In IBM Journal of Research and Development, Vol. 36, No. 3, pp. 487-501. [8] El-Desouky, A, Salem, M., and Arafat, H. (1992). A Handwritten Arabic Character Recognition Technique for Machine Reader. Int. Journal for Mini Microcomputer, Vol. 14, No. 2, pp. 57-61. [9] El-Sheikh, T.S. & El-Taweel, S.G. (1989). Real-time Arabic Handwritten Character Recognition. In the Proceedings of the 3rd Int. Conference on Image Processing and its Applications, pp. 212-216. Held in Warwick, UK, IEE. London, UK [10] El-Sheikh, T.S. (1990). Recognition of Handwritten Arabic Mathematical Formulas. In the Proceedings of the UK IT 1990 Conference, pp. 344-351. Suthampton, UK. [11] El-Wakil, M.S. & Shoukry, A (1989). Online Recognition of handwritten Arabic Characters. Pattern Recognition, Vol. 22, No. 2, pp. 97-105. [12] Granlund G H (1972). Fourier Preprocessing for Hand Print Character Recognition. In IEEE Transactions on Computers, February 1972. [13] Kauppinen, et al (1995). An experimental comparison of autoregressive and Fourier-based descriptors in 2D shape classification. In IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.17, No.2. [14] Khotanzad A and Hong Y H (1990). Invariant Image Recognition by Zernike Moments. In IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.12, No.5. [15] Kim W and Yuan P (1994). A Practical Pattern Recognition System for Translation, that we reach that high CR rate, without sacrificing, too much, the simplicity of the system, nor indeed, its real-time nature. 5 Conclusion In this paper we presented a simple and computationally efficient mapping, which can used for character recognition. We applied it to hand-written Arabic characters, and explained that it (taken with other features of the character) can be used to produce an effective recognition system to identify each character uniquely. The relatively simple system was applied to more than 9000 handwritten samples produced by 100 different individuals. It returned an average recognition rate of 92.75%. This is comparable to the CR rates of some of the most sophisticated Arabic recognition systems available. References [1] Amin, A., Kaced, A., Haton, J., and Mohr, R. (1980). Hand written Arabic Character Recognition by the I.R.A.C. system. Proceedings of the Fifth Int. Conference on Character Recognition, Miami, FL. Pp. 729-731. [2] Badi, K. & Shimura, M. (1982). Machine Recognition of Arabic Cursive Scripts. In Transactions of the Institute of Electronics & Communications Engrs. Japan, Vol. E65, pp. 107-114. [3] Bailey R R and Srinath M (1996). Orthogonal Moment Features for Use with Parametric and Non-Parametric Classifiers. In 7 Scale and Rotation Invariance. In proceedings of the IEEE Society Conference on Computer Vision and Pattern Recognition. Pp. 391-396. [16] Nishimura M and Van der Spiegel J (1995). Pattern Recognition Based on Orientation and Linestops Using an Orientation Sensor and Multilayered. Neural Network. In proceedings of SPIE’95. [17] Perantonis S J and Lisboa P J G (1992). Translation, Rotation, and Scale Invariant Pattern Recognition by High-Order Neural Networks and Moment Classifiers. In IEEE Transactions on Neural Networks. Vol.3, No.2. [18] Persoon E (1977). Shape Discrimination using Fourier Descriptors. In IEEE Transactions on Systems, Man and Cybernetics. Vol.smc-7, No.3. [19] Sanossian H Y Y (1996). An Arabic Character Recognition System Using Neural Network. In proceedings of the IEE workshop on Neural Networks for Signal Processing. Pp. 340-348. [20] Simon J-C (1994). Uncertainty versus Computational Complexity. In Artificial Intelligence in Mathematics. Johnson J H, McKee S, and Vella (eds), 1994. Oxford University Press. [21] Wang, Dayong and Xie, Weixin (1996). Invariant Image Recognition by Neural Networks and Modified Moment Invariants. In proceedings of SPIE’96. [22] Wilfong G, et al (1996). On-Line Recognition of Handwritten Symbols. In IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.18, No. 9. [23] Wong W-H et al. (1995). Generation of Moment Invariants and their uses for Character Recognition. 8 9