International Journal of Science, Engineering and Technology Research (IJSETR) Volume 1, Issue 1, July 2012 Comparison of Handwriting characters Accuracy using Different Feature Extraction Methods Thin Nu Nu Lwin, Thandar Soe Department of Information Technology, Mandalay Technological University Thinnunulwin@gmail.com Abstract –Feature extraction techniques can be important in character recognition, because they can enhance the efficiency of recognition in comparison to pixel-based approaches. These study aims to investigate the noval feature extraction techniques in order to use it for representing handwritten characters. In this system, three feature extraction methods (Gradient, DCT and DWT) are used to compare the effectiveness of handwriting characters. The system is started by acquiring an image containing characters. The characters are processed into several phases such as binarization, noise filtering, normalization and feature extraction before recognizing. A multilayer neural network is used for the recognition phase; feed forward back propagation algorithm is applied for training the network. The purpose of this paper is to compare different feature extraction methods in terms of recognition accuracy and training time. Keywords—Handwritten English characters (A-Z), Gradient, DCT, DWT, Multilayer neural network, Feed forward back propagation, Recognition accuracy I. INTRODUCTION andwriting recognition is one of the most desirable computer features in enhancing communication between human and computer. According to initial research, handwriting recognition is developed to allow computers to read and understand human language in written form. The field of handwriting recognition is divided into off-line and on-line recognition .The on-line approach uses a tracking device to collect time-position-action of writing strokes, i.e., tablet digitizer or pen device. The sequences of the writing positions are stored in a timely order. The off- line approach uses a light sensitivity device, e.g., a scanner or a digital camera, to read a written document. The off-line data acquisition and recognition approach are the interest of this study. Common processes of off-line handwriting recognition systems are preprocessing, feature extraction and recognition. The feature extraction process extracts the relevant information, known as feature vectors, which could be used to identify the input image in the recognition step. The recognition process uses these features to find the most compatible class with the input. The main objective for using feature extraction is to reduce the data dimensionality by extracting in most important features from character image [1]. Moreover, feature extraction is a significant factor for obtaining high accuracies in character recognition systems especially if there is not a lot of training data available. The present study aims to compare the performance of the Gradient and DCT for handwritten English characters. The gradient feature provides higher resolution on both magnitude and angle of the directional stokes, which leads to improvement on the character recognition rate. The gradient feature represents local characteristic of a character image. Features extracted from handwritten characters are directions H of pixels with respect to their neighboring pixels. This approach increases the information content and gives better recognition rate with reduced recognition time. The DCT is a popular signal transformation method, which is made the use of cosine functions of different frequencies. The DCT transform coding method compresses image data by representing the original signal with a small number of transform coefficients. It exploits the fact that for typical images a large amount of signal energy is concentrated in a small number of coefficients. The goal of DCT transform coding is to minimize the number of retained transform coefficients while keeping distortion at an acceptable level. For that reason, the DCT have become the most widely used transform coding technique. The Wavelet Transform is a powerful technique for representing data at different scales and frequencies. A discrete wavelet transform represents a time domain signal into time frequency domain and the signals are called wavelet coefficients. The Discrete Wavelet Transform (DWT) is based on sub-band coding, is found to yield a fast computation of Wavelet Transform. It is easy to implement and reduces the computation time and resources required. The purpose of the work is to ascertain the effectiveness of each feature extraction technique to capture useful information and hence resulting in more accurate recognition results. The remainder of the paper is organized as follows: Section II briefly reviews the prior works on feature extraction of handwritten characters. The proposed system components are given in Section III. Section IV briefly explains about materials and methods used on the current system. The results of our experiment are described in Section V and conclusions are mentioned in Section VI. II. RELATED WORKS Handwriting recognition is one of the most challenging and oldest problems in computer-related research. The challenge of handwriting recognition is how to implement computer systems that can read like humans. Many researchers had done work towards the off- line handwritten character recognition. Kumar [2] compared performances of five feature-extraction methods on handwritten Devanagari characters. The various features covered are Kirsch directional edges, distance transform, chain code, gradient and directional distance distribution. From that experimentation, it was found that Kirsch directional edges are least performing and gradient is the best performing with SVM classifiers. With multilayer perceptron (MLP), the performance of gradient and directional distance distribution is almost same. The chain code based feature is better as compared to Kirsch directional edges, distance transform. Lawgali [3] compared the effectiveness of Discrete Cosine Transform DCT and Discrete Wavelet Transform DWT to 1 All Rights Reserved © 2012 IJSETR International Journal of Science, Engineering and Technology Research (IJSETR) Volume 1, Issue 1, July 2012 capture discriminative features of Arabic handwritten characters. A new database containing 5600 characters covering all shapes of Arabic handwritten characters had also developed. DCT and DWT techniques are used for feature extraction of the characters. Coefficients of both techniques are used in ANN for classification of the characters. That experiment had demonstrated that feature extraction by DCT has a higher recognition rate than DWT. Olarik Surinta et al. [4] proposed a novel feature extraction technique called the hotspot technique for representing handwritten characters and digits. In the hotspot technique, the distance values between the closet black pixels and the hotspots in each direction are used as representation for a character. The hotspot technique is applied to three data sets including Thai handwritten characters (65 classes), Bangla numeric (10 classes) and MNIST (10 classes). The data sets are then classified by the k-Nearest Neighbors algorithm using the Euclidean distance as function for computing distances between data points. In that study, the classification rates obtained from the hotspot, mark direction and direction of chain code techniques were compared. The results showed that the hotspot technique provides the largest average classification rates. Dayashankar Singh et al.[5] presented a new feature extraction technique to calculate only twelve directional feature inputs depending upon the gradients. Total 500 handwritten samples, including handwritten Hindi characters, English characters and some special characters, were used in that experiment. Features extracted from handwritten characters are directions of pixels with respect to their neighboring pixels. These inputs are given to a back propagation neural network with one hidden layer and one output layer. Experiment result showed that the new approach provides better results as compared to other techniques in terms of recognition accuracy, training time and classification time. Amir Mowlaei et. al (2002) [6] presented a feature extraction using wavelet transform for Farsi/Arabic characters and numerals. The DWT is used to produce the wavelet coefficient and Haar wavelet is used during the feature extraction. The experiment is done using 480 samples per digit and 190 samples per character. Then both of this samples is divided into training and test set. Both of this set has a high recognition rate between 91 -99%. Wunsch and Laine[7] proposed the wavelet descriptors for recognition of handwritten characters. Their experimental results showed that wavelet descriptors are an efficient representation. In that paper, they proposed a new feature extraction method based on two-dimensional discrete wavelet transform for off-line recognition of unconstrained handwritten numerals using back-propagation neural networks as a classifier. In order to verify the performance of the proposed approach, 1500 handwritten numerals written by 30 persons were collected as the database, 750 numerals are used as the training set and the other 750 numerals as the testing set. Classification is accomplished by back-propagation neural networks with three layers. The recognition rate of the training set and the testing set are 99.1 % and 96.8 %, respectively. The experimental result shows that the proposed method is a simple and an efficient representation for unconstrained handwritten numerals recognition using fewer image preprocessing. From the above literature survey, it is clear that feature extraction is an integral part any recognition system and the selection of feature extraction techniques is an important step for getting higher recognition accuracy. Gradient, DCT and DWT are used and compare for feature extraction of all the shapes of handwritten English characters using Neural Network in the reorganization stage. I. SYSTEM COMPONENTS The entire system can be divided into four parts: A. Image Acquisition B. Preprocessing C. Feature extraction D. Recognition Start Image Acquisition Preprocessing Noise removal Binarization Normalization Feature Extraction DCT Gradient DWT Train Database Train with Neural Network End Figure1: Block Diagram of Training System Start Image Acquision Preprocessing Feature extraction Gradient Neural Network DCT DCT Neural Network Neural Network Train database Save recognition accuracy Comparison of recognition accuracy End Figure 2: Block Diagram of Recognition System 2 All Rights Reserved © 2012 IJSETR International Journal of Science, Engineering and Technology Research (IJSETR) Volume 1, Issue 1, July 2012 IV. MATERIALS AND METHODS The steps of the proposed comparison algorithms based on Gradient and DCT are described in Fig. 1 and Fig. 2. A. Image acquisition An image is acquired to the system as an input. This image should have a specific format, for example, png format. This image can be acquired through the scanner or, digital camera or other digital input devices. Scanner is the most common device used to get the image comparing to other devices due to less noisy accrued during imaging process. The input images are scanned at a resolution of 300 dpi (dot per inch) and stored by gray scale image as shown in Fig 3. characters varies from person to person and even with the same person from time to time. Therefore, the characters should be scaled to a standardized matrix to make the recognition process independent of the writing size and to get better recognition accuracy. In this paper, the character is normalized to size 32*32 pixels. An example of size normalization is illustrated in Fig.6. Fig. 6: Normalization of the character “A Fig. 3: Input Image B. Preprocessing Before the image is given to the recognition system, it needs to be brought in a format that is standard and acceptable to the neural network as input. Various blocks of the preprocessing step are as follows: (i).Binarization: In image binarization, the text image which is gray scale image is converted into a binary image with each pixel taking a value of 0 or 1 depending on threshold value of the image. The technique is most commonly employed for determining the threshold involves analyzing the histogram of gray scale levels in the digitized image. The scanned image and its binarized output are shown in Fig. 4. Fig.4: Input Image and Binary Image of “A” (ii).Noise removal: Noise removal means reducing noise in an image. For off-line recognition, the noise may come from the writing style or from the optical device that captures the image. The presence of noise can reduce the efficiency of the character recognition system. So, it should be eliminated from the image as much as possible to avoid confusion recognition. Median filtering is used in this paper. The example of noisy image and its filtering image are shown in Fig.5. C. Feature Extraction In printed and handwritten text, the features capture the information extracted from the characters. This information is passed onto the matcher to assist in the classification process. In this research, Gradient and DCT are adopted to extract the features of the characters. Both Gradient and DCT are widely used in the field of digital signal processing applications. (i).Gradient The gradient measures the magnitude and direction of the greatest change in intensity in a small neighborhood of each pixel. Gradients are computed by means of the Sobel operator. The Sobel templates used to compute the horizontal (X) & vertical (Y) components of the gradient. The templates are shown in Fig.7. Horizontal Template Vertical Template Fig.7: Sobel operator Template The two gradient components at location (i, j) are calculated by: Gx (i, j) = f( i-1, j+1) + 2f( i, j+1) +f( i+1, j+1) – f(i-1, j- 1)2f(i, j-1) –f(i+1, j+1) Gy (i, j) = f( i-1, j-1) + 2f( i-1, j) +f( i-1, j+1) – f(i+1, j-1) – 2f(i+1, j) –f(i+1, j+1) The gradient strength and the direction are calculated as : G (i, j ) =√Gx 2 + Gy 2 Θ (i, j) =tan-1Gx (i, j) /Gy (i, j) Fig .5: Noisy image and noise removed image (iii).Normalization: Normalization is used to standardize the fast size within the image. The size of the handwritten After computing the gradient of each pixel of the character, the gradient values are mapped onto 18 direction values to the angle span of 10 degree between any two adjacent direction values. (ii).Discrete Cosine Transform (DCT) The discrete cosine transform (DCT) is a technique for converting a signal into elementary frequency components. 3 All Rights Reserved © 2012 IJSETR International Journal of Science, Engineering and Technology Research (IJSETR) Volume 1, Issue 1, July 2012 DCT technique includes three steps: Transformation, Quantization and Encoding. The Discrete Cosine Transform converts data of the image into its elementary frequency components. Quantization is the process of reducing the number of possible values of a quantity, thereby reducing the number of bits needed to represent it. Entropy encoding is a technique for representing the quantize data as compactly as possible. First, image is divided into 8*8 block. It clusters the lowest frequency components in upper left corner, whereas the highest frequency components in bottom right corner of the array (m, n). 1 𝛼(u)= , √𝑀 { √ 2 , 1≤𝑢 ≤𝑀−1 𝑀 1 𝛼(v)= { Fig: 9. DWT decomposition at one level 𝑢=0 √𝑁 , 𝑣=0 2 √ ,1 ≤ 𝑣 ≤ 𝑁 − 1 𝑁 where f(m, n) is the pixel value at the (m, n)coordinate position in the image. F(u, v) is DCT domain representation of f(m, n), where u and v represent vertical and horizontal frequencies.The DCT coefficients are then quantized. After quantization, all of the quantized coefficient are extracted in a zigzag fashion and stored in a vector sequence as shown in Fig.8.These coefficients are encoded for efficienttransmission of the image. Therefore, these coefficients are used to extract the features of the character image. Fig.8: Zig Zag Sequence (iii).Discrete Wavelet Transform (DWT) The Wavelet Transform (WT) is a way to represent a signal in time- frequency form. DWT provides a more detailed picture of the signal being analyzed. DWT is applied low-pass filter (LPF) and high-pass filter to decompose the image along Row or Colum. The results of each filter are down-sampled by two. Each of the sub- signals is then again high and low filtered and the result is again down-sampled by two. At decomposition level, DWT separates an image into one low- frequency sub-band (LL) and three high frequency sub-bands (LH, HL, HH). The LL is an approximation sub-band, LH is horizontal detail sub-band, HL is vertical detail sub-band and HH is diagonal sub-band. The low frequency coefficients of sub-band (LL) are closed to the original image and they contain full details of the image. Therefore, these coefficients are used to detect the features of the character image. D. Recognition in Classification Template matching, structural analysis and neural networks have traditionally been popular classification methods for character recognition, but neural networks are increasingly proving to offer better and more reliable accuracy for handwriting recognition [8]. Architecture: The most popular architecture of neural network used in English character recognition takes a network with three layers: input layer, hidden layer and output layer. Fig.9 depicts example of the architecture 3-layer neural network. The input layer is fed by the feature of the characters. Therefore, the number of nodes in this layer depends on the number of input features of the network. The last layer is called the output layer and the number of its nodes is based on the desired outs. The hidden layer lies between the input and output layers. The system consists of 208 characters of different writers. For gradient, the number of input feature is 18, the number of neurons in hidden layer is 400 and the number of neurons in output layer is 26 in this system. For DCT, the number of input feature is 16, the number of neurons in hidden layer is 400 and the number of neurons in output layer is 26 in this system. In the hidden layer, the number of nodes governs the variance of samples which can be accurately and correctly recognized by the network. If the network has trouble in learning, then neurons can be added to this layer. Training phase: Commonly neural networks are trained, so that a particular input leads to a specific target output. There, the network is adjusted, based on a comparison of the output and the target, until the network output matches the target. The system apply feed forward back propagation neural network algorithm. The back propagation algorithm consists of three stages. The first is the forward phase, spread inputs from the input layer to the output layer through hidden layer to provide outputs. The second is the backward stage, calculate and propagate back of the associated error from the output layer to the input layer through hidden layer. And the third stage is the adjustment of the weights and biases. The backward stage is similar to the forward stage except that error values are propagated back through the network to determine how the weights are to be changed during training. During training each input pattern will have an associated target pattern. After training, application of the network involves only the computations of the feed forward stage. 4 All Rights Reserved © 2012 IJSETR International Journal of Science, Engineering and Technology Research (IJSETR) Volume 1, Issue 1, July 2012 Fig.10: Example of architecture of neural network with 3-layer V.EXPERIMENTAL RESULTS Experiments were carried out using 286 isolated English characters from 11 independent writers. These characters are divided into two data sets: training 208 characters and testing 78 characters. Comparative studies between Gradient, DCT and DWT in terms of recognition accuracy are summarized in Table 1. TABLE 1 RECOGNITION ACCURACY BY USING DIFFERENT FEATURE EXTRACTION TECHNIQUES Feature method Train Images Gradient 208 DCT DWT Test Images Known images Unknown images Accuracy 78 68 10 87.2% 208 78 65 13 83.3% 208 78 60 18 77.5% Hanwritten Character Classification using the Hotspot Feature xtraction Technique”, Department of Artificial Intelligence, University of Groningen, Nyenborgh 9, Graningen, The Netherlands. [5] Dayashankar Singh, Sanjay Kr. Singh, Dr. (Mrs.) Maitreyee Dutta, “Hand written character recognition using twelve directional feature input and neural network”, ©2010 international Journal of Computer Applications (0975 – 8887) Volume 1 – No. 3. [6] Amir Mowlaei et. al (2002) ‘Handwritten Arabic Character Recognition: Which FeatureExtraction Method?’ A. Lawgali, A. Bouridane, M. Angelova, Z. Ghassemlooy School of computing, Engineering and Information Sciences Northumbria University, Newcastle upon Tyne, UK ahmed.lawgali@northumbria.ac.uk [7] Wunsch and Laine “Handwritten Script Recognition using DCT and Wavelet Features at Block Level” G. G. Rajput of Computer Science, Gulbarga University, Gulbarga-585106 Karnataka, India. The accuracy for the gradient system was 87.2% of correct readings with 13% of incorrect readings and DCT was 83.3% of correct readings with 17% of incorrect readings and DWT was 77.5% of correct readings with 22.5% of incorrect reading on the test data set used. The result has shown that the feature extraction based on Gradient yields a higher recognition rate than other two methods. DCT was slightly higher recognition rate than DWT counterpart. VI .CONCLUSION This paper has compared two features extraction techniques (Gradient and DCT) for handwritten English characters. Both techniques have been used in ANN for classification of the characters. The recognition rates of Gradient and DCT techniques are 87.2% and 83.3% respectively. The results have demonstrated that features extraction by Gradient has a higher recognition rate for handwritten English characters. A reason may be that the ability of Gradient to compress data of the image makes it more efficient far pattern recognition application. [1] [2] [3] [4] REFERENCES Lauer, F., Suen, C.Y, and Bloch, G. (2007). A trainable feature extractor for handwritten digit recognition. Pattern Recognition, 40(6): 1816-1824 Kumar,Singh “ Performance comparison of features on Devanagari hand print dataset”, International Journal Recent Trends, vol.1, no.2, pp.33-37,2009. Lawgail A., “Handwritten Arabic Character Recognition: Which Feature Extraction Methods?”School of computing, Engineering and Information Sciences, Narthumbria University, Newcastle upon Tyne, UK Olarik Surinta, Lambert schonraker and Marco Wiering, “ 5 All Rights Reserved © 2012 IJSETR