Recognition of On-Line Handwritten Mathematical Expressions in the E-Chalk System - An Extension Ernesto Tapia and Raúl Rojas Freie Universität Berlin, Institut für Informatik Takustr. 9, D-14195 Berlin, Germany {tapia,rojas}@inf.fu-berlin.de Abstract This article introduces the new version of our system for the recognition of on-line handwritten mathematical expressions. The previous version was capable to recognize the most common mathematical expressions, such as expressions of calculus, following the usual mathematical conventions. This version improves and extends our technique, allowing the recognition of structures described by tabular and multiline arrangements, such as matrices and systems of equations. The recognition results are stored in a composite of symbols, which are used by computer algebra systems for algebraic computations, and function plotting. The system is embedded as an intelligent assistant in the Electronic Chalkboard (E-Chalk), a system for the live transmition and storage of lectures. 1. Introduction Traditionally, there has been a lack of tools for the development of systems specialized on the recognition of online handwritten mathematics, due to the complexity of twodimensional mathematical notation including formulas, arrays of symbols, diagrams, etc. Programs running on personal digital assistants (PDAs) and Tablet PCs are capable to recognize digital ink as text, or simple one-dimensional arithmetical expressions, e.g. using the Grafitti alphabet for the former, or libraries such as PenReader1 for the latter. Most recently, Microsoft has offered a special version of the Windows operating system for the Tablett PC, which includes libraries for processing, storing, and recognizing digital ink. However, the system is also specialized towards the recognition of Latin letters, and cannot be easily extended to recognize other symbols. In the meantime, a few prototypes of on-line formula recognizers have appeared. The Freehand Formula Entry 1 Developed by Paragon Software, http://www.penreader.com System [11] is a pen-based equation editor distributed under the GNU General Public License. The Natural Log system [8] is a user-dependent system written in Java, available on-line as an Applet on the Internet. The recognition results of the former are a LATEX represention of the mathematical expression, whereas the latter offers also a MathML representation. The Infty Editor [3] is a commercial system specialized for creating mathematical documents. It uses an interesting approach for the recognition of isolated symbols, based on the classification of extended and non-extended characters. MathJournal [14] is another commercial system developed for the Tablet PC version of the Windows platform. Its “solution engines” process the recognized expressions in numeric, graphic, or symbolic formats. The recognition of symbols is limited, because it uses the recognizer provided by Mirosoft. The recognition system described in this article is a tool embedded in the Electronic Chalkboard (E-Chalk), a system for the live transmition and storage of lectures [7]. E-Chalk uses a contact-sensitive screen to process and store the lecturer’s hadwriting, which is processed by means of intelligent agents working in the background. See Fig. 1. The aim of our system is to provide the lecturers the possibility to interpret and to process handwritten mathematics. The purpose of this article is to introduce the new version of our recognition system. It is organized as follows. Section 2 describes the components of the recognition system. Section 3 shows how the system is used as an intelligent tool in E-Chalk. We conclude this article with some comments in Sec. 4. 2. System description We follow a two-step approach for the recognition of online handwritten mathematical expressions. The first step is the recognition of isolated handwritten symbols, and the second step is the layout analysis of the symbols in the expression. In the next sections we describe the methods we Proceedings of the 2005 Eight International Conference on Document Analysis and Recognition (ICDAR’05) 1520-5263/05 $20.00 © 2005 IEEE Figure 2. DAG for four classes. Figure 1. The E-Chalk system. Four contactsensitive screens are used with four rear projectors. developed to carry out these main steps. A detailed description of our techniques can be found in [12]. 2.1. Symbol recognition Symbols are divided in three groups, which correspond to symbols constituted by one stroke, by two strokes, and by three strokes or more. Strokes are processed using known techniques to reduce irregularities, variability and amount of data: point clustering, dehooking, resampling and interpolation, grouping and reordering of strokes, and scaling. After the preprocessing step is done, some important features are extracted to form the input vector used for classification. These features include local features such as the coordinates of stroke points, sine and cosine of turning angles, and other global features related to topological properties of the stroke. We use Support Vector Machines (SVM) as the default classifier of on-line isolated symbols. From our experiments, we found that it was the best classifier when compared against other popular classification techniques. The results of our experiments with the 1a subset (digits) of the UNIPEN database [4] are shown in Table 1. For support vector classification we used the technique of SVMDirected Acyclic Graph (SVM-DAG) [10] using a radial basis kernel on each of the classification nodes of the graph. (See Fig. 2.) We used libsvm version 2.71 [2] to train the classification nodes. The parameter values C and γ of our support vector classifier were tuned using the tool included in libsvm. The parameter values used by the other classifiers shown in Table 1 are the default parameters given in Weka version 3.4.3 [15]. Approach Support Vector Nearest neighbor Neural network Perceptron [9] Hidden Markov Model [6] Fuzzy-Geometric [5] Decision tree Naive Bayes Table 1. Comparison of different classification approaches using the UNIPEN 1a (digits) subset. The error rates were estimated using ten-fold cross validation. 2.2. Structural analysis of mathematical expressions A mathematical formula is described as a composite, where the main components are symbols and baselines. Baselines are symbol composites representing symbols having horizontal arrangement in the expression [16]. Symbols are parents of baselines. Baselines have labels which give information about their position relative to their parent symbol. These labels are used to establish mathematical relationships between parent symbols and children baselines. (See Fig. 3.) Such relative positions lie on regions around symbols (above-left, above, superscript, right, subscript, below, below-left and subexpression regions). (See Fig. 4.) The main idea in our method is to use the semantical information given by symbols to group them and then construct the baseline tree which best describes the expression. For such purpose, symbols are considered as nodes of a weighted graph, and a Minimal Spanning Tree (MST) for this graph is constructed using a optimized version of Prim’s algorithm. The calculation of the weight of a edge joining two symbols is a distance-based value: If the symbols satisfy some mathematical relationship, as operators or operands in terms of semantical information, the weight is obtained by calculating the minimum distance between attractor points of symbols, see Fig. 4. If no mathematical Proceedings of the 2005 Eight International Conference on Document Analysis and Recognition (ICDAR’05) 1520-5263/05 $20.00 © 2005 IEEE Error 1.3480 % 1.4799 % 2.5021 % 3.0000 % 3.2000 % 3.7000 % 4.0070 % 8.3841 % (a) Figure 3. Above: a mathematical expressions. Below: Composite of four baselines representing the expression. (b) Figure 4. Regions around symbols. Circles represent the attractor points of the symbol. relationship is satisfied, the weight is the distance between the centers of each symbol bounding box. Details about the mathematical relationships satisfied by symbols, as well as the location of attractor points can be found in [13]. Our method allows: Figure 5. (a) Example of a MST construction for symbol clustering.The MST is initialized with the main baseline (−, , e, d, x). Boxes are used to locate regions defining fractions. (b) MST construction for tabular layouts. 1. Fraction Localization. This MST construction locates fraction in order to handle irregular horizontal structures. Consider the argument of the exponential function in Fig. 5(a). In such structure, the σ in the fraction could be interpreted as lying at the right of the e, while it actually belongs the denominator of the fraction. Fraction localization can be also used to locate and to determine attributes of the symbols like ‘overbrace’ and ‘underbrace’. 3. Matrix Recognition. The MST matrix construction is activated to symbols enclosed by squared brackets, see Fig. 5(a). This MST construction is used to recognize structures which describe equation systems, as well as functions defined by cases. For such purpose, we use the symbol ‘{’ to start the recognition of the tabular arrangement, see Fig. 6(b). This construction can also be used to recognize stacked arguments of sum-like operators (i.e. sum, integral, and product operators). In contrast to the previous cases, no symbol is used to start the tabular analysis, but the matrix mode is set by default, when processing the symbols located in the above and below regions of sum-like operators. 2. Symbol Clustering. The MST construction for symbol clustering associates groups of symbols to the regions, in order to define mathematical relationships between symbols and groups of symbols. This MST construction can be easily extended to recognize expressions that contain, for example, structures such as n Ck , or other operators with similar layouts. It can be also extended to recognize arguments of roots, such as the √ n in the structure n . The composite representing the expression is constructed by applying recursively the MST constructions explained above: the MST for symbols clustering is initialized with the main baseline. After that, the MST construction is applied to the groups found in the previous construction. Figure 5(a) shows the first step of the recursion. Proceedings of the 2005 Eight International Conference on Document Analysis and Recognition (ICDAR’05) 1520-5263/05 $20.00 © 2005 IEEE 3. Implementation in the E-Chalk System 3.1. Java implementation Using the techniques described in the previous section, we wrote a Java library for the recognition of on-line mathematical expressions in E-Chalk. We make use of a composite pattern to describe the baseline tree structure. Symbols are stroke composites, whereas baselines are symbol composites. In this way, Visitors are accepted by symbols and baselines. A Java class implementing a general interface to specify the FormatVisitor data type is used to convert the composite structure into a string representation, depending whether computer algebra system (Mathematica or Maple) is used to compute the operations indicated in the formula. (a) 3.2. Handwriting processing The recognition mode in E-Chalk is activated by selecting a reserved stroke color for the palette. During the recognition mode, strokes as processed and grouped as described in Sec. 2.1. If the classification gives as result the ‘end’ symbol ‘ ’ (lower-right corner), then the layout analysis is activated, and the structure of the recognition is interpreted, depending on computation defined by the gesture encountered in the handwriting. The processing of handwriting allows the recognition of matrices can also be extended to recognize systems of equations and to solve them. In this case, the activation gesture is the symbol ‘{’ at the beginning of the expression. The variables used to find the solution of the equation system are enclosed with the end symbol, see Fig. 6(b). The system is also capable to recognize structures defining parametric plots, as shown in Fig. 6(c). Intervals for values of parameters are established using an “equals-colon” notation, such as x = −1 : 3. Parameters without intervals take the values in the interval [−1, 1]. 3.3. Classification wizard The introduction of new and training of new data is done by using the interface shown in Fig. 7. A classification model is stored in an compressed file, with the tree structure shown in Fig. 7. A directory called “data” which store different files containing the symbols drawn in the right panel. Once the user ends to write the new symbols, a new page appears by pressing the “Next” button, to begin the training the classifiers using the parameters for support vector classification obtained in [12]. Figure 2 shows a DAG originally used for three classes, were the shadowed nodes are the new nodes introduced in the structure. The structure of the directed acyclic graph of the classifier allows an easy introduction and new data. When introducing a new class of symbols, one has only to (b) (c) Figure 6. Example of expressions recognized by the system. (a) Multiplication of matrices. (b) An equation system and its solution. (c) Function plotting. train a subset of the classification nodes. If the classification model has originally n classes, we have to add only n classification nodes to the graph for training. If new examples of classes already included in the model are introduced, only the corresponding nodes are trained with the new data. The data used to train the nodes are the support vectors stored in the classification models. The use of support vector machines allows a fast training of the new nodes, since they constitute small fraction of representative examples from the original database. For example, consider Fig. 2. There the data used to train the node labelled ‘1 vs. 4’ are the new Proceedings of the 2005 Eight International Conference on Document Analysis and Recognition (ICDAR’05) 1520-5263/05 $20.00 © 2005 IEEE References Figure 7. The interface used for storing and training of new data. data belonging to class ‘4’, and the support vectors with label ‘1’ stored in the nodes containing the label ‘1’, i.e. nodes labeled ‘1 vs. 3’ and ‘1 vs. 2’. 4. Comments and further work In this article we introduced a new version of our recognition system for on-line handwritten mathematical expressions. The method is capable to recognize matrices and other tabular arrangements. It allows the processing of online handwriting for matrix computations, for solving systems of equations, and for function plotting. The part corresponding to the recognition of symbols is capable to grow up “on-line” using the support vectors stored in the original model to train new classification models. Some authors argue that the use of only the support vectors for training new classifiers is not the best choice because doing so classification rates can get worse. To overcome such difficultines, we plan to implement some of the algorithms of incremental learning for support vector classification [1]. We plan also to include in the wizard some automatic method for the calculation of the parameters used for delimiting regions around symbols, used for the structural analysis of the mathematical expressions. We think that such parameters are strongly user-dependant, but a learning procedure included in the wizard could help to find the optimal parameters for each user. We do not provide a comparison with other methods, because a benchmark for the recognition of mathematical expressions does not exist. We will work towards defining such a benchmark. [1] G. Cauwenberghs and T. Poggio. Incremental and Decremental Support Vector Machine Learning. In Advances in Neural Information Processing Systems (NIPS*2000), volume 13, 2001. [2] C. Chang and C. Lin. LIBSVM: A Library for Support Vector Machines, 2001. http://www.csie.ntu.edu.tw/∼cjlin/libsvm. [3] M. Fujimoto, T. Kanahori, and M. Suzuki. Infty Editor A Mathematics Typesetting Tool with a Handwriting Interface and a Graphical Front-End to OpenXM Servers. In Proceedings of the RIMS Worshop “Computer Algebra - Algorithms, Implementations and Applications”, volume 1335, pages 217–226, 2003. [4] I. Guyon, L. Schomaker, R. Plamondon, M. Liberman, and S. Janet. UNIPEN Project for On-Line Data Exchange and Recognizer Benchmarks. In Proc. of the 12th Conference on Pattern Recognition (ICPR94), pages 29–23, 1994. [5] J. Hebert, M. Parizeau, and N. Ghazzali. A New Fuzzy Geometric Representation for On-Line Isolated Character Recognition. In Proceedings of the 14th International Conference on Pattern Recognition, pages 33–40, 1998. [6] J. Hu, S. Lim, and M. Brown. HMM Based Writer Independent On-Line Handwritten Character and Word Recognition. In Proceedings of IWFHR-6, pages 143–155, 1998. [7] L. Knipping. An Electronic Chalkboard for Classroom and Distance Teaching. PhD thesis, Freie Universität Berlin, Institut für Informatik, February 2005. [8] N. Matsakis. Recognition of Handwritten Mathematical Expressions. Master’s thesis, Massachusetts Institute of Technology, Cambridge, MA, May 1999. [9] M. Parizeau, A. Lemieux, and C. Gagné. Character Recognition Experiments using Unipen Data. In Proceedings of the 6th International Conference on Document Analysis and Recognition (ICDAR), pages 481–485, 2001. [10] J. C. Platt, N. Cristianini, and J. Shawe-Taylor. Large Margin DAGs for Multiclass Classification. Advances in Neural Information Proccesing Systems, 12:547–553, 2000. [11] S. Smithies, K. Novins, and J. Arvo. A Handwriting-Based Equation Editor. In Proceedings of Graphics Interface, 1999. [12] E. Tapia. Understanding Mathematics: A System for the Recognition of On-Line Handwritten Mathematical Expressions. PhD thesis, Freie Universität Berlin, Institut für Informatik, December 2004. [13] E. Tapia and R. Rojas. Recognition of On-Line Handwritten Mathematical Expressions using a Minimum Spanning Tree Contruction and Symbol Dominance. In J. Lladós and Y.-B. Kwon, editors, Graphics Recognition, volume 3088 of LNCS. Springer, July 2004. [14] L. Wenzel and H. Dillner. MathJournal – An Interactive Tool for the Tablet PC. http://www.xthink.com. [15] I. H. Witten and E. Frank. Data Mining: Practical Machine Learning Tools with Java Implementations. Morgan Kaufmann, San Francisco, 2000. [16] R. Zanibbi, D. Blostein, and J. Cordy. Recognizing Mathematical Expressions Using Tree Transformation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(11), November 2002. Proceedings of the 2005 Eight International Conference on Document Analysis and Recognition (ICDAR’05) 1520-5263/05 $20.00 © 2005 IEEE