MIT LIBRARIES OUPl 3 TOflO OObTfiflSb 2 DSWt POCT ALFRED P. T 1990 WORKING PAPER SLOAN SCHOOL OF MANAGEMENT AN IMPROVED STRUCTURAL TECHNIQUE FOR AUTOMATED RECOGNITION OF HANDWRITTEN INFORMATION Patrick S-P Wang Amar Gupta August 1990 WP# 3162-90 MASSACHUSETTS INSTITUTE OF TECHNOLOGY 50 MEMORIAL DRIVE CAMBRIDGE, MASSACHUSETTS 02139 AN IMPROVED STRUCTURAL TECHNIQUE FOR AUTOMATED RECOGNITION OF HANDWRITTEN INFORMATION Patrick S-P Wang Amar Gupta August 1990 WP# 3162-90 AN IMPROVED STRUCTURAL TECHNIQUE FOR AUTOMATED RECOGNITION OF HANDWRITTEN INFORMATION Wang Amar Gupta Patrick S-P IFSRC No. 123-90 Comments and suggestions relating to the contents of this paper should be directed to the principal investigator: Dr. Amar Gupta Principal Research Associate Sloan School of Management MIT Room E40-265 > Cambridge, 02139 Telephone: 617-253-8906 MA Abstract This paper examines several line-drawing pattern recognition methods for handwritten character recognition such as picture descriptive language(PDL), Berthod and Maroy's method(BM), extended Freeman's chain code(EFC), tree error transformation(ET)method, grammar(TG), and array grammar(AG). Then a new character recognition scheme that uses improved extended octal codes as the primitives fers advantages of handing flexible size, orientation, and is introduced. This scheme variations, need for fewer learning samples needed and lower degree of ambiguity. Finally the simulation of recognition by their real time on-line counterpart Keywords : is of- off-line character investigated. pattern recognition, character recognition, syntactic approach, structural approach, grammars, on-line vs off-line, learning, degrees of recognizability, learnability and ambiguity, 1. artificial intelligence Introduction During the past quarter century there has been a growing interest entists, engineers and researchers in the function and machine simulation of in this area and a large number for computer problems related to the replication of human reading. of technical papers sci- human Intensive research has been done and reports on line-drawing pattern recognition (including character recognition) can be found in the literature[l,2,3,8,10,1216,23-25,27-32]. This subject has attracted such an because it is immense interest very interesting and challenging, but also because automated postal code reading, check verification, office of banking, business, industrial, engineering and it and effort not only provides a strategy for automation and a large variety scientific applications[8,25,27]. Recently, the state of the art in line-drawing pattern recognition has advanced from the use of primitive schemes for the recognition of machine-printed alphanumerals to the application of sophisticated techniques for the recognition of a wide variety of handwrit- ten characters and symbols [1,27,32,34]. It is human interesting to see that even reading devices (eyes), beings, who 5-percent mistakes make about possess the best trained optical when reading in the absence of context [24]. These errors are mainly caused by infinite variations of shapes resulting from the writing habit, styles, education, region of origin, social environment, and other conditions of the writer. For character recognition mood, health by computers, the error rate could be even higher because of the reasons described above as well as other factors such as the writing instrument, writing surface, scanning mechanisms, learning techniques and machine recognition algorithms[3,8]. This paper begins with an analysis and comparison of several line-drawing pattern recognition methods. namely (1) syntactic These approaches can be basically two categories, approach (such as PDL, tree grammars and array grammars) and structural approach (such as BM The reason we concentrate on and structural methods rather than conventional they are hierarchical in nature, subpatterns according to subpaterns and so on, its till (2) method and extended Freeman's chain code method). Their advantages and disadvantages are discussed. tactic classified into i.e. statistical methods, is syn- because they divide a large, complicated pattern into smaller structure, which in turn can be further divided into smaller primitives are found which can no longer be divided. Using such structural and syntactic approaches, one can directly take advantages of the powerful data structures using grammatical rules, trees and directed labeled graphs, which are widely used in dealing with linguistic problems [6,7,11, 18,20,21]. Such approaches are similar to the powerful artificial intelligence or complicated problem is problem solving str tegies in which a large reduced to smaller subproblems [17,19,26,35,36]. Further, it has been shown that syntactic and structural approaches can overcome some disadvantages found in classic decision theoretic (statistical) approach, which has difficulty in distinguishing between two very similar patterns (characters) [6,18]. structural and syntactic approaches, we ognizability are normally intuition of human more natural beings, handwritten symbols. who Among the various believe that those with a higher degree of rec- in that they are more accurate and possess the best pattern recognition skill for closer to the dealing with A new is character recognition scheme using improved extended octal code as primitives This scheme provides certain advantages such as flexible discussed. variations, need for fewer learning samples size, orientation, of ambiguity and lower degree and higher de- gree of recognizability. Finally, an off-line character recognition technique that simulates on-line real time techniques 2. studied. is Notations, Terminologies, Definitions, and Fundamentals " Frequently we hear people say, But what constitutes a level of character, category 1 Your handwriting It is hard to recognize." well recognizable or a hardly recognizable handwriting at the Looking at Table word or even a sentence? recognizable, category 2 is is terrible. is 2.1, one can see that recognizable with some effort, category 3 is recognizable with great difficulty (and perhaps with a high risk of mis-recognition) and category 4 is beyond more concrete of human While recognition. this is intuitively true, it is necessary to define a criterion (or criteria) of "recognizable" characters for machine simulation reading. Figure 2.1 shows a block diagram of a typical character recognition scheme. Definition 2.1 The degree of ambiguity of a character B can be defined problem domain D and the encoding scheme DegAmb(B,D,E)=max. E as follows E DegRec{B,D,E) = terms of the : no. of interpretations of B in D Definition 2.2 The degree of recognizability of a character B, under the encoding scheme in under E. in the domain of D, can be defined as follows: l/DegAmb(B,D,E) if B is in the dictionary < otherwise Conventionally one would conjecture that the easier a character is to recognize it and vice versa. However, this is is not always true. to learn the easier We it do find examples T*bl« 2.1 S«T«raI lavala of racognizability The party begins. ^^ <£A^v-*- <f(S&u*-<J? (ZCSi^Ji. 2 drinks later. £&~- dj^^r^i_ 6JZ^_ J j^. ^c t+<£*W <€^Sr. C-H. £ /s ->l 1 Input 1- Data 1 Figure 2.2 Continuous transformation between I and Y, C and e f like i 0. which are easier to learn instance. While '0' is but harder to recognize and vice versa. Take '0' and 'Q' for comparatively easier to learn (syntactically) than 'Q', to recognize (semantically). behaved printed letters, DegAmb(X,D,E)> 2 harder Notice that Definitions 2.1-2.2 are valid not only for well but also for handwriting variations. For instance, in Figure and DegAmb(0,D,E)> For more examples, please see Figure 2.3. Syntactic Methods: 2.2, 2. In general, there are 8 possible relations be- tween learning and recognition as shown in Figure 3. it is 2.4. PDL, Tree Grammar and Array Grammar This section investigates several syntactic methods for line-drawing pattern recognition, namely, picture descriptive language, tree grammar and array grammar, with an analysis and comparison between them. S.l. The The picture descriptive language picture descriptive language itive is labeled at PDL. (PDL) was developed by two distinguished points, a tail and a head. Shaw[7,20,21]. A primitive can be linked or concatenated to other primitives only at its tail and/or head. Primitive Structure Interpretation Description a+b axb head(a) CAT tail(b) Each prim- Graph — - head(a) CAT head(b) a-b h t (tail(a) CAT tail(b)) a*b a (head(a) CAT head(b)) t ^ I_* h b The grammar that generates sentences in G=(Vn, where b is a context-free grammar Vt, P, S) Vn={S,Sl}, may be any PDL Vt={b,+,X,-,*,~,/,(,),l} primitive (including the "null point primitive" which has identical c, tail and head) and P: S S — — >b, >/SL, SL—-»/SL, T— -» The S — >STS, SL >S, T^ +, S— SL, S—^~S, SL —>SLTSL, T— SL — ~SL, T— x, » -, *. primitives can be as follows: hi - h2 — dl / d2 f gl d3 / vl \ g2 T v2 \ I g3 v3 Then, we can express the English characters as follows[6]. A- (d2+((d2+g2)»h2)+g2) B— ((v2+((v2+h2+gl+(~(dl+vl)))*h2)+gl+(~(cll+vl)))*h2) C-» (((~gl)+v2+dl+hl+gl+(~vl))x(hl+((dl+vl) xA))) D^ (h2*(v3+h2+gl+(~(dl+v2)))) E— ((v2+((v2+h2)xhl))xh2) F— ((v2+((v2+h2)xhl))xA) G— (((~gl)+v2+dl+hl+gl+(~vl))x(hl+((dl+vl+vl-hl)xA))) H^ (v2+(v2x(h2 + (v2x(~v2))))) I— (v3xA) J^ ((((~gl)+vl)xhl)+((dl+v3)xA) K— (v2 + (v2xd2xg2)) L— (v3xh2) M-+ (v3+g3+d3+(~v3)) N-» (v3+g3+(v3xA)) O— (hl*((~gl)+v2+dl+hl+gl+(~(dl+v2)))) P- ((v2+((v2+h2+gl+(~(dl+vl)))*h2))xA) Q-> (hl*((~gl)+v2+dl+hl+gl+(~dl+((~gl)xgl)+v2)))) R- (v2+(h2*(v2+h2+gl+(~(dl+vl)))+g2) S-»((((~gl)+vl)xhl)+((dl+vl+(~(gl+hl+gl))+vl+hl+gl+(~vl))xA)) T— ((v3+(hlx(~hl)))xA) U- ((((~gl)+v3)xhl)+((dl+v3)xA)) V— ((~ g 3)xd3xA) W^ (((~ g 3)+d3+g3)+(d3xA)) X- (d2+((~g2)xd2xg2)) Y- ((v2+((~g2)xd2))xA) Z— ((d3-h2)xh2) 10 For example, the character "A" can be described as follows verbatim : \ I \ I \ I Triangle + c \ a * b c Tree Grammars[6,ll,20] 3.2. By extension of one- dimensional concatenation to multidimensional concatenation, strings are generalized to trees. Naturally, if a + tree, it Let a pattern can be conveniently described by can be easily generated by a tree grammar. N+ be the Let set of strictly positive integers. U be N+ semigroup with identity element "0" generated by free The depth of a6 < b if only and only if a domain ^ if . denoted d(a) and defined as is ,andb £ j £ U there exists x 6 and only D, and (2)a A b if U D if and ranked alphabet is d(0)=0, d(a : x=b . and a binary operation . . i)=d(a)+l, ie a and b are incomparable Figure 3.1 represents the universal tree domain U. a. D such that a the universal tree domain (the is a finite i < j in a pair subset of N+ (£ r > U implies a r) :£ satisfying(l) b € . where £) — N 11 = i D and D N+ a finite set of U {0} N+ if . a and a tree a < b implies a G € D. is is "."). symbols and For a 6 X!, r(a) is called the rank of a. Let £ = '-» n A tree over £ (i- e --> over (5Z,r)) is a function o / I \ 12 / \ 1.1 1.1.2 1.1.1 Figure 3.1 D is is, domain .. Universal tree domain a tree domain and r[a(a)] that 1.2 ... \ / such that 3 = max{i \ a.i 6 D} the rank of a label at "a" must be equal to the at a. trees over The domain ^. The depth of a tree of a is a is denoted by D(a) or Da . of branches in the tree Let l\p be the set of defined as d(a)=max{d(a)|a 6 D(q)}. Let and "a" be a member of D(a). a/a, a subtree of a a/a = {(b,x) | For example, please refer to Fig. number at "a", (a.b,x) G 3.2. GA = (VA ,r A ,PA ,A) 12 a} is denned as a be a all tree where VA = {A,Al,A2,Na ,Nd ,Ne },VTA = r A {i) = {2}, r A (a) = r A (c) {0, 2}, = {i,a,c,d} r A {d) {0, 1}, = {0} Pa: l A --> / \ A2 11 $ a Al —> / a / \ Hb Ha c —> / c \ / a A2 \ / \ d c \ / \ I Ic Ba --> a, Id — > d, He --> c A tree grammar for character "A" Figure 3.2 3.3. Array Grammar( AG)[4,20,22,33] Definition 3.1 An isometric array grammar is a quintuple G= where Vff is a finite nonempty terminal symbols, Vr n Vj is a finite nonempty set of (VN ,VT ,P,S,#) nonterminal symbols, Vj =0 and # ^ (Vjv U Vj>) is set of rewriting rules of the 13 is a finite nonempty set of the background or blank symbol. form a — »/3, where array a and array P /3 are geometrically identical over V)v We a U Vy U { # } say that array x directly generates array — >/3, x contains a as a subarray and y : y, is a not all denoted the set of symbol all S in =*. The x=> SgN y, if is the starting symbol. there is a rewriting rule identical to x except that the subarray replaced with the corresponding symbols of the array sitive closure of #'s. /3. a is Let =^>* be the reflexive tran- language generated by an array grammar G, denoted L(G), is arrays of terminal symbols and #'s that can be generated from the starting a field of #'s. Definition 3.2 The distance between two arrays as the smallest Example number 3.1: X and Y over Vt, denoted as d(X, of error transformations required to derive Given two arrays over a, Y Y) can be defined from X. b b aaaaa aaaa a a X = Y = and a a a a a b we have b aaaaa X = b b aaaaa a aaa a aaa a Ti a Td a Ts a a |- a I- a I- a a a a a a a a b 14 = Y where the symbol I- The minimum number fore, d(X,Y) = means "becomes". * * * * * * * * * 3.2 6. a is 3.2, it can be seen that Therefore one would classify (a) and while (c) * * not very satisfactory as can be seen from the following example. Anyone who knows English T * Three arrays over '«*" Given the 3 arrays in Fig. (c). * * * Definition 3.2 is There- * * Figure 3.3 (b) 3. * * * rather than to Y is ****** * * * cluster. * * * = * * * d[(a), (c)] to 3. ********* Example X of error transformations required to transform (c), instead of (a) will easily recognize that (a) In fact, a survey of 18 persons showed that is d[(a), (b)] all and more is of = 18 (b), into and one similar to (b) them thought that probably an F or an E. The above example shows that the direct distance measure between two arrays satisfactory for clustering analysis. This motivates trial of the following 15 method. is not Assume R that every array array 3, generated by an array are augmented generate 9ft. T? Because Gc to , a variation or a noisy counterpart of a pure, noiseless is grammar resulting in Ga For instance, in Fig. 3.3, the person T T called 'core , called why must have learned a typical, pure or noiseless of this model for , Gc grammar'. If 'augmented grammar', then : "T has a Ga can also does a person recognize (b), instead of be a this to T before, (c) as a and one can assume that was used as a model when he or she learned could be additional rules it. A description horizontal stroke with a vertical stroke attached below the center of the horizontal ones, and both strokes are of similar length". In other words, when a T learned as it is grammar can be considered one's brain. This a certain kind of learned, is flexibility, that is if there were an array as a core new augmented grammar, one can T's, such as, , and I also recognize built into characterizing T. with some extra rules attached to resulting in a I grammar grammar this core With grammar some variations or noisy . J Definition 3.3 The array distance between an array grammar G* parsing of Ga da (G a is 8? and a group of arrays characterized by an augmented = minimum number ,8?) of 8? symbols not covered by the . Definition 3.4 The distance between an array grammar ofG a + G c is d,. (G c , 8?) 8? and a group of arrays characterized by a core array = minimum number of 3? parsing steps using non-core rules da {Ga »). Notice that , if there does not exist a parsing sequence for 3?, then both values are either infinite or undefined. In Fig. 3.2, for instance, da [D 3 ,(a)} = 0, <*,[(?,, (a)] 16 = 0,da [G 3 ,{b)] = 0, its da and de = de [G 2 ,(b)} We now 9,da [G 3 ,(c)} propose an algorithm that = -,de (G2 ,(c)) = -. produce an augmented grammar from a core will grammar. Without 2-point normal form being used, one can assume that each core grammar loss of generality, [4,9], we have a is in Chomsky Notice that because a different definition of connectedness slightly different Chomsky normal form, in that every rule is obeys one of the following: C # A# -> BC, -> B A C # -> , A -> A A #A -> C B, B B, B B A -> -> C # C # # C * -> B A , C or A ->a where A G VN B,C , G VN U VT and a G VT . In general, a rule can be represented by a triple (A, BC,3), where instance, A# -> BC 7. For can be represented by (A, BC,0), # A < d < C —> B can be represented by (A, BC, 1), etc., and A -» a is represented by (A, a, -). Notice that this representation coincides with the Freeman's chain code octal primitives as to 17 be shown in Fig. 4.2. of next section. Algorithm Ai( Core-to-augmented) Input, k core grammar G = i {VN . > VTi ,Pi,S ,#),i=l...k i Output, k augmented grammars Step 1. Step 2. For i=l to k do step 2 and P? = Pi;U For convenience, {(A,BC,(d+ if in l) mod6 ),(A,BC,(d+ 7) Pi the jth rule is (A, BC, d), ntM ) | (A,BC,d) € Pi} then add j.(d+ l)mo<i8 and j. (d+7) m<xf8 into Pf. Step We 3. are now ready Cluster Pass Input. A = {G 1 ,G 2 I set of , to propose a two-pass clustering procedure. n ...,G k }, digitized patterns where L(Gi) Output. The assignment (1 X={x\,Z2, L{Gj) = for i / j, i, and a set of k core grammars Z j=l,...,k. of Xi, i=l,...,n to k clusters characterized by Gi, i= CORE-TO-AUGMENTED Step 1. Call Step 2. For i = 1 to Step 3. For j = 1 to k do step 4-5 n do ...,i„}, step 3-5 18 1, ..., k. Step 4. Compute da (Gj,Xi) Step 5. Assign p with z, to cluster da (Gp ,Xi) - minj=1 Notice that H if more than one should be assigned to all k da (Gj,Xi). ... have the same clusters minimum distance with x t then these clusters. has been assigned to more than one cluster, then use Cluster Pass If i, , II. Cluster Pass II For Input. n, which are assigned to more than one all Xi 1 ,Xi J ,...,Xi rn X{ i and 1< and t\ < its k assigned grammars Gi,, ...,Gi k where < j=m < 1 do the following. n, 1 <h< k, 1< ij < . Output. Assignment of sc,-. to a proper cluster. Step 1. Compute dc (Gi K ,Xi i ) Step 2. Assign z 1; to cluster X{- cluster, for all G,- 4 to i q where <f c which Xi- has been assigned. = min de (Gi f ,i,y (Gi k ,*i,) for all Gi k to which has been assigned. The English characters adapted from set of Fu and Lu[7] is illustrated using the above two-pass clustering procedure. There are 51 characters from nine different classes: P, X, D, U, F, Y, K, H, and respectively. different shown is D and P, H and K, from the other eight in Fig. 3.4. the bottom. In characterized by GO, Gl, G2, G3, G4, G5, G6, G7, and G8, Eight of the nine classes are selected from four groups, each with similar structures; that is V The Fu and U classes. and V and X and Y. The Each character is an array on a 20*20 Generated in a top down fashion, each pattern results of 9 clusters are satisfactory as Lu[7], the use of a similarity patterns in terms of 19 shown measure using grammar transformations is class of character is F grid, as parsed upwards from in [29]. tree grammar for syntactic proposed and a nearest neighborhood 25 • I 10 13 12 26 27 28 —— ; 29 30 31 32 33 34 33 36 rule and a clustering procedure are described. This clustering procedure ful for this quite success- is two-dimensional pattern analysis, especially for English characters. However, in procedure, every character should be first encoded into a one- dimensional string via Freeman's chain code[5] and Shaw's PDL[21]. Further, pattern 47^(Figure when weighted transformations accurately classified into group 2, D, even threshold value as high as proposed and is In Lu[ll], a tree-to-tree distance 3. first Please note that although the purposes of for not are used with a measurement algorithm is encoded into a tree representation and, further, £ and some confusions between As{f\) and E$( sense that they is applied to clustering analysis for 2-dimensional patterns. Again, in this algorithm, every pattern should be there are 3.4.) all ftim at the same target two-dimensional patterns), the model used in[29] is [29] [29] (i.e. ) and they between £<(/ somewhat [7,11] are all ) and C${r ). similar, in the try to solve the clustering problem employs an entirely different approach. For instance, "isometric array grammar" — the time such a model has been first tried for two-dimensional pattern clustering analysis. Also, the definition of distance be- tween two patterns introduced in Section error transformation method. distance between an array paper does not use the conventional we use the concept Instead, and a 3 of this class of arrays characterized by a Context-Free Array Grammar(CFAG)[4]. Even though the same input data adapted from test our two-pass clustering algorithm, differences discussed above, we we obtain a different result. also notice that pattern 7 *j through our algorithm, in contrast to an A survey of 18 persons showed that 7 of them thought of 'Tas an X, it was an F, while 3 of them thought was trained to write and the array it grammar depends on BM Structural Methods: a. Berthod and Maroy's primitives. denote Berthod and Maroy's primitives character is the algorithm of Clearly it used to is classified as Fu and 8 of an Lu[7,ll]. them thought depends on how one is that training. approach and Extented Freeman Approach[EF] 4. We neither. [7,11] is In addition to the of Fig. 3.4 X F through and define the of parsing [3] and encoding scheme encoded into a string of the 5 primitives shown 21 in Table 4.1. as BM. Every <=T: straight line P: positive curve (counterclockwise) M: minus curve P (clockwise) L: penlift R: cusp J- Table 4.1 The five Berthod and Maroy primitives Examples of codes corresponding Let the domain D be all to various characters are English characters. ters (strings of primitives) resulting tablet is A shown in Figure 4.1. dictionary consisting of about 26 charac- from pre-processing of characters drawn on a graphic compiled. Code M Symbols Code Symbols PT TRTT TRMLT TTTT PRT Figure 4.1 Examples of codes corresponding to characters. 4 * « Figure 4.2 Freeman's octal chain code primitives Figure 4.3. (a) Coded word of "C" is 34567011 and (b) Coded word of "H" 30. 24 is 6 16 could be recognized as an "E", since both have and b. "TTLTLT", which is in the dictionary unambiguous. is Extented Primitives We adapt Freeman's chain code[5] and the 8 direction vectors shown in Figure 4.2 as primitives. Each character is encoded into a string of primitives, but only those local extrema points (points which are tangent to one of the 8 vectors) are extracted. The letter "C" could be encoded as 3456701 as consider the vector between 'pen-up' and octal primitive with a bar on shown its in Figure 4.3(a). successive 'pen-down' penlift, and choose the we closest Therefore the letter "H" could be encoded as 6165 it. call the above described encoding scheme EF(for Extended Freeman) and English characters. Using the same sample data words To handle as in Figure 4.3(b). Let us D= shown in > compiled in an ordering tionary, only 7 Further, this 1 > ... words are ambiguous and DegAmb(B, method shown in Table 4.3. In this dic- Dm, EF)=2, Be{C,0,D,P,X,Y,T}. eliminates the possibility of misrecognizing a \/ A I I * be interpreted 0462050 a dictionary consisting of about 48 >6>7as unacceptable symbols. For example, will not , \ V \ as the valid character "E". let number of obviously 3456701 0462050 E 076526 P 046 T 04620 F 0462050 E 0464 J 04640 I 161 N 21 V 2121 W 245670123406 G 2761 M 3456701 CO 4560 C 4620 F 0462050 E 4675 S 560 C 5670123 CO 567012367 Q 575 S 51630 A 51715 N 51730 A 527 XY 61 L 602 U 6157 K 620657 R 62065765 B 6275 DP 6420 J 670 L 67012 U 6157 K 61630 H 620657 R 62065765 B 620765 DP 6271 N 62715 M 6275 DP 627575 B 630 T 7171 W 72 V 715 Y 725 X Table 6. 5.1. An DZF dictionary Off-Line Character Recognition In this section, off-line character recognition simulating real time on-line technique is discussed and an algorithm for transforming two-dimensional line drawing patterns to parsing sequences based on the "universal array 27 grammar" is constructed. In conventional syntactic pattern recognition, each class of patterns is represented by a grammar[6]. In order to classify or recognize an input pattern given different classes of patterns characterized by different grammars, the input pattern Then to a pattern representation through preprocessing. grammars and is such a distance and classified into the class minimum with the is hierarchical. It divides a rather complicated pattern into normally transformed compared with all given distance measure, provided Such an approach within a certain threshold. is it is is basically structural subpattems, which in turn are further divided into subpattems and so on, until primitives are found which can no longer be divided. sound and can overcome some of the difficulties of It is structurally the decision theoretical (classic) approach discussed earlier. However, when the number of classes in a pattern recognition problem pattern matching and classification involve too consuming to many grammars and be practical. Besides, grammatical inference Mainly because of this, the line line-drawing patterns, is it is very high, becomes too time shown N-P complete hard. concept of "universal grammar" was proposed in [32] for on- with classification determined by the parsing sequence rather than by grammars. Such concept of "universal grammar" We use the model of " array methods explored for is grammar" because it off-line line-drawing patterns. offers several advantages over other in dealing with two-dimensional patterns as described in Section 3. * * * * » * * * * *** (a), * * **** character 'c' (b) . digit '2' Figure 6.1. patterns 'c' and '2' 28 > Example 6.1. Gl=(Vn,Vt ,P1 , Example 6.2. G2=(Vn.Vt ,P2,S ,#) ,S ,#) Vn={S ,S1 ,S2,S3 ,S4,SB ,S6 Vn={S,Sl.S2,S3,S4,SB,S6, S7,S8,S9,S10,S11> S7.S8.S9.S10} P2: PI: SI 0. # S —> SI * SI 1. S2 S2 2. —> * 1. SI # —> * 2. S2 ~> * S3 S3 # 53 3. * —> # S4 S3 3. —> 54 * ~> # SB S4 4. —> SB * —> # S6 SB 6. —> # S6 6. * S6 * # —> S7 S6 6. — S7 # ~> * S8 7. * S7 # 7. * SB # B. * S4 # 4. S2 * «> # S * —> # 0. S7 — # * S8 8. S8 # ~> * S9 8. S8 # — * S9 9. S9 # —> * S10 9. S9 # —> * S10 S10 —> * * Sll 10. 10. S10 # 29 —> 11. Sll —> * It can be seen from Examples character array 'C and grammars ble, to construct array it 6.1 and digit '2' respectively. 6.2 that arrays generated Generally speaking, one should construct different to represent different patterns. adequate grammars It is for all patterns. grammar (ULAG) was proposed extremely Therefore, in grammar itself, (Vn. Vt , S, P, #) , whore Vn = {S}. Vt = {a} and P: 0. S # —> a S # 1. S —> S # 2. a S —> S # a S 3. S —> a 4. # S —> S a a S B. --> # 6. —> # S a S # 7. 8. S a S S S —> [32] to distinguish classes of patterns. = difficult, if not impossi- a universal line to generate all on-line patterns. For recognition, utilizes the parsing sequence, not the ULAG Gu by Gl and G2 are the a 30 between different > In this universal line array grammar, a code associated with each production rule is except the terminal ride, and will be produced as part of the parsing sequence corresponding rule Using ULAG, is P = 'C is 4, 5, 5, 6, 6, 6, 7, 0, 0, 0, 8, '2' is 2, 0, 7, 6, 5, 5, 5, 6, 0, 0, 0, 8. The above concept can be extended OAG G (Vn, Vt, P, S. #) , ohere Vn to off-line character recognition. = {S, SI}, Vt = {a} and = # 0. S # —> # 2. S SI S 1. S —> S « s —> SI s S 3. SI — SI SI S 4. # S —> S SI B. SI S the successfully applied. the parsing sequence of parsing sequence of if —> # SI S S. # —> S 7. 8. SI —> S 9. # S S —> —> S a Example 6.3. We can use UAG above to get the parsing sequence for English letter E: 31 and the a a a a a a as follows. 4 8 4 9 8 8 9 # S ==> S SI ==> S S ==> S a ==> S SI a ====> S a a ==> SI a a S 89 890 6 ====> a a a ==> a a a ====> a a a S ====> a a a 8 ====> a a a ====> a a a SI a a a S SIS Sla S a 80898089 86898689 6 9 ========> a a a =========> a a a a a a a a a a a a SI a a SI a a a In the above example, we used UAG these sequences are rather long because parsing sequence to represent a pattern, but some nonterminal to nonterminal parsing terminal rules were involved. Now, we propose an algorithm, PATSEQ, which an unique shorter parsing sequence from a pattern. In order to describe some produces this algorithm, definitions are required. Definition p2, and ..., 6.1. The neighbors p8 shown in Figure of a pixel, pO, are identified by the eight directors, pi, 6.2. 32 pB p6 p7 p4 pO p8 p3 p2 pi Figure 6.2. Pixel pO and its neighbors Definition The segment 6.2. of a pixel, pO, is a set of neighbors of pO which are consecutive. In the example below, there are two segments, Si and S2, for the following pixel pO, where Si =(p5,p6,p7), S2 = (p2,p3). p5 p8 p7 pO p3 p2 Later, to the we use number Si to indicate the segment of elements in the segment The length of a segment, i. Si. length(Si), For example, length(Sl) = is equal length(S2) 4, = 2. Definition 6.3. ii) pi Definition 8). We 2 < 6.4. The = perfect if it satisfies the following conditions: and 4, in center pixel of a perfect segment, Ci, 1 length(Si) then Ci < 3 mod 8. (k=2 or 4 or 6 or one segment, where i=2,4,6,8. j=(i+2) is the only element in then Ci=pk, pk is is as follows: Si. one of the elements of Si also call Ci as the next pixel of pO. Definition the < is and pj are not If length(Si) ii) if segment length(Si) i) i) A number 6.5. The parsing code from the current pixel, pO, to one of of the rule successfully applied. Algorithm PATSEQ: transfer a pattern to a parsing sequence. 33 its next pixels is Input: a) Initial position (10, TO). b) Digitized pattern in which all segments are perfect. Output: A parsing sequence. Method: Step 1: push(XO); push(YO) Q:=empty;(Q stores the passed pixels). ; Step 2: repeat if stack is empty Step 3: then step 7. T:=pop; I:=pop; if (I,Y) is in Q then step 3; push all next pixels of (I.Y) and their parsing codes Step 4: into stack if no next pixel for P0 then push(-l); Step Delete current pixel from pattern. 5: z:=pop; if z=-l then step 6 else print (z ) Step 6: Step 7: until stack is empty. The following example illustrates the use of the algorithm PATSEQ. 1 2 3 4 E 6 7 > x 1 2 3 b 4 d B i h e f g 6 v We use character a,b,..., to indicate positions (4,2) 34 , (4,3) (5,3) , The parsing process of character Pixel 1 Stack ' r ' is as follows. Output 7. Discussions and Conclusions We introduced the basic concept of degrees of recognizability and ambiguity. Their Although there are many other factors such as human relationship has been discussed. factors, social environment, writing instruments, software and hardware environment as well as algorithm-oriented characteristics such as segmentation, data reduction, resolution, primitive selection, thresholding and quantization, nevertheless, the degree of ambiguity plays an inherent role as an important criterion of recognizability for hand- written symbols. Two different categories of experiments representing different recognition schemes for on-line handwritten symbols were conducted. The first used basic structural shapes as primitives while the second used octal chain codes as primitives. vided advantages of flexible samples. method, It it is A summary also possesses less likely to is I pro- and the need for fewer learning an inherently lower degree of ambiguity. Besides, in this sizes, orientation, variations, mis-recognize an obviously unacceptable symbol as a valid one. depicted in Table 7.1. I | I | The second method Grammar 1 | Structure 1 I Accuracy (degrees of ambiguity and recognizability) References 1. and Suen, C.Y., Computer Recognition of Totally Unconstrained Handwritten Zip Codes, Int. J. of Pattern Recognition and Artificial Intelli- Ahmed, P. gence (IJPRAI), 2. Components of an Omnifont Reader, Symp. on Pattern Recognition, John Wiley and Sons, 343-362 (1986). Baird, H., Simon, K., and Pavlidis, T., Int. 3. vol. 1, no. 1, 1-15 (1987). Berthod,M. and Maroy, J.P., Drawn on a Graphic Tablet. Learning in Syntactic Recognition of Symbols . Corn-put. Graphics and Image Processing 9, 166- 182, (1979). 4. Cook, C. R. and Wang, P. S. P., A Chomsky Hierarchy of Isotonic Array Grammars and Languages, Comput. Graphics Image Process, vol. 8, no. 4, 144-152 (1978). 5. On Freeman, H. the Encoding of Arbitrary Geometric Configurations, IRE Trans. Electronic Comput. 10, 206-268 (1968). 6. Fu, K. S., Syntactic Pattern Recognition and Applications, Englewood Cliffs, N.J., Prentice-Hall, (1982). 7. Fu, K.S. and Lu, S.Y., Trans. Systems 8. A Clustering Procedure for Syntactic Patterns. Man. Cybernet, 7, IEEE 734-742 (1977). Gupta, A., Hazarika, S., Kallel, M., and Srivastava, P., Optical Image Scanners and Character Recognition Devices: A Survey and New Taxonomy, MIT D7SRC Report No. 107-89, (1989) 9. Hopcroft, J.E. and Ullman, J.D.. Introduction to Automata Theory, Languages, and Ccomputation. Addison Wesley, New York, (1979). 10. M. Handprinted Hebrew character using Features Selected in the Hough Transform Space, Pattern Recognition, vol. 18. no. 2, Kushnir, et al., Recognition of 103-114, (1985). 11. Lu, S. Y., IEEE A Trans. Tree-to- Tree Distance Pattern Anal. Mach. and Its IntelL Application to Cluster Analysis, PAMI-1, vol. 1, no. 2, 219-224 (1979). 12. Mantas, J., An Overview of Character Recognition Methodologies, Pattern Recognition, vol. 19, no. 6, 122-134 (1986). 37 13. Masuda, I. et al, Approach IEEE-CUPR Smart Document Reader System, to Proc. 550-557, (1985). 14. Mori, 15. Research on Machine Recognition of Handprinted Characters, S. et al., IEEE-PAMI. Morishita, T., Ooura, gence(IJPRAI), Ishii, A Nagahashi, H. Nakatsugama,M., of Structural Character. 17. Int. vol. 2, no. 1, Nandhakumar, N., Pattern Recognition 18, no. 6, Y., Pattern Description and Generation Method vol. 8, no. 1, 112-117 (1986). IEEE-PAMI. and Aggarwal, - A Kanji Recognition Method which J. of Pattern Recognition and Artificiallntelli181-195 (1988). M. and Detects Writing Errors, 16. 386-405 (1984). vol.6, no. 4, A J., The Artificial Intelligence Approach to Perspective and Overview, Pattern Recognition, vol. 383-389 (1985). 18. Pavlidis, T., Structural Pattern Recognition, Springer- Verlag,New 19. Rich, E., Artificial Intelligence, York, (1980). McGraw-Hill, (1983) A, Picture Languages: Formal Models for Picture Recognition, Academic Press, New York (1979). 20. Rosenfeld, 21. Shaw, A.C., A Formal Picture Description Scheme as A Basis For Picture Pro- cessing Systems, Inf. Control vol. 14, 9-52 (1969). 22. Siromoney, R., Array Languages and Lindenmayer Systems - A Survey, The Book of L, Rozenberg, G. and Salomaa, A. (Ed.), Springer Verlag (1985). 23. Suen, C.Y., Factors Affecting the Recognition of Handprinted Characters. Proc. Int. 24. Conf. Cybernetics and Society, 174-175 (1973). Suen, C.Y., Berthod, M. and Mori, Characters 25. Stern, R., - S., The-state-of-the-art, Proc. Automatic Recognition of Handprinted IEEE vol. From Intelligent Character Recognition cessing, Proc. of Int. Electronic 68, no. 4, 469-487 (1980). to Intelligent Document Pro- Imaging Exposition and Conference, Prentice- Hall, 236-245 (1987). 26. Tanimoto, S., The Elements of Artificial Intelligence, Computer Science Press, (1987). 27. Tappert, C.C., Suen, C.Y. A Survey, Proc. ICPR'88, and Wakahara, T., Rome, 1123-1132 38 On-Line Handwriting Recognition(1988). Advances in Character Recognitions, Reconition. Ed. by K. S. Fu, CRC 197-236 (1982). 28. Ullmann, 29. Wang, J., P.S.P., An Grammars Application of Array in Applications of Pattern to Clustering Analysis. Pat- tern Recognition, vol. 17, 403-410 (1984). 30. Dual Level Pattern Recognition System US Patent PN 4521909 (1985). Wang, P.S.P., - A New OCR Technique, 31. Wang, P.S.P., A New Character Recognition Scheme With Lower Ambiguity and Higher Recognizability. Pattern Recognition Letters 3 431-436 (1985). 32. Wang, P.S.P., On-line Chinese Character Recognition, International Conference on Electronic Imaging 209-214. (1988). 33. Wang, P.S.P.(Ed.) Array Patterns, Grammars and Recognizers World Scientific Publishing Co., (1989). 34. Wang, A More Natural Approach for Recognition of Line- Drawing Patterns, Image pattern recognition: algorithm implementations, techniques and P.S.P., technology, Proc. SPIE, vol. 755, 141-160, (1987). 35. Winston, P. H., 36. Winston, P.H.(ed) with Shellard S., Artificial Intelligence Frontiers, MIT Press, to appear (1990). Artificial Intelligence, 39 Addison- Wesley, (1984). at MIT - Expanding Date Due Mil 3 TOfiO LIBRARIES OOblflAZb 2