Bigger Context and Better Understanding -- Expectation on Future MT Technology Zhendong Dong E-mail: dzddong@public.bta.net.cn Abstract This paper is intended to discuss new technology for next-generation machine translation. After a brief review of the problems in the state of the art of MT is made, the discussion is focused on two aspects: common-sense knowledge database and multi-sentential processing. Construction and application of new knowledge resources are emphasized and a concise presentation of HowNet, a Chinese-English knowledge system, is given. The paper gives a practical description of how understanding-based MT will be implemented. Key words machine translation MT discourse text understanding parsing sense disambiguation HowNet knowledge dictionary 1. Motivation Nowadays in the Internet with the topic of machine translation, we can find quite a lot of interesting stuff: free MT services, commercial products on sale, short-term courses on MT, call for papers for MT Summit or regional conferences, continental associations in Americas, Asia and Europe, publications including professional books or journals, and academic papers. However we also see some different pictures, among which the most impressive one to me is a new web site named madtrans@usa.net. The name of the site suggests, as the participants meant, that MT should stand for Mad Translation rather than Machine Translation. The participants of the site claim, "The purpose of this site is to re-establish the truth about the real (im)possibilities of Machine Translation systems (also called automatic translators). This is a quite complete synthesis with automatic translation examples, analyses from specialists and even expectations about the future of this technology." In the article they present many funny examples given by MT systems, make long citations about the reasons why computers cannot translate as human translators do, and they urge, "If you choose to purchase an automatic translator in spite of our advice and the examples we have shown you, be prepared to waste your money ". As a researcher and developer who has unfortunately been stuck in MT for nearly 30 years, I would be really reluctant to agree on their views in general, and the "expectations about the future of this technology" in particular. However, this paper is not intended to argue, but to discuss the future of MT technology. The discussion will cover what an innovation in future MT technology is. The innovation will be represented by three aspects: (1) single-sentence processing will be replaced by sentence-group (discourse) processing; (2) change in the depth of source language parsing (or more accurately, analysis); (3) novel device of target language generation. 2. Real Problems I claim that I don't agree some conclusions of the madtrans@usa.net participants, but I admit the problems in MT they numerated do exist, though they may be not the key problems. What are the real key problems in MT then? Firstly, let's imagine. A human translator handles one and only one sentence at a time when he is doing translation. Besides, when he translates a sentence, he has no idea at all about the previous one he just deals with nor the next one he is going to handle. Have you ever come across such kind of human translators? If there were one like this, could he do a good translation job? Do you want to hire a translator like this? The answer might definitely be negative. In reality none of human translators translate on the basis of single-sentence context processing. But unfortunately this is just the same case with MT, and this is one of the key problems which make it mad. Secondly, let's take the following sentence as an example. As far as I know most English-Chinese MT systems available in China are able to give correct translation for the English sentence "Mr. Nixon was sick and sent to hospital this morning". However, I am sure that none of the systems can give correct answer even for a simple question like "who was ill?" or "where was Mr. Nixon sent?". Can you imagine the same situation would happen to a human translator? It is really unbelievable that translation can be done without understanding. To summarize, the present technology of MT has two principal problems: one is single-sentence context processing, the other is lack of basis of understanding. In comparison with these two problems, the controversy about "mainstream linguistics', 'universal grammar', 'statistical modeling', 'derivation trees', 'parsing algorithms', or "interlingual", "transfer" etc. is merely a minor one. We take these two problems as the typical "traditional patterns which will clearly have to be broken if any real progress is to be made" [kay96]. To expand the scope of processing from single-sentence to multiple-sentence or sentence-group, and to achieve understanding of source language and to achieve real re-compilation in target language generation to get rid of the shadow of the syntactic detainment of the source language -these two techniques, we believe, will be the ice-breaker for new era MT. 3. Sentence-group Processing and Understanding 3.1 Sentence-group By sentence-group, we mean more than one sentence in series in a text. They may be a paragraph or even bigger context, or just two or three sentences within a paragraph. The number of the sentences to be processed is decided by the system designer and we consider it as flexible or adjustable to cater for the need of different kinds of texts. Single-sentence or sentence-group, is by no means a question of quantity, or an issue of memory space of a computer, but it is an issue of overall innovation for next-generation MT technology. Within the traditional patterns, the parser of an MT system make a syntactic tree of a processed sentence, one tree for one sentence. The trees of adjacent sentences are totally irrelevant. No relations can be formed between those trees even the parser keep the parsing results for all sentences. What relations can we build between syntactic trees? Can the parser specify any relations between the subject(NP) in the first sentence and the predicate(VP) in the next sentence? However in reality there are some meaningful links among sentences in a sentence-group or a paragraph unless the text is awkwardly or incoherently written. These links of meaning are critical for human translators to have good understanding of the text. Without these links of meaning, human translators sometimes will be at a loss if he is going tackle an isolated (deliberately made) sentence, like "he met John near the bank". Only by a bigger context can we achieve more understanding. 3.2 Links of meaning It is not easy to expand the context from one sentence to more and to have better understanding. The key factor is to build up the links of meaning between sentences. If we can not achieve this, bigger context will be useless. What kind of links are we to build then? The links of meaning include three kinds of relation: (1) concepts relations; (2) relations between events (usually denoted by verbs, especially main verbs); (3) the shifting of the roles of the events. Now let's take the following sentences as an example: In the past few years the Tanakas bought one or two toys for their children every time they travelled abroad. It is really a pity that now they lost nearly all of the them. After the "parsing" of these sentences by a future MT system, the three kinds of relation will be obtained as follows: (1) concepts relations (only main nodes): bought(buy|) --- main-event1 years(time|) --- duration Tanakas(human|) --- agent toys(tool|) --- possession children(human|) --- beneficiary every time(time|) --- time travelled(tour|) --- main-event1.1 they1(human|) --- agent abroad(location|) --- location lost(lose|) --- main-event2 It is really a pity that(comment|) --- comment they2(human|) --- relevant them3(?) --- possession (2) relations between events i have| = result of buy| ii iii pay| = precondition of buy| have = precondition of lose| (3) role shifting rules (RSR) a. beneficiary of buy| --> relevant of have| b. agent of buy| --> relevant of have| {if no other beneficiary} c. possession of buy| --> possession of have| d. relevant of have| --> relevant of lose| e. possession of have| --> possession of lose| they2 = relevant of lose| according to RSR-d. --> relevant of have| according to RSR-a. --> beneficiary of buy| (children) them3 = possession of lose| according to RSR-e. --> possession of have| according to RSR-c. --> possession of buy| (toys) Note: 1. a, b, c, d, e, listed above are the rules which control role shifting 2. The word with a symbol "|" represents a concept The role shifting process gives plausible answers to the questions of "who" and "what". This is a reliable approach to solve anaphora problem. Meanwhile this will also be a reliable approach to solve ellipsis. By the way, if the above example sentence is in idiomatic Chinese, the problem will turn into an ellipsis one, that is, both the subject and object of "lost" would be omitted. 4. New Technology For new technology of next-generation MT, our discussion will be focused on two points: general knowledge database and multi-sentential processing. The discussion in 3.2 clearly shows that we need some new resources. The traditional MT dictionary (with syntactic information, and some fragmental semantic information, i.e. confined in language knowledge only, or with only very little world knowledge) will not be able to meet the need to operate as described in 3.2. New technology needs new resources among which the critical one is a general knowledge database. We would like to take HowNet as an example to give a comprehensive picture of a general knowledge database and its application to machine translation. HowNet was released on the site of www.how-net.com in March 1999. 4.1 HowNet – A General Knowledge Database HowNet is a general knowledge database which describes relations between concepts and relations between the attributes of concepts. The concepts HowNet defines are denoted by words and phrases in Chinese and English. We regard HowNet as a Chinese-English bilingual common-sense knowledge system. 4.1.1 Sub-databases of HowNet HowNet is mainly composed of 9 sub-databases. They are as follows: (1) Chinese-English Bilingual Knowledge Dictionary (CEKD) (2) Main Features of Concepts (1) (MFC-1) (3) Main Features of Concepts (2) (MFC-2) (4) Secondary Features of Concepts (1) (SFC-1) (5) Secondary Features of Concepts (2) (SFC-2) (6) Secondary Features of Concepts (3) (SFC-3) (7) Event Role and Features (ERF) (8) List of Antonymous Relations (LAR) (9) List of Converse Relations (LCR) Apart from these sub-databases, HowNet contains a few descriptive files, such as pointers and their usage, parts of speech, and a maintenance toolkit. HowNet also have a Chinese GB Knowledge Dictionary, a Chinese Big5 Knowledge Dictionary and an English Knowledge Dictionary (though not as comprehensive as the Chinese one) which are extracted from the bilingual Knowledge Dictionary. The knowledge dictionary is the core component of HowNet. The size of HowNet depends mainly on the size of its Chinese-English Bilingual Knowledge Dictionary which is measured by the number of word forms and the number of concepts (meanings). The size of HowNet 1.0a is as follows: Word forms Languages Chinese English Total 50220 55422 N-category 26037 28876 V-category 16657 16706 A-category 09768 10716 Concepts Languages Chinese English Total 62174 72994 N-category 29787 36770 V-category 20468 21203 A-category 11173 14339 Note: the categories N, V, A do not exactly correspond to parts of speech Noun, Verb and Adj. Main Features of Concepts (1) (MFC-1) is composed by 800 main features of events in a hierarchy. There is a role framework for each main feature, the absolutely necessary roles are specified for an main feature. .Main Features of Concepts (2) (MFC-2) is composed by 140 main features of things (including physical or mental substance, facts, attributes, space and time) in a hierarchy. In HowNet the universality of the concepts of the same category is indicated in each main feature.In HowNet concepts are well defined by 1400 main features and secondary features, with the help of pointers and the HowNet Concept Definition Markup Language. 4.1.2 Types of Concept RelationsHowNet is powerful in describing relations between concepts. It describes not only the relations within the same categories, but also describes cross-category relations. The types of relations described by HowNet are mainly as follows: a. superordinate-subordinate (by means of MFC-1 and MFC-2) b. synonym (by means of DEF and bilingual equivalents) c. antonym (by means of LAR) d. converse (by means of LCR) e. part-whole (coded with pointer %, e.g. “heart”, “CPU”, etc) f. attribute-host (coded with pointer &, e.g. “color”, “speed”, etc) g. material-product (coded with pointer ?, e.g. “cloth”, “flour”, etc) h. agent-event (coded with pointer *, e.g. “doctor”, “employer”, etc) (may also be “experiencer” or “relevant”, depending on the type of event) i. patient-event (coded with pointer $, e.g. “patient”, “employee”, etc) (may also be “content” or “possession”, etc. depending on the type of event) j. instrument-event (coded with pointer *, e.g. “watch”, “computer”, etc) k. location-event (coded with pointer @, e.g. “bank”, “hospital”, “shop”, etc) l. time-event (coded with pointer @, e.g. “holiday”, “pregnancy”, etc) m. value-attribute (coded without pointer, e.g. “blue”, “slow”, etc) n. entity-value (coded without pointer, e.g. “dwarf”, “fool”, etc) o. event-role (coded with role-name, e.g. “wail”, “shopping”, “bulge”, etc) p. concepts related (coded with pointer #, e.g. “cereal”, “coalfield”, etc) 4.1.3 Concept Description in Knowledge Dictionary (KD)Every concept as an entry in HowNet has the following description items for each language: W_X= word or phrase G_X= POS E_X= examples DEF= concept definition Let's look at some examples and see how the concepts are defined in KD of HowNet. NO.=040995 NO.=016525W_C=教 G_C=V G_C=N E_C= E_C= W_E=teach W_E=university G_E=V G_E=N E_E= E_E= DEF=teach|教,education|教育 DEF=InstitutePlace|场所,@teach|教,@study|学,education|教育 NO.=041046 NO.=089130 W_C=教授 W_C=学生 G_C=N G_C=N E_C= E_C= W_C=大学 W_E=professor W_E=student G_E=N G_E=N E_E= E_E= DEF=human|人,*teach|教,education|教育 DEF=human|人,*study|学,education|教育 NO.=052920 NO.=006112 W_C=论文 W_C=博士 G_C=N G_C=N E_C= W_E=paper W_E=doctor G_E=N G_E=N E_E= E_E= DEF=text|语文,#research|研究 DEF=human|人,*research|研究,*study|学,education|教育 NO.=092249 NO.=092291 W_C=医 W_C=医院 G_C=V G_C=N E_C= E_C= E_C= W_E=treat W_E=hospital G_E=V G_E=N E_E= E_E= DEF=cure|医治 DEF=InstitutePlace|场所,@cure|医治,#disease|疾病,medical|医 NO.=092273 NO.=034930 W_C=医生 W_C=患者 G_C=N G_C=N E_C= E_C= W_E=doctor W_E=patient G_E=N G_E=N E_E= E_E= DEF=human|人,*cure|医治,medical|医 DEF=human|人,*SufferFrom|罹患,$cure|医治,#medical|医 Note: “human|人”, “InstitutePlace|场所”, “text|语文”, “disease|疾病”, etc. are N-category main features; “SufferFrom|罹患”, “cure|医治”, “research|研究,”, etc. are V-category main features; “education|教育”, “medical|医” are secondary features. “*”, “@”, “$”, etc. are pointers. So the DEF of the word “hospital” means that hospital is a place where (expressed by @) people treat diseases, it belongs to medical domain. 4.1.4 Event Role Framework As we mentioned before, each of the 800 main features of events is attached by a role framework, in which the absolutely necessary roles for the events of the same main feature are specified. By absolutely necessary roles, we mean those which will definitely join once an event happens, no matter it or they are expressed in real communications. A few lines of MFC-1 are cited below. V1.02 possession| 领 属 关 系 BelongTo| 属 于 {relevant,possessor} own| 有 OwnNot| 无 {relevant,possession} {relevant,possession} lose|失去 {relevant,possession} V2.02 AlterPossession| 变 领 属 {agent,possession} take| 取 {agent,possession,source} earn| 赚 {agent,possession,source} buy|买 {agent,possession,source,cost,~beneficiary}[commercial|商] collect|收 {agent,possession,source} Whenever the main feature own|有, i.e. the category for “possess”, “have”, etc. happens, “who” (possesses) and what (he possesses) will definitely exist. As for event buy|买, “who” (bought), “what” (he bought), “where” (he bought from), “how much” (he paid for it), and “for whom” (he bought, or just for himself). 4.2 Application of HowNet 4.2.1 For Better Understanding let’s take the example sentences in 3.2 to illustrate the process of new source language parsing. In the past few years the Tanakas bought one or two toys for their children every time they travelled abroad. It is really a pity that now they lost nearly all of the them. Different from the present technique, the new analysis technique, apart from traditional syntactic parsing, will make slot-filling when it come across a verb in the sentence. In the above sentences, the verbs the analyzer will find are “buy”, “travel”, “lose”. It will consult HowNet’s MOF-1, and then may construct a table as the follows. “buy” “travel” “lose” agent Tanakas agent they relevant they possession toy location abroad possession them source ? direction ? cost ? beneficiary children LocationIni ? time travel duration year LocationFin ? The slot-filling process will be repeatedly done on a selected length of the text (sentence-group), for an unknown role would not be found in the sentence presently being processed, but might be detected in the next sentence or in the previous sentence. After the process of slot-filling, the rules of “relation of event” and “role shifting” will be applied. It is clearly shown by itself that they can help solve the problems of anaphora or ellipsis in a very reasonable way. In the above example, with the help of rules of “role shifting”, it is easy for the analyzer to give a correct answer to the question of “who lost what”. 4.2.2 For Sense Disambiguation In traditional patterns, the sense disambiguation in MT is mainly done by rules. As usual one sense has to be determined by one or more rules. Since the processing is confined within a single sentence scope, the traditional techniques are rather incompetent. An innovation will also be made for sense disambiguation on the condition that new knowledge resources like HowNet is used. The new approach will take a sentence-group as its testing field and calculate the semantic distance between the senses to be disambiguated and the other senses in the field. The calculation are based mainly on two sets of data in the knowledge resources. In HowNet the two sets of data are coded in DEF and E_E or E_C (examples). For disambiguation of most of the solid words such as “doctor”, “bank”, “table”, we will not depend on rules in the traditional patterns. The sharper the contrast between two senses to be disambiguated is, the easier the disambiguation will be. In 4.1.3 the contrast between two senses of the word “doctor” is shown by their DEFs. So when we make sense disambiguation of them, we would consult the DEFs of the other solid words in the sentence-group, to see how many features are similar to each of the senses. One of advantages of the approach is that its algorithm is language-independent and system-independent. A sense disambiguation tool for MT can also be used for some other applications. 5. Conclusion Translation is really very difficult. It needs not only linguistic knowledge but also general knowledge. And in a sense it is a kind of art, for it needs the art to manipulate words. MT, although sometimes mad, give us some help in technical translation or in browsing in the Internet. The two factors we propose in this paper, i.e. bigger context processing and the of new knowledge resources, may be the key criterion for next-generation MT technology. Hopefully we will not have a disappointing future. References Chang, Jing-shin, Keh-yih Su (1997) Corpus-based Statistics-oriented (CBSO) Machine Translation Researches in Taiwan, MT Summit VI Proceedings Dong, Zhendong (1998) 未来机器翻译研究的展望, ComputerWorld Gerber, Laurie (1997) R&D for Commercial MT, MT Summit VI Proceedings Hovy, Eduard & Gerber, Laurie (1997) MT at the Paragraph Level: Improving English Synthesis in SYSTRAN TMI '97 Kay, Martin (1996) Machine Translation: The Disappointing Past and Present, Survey of the State of the Art in Human Language Technology Nagao, Makoto (1997) Machine Translation Through Language Understanding, MT Summit VI Proceedings Proc. of the International Conference on Machine Translation & Computer Language Information Processing pp.17-25,1999/6