Document 13704314

From: Proceedings, Fourth Bar Ilan Symposium on Foundations of Artificial Intelligence. Copyright © 1995, AAAI (www.aaai.org). All rights reserved. The Future of MTis Now andBar.Hillel was(almostentirely) Right EIliott Macklovitch Centrefor InformationTechnology Innovation(CITI) 1575 Chomedey Boulevard Laval, Quebec,Canada 1. Introduction A seer, accordingto onedictionary definition, is a personwhocanpredict eventsor developments; someone, in other words, whocan foretell the future. As far as I know, Yehoshua Bar-Hillel never actually claimedto be a seer. Hewasfirst andforemosta scientist (a logician andphilosopherof science),not a prophet.But in his manywritings on machine translation in the 1950’sandearly 60’s, Bar-Hillel did makea number of clear andimportantpredictions about the future of MT;andbaseduponhis assessment of what waspossiblein the field, he suggested certain priorities for future development. In this paper,I shall summarize someof the morestriking of thoseearly predictions andattempt to evaluatethemin light of the currentstate of the art of MT,some thirty-five yearslater. Asthe title of mypapersuggests,I believethat Bar-Hillel’s predictionshavebeenlargely borne out; andyet his suggestionsfor morereasonablewaysof putting the powerof computersat the service of translators havegoneby andlarge unheeded.In the final section of this paper;I will describea systemcurrently underdevelopment at the CITI called TransCheck,which is a novel kind of support tool for humantranslators. The approachto translation automationthat underpinsTransCheck is wholly consistent with Bar-Hillel’s owncall for a modestandjudicious useof mechanical aids as an alternative to classical MT,andas suchI amquite confidenthe wouldhaveendorsedit. At the CITI, weare convinced it is the wayof the future. Q Onthe Nonfeasibility of Fully Automatic,HighQuality Translation (or FAHQT) Thecelebratedargumentthat Bar-Hillei publishedin 1960on the nonfeasibility of fully automatic,high quality machinetranslation is nearly as well-knownas the acronym he coinedfor it, andcertainly doesnot bear repeatingin extenso.As Bar-HUlelhimself pointed out, the argumentdoes not even concern translation proper; rather, it demonstrates the inescapableneedfor extra-linguistic knowledge in order to determine the meaning = andhencethe translation - of a polysemous wordlike "pen" in a perfectly innocuoussentencelike "Theboxis in the pen". As it turns out, the knowledge required Mackloviteh 137 From: Proceedings, Fourth Bar Ilan Symposium on Foundations of Artificial Intelligence. Copyright © 1995, AAAI (www.aaai.org). All rights reserved. to disambiguate "pen" in this context concernsnot just common-sense expectationsabout the relative sizes of writing instrumentsandchildren’s enclosures;also requiredis the ability to reasonaboutthis knowledge, viz. if an object X is in someotherobject Y, X is normally smaller than y.1 Thesekinds of reasoningcapabilities andextra-linguistic knowledge wereobviouslynot available to existing machine translation systems thirty-five yearsago. Moreimportantfor us today, however, is the manner in whichBar-Hillel reacted to the suggestionthat suchinformationmighteventuallybe put at the disposalof later MT systems: ’~¢Vhatsucha suggestionamounts to, if takenseriously, is the requirement that a translation machineshouldnot only be suppliedwith a dictionary but also with a universal encyclopedia.This is surely utterly chimerical andhardly deservesany further discussion."(Bar-Hillel [2], p.176) Still, Bar-Hillel did accordthe suggestion somefurther attention, elaboratingon why he consideredthe idea of equipping an MTsystemwith a universal encyclopediaso preposterous. "Thenumberof facts wehuman beingsknowis, in a certain very pregnantsense, infinite. Knowing,for instance, that at a certain moment there are exactly eight chairs in a certain room,wealso knowthat there are morethan five chairs, less than 9, 10, 11, 12 andso on, ad infinitum... Weknowall theseadditional facts by inferencesweare able to perform,andit is clear that they are not, in anyserious sense,storedin our memory." (Bar-Hillel [2], p.177) At the core of Bar-Hillel’s argument, therefore,is not just the fact that translation routinely requires encyclopedicknowledge;i.e. knowledge not about the properties of languagebut aboutthe real world. Rather,the nubof the problemis the fact that humans caninstantaneously accessinfinite amounts of suchknowledge, as a result of their ability to infer. Andwhile Bar-Hillel wasable to envision a translating machinethat might eventually performcertain inferences, he found it inconceivablethat such a machine would be able to do so in the same spontaneous manner or under the same circumstances as anyintelligent human can, andas translators unconsciously do all the time. All in all, Bar-Hillel’s celebratedargument coversno morethana pagein the original text; andyet there are at least two quite different waysin whichit canbe interpreted.On the first andnarrowerinterpretation, it is a logically impeccable demonstration of the unattainability of FAHQT, seennot as a strawmanbut rather as the actual, thoughoften unstated,goal of manyof the groupsworkingin MTat that time. For the purposesof this demonstration,it is not necessaryfor Bar-Hillel to quantify the frequencywith which I. Hence,given that boxes are generally larger than writing instruments, the box in question is most likely within a kind of enclosure, such as a playpen.Barringindications to the contrary, this at least is a moreplausible interpretation of the sentence than the one in which"pen" is a writing instrument. 138 BISFAI-95 From: Proceedings, Fourth Bar Ilan Symposium on Foundations of Artificial Intelligence. Copyright © 1995, AAAI (www.aaai.org). All rights reserved. sentenceslike thoseof his simple exampleoccurin various types of documents; as long as anyinstance of this kind of ambiguityappears,requiring extra-linguistic knowledge and/orreasoning for its resolution,it is sufficient to scuttle the fully automatic part of the FAHQT ideal. In other words, as soonas a human post-editor has to be called uponto resolve the ambiguity of one polysemousword like "pen", the MTsystem involved necessarilybecomes less than fully automatic.Moreinterestingly, perhaps,oncethe need for humanintervention in the translation process is admitted and accepted,one can envisagea widerangeof possiblemodesof cooperation,or divisions of labour, between manandmachine.To find the mostproductiveor cost-effective arrangement was,for BarHillel, an empirical questionthat demanded careful study. Unfortunately, it wasnot a questionthat seemsto haveinterested the majority of MTworkersat the time, manyof whomcontinued to confuse the aims and methodsof MTas an area of fundamental researchwith thoseof MTas a practical endeavour. Bar-Hillel recognizedthe validity of both pursuits; whathe deploredwastheir confusion.Wereturn to this issue in section 6 below. 3. The Future of MT Thereis anotherpossibleinterpretation of Bar-Hillel’s famousdemonstration,one that is broader and morepessimistic, in which he can be seen as arguing for the unattainability of FAHQT not merelyin the short term,but altogether.Thatthis in fact was the real intention of his argument is suggested somewhat passinglyin the original article, whereBar-Hi,el states =that no existing or imaginableprogramwill enablean electronic computerto determinethe meaningof the word"pen" in the given sentence..." (p.175; emphasis added).But he reinforces this interpretation andmakesit perfectly clear in piece he publishedin 1962in the TimesLiterary Supplement, entitled =TheFuture of MachineTranslation." In it, Bar-Hillel declares that "MThas reachedan impassefrom whichit is not likely to emerge withouta radical change in the wholeapproach..." "...with all the progressmade in hardware,programming techniquesandlinguistic insight, the quality of fully autonomous mechanicaltranslation, even when restricted to scientific andtechnologicalmaterial, will neverapproachthat of qualified humantranslators and therefore MTwilt only undervery exceptional circumstances be able to compete with human translation." (Bar-Hillel [3], p.182; again, myemphasis) Tojustifythis pessimisticprognostic,Bar-Hillel points to the syntactical(or scoping) ambiguityof the adjective in a simple phraselike "slow neutronsand protons." The examplehe choosesis different, but in essence,itis the sameconciseargumentas he invokedin the celebrated1960article, basedon the fact that human translators routinely makeuse of their vast backgroundknowledge,no counterpart of which he says could conceivablystand at the disposal of computers.Bar-Hillel leaves it to the reader to Macklovitch 139 From: Proceedings, Fourth Bar Ilan Symposium on Foundations of Artificial Intelligence. Copyright © 1995, AAAI (www.aaai.org). All rights reserved. generalize the argumentto the "innumerablesemanticalambiguitieswhichnothing but plain, factual knowledge or considerations of truthfulnessandconsistency will resolve..." (ibid [3], p.182). Thefinal conclusionsBar-Hillel arrives at in ’~he Futureof MachineTranslation" musthaveappearedapocalyptic to MTworkersat the time; those of the ALPAC report pale in comparison: "1 would say that there is no prospect whatsoeverthat the employmentof electronicdigital computers in the field of translationwill lead to anyrevolutionary changes. A complete automation of the activity is whollyutopian.... Thequickerthis is understood, the better are the chances that moreattentionwill be paid to finding efficient waysof improving the statusof scientific andtechnological translation-... including judicious andmodest useof mechanical aids." (Bar-Hillel [3], p.183) 4. The Future is Now It hasnowbeenover thirty yearssince Bar-Hillel publishedtheseprovocativeviews on the future of machine translation. Of course,the future is by definition boundless. But suppose wedecidedto call in the bets today,andarbitrarily decreed that the future is now. HowwouldBar-Hillel’s predictionsfare in light of the currentstate of machine translation? In particular, someof the questionswewouldlike to considerare the following: WasBarHillel right in declaringthat high quality andfull automation are almostalwaysmutually exclusive in machine translation?2 Hastime shownthat he wascorrect, or just short of imagination, in asserting that no imaginableprogramwouldever allow a computerto perform sensedisambiguationson polysemouswordslike "pen" in underdetermined contextslike that of his famousexample? Is it reasonable to characterizeas exceptional those attested cases in which MThas been able to competefavourably with human translation? Moregenerally, has machinetranslation yet emergedfrom the impassein whichBar-Hillel sawit miring in 1962?Andfinally, has anythinglike a radical change occurredin the dominantapproachto the wholeproblemof translation automation,like the oneBar-Hillel calledfor in the early sixties? It is no easytask to attemptto summarize the current state of the art in a field as diverse andebullient as MTis today; andso inevitably, mypersonalassessment will appearto somepeopleas incompleteor tendentious.Still, it seemsto methat there are certain indisputablefacts aboutthe current state of machinetranslation whichanyone, regardlessof his or her particular bias, mustnecessarilyaccept.Andoneof theseis that the growinguseof computers over the last ten or fifteen yearshasnot beenaccompanied by anythinglike the revolutionin translation that early MTpractitionershadhopedfor. On this point at least, Bar-Hillel hasturnedout to be undeniably correct. Whileaccuratedata 2. "Thosewhoare interested in MTas a primarily practical device must realize that full automationof the translation process is incompatiblewith high quality." (Bar-Hillel [5], p. 167) 140 BISFAI-95 From: Proceedings, Fourth Bar Ilan Symposium on Foundations of Artificial Intelligence. Copyright © 1995, AAAI (www.aaai.org). All rights reserved. on the worldtranslation marketare notoriouslydifficult to obtain, I amawareof no studies or estimatesthat accordmachine translation morethan 5%of that market-andthat figure appearsto meto be generous.Whyis it that in 1995MTstill occupiessucha marginal role in professionaltranslation worldwide? Theunavoidable answer,in myview, is that in the vast majority of translation situations, currently available MTsystemsare simplynot 3 Andin mostcases, this is becausea able to satisfy users’ needsand/or expectations. significant gapcontinuesto exist between the quality of the "raw" machine output andthe quality requirements of the endusers, suchthat the cost of the post-editing necessary to makethe machine output usableturns out to be prohibitive. Heretoo, it wouldseemthat Bar-Hillel wasquite correct in assertinga generalincompatibilitybetween high quality and fully automaticmachine translation. Whatabout those cases whereMTdoes prove cost-effective and so can compete favourablywith human translation? Thesewouldappearto fall into oneof two categories. In the first, the applicationdomain is so narrowthat the developers havebeenable to craft a specialisedsystemcapableof producingtranslations of acceptablequality, principally becausethe restrictions on the languageusedin the domaineffectively reducethe full range of linguistic ambiguities to manageable proportions.4 In the other class of successful MTapplications, the end users agree to accept less than top quality translations- either because this is the only wayof obtaininga translationat all, or as a cost savingmeasure for texts that will not receivewideor public distribution. Overall, however,the translation situations in whicheither of these two conditions obtains are relatively rare, or as Bar-Hillel qualified them,"exceptional". 5. MT and AI Turningnowto the moretechnical questionof the capacityof current MTsystemsto performlexico-semanticaldisambiguations like thosethat Bar-Hillel illustrated with his s famous"box in the pen" example, let mefirst remarkhowappropriateit is that this questionbe consideredat a symposium on the foundationsof artificial intelligence. For the issue raised by suchexamples directly links machine translation to the broaderfields of AI and natural languageunderstanding.As mentionedabove,the knowledgerequired to disambiguate"pen" in Bar-Hillel’s example is not primarily linguistic; rather, it involves common-sense expectationsaboutreal-world objects, as well as the ability to reasonand 3. Otherexplanationsare possible, at least in principle. For example,adequate MTsystemsmightexist but simplybe too expensivefor the majority of potential users. Currently, however,this is certainly not the case; on the contrary, inexpensive PC-basedsystems have madeM’r morewidely available than ever. The problemlies in the deficiency of the technology,regardless of whatone is preparedto pay for it. 4. Or, in the absence of naturally occurring sublanguages(like that exploited by Canada’sMETEO system), artificial restrictions nay be imposedon the languagein whichtechnical documentationis drafted, in order to simplify the input to an MTsystemfurther downline.There has been an increase in the use of controlled languagefor MTin recent years. 5. Or, for that matter, syntactical disambiguationslike the one he illustrated with the "slow neutrons and protons" example,since both require a capacity to reason over extra-linguistic knowledge. Macklovitch 141 From: Proceedings, Fourth Bar Ilan Symposium on Foundations of Artificial Intelligence. Copyright © 1995, AAAI (www.aaai.org). All rights reserved. draw inferences over this kind of extra-linguistic information. Here, it wouldbe disingenuousnot to acknowledge the substantial progressthat hasbeenachievedsince the early 1960’s.For onething, the mainthrust of Bar-Hillel’s argument is nowa generally acceptedtruth in the field: everyoneinvolved in MTresearch anddevelopmenttoday recognizesthat in order to correctly translate unrestricted text, an MTsystemmustbe suppliednot just with dictionaries andgrammars, but with someportion of encyclopedic knowledgeas well. Overthe last three decades,numerous researchershaveattempted to design MTsystemsin such a wayas to accommodate this fundamentalfact about translation. This is not the placefor an exhaustivesurveyof theseefforts; but wewould like to attempta generalevaluationof their overall success. Terry Winogradis a prominentpioneer in artificial intelligence, whoseSHRDLU system in the late 60’s is still cited as providinga striking illustration of naturallanguage understanding.In 1984,Winogradpublishedan article in Scientific Americanentitled "ComputerSoftware for Workingwith Language,"in which he raised the following question: "Is there softwarethat really dealswith meaning - softwarethat exhibits the kind of reasoningthat a personwoulduse in carrying out tasks suchas translating, summarizingor answering a question? Suchsoftware has been the goal of researchprojectsin artificial intelligencesince the mid-1960’s,whenthe necessary computerhardwareand programmingtechniques beganto appear even as the impracticability of machinetranslation wasbecoming apparent..." (Winograd [6], p.136) Winograd briefly reviewssomeof the better knownNLPprojects in the 70’s and80’s that attemptedto encodeknowledge of the world in a form that a programcould use to drawinferences(e.g. Schank’sscripts). Theproblem,he points out, is that all of these programs workonly in highly limited, somewhat artificial domains,andit is not at all obvioushow- or whether- they canbe extended.In his conclusion,the reply he himself providesto his earlier questionis certainly not very encouraging;in fact, it is quite reminiscentof the conclusions that Bar-Hillel hadarrived at twenty-fiveyearsearlier. ’q’he limitations on the formalizationof contextualmeaning makeit impossibleat present - andconceivablyforever- to designcomputerprogramsthat comeclose to full mimicryof human languageunderstanding."(Winograd[6], p.142) But doesthis necessarily entail that computerswill never be able to translate adequately?Perhapsit might still be possible to design programsthat wouldallow a machine to producerelatively high quality translations withouthavingto simulatethe full extent of human languageunderstanding;if not all the time, at least often enoughfor human post-editingto be cost-effective. Actually, Bar-Hillel himselfthoughtso for a time. Referringbackto the expectationshe first held for MT,he observed: ’I knewthen that nothingcorresponding to items (3) and(4) [i.e. goodgeneral 142 BISFAI-95 From: Proceedings, Fourth Bar Ilan Symposium on Foundations of Artificial Intelligence. Copyright © 1995, AAAI (www.aaai.org). All rights reserved. background knowledge andexpertnessin the field] could be expectedof electronic computersbut ... entertained somehopesthat by exploiting the redundance of natural languagetexts better than human readers usually do, weshould perhaps be in a position to enablethe computers to overcome, at least partly, their lack of knowledge andunderstanding."(Bar-Hillel [4], p.213) For Bar-Hillel, however,thosehopeswereshort-lived. In 1962,he wrote: ’9"houghit is undoubtedlythe case that somereduction of ambiguity can be obtainedthroughbetter attention to certain formal clues ... it shouldby nowbe perfectly clear that there are limits to whattheserefinements[of purely formal methods]can achieve, limits that definitely block the wayto autonomous, highquality, machine translation." (ibid, p.213) Other AI researchershavebeenless pessimistic than Winogradand Bar-Hillel. Sergei Nirenburg, a Iongtime stalwart of the knowledge-based approachto machine translati.on, is oneof these. HeandKennethGoodman publisheda bravearticle [7] in 1990, in whichthey systematically take up manyof the criticisms that are frequently directed against meaning-based (or interlingual) MT, andexposethe doublestandards andmanyof the inconsistenciesin the argumentsthat are often employedto repudiate and even disparage the interlingual paradigm. But in the face of widespreadand persistent scepticism about the overall feasibility of developinga completeset of language-independent meaning representationsfor all possible linguistic expressionsin all humanlanguages,NirenburgandGoodman can do little morethan protest that ’~qe are makinginroadsinto theseandother difficult areas."(p.184)In the end,they are forced to recognizethat the only real wayto silence the critics andconvincethe scepticsis to demonstrate the practical utility of the approach by actually building a productionsystem prototype. Until suchtime as weseethe results of sucha prototype, however,it seemsto me wehaveevery reasonto remainsceptical. For as Winograd andothers havebeencareful to point out, all the interlingual systemsthat havebeendocumented or demonstrated to 6 date havebeenmoreor less toy systems. To the extent that they have beenable to simulatesomemeasure of inferencingin order to arrive at a correct translation, theseAIbasedsystemsmayperhapshave shownBar-Hillel to be wrong, at least in a narrow sense.7 Nonetheless,I doubt that Bar-Hillel wouldhavebeenvery impressedwith such demonstrations. For though an accomplishedtheorist in other domains, he never remainedsolely a theorist in matters of translation, but alwaysexhibited a genuine 6. As Hutchins [15] puts it in the summary of his chapter on AI-basedMTsystems: "It needs to be stressed, however, that none of the AI workersare expecting their workto result in the near future in ’operational’ MTsystems."(p.284) 7. Thougheven this is not entirely obvious. As mentionedabove, Bar-Hillel was able to envision a machinecapable of inferencing; what he found more difficult to imagine was "a schemewhich would make a machine perform such inferences in the same or similar circumstances under which an intelligent humanbeing wouldperform them." (BarHillel [2], p. 177) Andof course, he also discountedany ad hoc procedure, mountedsolely for the case at hand, "whose futility wouldshowitself in the next example."(ibid, p. 174) Macklovitch 143 From: Proceedings, Fourth Bar Ilan Symposium on Foundations of Artificial Intelligence. Copyright © 1995, AAAI (www.aaai.org). All rights reserved. concern with the practical problemsof translation in the real world. And here, demonstrations of the possibility of machinereasoningwithin toy domainsare of little consequence. Themajor obstacle to fully automatic,cost-effective machinetranslation todayremainsexactly the sameas it was35 yearsago, to wit, the vast andunpredictable rangeof the knowledge that is requiredto allow a machineto achievean understanding of a sourcetext that is sufficient for the purposesof translation. AsHutchinsandSomers put it in their Introductionto Machine Translation: "Theproblemfor MTsystemsis that it is at presentimpossiblein practice to code andincorporateall the potential (real world) knowledge that mightbe required resolveall possibleambiguitiesin a particular system,evenin systems restricted to relatively narrowrangesof contexts and applications. Despite advancesin Artificial Intelligence andin computing technology,the situation is unlikely to improvein the near future: the sheercomplexityandintractibility of real world knowledge are the principal impediments to quick solutions." (Hutchins& Somers [8], p.93) Hence,it seemsfairly safe to say that Bar-Hillel wouldnot havesubstantially modifiedhis viewson the feasibility of FAHQT, hadhe lived beyond1975to witnesssome of the impressivesuccesses in artificial intelligence. Withoutwantingto diminishthe importof the advances of the last twentyyears, it doesappearto be generallytrue that AI andits daughterdiscipline MTare similar, in that their mostimpressiveapplicationshave beenachievedin relatively restricted domains; in both fields, depthof understanding and breadthof coverageremainby andlarge mutually exclusive. Moreover,Bar-Hillel was alreadyfamiliar with some of the early successes of AI research,e.g. the workon pattern recognition (or perceptrons) and programsto play checkers. Given his remarkable foresight, wewouldbe rash to disregardthe warninghe formulatedin 1962:"it wouldbe disastrousto extrapolatefromtheseprimitive exhibitions of artificial intelligence to something like translation." (Bar-Hillel [4], p.214) 6. A Radical Changeof Approach In summary: an objective assessment of the state of the art in MTwouldseemto suggestthat the field is still in fact miredin the same impasse that Bar-Hillel described in the early 60’s. Thequestion wenowwant to consider is whetherwehavebegunto see anything like a radical changein the dominantapproachto the whole problemof translation automation that Bar-Hillel called for backthen. Myowninclination is to answer in the negative(with an importantqualification, to be specifiedbelow).Althoughthey may not publicly admitit, the researchers on mostcurrentmachine translationprojectsare still striving to achieveFAHQT, just as they werein Bar-Hillel’s time. Granted,the techniques wenowemploymayhave evolved; newapproaches,or perhapswhole newparadigms, haveemerged overthe last decade,whichappearat first glanceto be radically innovative, 144 BISFAI-95 From: Proceedings, Fourth Bar Ilan Symposium on Foundations of Artificial Intelligence. Copyright © 1995, AAAI (www.aaai.org). All rights reserved. e.g. Statistical MachineTranslation and Example-Based MachineTranslation.8 But the goal remainsessentially the samenowas it wasthen: to developa fully automatic translating machine capableof producingtarget texts of a quality comparable to that of a human translator - in other words,a translating robot. Noonecan object to this as an entirely worthwhilepursuit for a long-termresearchprogramme. Theproblemis that many of thoseworkingtowardthis objective continueto maintainthe unspoken assumption that their researchefforts - evenif they do eventuallyfall short of their ultimate goal - will neverthelessprove useful in providing short-term solutions to the practical problems besettinga translation professionthat is overwhelmed andunableto meetthe demand for its services.This is far froma self-evidenttruth, however, as wewill arguebelow.In his pre-ALPAC articles, Bar-Hillel deploredjust this confusionbetweenthe aimsandmethods of MTas an area of fundamental researchandMTas a practical endeavour.Perhapsit is time wefinally learnedthe lessonsof history andaccepted the fact that classical MT-and by this I mean anyapproach in whichthe initiative in the translation processis givenover to the machineso that it can autonomously producea target version of the sourcetext will never contribute morethan marginally to satisfying the ever-growingdemand for translation, or at least not for many,manyyearsto come. For those whoremainconcemed with the practical problemsof workingtranslators, doesBar-Hillel indicate the direction he thought that a genuinelyradical changeof approachshould take? The citation reproducedat the end of section 3 abovedoes provideoneclue, whereBar-Hillel mentionsthe possibility of a"judicious andmodestuse of mechanical aids." (Herethe emphasis is in the original.) In his "AimsandMethods Machine Translation"[5], Bar-HiUelis evenmoreexplicit: ’qhe only reasonableaim, then, for short-rangeresearchinto MTseemsto be that of finding somemachine-post-editor partnership that would be commercially competitive with existing humantranslation, and then try to improve the commercialcompetitivenessof this partnership by improvingthe programming in order to delegateto the machine moreandmoreoperationsin the total translation processwhichit canperformmoreeffectively than the human post-editor." (p.172) A partnership betweenmachineandhumantranslator/post-editor that takes as its starting point a judicious andmodestuseof mechanical aids: this wouldseemto point to whatis nowgenerally called machine-aided human translation (or MAHT), as distinct from 9. classical MTor human-aided machinetranslation MAHT has not beena popular avenue of researchoverthe last thirty years. Tobe sure, it hashadits isolated champions; most notably, perhaps, Martin Kay (cf Kay [11]). But generally speaking, MAHT has not 8. The standard reference for S/VII" is Brownet al. [9], and for EBMT, Sato and Nagao[I0]. The former approach, of course, does have historical antecedents, on which Bar-Hille! also had somevery interesting things to say. See especiallyBai-Hillel [5], p. 17 I. 9. Again, the distinction between MAHT and IIAMTmay be framed in terms of which of the two - manor machine - retains the initiative in the translation process..In MAHT, it is the humantranslator, and the machineis viewedas a tool that maybe called uponto amplify humancapabilities, but only on such tasks that can be automatedreliably. Macklovitch 145 From: Proceedings, Fourth Bar Ilan Symposium on Foundations of Artificial Intelligence. Copyright © 1995, AAAI (www.aaai.org). All rights reserved. succeeded in attracting anythinglike the attention andfundingthat hasbeenpouredinto classical MT,no doubtbecause,as Bar-Hillel suggested,the latter is perceivedas an intellectually morechallengingpursuit. Onlyin the last few years, as moreandmore researchershavestarted exploring corpus-basedapproachesto NLP,has MAHT begun 1° to receivethe attention it deserves. MAHT is the philosophicalcornerstone of the CITI’s machine-aided translation program.In this final section, I wouldlike to illustrate our approachto MAHT andbriefly discuss whyweconsiderit to be a radical departurefrom traditional responses to the problemof translation automation,by describingoneof the translator supporttools weare currently developing. Theproject in question is called TransCheck, andit is documented morefully in Macklovitch [13]. As its namesuggests, TransCheckis intended to be used as a translation checker,somewhat like a spelling checker.But wherethe latter verifies certain (orthographic)propertiesof a single monolingual text, TransCheck is designed to validate certain properties that normallymusthold betweentwo texts that are in a translation relation. To do so, the systemincorporatesan alignmentalgorithmthat automaticallylinks segments (currently sentences)in the target text to their corresponding segments in the sourcetext. Oncethe draft translation is completed,the translator submitsthe two languagefiles to TransCheck,andthe systemthen verifies the aligned segmentsto ensurethat they do not contain anydeceptivecognates,caiques,illicit borrowings,or certain other commonly occurring translation errors. Whenit does detect an error describedin its database of prohibitedtranslations, TransCheck flags it andprovidesthe user with informationon the correct target languageformthat shouldbe used.Preliminary tests of the first prototype haveproducedencouraging results, confirmingthe general viability of a translation checkerbasedon a sentencealignmentprogramanda part-ofspeech tagger;again,see[13] for further discussion.Ultimately,of course,it is hopedthat TransCheck will be able to detect moresubtle types of translation errors, andthat end userswill be able to modifythe contentsof the databaseso that it reflects their own translation norms.Here,however,I wouldlike to focus on anotherprojectedextensionto TransCheck which has not yet beenfully implemented,but whichillustrates, I think particularly well, the interest of whatBar-Hillel called a judicious andmodestuseof mechanicalaids. Oneof the mostboring tasks for a translation reviser (whomaybe the translator himself) is to verify that all numericalexpressionsin a sourcetext havebeencorrectly renderedin the target. Textsin domainssuchas economics or statistics can be packed full of suchexpressions,andthe smallesterror in onedigit is tantamount to a serious mistranslation: not only is it extremelyembarrassing for the translator, but it can 10. For moreon this paradigmshift, see Isabelle [12], wherethe author outlines someof the reasons whythe classical rule-based approachto MThas producedso few useful results in the wayof translator support tools, and why,on the contrary, the corpus-basedapproach seemsto lend itself so well to MAHT. 146 BISFAI-95 From: Proceedings, Fourth Bar Ilan Symposium on Foundations of Artificial Intelligence. Copyright © 1995, AAAI (www.aaai.org). All rights reserved. undermine the credibility of the entire text. Thereasonwhyhuman revisers find this task so boringis that the "translation" of numericalexpressions is so straightforwardthat it requires almostno intellectual effort (althoughbetween certain languages,there maybe someminor differences of syntax); andyet every numbermuststill be checkedfor the possibility of an error of transcriptionthat doesoccasionallyoccur.Is this not exactlythe kind of mechanicaloperationfor whichcomputers are better suited than humans? Onthe basis of the alignedsentencesit has paired, a systemlike TransCheck shouldbe able to verify that for every source segment containing a numerical expression, the correspondingtarget segmentcontains the equivalent numericalexpression; andwhere it doesn’t,that pair shouldbe broughtto the reviser’s attention. In actualfact, the problem is not as trivial as I havesuggestedhere11;nonetheless,weare convincedthat evena rudimentarynumericalcomponent within TransCheck should be able to validate a large proportion of the numerical expressions in most texts. Moreover, an important characteristicof this approach to translation validation is that evenif the systemis less thanfully exhaustive,whatever errors it doesdetect will still contributeto improvingthe quality of the final text; just asthe errors detectedby a spelling checkerimprovethe final product, eventhoughnoneof those systemsis fully exhaustiveeither. Notice, however, that this is not generallythe casewith classical MTsystems:the kind of partially correct translations generatedby such systemsdo not always result in a reduction of the translator’s workload,as manydisenchanted MTusers havetestified over the years. 7. Conclusion TransCheck is oneof a newgenerationof translation supporttools currently being developed at the CITI, all of whichare basedon the conceptof translation analysis, as 12 In our view, translation opposed to the classical MTapproach of translation generation. analysis doesconstitute a radical departurefrom traditional responsesto the whole problemof howbest to automatethe translation process, although it is not as yet anywhereclose to becomingthe dominantapproachin the field. In its favour, it has allowedfor the development, within a remarkablyshort period of time, of promisingnew types of translation support tools, which are wholly consistent with the modestand practical strategy that Bar-Hillel advocated over 30 yearsago. Whetherthesetools will actually live up to their promiseandprovemoreuseful to translatorsthan classical MThas to date, only the future will tell. Andno onebut a seercanpredict the future. 11. Toillustrate just a fewof the complicationsfrequentlyencountered,there is the obviousproblemof numerical expressionsthat are writtenaccordingto differentstandards,e.g. "7 p.m."vs. "19 h"; whichis whyall suchexpressions will haveto translated into a normalizedformbefore beingcompared.Less obviously,the text in onelanguagemay use a numeral,e.g. the date "1994",wherethe translation properly refers to the sameperiod by meansof a nonnumericalnounphraselike "last year". 12. SeeIsabelle et al. [14] for a descriptionof someof the CITI’sother projects, anda fuller discussionof the differencesbetweentranslation analysis andtranslation generation. Macklovitch 147 From: Proceedings, Fourth Bar Ilan Symposium on Foundations of Artificial Intelligence. Copyright © 1995, AAAI (www.aaai.org). All rights reserved. References [1] Bar-Hillel, Yehoshua. Language andInformation: SelectedEssayson their The0ry andApplication, Addison-Wesley Publishing Company, Inc., 1964. of the Nonfeasibility of Fully AutomaticHighQuality [2] Bar-Hi,el, Y. "A Demonstration Translation," reproduced in [1], pp. 174-179. Translation,"reproduced in [1], pp. 180-184. [3] Bar-Hillel, Y. ’’The Futureof Machine Translation,"in [1], [4] Bar-Hillel, Y. "FourLectureson AlgebraicLinguisticsandMachine pp. 185-218. in Machine Translation," in [1], pp. 166-173. [5] Bar-HiUel,Y. "AimsandMethods [6] Winograd, Terry. "Computer Software for Working with Language," ~ American,no.251(3), September 1984. "Treatment of Meaningin MTSystems," [7] Nirenburg, Sergei & KennethGoodman. Proceedinos of the Third International Conference on Theoretical and v MethodoloaicalIssues in MachineTranslation, Austin, Texas,June1990. [8] Hutchins, W. John& Harold L. Somers.An Introduction to MachineTranslation, AcademicPress, 1992. [9] Brown, Peter et al. "A Statistical L~, no.16:2, pp.79-85, 1990. Approach to Machine Translation," ~ Translation," EB.g.CB..(;;IJOg~ [10] Sato, Satoshi& MakotoNagao."TowardMemory-Based of COLING-90, vol.3, pp.247-252,1990. [11] Kay, Martin. "The Proper Place of Menand Machinesin LanguageTranslation," CSL-80-11,Xerox PARC,1980. [12] Isabelle, Pierre, "Machine-AidedHumanTranslation and the ParadigmShift," Proceedingsof the Fourth MachineTranslation Summit,Kobe,Japan,July 1993. [13] Macklovitch,Elliott. "UsingBi-textual Alignmentfor Translation Validation: the TransCheck system," Proceedingsof the First Conferenceof the Association for MachineTranslation in the Americas,Columbia,Maryland, pp.157-168,October 1994. [14] Isabelle, Pierre et al. "Translation Analysis and Translation Automation," Proceedings of the Fifth International Conference on TheoreticalandMethodological Issues in MachineTranslation, Kyoto, Japan,1993. [15] Hutchins, W. John. MachineTranslation; P~t. Present. Future, Ellis Horwood, Chichester,1986. 148 BISFAI-95

Document 13704314

Related documents

Products

Support

Document 13704314

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib