Machine Translation 2011/March What Is Machine Translation? http://www.diplomacy.edu/language/Translation/machine.htm Machine translation (MT) is the use of computer software to translate text or speech from one natural language into another. Like translation done by humans, MT does not simply involve substituting words in one language for another, but the application of complex linguistic knowledge: morphology (how words are built from smaller units of meaning), syntax (grammar), semantics (meaning), and understanding of concepts such as ambiguity. Research and development of machine translation has been going on since the 1950s, engaging some of the best minds in computing, linguistics and artificial intelligence. Steve Silberman writes: The dream of translation by computer is older than the high tech industry itself. Before email, before word processing, before command-line interfaces, machine translation - or MT - was one of the first two computer applications designed to act upon words instead of numbers (the other was code breaking)…But it turns out that really good MT is so hard to pull off that the task exhausted the top-end computing resources of every generation attempting it. Regardless, machine translation R&D(research and development) is going stronger than ever, fired up by the globalization of the Net. Today, all over the world, software designers, programmers, hardware engineers, neural-network experts, AI specialists, linguists, and cognitive scientists are enlisted in the effort to teach computers how to port words and ideas from language to language. ("Hello, World," Wired, May 2000) As our environment becomes more networked and connected internationally, the call for MT increases. Researchers predict that in the very near future English will no longer be the mother tongue of the majority of Internet users. 1 Machine Translation 2011/March Already the amount of material needed in different language versions is too vast for human translation alone, according to Systran, one of the oldest machine translation companies. MT is a long way from being able to replace human translation, and many experts feel it may never do so. But it can reduce the amount of work for human translators by taking over translations where accuracy is not essential, and by assisting humans with more important translation jobs. MT offers some real advantages: according to Systran, MT is much faster than human translation (humans can translate 2000 - 3000 words a day, while Systran’s MT software can translate 3700 words a minute). MT is much cheaper than human translation. MT software has a better memory than human translators: it can store translated documents and re-use phrases that have already been translated. The accuracy of MT is much lower than competent human translation, but can be improved in certain ways – for example, by ensuring that spelling and punctuation are all correct in the original text. When used in conjunction with human translators – to provide a first draft which is then given to a human for polishing, MT can save time and money. The following resources offer a good general introduction to machine translation: Steve Silberman, "Talking to Strangers," Wired, May 2000. A good history of the conception, development and current state of machine translation. "Machine Translation’s Past and Future," Wired, May 2000. A timeline of the history and future of machine translation. "Universal Translators," Wired, May 2000. A listing of machine translation research and development hubs worldwide. D.J. Arnold, Lorna Balkan, Siety Meijer, R.Lee Humphreys and Louisa Sadler, Machine Translation: an Introductory Guide, Londong: Blackwells-NCC, 1994. A comprehensive book about machine translation, available online. Links on MT. Research centers, products and software. Links on MT. Research centers, companies and articles online. 2 Machine Translation 2011/March Machine translation has proved useful in two fields primarily: as an aid for human translators, and for translating material on a restricted subject matter. First, as an aid for human translators working on material which must be accurately translated, MT can save time by producing a first draft. Second, MT can produce fairly accurate translations when the domain of discourse is highly restricted: when syntax is simplified, vocabulary is predictable and each word is likely to mean one and only one thing: technical documents, equipment maintenance manuals, weather reports, etc. “The classic example of MT that works is the Météo system, developed in Montreal, which has been translating Canada's weather bulletins between English and French on a daily basis since 1977. In the world of Météo discourse, ‘front’ always means a weather system. The translation of forecasts was so boring that before Météo took over, the Canadian government had a hard time keeping translators on the job for more than a couple of months.” (Steve Silberman, "Talking to Strangers," Wired, May 2000) Machine translation provides fast but potentially error-prone text translations. http://www-01.ibm.com/software/globalization/topics/machinetranslation/index.jsp John Hutchins “For many years, MT with human assistance has been a cost-effective option for multinational corporations and other multilingual bodies (e.g. the European Union). MT systems produce rough translations which are then revised (postedited) by translators. But post-editing to an acceptable quality can be expensive, and many organizations reduce costs and improve MT output by the use of ‘controlled’ languages, In this way, translation processes are closely linked to technical writing and integrated in the whole documentation workflow, making possible further savings in time and costs. It is widely agreed that where translation has to be of publishable quality, both human translation and MT have their roles. Machine translation is demonstrably cost-effective for large scale and/or rapid translation of technical documentation and software localization materials.” (http://www.hutchinsweb.me.uk/main.htm) 3 Machine Translation 2011/March Problems with machine translation 1- Machine translation works quite well for translating predictable technical texts – texts which never go beyond the expected domain of discourse. But this is little help in the domains where people want translation the most: for spontaneous conversations, in person, on the telephone, and on the Internet. 2- Computers just do not have the ability to deal adequately with the various complexities of language than humans handle naturally: ambiguity, syntactic irregularity, multiple word meanings and the influence of context. A classic example is illustrated in the following pair of sentences: Time flies like an arrow. Fruit flies like an apple. The sentence construction is parallel, but the meanings are entirely different: the first is a figure of speech involving a metaphor and the second is a literal description. And the identical words in the sentences - flies and like - are used in different grammatical categories. A computer can be programmed to understand either of these examples, but not to distinguish between them. 3- Computers not only lack the knowledge of the world to deal with word choice, but they also lack the knowledge necessary for cultural sensitivity. Melby writes that translation needs to be “sensitive to total context, including the intended audience of the translation. Meaning is not some abstract object that is independent of people and culture.” As an example of the damage that can be done by culturally ignorant and insensitive translation, even by humans, he describes his investigation of the translation of a remark made by Nikita Khrushchev in Moscow on November 19, 1956: Khrushchev was then the head of the Soviet Union and had just given a speech on the Suez Canal crisis. Nassar of Egypt threatened to deny passage through the canal. The United States and France moved to occupy the canal. Khrushchev complained loudly about the West. Then, after the speech, Khrushchev made an off-hand remark to a diplomat in the back room. That remark was translated “We will bury you” and was burned into the minds of my generation as a warning that the Russians would invade the United States and kill us all if they thought they had a chance of winning…Several months ago, I became curious to find out what Russian words were spoken by Khrushchev and whether they were translated appropriately…In Soviet Communist rhetoric, it is common to claim that history is on the side of Communism, referring back to Marx who argued that Communism was historically inevitable. Khrushchev then added that Communism does not need to go to war to destroy Capitalism. Continuing with the thought that Communism is a superior system and that Capitalism will self-destruct, he said, rather than what was reported by the press, something along the lines of ‘Whether you like it or not, we will be present at your burial,’ clearly meaning that he was predicting 4 Machine Translation 2011/March that Communism would outlast Capitalism. Although the words used by Khrushchev could be literally translated as “We will bury you,” (and, unfortunately, were translated that way) we have already seen that the context must be taken into consideration. The English translator who did not take into account the context of the remark, but instead assumed that the Russian word for “bury” could only be translated one way, unnecessarily raised tensions between the United States and the Soviet Union and perhaps needlessly prolonged the Cold War. ("Why Can’t a Computer Translate More Like a Person?") Examine the following MT systems and report your feedback: (what languages are involved/free;needs subscription../quality?) http://my.ajeeb.com/ http://www.aramedia.com/aschome.htm http://www.almisbar.com/ http://www.systranet.com/ http://www.systran.co.uk/ http://babelfish.yahoo.com/ http://www.languageweaver.com/page/home/ http://www.trident.com.ua/us/ http://www.bultra.com/mtlinks.htm http://www.reverso.net/text_translation.asp?lang=EN http://www.promt.com/ http://www.freetranslation.com/ http://technology.timesonline.co.uk/tol/news/tech_and_web/personal_tech/article701783 1.ece http://www.translatum.gr/dics/machine-translation.htm http://www.worldlingo.com/en/products_services/worldlingo_translator.html http://www.foreignword.com/ http://www.word2word.com/mt.html http://www.toggletext.com/ http://translation2.paralink.com/ http://www.apptek.com/index.php/product-demonstrations http://www.lingvosoft.com/ http://www.lingo24.com/free-translation-online.html http://www.translate4me.com/machine-translation.html ( READ : http://meta.wikimedia.org/wiki/Wikipedia_Machine_Translation_Project) (READ: Evaluation; http://isl.ira.uka.de/fileadmin/publication-files/1_149.pdf) (READ: Evaluate: http://www.globalwatchtower.com/2007/10/30/mt-shootout/) http://isl.ira.uka.de/fileadmin/publication-files/1_149.pdf http://free-translation.imtranslator.com/ http://translation.babylon.com/ http://translate.reference.com/ http://www.google.com.au/language_tools 5 Machine Translation 2011/March (READ: http://www.ibm.com/developerworks/library/us-mt/) (READ: http://www.h-online.com/open/news/item/EU-machine-translation-project912735.html) http://wareseeker.com/free-fast-machine-translation/ (Read: https://webgate.ec.europa.eu/mt/ecmt/html/help_en.html;jsessionid=KvvPLf LJHz6LtVbn8gGbTJMq2t8fM9Gml0WvJmY0vMqP9vT17nh0!654224107) ************************************************ Evaluation of MT’s : Around 1949, MT projects were launched first in the US, and soon thereafter in the USSR. They were motivated by the growing needs for intelligence gathering. They gave rise to the first MT screening systems. The goal of such systems is to produce automatically, quickly and cheaply large volumes of rough translations. The quality of the rough translations obtained is not essential. The output can be used to get an idea of the content. If the user wants a good translation of a part which looks interesting, he simply asks a human translator (who in general will judge the machine output to be too bad to bother with revision). There are various means for evaluating the performance of machine-translation systems. The oldest is the use of human judges to assess a translation's quality. Even though human evaluation is time-consuming, it is still the most reliable way to compare different systems such as rule-based and statistical systems. Automated means of evaluation include BLEU, NIST and METEOR. Relying exclusively on unedited machine translation ignores the fact that communication in human language is context-embedded and that it takes a person to comprehend the context of the original text with a reasonable degree of probability. It is certainly true that even purely human-generated translations are prone to error. Therefore, to ensure that a machine-generated translation will be useful to a human being and that publishable-quality translation is achieved, such translations must be reviewed and edited by a human. (PSU/CW 072 students translated: http://ar.wikipedia.org/wiki/%D8%AA%D8%B1%D8%AC%D9%85%D8%A9 _%D8%A2%D9%84%D9%8A%D8%A9) Assignment# 3 : 6 Machine Translation 2011/March Groups submitting feedback on MT sites.(Class Work). March 9,2011 7