Machine Translation 2011/March

advertisement
Machine Translation
2011/March
What Is Machine Translation?
http://www.diplomacy.edu/language/Translation/machine.htm
Machine translation (MT) is the use of computer software to translate text or
speech from one natural language into another. Like translation done by
humans, MT does not simply involve substituting words in one language for
another, but the application of complex linguistic knowledge: morphology (how
words are built from smaller units of meaning), syntax (grammar), semantics
(meaning), and understanding of concepts such as ambiguity.
Research and development of machine translation has been going on since the
1950s, engaging some of the best minds in computing, linguistics and artificial
intelligence. Steve Silberman writes:
The dream of translation by computer is older than the high tech industry itself.
Before email, before word processing, before command-line interfaces,
machine translation - or MT - was one of the first two computer applications
designed to act upon words instead of numbers (the other was code
breaking)…But it turns out that really good MT is so hard to pull off that the task
exhausted the top-end computing resources of every generation attempting it.
Regardless, machine translation R&D(research and development) is going
stronger than ever, fired up by the globalization of the Net. Today, all over the
world, software designers, programmers, hardware engineers, neural-network
experts, AI specialists, linguists, and cognitive scientists are enlisted in the effort
to teach computers how to port words and ideas from language to language.
("Hello, World," Wired, May 2000)
As our environment becomes more networked and connected
internationally, the call for MT increases. Researchers predict that in
the very near future English will no longer be the mother tongue of
the majority of Internet users.
1
Machine Translation
2011/March
Already the amount of material needed in different language versions is
too vast for human translation alone, according to Systran, one of the
oldest machine translation companies. MT is a long way from being able
to replace human translation, and many experts feel it may never do so.
But it can reduce the amount of work for human translators by taking
over translations where accuracy is not essential, and by assisting
humans with more important translation jobs.
MT offers some real advantages: according to Systran, MT is much faster
than human translation (humans can translate 2000 - 3000 words a day,
while Systran’s MT software can translate 3700 words a minute). MT is much
cheaper than human translation. MT software has a better memory than
human translators: it can store translated documents and re-use phrases
that have already been translated.
The accuracy of MT is much lower than competent human translation, but
can be improved in certain ways – for example, by ensuring that spelling
and punctuation are all correct in the original text.
When used in conjunction with human translators – to provide a first draft
which is then given to a human for polishing, MT can save time and money.
The following resources offer a good general introduction to machine translation:






Steve Silberman, "Talking to Strangers," Wired, May 2000. A good history of the
conception, development and current state of machine translation.
"Machine Translation’s Past and Future," Wired, May 2000. A timeline of the history and
future of machine translation.
"Universal Translators," Wired, May 2000. A listing of machine translation research and
development hubs worldwide.
D.J. Arnold, Lorna Balkan, Siety Meijer, R.Lee Humphreys and Louisa Sadler, Machine
Translation: an Introductory Guide, Londong: Blackwells-NCC, 1994. A comprehensive
book about machine translation, available online.
Links on MT. Research centers, products and software.
Links on MT. Research centers, companies and articles online.
2
Machine Translation
2011/March
Machine translation has proved useful in two fields primarily: as an aid for
human translators, and for translating material on a restricted subject
matter. First, as an aid for human translators working on material which must be accurately
translated, MT can save time by producing a first draft. Second, MT can produce fairly accurate
translations when the domain of discourse is highly restricted: when syntax is simplified,
vocabulary is predictable and each word is likely to mean one and only one thing: technical
documents, equipment maintenance manuals, weather reports, etc. “The classic example of MT
that works is the Météo system, developed in Montreal, which has been translating Canada's
weather bulletins between English and French on a daily basis since 1977. In the world of Météo
discourse, ‘front’ always means a weather system. The translation of forecasts was so boring that
before Météo took over, the Canadian government had a hard time keeping translators on the job
for more than a couple of months.” (Steve Silberman, "Talking to Strangers," Wired, May 2000)
Machine translation
provides fast but
potentially error-prone text translations.
http://www-01.ibm.com/software/globalization/topics/machinetranslation/index.jsp

John Hutchins
“For many years, MT with human assistance has been a cost-effective option for
multinational corporations and other multilingual bodies (e.g. the European
Union). MT systems produce rough translations which are then revised (postedited) by translators. But post-editing to an acceptable quality can be expensive,
and many organizations reduce costs and improve MT output by the use of
‘controlled’ languages, In this way, translation processes are closely linked to
technical writing and integrated in the whole documentation workflow, making
possible further savings in time and costs.
It is widely agreed that where translation has to be of publishable quality, both human
translation and MT have their roles. Machine translation is demonstrably cost-effective
for large scale and/or rapid translation of technical documentation and software
localization materials.”
(http://www.hutchinsweb.me.uk/main.htm)
3
Machine Translation
2011/March
Problems with machine translation
1- Machine translation works quite well for translating predictable technical texts – texts
which never go beyond the expected domain of discourse. But this is little help in the
domains where people want translation the most: for spontaneous conversations, in
person, on the telephone, and on the Internet.
2- Computers
just do not have the ability to deal adequately with the various
complexities of language than humans handle naturally: ambiguity, syntactic
irregularity, multiple word meanings and the influence of context. A classic
example is illustrated in the following pair of sentences:
Time flies like an arrow.
Fruit flies like an apple.
The sentence construction is parallel, but the meanings are entirely different: the first is a figure of
speech involving a metaphor and the second is a literal description. And the identical words in the
sentences - flies and like - are used in different grammatical categories. A computer can be
programmed to understand either of these examples, but not to distinguish between them.
3- Computers not only lack the knowledge of the world to deal with word choice,
but they also lack the knowledge necessary for cultural sensitivity. Melby writes
that translation needs to be “sensitive to total context, including the intended audience of the
translation. Meaning is not some abstract object that is independent of people and culture.” As an
example of the damage that can be done by culturally ignorant and insensitive translation, even
by humans, he describes his investigation of the translation of a remark made by Nikita
Khrushchev in Moscow on November 19, 1956:
Khrushchev was then the head of the Soviet Union and had just given a speech on the
Suez Canal crisis. Nassar of Egypt threatened to deny passage through the canal. The
United States and France moved to occupy the canal. Khrushchev complained loudly
about the West. Then, after the speech, Khrushchev made an off-hand remark to a
diplomat in the back room. That remark was translated “We will bury you” and was
burned into the minds of my generation as a warning that the Russians would invade the
United States and kill us all if they thought they had a chance of winning…Several
months ago, I became curious to find out what Russian words were spoken by
Khrushchev and whether they were translated appropriately…In Soviet Communist
rhetoric, it is common to claim that history is on the side of Communism, referring back to
Marx who argued that Communism was historically inevitable. Khrushchev then added
that Communism does not need to go to war to destroy Capitalism. Continuing with the
thought that Communism is a superior system and that Capitalism will self-destruct, he
said, rather than what was reported by the press, something along the lines of ‘Whether
you like it or not, we will be present at your burial,’ clearly meaning that he was predicting
4
Machine Translation
2011/March
that Communism would outlast Capitalism. Although the words used by Khrushchev
could be literally translated as “We will bury you,” (and, unfortunately, were translated
that way) we have already seen that the context must be taken into consideration. The
English translator who did not take into account the context of the remark, but instead
assumed that the Russian word for “bury” could only be translated one way,
unnecessarily raised tensions between the United States and the Soviet Union and
perhaps needlessly prolonged the Cold War. ("Why Can’t a Computer Translate More
Like a Person?")

Examine the following MT systems and report your feedback:
(what languages are involved/free;needs subscription../quality?)
http://my.ajeeb.com/
http://www.aramedia.com/aschome.htm
http://www.almisbar.com/
http://www.systranet.com/
http://www.systran.co.uk/
http://babelfish.yahoo.com/
http://www.languageweaver.com/page/home/
http://www.trident.com.ua/us/
http://www.bultra.com/mtlinks.htm
http://www.reverso.net/text_translation.asp?lang=EN
http://www.promt.com/
http://www.freetranslation.com/
http://technology.timesonline.co.uk/tol/news/tech_and_web/personal_tech/article701783
1.ece
http://www.translatum.gr/dics/machine-translation.htm
http://www.worldlingo.com/en/products_services/worldlingo_translator.html
http://www.foreignword.com/
http://www.word2word.com/mt.html
http://www.toggletext.com/
http://translation2.paralink.com/
http://www.apptek.com/index.php/product-demonstrations
http://www.lingvosoft.com/
http://www.lingo24.com/free-translation-online.html
http://www.translate4me.com/machine-translation.html
( READ : http://meta.wikimedia.org/wiki/Wikipedia_Machine_Translation_Project)
(READ: Evaluation; http://isl.ira.uka.de/fileadmin/publication-files/1_149.pdf)
(READ: Evaluate: http://www.globalwatchtower.com/2007/10/30/mt-shootout/)
http://isl.ira.uka.de/fileadmin/publication-files/1_149.pdf
http://free-translation.imtranslator.com/
http://translation.babylon.com/
http://translate.reference.com/
http://www.google.com.au/language_tools
5
Machine Translation
2011/March
(READ: http://www.ibm.com/developerworks/library/us-mt/)
(READ: http://www.h-online.com/open/news/item/EU-machine-translation-project912735.html)
http://wareseeker.com/free-fast-machine-translation/
(Read:
https://webgate.ec.europa.eu/mt/ecmt/html/help_en.html;jsessionid=KvvPLf
LJHz6LtVbn8gGbTJMq2t8fM9Gml0WvJmY0vMqP9vT17nh0!654224107)
************************************************
Evaluation of MT’s :
Around 1949, MT projects were launched first in the US, and soon thereafter in the
USSR. They were motivated by the growing needs for intelligence gathering. They gave
rise to the first MT screening systems. The goal of such systems is to produce
automatically, quickly and cheaply large volumes of rough translations. The quality of
the rough translations obtained is not essential. The output can be used to get an idea of
the content. If the user wants a good translation of a part which looks interesting, he
simply asks a human translator (who in general will judge the machine output to be too
bad to bother with revision).
There are various means for evaluating the performance of machine-translation systems.
The oldest is the use of human judges to assess a translation's quality. Even though
human evaluation is time-consuming, it is still the most reliable way to compare different
systems such as rule-based and statistical systems. Automated means of evaluation
include BLEU, NIST and METEOR.
Relying exclusively on unedited machine translation ignores the fact that communication
in human language is context-embedded and that it takes a person to comprehend the
context of the original text with a reasonable degree of probability. It is certainly true
that even purely human-generated translations are prone to error. Therefore, to
ensure that a machine-generated translation will be useful to a human being and that
publishable-quality translation is achieved, such translations must be reviewed and edited
by a human.

(PSU/CW 072 students translated:
http://ar.wikipedia.org/wiki/%D8%AA%D8%B1%D8%AC%D9%85%D8%A9
_%D8%A2%D9%84%D9%8A%D8%A9)

Assignment# 3 :
6
Machine Translation
2011/March
Groups submitting feedback on MT sites.(Class
Work).
March 9,2011
7
Download