PLIN019 – Machine Translation Basic information Introduction to Translation Introduction to Machine Translation Outline of Machine Translation History Bibliography I John Hutchins – Machine translation: past, present, future I John Hutchins – An introduction to machine translation I Philipp Koehn – Statistical Machine Translation I Sergei Nirenburg et al. – Readings in Machine Translation I Jiřı́ Levý – Uměnı́ překladu I Jiřı́ Levý – České theorie překladu Translation I Translation Translation is a transfer of a text from a source language to a target language. Interpreting Interpreting is oral translation of spoken language. Translation I Translation Translation is a transfer of a text from a source language to a target language. Interpreting Interpreting is oral translation of spoken language. Translation is like a woman: either faithful or beautiful. Translation II I technical translation × literary translation I exact reproduction × loose translational rephrasing Maimonidés, 12th century The context is crucial for translation. Werner Winter (1923–2010) Each word is an element pulled out from a complex language system and its relations to other segments of the system differ in different languages. Each meaning (sense) is an element from a complex system of segments which a speaker divides reality into. In Mojave language: a woman’s father 6= a man’s father Which properties of a source should be preserved? – J. Levý Translation (Levý) I must reproduce I I words of the original ideas of the original I should be able to be read as the original I is to be read as a translation should I I I I reflect style of the original show translator’s style should be read as a text falling into the period of I I the original the translator I can add or skip something from the original I shoud never add or skip something from the original Translatology I deals with translation of texts between languages and semiotic systems I questions of accuracy (fidelity), translatability I translation between cultural areas, various periods I descriptive branch (critics, history) × applied (practice) I formed 60’s–70’s, linguistic orientation I 80’s – close to theory of literature I 90’s – turned to a translator him/her-self Translator What should a good translator know: (Levý): I source language I target language I factual content of the text: facts of a period, the field (domain, for technical translation) Levý on artistic translation Translation should give the impression of a work of art. Machine translation and artistic translation – Levý Machine Translation’s goal is to fragment a sentence to the simplest comparable elements; artistic translation’s goal is the opposite: transfering of the highest units. Types of translations according to Roman Jakobson I interlingual – transfer between different languages I intralingual – transfer within a language, e.g. to a different dialect, to a standard language etc. I intersemiotic – transfer between different semiotic systems (sign language) Questions I Is accurate translation between languages possible at all? I What is easier: to translate from or to your mother tongue? I How we know w1 is translational equivalent of w2 ? I English wind types: airstream, breeze, crosswind, dust devil, easterly, gale, gust, headwind, jet stream, mistral, monsoon, prevailing wind, sandstorm, sea breeze, sirocco, southwester, tailwind, tornado, trade wind, turbulence, twister, typhoon, whirlwind, wind, windstorm, zephyr I How should we translate words like alkáč, večernı́ček, telka, čoklbuřt, knı́žečka, ČSSD . . . ? I And what about: matka, macecha, mamka, máma, maminka, matička, máti, mama, mamča, mamina I Navajo Code movie – language as a cipher Linguistic relativity I language properties substantially affect our view of the world I properties of different languages differ significantly I → their speakers live in different, incompatible worlds Ludwig Wittgenstein The limits of my language mean the limits of my world. Fritz Mauthner If Aristotle had spoken Chinese or Dakota he would have arrived at a totally different logic. Linguistic relativity – dualism I mould theories: language and thinking are the same, we think in our language I cloak theories: language is on surface, behind is a complex maze of thoughts Where linguistic relativity belongs? On intelligence – Jeff Hawkins (see TED) Le Ton beau de Marot – Douglas Hofstadter (Jabberwocky, Žvahlav, palindroms) Sapir-Whorf hypotesis I important theory of psycholinguistics I language determines thought I 30’s of 20th century, Edward Sapir, from linguistic relativity I comparison of concepts in American-Indian and European languages I elaborated by Benjamin Lee Whorf I later criticized: falsifiable form of the hypothesis (concepts for colours) showed the opposite to be true Machine Translation – definition A discipline of computational linguistics dealing with design, implementation and application of automatic systems (software) for translating texts with minimal human invervention. E.g. a translation with an electronic dictionary does not belong to machine translation. Machine translation – object of study We consider only technical / specialized texts: I web pages I technical manuals I scientific documents, papers I leaflets, catalogues I law texts I in general: texts from narrow domains Nuances on different language levels in art literature are out of scope of current MT systems. Machine translation – problems In fact an output of MT is always revised. We distinguish pre-editing and post-editing. Sometimes necessary even for human. MT systems make different types of errors. These mistakes are typical for human: I wrong prepositions: (I am in school) I missing determiners (I saw man) I wrong tense (Uviděl jsem – I was seeing), . . . For computers, errors in meaning are typical: Kiss me honey. → Polib mi med. Lexical choice A choice of a proper translational equivalent: I homonymy – pila, baby, ženu; byte, ate I polysemy – take, run, line; klı́č, kohout, mı́t I synonymy – kluk, chlapec, hoch; dı́vka, holka, děvče Word order I Word order II – free word order Word order rule The more morphologically rich the freer word order Katka snědla kousek koláče. I Kati megevett egy szelet tortát → Katie eating a piece of cake I Egy szelet tortát Kati evett meg → Katie ate a piece of cake I Kati egy szelet tortát evett meg → Katie ate a piece of cake I Egy szelet tortát evett meg Kati → Katie ate a piece of cake I Megevett egy szelet tortát Kati → Katie eating a piece of cake I Megevett Kati egy szelet tortát → Katie ate a piece of cake Direct methods for improving MT quality I limit input to a: I I I I sublanguage (indicative sentences) domain (informatics) document type (patents) text pre-processing (e.g. manual syntactic analysis) Basic terms I accuracy (precision) I intelligibility I fluency I source language, SL, L1 I target language, TL, L2 I corpus, corpora I ambiguity, polysemy I ... Classification based on approach I rule-based, knowledge-based – RBMT, KBMT I I transfer with interlingua I statistical machine translation – SMT I hybrid machine translation – HMT, HyTran Vauquois’s triangle Interlingua e ag rc e Syntactic transfer n tio ra ne lan ge gu e ag So u gu an aly an tl sis e rg Ta Semantic transfer Direct Classification based on interaction with a user I (human, manual translation) I machine-aided human translation – MAHT I human-aided machine translation – HAMT I fully automated high-quality (M)T – FAHQT HAMT and MAHT: CAT – computer-aided translation. Classification according to direction and arity Arity: I bilingual systems I multilingual systems Direction: I unidirectional I bidirectional Systems of Machine Translation Apertium (RBMT, open-source), Babelfish (Yahoo), Caitra (CAT system), ČESILKO (Czech-Slovak translation), EuroTra (ambicious project EC), Google Translate, Logos (OpenLogos, one of the oldest MT systems), METEO (translation of weather forecasts, English, French), Moses (open-source MT system), Pangloss (example-based MT), Rosetta (contains a logic analysis of propositions), Systran (one of the oldest MT systems), Trados (translation memory, CAT system), Verbmobil (translation of speech↔speech among German, English and Japanese), matecat (open-source online CAT system), . . . Conferences, workshops, institutions I ACL – Annual meetings of the Association for Computational Linguistics I NIST – National Institute of Standards and Technology I Translating and the Computer (London) I RANLP – Recent Advances in Natural Language Processing I Workshop on Machine Translation (WMT) I The Conference of the Association for Machine Translation in the Americas I LREC – Language Resources and Evaluation Conferences I www.wikicfp.com (Electronic) resources I links on PLIN019 web page I MT Archive I www.statmt.org I ACL Anthology Institutions I IAMT – International Association for Machine Translation: I I I EAMT – European Association for Machine Translation AMTA – The Association for MT in the Americas AAMT – The Asian-Pacific Association for MT I META-NET – unites European MT departments I British Computer Society Natural Language Translation Group I UK MFF ÚFAL I Obec překladatelů (art literature translators) I Jednota tlumočnı́ků a překladatelů I Ústav translatologie, FF UK Motivations for MT I period of information boom I I I I I 1922 – regular BBC radio broadcast 1923 – radio broadcast in CR 1936 – regular BBC TV broadcast 1953 – TV broadcast in CR computer development I I generation zero – Z1–3, Colossus, ABC, Mark I,II first generation – ENIAC (Electronic Numerical Integrator And Computer, 1945), MANIAC In 1947 RAM could store 100 numbers and a + b took 1/8 s! Early MT believes I translation is repeated activity – it was believed that it can be superseded by computers I computers were successful in deciphering war codes: would they be useful also for MT? Warren Weaver When I look at an article in Russian, I say: This is really written in English, but it has been coded in some strange symbols. I will now proceed to decode. First impulses In 1950 Weaver sended a memorandum to 200 addressees in which he outlined some problems of MT. I polysemy (ambiguity) is a common phenomenon I intersection of logic and language I connections with cryptography I universal properties of languages An early interest in MT held at several departments. At first at University of London (Andrew D. Booth). Soon after at MIT, University of Washington, University of California, Harvard, . . . Topics and first exchanges of experience I morphologic and syntactic analysis I meaning and knowledge representation I creating and working with electronic dictionaries I 1952 – first public conference at MIT I 1954 – first showcase of a working MT Alan Turing Test Using language as humans do is a sufficient operational test for intelligence. Georgetown experiment The first working prototype of MT. I IBM, New York I first public demonstration of MT I a computer applied to a non numerical task I over 60 sentences (probably carefully selected) I a dictionary with 250 words I from Russian to English I grammar for Russian contained 6 rules The experiment provoked enthusiasm. MT was obviously possible (despite fraudulently presented). Many new projects aroused after, mainly in USA, Russia. Progress in 50’s I MT provoked development in these fields: I I I theoretical linguistics (Chomsky) computational linguistics artificial intelligence I with higher coverage quality of MT decreased I even the best systems (GAT, Georgetown, Ru→En) provided unsatisfying results I generating random love poems (1952) Progress in 50’s I a first PhD thesis on MT defended (1954) I Journal of Machine Translation (1954) I First international MT conference held in London (1956) I Noam Chomsky: Syntactic Structures (1957) I MT research in USSR, Japan I first book about MT (Introduction), Paris (1959) 60’s, Disappointments from poor results I despite rather poor results, optimism prevailed I Yehoshua Bar-Hillel wrote a critics of MT status in 1959 I he claimed computers are not capable of lexical disambiguation I fully automated high-quality translation (FAHQT) unreachable Yehoshua Bar-Hillel – an example for disambiguation Little John was looking for his toy box. Finally, he found it. The box was in the pen. John was very happy. MT projects expenses began to decrease. Progress in 60’s I MT in USSR focused on En scientific paper abstracts I Association for MT in USA (1962) I Peter Toma leaves Georgetown MT, develops AUTOTRAN, later Systran ALPAC report, 1966 I Automatic Language Processing Advisory Commitee I an institution under U.S. National Academy of Science I it carried out analyses and evaluations of MT quality and usability I recommended to reduce expenditures for MT support I negative impact on MT as a scientific field I a problem was in strong underestimation of complexity of natural language understanding I MT development continued in Europe, USSR, Japan continuously I it took MT in USA another 15 years to regain its previous respect and status TAUM, METEO TAUM I Traduction Automatique à l’Université de Montréal I Université de Montréal in 1965 I prototypes of MT systems: TAUM-73, TAUM-METEO I first MT systems incorporating analysis of SL and synthesis of TL I EN → FR I TAUM Aviation (cancelled) METEO I 1981–2001 used for weather forecast translation I author John Chandiou, Canada Systran I one of the oldest MT companies (1968) I very popular translation system I basis for Yahoo Babelfish I until 2007 used even by Google I RBMT, since 2010, hybrid translation I from 1976 oficial MT system used by EEC Renaissance – 70’s I First Soviet MT program: AMPAR (En→Ru) I Systran installed at EC (1978) I Xerox uses Systran I a project proposes using Esperanto as interlingua (refused) Renaissance – 80’s I development of rule-based systems with interlingua I Rosetta project started (1980, logical interlingua) I first data-driven systems (Example-based MT) I boom of commercial MT systems I EUROTRA project (EU funded) began I IBM introduces 8-bit ASCII (1983) I Trados – the first company to develop CAT, Stuttgart I Unicode project (1987) I World Wide Web proposal (1989) Renaissance – 90’s I research on statistical MT (IBM) I SDL (CAT market leader) founded in UK (1992) I Verbmobil project (1992–99) I rule-based systems still dominating the field I AltaVista Babelfish (1997), 500,000 requests/day I first online commercial online MT service iTranslator Renaissance – 00’s I statistical systems dominate the field I quality of rule-based systems improved by statistical methods (hybrid systems) I new translation pairs I NIST launches first round of MT system benchmarking (2001) I EuroMatrix – a large scale EC funded project (2006) I Moses (open source statistical MT engine, 2007) Too optimistic prognosis Machine Translation nowadays I I unprecedented computational power, data structures I enabled work with billion words instantly Google 1 PB sort (2008) I I I I I trillion 100 B records 6 hours; 4,000 PCs; 48,000 discs MapReduce technique Google Ngrams I development of MT systems for everyone I number of parallel corpora steadily increasing I focus on under-resourced languages (LREC) I MT quality is improved slowly but steadily Machine Translation nowadays II I SMT rulezz I intense parallel (and comparable) data acquiring I development of MT systems based on evaluation metric outputs I USA: interest mainly in English as TL I EU: translation between 23 oficial languages of EU (EuroMatrix): English, Bulgarian, Czech, Danish, Estonian, Finnish, French, Irish, Italian, Lithuanian, Latvian, Hungarian, Maltese, German, Dutch, Polish, Portugese, Romanian, Greek, Slovak, Slovene, Spanish a Swedish. Machine translation nowadays III I big companies (Microsoft) focused on English as SL I large pairs (En↔Sp, En↔Fr): very good translation quality I SMT enriched with syntax I Google Translate as a gold standard I morphologically rich languages neglected I En↔* a *↔En pairs prevail Motivation in 21st century I translation of web pages for gisting (grasping the main message) I methods for speeding-up human translation substantially (translation memories) I cross-language extraction of facts and search for information I instant translation of e-communication I translation on mobile devices Conclusion I MT falls into AI-complete problems I immense computational power at our disposal I commercial (market) potential is bigger than ever I there is always a thing to be improved in MT I statistical methods seem to be more convenient (fast, cheap) I new ideas most welcome! (theses)