Invited Lecture CS 4705: Introduction to Natural Language Processing Fall 2004 Machine Translation: Challenges and Approaches Nizar Habash Post-doctoral Fellow Center for Computational Learning Systems Columbia University Sounds like Faulkner? It lay on the table a candle burning at each corner upon the envelope tied in a soiled pink garter two artificial flowers. Not hit a man in glasses. Faulkner William Faulkner, "The sound and Machine Translation the fury“ It was once a shade, which was in all beautiful weather under a tree and varied like the branches in the wind. Es war einmal ein Schatten, der lag bei jedem schönen Wetter unter einem Baum und schwankte wie die Faulkner Zweige im Wind. Helmut Wördemann, "Der Machine Translation unzufriedene Schatten“ (Translated by Systran) http://www.ee.ucla.edu/~simkin/sounds_like_faulkner.html Progress in MT Statistical MT example 2002 2003 Human Translation Egyptair Has Tomorrow to Egypt Air May Resume its Resume Its Flights to Flights to Libya Tomorrow Libya Cairo, April 6 (AFP) - An Cairo 4-6 (AFP) said an Egypt Air official Cairo 6-4 ( AFP ) - an official at the Egyptian official announced today Aviation Company today announced, on Tuesday, in the Egyptian lines that the company egyptair that Egypt Air will resume company for flying may resume as of its flights to Libya as of Tuesday is a company " tomorrow, Wednesday its tomorrow, Wednesday, insistent for flying " may flights to Libya after the after the UN Security resumed a consideration International Security Council had announced the Council resolution to the of a day Wednesday suspension of the suspension of the embargo tomorrow her trips to embargo imposed on imposed on Libya. Libya of Security Council Libya. decision trace international the imposed ban comment . insistent Wednesday may recurred her trips to Libya tomorrow for flying Form a talk by Charles Wayne, DARPA Road Map • • • • Why Machine Translation (MT)? Multilingual Challenges for MT MT Approaches MT Evaluation Why (Machine) Translation? Languages in the world • 6,800 living languages • 600 with written tradition • 95% of world population speaks 100 languages Translation Market • $8 Billion Global Market • Doubling every five years (Donald Barabé, invited talk, MT Summit 2003) Why Machine Translation? • Full Translation – Domain specific • Weather reports • Machine-aided Translation – Translation dictionaries – Translation memories – Requires post-editing • Cross-lingual NLP applications – Cross-language IR – Cross-language Summarization Road Map • Why Machine Translation (MT)? • Multilingual Challenges for MT – – – – Orthographic variations Lexical ambiguity Morphological variations Translation divergences • MT Paradigms • MT Evaluation Multilingual Challenges • Orthographic Variations – Ambiguous spelling • ب األوْ الدُ اشعَارا كتب االوالد اشعارا َ َ َكت – Ambiguous word boundaries • • Lexical Ambiguity – Bank ( بنكfinancial) vs. ( ضفةriver) – Eat essen (human) vs. fressen (animal) Multilingual Challenges Morphological Variations • Affixation vs. Root+Pattern write kill written killed كتب قتل مكتوب مقتول do done فعل مفعول • Tokenization conj noun article plural And the cars والسيارات Et les voitures and the cars w Al SyArAt et le voitures Multilingual Challenges Translation Divergences • How languages map semantics to syntax • 35% of sentences in TREC El Norte Corpus (Dorr et al 2002) • Divergence Types – Categorial (X tener hambre X be hungry) [98%] – Conflational (X dar puñaladas a Z X stab Z) [83%] – Structural (X entrar en Y X enter Y) [35%] – Head Swapping (X cruzar Y nadando X swim across Y) – Thematic (X gustar a Y Y like X) [8%] [6%] Translation Divergences conflation ليس ا نا be هنا لست هنا I-am-not here I not etre here I am not here Je ne pas ici Je ne suis pas ici I not be not here Translation Divergences categorial, thematic and structural * ا نا be بردان I * tener cold Yo frio קר ל אני انا بردان I cold I am cold tengo frio I-have cold קר לי cold for-me Translation Divergences head swap and categorial اسرع swim I across quickly river انا عبور سباحة نهر I swam across the river quickly اسرعت عبور النهر سباحة I-sped crossing the-river swimming Translation Divergences head swap and categorial swim I across quickly river חצה אני את ב ב נהר שחיה מהירות I swam across the river quickly חציתי את הנהר בשחיה במהירות I-crossed obj river in-swim speedily Translation Divergences head swap and categorial انا اسرع ver b عبور سباحة ver b אני نهر ver b I swim across quickly river חצה את ב ב נהר שחיה מהירות Translation Divergences Orthography+Morphology+Syntax mom’s car car possessed-by mom 妈妈的车 mama de che سيارة ماما sayyArat mama la voiture de maman Road Map • Why Machine Translation (MT)? • Multilingual Challenges for MT • MT Approaches – Gisting / Transfer / Interlingua – Statistical / Symbolic / Hybrid – Practical Considerations • MT Evaluation MT Approaches MT Pyramid Source meaning Target meaning Source syntax Source word Analysis Target syntax Gisting Target word Generation MT Approaches Gisting Example Sobre la base de dichas experiencias se estableció en 1988 una metodología. Envelope her basis out speak experiences them settle at 1988 one methodology. On the basis of these experiences, a methodology was arrived at in 1988. MT Approaches MT Pyramid Source meaning Source syntax Source word Analysis Target meaning Transfer Gisting Target syntax Target word Generation MT Approaches Transfer Example • Transfer Lexicon – Map SL structure to TL structure poner :subj butter :mod :obj X mantequilla en :subj X :obj Y :obj Y X puso mantequilla en Y X buttered Y MT Approaches MT Pyramid Source meaning Source syntax Source word Analysis Interlingua Transfer Gisting Target meaning Target syntax Target word Generation MT Approaches Interlingua Example: Lexical Conceptual Structure (Dorr, 1993) MT Approaches MT Pyramid Source meaning Source syntax Source word Analysis Interlingua Transfer Gisting Target meaning Target syntax Target word Generation MT Approaches MT Pyramid Source meaning Interlingual Lexicons Source syntax Source word Analysis Transfer Lexicons Target meaning Target syntax Dictionaries/Parallel Corpora Target word Generation MT Approaches Statistical vs. Symbolic Source meaning Source syntax Source word Analysis Target meaning Target syntax Target word Generation MT Approaches Noisy Channel Model Portions from http://www.clsp.jhu.edu/ws03/preworkshop/lecture_yamada.pdf MT Approaches IBM Model (Word-based Model) http://www.clsp.jhu.edu/ws03/preworkshop/lecture_yamada.pdf MT Approaches Statistical vs. Symbolic vs. Hybrid Source meaning Source syntax Source word Analysis Target meaning Target syntax Target word Generation MT Approaches Statistical vs. Symbolic vs. Hybrid Source meaning Source syntax Source word Analysis Target meaning Target syntax Target word Generation MT Approaches Hybrid Example: GHMT • Generation-Heavy Hybrid Machine Transaltion • Lexical transfer but NO structural transfer Maria puso la mantequilla en el pan. lay locate place put render set stand poner :subj :mod :obj Maria mantequilla en :obj :subj :mod :obj Maria butter bilberry on in into at :obj pan bread loaf MT Approaches Hybrid Example: GHMT • LCS-driven Expansion • Conflation Example PUTV Agent MARIA Theme BUTTERN [CAUSE GO] Goal BREAD BUTTERV [CAUSE GO] Agent Goal MARIA BREAD CategorialVariation MT Approaches Hybrid Example: GHMT • Structural Overgeneration put Maria butter lay on Maria bread butter Maria bread at loaf bread Maria butter render butter … Maria butter into loaf MT Approaches Hybrid Example: GHMT Target Statistical Resources • Structural N-gram Model – Long-distance – Lexemes • Surface N-gram Model – Local – Surface-forms buy John car a red John bought a red car MT Approaches Hybrid Example: GHMT Linearization &Ranking Maria Maria Maria Maria Maria Maria Maria buttered the bread butters the bread breaded the butter breads the butter buttered the loaf butters the loaf put the butter on bread -47.0841 -47.2994 -48.7334 -48.835 -51.3784 -51.5937 -54.128 MT Approaches Practical Considerations • Resources Availability – Parsers and Generators • Input/Output compatability – Translation Lexicons • Word-based vs. Transfer/Interlingua – Parallel Corpora • Domain of interest • Bigger is better • Time Availability – Statistical training, resource building MT Approaches Resource Poverty No Parser? No Translation Dictionary? Parallel Corpus • Align with rich language • Extract dictionary •Parse rich side •Infer parses •Build a statistical parser Road Map • • • • Why Machine Translation (MT)? Multilingual Challenges for MT MT Approaches MT Evaluation MT Evaluation • More art than science • Wide range of Metrics/Techniques – interface, …, scalability, …, faithfulness, ... space/time complexity, … etc. • Automatic vs. Human-based – Dumb Machines vs. Slow Humans MT Evaluation Metrics (Church and Hovy 1993) • System-based Metrics Count internal resources: size of lexicon, number of grammar rules, etc. – easy to measure – not comparable across systems – not necessarily related to utility MT Evaluation Metrics • Text-based Metrics – Sentence-based Metrics • Quality: Accuracy, Fluency, Coherence, etc. • 3-point scale to 100-point scale – Comprehensibility Metrics • • • • Comprehension, Informativeness, x-point scales, questionnaires most related to utility hard to measure MT Evaluation Metrics • Text-based Metrics (cont’d) – Amount of Post-Editing • number of keystrokes per page • not necessarily related to utility • Cost-based Metrics – Cost per page – Time per page Human-based Evaluation Example Accuracy Criteria 5 4 3 2 1 contents of original sentence conveyed (might need minor corrections) contents of original sentence conveyed BUT errors in word order contents of original sentence generally conveyed BUT errors in relationship between phrases, tense, singular/plural, etc. contents of original sentence not adequately conveyed, portions of original sentence incorrectly translated, missing modifiers contents of original sentence not conveyed, missing verbs, subjects, objects, phrases or clauses Human-based Evaluation Example Fluency Criteria 5 4 3 2 1 clear meaning, good grammar, terminology and sentence structure clear meaning BUT bad grammar, bad terminology or bad sentence structure meaning graspable BUT ambiguities due to bad grammar, bad terminology or bad sentence structure meaning unclear BUT inferable meaning absolutely unclear Fluency vs. Accuracy FAHQ MT conMT Prof. MT Info. MT Fluency Accuracy Automatic Evaluation Example Bleu Metric • Bleu – – – – – BiLingual Evaluation Understudy (Papineni et al 2001) Modified n-gram precision with length penalty Quick, inexpensive and language independent Correlates highly with human evaluation Bias against synonyms and inflectional variations Automatic Evaluation Example Bleu Metric Test Sentence Gold Standard References colorless green ideas sleep furiously all dull jade ideas sleep irately drab emerald concepts sleep furiously colorless immature thoughts nap angrily Automatic Evaluation Example Bleu Metric Test Sentence Gold Standard References colorless green ideas sleep furiously all dull jade ideas sleep irately drab emerald concepts sleep furiously colorless immature thoughts nap angrily Unigram precision = 4/5 Automatic Evaluation Example Bleu Metric Test Sentence Gold Standard References colorless green ideas sleep furiously colorless green ideas sleep furiously colorless green ideas sleep furiously colorless green ideas sleep furiously all dull jade ideas sleep irately drab emerald concepts sleep furiously colorless immature thoughts nap angrily Unigram precision = 4 / 5 = 0.8 Bigram precision = 2 / 4 = 0.5 Bleu Score = (a1 a2 …an)1/n = (0.8 ╳ 0.5)½ = 0.6325 63.25