Dallin Hardcastle LING 480 11/14/2012 RbMT or SMT? For this paper, I have investigated major differences between Rule-Based Machine Translation (RbMT) and Statistical Machine Translation (SMT) to discover which system is not only currently superior, but also the pathway of the future of machine translation. While each system can be very beneficial, I have concluded that neither is currently superior, and neither is the concrete answer for a more efficient translation service. The answer lies in an effective mix of both ideologies, a blend of both RbMT and SMT, or a Hybrid Machine Translation, will lead to the biggest advances in machine translation since the invention of the computer. RbMT uses grammars, phonological rules, and other linguistic principles to perform translations. There is tremendous upside in such translation systems if they have been thoroughly developed. SYSTRAN is a company who has had a very successful past in RbMT, dating back to 1968 when the company was founded by Dr. Peter Toma. They were one of the few translation companies that survived the major decrease of funding from ALPAC. Their system helped the United States translate millions of documents during the Cold War and was the foundation of the free online translation service, Yahoo! Babel Fish. The downfall of RbMT is that rapid translation is not feasible unless extensive grammatical rules have already been established between certain languages. This development is very costly, time consuming, and slow. Serious SMT study began in the early 1990’s, when the United States government, specifically the DARPA (Defense Advanced Research Projects Agency), funded and IBM project called CANDIDE. The idea was to form accurate algorithms to statistically analyze an extensive set of bilingual corpora to provide accurate, fluent-sounding translations. This project guaranteed 80% accuracy to its algorithms, therefore not guaranteeing accurate translations. DARPA eventually rated SYSTRAN’s system higher on the accuracy scale and funding for CANDIDE was cut. There are certain advantages to SMT, however, as it gives very rapid translations. Google Translate is probably the most used free MT translation technology online today, and it is statistically based. However, if one needs to translate a complex sentence that requires knowledge of grammatical structure, SMT is not a reliable solution. Dr. Sabine Hunsicker, a researcher at the German Research Center for Artificial Intelligence, compares the two MT systems, “While SMT systems suffer from a lack of grammatical structure, resulting in ungrammatical sentences, RbMT systems have to deal with a lack of lexical coverage” (Hunsicker, 312). Both systems have serious shortcomings. Dr. Yoricks Wilcks, a professor of Artifical Intelligence at Sheffield University in the U.K., believes that a hybrid system is the answer to the future (Wilcks, 89). Ironically, IBM and SYSTRAN are two companies heading the development of such technology. SYSTRAN released a new hybrid technology in 2010, and IBM has teamed up with LinguaSys and Google Translate in the development of their own. It will be interesting to see how these new Hybrid MT are superior to their predecessors. The question lies in whether or not Hybrid MT will provide consistent FAHQUT translations. My hypothesis is that while the number of full time translators may reduce, the number of full time technical support will increase. The industry will continue to grow and more jobs will be available, but available to those with technological training, not just translators. Works Cited Wilks, Yorick, 1939. Machine Translation its Scope and Limits. Ed. SpringerLink (Online service). New York; London: Springer, 2008. Print. Hunsicker, Sabine, 2012. Machine Learning for Hybrid Machine Translation. Proceedings of the 7th Workshop on Statistical Machine Translation, pages 312216 Association for Computational Linguistics. Montreál, Canada, June 8, 2012.