Using Dynamic Language Corpora in Web-based Translation; An Analysis of Error Factors

Document Type : Research Paper

Authors

Abstract

Corpus linguistics is a branch of linguistics which has enjoyed remarkable progress due to the technological advances in communication and networking industries. The application of the worldwide web is no longer limited to search for information or providing data. The increasing development of new technologies offered in internet has created new fields of interest. Online translation is one of these new fields, which is currently experiencing the early stages of its development. The most significant feature of such a translation is the use of dynamic linguistics corpora, which has an interactive and statistics-based foundation and is constantly updated. Delving into the mechanisms of web-based translation using dynamic corpora the present study surveys different factors causing errors in web-based translation from Persian into English, especially cultural factors. Although the findings show serious problems in this way, it seems that the new advanced algorithms in standard translation software can take hopeful steps toward the perfection of this technology.

Keywords


خداپرستی، فرج الله. (1380) ابداع و طراحی سیستم رایانه‌ای جهت ترجمه متون علمی زبان انگلیسی به زبان فارسی. مرکز اطلاعات و مدارک علمی ایران.
دبیرخانه شورای عالی اطلاع رسانی جمهوری اسلامی ایران.(1388). جمع آوری اطلاعات چالش­ها و روش های ترجمه ماشینی زبان انگلیسی به فارسی و بالعکس. مستند شماره 1/1/2537/190. تهران: دانشگاه علم و صنعت ایران.
Aue, Anthony. et. al. (2004). Statistical Machine Translation Using Labeled Semantic Dependency Graphs, Microsoft Research: Proceedings of TMI 2004.
Baker, M. (1992), In Other Words, London: Rutledge.
Gear, David. (2009), Google Official research blog
Granger, S. et. al.(2008), Corpus-based Approach to Contrastive Linguistics and Translation Studies. Amsterdam: Rodopi.
Huston, S. (2006) Corpus Linguistics, In Keith Brown, Encyclopedia of Language and Linguistics. Elsevier (2006) USA.
Hutchins, John (2007), Machine translation: problems and issues (panel at conference, 13 December 2007.  Chelyabinsk, Russia.).
Hutchins, W. John; and Harold L. Somers (1992). An Introduction to Machine Translation. London: Academic Press. 
Jing-yi, Wang. (2006) A Discussion on the Promotion of Machine Translation by Multi-engine Method Based on Dynamic Language Corpus. US-China Foreign Language، ISSN1539-8080, USA
Kaji, Hiroyuki. (1987), A Japanese-to- English Machine Translation System Based on Semantics. Proceedings of Machine Translation Summit, pp.55-60, 1987.
Meyer, C. (2002), English Corpus Linguistics: An Introduction. Cambridge: Cambridge University Press.
Nani (2002) Annual Meeting of the Association for Computational Linguistics، July 6-12، 2002, Philadelphia, PA, USA
Newmark, P. (1998) A Textbook of Translation. New York: Prentice Hall.
Nida, Eugene A. and C.R. Taber (1969 / 1982) The Theory and Practice of Translation, Leiden: E. J. Brill. 
Och, F (2006), Statistical machine translation live. Retrieved from: Google Research blog, April 28, 2006.
Och, F. and Ney، H. 2002. Discriminative Training and Maximum Entropy Models for Statistical Machine Translation. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics.
Richardson, S. et. al. (2001) Achieving commercial-quality translation with example-based methods. Proceedings of MT Summit VIII: 293-298.
Silberman, S. (2004), Machine Translation: AI Methods for Translating from One Language to Another. Retrieved August 5, 2011, from www.aaai.org/AITopics/pmwiki/pmwiki.php/AITopics/MachineTranslation.
Somers, Harold. (2001) Three perspectives on MT in the classroom. MT Summit VIII Workshop on Teaching Machine Translation (Santiago de Compostela Spain), 25-29.
Wilks, YA (1972). Grammar, Meaning, and the Machine Analysis of Language. London: Rutledge & Kegan Paul.