Word Sense Based Hindi-Tamil Statistical Machine Translation

Word Sense Based Hindi-Tamil Statistical Machine Translation

Vimal Kumar K., Divakar Yadav
ISBN13: 9781799809517|ISBN10: 179980951X|EISBN13: 9781799809524
DOI: 10.4018/978-1-7998-0951-7.ch021
Cite Chapter Cite Chapter

MLA

Kumar K., Vimal, and Divakar Yadav. "Word Sense Based Hindi-Tamil Statistical Machine Translation." Natural Language Processing: Concepts, Methodologies, Tools, and Applications, edited by Information Resources Management Association, IGI Global, 2020, pp. 410-421. https://doi.org/10.4018/978-1-7998-0951-7.ch021

APA

Kumar K., V. & Yadav, D. (2020). Word Sense Based Hindi-Tamil Statistical Machine Translation. In I. Management Association (Ed.), Natural Language Processing: Concepts, Methodologies, Tools, and Applications (pp. 410-421). IGI Global. https://doi.org/10.4018/978-1-7998-0951-7.ch021

Chicago

Kumar K., Vimal, and Divakar Yadav. "Word Sense Based Hindi-Tamil Statistical Machine Translation." In Natural Language Processing: Concepts, Methodologies, Tools, and Applications, edited by Information Resources Management Association, 410-421. Hershey, PA: IGI Global, 2020. https://doi.org/10.4018/978-1-7998-0951-7.ch021

Export Reference

Mendeley
Favorite

Abstract

Corpus based natural language processing has emerged with great success in recent years. It is not only used for languages like English, French, Spanish, and Hindi but also is widely used for languages like Tamil, Telugu etc. This paper focuses to increase the accuracy of machine translation from Hindi to Tamil by considering the word's sense as well as its part-of-speech. This system works on word by word translation from Hindi to Tamil language which makes use of additional information such as the preceding words, the current word's part of speech and the word's sense itself. For such a translation system, the frequency of words occurring in the corpus, the tagging of the input words and the probability of the preceding word of the tagged words are required. Wordnet is used to identify various synonym for the words specified in the source language. Among these words, the one which is more relevant to the word specified in source language is considered for the translation to target language. The introduction of the additional information such as part-of-speech tag, preceding word information and semantic analysis has greatly improved the accuracy of the system.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.