Machine Learning Approaches for Bangla Statistical Machine Translation

Machine Learning Approaches for Bangla Statistical Machine Translation

Maxim Roy
ISBN13: 9781466639706|ISBN10: 1466639709|EISBN13: 9781466639713
DOI: 10.4018/978-1-4666-3970-6.ch004
Cite Chapter Cite Chapter

MLA

Roy, Maxim. "Machine Learning Approaches for Bangla Statistical Machine Translation." Technical Challenges and Design Issues in Bangla Language Processing, edited by M. A. Karim, et al., IGI Global, 2013, pp. 79-95. https://doi.org/10.4018/978-1-4666-3970-6.ch004

APA

Roy, M. (2013). Machine Learning Approaches for Bangla Statistical Machine Translation. In M. Karim, M. Kaykobad, & M. Murshed (Eds.), Technical Challenges and Design Issues in Bangla Language Processing (pp. 79-95). IGI Global. https://doi.org/10.4018/978-1-4666-3970-6.ch004

Chicago

Roy, Maxim. "Machine Learning Approaches for Bangla Statistical Machine Translation." In Technical Challenges and Design Issues in Bangla Language Processing, edited by M. A. Karim, M. Kaykobad, and M. Murshed, 79-95. Hershey, PA: IGI Global, 2013. https://doi.org/10.4018/978-1-4666-3970-6.ch004

Export Reference

Mendeley
Favorite

Abstract

Machine Translation (MT) from Bangla to English has recently become a priority task for the Bangla Natural Language Processing (NLP) community. Statistical Machine Translation (SMT) systems require a significant amount of bilingual data between language pairs to achieve significant translation accuracy. However, being a low-density language, such resources are not available in Bangla. In this chapter, the authors discuss how machine learning approaches can help to improve translation quality within as SMT system without requiring a huge increase in resources. They provide a novel semi-supervised learning and active learning framework for SMT, which utilizes both labeled and unlabeled data. The authors discuss sentence selection strategies in detail and perform detailed experimental evaluations on the sentence selection methods. In semi-supervised settings, reversed model approach outperformed all other approaches for Bangla-English SMT, and in active learning setting, geometric 4-gram and geometric phrase sentence selection strategies proved most useful based on BLEU score results over baseline approaches. Overall, in this chapter, the authors demonstrate that for low-density language like Bangla, these machine-learning approaches can improve translation quality.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.