English to Hindi Paraphrase Convention for Translating Homoeopathy Literature

English to Hindi Paraphrase Convention for Translating Homoeopathy Literature

Sanjay K. Dwivedi (School of Information Science and Technology, Department of Computer Science, Babasaheb Bhimrao Ambedkar University (A Central University), Lucknow, Uttar Pradesh, India) and Pramod P. Sukhadeve (School of Information Science and Technology, Department of Computer Science, Babasaheb Bhimrao Ambedkar University (A Central University), Lucknow, Uttar Pradesh, India)
Copyright: © 2012 |Pages: 9
DOI: 10.4018/ijalr.2012100105
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

The rule based approach to machine translation (MT) confines grammatical rules between the source and the target language with the goal of constructing grammatical translation between the language pair. In this paper, we describe the structural representation of English stemmer, POS tagging and design transfer rules which can generate Hindi sentence from the structural representation of the English sentence. Due to the specific terminology of homoeopathic sentences and the linguistic gap between the two languages the translation of these literatures form English to Hindi is a challenging task. The rule sets are used to plug gap between the two languages. Further, rule sets are described for mapping preposition verbs, nouns, etc. Finally, a system architecture has been proposed for the translation of homoeopathy literature from English to Hindi Language.The system accuracy has been evaluated using Bleu score, which is found out to be 0.7501 and the accuracy percentage of the system is 82.23%.
Article Preview

Introduction

Machine translation (MT) has been defined as the process that utilizes computer software to translate text from one natural language to another. Generally, for translating the source language (SL) to target language (TL) different approaches/techniques are used. Some of the major approaches are Rule based, Statistical approach and Example based approach. The MT system described in this paper, uses rule based approach, which is one of the effective techniques (Vaishali & Devale, 2010).

In recent years, the research in MT has gained momentum. Many language pairs (including the languages of Indian origin) have been chosen to develop MT systems, with good results. In the work of Jain, Bahadur and Chauhan(2012), the target language generation mechanism has been outlined for English to Sanskrit language pair using rule based machine translation technique for their system “Etrans”. In adopting rules based MT from English to Bangla, Francisca, Mamun Mia and Rahman (2011, Jun-Jul) have proposed language translation model that relies on rule based methodologies particularly fuzzy rules. Most of these systems are in the English to Hindi languages (specific domain) with the exceptions of a Hindi to English (Sinha, & Thakur, 2005) and English to Kannada (Kumar, & Murthy, 2006) machine translation systems. Indian machine translation systems are used to translate English to Hindi language (Naskar & Bandyopadhyay, 2002), which are as follows; The English to Hindi Anusaaraka system follows the basic principles (Bharti, Chaitanya, Kulkarni, & Sangal, 1997) of information preservation. The system makes text in one Indian language accessible in another Indian language. Anubharti 2004 (Sinha, 2004) approach for machine-aided-translation is a hybridized example-based machine translation approach that is a combination of example-based, corpus-based approaches and some elementary grammatical analysis.

In this work, we describe an English-Hindi MT system for Homoeopathy documents. In our proposed system, sentences of Homoeopathy literature (in English) are first passed to Stemmer (Dwivedi, & Sukhadeve, 2011, May). Further, the Part of Speech (PoS) tagger (Dwivedi, & Sukhadeve, 2011, July), tags all the words with the help of our designed tagging rules and corpuses (English and Hindi). These steps are assisted by English to Hindi online dictionary Shabdkosh (शब्दकोश). Afterwards, we compare the structural representation of both the language for mapping and designed assembly rules for translation.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 7: 2 Issues (2017)
Volume 6: 2 Issues (2016)
Volume 5: 1 Issue (2015)
Volume 4: 1 Issue (2014)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing