Article Preview
TopIntroduction
Machine translation (MT) has been defined as the process that utilizes computer software to translate text from one natural language to another. Generally, for translating the source language (SL) to target language (TL) different approaches/techniques are used. Some of the major approaches are Rule based, Statistical approach and Example based approach. The MT system described in this paper, uses rule based approach, which is one of the effective techniques (Vaishali & Devale, 2010).
In recent years, the research in MT has gained momentum. Many language pairs (including the languages of Indian origin) have been chosen to develop MT systems, with good results. In the work of Jain, Bahadur and Chauhan(2012), the target language generation mechanism has been outlined for English to Sanskrit language pair using rule based machine translation technique for their system “Etrans”. In adopting rules based MT from English to Bangla, Francisca, Mamun Mia and Rahman (2011, Jun-Jul) have proposed language translation model that relies on rule based methodologies particularly fuzzy rules. Most of these systems are in the English to Hindi languages (specific domain) with the exceptions of a Hindi to English (Sinha, & Thakur, 2005) and English to Kannada (Kumar, & Murthy, 2006) machine translation systems. Indian machine translation systems are used to translate English to Hindi language (Naskar & Bandyopadhyay, 2002), which are as follows; The English to Hindi Anusaaraka system follows the basic principles (Bharti, Chaitanya, Kulkarni, & Sangal, 1997) of information preservation. The system makes text in one Indian language accessible in another Indian language. Anubharti 2004 (Sinha, 2004) approach for machine-aided-translation is a hybridized example-based machine translation approach that is a combination of example-based, corpus-based approaches and some elementary grammatical analysis.
In this work, we describe an English-Hindi MT system for Homoeopathy documents. In our proposed system, sentences of Homoeopathy literature (in English) are first passed to Stemmer (Dwivedi, & Sukhadeve, 2011, May). Further, the Part of Speech (PoS) tagger (Dwivedi, & Sukhadeve, 2011, July), tags all the words with the help of our designed tagging rules and corpuses (English and Hindi). These steps are assisted by English to Hindi online dictionary Shabdkosh (शब्दकोश). Afterwards, we compare the structural representation of both the language for mapping and designed assembly rules for translation.