Deep Learning for Trilingual Character Recognition

Deep Learning for Trilingual Character Recognition

M. Yashodha (Dept. of ISE, Bahubali College of Engineering, India), SK Niranjan (JSS Science and Technology University, Mysore, India) and V. N. Manjunath Aradhya (Sri Jayachamarajendra College of Engineering, Mysore, India)
Copyright: © 2019 |Pages: 7
DOI: 10.4018/IJNCR.2019010104

Abstract

As India is a multilingual country, in which the national language is Hindi, regional languages still exist in each of the corresponding states. In government offices, for the purpose of communication and maintenance of files and ledgers, the languages preferred are the regional languages and Hindi. As corporate offices and private organizations also exist in the country, these bodies mainly prefer the English language with the regional language in recording documents and ledgers. So, in this regard, in India a document contains multilingual texts, and there is a need of a multilingual OCR system. In this article, a trilingual OCR system is developed using deep learning for supporting English, Hindi and Kannada languages, the regional language of the state Karnataka.
Article Preview
Top

A document with more than one language shall be considered as a multilingual document. We find the multilingual documents in the form of application forms, like railway reservation forms, bank challans, job application forms, etc., where the information has to be provided using more languages. Our nation is a multilingual country where many official documents may contain at least three languages, one which is official language of the local state, and the other languages are Hindi and the language which is common for the purpose of official communication that is English. Thus, documents of multilingual nature are a challenge for document analysis tasks and further recognition tasks when compared to documents with single scripts which are known as the monolingual documents. The processing of multilingual documents is carried out by using script identification algorithms for any particular language. This avenue of multilingual document analysis shall be categorized into two types: viz., printed and handwritten documents. There are many attempts in the literature towards the analysis of multilingual document in printed nature. But there is a huge scope towards the hand printed multilingual document analysis.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 9: 4 Issues (2020): 1 Released, 3 Forthcoming
Volume 8: 4 Issues (2019)
Volume 7: 4 Issues (2018)
Volume 6: 2 Issues (2017)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing