Article Preview
Top1. Introduction
A human eye can easily detect and recognize things, having a unique capability of observing things in three dimensions. Humans have all the aspects of vision detection i.e., angles, shadow, background, and lighting effects. A human eye can easily detect things from the real scenes as well i.e., from the images. It also can recognize text from the images as well as from scenes, no matter it is a printed text image, a cheque image, any bill, or any scene text images. Also, humans have a tremendous characteristic of recognizing things, no matter they are in improper format i.e., non-uniform printing, any handwritten text, any high glaze, or diminished images. Human eyes can recognize all types of such images very precisely, but all this human work is not an easy task for the computer.
Computers need to put on more effort into doing such computer vision tasks. In recent times, many algorithms are proposed to perform image processing and text recognition tasks such as optical character recognition, bill receipt recognition, scene text recognition, bank cheque recognition (Srivastava et al.,2019), handwriting recognition in different languages, and much more. Due to all these existing texts recognizing algorithms and notable growth in computer vision capabilities of recognizing text from images; the field of OCR (Optical Character Recognition) (Das et al.,2020) is attracting the eyes of many researchers towards itself and as the results, recent text detection software’s can work nearly like the human vision technology work.
1.1. Text Identification in an OCR System
Text identification in an OCR system means detecting and recognize the text from any printed, handwritten, scene text images. All these text identification tasks are performed by OCR systems by using image processing methods. All image processing techniques make the work of cerebral reading easier. To identify text from the images, various researchers are using the techniques of deep learning algorithms. Text identification can be done as shown in Figure 1. The next subsection presents the basic architecture of OCR.
Figure 1. Shows the text extraction (a) Input image (b) Extreme Regions extracted (c) Non character components removed (d) Misclassified characters shown by red flags (e) Output results (Zheng et al., 2017)