Multilingual Scene Text Detection Using Gradient Morphology

Multilingual Scene Text Detection Using Gradient Morphology

Dibyajyoti Dhar (Jadavpur University, India), Neelotpal Chakraborty (Jadavpur University, India), Sayan Choudhury (Jadavpur University, India), Ashis Paul (Jadavpur University, India), Ayatullah Faruk Mollah (Aliah University, India), Subhadip Basu (Jadavpur University, India) and Ram Sarkar (Jadavpur University, India)
Copyright: © 2020 |Pages: 13
DOI: 10.4018/IJCVIP.2020070103
OnDemand PDF Download:
List Price: $37.50
10% Discount:-$3.75


Text detection in natural scene images is an interesting problem in the field of information retrieval. Several methods have been proposed over the past few decades for scene text detection. However, the robustness and efficiency of these methods are downgraded due to high sensitivity towards various complexities of an image. Also, in multi-lingual environment where texts may occur in multiple languages, a method may not be suitable for detecting scene texts in certain languages. To counter these challenges, a gradient morphology-based method is proposed in this paper that proves to be robust against image complexities and efficiently detects scene texts irrespective of their languages. The method is validated using low quality images from standard multi-lingual datasets like MSRA-TD500 and MLe2e. The performance of the method is compared with that of some state-of-the-art methods, and comparably better results are observed.
Article Preview

Literature Survey

Current methodologies for scene text detection are based on detecting stable homogeneous components having near similar intensities and refining them on the basis of their stroke properties. Morphological operations are also utilized in some of these works to eliminate spurious components and retaining maximum text components.

The concept of detecting Maximally Stable Extremal Regions (MSER) was first introduced by Matas et al. (2004) where stable blobs of text regions having similar intensity values are detected. Epshtein et al. (2010) introduced the concept of Stroke Width Transform (SWT) where a text component is considered to be a well-defined stroke depicting some meaning. Yao et al. (2012) uses this concept to localize multi-oriented text regions from complex natural scenes. Text components are retained using stroke properties of candidate components. Since MSER method is highly sensitive to blur, Chen et al. (2011) designed a Canny edge based MSER technique and applied SWT to discard the non-text MSER components. This combination of MSER and SWT has become a popular way of identifying text candidates. Gomez & Karatzas (2013) used this combination to detect multi-lingual scene texts and performing perceptual clustering to localize the text regions. Yin et al. (2015) extracted MSER blobs and applied morphology-based grouping to localize multi-oriented and multi-lingual scene texts.

Complete Article List

Search this Journal:
Volume 13: 1 Issue (2023): Forthcoming, Available for Pre-Order
Volume 12: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 11: 4 Issues (2021)
Volume 10: 4 Issues (2020)
Volume 9: 4 Issues (2019)
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 2 Issues (2016)
Volume 5: 2 Issues (2015)
Volume 4: 2 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing