Segmentation Free Word Spotting for Handwritten Documents Using Bag of Visual Words Based on Co-HOG Descriptor

Segmentation Free Word Spotting for Handwritten Documents Using Bag of Visual Words Based on Co-HOG Descriptor

Thontadari C. (Kuvempu University, Shimoga, India) and Prabhakar C. J. (Department of Computer Science, Kuvempu University, Shimoga, India)
Copyright: © 2019 |Pages: 17
DOI: 10.4018/IJIRR.2019040105
OnDemand PDF Download:
No Current Special Offers


In this article, the authors propose a segmentation-free word spotting in handwritten document images using a Bag of Visual Words (BoVW) framework based on the co-occurrence histogram of oriented gradient (Co-HOG) descriptor. Initially, the handwritten document is represented using visual word vectors which are obtained based on the frequency of occurrence of Co-HOG descriptor within local patches of the document. The visual word representation vector does not consider their spatial location and spatial information helps to determine a location exclusively with visual information when the different location can be perceived as the same. Hence, to add spatial distribution information of visual words into the unstructured BoVW framework, the authors adopted spatial pyramid matching (SPM) technique. The performance of the proposed method evaluated using popular datasets and it is confirmed that the authors' method outperforms existing segmentation free word spotting techniques.
Article Preview

1. Introduction

Nowadays, documents digitization has become more popular for storage and transmission instead of the traditional paper documents. In order to access the content of digitized documents, the manuscripts are transcribed into machine understandable format, so users can perform the textual search. When dealing with huge collections of handwritten documents, automatic transcription processes are carried out using Optical Character Recognition (OCR) strategies. An automatic recognition of poor quality handwritten text is not feasible by traditional OCR approaches which mainly adequate for modern printed documents with simple layouts and known fonts. Most of the constraints encountered by OCR systems for handwritten documents stem from difficulties in segmenting characters or words, the variability of the handwriting and the open vocabulary. In order to overcome the drawbacks of OCR, the researchers have developed word spotting technique which becomes an essential tool to retrieve the historical and modern handwritten documents based on user interest information. Word spotting can be defined as the pattern recognition task aimed at locating and retrieving a particular word from a document image collection without explicitly transcribing the whole corpus.

The researchers have proposed techniques for word spotting in handwritten documents either using segmentation or without segmentation of handwritten documents. The main drawback of segmentation-based word spotting techniques is that they need to perform segmentation step to select candidate words. Any segmentation errors affect the subsequent steps such as word representation and matching, so it is desirable to avoid segmentation of documents. This motivated to the researchers of word spotting domain move towards segmentation free word spotting methods. In segmentation free methods (Leydier et al., 2005; Gatos et al., 2009), the document images are represented by feature descriptor such as Surface Invariant Feature Transform(SIFT). Then, sliding window or patch-based approaches are used to locate the document regions that are most similar to the query word (Rusinol et al., 2015; Shekhar et al., 2012; Rothacker et al., 2013 and Zhang et al., 2013). The drawback of SIFT-based word spotting is that they are memory intensive; window size cannot be adapted to the length of the query, relatively slow to compute and match. In order to avoid matching all the key points among them, the Bag of Visual Words (BoVW) technique has been used for word spotting in handwritten documents (Rusinol et al., 2011; Shekhar et al., 2012). The BoVW based word spotting methods yield holistic and fixed-length image representation while keeping the discriminative power of local descriptor.

Almazan et al. (2014) have proposed unsupervised segmentation free word spotting method based on HOG descriptor. Documents images are represented through a grid of HOG descriptor, and a sliding-window approach is used to locate the document regions that are most similar to the query. HOG feature descriptor captures orientation of only isolated pixels, whereas spatial information of neighboring pixels is ignored. In order to capture the spatial information of neighboring pixels, we propose a Co-occurrence Histogram of Oriented Gradient (Co-HOG) descriptor (Watanabe et al., 2009) for word spotting in handwritten documents. The Co-HOG is an extension of HOG descriptor, which encodes gradient orientation of neighboring pixel pairs and accordingly captures more spatial and relative information, making it more dominant to represent the characters shape precisely and effectively.

Complete Article List

Search this Journal:
Open Access Articles: Forthcoming
Volume 12: 4 Issues (2022): Forthcoming, Available for Pre-Order
Volume 11: 4 Issues (2021): 2 Released, 2 Forthcoming
Volume 10: 4 Issues (2020)
Volume 9: 4 Issues (2019)
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing