Understanding Digital Documents Using Gestalt Properties of Isothetic Components

Understanding Digital Documents Using Gestalt Properties of Isothetic Components

Shyamosree Pal, Partha Bhowmick, Arindam Biswas, Bhargab B. Bhattacharya
ISBN13: 9781466609006|ISBN10: 1466609001|EISBN13: 9781466609013
DOI: 10.4018/978-1-4666-0900-6.ch010
Cite Chapter Cite Chapter

MLA

Pal, Shyamosree, et al. "Understanding Digital Documents Using Gestalt Properties of Isothetic Components." Multimedia Storage and Retrieval Innovations for Digital Library Systems, edited by Chia-Hung Wei, et al., IGI Global, 2012, pp. 183-207. https://doi.org/10.4018/978-1-4666-0900-6.ch010

APA

Pal, S., Bhowmick, P., Biswas, A., & Bhattacharya, B. B. (2012). Understanding Digital Documents Using Gestalt Properties of Isothetic Components. In C. Wei, Y. Li, & C. Gwo (Eds.), Multimedia Storage and Retrieval Innovations for Digital Library Systems (pp. 183-207). IGI Global. https://doi.org/10.4018/978-1-4666-0900-6.ch010

Chicago

Pal, Shyamosree, et al. "Understanding Digital Documents Using Gestalt Properties of Isothetic Components." In Multimedia Storage and Retrieval Innovations for Digital Library Systems, edited by Chia-Hung Wei, Yue Li, and Chih-Ying Gwo, 183-207. Hershey, PA: IGI Global, 2012. https://doi.org/10.4018/978-1-4666-0900-6.ch010

Export Reference

Mendeley
Favorite

Abstract

This paper introduces how Gestalt properties can be used for identifying various components in a document image. That the human mind makes a holistic approach to vision rather than a disintegrated approach is shown to be useful for document analysis. Since the major constituent components (textual or non-textual) in a document page are arranged in a rectilinear fashion, rectilinear/isothetic decomposition of different components are made on a document page. After representing the page as a feature set of its polygonal covers corresponding to the distinct regions of interest, each polygon is iteratively decomposed into the sub-polygons tightly enclosing the corresponding sub-components to capture the overall information as well as the necessary details to the desired level of precision. Subsequently, these components and sub-components are analyzed using Gestalt laws/properties, which have been explained in detail in the context of this work. Text regions, tabular structures, and various graphic objects readily admit some of the Gestalt properties. We have tested our algorithm on several benchmark datasets, and some relevant results have been produced here to demonstrate the effectiveness and elegance of the proposed method.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.