Automated Text Detection and Recognition in Annotated Biomedical Publication Images

Automated Text Detection and Recognition in Annotated Biomedical Publication Images

Soumya De (Missouri University of Science and Technology, USA), R. Joe Stanley (Missouri University of Science and Technology, USA), Beibei Cheng (Missouri University of Science and Technology, USA), Sameer Antani (National Institutes of Health, USA), Rodney Long (National Institutes of Health, USA) and George Thoma (National Institutes of Health, USA)
Copyright: © 2017 |Pages: 33
DOI: 10.4018/978-1-5225-0571-6.ch018


Images in biomedical publications often convey important information related to an article's content. When referenced properly, these images aid in clinical decision support. Annotations such as text labels and symbols, as provided by medical experts, are used to highlight regions of interest within the images. These annotations, if extracted automatically, could be used in conjunction with either the image caption text or the image citations (mentions) in the articles to improve biomedical information retrieval. In the current study, automatic detection and recognition of text labels in biomedical publication images was investigated. This paper presents both image analysis and feature-based approaches to extract and recognize specific regions of interest (text labels) within images in biomedical publications. Experiments were performed on 6515 characters extracted from text labels present in 200 biomedical publication images. These images are part of the data set from ImageCLEF 2010. Automated character recognition experiments were conducted using geometry-, region-, exemplar-, and profile-based correlation features and Fourier descriptors extracted from the characters. Correct recognition as high as 92.67% was obtained with a support vector machine classifier, compared to a 75.90% correct recognition rate with a benchmark Optical Character Recognition technique.
Chapter Preview

1. Introduction

Essential information is often conveyed through images in biomedical publications. Images such as diagrams, tables, histograms and flowcharts are typically rich in content, summarizing the important results/methods presented in an article. Such images, when used in conjunction with either the image caption text or the image citations (mention) in the publications, can enhance the performance of Clinical Decision Support (CDS) systems (Demner-Fushman, 2008, 2009; Deserno, 2009). In previous studies, the retrieval of biomedical information for CDS has been primarily text-based, limited mainly to bibliographic information. To that end, traditional Content-Based Image Retrieval (CBIR) provides automated indexing and retrieval of large image collections. Biomedical images for a given modality (e.g. MRI, Histology or X-Ray) are however very similar in nature. Therefore, existing CBIR techniques based only on the visual features (texture/shape) of images are not sufficient for accurate retrieval of biomedical images (Pfund, 2002; Müller, 2004; Tang, 1999). In addition to text (image captions/citations) and visual features, retrieving characters from biomedical images can be used as part of a broader process to obtain complementary information for enhanced CBIR.

As part of CBIR, regions of interest (ROIs) within biomedical images are those which contain illustrations such as arrows/symbols/text-labels. Commonly used methods for CBIR, however, do not utilize these ROIs. The semantic gap in biomedical image analysis can be reduced by characterizing the ROIs, as compared to only analyzing the image as a single entity (Demner-Fushman, 2009, Deserno, 2009). Lehmann et al. proposed that three additional semantic abstraction levels are required of CBIR systems to understand complex medical knowledge (Lehmann, 2004). These include low-level medical information to understand the imaging modality, mid-level information obtained from ROIs, and high-level information obtained from the spatial relationships of ROIs (Lehmann, 2004, 2005).

Images in biomedical articles are generally of two types: medical images and analytical images. Medical images include MRIs, CT-Scans, X-rays, photographs and so forth. Analytical images, such as diagrams, statistical charts, flowcharts, and tables represent images that are created to either illustrate biomedical concepts or allow for biomedical data analysis. In previous studies involving CBIR, classification of analytical images into its various modalities have been successfully performed (Rahman, 2008; Pourghassem, 2008; Stanley, 2011). The information present within these analytical images must be extracted, however, to support both multimodal (image + text) biomedical information retrieval and CDS (Demner-Fushman, 2007; Cheng, 2011). The study presented in this paper is focused on enhancing the retrieval of textual information from analytical images.

As previously stated, authors often include several forms of annotations with their images. These annotations include but are not limited to text, text labels (e.g., A, B, and C), pointers (e.g., arrows and arrowheads) and symbols (e.g., asterisk). Such annotations are used to identify a ROI in the image. In previous CBIR-based research, arrow detection has been found to be successful in several types of biomedical images (Cheng, 2011; Dov, 1999; Park, 2008; Herold, 2010; Hearst, 2007). The integration of semantic annotation and information visualization was performed by Herold et al to analyze fluorescence micrographs of tissue samples for CBIR applications (Herold, 2010). Previous studies have analyzed biomedical images with text-like characteristics for both the extraction and recognition of textual characters (Hearst, 2007; Wu, 1999; Xu, 2008 2010; You, 2009, 2010).

Complete Chapter List

Search this Book: