Image Analysis and Understanding Based on Information Theoretical Region Merging Approaches for Segmentation and Cooperative Fusion

Image Analysis and Understanding Based on Information Theoretical Region Merging Approaches for Segmentation and Cooperative Fusion

Felipe Calderero (Universitat Pompeu Fabra (UPF), Spain) and Ferran Marqués (Technical University of Catalonia (UPC), Spain)
DOI: 10.4018/978-1-4666-2518-1.ch004

Abstract

This chapter addresses the automatic creation of simplified versions of the image, known as image segmentation or partition, preserving the most semantically relevant information of the image at different levels of analysis. From a semantic and practical perspective, image segmentation is a first and key step for image analysis and pattern recognition since region-based image representations provide a first level of abstraction and a reduction of the number of primitives, leading to a more robust estimation of parameters and descriptors. The proposed solution is based on an important class of hierarchical bottom-up segmentation approaches, known as region merging techniques. These approaches naturally provide a bottom-up hierarchy, more suitable when no a priori information about the image is available, and an excellent compromise between efficiency of computation and representation. The chapter is organized in two parts dealing with the following objectives: (i) provide an unsupervised solution to the segmentation of generic images; (ii) design a generic and scalable scheme to automatically fuse hierarchical segmentation results that increases the robustness and accuracy of the final solution.
Chapter Preview
Top

Introduction

So easy, yet so difficult. Image analysis and understanding is an immediate and almost innate task for humans, but a challenging and surprisingly difficult task for artificial vision systems and computers (Levine, 1985). The lack of a detailed and complete understanding of the neurological and cognitive processes involved in human vision makes not only high-level processes but also the lowest and fundamental stages of visual processing a stimulating and widely open problem. Nevertheless, psychological studies (Sternberg, 2003; Bruce, 1996) agree on some key principles, such as the importance of perceptual grouping as a preliminary stage into the visual processing chain to scene recognition and understanding.

Perceptual grouping (Lowe, 1985) refers to the human visual ability to extract significant image relations from lower-level primitive image features, without any knowledge of the image content, and to group them to obtain meaningful higher-level structures. Research in perceptual grouping was started in 1920's by Gestalt psychologists, whose goal was to discover the underlying principle that would unify the various grouping phenomena of human perception. Gestalt psychologists observed the tendency of the human visual system to perceive configurational wholes, with rules that govern the uniformity of psychological grouping for perception and recognition, as opposed to recognition by analysis of discrete primitive image features (Geisler, 2000).

Inspired by this perceptual foundation (Desolneux, 2004), the image processing community has tirelessly worked for more than four decades into the image segmentation field. Image segmentation refers to the process of partitioning a digital image into a disjoint set of connected pixels, known as regions. The benefit of this region-based image representation or image partition representation is twofold. From a semantic point of view, image segmentation is a first level of abstraction providing an image representation closer to the object representation than the set of pixels (Ballard, 1982). Ideally, we expect objects to be formed by a single region or by a union of regions (adjacent or not). From a practical point of view, a region-based representation of the image reduces the number of elementary primitives and allows a more robust estimation of parameters and descriptors, especially when compared with a direct pixel-based representation, and facilitates later processing, storage, and retrieval. In other words, segmentation simplifies the image into some primitives more semantically meaningful and easier to analyze (Forsyth, 2003). However, it is a critical step since the results of the segmentation will have considerable influence over all the subsequent processes of image analysis (Zhang, 1995), such as object representation and description, feature measurements, and even the following higher level tasks, such as object classification or scene interpretation.

Apart from the intrinsic difficulty of discovering and emulating the perceptual grouping principles and mechanisms of human vision, the image segmentation problem is an ill-posed problem in a double sense1. The first type of ill-posedness refers to the fact that, in a large number of cases, a unique solution to the image segmentation problem does not exist. Instead of a single optimal partition, it is possible to find different region-based explanations of an image at different level of analysis or detail (Bertero, 1988). Hence, the desired segmentation result depends on the image itself and on the particular application or purpose of the analysis to be performed. For instance, consider the example shown in Figure 1. If we are interested in counting the number of tigers in the image or in detecting an animal presence in the scene, the partition in Figure 1(b) would be appropriate. However, if our purpose is to recognize the animal or to count the number of stripes of the tiger, we would find the partition in Figure 1(c) a more valuable segmentation.

Figure 1.

Non-uniqueness of the solution to the image segmentation problem: counting tigers or stripes? (a) Original image extracted from the Berkeley Segmentation Dataset2 (Martin, 2001). (b) Image partition into 2 regions. (c) Image partition into 100 regions. Both partitions were computed using one of the techniques presented in this chapter.

Key Terms in this Chapter

Region Merging: This term refers to a family of hierarchical bottom-up image segmentation techniques. Starting from an initial partition or from the collection of pixels, regions are iteratively merged until a certain criterion is fulfilled (for instance, a single region is obtained). They are characterized by: a region model, a merging criterion, and a merging order.

Information Fusion: Combination of data from multiple sources and gather that information into discrete, actionable items in order to achieve inferences, which will be more efficient and narrowly tailored than if they were achieved by means of disparate sources. In computer vision, image fusion is the process of combining relevant information from two or more images into a single image. The resulting image will be more informative than any of the input images.

Information Diversity: In the context of pattern classifier combination and classifier ensembles, it refers to the specific and valuable knowledge that each classifier provides to the ensemble. Theoretically, the larger diversity in an ensemble, the richer information that is available to the system, and the higher its performance is expected.

Image Partition: Result of the segmentation process applied to a digital image.

Region: Set of connected (adjacent) pixels of a digital image. Connectivity of pixels is usually defined in terms of 4-connectivity (top, bottom, left, and right neighbor pixels) or 8-connectivity (including also the closest neighbor in diagonal directions).

Image Segmentation: Process of partitioning a digital image into a disjoint set of connected pixels (known as regions).

Complete Chapter List

Search this Book:
Reset