Bio-Inspired Scheme for Classification of Visual Information

Bio-Inspired Scheme for Classification of Visual Information

Le Dong (University of Electronic Science and Technology of China, China), Ebroul Izquierdo (University of London, UK) and Shuzhi Ge (University of Electronic Science and Technology of China, China)
Copyright: © 2011 |Pages: 25
DOI: 10.4018/978-1-60960-024-2.ch014
OnDemand PDF Download:
List Price: $37.50


In this chapter, research on visual information classification based on biologically inspired visually selective attention with knowledge structuring is presented. The research objective is to develop visual models and corresponding algorithms to automatically extract features from selective essential areas of natural images, and finally, to achieve knowledge structuring and classification within a structural description scheme. The proposed scheme consists of three main aspects: biologically inspired visually selective attention, knowledge structuring and classification of visual information. Biologically inspired visually selective attention closely follow the mechanisms of the visual “what” and “where” pathways in the human brain. The proposed visually selective attention model uses a bottom-up approach to generate essential areas based on low-level features extracted from natural images. This model also exploits a low-level top-down selective attention mechanism which performs decisions on interesting objects by human interaction with preference or refusal inclination. Knowledge structuring automatically creates a relevance map from essential areas generated by visually selective attention. The developed algorithms derive a set of well-structured representations from low-level description to drive the final classification. The knowledge structuring relays on human knowledge to produce suitable links between low-level descriptions and high-level representation on a limited training set. The backbone is a distribution mapping strategy involving two novel modules: structured low-level feature extraction using convolution neural network and topology preservation based on sparse representation and unsupervised learning algorithm. Classification is achieved by simulating high-level top-down visual information perception and classification using an incremental Bayesian parameter estimation method. The utility of the proposed scheme for solving relevant research problems is validated. The proposed modular architecture offers straightforward expansion to include user relevance feedback, contextual input, and multimodal information if available.
Chapter Preview


In this chapter, the well-established attention models are exploited to build a model for image analysis and classification following human perception and interpretation of natural images. The proposed approach aims at, to some extent, mimicking the human visual system and to use it to achieve higher accuracy in image classification. Low-level features are used to generate an essential area in the image of concern. A method to generate a topology representation based on the structured low-level features is developed. Using this method, the preservation of new objects from a previously perceived ontology in conjunction with the color and texture perceptions can be processed autonomously and incrementally. The topology representation network structure consists of the posterior probability and the prior frequency distribution map of each image cluster conveying a given semantic concept. The proposed framework uses a biologically inspired visually selective attention model and knowledge structuring techniques to approximate human-like inference.

Contrasting related works from the conventional literature, the proposed framework exploits known fundamental properties of biologic visual systems and a suitable knowledge structuring model to achieve classification of natural images.

Based upon the roles of the brain structures related with visual information processing, a biologically inspired framework for visually selective attention has been developed. The developed system is implemented by mimicking the functions and the connection of the brain structures including the visual pathway in the human brain. The developed system has human-like mechanisms such as incremental learning, a social function with human interaction processing, autonomous mechanism for visually selective attention, and low-level visual information perception.

An object non-specific perception model is used in the knowledge structuring that can make a representation of an arbitrary object by using the sparse coded features of a convolution neural network (CNN). A generative model based upon Bayes’ theorem using Gaussian mixtures, can classify an arbitrary object in a feature space that is generated by a CNN [Vailaya and Jain 1999]. Moreover, the developed model plays a role in perceiving an arbitrary object in a natural scene, by recognizing an object category based upon the maximum likelihood (ML) method in a natural scene. It is one of the most important thing in the proposed biologically inspired framework, whereby the training object area is automatically decided by the proposed selective attention model and not by hand, which is a very important feature.

An important contribution of the presented work is the dynamic preservation of high-level representation of visual information based on the visually selective attention of a natural scene.

Another important feature of the proposed framework is the constant evaluation of the involved confidence and support measures used in the classification of visual information. As a result, continually changing associations and frequencies for each class according to the inference rules is achieved.

The last but not the least, the presented ingenious framework integrates visually selective attention model with graphical model-based topology representation, thus rendering favorable low- and high-level features and reasonable premise to drive the final classification.

These main novel features of the framework together with an open and modular architecture enable important extensions to include user relevance feedback, contextual input, and multimodal information if available. These important features are the scope of ongoing implementations and system extensions targeting enhanced robustness and classification accuracy.

Complete Chapter List

Search this Book: