A Generic Framework for Feature Representations in Image Categorization Tasks

A Generic Framework for Feature Representations in Image Categorization Tasks

Adam Csapo (Budapest University of Technology and Economics, Hungary), Barna Resko (Hungarian Academy of Sciences, Hungary), Morten Lind (NTNU, Dept. of Production and Quality Engineering, Norway), Peter Baranyi (Budapest University of Technology and Economics, Hungary, & Hungarian Academy of Sciences, Hungary) and Domonkos Tikk (Budapest University of Technology and Economics, Hungary)
Copyright: © 2012 |Pages: 22
DOI: 10.4018/978-1-4666-0261-8.ch029


The computerized modeling of cognitive visual information has been a research field of great interest in the past several decades. The research field is interesting not only from a biological perspective, but also from an engineering point of view when systems are developed that aim to achieve similar goals as biological cognitive systems. This paper introduces a general framework for the extraction and systematic storage of low-level visual features. The applicability of the framework is investigated in both unstructured and highly structured environments. In a first experiment, a linear categorization algorithm originally developed for the classification of text documents is used to classify natural images taken from the Caltech 101 database. In a second experiment, the framework is used to provide an automatically guided vehicle with obstacle detection and auto-positioning functionalities in highly structured environments. Results demonstrate that the model is highly applicable in structured environments, and also shows promising results in certain cases when used in unstructured environments.
Chapter Preview

The Visual Feature Array

The Visual Feature Array Concept

At the heart of the proposed concept is a Visual Feature Array (VFA) model, which is a cognitive information processing model that uses the information processing structures defined in the VFA concept. The VFA model obtains information from the environment, performs various operations on it, and then supplies higher-order models of cognitive informatics with its output. The proposed VFA concept allows for the implementation of cognitive functions analogous to those performed in the primary visual cortex (Figure 1).

Figure 1.

The VFA concept


Information Processing Structures in the VFA Concept

The information processing structures of the VFA concept describe the units available and their relationships in constructing a VFA model. VFA models can be composed of data arrays and operations that can be performed on them. The models constructed in the VFA concept are functionally similar to the information processing structures in the brain, where a large number of neurons (data arrays in the concept) affect the output of other large numbers of neurons, according to the topology and strength of their connections (operations in the concept).

Data Arrays

The data arrays are multidimensional arrays whose values can be of any type. The arrays were conceived to represent the output of individual neurons responsible for visual features in the modeled neural structures. The values contained in data arrays will be subsequently referred to as (artificial) neurons or computational elements.


An operation takes a data array of 978-1-4666-0261-8.ch029.m01 dimensions as its operand, and places its result in a data array of 978-1-4666-0261-8.ch029.m02 dimensions. The operations of the VFA concept are of SIMD type, and each of them can have static and / or running parameters. Running parameters assume discrete values taken from a limited interval.

Let 978-1-4666-0261-8.ch029.m03denote an operation, where 978-1-4666-0261-8.ch029.m04 is the number of running parameters. Let 978-1-4666-0261-8.ch029.m05 denote the input data array, and 978-1-4666-0261-8.ch029.m06 the output data array, where 978-1-4666-0261-8.ch029.m07 and 978-1-4666-0261-8.ch029.m08are the dimensions of 978-1-4666-0261-8.ch029.m09 and 978-1-4666-0261-8.ch029.m10 respectively. The performed operation can then be written as978-1-4666-0261-8.ch029.m11

Three basic operation types are defined:

  • Filtering operations

  • Lateral operations

  • Projective operations

Filtering Operations

Filtering operations take a data array 978-1-4666-0261-8.ch029.m12 and output a data array 978-1-4666-0261-8.ch029.m13 where 978-1-4666-0261-8.ch029.m14 Given a filtering operation 978-1-4666-0261-8.ch029.m15 the relationship between the data array dimensions can be written as 978-1-4666-0261-8.ch029.m16 The subspace of the last k dimensions of 978-1-4666-0261-8.ch029.m17 defines the results of the filtering operation with respect to the parameter values of 978-1-4666-0261-8.ch029.m18

Lateral Operations

Lateral operations take a data array 978-1-4666-0261-8.ch029.m19 and output a data array 978-1-4666-0261-8.ch029.m20 where 978-1-4666-0261-8.ch029.m21 Furthermore, the input data array is replaced by the output data array. Lateral operations therefore allow for the definition of recurrent functionalities.

Projective Operations

Projective operations take a data array 978-1-4666-0261-8.ch029.m22 and output a data array 978-1-4666-0261-8.ch029.m23 where 978-1-4666-0261-8.ch029.m24 Such operations perform the same calculations on all values along the given 978-1-4666-0261-8.ch029.m25 dimensions of the input data array. The result is stored in an output data array that contains all of the input data array dimensions, except for the 978-1-4666-0261-8.ch029.m26 dimensions along which the projective operation was performed.

Complete Chapter List

Search this Book: