A Generic Framework for Feature Representations in Image Categorization Tasks

A Generic Framework for Feature Representations in Image Categorization Tasks

Adam Csapo (Budapest University of Technology and Economics, Hungary), Barna Resko (Hungarian Academy of Sciences, Hungary), Morten Lind (Norwegian University of Science and Technology, Norway) and Peter Baranyi (Budapest University of Technology and Economics, Hungary & Hungarian Academy of Sciences, Hungary)
DOI: 10.4018/jssci.2009062503
OnDemand PDF Download:
List Price: $37.50
10% Discount:-$3.75


The computerized modeling of cognitive visual information has been a research field of great interest in the past several decades. The research field is interesting not only from a biological perspective, but also from an engineering point of view when systems are developed that aim to achieve similar goals as biological cognitive systems. This article introduces a general framework for the extraction and systematic storage of low-level visual features. The applicability of the framework is investigated in both unstructured and highly structured environments. In a first experiment, a linear categorization algorithm originally developed for the classification of text documents is used to classify natural images taken from the Caltech 101 database. In a second experiment, the framework is used to provide an automatically guided vehicle with obstacle detection and auto-positioning functionalities in highly structured environments. Results demonstrate that the model is highly applicable in structured environments, and also shows promising results in certain cases when used in unstructured environments.
Article Preview

The Visual Feature Array

The Visual Feature Array Concept

At the heart of the proposed concept is a Visual Feature Array (VFA) model, which is a cognitive information processing model that uses the information processing structures defined in the VFA concept. The VFA model obtains information from the environment, performs various operations on it, and then supplies higher-order models of cognitive informatics with its output. The proposed VFA concept allows for the implementation of cognitive functions analogous to those performed in the primary visual cortex (Figure 1).

Figure 1.

The VFA concept


Information Processing Structures in the VFA Concept

The information processing structures of the VFA concept describe the units available and their relationships in constructing a VFA model. VFA models can be composed of data arrays and operations that can be performed on them. The models constructed in the VFA concept are functionally similar to the information processing structures in the brain, where a large number of neurons (data arrays in the concept) affect the output of other large numbers of neurons, according to the topology and strength of their connections (operations in the concept).

Data Arrays

The data arrays are multidimensional arrays whose values can be of any type. The arrays were conceived to represent the output of individual neurons responsible for visual features in the modeled neural structures. The values contained in data arrays will be subsequently referred to as (artificial) neurons or computational elements.


An operation takes a data array of jssci.2009062503.m01 dimensions as its operand, and places its result in a data array of jssci.2009062503.m02 dimensions. The operations of the VFA concept are of SIMD type, and each of them can have static and / or running parameters. Running parameters assume discrete values taken from a limited interval.

Let jssci.2009062503.m03denote an operation, where jssci.2009062503.m04 is the number of running parameters. Let jssci.2009062503.m05 denote the input data array, and jssci.2009062503.m06 the output data array, where jssci.2009062503.m07 and jssci.2009062503.m08are the dimensions of jssci.2009062503.m09 and jssci.2009062503.m10 respectively. The performed operation can then be written asjssci.2009062503.m11.

Three basic operation types are defined:

  • ● Filtering operations

  • ● Lateral operations

  • ● Projective operations

Filtering operations.

Filtering operations take a data array jssci.2009062503.m12 and output a data arrayjssci.2009062503.m13, where jssci.2009062503.m14. Given a filtering operationjssci.2009062503.m15, the relationship between the data array dimensions can be written as jssci.2009062503.m16. The subspace of the last k dimensions of jssci.2009062503.m17 defines the results of the filtering operation with respect to the parameter values ofjssci.2009062503.m18.

Lateral operations.

Lateral operations take a data array jssci.2009062503.m19 and output a data array jssci.2009062503.m20, where jssci.2009062503.m21. Furthermore, the input data array is replaced by the output data array. Lateral operations therefore allow for the definition of recurrent functionalities.

Projective Operations

Projective operations take a data array jssci.2009062503.m22 and output a data array jssci.2009062503.m23, where jssci.2009062503.m24. Such operations perform the same calculations on all values along the given jssci.2009062503.m25 dimensions of the input data array. The result is stored in an output data array that contains all of the input data array dimensions, except for the jssci.2009062503.m26 dimensions along which the projective operation was performed.

Complete Article List

Search this Journal:
Volume 15: 1 Issue (2023): Forthcoming, Available for Pre-Order
Volume 14: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 13: 4 Issues (2021)
Volume 12: 4 Issues (2020)
Volume 11: 4 Issues (2019)
Volume 10: 4 Issues (2018)
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing