Computational Information-Maximization Models

Computational Information-Maximization Models

Mitja Peruš (University of Ljubljana, Slovenia) and Chu Kiong Loo (Multimedia University, Malaysia)
DOI: 10.4018/978-1-61520-785-5.ch003
OnDemand PDF Download:
$30.00
List Price: $37.50

Chapter Preview

Top

3.1 Maximal Preservation Of Information (“Infomax”)

Linsker’s infomax-net. The early “infomax” perception-model by Linsker (1988) was established with an observation that Hebbian adjustment of connections is able to evolve feature-selective neurons which collectively preserve information “as much as possible” during input-to-output processing. Linsker (1988) used a multi-layer network with local feed-forward connections which are Hebb-like – determined by covariance. In this case, local means that each neuron in layer l+1 receives inputs from neurons in a confined circular region of layer l. The “infomax” ideas used in the classical PCA-like1 self-organizing net by Linsker (1988, and later works) have later been much developed in direction of ICA.

From second- to higher-order statistics. As has been mentioned in section 1.5, ICA is supposed to be important for cortical image processing because of taking into account the statistics of input data which is also of higher order than the second order.2Correlation and convolution, for example, are “learning (i.e., memory storage) rules” of second order. Such mathematical expressions, used in unsupervised learning models, have been named Hebbian (directly or in a generalized sense, e.g. the covariance “learning rule” and PCA). ICA goes beyond second-order statistics of PCA, because it also processes higher-order terms in a series of statistical quantities – moments or cumulants. This is related with neuronal interactions of higher order, i.e. more than two neurons may interact at the same time because they may be directly connected (e.g., having synapses close to each-other and affecting one another), or their activity may be coupled in another way (e.g., as in coherent oscillations).

In the case of oscillatory activities in neurons and subcellular structures including dendritic fields, the relevant variables become local phases and their relations (King et al., 2000). Oscillations are described by functions of complex-valued exponents (the exponent without imaginary unit i is the phase).3

Phases needed for edges. Phase-information is needed for successful approximation of V1-filters which were found to be localized, oriented and band-pass (i.e., selective to structure at different spatial scales). Local angle of orientation is described by local phase. Such filters are needed to trace segments of edges which are themselves oriented. Experiments show that individual simple cells of V1 with their specific receptive fields act as such filters, i.e. as selectors or edge-segments by having maximal response to specifically oriented stimuli.

An edge (of an object), as a crucial element of an image, manifests specific relationships among many pixels being encoded in neurons or receptors, not only two (neighboring) ones. Second-order statistics (as in PCA) is sensitive only to pair-wise relationships, like correlations encoded in the Hebb rule. Higher-order statistics (as in ICA) is sensitive to multi-neuronal relationships, reflecting multi-pixel gestalt-structures, and thus goes beyond the two-pixel (or two-neuron, respectively) relations of PCA. In PCA, gestalts are formed from feature-segments by global organization from local bilateral connections (usual simple synapses). In ICA, however, gestalts are formed directly by semi-local multilateral connections (complex synaptic structures4). This allows ICA to detect edges by filters which are essentially more localized (wavelet-like) than it would be achievable by PCA performing global (Fourier-like) spatial and spatial-frequency analysis.

Bell & Sejnowski (1996, pp. 261-262) write: “[Hebbian models] reflect only the amplitude spectrum of the signal and ignore the phase spectrum where most of the suspicious local coincidences in natural signals take place. An edge in an image, for example, is a coincidence in the phase spectrum, since if we were to Fourier analyse it, we would see many sine waves of different frequencies, all aligned in phase where the edge occurred. Correlation-based methods cannot learn edge-detectors, though they often may seem to be doing so by local-windowing of the learnt Fourier components, turning them into Gabor-like filters…”

Complete Chapter List

Search this Book:
Reset