Mathematical Foundations Modeled after Neo-Cortex for Discovery and Understanding of Structures in Data

Mathematical Foundations Modeled after Neo-Cortex for Discovery and Understanding of Structures in Data

Shubha Kadambe (Rockwell Collins, USA)
DOI: 10.4018/978-1-4666-2539-6.ch010

Abstract

Even though there are distinct areas for different functionalities in the mammalian neo-cortex, it seems to use the same algorithm to understand a large variety of input modalities. In addition, it appears that the neo-cortex effortlessly identifies the correlation among many sensor modalities and fuses information obtained from them. The question then is, can we discover the brain’s learning algorithm and approximate it for problems such as computer vision and automatic speech recognition that the mammalian brain is so good at? The answer is: it is an orders of magnitude problem, i.e., not a simple task. However, we can attempt to develop mathematical foundations based on the understanding of how a human brain learns. This chapter is focused along that direction. In particular, it is focused on the ventral stream – the “what pathway” - and describes common algorithms that can be used for representation and classification of signals from different sensor modalities such as auditory and visual. These common algorithms are based on dictionary learning with a beta process, hierarchical graphical models, and embedded hidden Markov models.
Chapter Preview
Top

1. Introduction

In this chapter, we provide the results of applicability of dictionary learning in processing images by filling in the missing pixels and enhancing noisy images. Some example applications of hierarchical graphical models are also provided. Moreover, we demonstrate that the same learning algorithm can be used in representing both visual and audio signals. This indicates that it is possible to approximate how the neo-cortex processes multi-sensory data. The results provided in this chapter are promising and the described algorithms may constitute a step in the right direction for approximating the brain's learning, understanding and inferring algorithms.

Even though the neo-cortex of the mammalian brain has distinct areas for different functionalities it seems to use the same algorithm to understand a large variety of input modalities (Mountcastle, 1978; Hawkins & Blakeslee, 2004). For example, in the ferret experiments conducted by Roe et al (1992) it was shown that the auditory cortex learned to “see” by plugging in outputs of vision sensors into the auditory part of the brain. Similarly, it has been shown that by sensory remapping in the human brain the functionality of that part of the brain can be altered. For example, by remapping touch sensors with visual cortex it was shown that blind persons can get visual perception by touch (Sadato et al., 1996). In addition, it appears that the neo-cortex effortlessly identifies the correlation among many sensor modalities and fuses information obtained from them. The question then is, can we discover the brain’s learning algorithm and approximate it for solving problems such as computer vision and automatic speech recognition that the mammalian brain is so good at? The answer is: it is an orders of magnitude problem, i.e., not a simple task. (a) The human brain has 1014 synapses; (b) Humans live approximately 109 seconds; (c) If each synapse has just one bit to parameterize, humans would need to learn 1014 bits in 109 seconds. That is, humans have to learn at the speed of 105 bits per second (Hinton, n.d.). This tremendous amount of information is learned by humans mostly in unsupervised fashion. To achieve the same feat by a computer is an orders of magnitude problem indeed! However, we can attempt to develop mathematical foundations based on the understanding of how a human brain learns. This chapter is focused along that direction.

The visual cortex of the mammalian brain is part of the cerebral cortex responsible for processing visual information. It is located in the occipital lobe in the back of the brain. There is a visual cortex on each hemisphere of the brain. The left hemisphere visual cortex receives signals from the right visual field and vice versa. The primary visual cortex V1, which is a small portion of the brain as can be seen in Figure 1, is located in and around the calcarine fissure in the occipital lobe. Each hemisphere’s V1 transmits information to two primary pathways – the dorsal and ventral streams. The dorsal stream begins with V1, goes through visual area V2, then to the dorsomedial area next to the visual area V5 and then to the posterior parietal cortex. This dorsal pathway is sometimes referred to as “where pathway” or “how pathway”. It is associated with motion, representation of object locations and control of the eyes and arms, especially when visual information is used to guide saccades or reaching (Milner, 1992). The ventral stream begins with V1, goes through V2, V4 and Inferior Temporal (IT) cortex. This ventral stream is sometimes referred to as “what pathway”. It is associated with form recognition and object representation. It is also involved with storage of long-term memory. In this chapter we are focusing on this ventral stream as we are interested in the common algorithm that the brain applies for recognition and representation. The regions in the ventral stream V1, V2, V4 and IT are believed to be connected hierarchically with both feed forward and backward pathways as depicted in a simplified form in Figure 2.

Figure 1.

Primary visual cortex V1

Figure 2.

Hierarchical connections of the regions in the visual processing along the ventral stream (“what pathway”)

Complete Chapter List

Search this Book:
Reset