Stochastic Neural Network Classifiers

Stochastic Neural Network Classifiers

Eitan Gross
Copyright: © 2015 |Pages: 10
DOI: 10.4018/978-1-4666-5888-2.ch026
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Chapter Preview

Top

Main Focus Of The Article

For an ideal information processing system, the information-theoretic distance between the output response dY to two stimuli, α0, α1, must be smaller then the corresponding distance of the input response dX to these stimuli, i.e.: dYo, α1)≤ dXo, α1). The choice of which distance measure to use is not always trivial and will normally depend on the type of stimulus (continuos or discrete) and the manner in which the distance chosen scales with the size (number of neurons in our case) of the information-processing system.

Key Terms in this Chapter

Multi-Layer Perceptron (MLP): An artificial neural network model with feed forward architecture that maps sets of input data onto a set of desired outputs iteratively, through the process of learning. A MLP consists of an input layer of neurons, one or more hidden layers of neurons and an output layer of neurons, where each layer is fully connected to the next layer.

Back propagation Algorithm: A supervised learning algorithm used to train artificial neural networks, where the network learns from many inputs, similar to the way a child learns to identify a bird from examples of birds and birds attributes.

Artificial Neural Networks: Computer models of interconnected neurons that can be trained to carry out pattern recognition and other low-level cognitive functions through supervised or unsupervised of learning.

Error Function: A mathematically differentiable function related to the difference between the desired output of the neural network and the actual output. During the learning phase the network minimizes the error function by dynamically changing the strength (weight) of the connections between the neurons in the network.

Gaussian Process: A non-paramteric regression method that places probability distributions over functions.

Gradient Descent: An optimization algorithm used to find a local minimum of a differentiable function. The backpropagation learning algorithm searches for the minimum in the error function in neuronal weight space using the method of gradient descent.

Information-Theoretic: The distance in probability space between any two arbitrary probability distributions. In stochastic neural networks, information-theoretic is used as an error function to be minimized during learning.

Kullback-Leibler Distance: An information-theoretic measure (also known as relative entropy) used to analytically define the distance between two arbitrary probability distributions. It should be noticed that while the Kullback-Leibler distance is not symmetrical, it is still an additive quantity.

Complete Chapter List

Search this Book:
Reset