Information geometry is one of the most effective tools to investigate stochastic learning models. In it, stochastic learning models are regarded as manifolds in the view of differential geometry. Amari applied it to Boltzmann Machines, which is one of the stochastic learning models. The purpose of this chapter is to apply information geometry to complex-valued Boltzmann Machines. First, we construct the complex-valued Boltzmann Machines. Next, the author describes information geometry. The author will know some important notions of information geometry, exponential families, mixture families, Kullback-Leibler divergence, connections, geodesics, Fisher metrics, potential functions and so on. Finally, they apply information geometry to complex-valued Boltzmann Machines. They will investigate the structure of complex-valued Boltzmann manifold and know the notions of the connections and Fisher metric. Moreover we will get an effective learning algorithm, what is called em algorithm, for complex-valued Boltzmann machines with hidden neurons.
TopIntroduction
These days we can get massive information and it is hard to deal with it without computers. Machine learning is effective for computers to manage massive information. Machine learning uses various learning machine models, for instance, decision trees, Bayesian Networks, Support Vector Machine, Hidden Markov Model, normal mixed distributions, neural networks and so on. Some of them are stochastically constructed.
The neural network is one of the learning machine models. It consists of many units, which are called neurons. We often use binary neurons. Then each neuron takes only two states. The set of neurons, however, takes many states. Various types of neural networks have been proposed. Feed-forward types and symmetric types of neural networks are main models. Feed-forward types of neural networks are often applied to recognize given patterns and are so useful. Symmetric types of neural network are often applied as Associative Memories. The Hopfield Network is one of the most famous models. Boltzmann Machines are stochastic types of Hopfield Networks.
Neurons take only two states in most cases. McEliece, Posner, Rodemich and Venkatech (1987) is a highly recognized critique of the Hopfield memory low capacity. Baldi and Hormik (1989) rigorously showed existence of numerous local minima in objective function for learning a nonlinear perceptron. The presentation capacity of a neuron is too poor. We hope that the neuron models have multi-states. Some researchers have proposed such models. Multi-level neuron is one of them (Zurada, Cloete & Poel, 1996). Complex-valued neurons are also multi-states neuron models. Several models of complex-valued neurons have been proposed, for example, phasor neurons (Noest, 1988a), discrete-state phasor neurons (Noest, 1988b), amplitude-phase type of complex-valued neurons (Hirose, 1992; Kuroe, 2003), real part – imaginary part type of complex-valued neurons (Benvenuto & Piazza, 1992; Nitta & Furuta, 1991; Nitta, 1997) and so on (Nemoto & Kubono, 1996; Nemoto, 2003). In this chapter, we deal with the phasor neurons and the discrete phasor neurons. In this chapter, we call phasor neurons continuous phasor neurons to distinguish them clearly from discrete phasor neurons.
Several types of neural networks, feed-forward neural networks, Hopfield Networks and Boltzmann Machines, have been extended to the complex-valued neural networks. Noest proposed complex-valued Hopfield Networks (Noest, 1988a; Noest, 1988b). They are called continuous phasor or discrete phasor neural networks. Hirose proposed back propagation learning algorithms for the amplitude-phase type of complex-valued neural networks (Hirose, 1992). Benvenuto and Piazza (1992) and Nitta and Furuya (1991) independently proposed back propagation learning algorithms for the real part – imaginary part type of complex-valued neural networks. Boltzmann Machines were also extended to the complex-valued Boltzmann Machines (Zemel, Williams & Mozer, 1993; Zemel, Williams & Mozer, 1995). The complex-valued Boltzmann Machines proposed by Zemel et al. (1993) are continuous models. Kobayashi and Yamazaki proposed the discrete version of complex-valued Boltzmann Machines (Kobayashi & Yamazaki, 2003).