On the Equivalence between Ordinary Neural Networks and Higher Order Neural Networks

On the Equivalence between Ordinary Neural Networks and Higher Order Neural Networks

Mohammed Sadiq Al-Rawi (University of Aveiro, Portugal) and Kamal R. Al-Rawi (Petra University, Jordan)
DOI: 10.4018/978-1-61520-711-4.ch006
OnDemand PDF Download:
$37.50

Abstract

In this chapter, we study the equivalence between multilayer feedforward neural networks referred as Ordinary Neural Networks (ONNs) that contain only summation (Sigma) as activation units, and multilayer feedforward Higher order Neural Networks (HONNs) that contains Sigma and product (PI) activation units. Since the time they were introduced by Giles and Maxwell (1987), HONNs have been used in many supervised classification and function approximation. Up to the date of writing this chapter, the most cited HONN article by ISI Thomson Web of Knowledge is the work of Kosmatopoulos et al., (1995) by which they introduced a recurrent HONN modeling. A simple comparison with ONNs is usually performed in order to demonstrate the performance of some newly introduced HONN architecture. Is it true that HONNs outperform ONNs, how much do they differ? And how much do they commute? Does equivalence exists between a HONN and an ONN? Is it possible to convert a HONN to an equivalent ONN? And how neural network equivalence is defined? This chapter tries to answer most of these questions. Due to the existence of huge neural networks architectures in the literature, the authors of this work are concerned and think that equivalence studies are necessary to give abstract definitions and unified approaches which might help in better understanding of HONNs performance and their respective design. On contrary to most of the previous works were HONN weights are non-negative integers, HONNs are given in this chapter in a form such that weights are adjustable real-valued numbers. In doing that, HONNs might have more expressive power and there is an increase probability of having complex valued neuron outputs. To enable the use of the real-valued weights that may result in a complex valued neuron output we introduce normalization to the input data as well as a modification to neuron activation functions. Using simple mathematics and the proposed normalization to input data, we showed that HONNs are equivalent to ONNs. The converted equivalent ONN posses the features of HONN and they have exactly the same functionality and output. The proposed conversion of HONN to ONN would permit using the huge amount of optimization algorithms to speed up the convergence of HONN and/or finding better topology. Recurrent HONNs, cascaded correlation HONNs, or any other complicated HONN can be simply defined via their equivalent ONNs and then trained with backpropagation, scaled conjugate gradient, Lavenberg-Marqudat algorithm, brain damage algorithms (Duda et al., 2000), etc. Using the developed equivalency model, this chapter also gives an easy bottom-up approach to convert a HONN to its equivalent ONN. Results on XOR and function approximation problems showed that ONNs obtained from their corresponding HONNs converged well to a solution. Different optimization training algorithms have been tested equivalent ONNs having feedforward structure and/or cascade correlation where the later have shown outstanding function approximation results.
Chapter Preview
Top

1. Introduction

In our daily life we face several classification problems that are considered nonlinear, i.e., one cannot separate two categories using simply a line for two dimensional patterns, a plane for three dimensional patters, or a hyper plane as in multi-dimensional patters. Inspired by the biological neuronal system, computational intelligent based classification systems were developed in the past few decades and are widely known as computational neural networks. These computational networks possess powerful nonlinear classification ability and they are also known in the literature with other proximate names, such as artificial neural networks, and statistical neural networks. In this work, we choose to call multi-layer feedforward neural networks as Ordinary Neural Networks (ONNs) in order to distinguish them from Higher Order Neural Networks (HONNs). The reason is that both HONNs and ONNs are artificial, computational, multi-layer feedforward neural networks. Nonetheless, other terminologies of ONNs might exist such as first order neural networks (Giles, et al., 1988), or multi-layer perceptrons (Minsky and Papert 1969),

In order to solve nonlinear classification problems, an ONN with one or more hidden layers can be employed. Determining the proper number of hidden layers and the number of units in each hidden layer is accomplished by trial and error, dynamic adaptive algorithms e.g. surgeon brain damage algorithms (Duda et al., 2000). Several studies have used HONNs rather than ONNs in order to obtain better performance (Thimm, 1998; Thimm & Fiesler, 1997; Spirkovska & Reid, 1993; Rovithakis et al., 2004). To what degree we can rely on these outperformance results? When we investigate the literature we see that ONNs have only Sigma (summation) activation units, for example, the output of a ONN is given by:

(1) where are the weights that connect different layers of ONN, is the value of the ith input taken from the input pattern. In contrast, a HONN must have at least one PI (product unit) as was shown by Giles & Maxwell, (1987), hence, the output of an up to the second order HONN is given by:

(2)

Thus, the major difference between HONNs and ONNs is the way the activation is calculated, i.e., only sigma units are used to construct ONNs, while Sigma and PI units or just PI units are used to construct HONNs. Does this matter? In computer architecture, a multiplication operation can be implemented via an algorithm implementing several addition operations (Knuth, 1997; Kulisch, 2002). In fact, multiplications are defined for the whole numbers in terms of repeated addition and even multiplications of real numbers could be defined by a systematic generalization of this basic idea. With this in mind, HONNs could be converted to a very complex, large size, constrained ONNs. The hypothetical large sized ONN that equiv a HONN might justify the power of a moderate size HONN. Nonetheless, it is unfair to compare the computational cost of some HONN to another ONN that has the same number of units and synaptic connections. More than that, it is also unfair to compare the expressive power of a HONN to an ONN when they have the same number of units and synaptic connections. The reason is that the computational architecture and computational complexity of a HONN is much higher than that of an ONN. To overcome this dilemma it is necessary to develop a mathematical model for converting a HONN to its equivalent ONN and further studies can be performed later to answer questions about the expressive power and the computational complexity of both architectures.

Complete Chapter List

Search this Book:
Reset