Abstract
One of the most successful tools in the Artificial Intelligence (AI) world is Artificial Neural Networks (ANNs). This technique is a powerful tool used in many different environments, with many different purposes, like classification, clustering, signal modelization, or regression (Haykin, 1999). Although they are very easy to use, their creation is not a simple task, because the expert has to do much effort and spend much time on it. The development of ANNs can be divided into two parts: architecture development and training and validation. The architecture development determines not only the number of neurons of the ANN, but also the type of the connections among those neurons. The training determines the connection weights for such architecture. The architecture design task is usually performed by means of a manual process, meaning that the expert has to test different architectures to find the one able to achieve the best results. Each architecture trial means training and validating it, which can be a process that needs many computational resources, depending on the complexity of the problem. Therefore, the expert has much participation in the whole ANN development, although techniques for relatively automatic creation of ANNs have been recently developed.
TopBackground
ANN development is a research topic that has attracted many researchers from the world of evolutionary algorithms (Nolfi & Parisi D., 2002) (Cantú-Paz & Kamath, 2005). These techniques follow the general strategy of an evolutionary algorithm: an initial population with different types of genotypes encoding also different parameters – commonly, the connection weights and/or the architecture of the network and/or the learning rules – is randomly created and repeatedly induced to evolve.
The most direct application of EC tools in the ANN world is to perform the evolution of the weights of the connections. This process starts from an ANN with an already determined topology. In this case, the problem to be solved is the training of the connection weights, attempting to minimise the network failure. Most of training algorithms, as backpropagation (BP) algorithm (Rumelhart, Hinton & Williams, 1986), are based on gradient minimisation, which presents several inconveniences. The main of these disadvantages is that, quite frequently, the algorithm gets stuck into a local minimum of the fitness function and it is unable to reach a global minimum. One of the options for overcoming this situation is the use of an evolutionary algorithm, so the training process is done by means of the evolution of the connection weights within the environment defined by both, the network architecture, and the task to be solved. In such cases, the weights can be represented either as the concatenation of binary values or of real numbers on a genetic algorithm (GA) (Greenwood, 1997).
The evolution of architectures consists on the generation of the topological structure, i.e., establishing the connectivity and the transfer function of each neuron. To achieve this goal with an evolutionary algorithm, it is needed to choose how to encode the genotype of a given network for it to be used by the genetic operators.
The most typical approach is called direct encoding. In this technique there is a one-to-one correspondence between each of the genes and a determined part of the network. A binary matrix represents an architecture where every element reveals the presence or absence of connection between two nodes (Alba, Aldana & Troya, 1993).
In comparison with direct encoding, there are some indirect encoding methods. In these methods, only some characteristics of the architecture are encoded in the chromosome. These methods have several types of representation.
Firstly, the parametric representations represent the network as a group of parameters such as number of hidden layers, number of nodes for each layer, number of connections between two layers, etc (Harp, Samad & Guha, 1989). Another non direct representation type is based on a representation system that uses grammatical rules (Kitano, 1990), shaped as production rules that make a matrix that represents the network.
Another type of encoding is the growing methods. In this case, the genotype contains a group of instructions for building up the network (Nolfi & Parisi, 2002).
Key Terms in this Chapter
Population: Pool of individuals exhibiting equal or similar genome structures, which allows the application of genetic operators
Genetic Programming: Machine learning technique that uses an evolutionary algorithm in order to optimise the population of computer programs according to a fitness function which determines the capability of a program for performing a given task.
Artificial Neural Networks: Interconnected set of many simple processing units, commonly called neurons, that use a mathematical model, that represents an input/output relation,
Search Space: Set of all possible situations of the problem that we want to solve could ever be in.
Back-Propagation Algorithm: Supervised learning technique used by ANNs, that iteratively modifies the weights of the connections of the network so the error given by the network after the comparison of the outputs with the desired one decreases
Phenotype: Expression of the properties coded by the individual’s genotype.
Genotype: The representation of an individual on an entire collection of genes which the crossover and mutation operators are applied to.
Evolutionary Computation: Set of Artificial Intelligence techniques used in optimization problems, which are inspired in biologic mechanisms such as natural evolution