Evolutionary Development of ANNs for Data Mining

Evolutionary Development of ANNs for Data Mining

Daniel Rivero (University of A Coruña, Spain)
Copyright: © 2009 |Pages: 7
DOI: 10.4018/978-1-60566-010-3.ch128
OnDemand PDF Download:


Artificial Neural Networks (ANNs) are learning systems from the Artificial Intelligence (AI) world that have been used for solving complex problems related to different aspects as classification, clustering, or regression (Haykin, 1999), although they have been specially used in Data Mining. These systems are, due to their interesting characteristics, powerful techniques used by the researchers in different environments (Rabuñal, 2005). Nevertheless, the use of ANNs implies certain problems, mainly related to their development processes. The development of ANNs can be divided into two parts: architecture development and training and validation. The architecture development determines not only the number of neurons of the ANN, but also the type of the connections among those neurons. The training will determine the connection weights for such architecture. Traditionally, and given that the architecture of the network depends on the problem to be solved, the architecture design process is usually performed by the use of a manual process, meaning that the expert has to test different architectures to find the one able to achieve the best results. Therefore, the expert must perform various tests for training different architectures in order to determine which one of these architectures is the best one. This is a slow process due to the fact that architecture determination is a manual process, although techniques for relatively automatic creation of ANNs have been recently developed. This work presents various techniques for the development of ANNs, so that there would be needed much less human participation for such development.
Chapter Preview


The development of ANNs has been widely treated with very different techniques in AI. The world of evolutionary algorithms is not an exception, and proof of it is the large amount of works that have been published is this aspect using several techniques (Nolfi, 2002; Cantú-Paz, 2005). These techniques follow the general strategy of an evolutionary algorithm: an initial population with different types of genotypes encoding also different parameters – commonly, the connection weights and/or the architecture of the network and/or the learning rules – is randomly created. Such population is evaluated for determining the fitness of every individual. Subsequently, the population is repeatedly induced to evolve by means of different genetic operators (replication, crossover, mutation) until a certain termination parameter has been fulfilled (for instance, the achievement of an individual good enough or the accomplishment of a predetermined number of generations).

As a general rule, the field of ANNs generation using evolutionary algorithms is divided into three main groups: evolution of weights, architectures and learning rules.

The evolution of weight starts from an ANN with an already determined topology. In this case, the problem to be solved is the training of the connection weights, attempting to minimise the network failure. Most of training algorithms, as backpropagation (BP) algorithm (Rumelhart, 1986), are based on gradient minimisation, which presents several inconveniences (Sutton, 1986). The main of these disadvantages is that, quite frequently, the algorithm gets stuck into a local minimum of the fitness function and it is unable to reach a global minimum. One of the options for overcoming this situation is the use of an evolutionary algorithm, so the training process is done by means of the evolution of the connection weights within the environment defined by both, the network architecture, and the task to be solved. In such cases, the weights can be represented either as the concatenation of binary values or of real numbers on a genetic algorithm (GA) (Greenwood, 1997). The main disadvantage of this type of encoding is the permutation problem. This problem means that the order in which weights are taken at the vector might cause that equivalent networks might correspond to completely different chromosomes, making the crossover operator inefficient.

Complete Chapter List

Search this Book: