Towards an Improved Ensemble Learning Model of Artificial Neural Networks: Lessons Learned on Using Randomized Numbers of Hidden Neurons

Towards an Improved Ensemble Learning Model of Artificial Neural Networks: Lessons Learned on Using Randomized Numbers of Hidden Neurons

Fatai Anifowose (Universiti Malaysia Sarawak, Malaysia), Jane Labadin (Universiti Malaysia Sarawak, Malaysia) and Abdulazeez Abdulraheem (King Fahd University of Petroleum and Minerals, Saudi Arabia)
DOI: 10.4018/978-1-5225-0159-6.ch031
OnDemand PDF Download:
$37.50

Abstract

Artificial Neural Networks (ANN) have been widely applied in petroleum reservoir characterization. Despite their wide use, they are very unstable in terms of performance. Ensemble machine learning is capable of improving the performance of such unstable techniques. One of the challenges of using ANN is choosing the appropriate number of hidden neurons. Previous studies have proposed ANN ensemble models with a maximum of 50 hidden neurons in the search space thereby leaving rooms for further improvement. This chapter presents extended versions of those studies with increased search spaces using a linear search and randomized assignment of the number of hidden neurons. Using standard model evaluation criteria and novel ensemble combination rules, the results of this study suggest that having a large number of “unbiased” randomized guesses of the number of hidden neurons beyond 50 performs better than very few occurrences of those that were optimally determined.
Chapter Preview
Top

Introduction

Artificial Neural Networks (ANN) has become a “household” technique in the Computational Intelligence (CI) and data mining application community. It is the most popular and commonly used technique for most predictive modeling tasks. Since it is readily available as a toolbox in the MATLAB software (Demuth et al., 2009), it is easily applied on non-linear and most challenging academic and industrial problems. The journey of the application of CI techniques in petroleum engineering has been interesting. It started with the derivation of empirical equations for the estimation of most petroleum reservoir properties such as porosity and permeability. These equations were used to establish linear relationships between certain observed parameters and the target reservoir property. Later, these equations were found to be inferior to linear and multivariate regression tools in terms of predictive performance. When the capabilities of CI techniques, especially ANN, was discovered by petroleum engineers, focus was shifted from the linear and multivariate regression tools as they could not compete with the latter (Eskandari et al. 2004). Consequently, CI became well embraced in the petroleum reservoir characterization research and has been reported to perform excellently well (Zahedi et al. 2009; El-Sebakhy, 2009). Due to the nice graphical user interface and its ease of use, ANN became popular and commonly used among petroleum engineers.

However, despite the common use of this technique, it poses a number of challenges one of which is the determination of the appropriate network design architecture. One of the important parameters in the ANN design architecture is the number of hidden neurons. The process of determining the optimal number of neurons in the hidden layer has remained an open challenge in the ANN application literature (Bodgan, 2009). So far, two methods have been employed to handle this situation: continuation of the age-long trial-and-error method (Petrus et al., 1995) and optimization techniques using evolutionary algorithms (Hassan et al., 2005; Maertens et al., 2006).

Each of these methods has its limitations and disadvantages. The trial-and-error method, on one hand, requires so much time and effort. Upon these, it still ends up getting caught in the local optima rather than the global. Since this method involves trying different numbers of hidden neurons consecutively and sometimes haphazardly, it is very easy to miss the global optima between two chosen points and settle down to a sub-optimal value. This method is usually and often affected by the “human factor” of getting tired after a few trials and settling down to the best of the set. However, this best of the set may not be the global best but simply the best among the tried possibilities.

The use of evolutionary algorithms to automatically optimize the number of hidden neurons, on the other hand, leads to high computational complexity, consumption of enormous computing memory resources and increased execution time. Since the evolutionary algorithms are based on exhaustive search heuristic algorithms, some of them have been reported to also end up in the local optima (Anifowose et al., 2013a; Bies et al., 2006; Gao, 2012). There is the need to look elsewhere for a solution. CI hybrid techniques have also been studied in the literature. However, like the ANN technique, hybrid models are only able to handle one hypothesis at a time, hence would not be an appropriate solution to this problem. In view of the limitations of these two conventional methods, we are proposing in this chapter a novel ensemble methodology that utilizes the power of unbiased random guesses in the assignment of the number of hidden neurons.

Complete Chapter List

Search this Book:
Reset