Swarm-Based Nature-Inspired Metaheuristics for Neural Network Optimization

Swarm-Based Nature-Inspired Metaheuristics for Neural Network Optimization

Swathi Jamjala Narayanan, Boominathan Perumal, Jayant G. Rohra
DOI: 10.4018/978-1-5225-2857-9.ch002
(Individual Chapters)
No Current Special Offers


Nature-inspired algorithms have been productively applied to train neural network architectures. There exist other mechanisms like gradient descent, second order methods, Levenberg-Marquardt methods etc. to optimize the parameters of neural networks. Compared to gradient-based methods, nature-inspired algorithms are found to be less sensitive towards the initial weights set and also it is less likely to become trapped in local optima. Despite these benefits, some nature-inspired algorithms also suffer from stagnation when applied to neural networks. The other challenge when applying nature inspired techniques for neural networks would be in handling large dimensional and correlated weight space. Hence, there arises a need for scalable nature inspired algorithms for high dimensional neural network optimization. In this chapter, the characteristics of nature inspired techniques towards optimizing neural network architectures along with its applicability, advantages and limitations/challenges are studied.
Chapter Preview


Swarm based Optimization is a strategy that considers several agents collectively working to intelligently achieve a goal in the most optimal manner. Nature inspired techniques might include the consideration of these agents to be a flock of birds, a school of fish, a swarm of bees etc. Metaheuristic approaches are framed to form an analogy between the nature and computational systems and hence implement a relevant behavior as a paradigm to perform the required task. Since 1990, several nature inspired meta-heuristic techniques have been proposed. There exist several applications areas where the metaheuristics like swarm based or evolutionary optimization algorithms play vital role. Several NP-hard optimization problems like Traveling Salesman Problem, Quadratic Assignment Problem, Graph problems are also solved using nature inspired techniques.

Classification is the task of assigning an object to a pre-defined class or group (Duda, 1973). Classifier can be considered as a mapping function of the form 978-1-5225-2857-9.ch002.m01), where f(.) is the classifier that maps the object Xi to class Yi based on parameters 978-1-5225-2857-9.ch002.m02 that are related attributes of object Xi. Classification is widely used in business, science, industry, and medicine and addresses many real world problems such as bankrupt prediction, credit scoring, medical diagnosis, handwritten character recognition, and speech recognition.

In traditional statistical classifiers, classification decision depends on posterior probability which is derived based on the assumptions on underlying probability model. Prior knowledge required on data properties and model capabilities limits the scope of statistical classifiers in many real world problems. Emergence of Neural Network, a non-linear model that models real world complex problems provides solution for the conventional statistical classifiers. The advantage of neural networks lies in the following theoretical aspects. First, neural networks are data driven self-adaptive methods in that they can adjust themselves to the data without any explicit specification of functional or distributional form for the underlying model. Second, they are universal functional approximators in that neural networks can approximate any function with arbitrary accuracy (Cybenko 1989; Hornik, 1991; Hornik et al, 1989).

Since any classification procedure seeks a functional relationship between the group membership and the attributes of the object, accurate identification of this underlying function is doubtlessly important. Third, neural networks are nonlinear models, which makes them flexible in modelling real world complex relationships. Finally, neural networks are able to estimate the posterior probabilities, which provide the basis for establishing classification rule and performing statistical analysis.Neural networks are considered as data driven self-adaptive methods and universal functional approximators that estimates posterior probability with arbitrary accuracy.

To improve the performance of the neural networks by optimizing its parameters, the authors (Werbos, 1990, 1994; Williams et al.,1986; Gupta & Sexton, 1999; Wilamowski, 2002) have suggested that back propagation using gradient descent methods is the most widely used neural network training method to optimize the neural network parameters in supervised learning strategy. In recent years, many improved learning algorithms have been developed that aim to remove the shortcomings of the gradient descent based systems.

Key Terms in this Chapter

Feed Forward Neural Network: In this network, there are no cycles formed. The information moved forward from input layer to hidden layer and from hidden to output layer.

Nature Inspired Optimization: The optimization techniques which emerged based on the behavior of the nature including the behavior of ants, birds, bees, bats, cat, cuckoo, fireflies, etc.

Deep Learning (DL): Is a branch of machine learning with a set of algorithms modelling high level abstractions in data using deep graphs having multiple processing layers.

Gradient: Is an increase or decrease that happens when moving from one point to another.

Hessian: Is a square matrix of second order derivatives.

Neural Network (NN): A system modelled based on the working mechanism of human brain and nervous system.

Back Propagation (BP): Is a commonly used method for back propagating errors while training artificial neural networks.

Mean Square Error (MSE): Commonly used objective function to evaluate the performance of classification algorithms. It is defined as the variance of the estimator.

Complete Chapter List

Search this Book: