Article Preview
Top1. Introduction
During the past two decades, Artificial Neural Networks (ANNs) have attracted overwhelming attentions in the domain of time series modeling and forecasting. ANNs are widely popular due to their non-linear, non-parametric, data-driven and self-adaptive nature (Zhang, 2003; Khashei & Bijari, 2010). Many traditional forecasting methods often suffer from one or more major limitations. For example, the well-known Box-Jenkins model (Box & Jenkins, 1970) solely requires that the associated time series is linear in nature which is often rare for real-world data. In contrast, the usual nonlinear forecasting methods have high mathematical complexity and they explicitly depend on the knowledge of the intrinsic data distribution process (Hamzaçebi, et al., 2009). However, ANNs have the surprising ability of modeling both linear as well as nonlinear time series without requiring any preliminary information. ANNs adaptively learn from the successive training patterns, utilize the acquired knowledge to formulate an adequate model and then generalize the experience to forecast future observations. Additionally, ANNs are universal approximators, i.e. they can approximate any continuous function to any desired degree of precision (Hornik, et al., 1989). These distinctive features make them more general as well as flexible than many other traditional forecasting methods. In spite of these unique strengths, the designing of an appropriate ANN model is in general quite tedious and needs a variety of challenging issues to resolve (Zhang, et al., 1998; Zhang, 2003). The most crucial among them is the selection of an appropriate training algorithm. The ANN training is an unconstrained nonlinear minimization problem and so far the Backpropagation (BP) is the most recognized method in this regard. However, the standard BP algorithm requires large computational time, has slow convergence rate and often gets stuck into local minima (Zhang, et al., 1998; Kamruzzaman, et al., 2006). These inherent drawbacks could not be entirely eliminated in spite of the development of several modifications or alterations of the BP algorithm in literature.
In recent years, the Particle Swarm Optimization (PSO) technique (Kennedy & Eberhart, 1995; Kennedy, et al., 2001) has gained notable popularity in the field of nonlinear optimization. PSO is a population based evolutionary computation method which is originally inspired from the social behavior in birds flock. The central aim of the PSO algorithm is to ultimately cluster all swarm particles in the vicinity of the desired global optimal solution. It is governed by the principle that the individual members in a social system are benefited from the intelligent information which iteratively emerges through the cooperation of all members (Jha, et al., 2009). Although, PSO and Genetic Algorithm (GA) have many similarities but they do differ on some fundamental aspects. GA depends on Darwin’s theory of survival of the fittest, whereas PSO is based on the principle of evolutionary computing (Leung, et al., 2003; Jha, et al., 2009). In GA, the population size successively decreases through eliminating the weakest solutions; however, the population size in PSO remains constant throughout. The two central operations of GA, viz. crossover and mutation do not exist in PSO. In a similar manner, the concept of personal and global best positions which is fundamental to PSO is irrelevant in GA. Over the past few years, PSO has found prolific applications for neural network training due to its many influential properties, e.g. high search power in the state space, fast convergence rate and ability of providing global optimal solution (Jha, et al., 2009; Chen, et al., 2011). Evidently, PSO can be a much better alternative to the standard BP training method. However, till now the existing literature on PSO based neural networks for time series forecasting is quite scarce and as such this topic surely needs more research attentions (Chen, et al., 2006; Jha, et al., 2009; Adhikari & Agrawal, 2011).