The optimization of a cost function which has a number of local minima is a relevant subject in many important fields. For instance, the determination of the weights of learning machines depends in general on the solution of global optimization tasks (Haykin, 1999). A feature shared by almost all of the most common deterministic and stochastic algorithms for continuous non – linear optimization is that their performance is strongly affected by their starting conditions. Depending on the algorithm, the correct selection of an initial point or set of points have direct consequences on the efficiency, or even on the possibility to find the global minima. Of course, adequate selection of seeds implies prior knowledge on the structure of the optimization task. In the absence of prior information, a natural choice is to draw seeds from a uniform density defined over the search space. Knowledge on the problem can be gained through the exploration of this space. In this contribution is presented a method to estimate probability densities that describe the asymptotic behavior of general stochastic search processes over continuously differentiable cost functions. The relevance of such densities is that they give a description of the residence times over the different regions of the search space, after an infinitely long exploration. The preferred regions are those which minimize the cost globally, which is reflected in the asymptotic densities. In first instance, the resulting densities can be used to draw populations of points that are consistent with the global properties of the associated optimization tasks.
Stochastic strategies for optimization are essential to most of the heuristic techniques used to deal with complex, unstructured global optimization problems (Pardalos, 2004). The roots of such methods can be traced back to the Metropolis algorithm (Metropolis, Rosenbluth, Rosenbluth, Teller & Teller, 1953), introduced in the early days of scientific computing to simulate the evolution of a physical system to thermal equilibrium. This process is the base of the simulated annealing technique (Kirkpatrick, Gellat & Vecchi, 1983), which makes use of the convergence to a global minimum in configurational energy observed in physical systems at thermal equilibrium as the temperature goes to zero.
The method presented in this contribution is rooted in similar physical principles as those on which simulated annealing type algorithms are based. However, in contrast with other approaches (Suykens, Verrelst & Vandewalle, 1998) (Gidas, 1995) (Parpas, Rustem & Pistikopoulos, 2006), the proposed method considers a density of points instead of Markov transitions of individual points. The technique is based in the interplay between Langevin and Fokker – Planck frameworks for stochastic processes, which is well known in the study of out of equilibrium physical systems (Risken, 1984) (Van Kampen, 1992). Fokker - Planck equation has been already proposed for its application in search algorithms, in several contexts. For instance, it has been used to directly study the convergence of populations of points to global minima (Suykens, Verrelst & Vandewalle, 1998), as a tool to demonstrate the convergence of simulated annealing type algorithms (Parpas, Rustem & Pistikopoulos, 2006) (Geman & Hwang, 1986), or as a theoretical framework for Boltzmann type learning machines (Movellan & McClelland, 1993) (Kosmatopoulos & Christodoulou, 1994). In the context of global optimization by populations of points, it has been proposed that the populations evolve under the time – dependent version of the Fokker – Planck equation, following a schedule for the reduction of the diffusion constant D (Suykens, Verrelst & Vandewalle, 1998).