Article Preview
TopIntroduction
Estimation of distribution algorithm (EDA) is a new type of evolutionary algorithm (Mühlenbein & Paaß, 1996). EDA evolves a population in a more principled manner than its predecessor, genetic algorithm (GA) (Goldberg, 1989; Mitchell, 1996). In GA, randomness is used in many steps of the algorithm such as selecting highly fit individuals and creating a new population from selected individuals. GA creates a new individual by crossing over two selected individuals. The crossover point is chosen at random. After that, GA randomly mutates the new individual. However, in EDA, a new population is generated from a probabilistic model, which is built from highly fit chromosomes. Even though generating a new population in EDA also uses randomness, it is based on the knowledge about relationship among variables embedded in the model.
Building a model for a large problem (i.e., a problem with a solution encoded with a large number of bits) is very time consuming. For example, the running time for building a model for MIMIC and BOA is O(n2) and O(n2N+n3), where n is a chromosome length and N is the number of individuals used for building a model. Several approaches can reduce the size of the search space in evolutionary computation (Chen & Smith, 1999; Yong & Sannomiya, 2001). The authors have investigated compression as a search space reduction method for evolutionary computations. The authors’ earliest compressed encoding is similar to run-length encoding (Suwannik et al., 2005). The results demonstrate that the compressed GA requires 805 times less fitness evaluations than the simple GA when solving the 128-bit OneMax problem. Later, the authors proposed c2GA, which combines the compact genetic algorithm (cGA) with compressed encoding (Watchanupaporn et al., 2006). The performance of the c2GA is superior to that of cGA for the OneMax and RoyalRoad problems (Mitchell, 1994).
The compression algorithm used in compressed GA and c2GA is not effective. Moreover, the number of bits of the repetition times (the run length) must be specified. To overcome this problem, LZW-compressed chromosome encoding is proposed by Kunasol et al. (2006). An LZWGA chromosome is an array of integers that can be decompressed using the Lempel-Ziv-Welch (LZW) decompression algorithm (Welch, 1984). Because the size of compressed chromosome is typically smaller than the size of the decompressed chromosome, LZWGA searches a smaller search space. Consequently, it can be used to solve very large problems. Kunasol et al. (2006) conducted experiments on one-million-bit problems. Solving a problem of this size using any canonical GA is not practical. However, LZWGA could solve the one-million-bit OneMax and Trap problems in 160 generations. A one-million-bit problem has a search space size equal to 21000000, or 9.90x10301029, possibilities.
LZW compressed encoding was also applied to EDA. Watchanupaporn and Suwannik (2010) combines LZW compressed encoding with cGA, which is a univariate EDA. Unlike cGA, this algorithm does not use a probability vector to represent the entire population. Instead, it uses a probability matrix because the generated LZW chromosome is an array of integers. Watchanupaporn et al. (2012) proposed another method to combine LZW to cGA. Unlike the previous method, the core of the cGA algorithm does not have to be modified. Therefore, a probability vector can be used to represent the entire population. This method can easily be used in various EDAs such as LZWMIMIC (Watchanupaporn and Suwannik 2011) which combines LZW compressed encoding with MIMIC, which is a bivariate EDA (Bonet et al.,1997). This proposed algorithm outperforms the original MIMIC algorithm.