Article Preview
Top1. Introduction
According to the No-Free-Lunch theorem (Wolpert & Macready, 1997), any elevated performance over one class of problems is compensated by reduced performance over another class. This implies that techniques that sample the space of computer programs and incorporate problem-specific knowledge can be effective for their chosen class of problems. A fundamental challenge is how to find the appropriate composition of knowledge and algorithm. Decades of research on evolutionary computation have addressed this problem, but the theory around it remains incomplete and fragile.
Evolutionary algorithms rely on the paradigm of trial-and-error learning and are widely used to solve many real-world problems. Memetic algorithms were introduced in 1989 to formalize the computational idea of an evolutionary algorithm directly modifying genetic material based on specific information from the environment (Moscato, 1989). Memetic algorithms combine evolutionary, population-based, global search and heuristic, problem-specific or knowledge-based, local search, which provides an intermediate for connecting knowledge and genetic search procedures (Eiben & Smith, 2003). Under this backdrop, this paper aims to investigate the use of problem-specific knowledge in genetic search procedures and suggest that a combination of memetic algorithms and domain knowledge might help solve complex problems.
This study proposes a new memetic algorithm called Knowledge-based Memetic Algorithm (KBMA), in which domain knowledge is combined with the initialization and reproduction operators. The hypothesis is that incorporating knowledge into a memetic algorithm can systematically influence the traversal of the search space. The stock classification problem is selected as the test bed. Knowledge is represented in a semantic network of financial attributes. Two sets of experiments are implemented to test the hypothesis. The first set of experiments focuses on incorporating domain knowledge into the initialization phase of evolution and tests whether using domain knowledge will cause the convergence to be faster by preserving building blocks. The building blocks are identified with aid of the semantic distance between the components of an individual. The second set of experiments deals with combining domain knowledge with the reproduction operators, in which two case studies are considered. On one hand, the knowledge is used to constrain the reproduction such that offspring are semantically similar to parents. On the other hand, the knowledge is applied to constrain the reproduction so that offspring are semantically dissimilar to parents.
This study contributes to the theory of systematically incorporating domain knowledge into evolutionary algorithms, which is an important, first step towards improving effectiveness and efficiency of evolutionary computation. First, this study provides an innovative, intuitive perspective of constructing ‘meaningful’ building blocks in genetic search procedures. Components are sorted based on their semantic relationships so that the building blocks might be preserved during evolution. Second, this study proposes a new method of measuring the distance between children and their parents. Much published work has dealt with highly constrained problems in which what is evolved is a numeric parameter (Rowe, Vose, & Wright, 2004). The reproductive operators have made changes in the numbers that are measured by Hamming distance or Euclidean distance. This study measures the distance between parents and offspring from a different, knowledge-oriented perspective. Semantic distance is used to measure the distance between individuals. The experimental results show how semantic distance can correspond to distance between evolutionary individuals.
The remainder of the paper is structured as follows. Section 2 provides background information on evolutionary algorithms, knowledge in evolution, and its financial applications. Section 3 presents the framework, in which the algorithm description, problem statement, and domain knowledge are presented. Section 4 presents the experiments and results on incorporating knowledge into the initialization and reproduction. Section 5 discusses the results and presents the limitations and future directions. Section 6 concludes the paper.