Recursive Learning of Genetic Algorithms with Task Decomposition and Varied Rule Set

Recursive Learning of Genetic Algorithms with Task Decomposition and Varied Rule Set

Lei Fang (Xi’an Jiaotong-Liverpool University, China), Sheng-Uei Guan (Xi’an Jiaotong-Liverpool University, China) and Haofan Zhang (Xi’an Jiaotong-Liverpool University, China)
DOI: 10.4018/978-1-4666-3628-6.ch014
OnDemand PDF Download:
$37.50

Abstract

Rule-based Genetic Algorithms (GAs) have been used in the application of pattern classification (Corcoran & Sen, 1994), but conventional GAs have weaknesses. First, the time spent on learning is long. Moreover, the classification accuracy achieved by a GA is not satisfactory. These drawbacks are due to existing undesirable features embedded in conventional GAs. The number of rules within the chromosome of a GA classifier is usually set and fixed before training and is not problem-dependent. Secondly, conventional approaches train the data in batch without considering whether decomposition solves the problem. Thirdly, when facing large-scale real-world problems, GAs cannot utilise resources efficiently, leading to premature convergence. Based on these observations, this paper develops a novel algorithmic framework that features automatic domain and task decomposition and problem-dependent chromosome length (rule number) selection to resolve these undesirable features. The proposed Recursive Learning of Genetic Algorithm with Task Decomposition and Varied Rule Set (RLGA) method is recursive and trains and evolves a team of learners using the concept of local fitness to decompose the original problem into sub-problems. RLGA performs better than GAs and other related solutions regarding training duration and generalization accuracy according to the experimental results.
Chapter Preview
Top

Introduction

Pattern classification problem plays a major role in various fields of computer science and engineering, such as data mining and image processing has been attracting growing interests. The task of pattern classification is to assign patterns into one out of several predefined classes within which the assigned patterns share some common properties. Assume we have a c-class problem in an n-dimensional pattern space. And q real vectors Xi=(xi1 xi2xin) i = 1,2, …,q, are given as training patterns from the c classes (c<<q). Normally, a GA classifier, for example, will train its chromosomes against a set of training patterns (with known classes) to discover the relationship between the attributes and classes. The discovered rules can be evaluated by classification accuracy or error rate either on the training data or test data. Classification problem has also been widely included among machine learning problems while much research has been done on machine learning techniques by using classification problem as a vehicle.

Many soft computing techniques have been applied to solve the classification problem, including artificial neural networks (ANN) (Guan & Li, 2001; Su, Guan, & Yeo, 2001; Anand, Mehrotra, Mohan, & Ranka, 2002), fuzzy logic (Ishibuchi, Nakashima, & Murata, 2002; Setnes & Roubos, 2002), and evolutionary algorithms (De Jong & Spears, 1991; Zhu & Guan, 2005). Among them, rule-based genetic algorithms (GAs), a member of evolutionary algorithms, is one of the most successful approaches for classification, either supervised or unsupervised (Corcoran & Sen, 1994). There are two main streams of rule-based GAs classifiers (Smith, 1980; De Jong, 1988; Cordón, Herrera, Hoffmann, & Magdalena, 2001). One is the Pittsburgh approach, and the other the Michigan approach. These names follow the names of the universities at which these approaches were firstly proposed. In the Pittsburgh (Pitt) approach, each chromosome is encoded as a complete set of rules. Therefore, one individual chromosome can solve the given classification problem. On the other hand, the Michigan approach uses instead the whole population as the solution-set, which means the whole set of chromosomes should work together to solve the classification problem. In this paper, the Pitt approach is applied.

Despite of the popularity of GAs, there still exist problems within GAs with reference to solving pattern classification problem. It is usually a time consuming task for GA-based classifiers to evolve a qualified set of solutions. Moreover, the classification accuracy finally achieved by GAs, is not always satisfactory, especially when solving problems in higher dimensions. Time and accuracy are two of the most important metrics that can be used to measure the performance of a pattern classifier. A good pattern classifier should be both efficient in time and effective in accuracy. Besides, GAs also suffer from early convergence such that the evolution process cannot gain further improvement when the solution (population) is still premature.

Conventionally, most GA-based solutions in the literature work in the static mode, where all the parameters of GAs (e.g., rule number, chromosome length in the Pitt approach, population size in the Michigan approach, the attributes, classes and training data, etc.), are determined before the training process starts. However, learning should be an ad hoc process in which all the parameters, especially the rule number, should be determined at run-time according to different classification problems. It is intuitive to see that a larger set of rules may fit for hard problems, on the other hand, a smaller one for simple problems. Therefore, if GA can be equipped with the capability to decide the problem-dependent rule-number parameter, it will be better than its static counterpart.

Complete Chapter List

Search this Book:
Reset