The problem of feature selection is fundamental in various tasks like classification, data mining, image processing, conceptual learning, and so on. Feature selection is usually used to achieve the same or better performance using fewer features. It can be considered as an optimization problem and aims to find an optimal feature subset from the available features according to a certain criterion function. Clonal selection algorithm is a good choice in solving an optimization problem. It introduces the mechanisms of affinity maturation, clone, and memorization. Rapid convergence and good global searching capability characterize the performance of the corresponding operations. In this study, the property of rapid convergence to global optimum of clonal selection algorithm is made use of to speed up the searching of the most appropriate feature subset among a huge number of possible feature combinations. Compared with the traditional genetic algorithm-based feature selection, the clonal selection algorithm-based feature selection can find a better feature subset for classification. Experimental results on datasets from UCI learning repository, 16 types of Brodatz textures classification, and synthetic aperture radar (SAR) images classification demonstrated the effectiveness and good performance of the method in applications.
Feature selection is an active research area in pattern recognition, machine learning, and data mining. In the workshop of NIPS 2003 on feature extraction and feature selection challenge, feature selection is studied extensively. And there is a workshop on feature selection in NIPS 2006. Also, FSDM 2006 is an international workshop on feature selection for data mining. At present, a great deal of research on feature selection has been carried out. Feature selection is defined as the process of choosing a subset of the original predictive variables by eliminating redundant features and those with little or no predictive information. If we extract as much information as possible from a given dataset while using the smallest number of features, we can not only save a great amount of computing time and cost, but also improve the generalization ability to unseen points.
The majority of classification problems require supervised learning where the underlying class probabilities and class-conditional probabilities are unknown, and each instance is associated with a class label. In these situations, relevant features are often unknown a priori. Therefore, many candidate features are introduced to better represent the domain. Unfortunately, many of these are either partially or completely irrelevant to the target concept. Reducing the number of irrelevant features drastically reduces the running time of a learning algorithm and yields more general concept. This helps in getting better insight into the underlying concept of a real-world classification problem (Kohavi, & Sommereld, 1995; Koller, & Sahami, 1994). Feature selection methods try to pick a subset of features that are relevant to the target concept (Blum, & Langley, 1997).
Recently, natural computation algorithms get widely applications in feature selection (Yang, & Honavar, 1998) and synthesis (Li, Bhanu, & Dong, 2005; Lin, & Bhanu, 2005) to improve the performance and reduce the feature dimension as well. Among them, genetic algorithm (GA) is one of the most popularly used in feature selection (Oh, Lee, & Moon, 2004; Raymer, Punch, Goodman, Kuhn, & Jain, 2000; Zio, Baraldi, & Pedroni, 2006). In this chapter, instead of using GA to search for the optimal feature subset for classification, an effective global optimization technique, the clonal selection algorithm (de Castro, & Von Zuben, 1999, 2000, 2002; Du, Jiao, & Wang, 2002) in artificial immune systems (AISs) is applied in feature selection. AISs are proving to be a very general and applicable form of bio-inspired computing. To date, AISs have been applied to various areas (Bezerra, de Castro, & Zuben, 2004; Dasgupta, & Gonzalez, 2002; de Castro, & Timmis, 2002; de Castro, & Zuben, 2002; Forrest, Perelson, Allen, & Cherukuri, 1994; Nicosia, Castiglione, & Motta, 2001; Timmis, & Neal, 2001; Zhang, Tan, & Jiao, 2004) such as machine learning, optimization, bioinformatics, robotic systems, network intrusion detection, fault diagnosis, computer security, data analysis and so on. Clonal selection algorithm was proposed as a computational realization of the clonal selection principle for pattern matching and optimization. It has become perhaps the most popular in the field of AISs. This chapter will investigate the performance of the clonal selection algorithm in the feature selection.