Feature Selection in Data Mining

Feature Selection in Data Mining

Yong Seong Kim (University of Iowa, USA), W. Nick Street (University of Iowa, USA) and Filippo Menczer (University of Iowa, USA)
Copyright: © 2003 |Pages: 26
DOI: 10.4018/978-1-59140-051-6.ch004
OnDemand PDF Download:


Feature subset selection is an important problem in knowledge discovery, not only for the insight gained from determining relevant modeling variables, but also for the improved understandability, scalability, and, possibly, accuracy of the resulting models. The purpose of this chapter is to provide a comprehensive analysis of feature selection via evolutionary search in supervised and unsupervised learning. To achieve this purpose, we first discuss a general framework for feature selection based on a new search algorithm, Evolutionary Local Selection Algorithm (ELSA). The search is formulated as a multi-objective optimization problem to examine the trade-off between the complexity of the generated solutions against their quality. ELSA considers multiple objectives efficiently while avoiding computationally expensive global comparison. We combine ELSA with Artificial Neural Networks (ANNs) and Expectation-Maximization (EM) algorithms for feature selection in supervised and unsupervised learning respectively. Further, we provide a new two-level evolutionary algorithm, Meta-Evolutionary Ensembles (MEE), where feature selection is used to promote the diversity among classifiers in the same ensemble.

Complete Chapter List

Search this Book:
Table of Contents
John Wang
John Wang
Chapter 1
Stefan Arnborg
This chapter reviews the fundamentals of inference, and gives a motivation for Bayesian analysis. The method is illustrated with dependency tests in... Sample PDF
A Survey of Bayesian Data Mining
Chapter 2
William H. Hsu
In this chapter, I discuss the problem of feature subset selection for supervised inductive learning approaches to knowledge discovery in databases... Sample PDF
Control of Inductive Bias in Supervised Learning Using Evolutionary Computation: A Wrapper-Based Approach
Chapter 3
Herna Viktor, Eric Paquet, Gys le Roux
Data mining concerns the discovery and extraction of knowledge chunks from large data repositories. In a cooperative datamining environment, more... Sample PDF
Cooperative Learning and Virtual Reality-Based Visualization for Data Mining
Chapter 4
Yong Seong Kim, W. Nick Street, Filippo Menczer
Feature subset selection is an important problem in knowledge discovery, not only for the insight gained from determining relevant modeling... Sample PDF
Feature Selection in Data Mining
Chapter 5
Massimo Coppola, Marco Vanneschi
We consider the application of parallel programming environments to develop portable and efficient high performance data mining (DM) tools. We first... Sample PDF
Parallel and Distributed Data Mining through Parallel Skeletons and Distributed Objects
Chapter 6
Jerzy W. Grzymala-Busse, Wojciech Ziarko
The chapter is focused on the data mining aspect of the applications of rough set theory. Consequently, the theoretical part is minimized to... Sample PDF
Data Mining Based on Rough Sets
Chapter 7
Marvin L. Brown, John F. Kros
Data mining is based upon searching the concatenation of multiple databases that usually contain some amount of missing data along with a variable... Sample PDF
The Impact of Missing Data on Data Mining
Chapter 8
Hsin-Chang Yang, Chung-Hong Lee
Recently, many approaches have been devised for mining various kinds of knowledge from texts. One important application of text mining is to... Sample PDF
Mining Text Documents for Thematic Hierarchies Using Self-Organizing Maps
Chapter 9
John Wang, Alan Oppenheim
Although Data Mining (DM) may often seem a highly effective tool for companies to be using in their business endeavors, there are a number of... Sample PDF
The Pitfalls of Knowledge Discovery in Databases and Data Mining
Chapter 10
Marvin D. Troutt, Donald W. Gribbin, Murali S. Shanker, Aimao Zhang
Data mining is increasingly being used to gain competitive advantage. In this chapter, we propose a principle of maximum performance efficiency... Sample PDF
Maximum Performance Efficiency Approaches for Estimating Best Practice Costs
Chapter 11
Eitel J.M. Lauria, Giri Kumar Tayi
One of the major problems faced by data-mining technologies is how to deal with uncertainty. The prime characteristic of Bayesian methods is their... Sample PDF
Bayesian Data Mining and Knowledge Discovery
Chapter 12
Vladimir A. Kulyukin, Robin Burke
Knowledge of the structural organization of information in documents can be of significant assistance to information systems that use documents as... Sample PDF
Mining Free Text for Structure
Chapter 13
Michael Johnson, Farshad Fotouhi, Sorin Draghici
This chapter presents three systems that incorporate document structure information into a search of the Web. These systems extend existing Web... Sample PDF
Query-By-Structure Approach for the Web
Chapter 14
Tomas Eklund, Barbro Back, Hannu Vanharanta, Ari Visa
Performing financial benchmarks in today’s information-rich society can be a daunting task. With the evolution of the Internet, access to massive... Sample PDF
Financial Benchmarking Using Self-Organizing Maps - Studying the International Pulp and Paper Industry
Chapter 15
Fay Cobb Payton
Recent attention has turned to the healthcare industry and its use of voluntary community health information network (CHIN) models for e-health and... Sample PDF
Data Mining in Health Care Applications
Chapter 16
Lori K. Long, Mavin D. Troutt
This chapter focuses on the potential contributions that Data Mining (DM) could make within the Human Resource (HR) function in organizations. We... Sample PDF
Data Mining for Human Resource Information Systems
Chapter 17
Yao Chen, Joe Zhu
Information technology (IT) has become the key enabler of business process expansion if an organization is to survive and continue to prosper in a... Sample PDF
Data Mining in Information Technology and Banking Performance
Chapter 18
Jack S. Cook, Laura L. Cook
This chapter highlights both the positive and negative aspects of Data Mining (DM). Specifically, the social, ethical, and legal implications of DM... Sample PDF
Social, Ethical and Legal Issues of Data Mining
Chapter 19
Christian Bohm, Maria R. Galli, Omar Chiotti
The aim of this work is to present a data-mining application to software engineering. Particularly, we describe the use of data mining in different... Sample PDF
Data Mining in Designing an Agent-Based DSS
Chapter 20
Jeffrey Hsu
Every day, enormous amounts of information are generated from all sectors, whether it be business, education, the scientific community, the World... Sample PDF
Critical and Future Trends in Data Mining: A Review of Key Data Mining Technologies/Applications
About the Authors