Reference Hub1
Control of Inductive Bias in Supervised Learning Using Evolutionary Computation: A Wrapper-Based Approach

Control of Inductive Bias in Supervised Learning Using Evolutionary Computation: A Wrapper-Based Approach

Copyright: © 2003 |Pages: 28
ISBN13: 9781591400516|ISBN10: 1591400511|EISBN13: 9781591400950
DOI: 10.4018/978-1-59140-051-6.ch002
Cite Chapter Cite Chapter

MLA

Hsu, William H. "Control of Inductive Bias in Supervised Learning Using Evolutionary Computation: A Wrapper-Based Approach." Data Mining: Opportunities and Challenges, edited by John Wang, IGI Global, 2003, pp. 27-54. https://doi.org/10.4018/978-1-59140-051-6.ch002

APA

Hsu, W. H. (2003). Control of Inductive Bias in Supervised Learning Using Evolutionary Computation: A Wrapper-Based Approach. In J. Wang (Ed.), Data Mining: Opportunities and Challenges (pp. 27-54). IGI Global. https://doi.org/10.4018/978-1-59140-051-6.ch002

Chicago

Hsu, William H. "Control of Inductive Bias in Supervised Learning Using Evolutionary Computation: A Wrapper-Based Approach." In Data Mining: Opportunities and Challenges, edited by John Wang, 27-54. Hershey, PA: IGI Global, 2003. https://doi.org/10.4018/978-1-59140-051-6.ch002

Export Reference

Mendeley
Favorite

Abstract

In this chapter, I discuss the problem of feature subset selection for supervised inductive learning approaches to knowledge discovery in databases (KDD), and examine this and related problems in the context of controlling inductive bias. I survey several combinatorial search and optimization approaches to this problem, focusing on data-driven, validation-based techniques. In particular, I present a wrapper approach that uses genetic algorithms for the search component, using a validation criterion based upon model accuracy and problem complexity, as the fitness measure. Next, I focus on design and configuration of high-level optimization systems (wrappers) for relevance determination and constructive induction, and on integrating these wrappers with elicited knowledge on attribute relevance and synthesis. I then discuss the relationship between this model selection criterion and those from the minimum description length (MDL) family of learning criteria. I then present results on several synthetic problems on task-decomposable machine learning and on two large-scale commercial data-mining and decision-support projects: crop condition monitoring, and loss prediction for insurance pricing. Finally, I report experiments using the Machine Learning in Java (MLJ) and Data to Knowledge (D2K) Java-based visual programming systems for data mining and information visualization, and several commercial and research tools. Test set accuracy using a genetic wrapper is significantly higher than that of decision tree inducers alone and is comparable to that of the best extant search-space based wrappers.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.