Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

A Review on Evolutionary Prototype Selection: An Empirical Study of Performance and Efficiency

Salvador García, José Ramón Cano, Francisco Herrera

Source Title: Intelligent Systems for Automated Learning and Adaptation: Emerging Trends and Applications

DOI: 10.4018/978-1-60566-798-0.ch005

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Evolutionary algorithms have been successfully used in different data mining problems. Given that the prototype selection problem could be seen as a combinatorial problem, evolutionary algorithms have been used to solve it with promising results. This chapter presents an evolutionary data mining application known as evolutionary prototype selection. Various approaches have been proposed in the literature following two strategies on the use of evolutionary algorithms: general evolutionary models and models specific to prototype selection problem. In this chapter, the authors review the representative evolutionary prototype selection algorithms proposed, give their description and analyze their performance in terms of efficiency and effectiveness. They study their performance considering different sizes of the data sets, and analyze their behavior when the database scales up. The results are statistically contrasted in order to argue the benefits and drawbacks of each model.

Chapter Preview

Top

1. Introduction

In supervised classification problems, we usually have a training set of samples (prototypes, instances) in which each example is labeled according to a given class. In the group of supervised classifiers, we can find the Nearest Neighbor (NN) rule method (Papadopoulos & Manolopoulos, 2004; Shakhnarovich et al., 2006) that predicts the class of a new prototype by computing a similarity measure (Pekalska et al., 2006) between this prototype and all prototypes from the training set, called the k-Nearest Neighbors (k-NN) classifier. In the k-NN classifier, k nearest neighbors vote to decide the class of the new instance to classify. A specific case is the 1-NN, where just the nearest neighbor is considered to take the decision.

Recent studies show that k-NN classifiers could be improved by employing numerous procedures. Among them, we could cite proposals on instance reduction (Liu & Motoda, 2002; Wilson & Martinez, 2000) for incorporating weights for improving classification (Paredes & Vidal, 2006), and for accelerating classification task (Gómez-Ballester et al., 2006), etc.

Prototype Selection (PS) is an instance reduction process consisting of maintaining those instances that are more relevant in the classification task of the k-NN algorithm and removing the redundant ones. This attempts to reduce the number of rows in the data set without any loss in classification accuracy and obtain an improvement in the classifier. Various approaches of PS algorithms have been proposed in the literature, see Wilson & Martinez (2000) and Grochowski & Jankowski (2004) for a review. Another process used for reducing the number of instances in the training data is the prototype generation, which consists of building new examples by combining or computing several metrics among the original data and including them into the subset of the training data (Lozano et al., 2006).

Evolutionary Algorithms (EAs) have been successfully used in different data mining problems (see Freitas, 2002; Ghosh & Jain, 2005; García-Pedrajas & Ortiz-Boyer, 2007). Given that PS problem could be regarded as a combinatorial problem, EAs have been used to solve it with promising results (Cano et al., 2003). We termed such approaches as Evolutionary Prototype Selection (EPS).

The increase in the numbers of training samples is a staple problem in PS (which is known as the Scaling Up Problem). This problem produces excessive storage requirement, increases time complexity, and affects generalization accuracy. These drawbacks are present in EPS too, because they result in an increment in chromosome size and execution time and also lead to a decrease in convergence capabilities of the EA. Traditional EPS approaches generally suffer from excessively slow convergence between solutions because of their failure to exploit local information. This often limits the practicality of EAs on many large-scale problems where computational time is a crucial consideration. A first rapprochement on the use of EAs when this problem scales up can be found in Cano et al. (2005).

The aim of this paper is to present a review of the EPS algorithms proposed, provide their description and analyze their performance in terms of efficiency and effectiveness. We will study their performance considering different sizes of the data sets, aimed at studying their behavior when the database scales up. The results will be statistically contrasted in order to argue the benefits and drawbacks of each model.

This chapter is organized in the following manner. Section 2 presents the PS task formally and the scaling problem involved when the size of the data set increases. A review of EPS is given in Section 3. In Section 4 we describe the EPS methods studied. Details of empirical experiments and results obtained are reported in Section 5. Section 6 contains a brief summary of the work and the conclusions reached.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

A Review on Evolutionary Prototype Selection: An Empirical Study of Performance and Efficiency

Abstract

1. Introduction

Complete Chapter List