Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

An Evolving System in the Text Classification Problem

Elias Oliveira, Patrick Marques Ciarelli, Evandro Ottoni Teatini Salles

Source Title: Handbook of Research on Computational Intelligence for Engineering, Science, and Business

DOI: 10.4018/978-1-4666-2518-1.ch018

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Traditional machine learning techniques have been successful in yielding good results when the data are stable along the time horizon. However, in many cases, these techniques may be inefficient for data that are constantly expanding and changing over time. To address this problem, new learning techniques have been proposed in the literature. In this chapter, the authors discuss some improvements on their technique, called Evolving Probabilistic Neural Network (ePNN), and present the aspects of this recent learning paradigm. This technique is based on the Probabilistic Neural Networks. In this chapter the authors compare their technique against two other competitive techniques that can be found in the literature: Incremental Probabilistic Neural Network (IPNN) and Evolving Fuzzy Neural Network (EFuNN). To show the better performance of their technique, the authors present and discuss a series of experiments that demonstrate the efficiency of ePNN over both the IPNN and EFuNN approaches.

Chapter Preview

Top

Introduction

The volume of registered information in both the public and private domains has increased steadily over the past decade. The information may be in textual, graphical image or audio format. Information in any of these formats can be considered as data to be processed. In many contexts, the data analysis is performed manually (Mitra et al., 2002). A few examples in which information is registered mainly as text include the multitude of medical registers that physicians access each day, the numerous documents that governmental authorities must analyze within a month to make important decisions; and the many academic essays that are analyzed and graded during a semester. In such cases, specialists analyze the data and make decisions based on the information acquired from the analysis.

As the amount of data grows, the time required for analysis, the demand for specialized labor and the processing costs all increase. The use of computational technologies by specialists to automate some processes and make them less time-consuming and costly is therefore becoming increasingly frequent. Machine learning has been applied in this context. The machines usually learn from data sets prepared by humans and are subsequently able to infer knowledge from new data that come in. As time goes by and the machines’ capacity for extracting knowledge from new data decreases, a new training procedure is needed. Therefore, from time to time, these algorithms are retrained from scratch using a new human prepared data set. However, such techniques tend to become inadequate or inefficient for data that are constantly expanding and/or changing over time (Salles et al., 2010).

An off-line model has the advantage of reaching an optimized structure from a previously obtained training data set, but its performance may decline suddenly when certain features of the environment change. Although there are many cases in which this approach produces good results in many contexts, an off-line model usually needs to be completely re-designed when new circumstances occur; for example, a new class or substantial changes in one or more features of the environment. Another drawback is that the performance of an off-line model is directly related to the quality of the available data set. The process of learning in these models is based on the assumption that the whole data set is already available during the training phase and therefore does not consider the possibility of accommodating new knowledge when new data are acquired. However, the acquisition of adequate data representative of the problem is often onerous and time-consuming; furthermore, the data are usually available in small quantities over a period of time (Bhattacharyya et al., 2008).

New intelligent algorithms have therefore been developed to overcome these difficulties of the previous learning approaches. A new paradigm in the field of computational intelligence, based on building models from incremental algorithms and data flows, emerged around a decade ago. The new models, called evolutionary models, not only minimize the problem of storing large amounts of data by processing them altogether at once, but also offer important characteristics for the modeling of non-linear adaptive processes. The main characteristics of the evolutionary models are continuous learning, self-organization and the ability to adapt to unknown environments (Watts, 2009). To attain these goals, the training processes of these models must be able to obtain new knowledge (plasticity) without forgetting the previously acquired knowledge (stability). Thus, a trade-off between the properties of stability and plasticity is necessary (Polikar et al., 2001).

In this chapter, we discuss a new technique of evolutionary model, called Evolving Probabilistic Neural Network (ePNN), based on a technique presented by Vlassis et al. (1999). Experimental evaluations showed that the proposed technique is highly competitive with other techniques in the literature.

Key Terms in this Chapter

Plasticity: This is a characteristic of a system to continually learn new knowledge. However this is not a guarantee of preserving the previously acquired knowledge.

Incremental Learning: This is a Machine Learning technique which training and learning steps is performed continuously over time and never ends.

Evolving Layers: Layers of an evolving neural network that suffer the most changes during the learning process.

Stability: This is a characteristic of a system to retain the knowledge acquired without suddenly forget them. However this is not a guarantee of learn new information.

Data Analysis: It is an activity which works on collected data sets to search for patterns, trends, relationships or any kind of useful information.

Off-line Model: This is a machine learning technique, whose (re)training time is usually very slow, and it is not meant to be performed frequently to monitor the dynamicity of a process.

Stopwords: Words which are irrelevant in systems of search and classification because they do not contribute with useful information. Normally they are discarded in these systems.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

An Evolving System in the Text Classification Problem

Abstract

Introduction

Key Terms in this Chapter

Complete Chapter List