Active Learning with SVM

Jun Jiang; Horace H.S. Ip

doi:10.4018/978-1-59904-849-9.ch001

Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Active Learning with SVM

Jun Jiang, Horace H.S. Ip

Source Title: Encyclopedia of Artificial Intelligence

DOI: 10.4018/978-1-59904-849-9.ch001

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Jiang, Jun; IP Horace H. S. With the increasing demand of multimedia information retrieval, such as image and video retrieval from the Web, there is a need to find ways to train a classifier when the training dataset is combined with a small number of labelled data and a large number of unlabeled one. Traditional supervised or unsupervised learning methods are not suited to solving such problems particularly when the problem is associated with data in a high-dimension space. In recent years, many methods have been proposed that can be broadly divided into two groups: semi-supervised and active learning (AL). Support Vector Machine (SVM) has been recognized as an efficient tool to deal with high-dimensionality problems, a number of researchers have proposed algorithms of Active Learning with SVM (ALSVM) since the turn of the Century. Considering their rapid development, we review, in this chapter, the state-of-the-art of ALSVM for solving classification problems.

Chapter Preview

Top

Background

The general framework of AL can be described as in Figure 1. It can be seen clearly that its name – active learning – comes from the fact that the learner can improve the classifier by actively choosing the “optimal” data from the potential query set Q and adding it into the current labeled training set L after getting its label during the processes. The key point of AL is its sample selection criteria.

Figure 1.

Framework of active learning

AL in the past was mainly used together with neural network algorithm and other learning algorithms. Statistical AL is one classical method, in which the sample minimizing either the variance (D. A. Cohn, Ghahramani, & Jordan, 1996), bias (D. A. Cohn, 1997) or generalisation error (Roy & McCallum, 2001) is queried to the oracle. Although these methods have strong theoretical foundation, there are two common problems limiting their application: one is how to estimate the posterior distribution of the samples, and the other is its prohibitively high computation cost. To deal with the above two problems, a series of version space based AL methods, which are based on the assumption that the target function can be perfectly expressed by one hypothesis in the version space and in which the sample that can reduce the volume of the version space is chosen, have been proposed. Examples are query by committee (Freund, Seung, Shamir, & Tishby, 1997), and SG AL (D. Cohn, Atlas, & Ladner, 1994). However the complexity of version space made them intractable until the version space based ALSVMs have emerged.

The success of SVM in the 90s has prompted researchers to combine AL with SVM to deal with the semi-supervised learning problems, such as distance-based (Tong & Koller, 2001), RETIN (Gosselin & Cord, 2004) and Multi-view (Cheng & Wang, 2007) based ALSVMs. In the following sections, we summarize existing well-known ALSVMs under the framework of version space theory, and then briefly describe some mixed strategies. Lastly, we will discuss the research trends for ALSVM and give conclusions for the chapter.

Top

Version Space Based Active Learning With Svm

The idea of almost all existing heuristic ALSVMs is explicitly or implicitly to find the sample which can reduce the volume of the version space. In this section, we first introduce their theoretical foundation and then review some typical ALSVMs.

Version Space Theory

Based on the Probability Approximation Correct learning model, the goal of machine learning is to find a consistent classifier which has the lowest generalization error bound. The Gibbs generalization error bound (McAllester, 1998) is defined as

where P_H denotes a prior distribution over hypothesis space H, V(z) denotes the version space of the training set z, m is the number of z and δ is a constant in [0, 1]. It follows that the generalization error bound of the consistent classifiers is controlled by the volume of the version space if the distribution of the version space is uniform. This provides a theoretical justification for version space based ALSVMs.

Key Terms in this Chapter

Version Space: The subset of the hypothesis space which is consistent with the training set.

Unsupervised Learning: The set of learning algorithms in which the samples in training dataset are all unlabelled.

Hypothesis Space: The set of all hypotheses in which the objective hypothesis is assumed to be found.

Statistical Active Learning: The set of active learning algorithms in which the sample selection criteria is based on some statistical objective function, such as minimization of generalisation error, bias and variance. Statistical active learning is usually statistically optimal.

Heuristic Active Learning: The set of active learning algorithms in which the sample selection criteria is based on some heuristic objective function. For example, version space based active learning is to select the sample which can reduce the size of the version space.

Supervised Learning: The set of learning algorithms in which the samples in the training dataset are all labelled.

Semi-Supervised Learning: The set of learning algorithms in which both labelled and unlabelled data in the training dataset are directly used to train the classifier.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Active Learning with SVM

Abstract

Background

Version Space Based Active Learning With Svm

Version Space Theory

Key Terms in this Chapter

Complete Chapter List