Search the World's Largest Database of Information Science & Technology Terms & Definitions
InfInfoScipedia LogoScipedia
A Free Service of IGI Global Publishing House
Below please find a list of definitions for the term that
you selected from multiple scholarly research resources.

What is Two-Step Strategy of PU Learning

Handbook of Research on Text and Web Mining Technologies
The first step of PU learning is to identify a set of reliable negative documents (set RN) from the unlabeled set U; and second step of PU leaning is then to build a classifier using positive set P, reliable negative set RN and remaining unlabeled set U’ (U’=U-RN).
Published in Chapter:
Partially Supervised Text Categorization
Xiao-Li Li (Institute for Infocomm Research, A* STAR, Singapore)
Copyright: © 2009 |Pages: 21
DOI: 10.4018/978-1-59904-990-8.ch005
Abstract
In traditional text categorization, a classifier is built using labeled training documents from a set of predefined classes. This chapter studies a different problem: partially supervised text categorization. Given a set P of positive documents of a particular class and a set U of unlabeled documents (which contains both hidden positive and hidden negative documents), we build a classifier using P and U to classify the data in U as well as future test data. The key feature of this problem is that there is no labeled negative document, which makes traditional text classification techniques inapplicable. In this chapter, we introduce the main techniques S-EM, PEBL, Roc-SVM and A-EM, to solve the partially supervised problem. In many application domains, partially supervised text categorization is preferred since it saves on the labor-intensive effort of manual labeling of negative documents.
Full Text Chapter Download: US $37.50 Add to Cart
eContent Pro Discount Banner
InfoSci OnDemandECP Editorial ServicesAGOSR