Multi-Label Classification

Multi-Label Classification

Jesse Read (Universidad Carlos III, Spain) and Albert Bifet (Yahoo! Research Barcelona, Spain)
Copyright: © 2014 |Pages: 4
DOI: 10.4018/978-1-4666-5202-6.ch142

Chapter Preview



Multi-label classifications come naturally. For example, a movie can be assigned to both genres “Action” and “Comedy.” In the physical world, this often involves physical duplication, for example, a video store has to put copies of such a movie in two different sections -- “Action” and “Comedy” -- or create a single-category (“Action-Comedy”). In a digital environment there are no such restrictions, and it is likely for this reason that there has been a rapid rise in the popularity of the multi-label concept in everyday applications. Consider how the once-familiar ‘folder’ metaphor is being replaced by the label/tag term in many everyday applications: email, picture, document, and media collections, and so forth.

The typical goal of multi-label learning is to learn a model to predict or recommend labels for data points automatically, and thus reduce or even eliminate the need for time-consuming manual labeling of data and document collections. Such collections may include e-mails and text documents, images, audio and video collections, or even certain biological applications such as where genes can be associated with multiple functions. For a detailed introduction and review of multi-label classification, see for example (Tsoumakas, G., Katakis, I., & Vlahavas, 2010).

Multi-label classification has borrowed heavily from the already-existing domain of traditional single-label classification, of assigning a single ‘class’ to each data element. Methods can be grouped roughly into two categories:

Problem/Data Transformation

In this approach, the multi-labeled data is transformed into one or a series of single-label problems. Standard off-the-shelf single-label classifiers can then be applied. A typical approach is to create one binary problem for each label (to predict if the label is relevant or not), as in (Read et al., 2011), or a multi-class problem where combinations of labels are represented as atomic mutually-exclusive classes (“Action-Comedy,” in the movie example, would be considered a single class), as in (Tsoumakas, G., Katakis, I., & Vlahavas, P., 2011). Perhaps the earliest work of this type with specific reference to multi-label classification is (Boutell et al., 2004) where images were assigned scene labels.

Algorithm Adaptation

In this approach, existing single-label algorithms are adapted to deal with the multi-label problem, for example, in artificial neural networks that have multiple outputs (Zhang, M.-L. & Zhou, Z.-H., 2006), and in decision trees that predict multiple labels in the leaves (Vens, C., et al., 2008; and more recently, Kocev, D., 2013.), among many other adaptations.

Multi-label classification software with a variety of methods for multi-label learning and evaluation include:

The selection of method depends on the size of the data collection, the number of labels, and the peculiarities of the data, i.e. the application domain. A good recent review of some of the most well-known methods so far is given in (Zhang, M.-L. & Zhou, Z.-H., 2013).

Key Terms in this Chapter

Active Learning: A method can decide its own training data by requesting manually-assigned labels for particular data points.

Data Stream: A dynamic and evolving stream of data, such as e-mail and news. Usually considered to grow too large to hold in memory.

Folksonomy: Collaboration from different users to apply labels, usually online such as in a forum.

Transfer Learning: Using the model created from one task, to help with a different task.

Semi-Supervised Learning: Learning to label new data using both labeled training data plus unlabeled data.

Complete Chapter List

Search this Book: