Zero-Shot Feature Selection via Transferring Supervised Knowledge

Zero-Shot Feature Selection via Transferring Supervised Knowledge

Zheng Wang, Qiao Wang, Tingzhang Zhao, Chaokun Wang, Xiaojun Ye
Copyright: © 2021 |Pages: 20
DOI: 10.4018/IJDWM.2021040101
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Feature selection, an effective technique for dimensionality reduction, plays an important role in many machine learning systems. Supervised knowledge can significantly improve the performance. However, faced with the rapid growth of newly emerging concepts, existing supervised methods might easily suffer from the scarcity and validity of labeled data for training. In this paper, the authors study the problem of zero-shot feature selection (i.e., building a feature selection model that generalizes well to “unseen” concepts with limited training data of “seen” concepts). Specifically, they adopt class-semantic descriptions (i.e., attributes) as supervision for feature selection, so as to utilize the supervised knowledge transferred from the seen concepts. For more reliable discriminative features, they further propose the center-characteristic loss which encourages the selected features to capture the central characteristics of seen concepts. Extensive experiments conducted on various real-world datasets demonstrate the effectiveness of the method.
Article Preview
Top

1. Introduction

The problem of feature selection (Guyon & Elisseeff, 2003; Wang, Ye, Wang, & Yu, 2019) has been widely investigated due to its importance for pattern recognition and image processing systems. This problem can be formulated as follows: identify an optimal feature subset which provides the best tradeoff between its size and relevance for a given task. The identified features not only provide an effective solution for the task, but also provide a dimensionally-reduced view of the underlying data.

Supervised knowledge (e.g., labels or pair-wise relationships) associated to data is capable of significantly improving the performance of feature selection methods (Chandrashekar, Sahin, & Engineering, 2014). However, it should be noted that existing supervised feature selection methods are facing an enormous challenge — the generation of reliable supervised knowledge cannot catch up with the rapid growth of newly-emerging concepts and multimedia data. In practice, it is costly to annotate sufficient training data for new concepts timely, and meanwhile, impractical to retrain the feature selection model whenever a new concept emerges. As illustrated in Figure 1, traditional methods perform well on the seen concepts which have correct guidance, but they may easily fail on the unseen concepts which have never been observed, like the newly invented product “quadrotor”. Therefore, the problem of Zero-shot Feature Selection (ZSFS), i.e., building a feature selection model that generalizes well to unseen concepts with limited training data of seen concepts, deserves great attention. However, few studies have considered this problem.

The major challenge in the ZSFS problem is how to deduce the knowledge of unseen concepts from seen concepts. In fact, the primary reason why existing studies fail to handle unseen concepts is that they only consider the discrimination among seen concepts (like the 0/1-form class labels illustrated in Figure 1), such that little knowledge could be deduced for unseen concepts. To address this, as illustrated in Figure 2, we adopt the class-semantic descriptions (i.e., attributes) as supervision for feature selection. This idea is inspired by the recent development of Zero-shot Learning (ZSL) (Farhadi, Endres, Hoiem, & Forsyth, 2009) (Guo, Ding, Han, & Gao, 2017) which has demonstrated that the capacity of inferring attributes allows us to describe, compare, or even categorize unseen objects.

An attendant problem is how to identify reliable discriminative features with attributes which might be inaccurate and noisy (Jayaraman & Grauman, 2014). To alleviate this, we further propose a novel loss function (named center-characteristic loss) which encourages the selected features to capture the central characteristics of seen concepts. Theoretically, this loss function is a variant of the center loss (Wen, Zhang, Li, & Qiao, 2016) which has shown its effectiveness to learn discriminative and generalized features for categorizing unseen objects.

Figure 1.

The Zero-shot Feature Selection Problem

IJDWM.2021040101.f01
Figure 2.

Overview of the proposed method

IJDWM.2021040101.f02

We evaluate the performance of the proposed method on several real-world datasets, including SUN, aPY and CIFAR10. One point should be noted is that the attributes of CIFAR10 are automatically generated from a public Wikipedia text-corpus (Shaoul, 2010) by a well-known NLP tool (Huang, Socher, Manning, & Ng, 2012). The experimental evidence shows that no matter with manually or automatically generated attributes, our method generalizes well to unseen concepts.

We summarize our main contributions as follows:

Complete Article List

Search this Journal:
Reset
Volume 20: 1 Issue (2024)
Volume 19: 6 Issues (2023)
Volume 18: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 17: 4 Issues (2021)
Volume 16: 4 Issues (2020)
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing