Article Preview
Top1. Introduction
In machine learning, the training data can often be enhanced by the selection or extraction of features from the raw data collected. Choosing informative, discriminative and independent features is a critical step for machine learning tasks. Generally speaking, there are two ways of constructing feature representations: either with domain knowledge or learning from samples, and each has its own pros and cons. If we use domain knowledge to construct features, it would be easier to find proper representations for a specific task. However, for tasks that are different even in a slight way, we have to refer to domain knowledge all the time before feature extraction. Moreover, for many cases in image understanding (e.g., age estimation based on facial images), using domain knowledge to generate geometry or texture features (Todd et al., 1980) may not be feasible at all.
Recently, feature learning, which obviates manual feature engineering, has become a commonly used approach to learn proper representations of inputs from samples (Bengio et al., 2013). For a given problem, the learning procedure would optimize the representations based on labeled or unlabeled data and produce robust features for classification or regression tasks. In addition, fine-tuning from established models for similar tasks would provide improved performance and overcome drawbacks such as over-fitting or insufficient training data.
Ranking, usually formalized as “learning to rank”, is the application of machine learning mostly used in the field of information retrieval. Ranking algorithms typically map the inputs to ordinal relationships, and can generally be divided into three categories (Liu, 2009): Pointwise, Pairwise, and Listwise. In the literature, it was generally taken for granted that features extracted by domain knowledge or learned from data for classification tasks can also be used for ranking. However, ranking methods possess some distinctive properties different from common classification approaches, e.g., the ordinal relationships among the class labels. These differences indicate the necessity of developing new feature learning procedures dedicated for ranking.
Figure 1. Ranking-CNN for facial image-based age estimation
In this paper, we propose a feature learning approach that provides and interprets the best features for ranking problems. Our framework is based on the ranking-CNN (Chen et al., 2017) and can be utilized in a variety of ranking applications. Specifically, ranking-CNN contains a series of basic binary CNNs, each of which has a sequence of convolutional layers, sub-sampling layers and fully connected layers. Basic CNNs are initialized with unsupervised learning and fine-tuned with ordinal labels through supervised learning. Then, their binary outputs are aggregated to predict the final ranking. Figure 1 shows an illustration of the ranking-CNN model on facial image-based age estimation. The main contribution of this paper is summarized as follows:
- •
We illustrate why ranking-CNN can learn a feature representation that is most suitable for establishing the relative order of the labels;
- •
We present a case study of our model on facial image-based age estimation. Through extensive experiments performed on real datasets, we show that ranking-CNN outperforms other state-of-the-art feature extractors and age estimators;
- •
We propose an innovative way of understanding human aging pattern from a machine perspective. By visualizing activations in feature maps using histograms and deconvolution algorithm, we present a clear illustration of the deep aging patterns learned by ranking-CNN.