Article Preview
TopIntroduction
Dimensionality reduction is an important problem in machine learning and data mining. Techniques that can significantly reduce the dimensionality of high-dimensional data have been extensively used in areas such as computer vision, data mining, machine learning and pattern recognition (Duda, Hart, & Stork, 2001; Hastie, Tibshirani, & Friedman, 2001; Feng, Li, Du, & Zhang, 2017). In general, the goal of most dimensionality reduction techniques is to map the high-dimensional data points in a data set into a set of low-dimensional data points while preserving the features that are most important for recognition and classification. A successful dimensionality reduction technique can help avoiding problems that may arise from high dimensionality without adversely affecting the accuracy of recognition and classification.
So far, a large number of approaches have been developed to process high-dimensional data and reduce the dimensionality. For example, the Principal Component Analysis (PCA) (Fukunaga, 1990) is a traditional dimensionality reduction technique used to process unlabelled data. The dimensionality of a data set is reduced by seeking for linear projections that can maximize the global variance of the projected data points. PCA thus can preserve the maximum amount of the global information contained in a data set. However, PCA is not guaranteed to achieve optimal performance for classification problems and thus is often not used to process labelled data.
As another example, Linear Discriminant Analysis (LDA) is a classical supervised dimensionality reduction technique that can process labelled data (Fisher, 1936; Fukunaga, 1990). The goal of LDA is to maximize the extent to which data points in different classes are separated from one another in the space of reduced dimensionality. It computes the directions along which the ratio of the between-class distance and the within-class distance is maximized. LDA has been extensively used for dimensionality reduction in a variety of applications, such as microarray data analysis and face recognition (Duda, Hart, & Stork, 2001; Dudoit, Fridlyand, & Speed, 2002). However, LDA is only able to capture and preserve the global geometric structure of a data set, the local geometric structure might be lost after the dimensionality of data is reduced (Chen, Ye, & Li, 2007).
Recent research has shown that local geometric structure is an important feature of a data set and may affect the recognition accuracy of certain classifiers (Belkin & Niyogi, 2001; Chen, Ye, & Li, 2007; Silva & Tenenbaum, 2003; Harandi, Salzmann, & Hartley, 2017). Given a dataset, many classifiers classify a data point by analyzing the part of the dataset that is in the vicinity of the data point. The local geometry of the dataset is thus important for such classifiers to correctly classify a data point. Based on this observation, a variety of approaches have been developed to reduce the dimensionality of a dataset while preserving its local geometric structure. For example, Laplacian Eigenmaps (LE) (Belkin & Niyogi, 2001) and locality preserving projection (LPP) (He & Niyogi, 2002) reduce the dimensionality by minimizing an objective defined in terms of the graph Laplacian matrix. Locally Linear Embedding (LLE) (Roweis & Saul, 2000) models the local geometric structure with linear dependence and data points are mapped into a low-dimensional manifold while preserving the dependence with minimum error. These dimensionality reduction techniques preserve the local geometric structure of a data set. However, most of these approaches do not consider global features contained in a data set and important global features might be lost during the process of dimensionality reduction.