This chapter is a brief introduction to biometric discriminant analysis technologies — Section I of the book. Section 2.1 describes two kinds of linear discriminant analysis (LDA) approaches: classification-oriented LDA and feature extraction-oriented LDA. Section 2.2 discusses LDA for solving the small sample size (SSS) pattern recognition problems. Section 2.3 shows the organization of Section I.
Linear Discriminant Analysis
Linear discriminant analysis (LDA) method has been widely studied in and successfully applied to biometric recognition such as face, fingerprint, and palm print identification or verification.
The essence of LDA is to construct a linear discriminant criterion which can be used to build a binary classifier or a feature extractor. To differentiate LDA for binary classification from LDA for feature extraction, hereafter we name the former as classification-oriented LDA, and the later feature extraction-oriented LDA.
Classification-Oriented Linear Discriminant Analysis
Linear discriminant analysis was initially developed for binary classification in the seminal work of LDA (Fisher, 1936). Among various discriminant criteria, one of the most famous is Fisher discriminant criterion (FDC) for binary linear discriminant analysis. FDC tries to seek an optimal projection direction such that the between-class variance is maximized while the within-class variance is minimized if samples from two distinct classes are projected along this projection direction. Besides FDC, there exist other linear discriminant criteria for binary classification. Among them, perceptron and minimum squared-error (MSE) (Duda, Hart, & Stork, 2001) criteria are two well-known examples.
The mathematical form of each linear discriminant criterion can be characterized as an optimization model which is used to calculate the weight and sometimes the bias of a binary classifier. Similarly, the mathematical form of linear support vector machine (LSVM) (Burges, 1998) is also an optimization model to calculate the weight and bias for a binary classifier. Thus, LSVM can be viewed as a kind of classification-oriented LDA method.
Classification-oriented LDA methods are in fact binary linear classifiers and can not directly be applied to multiple pattern classification tasks. They should be used in combination with one of the implementation strategies described in Section 3.1.3 if they are applied to these tasks.
Based on Fisher’s work, Wilks (1962) extended the concept of optimal projection direction to a set of discriminant vectors by extending Fisher discriminant criterion to multiple Fisher discriminant criterion. While the former is used to calculate the weight of the Fisher classifier, the latter is used to compute a set of Fisher discriminant vectors. By using these discriminant vectors as a transformation matrix, Wilks successfully reduced a complicated classification problem in a high-dimensional input space into a simple one in a low-dimensional feature space. The procedure which compresses the data from the input space into a feature space by utilizing a transformation matrix is called linear feature extraction. The feature extraction method using Fisher discriminant vectors is called Fisher linear discriminant (FLD).
In general, Fisher discriminant vectors are unnecessarily orthogonal to each other. Many researchers believed that the discriminant capability of FLD could be enhanced by removal of linear dependence among Fisher discriminant vectors. Based on this intuition, Foley-Sammon discriminant (FSD) — a feature extraction method which used a set of orthogonal discriminant vectors was subsequently developed (Sammon, 1970; Foley & Sammon, 1975; Duchene & Leclercq, 1988).
Lda For Solving The Small Sample Size Problems
The Small Sample Size Pattern Recognition Problems
Since FDC, MSE, FLD, and FSD all involve the computation of the inverse of one or several scatter matrices of sample data, it is a precondition that these matrices should be nonsingular. In the small sample size (SSS) pattern recognition problems such as appearance-based face recognition, the ratio of dimensionality of input space to the number of samples is so large that the matrices involved are all singular. As a result, standard LDA methods cannot directly be applied to these SSS problems. Due to the prospective applications to biometric identification and computer vision, LDA for solving the SSS problems becomes one of the hottest research topics in pattern recognition.