Linear Discriminant Function and Linear Classifier
Let be a set of training samples from two classes and with samples from , and let be their corresponding class labels. Here means that belongs to whereas means that belongs to . A linear discriminant function is a linear combination of the components of a feature vector which can be written as:, (1) where the vector and the scalar are called weight and bias respectively. The hyperplane is a decision surface which is used to separate samples with positive class labels from samples with negative ones.
A linear discriminant criterion is an optimization model which is used to seek the weight for a linear discriminant function. The chief goal of classification-oriented LDA is to set up an appropriated linear discriminant criterion and to calculate the optimal projection direction, i.e. the weight. Here “optimal” means that after samples are projected onto the weight, the resultant projections of samples from two distinct classes and are fully separated.
Once the weight has been derived from a certain linear discriminant criterion, the corresponding bias can be computed using:, (2) or, (3) where and are respectively the mean training sample and the mean of training samples from the class . They are defined as, (4) and
.
(5)For simplicity, we calculate the bias using the Eq. (2) throughout this chapter.
Let denote the mean of the projected training samples from the class . Thus, the binary linear classifier based on the weight and the bias is defined as follow:, (6) which assigns a class label to an unknown sample . Here, is the sign function. That is, once the weight in a linear discriminant function has been worked out the corresponding binary linear classifier is fixed.