Linear Discriminant Function and Linear Classifier
Let
be a set of training samples from two classes
and
with
samples from
, and let
be their corresponding class labels. Here
means that
belongs to
whereas
means that
belongs to
. A linear discriminant function is a linear combination of the components of a feature vector
which can be written as:
, (1) where the vector
and the scalar
are called weight and bias respectively. The hyperplane
is a decision surface which is used to separate samples with positive class labels from samples with negative ones.
A linear discriminant criterion is an optimization model which is used to seek the weight for a linear discriminant function. The chief goal of classification-oriented LDA is to set up an appropriated linear discriminant criterion and to calculate the optimal projection direction, i.e. the weight. Here “optimal” means that after samples are projected onto the weight, the resultant projections of samples from two distinct classes
and
are fully separated.
Once the weight
has been derived from a certain linear discriminant criterion, the corresponding bias
can be computed using:
, (2) or
, (3) where
and
are respectively the mean training sample and the mean of training samples from the class
. They are defined as
, (4) and

.
(5)For simplicity, we calculate the bias using the Eq. (2) throughout this chapter.
Let
denote the mean of the projected training samples from the class
. Thus, the binary linear classifier based on the weight
and the bias
is defined as follow:
, (6) which assigns a class label
to an unknown sample
. Here,
is the sign function. That is, once the weight in a linear discriminant function has been worked out the corresponding binary linear classifier is fixed.