Data classification is a supervised learning strategy that analyzes the organization and categorization of data in distinct classes. Generally, a training set, in which all objects are already associated with known class labels, is used in classification methods. The data classification algorithms work on this set by using input attributes and builds a model to classify new objects. In other words, the algorithm predicts output attribute values. Output attribute of the developed model is categorical (Roiger & Geatz, 2003). There are many applications of data classification in finance, health care, sports, engineering and science. Data classification is an important problem that has applications in a diverse set of areas ranging from finance to bioinformatics (Chen & Han & Yu, 1996; Edelstein, 2003; Jagota, 2000). Majority data classification methods are developed for classifying data into two groups. As multi-group data classification problems are very common but not widely studied, we focus on developing a new multi-group data classification approach based on mixed-integer linear programming.
There are a broad range of methods for data classification problem including Decision Tree Induction, Bayesian Classifier, Neural Networks (NN), Support Vector Machines (SVM) and Mathematical Programming (MP) (Roiger & Geatz, 2003; Jagota, 2000; Adem & Gochet, 2006). A critical review of some of these methods is provided in this section. A major shortcoming of the neural network approach is a lack of explanation of the constructed model. The possibility of obtaining a non-convergent solution due to the wrong choice of initial weights and the possibility of resulting in a non-optimal solution due to the local minima problem are important handicaps of neural network-based methods (Roiger & Geatz, 2003). In recent years, SVM has been considered as one of the most efficient methods for two-group classification problems (Cortes & Vapnik, 1995; Vapnik, 1998). SVM method has two important drawbacks in multi-group classification problems; a combination of SVM has to be used in order to solve the multi-group classification problems and some approximation algorithms are used in order to reduce the computational time for SVM while learning the large scale of data.
There have been numerous attempts to solve classification problems using mathematical programming (Joachimsthaler & Stam, 1990). The mathematical programming approach to data classification was first introduced in early 1980’s. Since then, numerous mathematical programming models have appeared in the literature (Erenguc & Koehler, 1990) and many distinct mathematical programming methods with different objective functions are developed in the literature. Most of these methods modeled data classification as linear programming (LP) problems to optimize a distance function. In addition to LP problems, mixed-integer linear programming (MILP) problems that minimize the misclassifications on the design data set are also widely studied. There have been several attempts to formulate data classification problems as MILP problems (Bajgier & Hill, 1982; Gehrlein 1986; Littschwager, 1978; Stam & Joachimsthaler, 1990). Since MILP methods suffer from computational difficulties, the efforts are mainly focused on efficient solutions for two-group supervised classification problems. Although it is possible to solve a multi-group data classification problem by solving several two-group problems, such approaches also have drawbacks including computational complexity resulting in long computational times (Tax & Duin, 2002).