Decision Rule Extraction for Regularized Multiple Criteria Linear Programming Model

Decision Rule Extraction for Regularized Multiple Criteria Linear Programming Model

DongHong Sun (Tsinghua University, China), Li Liu (University of Technology, Sydney, Australia), Peng Zhang (Chinese Academy of Sciences, China), Xingquan Zhu (University of Technology, Sydney, Australia) and Yong Shi (Chinese Academy of Sciences, China and University of Nebraska at Omaha, USA)
Copyright: © 2011 |Pages: 14
DOI: 10.4018/jdwm.2011070104
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Due to the flexibility of multi-criteria optimization, Regularized Multiple Criteria Linear Programming (RMCLP) has received attention in decision support systems. Numerous theoretical and empirical studies have demonstrated that RMCLP is effective and efficient in classifying large scale data sets. However, a possible limitation of RMCLP is poor interpretability and low comprehensibility for end users and experts. This deficiency has limited RMCLP’s use in many real-world applications where both accuracy and transparency of decision making are required, such as in Customer Relationship Management (CRM) and Credit Card Portfolio Management. In this paper, the authors present a clustering based rule extraction method to extract explainable and understandable rules from the RMCLP model. Experiments on both synthetic and real world data sets demonstrate that this rule extraction method can effectively extract explicit decision rules from RMCLP with only a small compromise in performance.
Article Preview

1. Introduction

With the development of large storage equipment and high performance computing technology, we are now able to collect large volumes of data from different sources. Discovering hidden patterns and useful knowledge from such large volume data to support decision making has become a pressing task for modern intelligent systems.

To meet this requirement, a new discipline called Data Mining (Olson & Shi, 2007; Peng, Kou, Shi, & Chen, 2008) has emerged, in which a number of learning methods are proposed to extract knowledge from large scale databases. Depending on the data characteristics and mining objectives, existing data mining models can be categorized into three types: association rule mining, clustering unlabeled data, and generating prediction models from labeled data (Zhang, Zhu, & Shi, 2008; Zhu, Zhang, Lin, & Shi, 2010; Qin, Zhang, & Zhang, 2010).

In the domain of classification, many effective models have been proposed in recent years, such as the decision tree model (Breiman, Friedman, Olshen, & Stone, 1984), Artificial Neural Networks (ANN) (Aleksander & Morton, 1990), and Support Vector Machines (SVMs) (Vapnik, 1998). According to their differences in utilizing decision logics, these models can be further categorized into two types: transparent models or non-transparent models.

Transparent models provide explicit (transparent) decision logics (such as decision rules or decision trees) from training samples, so that predictions are highly understandable for end users. Transparent decision making is, in fact, required in many business-related applications or medical systems in which decisions must be understandable by domain experts.

The decision logics of non-transparent decision models, such as SVMs, ANNs and others, on the other hand, are like black-box models and are not interpretable by human experts. Although users are able to obtain a prediction, example, they are nevertheless incapable of knowing the logic or the reasons as to why such a prediction is made. Compared to transparent models, which are widely used in human society, non-transparent black-box models are often used in machine society where explanation of the mining results is less important.

A family of Multiple Criteria Mathematical Programming (MCMP) based classification models (Shi, Liu, Yan, & Chen, 2008c) has recently been proposed for data classification.

Shi (2001), Shi, Wise, Lou, and Lin (2001), Shi, Peng, Xu, and Tang (2002), and Kou, Liu, and Peng (2003) proposed the use of Multiple Criteria Linear Programming (MCLP) for credit card fraud detection. Based on the MCLP model, He, Liu, and Shi (2004) and He, Shi, and Xu (2004) further proposed a Fuzzy Multiple Criteria Linear Programming (FMCLP) model and a Multiple Criteria Nonlinear Programming (MCNP) model for credit card analysis.

Kou, Peng, Shi, and Chen (2006c) proposed a Multiple Criteria Quadratic Programming (MCQP) model by adapting the linear objective functions of MCLP to quadratic ones. Kou, Peng, Shi, and Chen (2006a, 2006b) also proposed a multiple groups MCLP model to solve the multiple groups classification problem of MCLP.

Following promising results from MCQP, Kou, Peng, Chen, and Shi (2009) stepped forward and proposed a kernel-based MCQP method which extends MCQP to nonlinear classification problems. Further to this method, Zhang and Tian (2007) proposed a kernelized MCLP by adopting the inner product form of SVM to MCLP, which is a popular method for extending a linear classifier to non-linear one.

Zhang, Zhu, Zhang, and Shi (2010) and Zhang, Zhang, and Shi (2007) proposed a MQLC model for VIP E-mail Analysis. Based on rough set theory, Zhang, Shi, Zhang, and Gao (2008) proposed a rough set-based MCLP model and reported its efficiency on several UCI benchmark datasets.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing