Receive a 20% Discount on All Purchases Directly Through IGI Global's Online Bookstore

M. A.H. Farquad (Institute for Development and Research in Banking Technology (IDRBT) and University of Hyderabad, India), V. Ravi (Institute for Development and Research in Banking Technology (IDRBT), India) and Raju S. Bapi (University of Hyderabad, India)

Source Title: Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods, and Techniques

Copyright: © 2010
|Pages: 23
DOI: 10.4018/978-1-60566-766-9.ch019

Chapter Preview

TopSupport Vector Machines (SVMs) (Vapnik 1995, 1998) and other linear classifiers are popular methods for building hyperplane-based classifiers from data sets and have been shown to have excellent generalization performance in a variety of applications. SVM is based on the statistical learning theory developed by Vapnik (1995) and his team at AT&T Bell Labs, which is a new learning algorithm and can be seen as an alternative training technique for Polynomial, Radial Basis Function and Multi-Layer Perceptron classifiers (Cortes & Vapnik 1995, Edgar et al. 1997). The scheme used by the SVM is that some linear weighted sum of the explanatory variables is lower (or higher) than a pre-specified threshold indicating that the given sample is classified into one class or the other. Even though such a schemes works well, it is completely non-intuitive to human experts in that it does not let us know the knowledge learnt by it during training in simple, comprehensible and transparent way. Therefore, SVM are also treated as “Black Box” models just like ANN.

There are many techniques existing for extracting knowledge embedded in trained neural networks in the form of *if-then-else* rules (Tickle et al. 1998). The process of converting opaque models into transparent models is often called *Rule Extraction*. These models are useful for understanding the nature of the problem and interpreting its solution. Using the rules extracted one can certainly understand in a better way, how a prediction is made. Gallant (1988) initiated the work of rule extraction from a neural network that defines the knowledge learnt in the form of *if-then* rules.

The advantages of rule extraction algorithm:

*•*Provision of user explanation capability (Gallant 1988). Davis et al. (1977) argues that even limited explanation can positively influence the system acceptance by the user.

*•*Data exploration and the induction of scientific theories. A learning system might discover salient features in the input data whose importance was not previously recognized (Craven & Shavlik 1994).

*•*Improves Generalization.

A taxonomy describing the techniques that are used to extract symbolic rules from the neural networks is proposed by Andrew et al. (1995). In general, rule extraction techniques are divided into two major groups i.e. decompositional and pedagogical. Decompositional techniques view the model at its minimum (or finest) level of granularity (at the level of hidden and output units in case of ANN). Rules are first extracted at individual unit level, these subset of rules are then aggregated to form global relationship. Pedagogical techniques extract global relationship between the input and the output directly without analyzing the detailed characteristics of the underlying solution. The third group for rule extraction techniques is eclectic, which combines the advantages of the decompostional and pedagogical approaches.

Earlier work (Andrews et al. 1995), (Towell & Shavlik 1993) and recent work (Kurfess 2000, Darbari 2001) includes rule extraction from neural network. Hayashi (1990) incorporated fuzzy sets with expert systems and proposed a rule extraction algorithm FNES (Fuzzy Neural Expert System). FNES relies on the involvement of an expert at input phase for transforming the input data into required format. This transformation process is then automated in Fuzzy-MLP (Mitra 1994). Later Mitra & Hayashi (2000) presented a survey of extracting fuzzy rules from neural networks. However, not much work has been done to extract rules from SVM.

Fidelity: a rule set is considered to display a high level of fidelity if it can mimic the behavior of the machine learning technique from which it was extracted.

Fuzzy Rule Based Systems: Fuzzy rules are linguistic IF-THEN- constructions that have the general form “IF A THEN B” where A and B are (collections of) propositions containing linguistic variables. A is called the premise and B is the consequence of the rule.

Radial Basis Function Network (RBF): RBF is a feed-forward neural network and has both unsupervised and supervised phases. In the unsupervised phase input data are clustered and cluster details are sent to hidden neurons, where radial basis functions of the inputs are computed by making use of the center and the standard deviation of the clusters.

Support Vector Machine (SVM): The SVM is a powerful learning algorithm based on recent advances in statistical learning theory. SVMs are learning systems that use a hypothesis space of linear functions in a high dimensional space trained with a learning algorithm from optimization theory that implements a learning bias derived from statistical learning theory.

Decision Tree (DT): A “decide-and-conquer” approach to the problem of learning from a set of independent instances leads naturally to a style of representation called a decision tree.

Adaptive Network-based Fuzzy Inference Systems (ANFIS): Using a given input/output data set the toolbox function ANFIS constructs a fuzzy inference system (FIS) whose membership function parameters are tuned (adjusted) using either a backpropagation algorithm alone, or in combination with a least squares type of method.

Rule Extraction: Rule extraction is the procedure to represent the knowledge in the form of IF-THEN rules learnt by the model during training.

Search this Book:

Reset