Correlation Analysis in Classifiers

Correlation Analysis in Classifiers

Vincent Lemaire (France Télécom, France), Carine Hue (GFI Informatique, France) and Olivier Bernier (France Télécom, France)
DOI: 10.4018/978-1-60566-906-9.ch011
OnDemand PDF Download:


This chapter presents a new method to analyze the link between the probabilities produced by a classification model and the variation of its input values. The goal is to increase the predictive probability of a given class by exploring the possible values of the input variables taken independently. The proposed method is presented in a general framework, and then detailed for naive Bayesian classifiers. We also demonstrate the importance of “lever variables”, variables which can conceivably be acted upon to obtain specific results as represented by class probabilities, and consequently can be the target of specific policies. The application of the proposed method to several data sets shows that such an approach can lead to useful indicators.
Chapter Preview

Variable Importance

Whatever the method and the model, the goal is often to analyze the behavior of the model in the absence of one input variable, or a set of input variables, and to deduce the importance of the input variables, for all examples. The reader can find a large survey in (Guyon, 2005). The measure of the importance of the input variables allows the selection of a subset of relevant variables for a given problem. This selection increases the robustness of models and simplifies the understanding of the results delivered by the model. The variety of supervised learning methods, coming from the statistical or artificial intelligence communities often implies importance indicators specific to each model (linear regression, artificial neural network ...).

Complete Chapter List

Search this Book: