Attribute Relevance Analysis

Attribute Relevance Analysis

DOI: 10.4018/978-1-4666-6288-9.ch007
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Structured problems need to be quantified by relevance, and that is explained in this chapter. The most common methods of relevance analysis and different strength and sensitivity measures are explained in a way that practitioners can easily use them and start experimenting. Contrary to the belief that powerful hardware and sophisticated software can substitute for attribute relevance analysis, attribute relevance analysis is an important part of each analysis that operates with the target variable. Recognition of the most important variables, those with the greatest impact on the target variable, reduces redundancy and uncertainty at the model development process stage. It provides robustness of the model and model reliability. Attribute relevance analysis also evaluates attribute characteristics. Attribute characteristics evaluation includes measuring attribute values' impact on target variables. It helps in understanding relations and logic between the most important predictors and the target variable and understanding relations and logic between the most important predictors from the target variable perspective. Making models relevant and being able to proof them is relevant and almost as important as the ability to build them. After completing this chapter, analysts (readers) are ready to start projects.
Chapter Preview
Top

7.1 Introduction And Basic Concepts

A robust and stable predictive model has few attributes incorporated into model. It could be 6-10 of most predictive attributes. As it is evident initial data sample could contain more than hundreds of potential predictors. Some of them are original variables from databases as socio demographic values assigned to each customer, and other has behavioral characteristics defined by experts and extracted from existing transactional data.

Attribute relevance analysis has two important functions:

  • Recognition of most important variables which has greatest impact on target variable.

  • Understanding relations and logic between most important predictor and target variable, and understanding relations and logic between most important predictors from target variable perspective.

Contrary to assurance that powerful hardware and sophisticated software can substitute need for attribute relevance analysis, attribute relevance analysis is important part of each analysis, which operates with target variable. Recognition of most important variables, which has greatest impact on target variable, reduces redundancy and uncertainty at model development process stage. It provides robustness of the model and model reliability. Attribute relevance analysis besides importance measuring, evaluates attribute characteristics. Attribute characteristics evaluation includes measuring attribute values impact on target variables. It helps on understanding relations and logic between most important predictors and target variable, and understanding relations and logic between most important predictors from target variable perspective. After attribute relevance analysis stage, analyst has initial picture about churner profile and behavior. This stage often opens many additional questions related to reveled relations and sometimes induces construction of new behavioral (derived) variables, which also should pass attribute relevance analysis process.

From perspective of predictive churn modeling there are two basic data sample types for predictive churn model development:

  • Data sample with binomial target variable.

  • Data sample with multinomial target variable.

Top

7.2 Binomial Target Variable Versus Multinomial Target Variable

Data sample with binomial target variable contains target variable with two finite states. In churn problematic this states could be marked as: “Yes” or “No”. “Yes” or “No” states determines churn commitment within some period (outcome period). It also could be marked differently (as numeric values) into data sample e.g. with “1” and “0”. Data sample with binomial target variable is most common one in predictive churn modeling, and it is mostly used for hard churn modeling in situation where we have proven that customer has committed churn based on interrupted contract. In situation where we do not operate with contracts, as it was already presented, criteria for churn should be defined, and if customer reached defined criteria, he or she could be marked within data sample with churn status = ”Yes”. It is situation where contract does not exist like in retail. Data sample with multinomial target variable contains target variable with more than two finite states.

In churn problematic this states could be marked e.g. as: “Yes”, “No”, “Soft Churn”, “Unknown etc. Enumerated states determine churn status within some period (outcome period). It also could be marked differently (as numeric values) into data sample e.g. with “1”, “0”, “2”,”3” etc.

Complete Chapter List

Search this Book:
Reset