With the explosion in the amount of data produced in commercial environments, organizations are faced with the challenge of how to collect, analyze, and manage such large volumes of data. As a consequence, they have to rely upon new technologies to efficiently and automatically manage this process. Data mining is an example of one such technology, which can help to discover hidden knowledge from an organization’s databases with a view to making better business decisions (Changchien & Lu, 2001). Data mining, or knowledge discovery from databases (KDD), is the search for valuable information within large volumes of data (Hand, Mannila & Smyth, 2001), which can then be used to predict, model or identify interrelationships within the data (Urtubia, Perez-Correa, Soto & Pszczolkowski, 2007). By utilizing data mining techniques, organizations can gain the ability to predict future trends in both the markets and customer behaviors. By providing detailed analyses of current markets and customers, data mining gives organizations the opportunity to better meet the needs of its customers. With such significance in mind, this chapter aims to investigate how data mining techniques can be applied in customer relationship management (CRM). This chapter is organized as follows. Firstly, an overview of the main functionalities data mining technologies can provide is given. The following section presents application examples where data mining is commonly applied within the domain, with supporting evidence as to how each enhances CRM processes. Finally, current issues and future research trends are discussed before the main conclusions are presented.
Data mining methods can generally be grouped into four categories: classification, clustering, association rules and information visualization. The following subsections will describe these in further detail.
Databases are full of hidden information that can help to make important business decisions. Classification involves using an algorithm to find a model that describes a data class or concept (Han & Kamber, 2006). By identifying a series of predefined labels, items can be categorized into classes according to its attributes (e.g., age or income). Thus, it is a useful technique for identifying the characteristics of a new item. For example, in the case of a bank loan clerk, classification is useful for predicting whether loan applicants are a “safe” or “risky” investment for the bank based on the class that they belong to. Popular classification techniques include Decision Trees and Bayesian Networks.
Where classification is thought of as a supervised learning technique because it uses a set of predefined class labels, clustering is an unsupervised learning technique. Because no assumptions are made about the structure of the data, clustering can uncover previously hidden and unexpected trends or patterns. Clustering involves grouping items into “natural” clusters based on their similarities (Hand et al., 2001). Each item in a cluster is similar to those within its cluster, but dissimilar to those items in other clusters. In this way, clustering is commonly used to identify customer affinity groups with the aim of targeting with specialized marketing promotions (section 3.2.2). Common clustering techniques include K-means and Kohonen Networks.
Association rules are mainly used to find relationships between two or more items in a database. Association rules are expressed in the form (X→Y), where X and Y are both items. In a set of transactions, this means that those containing the items X, tend to contain the items Y. Such an association rule is usually measured by support and confidence, where the support is the percentage of both X and Y contained in all transactions and the confidence is calculated by dividing the number of transactions supporting the rule by the number of transactions supporting the rule body (Zhang, Gong & Kawamura, 2004). For example, this technique is commonly used to identify which items are regularly purchased together or to identify the navigational paths of users through an online store. The discovery of such relationships can help in many business decisions, such as customer shopping behavior analysis, recommendations, and catalog design (Han & Kamber, 2006).
Key Terms in this Chapter
Decision Tree: A visual representation of a decision problem that forms tree-like predictive models with the aim of helping people to make better decisions.
Self Organizing Map (SOM): An unsupervised neural network algorithm that results in a clustered neuron structure, where neurons with similar properties (values) are arranged in related areas on the map.
K-Means: A nonhierarchical clustering technique which splits data sets into K (a given number) subsets in a way that each subset is maximally compact.
Kohonen Networks: A type of unsupervised neural network used for finding patterns in input data without human intervention, consisting of Vector Quantization, Self-Organizing Map, and Learning Vector Quantization.
Neural Network: A computational approach inspired by simple models of the brain, consisting of nodes or neurons connected together in some sort of network.
Bayesian Classification: A type of classification algorithm, based on the statistical probability of a class and the features associated with that class.