Data mining has been widely applied in many areas over the past two decades. In marketing, many firms collect large amount of customer data to understand their needs and predict their future behavior. This chapter discusses some of the key data mining problems in marketing and provides solutions and research opportunities.
Analytics and data mining are becoming more important than ever in business applications, as described by Davenport and Harris (2007) and Baker (2006). Marketing analytics have two major areas: market research and database marketing. The former addresses strategic marketing decisions through survey data analysis and the latter handles campaign decisions through analysis of behavioral and demographic data. Due to the limited sample size of a survey, market research is normally not considered data mining. This chapter will focus on database marketing where data mining is used extensively by large corporations and consulting firms to maximize marketing return on investment.
The simplest tool is RFM where historical purchase recency (R), frequency (F), and monetary (M) value are used for targeting. Other tools include profiling by pre-selected variables to understand customer behavior, segmentation to group customers with similar characteristics, and association rules to explore purchase relationships among products, see Rud (2001); Berry and Linoff(2000). More advanced marketing involves predictive modeling to improve targeting and maximize returns. For examples, marketing-mix analysis has been around for three decades to optimize advertising dollars, see Dekimpe and Hanssens (2000); attrition modeling is used to identify customers at risk of attrition, see Rud (2001); and long-term value is used to prioritize marketing and services, see Peppers and Rogers (1997,1999).
To improve 1:1 marketing campaigns (e.g. direct mails, outbound), response modeling to identify likely responders is now a standard practice in larger corporations. As summarized in Figure 1, a previous campaign provides data on the ‘dependent variable’ (responded or not), which is merged with individual characteristics. A response model is developed to predict the response rate given the characteristics. The model is then used to score the population to predict response rates for all individuals. Finally, the best list of individuals will be targeted in the next campaign in order to maximize effectiveness and minimize expense.
Response modeling process
Response modeling can be applied in the following activities via any marketing channel, see Rud (2001); Berry and Linoff (1997):
Acquisition: Which prospects are most likely to become customers.
Development: Which customers are mostly likely to purchase additional products (cross-selling) or add monetary value (up-selling).
Retention: Which customers are most retainable; this can be relationship or value retention.
In this chapter, we describe highly important problems that are infrequently mentioned in academic literature but frequently faced by marketing analysts. These problems are embedded in various components of the campaign process, from campaign design to response modeling to campaign optimization, see Figure 2. Each problem will be described in the Problem-Solution-Opportunity format.
Database marketing campaign process