Finding differences among two or more groups is an important data-mining task. For example, a retailer might want to know what the different is in customer purchasing behaviors during a sale compared to a normal trading day. With this information, the retailer may gain insight into the effects of holding a sale and may factor that into future campaigns. Another possibility would be to investigate what is different about customers who have a loyalty card compared to those who don’t. This could allow the retailer to better understand loyalty cardholders, to increase loyalty revenue, or to attempt to make the loyalty program more appealing to non-cardholders. This article gives an overview of such group mining techniques. First, we discuss two data-mining methods designed specifically for this purpose—Emerging Patterns and Contrast Sets. We will discuss how these two methods relate and how other methods, such as exploratory rule discovery, can also be applied to this task. Exploratory data-mining techniques, such as the techniques used to find group differences, potentially can result in a large number of models being presented to the user. As a result, filter mechanisms can be a useful way to automatically remove models that are unlikely to be of interest to the user. In this article, we will examine a number of such filter mechanisms that can be used to reduce the number of models with which the user is confronted.
There have been two main approaches to the group discovery problem from two different schools of thought. The first, Emerging Patterns, evolved as a classification method, while the second, Contrast Sets, grew as an exploratory method. The algorithms of both approaches are based on the Max-Miner rule discovery system (Bayardo Jr., 1998). Therefore, we will briefly describe rule discovery.
Rule discovery is the process of finding rules that best describe a dataset. A dataset is a collection of records in which each record contains one or more discrete attribute-value pairs (or items). A rule is simply a combination of conditions that, if true, can be used to predict an outcome. A hypothetical rule about consumer purchasing behaviors, for example, might be IF buys_milk AND buys_cookies THEN buys_cream.
Association rule discovery (Agrawal, Imielinski & Swami, 1993; Agrawal & Srikant, 1994) is a popular rule-discovery approach. In association rule mining, rules are sought specifically in the form of where the antecedent group of items (or itemset), A, implies the consequent itemset, C. An association rule is written A → Cs . Of particular interest are the rules where the probability of C is increased when the items in A also occur. Often association rule-mining systems restrict the consequent itemset to hold only one item as it reduces the complexity of finding the rules.
In association rule mining, we often are searching for rules that fulfill the requirement of a minimum support criteria, minsup, and a minimum confidence criteria, minconf. Where support is defined as the frequency with which A and C co-occur support(A → C) = frequency(A ∪ C) and confidence is defined as the frequency with which A and C co-occur, divided by the frequency with which A occurs throughout all the data
The association rules discovered through this process then are sorted according to some user-specified interestingness measure before they are displayed to the user.
Another type of rule discovery is k-most interesting rule discovery (Webb, 2000). In contrast to the support-confidence framework, there is no minimum support or confidence requirement. Instead, k-most interesting rule discovery focuses on the discovery of up to k rules that maximize some user-specified interestingness measure.