Anamika Gupta (University of Delhi, India), Shikha Gupta (University of Delhi, India) and Naveen Kumar (University of Delhi, India)

Source Title: Handbook of Research on Innovations in Database Technologies and Applications: Current and Future Trends

Copyright: © 2009
|Pages: 10
DOI: 10.4018/978-1-60566-242-8.ch057

Chapter Preview

TopLet *D* denotes the database of *N* transactions and *I* denotes the set of *n* items in *D*. A set of one or more items belonging to set *I* is termed as an itemset. A k-itemset is an itemset of cardinality k. A transaction *T*∈ *D* contains an itemset and is associated with a unique identifier *TID.* The probability of an itemset *X* being contained in a transaction is termed as *support* of *X*.

An itemset having support greater than the user specified support threshold (*ms*) is termed as *frequent itemset (FI)*.

An *association rule* is an implication of the form *X →Y* where *X* ⊂ *I, Y*⊂ *I* and *X*∩*Y =*∅. *Support* and *Confidence* are rule evaluation metrics of an *association rule*. *Support* of a rule *X → Y* in D is ‘sup’ if *sup*% of transactions in *D* contain *X* ∪ *Y*. It is computed as:

Confidence of a rule *X → Y* in *D* is ‘*conf*’ if *conf*% of transactions in *D* that contain *X*, also contain *Y*. It is computed, as the conditional probability that *Y* occurs in a transaction given *X* is present in the same transaction, i.e.

A rule generated from frequent itemsets is *strong* if its *confidence* is greater than the user specified confidence threshold (*mc*).

Three independent groups of researchers (Pasquier N., Bastide Y., Taouil R. & Lakhal L. 1999a), (Zaki M.J. & Hsiao C. J. 1999), (Stumme G., 1999) introduced the notion of mining FCI instead of FI and have given the following definitions of FCI:

Zaki et al. defines *closed itemset* as an itemset whose support is not equal to support of any of its proper superset (Closure Property) (Zaki M. J. & Hsiao C. J. 1999). In other words *X* is a closed itemset if there exists no proper superset *X’* of *X* such that support(*X*) = support(*X’*). Closed itemsets with support greater than the user specified support threshold (ms) are *frequent closed itemsets (FCI)*.

Data Mining: Extraction of interesting, non-trivial, implicit, previously unknown and potentially useful information or patterns from data in large databases

Formal Concept Analysis (FCA): It is a branch of mathematics based on Concept and Concept Hierarchies.

Galois Closure Operator: Let D = (O,I,R) be a data mining context where O and I are finite sets of objects (transactions) and items respectively. ? ? ? x ? is a binary relation between objects and items. For O ? O, and I ? I, we define: f(O): 2O ? 2If(O) = (i? I | ?o ? O, (o,i) ? R} g(I): 2I ? 2Og(I) = (o? O | ?i ? I, (o,i) ? R} f(O) associates with O the items common to all objects o ? O and g(I) associates with I the objects related to all items i ? I. The operators h = f o g in 2I and h’ = g o f in 2O are Galois closure operators. An itemset C ? I from D is a closed itemset iff h(C) = C.

Association Rule: An Association rule is an implication of the form X?Y where X ? I, Y? I and XnY =Ø, I denotes the set of items.

Formal Concept: A formal context K = (G,M,I) consists of two sets G (objects) and M (attributes) and a relation I between G and M. For a set A?G of objects A’={meM | gIm for all geA} (the set of all attributes common to the objects in A). Correspondingly, for a set B of attributes we define B’ = {geG | gIm for all meB} (the set of objects common to the attributes in B). A formal concept of the context (G,M,I) is a pair (A,B) with A?G,B?M, such that A’=B and B’=A A is called the extent and B is the intent of the Formal Concept (A,B).

Closure: For any itemset C ? I, h(C) is the closure of C where h = f o g is the Galois Closure Operator.

Frequent Itemset (FI): A set of one or more items belonging to set I is termed as an itemset. An itemset X is a frequent itemset if support of X is greater than the user specified support threshold (ms).

Non-Redundant Association Rules: Let Ri denote the rule X1i?X2i, where X1,X2 ? I. Rule R1 is more general than rule R2 provided R2 can be generated by adding additional items to either the antecedent or consequent of R1. Rules having the same support and confidence as more general rules are the redundant association rules. Remaining rules are non-redundant rules.

Generator Itemset: A generator p of a closed itemset c is one of the smallest itemsets such that h(p) = c.

Frequent Closed Itemset (FCI): An itemset X is a closed itemset if there exists no itemset X’ such that i. X’ is a proper superset of X, ii. Every transaction containing X also contains X’. A closed itemset X is frequent if its support exceeds the given support threshold.

Search this Book:

Reset