Formal concept analysis (FCA) can be used to derive conceptual structures, analyze complex structures, and discover data dependencies (Wille, 1989; Wille, 2005). FCA is useful in data mining in two ways. First, it provides tools for formal representation of knowledge in an efficient manner. Second, it helps to formalize the conceptual knowledge discovery for different data mining tasks. FCA is increasingly applied in conceptual clustering, data analysis, information retrieval, knowledge discovery, and ontology engineering. Though different from first order logic, FCA emphasizes inter-subjective communication and argumentation. FCA also facilitates importation of the notion of a concept into the modeling of knowledge discovery in databases (KDD).
Formal concept analysis is based on the notions of formal context and formal concept. A formal context is a binary relation between a set of objects and a set of attributes. A formal context provides logic representation of a data set and is used to extract formal concepts.
A formal concept is a pair of intent and extent (Saquer & Deogun, 1999). Intent is a set of features possessed by each object. The extent represents the set of all objects that belong to the concept. These objects share the features from intent. Given a set of features in intent, we can find objects that share the set or subset of features that are shared by the candidates in the extent. There may exist some indiscernible objects in the extent; such objects can be classified using concept learning from formal concept analysis.
Formal Context
A formal context is defined by a triplet (O,A,R), where O and A are two finite and nonempty sets, namely the object set and the attribute set. The relationships between objects and attributes are described by a binary relation R between O and A, which is a subset of the Cartesian product O×A. If an object Ox possesses an attribute Ay, we denote it as (Ox,Ay)∈R, or OxRAy.
Based on the definition of formal context, we know that an object Ox∈O has a set of attributes:
and an attribute
Ay is possessed by the set of objects:
To perform FCA, we first define a set-theoretic operator “*” to associate the subset of objects and attributes mutually in a formal context (O,A,R).
This shows that the “*” operator associates a subset of attributes X* to the subset of objects X. Similarly, for any subset of attributes , we can associate a subset of objects as follows:
The “*” operation induces the following attributes: for and ,
(1)(2)(3)(4)A pair of mappings is called a Galois connection if it satisfies (1) and (2), and hence (3). By definition, is the set of attributes possessed by Ox, and is the set of objects having attributes Ay. For a set of objects X, X* is the maximal set of attributes shared by all objects in X. Similarly, for a set of attributes Y, Y* is the maximal set of objects that have all attributes in Y (Yao & Chen, 2006).