Xenia Alexandre Naidenova (Research Centre of Military Medical Academy – Saint Petersburg, Russia)

Copyright: © 2018
|Pages: 11

DOI: 10.4018/978-1-5225-2255-3.ch017

Chapter Preview

TopBy classification we mean partition of a given object’s set into disjoint blocks or classes. We assume that objects are described by a set U of symbolic or numeric attributes and each object can have one and only one value of each attribute. Then each attribute generates, by its values, partition of a given set of objects into mutually disjoint classes the number of which is equal to the number of values of this attribute. To give a target classification of objects, we use an additional attribute KL not belonging to U. In Table 1, we have two classes: KL+ (positive objects) and KL− (negative objects).

By classification learning we mean approximation of given object classification in terms of attributes names or values of attributes (Naidenova, 2012). This approximation is reduced to extracting logical rules in the form of functional or implicative dependencies from observable datasets. These dependencies allow to distinguish between classes of given classification. For our example in Table 1, we have some rules based on implicative (ID) and functional dependencies (FD): Color_of_Hairs, Color_of_Eyes → KL (FD); if Blond, Blue, then KL = “+”; if Hazel, then KL = “−”; if Brown, Blue, then KL = “−” (IDs).

The task of classification learning based on inferring implicative rules is equivalent to the task of concept formation (Banerji, 1969, Ganter & Wille, 1999). The goal of this task is to describe/classify new objects according to description/classification of existing objects. Inferring good diagnostic (classification) tests (GDTs) is the formation of the best descriptions of a given object class KL+ against the objects not belonging to this class (KL−).

Let M = (∪dom(attr), attr ∈ U), where dom(attr) is the set of all values of attr. Let X ⊆ M and G be the set of indices of objects considered (objects for short), G = G+ ∪ G−, where G+ and G− the sets of positive and negative objects, respectively. Denote by d(g) the description of object g ∈ G. Let P(X) = {g | g ∈ G, X ⊆ d(g)). We call P(X) the interpretation of X in the power set 2^{G}. If P(X) contains only positive objects and the number of these objects more than 2, then we call X a description of some positive objects and (P(X), X) a test for positive objects. Let us define a good test or good description of objects.

Example of classification

Index of Example | Height | Color of Hair | Color of Eyes | KL |

1 | Low | Blond | Blue | + |

2 | Low | Brown | Blue | − |

3 | Tall | Brown | Hazel | − |

4 | Tall | Blond | Hazel | − |

5 | Tall | Brown | Blue | − |

6 | Low | Red | Blue | − |

7 | Tall | Red | Blue | + |

8 | Tall | Blond | Blue | + |

Good Irredundant Classification Test (GIRT): A good test is irredundant if deleting any attribute’s value from it changes its property “to be test” into the property “not to be a test”.

Incremental Learning: Incremental learning is a machine learning paradigm where the learning process takes place whenever new example(s) or new attribute(s) (attribute value(s)) merge or must be deleted from dataset and the solutions already obtained are only modified.

The Subtask of the Second Kind: For a given set of positive and negative examples and a non-empty collection of attributes’ values such that it does not correspond to a test for the set of positive examples, find all GMRTs (GIRTs) containing it.

Diagnostic or Classification Test: Assume that we have two sets of objects’ examples called positive and negative examples, respectively. A test for a subset of positive examples is a collection of attributes’ values describing this subset of examples, i.e. it is common or general feature for all examples of this subset and, simultaneously, none of the negative examples is described by it.

Good Classification Test: A classification test describing a given set of positive examples is good if this set of positive examples is maximal in the sense that if we add to it any positive example not belonging to it, then the collection of attributes’ values describing the obtained set will describe at least one negative example.

The Subtask of the First Kind: Assume that we have two sets of positive and negative examples and a positive example. The subtask of the first kind is to find all the collections of attributes’ values that are included in the description of this example and correspond to the good tests (GMRTs or GIRTs) for the set of positive examples.

Good Maximally Redundant Classification Test: ( GMRT): A good test is a maximally redundant one if extending it by any attribute’s value not belonging to it changes its property “to be a good test” into the property “to be a test but not a good one”.

Search this Book:

Reset

Copyright © 1988-2020, IGI Global - All Rights Reserved