Basic Principles of Data Mining

Basic Principles of Data Mining

Karl-Ernst Erich Biebler (Ernst-Moritz-Arndt-University, Germany)
DOI: 10.4018/978-1-60566-196-4.ch015
OnDemand PDF Download:
List Price: $37.50


This chapter gives a summary of data types, mathematical structures, and associated methods of data mining. Topological, order theoretical, algebraic, and probability theoretical mathematical structures are introduced. The n-dimensional Euclidean space, the model used most for data, is defined. It is executed briefly that the treatment of higher dimensional random variables and related data is problematic. Since topological concepts are less well known than statistical concepts, many examples of metrics are given. Related classification concepts are defined and explained. Possibilities of their quality identification are discussed. One example each is given for topological cluster and for topological discriminant analyses.
Chapter Preview

Data Types

Observations at objects are informed about as data. One can receive these observations as measuring, numbers or verbal descriptions, for example. Sometimes they concern a quality, often also more qualities. Also more complicated facts can be included concerning the objects, such as relations. It is therefore required to distinguish data types. Data types relevant for the data analyses are described in the following.

One knows data types also from programming languages. These shall not be treated here.

A set in the set-theoretical meaning consists of elements, . The index may be finite or infinite. According to this one distinguishes finite and infinite sets. The sets and are the same in the set-theoretical meaning. This means all elements of a set are different.

Data sets are collections of elements of a set. The data sets and have to be distinguished. The same element of a set can appear repeatedly in a data set.

String data are signs or character strings (e.g. letters, words, abstract words). Numerical data are numbers (e.g. 3, 324, 2.1482). Dates are not regarded as numeric data. They form a type of their own.

Complete Chapter List

Search this Book: