Data Mining and Knowledge Discovery in Databases

Data Mining and Knowledge Discovery in Databases

Ana Azevedo (Polytechnic Institute of Porto, Portugal)
DOI: 10.4018/978-1-5225-7598-6.ch037

Abstract

The term knowledge discovery in databases or KDD, for short, was coined in 1989 to refer to the broad process of finding knowledge in data, and to emphasize the “high-level” application of particular data mining (DM) methods. The DM phase concerns, mainly, the means by which the patterns are extracted and enumerated from data. Nowadays, the two terms are, usually, indistinctly used. Efforts are being developed in order to create standards and rules in the field of DM with great relevance being given to the subject of inductive databases. Within the context of inductive databases, a great relevance is given to the so-called DM languages. This chapter explores DM in KDD.
Chapter Preview
Top

Data Mining And The Knowledge Discovery In Databases Process

“The KDD process, as presented in (Fayyad, Piatetski-Shapiro, & Smyth, 1996), is the process of using DM methods to extract what is considered knowledge according to the specification of measures and thresholds, using a database along with any required preprocessing, sub sampling, and transformation of the database. There are five stages considered, namely, selection, preprocessing, transformation, data mining, and interpretation/evaluation as presented in Figure 1:

  • Selection: This stage consists on creating a target data set, or on focusing in a subset of variables or data samples, on which discovery is to be performed;

  • Preprocessing: This stage consists on the target data cleaning and preprocessing in order to obtain consistent data;

  • Transformation: This stage consists on the transformation of the data using dimensionality reduction or transformation methods;

  • Data Mining: This stage consists on the searching for patterns of interest in a particular representational form, depending on the DM objective (usually, prediction);

  • Interpretation/Evaluation: This stage consists on the interpretation and evaluation of the mined patterns.” (Azevedo & Santos, 2008, p. 183)

Figure 1.

The KDD process

978-1-5225-7598-6.ch037.f01

Complete Chapter List

Search this Book:
Reset