Knowledge Discovery (KD) process model was first discussed in 1989. Different models were suggested starting with Fayyad’s et al (1996) process model. The common factor of all data-driven discovery process is that knowledge is the final outcome of this process. In this chapter, the authors will analyze most of the KD process models suggested in the literature. The chapter will have a detailed discussion on the KD process models that have innovative life cycle steps. It will propose a categorization of the existing KD models. The chapter deeply analyzes the strengths and weaknesses of the leading KD process models, with the supported commercial systems and reported applications, and their matrix characteristics.
Knowledge Discovery Process Modeling Categorization
The following are the proposed categories for Knowledge Discovery Process (KDP) modeling:
Traditional KDP Approach. This approach is widely used by most of KDP modeling innovators. Starting with Fayyad’s et al. (1996) KDD process modeling, many of KDP modeling used the same process flow including most of the following steps: business understanding, data understanding, data processing, data mining/modeling, model evaluation, and deployment/visualization.
Ontology-based KDP Approach. This approach is the integration of ontology engineering and traditional KDP approach steps. Three directions were identified in this approach: Ontology for KDP, KDP for Ontology, and the integration of both previous directions (Gottgtroy 2007).
Web-based KDP Approach. This approach mainly deals with web log analysis. It is mainly similar to traditional KDP approach, but it has some unique steps to deal with log web data, see (Pabarskaite and Raudys 2007) and (Buchner et al. 1999).
Agile-based KDP Approach. This approach is the integration between agile methodologies and KDP traditional methodologies (Alnoukari et al. 2008).
The Leading Kdp Models
The following leading KDP models have been chosen by the authors based on their innovation steps, and their applications in both academia and industry:
Knowledge Discovery in Databases (KDD) Process by Fayyad et al. (1996).
Information Flow in a Data Mining Life Cycle by Ganesh et al. (1996).
SEMMA by SAS Institute (1997).
Refined KDD paradigm by Collier et al. (1998).
Knowledge Discovery Life Cycle (KDLC) Model by Lee and Kerschberg (1998).
CRoss-Industry-Standard Process for Data Mining (CRISP-DM) by CRISP-DM (2000).
Generic Data Mining Life Cycle by (DMLC) by Hofmann (2003).
Ontology Driven Knowledge Discovery Process (ODKD) by Gottgtroy (2007).
Adaptive Software Development-Data Mining (ASD-DM) Process Model by Alnoukari et al. (2008).