The purpose of this literature review chapter is to discuss the integrated techniques of knowledge discovery, identify gaps, and draw research objectives of this research. The chapter firstly discusses the pattern extraction techniques from large datasets, for example, a data warehouse, followed by pattern prediction techniques. A review of pattern extraction and prediction is presented on the basis of knowledge independency, multi-level mining ability, advanced evaluation of results, and visualization ability. At the end, a summary of issues in the current research are presented followed by the research objectives of this research.
Top2.2 Hybrid Techniques Of Knowledge Discovery
Knowledge discovery from large datasets has remained an interesting domain for researchers. Analysts have been interested in finding hidden and interesting patterns from large data sets (Frawley, Piatetsky-Shapiro, & Matheus, 1992;Fayyad, Piatetsky-Shapiro, & Smyth, 1996;Goebel & Gruenwald, 1999). For knowledge discovery data mining and data warehousing have played a major role independently (Cios, Pedrycz, & Swiniarski, 1998;Wang, 1999; Palpanas, 2000). Data mining has widely been used for knowledge discovery in all domains like business, medicine,, education and science as it aims to extract patterns from large data sets (Han, Kamber, & Chiang, 1997;Hand, Mannila, & Smyth, 2001). On the other hand data warehouses have been used for exploratory analysis by the researchers. Researchers have worked in past for data warehouse evolution in various directions (Kimball & Caserta, 2011;Berson & Smith, 1997; Chaudhuri & Dayal, 1997). However recently, researchers have started using both domains in hybrid fashion (You, Dillon, & Liu, 2001;Z. Liu & Guo, 2001; Chen et al., 2006; Hsu & Chien, 2007; Messaoud, Rabaséda, Boussaid, & Missaoui, 2006;Messaoud, Rabaséda, Rokia, & Boussaid, 2007;;,Korikache & Yahia, 2014). The hybrid techniques are generally used for knowledge discovery and have strengthened the process.
Han, Pei, and Kamber (2011) have defined different characteristics of a mining processing working in a multidimensional environment. Authors have suggested that mining process needs to be domain independent, so that analysts can use the process without domain knowledge. Secondly, a good mining process provides facilities for mining on different subsets of data and at varying levels of abstraction in a multi-dimensional environment. The extracted knowledge needs to be validated through advanced measures of interestingness as well. Moreover, the process is strengthened by adding a visualization component to the process.