This chapter introduces the context in general, the aims, and the rationale of the book. It follows this with a brief review of the contents of chapters in this book. The motivation for the multi-level pattern extraction and prediction is given, followed by identifying different limitations in existing techniques in this subject to satisfy the requirements of multi-level pattern extraction and prediction.
Chapter Preview

1.1 Introduction: Pattern Extraction And Prediction In Large Datasets

A large volume of data is present in all domains today. Apart from handling the storage of data, researchers have also been interested to use this data for knowledge extraction purposes. Recently, techniques have been adapted to find hidden and interesting patterns from large data sets (Frawley, Piatetsky-Shapiro, & Matheus (1992); Fayyad, Piatetsky-Shapiro, & Smyth (1996); Goebel & Gruenwald, 1999). For knowledge discovery data mining and data warehousing have played a major role independently (Cios, Pedrycz, & Swiniarski, 1998; Wang, 1999; Palpanas, 2000). Data mining has widely been used for knowledge discovery in all domains like business, medicine, education, and science as it aims to extract patterns from large data sets (Han, Kamber, & Chiang, 1997; Hand, Mannila, & Smyth, 2001). On the other hand data warehouses have been used for exploratory analysis by the researchers. Researchers have worked in past for data warehouse evolution in various directions (Kimball & Caserta, 2011; Berson & Smith,1997; Chaudhuri & Dayal, 1997). However recently, researchers have started using both domains in hybrid fashion (You, Dillon, & Liu, 2001; Liu & Guo, 2001; Chen et al, 2006; Hsu & Chien, 2007; Messaoud, Rabaséda, Boussaid, & Missaoui, 2006; Messaoud, Rabaséda, Rokia, & Boussaid, 2007; Usman, Asghar, & Fong, 2009; Korikache & Yahia, 2014). The hybrid techniques are generally used for knowledge discovery and have strengthened the process. Researchers have continued for advancements in this area. The hybrid techniques include pattern prediction and extraction in multidimensional environment.

Han, Pei, and Kamber (2011) have defined different characteristics of a mining processing working in a multidimensional environment. Authors have suggested that mining process needs to be domain independent, so that analysts can use the process without domain knowledge. Secondly, a good mining process provides facilities for mining on different subsets of data and at varying levels of abstraction in a multi-dimensional environment. The extracted knowledge needs to be validated through advanced measures of interestingness as well. Moreover, the process is strengthened by adding a visualization component to the process.

The hybrid techniques discussed above have been used for knowledge discovery in almost all domains in past (Bogdanova & Georgieva, 2005; Yadav & Pal, 2015; McNamara, Crossley, Roscoe, Allen, &and Dai, 2015; Messaoud et al, 2006; Messaoud et al, 2007; Ordonez & Chen 2009; Bodin-Niemczuk, Messaoud, Rabaséda, & Boussaid, 2008). In some cases, it is important for a knowledge discovery to be guided by a domain expert, so techniques have been proposed in past to deal with such cases (Kamber, Han, &Chiang, 1997; Messaoud et al, 2006; Messaoud et al, 2007; Ordonez & Chen, 2009; Kamber et al, 1997; Sarawagi, Agrawal, & Megiddo, 1998). However in most cases, knowledge discovery aims to find hidden and interesting patterns. There are some approaches which mine over data warehouse in automated way (Usman, Pears, & Fong, 2013; Ordonez & Chen, 2009; Azeem, Usman, & Ahmad, 2014).

Complete Chapter List

Search this Book: