The objective of data mining is to discover new and useful knowledge, in order to gain a better understanding of nature. This in fact is the goal of scientists when carrying out scientific research, independent in their various disciplines. This goal-oriented view enables us to re-examine data mining in a wider context of scientific research. The consequence after the immediate comparison between scientific research and data mining is that, an explanation discovery and evaluation task is added to the existing data mining framework. In this chapter, we elaborate the basic concerns and methods of explanation discovery and evaluation. Explanationoriented association mining is employed as a concrete example to show the whole framework.
Scientific research and data mining have much in common in terms of their goals, tasks, processes and methodologies. As a recently emerged multi-disciplinary study, data mining and knowledge discovery can benefit from the long established studies of scientific research and investigation (Martella et al., 1999). By viewing data mining in a wider context of scientific research, we can obtain insights into the necessities and benefits of explanation discovery. The model of explanation-oriented data mining is a recent result from such an investigation (Yao et al., 2003). The basic idea of explanation-oriented data mining has drawn attentions from many researchers (Lin & Chalupsky, 2004; Yao, 2003) ever since the introduction of it.
Common Goals of Scientific Research and Data Mining
Scientific research is affected by the perceptions and the purposes of science. Martella et al. (1999) summarized the main purposes of science, namely, to describe and predict, to improve or manipulate the world around us, and to explain our world. The results of the scientific research process provide a description of an event or a phenomenon. The knowledge obtained from research helps us to make predictions about what will happen in the future. Research findings are useful for us to make an improvement in the subject matter. Research findings can be used to determine the best or the most effective interventions that will bring about desirable changes. Finally, scientists develop models and theories to explain why a phenomenon occurs.
Goals similar to those of scientific research have been discussed by many researchers in data mining. For example, Fayyad et al. (1996) identified two high-level goals of data mining as prediction and description. Prediction involves the use of some variables to predict the values of some other variables, and description focuses on patterns that describe the data. Ling et al. (2002) studied the issue of manipulation and action based on the discovered knowledge. Yao et al. (2003) introduced the notion of explanation-oriented data mining, which focuses on constructing models for the explanation of data mining results.