System Uncertainty Based Data-Driven Knowledge Acquisition

System Uncertainty Based Data-Driven Knowledge Acquisition

Jun Zhao (Chongqing University of Posts & Telecommunications, P.R. China) and Guoyin Wang (Chongqing University of Posts & Telecommunications, P.R. China)
Copyright: © 2012 |Pages: 15
DOI: 10.4018/978-1-4666-0261-8.ch026


In the three-layered framework for knowledge discovery, it is necessary for technique layer to develop some data-driven algorithms, whose knowledge acquiring process is characterized by and hence advantageous for the unnecessity of prior domain knowledge or external information. System uncertainty is able to conduct data-driven knowledge acquiring process. It is crucial for such a knowledge acquiring framework to measure system uncertainty reasonably and precisely. Herein, in order to find a suitable measuring method, various uncertainty measures based on rough set theory are comprehensively studied: their algebraic characteristics and quantitative relations are disclosed; their performances are compared through a series of experimental tests; consequently, the optimal measure is determined. Then, a new data-driven knowledge acquiring algorithm is developed based on the optimal uncertainty measure and the Skowron’s algorithm for mining propositional default decision rules. Results of simulation experiments illustrate that the proposed algorithm obviously outperforms some other congeneric algorithms.
Chapter Preview


Cognitive Informatics (CI) is a new transdisciplinary research field. It combines cognitive science with informatics. The combination captures both the understanding of intelligence in minds and its implementation in computers. That characteristic makes CI develops rapidly in recent years (Kinsner, 2005; Wang, 2002; Wang, 2002; Wang, 2003; Wang, 2003; Wang, 2004; Wang & Kinsner, 2006; Wang, 2007). The general ideas of CI can be applied in studying related problems. In the field of knowledge discovery in databases, a three-layered framework can be accordingly established based on those ideas (Yao, 2004). The three layers, from the inner to the outer, are philosophy layer, technique layer and application layer. Different layer deals with problems in different contexts, i.e. in mind, in computer and in application, respectively. Each layer provides services for the outer ones. According to this three-layered framework, philosophy layer conceptually provides prior domain knowledge for technique layer. However, it is very possible that prior knowledge is biased or even incorrect in some real-life applications. To make matter even worse, prior knowledge is very possibly unavailable at all in some cases which have not been well explored yet. Therefore theoretically and practically speaking, it is necessary for technique layer to develop algorithms that are independent of prior knowledge. Such algorithms are completely data-driven, namely, their knowledge acquiring process can be conducted or directed by information systems themselves, and therefore prior knowledge or external information is no longer necessary for them.

Essentially, knowledge acquiring process can be regarded as a kind of knowledge transformation. In an original information system, knowledge exists in the form of primitive data. To make the knowledge more understandable and applicable, several steps may be involved in data mining process. For example, in rough set theory, a typical processing sequence includes steps such as discretization, reduction and rule acquisition. Accordingly, knowledge is sequentially transformed from original data to discretized data, reduced forms, and then ultimate decision rules or decision trees. Conceptually speaking, knowledge needs to be kept essentially undisturbed in the transforming sequence. Thus, data-driven methods have to utilize some special features of knowledge to direct its mining process. One of the most interesting issues of data-driven knowledge acquisition is to find out such reasonable features. For knowledge in different forms, those features must be commonly existed and intrinsic, and can be quantitatively measured and comparable. System uncertainty is such a good candidate. Due to the inherent existence of various uncertain factors, system uncertainty is an intrinsic common feature of and hence becomes an essential link between information systems and their induced knowledge systems. In fact, data-driven algorithms for acquiring decision rules have already been successfully implemented based on the minimum degree of the local certainty of information systems (Wang & He, 2002; Wang & He, 2003). Obviously, a data-driven knowledge acquiring framework based on system uncertainty requires first of all, measuring and handling system uncertainty as reasonably and precisely as possible. It is obvious that the ultimate performance may be negatively influenced if an improper uncertainty measure is applied in such an algorithm.

Complete Chapter List

Search this Book: