Classification on Top of Data Cube

Classification on Top of Data Cube

Lixin Fu
Copyright: © 2005 |Pages: 19
DOI: 10.4018/978-1-59140-414-9.ch009
(Individual Chapters)
No Current Special Offers


Currently, data classification is either performed on data stored in relational databases or performed on data stored in flat files. The problem with these approaches is that for large data sets, they often need multiple scans of the original data and thus are often infeasible in many applications. In this chapter we propose to deploy classification on top of OLAP (online analytical processing) and data cube systems. First, we compute the statistics in various combinations of the attributes known as data cubes. The statistics are then used to derive classification models. In this way, we only scan the original data once, which improves the performance of classification significantly. Furthermore, our new classifier will provide “free” classification by eliminating the dominating I/O overhead of scanning the massive original data. An architecture that integrates database, data cube, and data mining is given and three new cube-based classifiers are presented and evaluated.

Complete Chapter List

Search this Book: