Improved Data Partitioning for Building Large ROLAP Data Cubes in Parallel
Ying Chen (Dalhousie University, Canada), Frank Dehne (Carleton University, Canada), Todd Eavis (Concordia University, Canada) and A. Rau-Chaplin (Dalhousie University, Canada)
Copyright: © 2008
This paper presents an improved parallel method for generating ROLAP data cubes on a shared-nothing multiprocessor based on a novel optimized data partitioning technique. Since no shared disk is required, our method can be used for highly scalable processor clusters consisting of standard PCs with local disks only, connected via a data switch. Experiments show that our improved parallel method provides optimal, linear, speedup for at least 32 processors. The approach taken, which uses a ROLAP representation of the data cube, is well suited for large data warehouses and high dimensional data, and supports the generation of both fully materialized and partially materialized data cubes.