Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

A Workload Assignment Strategy for Efficient ROLAP Data Cube Computation in Distributed Systems

Ilhyun Suh, Yon Dohn Chung

Source Title: International Journal of Data Warehousing and Mining (IJDWM) 12(3)

DOI: 10.4018/IJDWM.2016070104

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Data cube plays a key role in the analysis of multidimensional data. Nowadays, the explosive growth of multidimensional data has made distributed solutions important for data cube computation. Among the architectures for distributed processing, the shared-nothing architecture is known to have the best scalability. However, frequent and massive network communication among the processors can be a performance bottleneck in shared-nothing distributed processing. Therefore, suppressing the amount of data transmission among the processors can be an effective strategy for improving overall performance. In addition, dividing the workload and distributing them evenly to the processors is important. In this paper, the authors present a distributed algorithm for data cube computation that can be adopted in shared-nothing systems. The proposed algorithm gains efficiency by adopting the workload assignment strategy that reduces the total network cost and allocates the workload evenly to each processor, simultaneously.

Article Preview

Top

Introduction

Data cube is an essential part of analytical processing, and it is widely used for analyzing multidimensional data (Gray et al., 1997). Data cube allows users to explore multidimensional data from various perspectives and at different hierarchical summarization levels. By exploring data cube, users can easily gain insights from multidimensional data. For this reason, data cube plays a key role in On-line Analytical Processing (OLAP) systems (Chaudhuri & Dayal, 1997).

In OLAP systems, multidimensional data are usually gathered from sources that generate a large amount of data. Good examples of such sources are sales recording systems, transaction logging systems, and sensors that report their measurements periodically. The generated data are gathered and integrated into an underlying tier called the data warehouse (Han, Kamber, & Pei, 2011). As the size of the data warehouse grows, the complexity of data cube computation becomes a performance bottleneck and makes it difficult for users to perform analysis at a tolerable time. This is even more problematic in these days because the amount of data being generated is growing explosively.

Since data cube computation requires a high computational cost, given its exponential growth of computation with the growth in the number of dimensions, it is unlikely to cope with massive data using a single machine. In this context, some distributed solutions have been proposed (Chen, Dehne, Eavis, & Rau-Chaplin, 2004b, 2004a, 2008; Lee, Jo, & Kim, 2015; Lee, Kim, Moon, & Lee, 2012; Nandi, Yu, Bohannon, & Ramakrishnan, 2012; Sergey & Yury, 2009). These studies exploit distributed computing power using a set of processors in order to achieve high performance. In distributed computing, a task is divided into multiple sub-tasks and distributed to the processors. With regard to the sub-task assignments, having the workload evenly distributed among the processors is a key factor for achieving maximal parallelism and scalability.

There are three architectural choices for building a distributed system: shared-disk, shared-memory and shared-nothing (DeWitt & Gray, 1992). The shared-nothing architecture is known to have the best scalability among the three architectures because it can scale the I/O bandwidth (Babu & Herodotou, 2013). Typically, limited I/O bandwidth becomes a performance bottleneck in data intensive jobs such as data cube computation. Despite the scalable I/O, frequent network intercommunication and massive remote data access between processors can be a critical overhead in shared-nothing distributed systems. This is because network communication is the slowest component among the operations of distributed processing. Thus, reducing network cost can be an effective strategy for improving the overall performance of distributed processing.

In this paper, we present an efficient algorithm for computing relational OLAP (ROLAP) data cube in shared-nothing distributed systems. We focus on ROLAP cube computation because it can be easily incorporated into existing DBMSs. In addition, we focus on shared-nothing architecture since it has proven its scalability and is widely used in distributed processing systems. The main contribution of this paper is as follows:

•
We introduce a method for dividing the entire cube computation task into independent sub-tasks.
•
We present an algorithm for assigning the sub-tasks to the processors for efficient data cube computation. By reducing the inter-processor network transmission and achieving good load balance among the processors, our algorithm gains efficiency.
•
We present additional optimization techniques that can be applied to our algorithm.
•
We evaluate the performance of the proposed algorithm with extensive experiments.

Complete Article List

Search this Journal:

Reset

Volume 20: 1 Issue (2024)

Volume 19: 6 Issues (2023)

Volume 18: 4 Issues (2022): 2 Released, 2 Forthcoming

Volume 17: 4 Issues (2021)

Volume 16: 4 Issues (2020)

Volume 15: 4 Issues (2019)

Volume 14: 4 Issues (2018)

Volume 13: 4 Issues (2017)

Volume 12: 4 Issues (2016)

Volume 11: 4 Issues (2015)

Volume 10: 4 Issues (2014)

Volume 9: 4 Issues (2013)

Volume 8: 4 Issues (2012)

Volume 7: 4 Issues (2011)

Volume 6: 4 Issues (2010)

Volume 5: 4 Issues (2009)

Volume 4: 4 Issues (2008)

Volume 3: 4 Issues (2007)

Volume 2: 4 Issues (2006)

Volume 1: 4 Issues (2005)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

A Workload Assignment Strategy for Efficient ROLAP Data Cube Computation in Distributed Systems

Abstract

Introduction

Complete Article List