Building Textual OLAP Cubes Using Real-Time Intelligent Heterogeneous Approach

Building Textual OLAP Cubes Using Real-Time Intelligent Heterogeneous Approach

Haytham Alzeini (IIUM, Kuala Lumpur, Malaysia), Shihab A. Hameed (IIUM, Kuala Lumpur, Malaysia) and Mohamed Hadi Habaebi (IIUM, Kuala Lumpur, Malaysia)
Copyright: © 2018 |Pages: 26
DOI: 10.4018/IJIIT.2018070105

Abstract

This article describes how the ever-growing amount of data entails introducing innovative solutions in or-der to capture, process, and store the information. OLAP has been considered a powerful analytical technology that enables analysts to gain insight into data and project information from diversified points of view. Thereupon, OLAP has been utilized in a broad spectrum of sensitive applications in the industry. The technology has occupied its place at the forefront of the vibrant information technology landscape of research in order to meet the evolving needs. One of these needs that has enticed the researchers' attention is providing real-time answers which suggests, in particular cases, processing billions of records in few seconds or less. The limited processing capacities have arisen as a major hurdle in the way of achieving such an aim. Although numerous improvements have been suggested, few have considered the heterogeneous computing approach, whereby quantum leap in terms of the response time has been achieved, albeit in most cases, only numerical data have been utilized. In this article, the authors introduce a novel heterogeneous OLAP approach targets textual OLAP cubes aggregation and can be utilized efficiently in OLAP-based pattern recognition problems. In this context, the approach (a) exploits the GPU along with the CPU in order to process textual data. (b) Stores the queries aggregations' hash table in the global memory such that the higher aggregations levels are being answered in a shorter time (c) Introduces an intelligent self-evaluating mechanism (ISEM), that evaluates the resource efficiency on query-basis by deciding which resource (CPU or GPU+CPU) is more reliable to process each query. The authors' empirical results have shown the achieved gain is up to thirty-two folds over the parallel CPU-based counterpart solution. Furthermore, their approach has demonstrated that adopting aggregation-memory optimization significantly improves the performance of high-level textual aggregations.
Article Preview

1. Introduction

Online analytical processing (OLAP Council, 2001) is a category of software technology that enables analysts, managers and executives to gain insight into data through fast, consistent, interactive access to a wide variety of possible views of information that has been transformed from raw data to reflect the real dimensionality of the enterprise as understood by the user. OLAP functionality is characterized by dynamic multidimensional analysis of consolidated enterprise data supporting end user analytical and navigational activities including calculations and modeling applied across dimensions, through hierarchies and/or across members, trend analysis over sequential time periods, slicing subsets for on-screen viewing, drill-down to deeper levels of consolidation, rotation to new dimensional comparisons in the viewing area. Practically, OLAP works as an integrated system into data warehouses with data size of gigabytes to petabytes in certain instances. Hence, reading and analyzing such an enormous size mandates potent processing capacity. In addition, usually, response time in OLAP applications is a critical factor (medical and financial applications). That is, processors are expected to read, analyze and deliver answers for the queries in a very short time. Many enhancements have been introduced in order to improve the performance. The enhancements, which have stemmed from different diagnoses, can be divided mainly into two major streams: Software-based solutions and hardware-based solutions. Materialization has been extensively studied throughout the last decade that it has become an iconic software-based solution. However, such a solution had been early eliminated in the authors’ study as a direct result of meeting the real-time requirement failure (Alzeini, Hameed, & Habaebi, 2014). Thus, in this research, only the hardware-based solutions have been taken into consideration. This article argues that all OLAP Hardware-based solutions that have been discussed in the literature can be categorized into three main approaches: multi-core CPUs, GPU and heterogeneous approach. The latter, has shown very positive results – under certain circumstances and with meeting particular conditions – for achieving Real-Time OLAP answers.

In heterogeneous computing systems, GPU works in harmony with the CPU in order to bring about a valuable improvement in terms of the response time in addition to reducing the monetary costs (Govindaraju, Lloyd, Wang, Lin, & Manocha, 2004). Yet, several practical limitations and barriers must be considered when one designs or optimizes an algorithm for heterogeneous system. Most of the difficulties arise as a result of the fact that the physical structures of the GPU in addition to the purpose of GPU’s design are much different than the CPU’s structure and purposes. That is, the GPU is mainly designed to process integers more than any other data type (e.g. strings), in addition to the everlasting constrained data transfer issue between the two components.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 15: 4 Issues (2019): Forthcoming, Available for Pre-Order
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing