Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

A Framework for Evaluating Design Methodologies for Big Data Warehouses: Measurement of the Design Process

Francesco Di Tria, Ezio Lefons, Filippo Tangorra

Source Title: International Journal of Data Warehousing and Mining (IJDWM) 14(1)

DOI: 10.4018/IJDWM.2018010102

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

This article describes how the evaluation of modern data warehouses considers new solutions adopted for facing the radical changes caused by the necessity of reducing the storage volume, while increasing the velocity in multidimensional design and data elaboration, even in presence of unstructured data that are useful for providing qualitative information. The aim is to set up a framework for the evaluation of the physical and methodological characteristics of a data warehouse, realized by considering the factors that affect the data warehouse's lifecycle when taking into account the Big Data issues (Volume, Velocity, Variety, Value, and Veracity). The contribution is the definition of a set of criteria for classifying Big Data Warehouses on the basis of their methodological characteristics. Based on these criteria, the authors defined a set of metrics for measuring the quality of Big Data Warehouses in reference to the design specifications. They show through a case study how the proposed metrics are able to check the eligibility of methodologies falling in different classes in the Big Data context.

Article Preview

Top

1. Introduction

The design of data warehouses in the context of Big Data requires new solutions for solving the challenges and taking advantages of the opportunities introduced by novel data sources, such as social networks, that provide also qualitative information (Value issue) to companies about user preferences (Waters & Jamal, 2011). Indeed, these data are daily generated (Velocity issue) in a massive way (Volume issue) (Chen et al., 2014) and usually appear in both structured and unstructured forms (Variety issue) (Buneman et al., 1997; Rehman et al., 2012). In order to be effectively used for business analytics and decision making, these data are to be validated according to a data quality model that checks the degree of reliability (Veracity issue). Each of these issues is faced by emerging methods for data warehouse design.

First, the Value issue concerns the realization of a schema with a good quality, where all the data sources contribute to the data warehouse in the same way. A schema with a good quality is that allows to extract all the information the decision makers are interested in and that presents no violations in reference to the constraints in the data sources. To achieve this, hybrid methodologies are adopted, because they take into account the best features of traditional methodologies. Applying such methodologies, the designer produces a multidimensional schema that not only agrees with the data sources but also does not miss any requirement and does not discard any data source. On the other hand, the workflow of these methodologies is quite complex because they integrate and reconcile both the requirement and the data oriented approaches (Mazón & Trujillo, 2009; Mazón et al., 2007; Di Tria et al., 2015; Di Tria et al., 2012).

The Velocity issue is related to the necessity of integrating new data sources as soon as possible and accepting new business requirements without performing a complete redesign process. The aim is to quickly modify an existing schema for timely providing updated and accurate information in reference to the most recent business goals. This aim can be reached using automatic and agile techniques, because the former simulates the reasoning of an expert designer, by avoiding repetitive tasks and human errors (Di Tria et al., 2014; Phipps & Davis, 2002), while the latter introduces adjustments to a multidimensional schema, letting the data warehouse evolve as business requirements change (Corr with Stagnitto, 2011).

The Volume issue addresses the problems of realizing a data warehouse without importing, replicating, and storing tens of terabytes through the ETL process. The solution is based on a virtual data warehouse, where the movement of data among systems is avoided. As a further consequence, the delays of the importing phase for feeding the data warehouse are discarded (Farooq & Sarwar, 2010) and the data to be used in the analytical phase are immediately available. As an alternative to the virtual data warehouse approach, emergent non-relational models adopted in NoSQL databases provide more flexibility, for they allow denormalized and join-less schemas that can be exploited for analysing data according to novel paradigms, besides the traditional OLAP operators (Dehdouh et al., 2014). So, non-relational models are actually replacing traditional logical models (viz ROLAP and MOLAP) (Chevalier et al., 2015).

For facing the Variety issue, recent papers have introduced a semantic level in multidimensional design, on the basis of an ontological approach (He et al., 2011; Vranesic & Rovan, 2009; Di Tria et al., 2013; Khouri & Bellatreche, 2011; Thenmozhi & Vivekanandan, 2013). Since an ontology is a machine-processable conceptual representation of a domain of interest, it is used for solving in automatic way syntactical and semantic inconsistencies in the schema integration process, even in presence of unstructured data.

Complete Article List

Search this Journal:

Reset

Volume 20: 1 Issue (2024)

Volume 19: 6 Issues (2023)

Volume 18: 4 Issues (2022): 2 Released, 2 Forthcoming

Volume 17: 4 Issues (2021)

Volume 16: 4 Issues (2020)

Volume 15: 4 Issues (2019)

Volume 14: 4 Issues (2018)

Volume 13: 4 Issues (2017)

Volume 12: 4 Issues (2016)

Volume 11: 4 Issues (2015)

Volume 10: 4 Issues (2014)

Volume 9: 4 Issues (2013)

Volume 8: 4 Issues (2012)

Volume 7: 4 Issues (2011)

Volume 6: 4 Issues (2010)

Volume 5: 4 Issues (2009)

Volume 4: 4 Issues (2008)

Volume 3: 4 Issues (2007)

Volume 2: 4 Issues (2006)

Volume 1: 4 Issues (2005)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

A Framework for Evaluating Design Methodologies for Big Data Warehouses: Measurement of the Design Process

Abstract

1. Introduction

Complete Article List