Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Hengam a MapReduce-Based Distributed Data Warehouse for Big Data: A MapReduce-Based Distributed Data Warehouse for Big Data

Mohammadhossein Barkhordari, Mahdi Niamanesh

Source Title: International Journal of Artificial Life Research (IJALR) 8(1)

DOI: 10.4018/IJALR.2018010102

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

When working with a high volume of information that follows an exponential pattern, the authors confront big data. This huge amount of information makes big data retrieval and analytics important issues. There have been many attempts to solve data analytic problems using distributed platforms, but the main problem with the proposed methods is not observing the data locality. In this article, a MapReduce-based method called Hengam is proposed. In this method, data format unification helps nodes to have data independence. The unified format leads to an increase in the information retrieval speed and prevents data exchange betoen nodes. The proposed method was evaluated using data items from an ICT company and the information retrieval time was much better than that of other open-source distributed data warehouse software.

Article Preview

Top

1. Introduction

When confronted with a high volume of data records generated by software systems, sensors, social networks, mobiles, etc., we need systems to manage and utilize this huge amount of data more than ever before. Current database management systems (DBMSs) cannot manage this huge amount of information, so a change is needed in this area. For several years, the managing of huge amounts of data has been known as big data management, where data volume is one dimension of big data. Other dimensions include data item veracity, velocity, and variety, which are out of the scope of this paper.

One of the areas needing big data management is data warehousing. When working with a high volume of information generated by online transaction process (OLTP) systems, creating a data warehouse for this huge amount of information is critical. Information retrieval is one of the most important factors in data warehousing. This huge amount of data cannot be stored only on one server, and data must be distributed over several nodes. There are two architectures for distributed solutions: shared memory and storage and shared nothing. In the shared memory and storage architecture, for example, Oracle real application cluster (RAC) servers have a shared memory in storage area network(SAN) storage. They have complex configurations and high maintenance costs. A node count limitation is another big problem. Another group of distributed solutions is that of shared nothing solutions. This group of solutions usually uses a distributed platform to store and retrieve information. One of the most popular groups of shared nothing solutions is not only structured query language (NOSQL), by which structured and non-structured data can be supported. Usually, users’ queries are converted to MapReduce tasks by the NOSQL interface. MapReduce (Dean et al., 2008) is a programming method that can solve big data problems using distributed and scalable solutions. The data entry speed in NOSQL data warehouses is very high because they do not have to observe DBMS constraints. NOSQL data warehouses usually do not have DBMS facilities, such as an index, different data types, etc.

However, the main problem with distributed data warehouses is not DBMS facilities. The main problem is data locality, as it does not exist for data processing on the node, which is needed for processing. Many attempts have been made to conquer this problem, but to the best of our knowledge, there is no method to solve this problem completely.

In this paper, we introduce the Hengam method, which offers a scalable and distributable data warehouse for big data; this method is based on MapReduce. In the proposed method, the data locality problem is solved completely, and traditional DBMSs can be used on distributed nodes. The proposed method was evaluated with the data items of an ICT company.

Complete Article List

Search this Journal:

Reset

Open Access Articles: Forthcoming

Volume 8: 2 Issues (2018)

Volume 7: 2 Issues (2017)

Volume 6: 2 Issues (2016)

Volume 5: 1 Issue (2015)

Volume 4: 1 Issue (2014)

Volume 3: 4 Issues (2012)

Volume 2: 4 Issues (2011)

Volume 1: 4 Issues (2010)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

Hengam a MapReduce-Based Distributed Data Warehouse for Big Data: A MapReduce-Based Distributed Data Warehouse for Big Data

Abstract

1. Introduction

Complete Article List