Evaluating NoSQL Databases for Big Data Processing within the Brazilian Ministry of Planning, Budget, and Management

Evaluating NoSQL Databases for Big Data Processing within the Brazilian Ministry of Planning, Budget, and Management

Ruben C. Huacarpuma, Daniel da C. Rodrigues, Antonio M. Rubio Serrano, João Paulo C. Lustosa da Costa, Rafael T. de Sousa Júnior, Lizane Leite, Edward Ribeiro, Maristela Holanda, Aleteia P. F. Araujo
Copyright: © 2015 |Pages: 18
DOI: 10.4018/978-1-4666-8147-7.ch011
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

The Brazilian Ministry of Planning, Budget, and Management (MP) manages enormous amounts of data that is generated on a daily basis. Processing all of this data more efficiently can reduce operating costs, thereby making better use of public resources. In this chapter, the authors construct a Big Data framework to deal with data loading and querying problems in distributed data processing. They evaluate the proposed Big Data processes by comparing them with the current centralized process used by MP in its Integrated System for Human Resources Management (in Portuguese: Sistema Integrado de Administração de Pessoal – SIAPE). This study focuses primarily on a NoSQL solution using HBase and Cassandra, which is compared to the relational PostgreSQL implementation used as a baseline. The inclusion of Big Data technologies in the proposed solution noticeably increases the performance of loading and querying time.
Chapter Preview
Top

Big Data

The current data we manage is very diverse and complex. This is a consequence of social network interactions, blog posts, tweets, photos and other shared content. Devices continuously send messages about what they or their users are doing. Scientists are generating detailed measurements of the world around us with sensors installed within devices such as mobile telephones, tablets, watches, cars, computers, etc. and finally the internet is the ultimate source of data with colossal dimensions (Marz, 2013).

Big data is exceeding the conventional database systems capacities. The data is too big, moves too fast or does not fit into existing database architectures (Dumbill, 2012). Although the literature usually defines Big Data based on the size of the data, in this work point of view, Big Data is not only defined by the size, but also according to Russom (2011), we take into account the so called by 3Vs factors, i.e. Volume, Variety and Velocity.

Complete Chapter List

Search this Book:
Reset