NoSQL Database Phenomenon

NoSQL Database Phenomenon

Copyright: © 2018 |Pages: 60
DOI: 10.4018/978-1-5225-3385-6.ch002
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

The chapter explains that NoSQL databases emerged as an attempt for resolving the limitations of relational databases in coping with Big Data. The issue of Big Data is related to extensive requests for the storage and management of complex, dynamic, evolving, distributed, and heterogeneous data from different sources and platforms. The chapter provides an overview of the technologies, including Google File System (GFS), MapReduce, Hadoop, and Hadoop Distributed File System (HDFS), which were the first responses to Big Data challenges and main driving forces for the development of NoSQL databases. Also, the chapter asserts that NoSQL is an umbrella term related to numerous databases with different architectures and purposes, which can be classified in four basic categories: key-value, column-family, document, and graph stores. The chapter discusses the general features of NoSQL databases, as well as the specific features of each of the four basic categories of NoSQL databases.
Chapter Preview
Top

Big Data Challenges

Since 2012, the term Big Data has become increasingly mainstream. However, many different (and sometimes unclear) definitions exist. According to Wu, Buyya, and Ramamohanarao (2016), Big Data definitions can be grouped based on:

  • Domains (Vs): One of the earliest definitions was based on 3Vs: (1) volume; (2) velocity; and (3) variety (Lanely, 2001). The volume represents continuous and cumulative data growth. Velocity is the speed of data transfer from one point to another (real time data streaming, YouTube uploads, social media posts, e-mail, etc.). Variety refers to different types of data formats. IBM (2016) added a fourth domain, veracity, to refer to data uncertainty (e.g., trustworthy and quality data). Microsoft (2016) added variability and visibility, which extended the number of domains to six. Variability relates to data complexity (e.g., the huge number of data attributes). Visibility refers to the required existence of a complete data picture in the decision-making process. The number of domains (Vs) has grown to as many as 11 (Elliott, 2013).

  • Technology: This type of definition focuses on technological support to Big Data (Hadoop, MapReduce, Spark, etc.).

  • Application: Applications based on Big Data is the focus of this definition, including machine learning, data mining, social media analytics, etc.).

  • Signals: Signal definitions relate to application definitions. However, signals focus on timing and finding new signal patterns in Big Data sets.

  • Opportunities: This definition focuses on the potential of Big Data in fields of human work and living, especially as a driving force of developments in new technologies.

  • Metaphor: Metaphor views Big Data as an extension of the human brain.

  • New Term for Old Stuff: Big Data as a new term for old stuff views the Big Data phenomenon as a buzz word for existing concepts (i.e., business intelligence, data mining, social media analytics, etc.).

These definitions, however, do not focus on the use of data to resolve business problems. Big Data should be used as a powerful tool in the decision-making process. This requires Big Data to be viewed using the following aspects, i.e., domain knowledge (Wu, Buyya, & Ramamohanarao, 2016):

  • Data domain (searching for patterns)

  • Business intelligence domain (making predictions)

  • Statistical domain (making assumptions). (p. 11)

Complete Chapter List

Search this Book:
Reset