NoSQL Databases

NoSQL Databases

Manoj Manuja (Education and Research Department, Infosys Ltd, India) and Neeraj Garg (Education and Research Department, Infosys Ltd, India)
Copyright: © 2015 |Pages: 13
DOI: 10.4018/978-1-4666-5888-2.ch037

Chapter Preview



Over the years, companies have been making critical business and financial decisions based on the transactional data stored in their relational databases. Beyond this business critical data, there is a treasure trove of other non-traditional, non-relational, less structured data in the form of Weblogs, emails, social media files, sensing devices outputs, bitmaps in the form of audio and video files that can be mined for useful information. Over the last few years, companies have started storing this data treasure because of the decreasing cost of both data storage and computing power. This data has primarily three characteristics namely

  • Data is generated in huge Volumes.

  • Data is available in a Variety of data formats.

  • Data is gathered with high Velocity.

3 V’s make this data termed as Big Data which is huge in volume, coming in variety and with high velocity. There are broadly two major ways to store and manage big data i.e. using relational databases and non-relational database. Most of the existing DBMS (Database Management Systems) available across the globe are primarily relational in nature, can be expanded vertically and have predefined-schemas. Most importantly, they do not have matured way to analyze unstructured data and recognize data patterns at run time. Therefore, organizations are exploring the need of a DBMS which can handle structured as well as unstructured data and can be expanded horizontally as well as vertically. It should have the characteristic of accommodating changes in the database schemas during project life cycle and must provide the flexibility to update it as and when required. Non-relational databases like NoSQL have all these characteristics and properties to support 3Vs (Volume, Variety and Velocity) of big data along with the normal operational functionalities of SQL (Structured Query Language) (Lai, 2010; LaValle, et al., 2011).


Main Focus Of The Article

In this paper, we explore and examine the characteristics of relational and non-relational databases in the context of managing big data. Below section provides a brief of relational and non-relational databases with MySQL as an example for RDBMS (Relational Database Management System) and MongoDB as non-RDBMS (NoSQL) example. Subsequent three sections provide a detailed comparison of these two databases on the basis of three critical characteristics namely “High Performance and Flexible Schema,” “Replication” and “Sharding.” The section “Analysis and Discussion” provides a consolidated feature by feature analysis. Last two sections provide conclusion and references being used.

Key Terms in this Chapter

Replication: Creating/ having a copy of the data.

NoSQL: One of the most appropriate data storage provision to store unstructured data. It supports data with various dynamic data types such as document based storage, graph based storage etc.

MongoDB: Document based NoSQL storage database management system used to store both operational data and humongous data. It provides schema flexibility, increased performance and high availability of data.

Big Data: The data with frequency of generation of the size petabytes / terabytes per second and is so huge in volume and not fit to be stored in the existing traditional relational database storage.

Horizontal Scalability: When data is expected to grow at very high rate and rather than extending the capacity of existing hardware resources (addition of RAM, Hard Disk etc.), if a new hardware/ node (new commodity hardware) is added to accommodate that growth, it is called as horizontal scalability.

Sharding: Another name of database partitioning used by some of the NoSQL database vendors like MongoDB.

Relation Database Management Systems (RDBMS): Used to store related set of data in tables consist of rows and columns which ensures Atomicity, Consistency, Integrity and Durability (ACID) of data for high end applications involving large number of frequent financial transactions.

Complete Chapter List

Search this Book: