Query Languages in NoSQL Databases

Query Languages in NoSQL Databases

Maristela Holanda (University of Brasilia, Brazil) and Jane Adriana Souza (University of Brasilia, Brazil)
DOI: 10.4018/978-1-4666-8767-7.ch015
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

This chapter aims to investigate how NoSQL (Not Only SQL) databases provide query language and data retrieval mechanisms. Users attest to many advantages in using the NoSQL databases for specific applications, however, they also report that querying and retrieving data easily continues to be a problem. The NoSQL operations require that, during the project, the queries must be thought of as built-in application codes. The authors intend to contribute to the investigation of querying, considering different types of NoSQL databases.
Chapter Preview
Top

Introduction

Edgar Codd introduced the relational model in 1970, which had the main objective of providing support for data independence and integrity (Silva, 2011). As the relational model has a strong theoretical foundation, it has been fully accepted by the academia. In the past few decades, the relational database has become dominant database model.

Padhy et al. (2011) cite that from the time relational databases began, a few of model databases was introduced, such as object database (DB40, Velocity) and XML database (BaseX, Berkeley DB XML). Recently, a new database model known as NoSQL has emerged and promises to dominate the market, though the guaranteeing high performance, handle modern workloads, and ability to process large volumes of unstructured data.

Database systems must provide data structures and advanced techniques to improve access and retrieval activities for data management (Elmasri & Navathe, 2011). Traditional relational database systems usually have an interactive interface to run SQL (Structured Query Language) commands. The interactive interface is very convenient for ad hoc queries. It is also possible to interact with the database through application programs using API (Application Programming Interface) or embedded SQL in applications. SQL language is useful in minimizing the impedance mismatch problem, which occurs due to differences between the model database and programming language.

The SQL language can be considered one of the most successful generators of relational database, because it is a widely used language with instructions for data definition, queries and updates. In addition, the SQL language provides a declarative language interface level so that users can easily declare what they wish.

The researches in cloud computing predict new architectures for data management, contributing to the rise in structured and unstructured data. The NoSQL databases are used in a cloud computing environment. The question is how to use the new database engine architectures, declarative programming languages, and the interplay of structured and unstructured data is a point to investigate (Elmasri & Navathe, 2011; Agrawal et al., 2009).

According to Hecht and Jablonski (2011), with the increasing amount of data generated by Web 2.0, the applications have storage requirements that exceed capacities and possibilities of traditional relational databases. The NoSQL databases come to store and process massive data, and the queries occur over huge amounts of data. The possibility of providing user-friendly query languages and exhibiting the results efficiently has been the object of academic study in NoSQL database. Recently, studies on data query language to unify NoSQL query interfaces appear as a possibility (Bach & Werner, 2014; Nasholm, 2012). One idea is to create a simple interface where users define the query for data independently of NoSQL databases. This is a challenge nowadays because there are many NoSQL databases with different data model. This chapter highlights the features of data query and retrieval mechanisms of some NoSQL databases and identify how the different NoSQL databases work with these operations.

The organization of this chapter is as follows. Section 2 presents background on NoSQL databases. NoSQL Query Languages are provided in Section 3. Section 4 presents Comparative Analysis of NoSQL languages, and the chapter closes with Recommendations plus Future Research Directions.

Top

Background

The need to manage big data encouraged the development of new database models. Some factors could be cited to demonstrate that the idea is inevitable, because of the limitations of traditional relational databases (Indrawan-Santiago, 2012):

  • The advent of cloud computing with easy to access parallel computing;

  • The proliferation of Web 2.0 applications and the interest of academia in e-science applications;

  • The need to dealing with large amounts of data;

  • The need to manage both structured and unstructured data.

Based on these factors, NoSQL databases were designed and developed.

Key Terms in this Chapter

SQL Language: Query Language to traditional databases. It contains instructions to data definition, data manipulation and query. Usually offers a high-level declarative interface to the end user could write his queries.

MapReduce: An engine which idea is divide a work in many tasks. With base in a table distribution, the algorithm divide a ad-hoc query in different sub-queries in the same time, with replicas, where one sub-query is mapped in k+1 sub-queries. Contains the map and reduce function to execute these tasks.

Lucene: A search engine to search and API indexing of documents. It is written in the java and open source software from the Apache Software Foundation through the Apache license.

BigData: Term that indicate a lot of data. An informal means indicate data that not could put fit in only a machine.

JSON: The JavaScript Object Notation is the binary format to represent data like list, map, date, Boolean and different precision numbers.

Sharding: The technique to divide a table in many nodes. The data are spread to nodes following arbitrary criteria.

BigTable: Sparce multidimensional, distributed, and ordered map. Framework created by Google to management unstructured data.

Erlang: Ericson programming language to be used in distributed and fault tolerant applications that need to be execute in real time environment. It is consider a nice programming language to distribute system.

CAP: Theorem created by Brewer teacher, indicate three properties Consistency, Availability and Partition Tolerance. The theorem implies that the database project should be choice two of three properties.

Complete Chapter List

Search this Book:
Reset