Article Preview
Top1. Introduction
Nowadays, NoSQL databases are widely used to store and process Big Data mainly due to high performance, scalability, and flexibility features (Schram & Anderson, 2012; Tudorica & Bucur, 2011; Pokorny, 2013). These benefits became possible mainly due to ease of data distribution and shredding in the NoSQL systems. Relational Database Management Systems (RDBMS) are famous for providing the surety of ACID properties, however, performance and scalability of RDBMS are poor after database size start growing dramatically. There are many applications which may not require all ACID properties. For example, social network applications can tolerate relaxation in strict consistency. Moreover, NoSQL databases do not need a strict schema which provides flexibility to the users to update data easily (Okman, Gal-Oz, Gonen, Gudes, & Abramov, 2011). NoSql databases support availability, partition tolerance, and high performance. However, it relaxes data consistency as compared to RDBMS, though most of the NoSQL database support eventual consistency which is feasible and can be neglected in the most of cases because any change in the data will propagate to all the node within few milliseconds and makes the data consistent quickly. Therefore, NoSQL databases are attractive and used in many large-scale applications.
One of the major disadvantages of the NoSQL database is a weak security support which raises concerns for users. Specifically, NoSQL databases store data in flat files which increases security concerns of data at rest. This concern severely increases when NoSQL-bases solutions are deployed on un-trusted servers specifically on the cloud because any unauthorized access beaches data confidentiality easily. Some of the NoSQL-based databases provide data encryption methods to ensure data security and confidentiality (Nafi, Kar, Hoque, & Hashem, 2013), (Meyer & Schwenk, 2013). However, the data-at-rest and data-at-transit remain vulnerable as these systems do not provide query processing on encrypted data. Commonly, DES (Data Encryption Standard), AES (Advanced Encryption Standard), Hashing (MD5, SHA) techniques etc. are implemented for data-at-rest encryption but data needs to be decrypted before querying the database. Additionally, these systems require transport layer security protocols and method e.g., SSL, IPSec, TLS and SSH to ensure security during data-at-transit. These techniques increase processing and communication time overhead to ensure security for server-to-server and server-to-client communication. Moreover, plain data exists in memory as query processing over encrypted data is not supported.
Primarily, NoSQL databases were designed and developed without focusing on data security. Thus, third-party tools and services were employed to secure NoSQL databases (Okman, Gal-Oz, Gonen, Gudes, & Abramov, 2011). Moreover, distributed architecture of NoSQL databases exposes to various security vulnerabilities e.g. accidental modifications, illegal use or unauthorized access, administrative or logical decontrols, insecure communication channel, etc. (Zahid, Masood, & Shibli, 2014; Baccam, 2010). This brings new challenges to protect sensitive and critical data from unauthorized access and illegal usage in NoSQL databases. Classic cryptographic techniques can be employed to protect against various security vulnerabilities, and preserve data security during data storage and communication. However, original data remains vulnerable during query processing in the NoSQL database, especially while processing personal information related to healthcare and personal financial transaction (Lin, Tsai, & Lin, 2014). Moreover, NoSQL databases are deployed on public clouds to leverage the benefits of adaptive resource provisioning (Iqbal W., Dailey, Carrera, & Janecek, 2011) and pay-as-you-go features. Which makes the data of NoSQL vulnerable and easy to access by unintended users.