Article Preview
TopIntroduction
The technological revolution integrating multiple information sources and extension of computer science in different sectors led to the explosion of the data quantities, which reflects the scaling of volumes, numbers, and types. These massive increases have resulted in the development of new location techniques and access to data. The final steps in this evolution have emerged new technologies: Cloud Computing and Big Data (Gantz & Reinsel, 2012).
The several studies show that the growth of the volume of stored computer data is about 30% per year (Giga, Tera, Peta, Exa, Zetta ...) (Lyman, 2003). This « data deluge » is mainly due to the plethora of sources to create digital data (e.g. Internet, Computers, Mobile Phones, Digital Cameras, and Intelligent devices...). However, face to the deluge of data, it is necessary to measure information quality and extract hidden predictive information from these large volumes of data (Batini et al., 2016; Katerattanakul & Siau, 1999; Lee & Siau, 2001).
The scaling in volumetry and the geographical distribution of the data are already current problems to which is added another difficulty with the diversity of types of these masses created and accumulated without organization or structure data. The large and growing ratio of unstructured data from all numeric data is one of the factors that led to the new concept Big Data: the last studies show that the percentage of unstructured data is 90% (Gantz & Reinsel, 2012). So, the extension of management and treatment for multi-structured data (structured data, semi-structured or unstructured) is considered as a fundamental issue for both Big Data and Cloud.
The data management systems have constantly evolved since the early days of computer science. Computer designs since the nineties used data warehouses (Inmon, 2005), which are generally centralized on servers connected to storage arrays hardly scalable. These architectures suffer from scalability issues (addition of power on demand). In fact, the growing volume of data, the wide variety of multi-structured data (e.g., video, text, web), and the velocity or frequency at which these data are generated impose new challenges in terms of response time and processing delay.
This scientific revolution has imposed to researchers in the field new problems that have led to the development of new technologies to host and manage these large volumes of data. This new computing era has forced the different researchers to gradually abandon the traditional database management systems limited by ACID constraints (Atomicity, Coherence, Isolation, Durability) and to search the best solutions to adapt their systems to these imposed changes. These changes led to new Big Data concepts and contributed to the emergence of the NoSQL (Cattell, 2011; Oussous et al., 2017) and NewSQL movement (Aslett, 2011; Piekos, 2015).
Currently, many users of traditional relational DataBase Management System (DBMS) called “SQL” want to migrate to these new non-relational solutions “NoSQL” to anticipate the explosion of their data in the future, and the support of unstructured data (Hsieh et al., 2017). However, the risk of this changeover must be measured and fears must be reassured, to justify a possible transition from SQL to NoSQL.