Article Preview
TopIntroduction And Problematic
The concept of data-driven decision-making is now known broadly over the world, especially with the coming of big data and its promises, in those days there is a big enthusiasm to this technology because of its real goals even if there is currently a wide gap between its theoretic concepts and goals and its realization, people upload more than 2.5 quintillion bytes of data per year.
In order to get much better understanding our work, the authors like to give you a complete grasp about the big data and its concepts, especially the cloud computing, simply the cloud computing is a concept of outsourcing the data from local machines to a set of virtual servers situated at distance by presenting the processing of those data as web services which guarantee an economical useless in term of capacity of material required to the process and also in term of financial concepts by basing on rule of pay what you use. Otherwise, the cloud computing is a model of sharing resources over the web using web services in order to enable convenient on demand, we talk about developing applications without the need to install a recommended platform and storing data without need to big spaces in the computer using storage as a service(Greer, 2012; Li, 2012).
Concerning the concept of big data according to authors in (Jinbao, 2013), it is as a simple of an abstract layer used in order to give a visual way to manage stored data of multiple structures and formats over global storage devices in a fundamental architecture like what Figure 1 indicates.
Figure 1. General architecture of big data (Jinbao, 2013)
Privacy, timeless, scalability of data is the most important problems that big data recognize starting from the first step of data acquisition; in fact search information over big data is like searching for a needle in haystack because of the huge volume of data, nowadays, relevance information has become a crucial problem that’s why the construction of an efficient private information retrieval model (PIR) is a challenge in the middle of computer sciences and most of classical methods present several problems such as:
- •
In terms of cryptographic schemes where the concept of randomizing during encryption is a necessity to improve the security of such ciphered text which can affect the retrieval model
- •
Quality of performance and the results returned
- •
The choice of parameters (representation method, similarity measure)
- •
Response time
- •
The problem of the multiplicity of data
Our target is to handle and settle the problems previously cited where the integration of the optimization algorithms is an important fact in order to obtain scalable and rapid services (Yaga, 2012).
The optimization of retrieval models is a wide area with various techniques that can affect even the optimization of queries or any step of the recovery process, in this paper, we try to project a set of techniques for the optimization’s of a secure retrieve protocol named PIR (Private Information Retrieval) in big data by applying a set of meta-heuristics algorithms in order to improve the efficiency of this approach, for that we include also a clear protocol, which is a retrieval model without security issue in order to give a better comparison and study the influence of cryptosystems on retrieval models
TopNow many researches are in progress in order to ensure more scalability and rapidity of PIR protocols in big data, in this section we present a set of published works in concerning the retrieval and PIR, models in big data and cloud computing.