Article Preview
Top1. Introduction
The advancements in big data led to several opportunities for research in the upcoming years. The Big data is adapted for discovering knowledge using different sectors of society. The big data contains vast data, which is generated through the digital processes and shared among several individuals through webs. The big data has spanned the way for making the decisions in a right way. The decision support has motivated several users to keep the data online (Xuezhen, et al., 2014). Due to the sharing of data, several concerns related to security are generated. The ability to store the personal information is a major issue in the context of privacy-preservation (Karle & Vora, 2017). As the big data handles the data of a large number of users, the privacy is an important task, which needs to be accomplished for protecting the data (Yang, et al., 2014), (Youke, et al., 2020). Numerous applications are designed for allowing the users to access the data with trust management (Denglong et al., 2020). The privacy and security is a major challenge in big data. The big data is not accepted if privacy and security are not addressed. The scalability (S. Atiewi et al., 2020) is another major issue when the conventional preservation technique is adapted in big data. In spite of several techniques developed for privacy preservation, most of them cannot efficiently preserve the privacy as they fail to handle different attacks (Antony & Antony, 2016). The big data requires large storage and computational power for preserving the data. Hence, it adapts a large distributed system for storing data at various locations and for easy retrieval (Geetha, et al., 2017).
As preserving the privacy is an important issue in processing the big data, it affects academia as well as the IT industry. The important aspect while sharing data is to preserve the privacy and simultaneously provide the data utility. The purpose of extracting the useful data from large datasets is to obtain a data, which is not misused. Several techniques are devised for privacy preservation, but most of them are ineffective for addressing the problems related to security while privacy preservation (Thanamani, 2017). The priory used privacy preservation techniques can be categorized into two phenomena. The first phenomenon is hiding the identity of the user and the second phenomenon is the preservation of user’s important data. The big data needs to consider communication overhead and computational cost due to its large volume, velocity, and variety (Guan & Si, 2017). The information transfer can be secured if the privacy of the database is preserved. The parameters considered for the privacy preservation while processing big data are categorized as integrity, controllability, preservability, and confidentiality. The performance of various algorithms based on privacy preservation is increased due to its outstanding behavior to protect the big data. However, the technique based on privacy preservation ignores the accessing of data by untrustworthy users due to data loss, and data leakage. The privacy can be preserved using input privacy and output privacy. The performance of the anonymization based algorithms can be improved if optimization based algorithms are adapted (Tang, et al., 2016).