An Enhanced Data Anonymization Approach for Privacy Preserving Data Publishing in Cloud Computing Based on Genetic Chimp Optimization

An Enhanced Data Anonymization Approach for Privacy Preserving Data Publishing in Cloud Computing Based on Genetic Chimp Optimization

Sahana Lokesh R., H.R. Ranganatha
Copyright: © 2022 |Pages: 20
DOI: 10.4018/IJISP.300326
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

The data privacy. It is the biggest challenge in medical field to share and publish sensitive information about an individual to the cloud infrastructure. Therefore, it is essential to protect the patients’ information with high security and more data privacy. In this paper, a novel technique based on Mondrian based k-anonymization incorporated with Genetic-Chimp Optimization Algorithm is proposed to protect the privacy of the patients. The optimization algorithm employs average equivalence value and generalized information loss for the calculation of fitness value. Moreover, DNA-Genetic algorithm based encryption technique is also implemented after anonymization process to give extra protection to the anonymized database. The performance of the proposed privacy preservation technique is evaluated with respect to parameters such as information loss, privacy and utility. It can be observed that the proposed approach shows better results and it is efficient to preserve the privacy of medical databases when compared to other techniques.
Article Preview
Top

1. Introduction

Cloud Computing is a modern computing archetype to store, manage and retrieve information such as individual data, medical records, financial transactions, and so on. The cloud environment is utilized among millions of people all over the world (Zhan et al., 2015). It contains sensitive information about individuals like personal information such as quasi-identifiers (QI), name, date of birth, address, contact number, zip code, passwords, emails, health, finance, treatment details, medical records, etc., (Kundalwal et al., 2019). The privacy of the individuals is at risk when publishing the data with such sensitive information. While preserving data privacy, data can be published using Privacy preserving data publishing (PPDP) technique. Many data anonymization techniques are proposed for PPDP (Lokesh and Ranganatha, 2019; Romanou, 2018; Arava and Lingamgunta, 2019; Andrew et al., 2019; Ashkouti et al., 2020).

The sensitive information excluding processed data records to be protected before publishing in the cloud is known as data anonymization (Romanou, 2018). The k-anonymization is one of the most popular anonymization methods that guarantees each data record is unidentifiable from minimum k-1 other data records (Arava and Lingamgunta, 2019). Suppression and Generalization are the two main operations of the k-anonymization process. Though this technique shows better performance in individual privacy preservation, the performance is ineffective for high-dimensional and multi-dimensional data (Andrew et al., 2019). It is identified that the Mondrian based k-anonymization technique can handle multi-dimensional databases and capable to partition the high dimensional numerical data (Ashkouti et al., 2020).

Preserving both the utility and privacy of the anonymous data simultaneously is the main challenge in PPDP. Metaheuristic-based optimization technique integrated with Mondrian based k-anonymization method is employed to improve the privacy and utility in this research. The optimization algorithm adopted in this paper is the combination of both genetic algorithm (GA) (McCall, 2005) and chimp optimization algorithm (ChOA) (Khishe and Mosavi, 2020). ChOA is inspired by the sexual motivation and individual intelligence of chimps in their group hunting. GA is a powerful optimization algorithm that proves its reliability in solving many complex real world issues and hence utilized in PPDP (Tahir et al., 2020). Though the anonymized data will protect the sensitive information if published to the cloud, the data in non-encrypted format can be easily learned by the public and cloud users (Bibal Benifa and Venifa Mini, 2020). Data encryption techniques can protect the anonymized data in the cloud. GA based encryption technique shows better results in data encryption (Tahir et al., 2020). To improve the performance of GA, DNA computing can be integrate with the GA due to the nature connection of GA with DNA computing (Zang et al., 2018). Thus, DNA-Genetic Algorithm (DNA-GA) encryption technique is utilized in the proposed approach to protect the anonymized data in the cloud (Mousa, 2016). This encryption technique is more stable because of the genetic and multi-iteration activities. The performance of the proposed anonymization approach, optimization algorithm and encryption technique is evaluated with different datasets like Adult database, Indian Liver Patient Records, Early Stage Diabetes Risk Prediction Dataset and COVID-19 patient pre-condition dataset intems of generalized information loss (IJISP.300326.m01, average equivalence value IJISP.300326.m02, running time and throughput.

Complete Article List

Search this Journal:
Reset
Volume 18: 1 Issue (2024)
Volume 17: 1 Issue (2023)
Volume 16: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 15: 4 Issues (2021)
Volume 14: 4 Issues (2020)
Volume 13: 4 Issues (2019)
Volume 12: 4 Issues (2018)
Volume 11: 4 Issues (2017)
Volume 10: 4 Issues (2016)
Volume 9: 4 Issues (2015)
Volume 8: 4 Issues (2014)
Volume 7: 4 Issues (2013)
Volume 6: 4 Issues (2012)
Volume 5: 4 Issues (2011)
Volume 4: 4 Issues (2010)
Volume 3: 4 Issues (2009)
Volume 2: 4 Issues (2008)
Volume 1: 4 Issues (2007)
View Complete Journal Contents Listing