A Survey on Privacy Preserving Dynamic Data Publishing

A Survey on Privacy Preserving Dynamic Data Publishing

Salheddine Kabou (LabRI Laboratory, Ecole Superieure en Informatique, Sidi Bel-Abbes, Algeria), Sidi mohamed Benslimane (LabRI Laboratory, Ecole Superieure en Informatique, Sidi Bel-Abbes, Algeria) and Mhammed Mosteghanemi (Ecole Nationale Supérieure d'Informatique, Bab Ezzouar, Algeria)
DOI: 10.4018/IJOCI.2018100101

Abstract

Many organizations, especially small and medium business (SMB) enterprises require the collection and sharing of data containing personal information. The privacy of this data must be preserved before outsourcing to the commercial public. Privacy preserving data publishing PPDP refers to the process of publishing useful information while preserving data privacy. A variety of approaches have been proposed to ensure privacy by applying traditional anonymization models which focused only on the single publication of datasets. In practical applications, data publishing is more complicated where the organizations publish multiple times for different recipients or after modifications to provide up-to-date data. Privacy preserving dynamic data publication PPDDP is a new process in privacy preservation which addresses the anonymization of the data for different purposes. In this survey, the author will systematically evaluate and summarize different studies to PPDDP, clarify the differences and requirements between the scenarios that can exist, and propose future research directions.
Article Preview
Top

1. Introduction

Nowadays, Government regulations and many organizations, especially Small and Medium Business (SMB) enterprises require the collection, exchange and sharing of enormous repositories of digital information. In the case where information contains personally identifiable information, such data sharing is subject to constraints imposed by security and privacy of data owners (Chang et al., 2016a). In the study of (Fung et al., 2010), authors announce that 87% of the population of the United States can be uniquely identified by a given dataset published for the public, which extremely reflects privacy violation in the publishing scenario. In August 2006, America Online (AOL) published 20 Million anonymous logs of search queries collected from 658,000 users to facilitate information retrieval research for academic purposes, after mapping each user to a randomly generated identifier (Adeel, 2013). The privacy of this data which is the number one factor for security based on 400 IT professionals' opinions (Chang et al., 2016b) must be preserved, i.e. any sensitive information should not be disclosed to guarantee that individuals privacy cannot be inferred from dataset directly. Keeping and improving security and privacy is also essential for all users and services, such for the Internet Of Things and the Big Data paradigms (Yang et al., 2018; Kuo et al., 2018).

The most important task is to develop methods and tools for publishing data as a remedy of this awkward situation for finding the right balance between data utility and information privacy when publishing dataset. This area of research is called privacy-preserving data publishing (PPDP), which can be considered as a technical answer to complement the privacy approaches. Data anonymization is one of the privacy preserving techniques that translate the information making the original data worthless to anybody except the owners (Kabou and Benslimane, 2015).

It has been widely discussed in the literature such as k-anonymity (Samarati and Sweeney, 1998) (Sweeney, 2002), l-diversity (Machanavajjhala et al., 2006), k-concealment (Tassa et al., 2012).Since the appearance of k-anonymization, several privacy preserving models have been proposed, generally known as Privacy-Preserving Static Data Publishing PPSDP which ensure privacy protection up to a certain level i.e., they are focused on single publication of datasets.

In practical applications, data publishing is more complicated. For example, the organizations can publish a dataset multiple times for different recipients statically or after modifications (insertions, deletions or update) for providing up-to-date data. Each time, the data is anonymized differently for different purposes, or the data is published incrementally as new data is collected. In dynamic data publication problem, the above-mentioned paradigms could provide protection pertaining to a single release. This need opens a new era in privacy preservation called privacy preserving dynamic data publication. (Adeel, 2013).

Complete Article List

Search this Journal:
Reset
Open Access Articles
Volume 10: 4 Issues (2020): 1 Released, 3 Forthcoming
Volume 9: 4 Issues (2019)
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing