(p+, α, t)-Anonymity Technique Against Privacy Attacks

(p+, α, t)-Anonymity Technique Against Privacy Attacks

Sowmyarani C. N., Veena Gadad, Dayananda P.
Copyright: © 2021 |Pages: 19
DOI: 10.4018/IJISP.2021040104
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Privacy preservation is a major concern in current technology where enormous amounts of data are being collected and published for carrying out analysis. These data may contain sensitive information related to individual who owns them. If the data is published in their original form, they may lead to privacy disclosure which threats privacy requirements. Hence, the data should be anonymized before publishing so that it becomes challenging for intruders to obtain sensitive information by means of any privacy attack model. There are popular data anonymization techniques such as k-anonymity, l-diversity, p-sensitive k-anonymity, (l, m, d) anonymity, and t-closeness, which are vulnerable to different privacy attacks discussed in this paper. The proposed technique called (p+, α, t)-anonymity aims to anonymize the data in such a way that even though intruder has sufficient background knowledge on the target individual he will not be able to infer anything and breach private information. The anonymized data also provide sufficient data utility by allowing various data analytics to be performed.
Article Preview
Top

1. Introduction

Privacy preservation can be achieved by many methods such as securing private information by using cryptographic methods, Access control methods etc. But, these techniques (Narayan S, 2010; Qian H, 2015; Wang J. H, 2010) does not provide data utility. Data anonymization techniques provide data utility by providing privacy preservation. Hence, there is a huge scope for anonymization techniques to utilize the data for purpose of carrying out research, Statistical analysis for decision making and forecasting (Fung, 2010). Personal data also called as Personally Identifiable Information (PII) is either information that relates to any identifiable living individual or the pieces of information collected together that can lead to identification of a particular person. Example of the former case is a name and surname, home address, personal email address, identification card number, Internet Protocol (IP) address, cookie ID, location data etc. Examples of latter case are the data held by hospital or doctor, census data, data that is provided at the workplace etc. Data anonymization is the process of information sanitization whose main aim is to preserve privacy of personally identifiable information present in the dataset (Graham, 2009; L. Willenborgand, 1996). The anonymized data becomes meaningless if utility of data is not considered, i.e., the raw data has no privacy but has full utility and completely anonymized data has perfect privacy but no utility.

Privacy Preserving Data Publishing (PPDP) is set methods and tools with the objective of publishing the data that remain practically useful at the same time preserving the individual privacy (C M Fung, 2010; Sattar, et al.,2013; Xu, Feng, et al.,2019)) is one of the important issues of research in the domain of data privacy and security, network security, cyber physical systems, information security etc.. Most of the data today is published is in the form of microdata (Winkelmann R, 2006). A microdata is a file with n records and the record may contain m variables also called attributes of an individual of whom the information is collected. Let T be the original microdata table. At the basic level of (PPDP), in the published microdata the identifiers are removed and anonymization methods are applied on Quasi-Identifier’s, the resulting table is of the form: T` (Quasi-Identifier’s, Sensitive Attributes). From the literature ((Samarati P, 2001)), the attributes in the microdata file can be classified and defined as follows:

  • 1.

    Quasi-identifiers (QIDs)- used to identify the individuals but not uniquely for example- person’s age, zip code etc.

  • 2.

    Confidential/sensitive attributes (SA)- person’s sensitive information which needs to be secured, for example- diagnosis report, community, disease, salary, occupation, investments, opinion polls etc.

The microdata file is published by removing the identifiers that directly identify the individual, however in the published data the Quasi-Identifiers (QID’s), Sensitive Attributes (SA) are retained as these contain values useful for analytics/study/research purpose. These values that are subjected to anonymization to make sure that the published microdata is safe from possible attacks such as background knowledge attack, homogeneity attack, attribute linkage attack, skewness attack, similarity attack etc. A complete list of privacy attacks on published data is provided in (Sowmyarani C N,2017)

Table 1.
Original table
JobGenderAgeSalary
EngineerMale3050,000
EngineerMale3250,000
DoctorFemale3560,000
ChoreographerFemale4535,000
DancerMale4035,000
DancerMale4235,000
DoctorMale3860,000
ChoreographerMale4835,000

Complete Article List

Search this Journal:
Reset
Volume 18: 1 Issue (2024)
Volume 17: 1 Issue (2023)
Volume 16: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 15: 4 Issues (2021)
Volume 14: 4 Issues (2020)
Volume 13: 4 Issues (2019)
Volume 12: 4 Issues (2018)
Volume 11: 4 Issues (2017)
Volume 10: 4 Issues (2016)
Volume 9: 4 Issues (2015)
Volume 8: 4 Issues (2014)
Volume 7: 4 Issues (2013)
Volume 6: 4 Issues (2012)
Volume 5: 4 Issues (2011)
Volume 4: 4 Issues (2010)
Volume 3: 4 Issues (2009)
Volume 2: 4 Issues (2008)
Volume 1: 4 Issues (2007)
View Complete Journal Contents Listing