Privacy Preserving Data Mining Using Time Series Data Aggregation

Privacy Preserving Data Mining Using Time Series Data Aggregation

Sivaranjani Reddi (ANITS, Bheemunipatnam, India)
DOI: 10.4018/IJSITA.2017100101

Abstract

This article proposes a mechanism to provide privacy to mined results by assuming that the data is distributed across many nodes. The first objective includes mining the query results by the node in a cluster, communicating it to the cluster head, aggregating the data collected from all the cluster nodes and then communicating it to the group controller. The second objective is to incorporate privacy at each level of the clusters node: cluster head and the group controller level. The final objective is to provide a dynamic network feature, where the nodes can join or leave the distributed network without disturbing the network functionality. The proposed algorithm was implemented and validated in Java for its performance in terms of communication costs computational complexity.
Article Preview
Top

Introduction

Many real life applications of data mining is facing problems towards the privacy preservation of the data (Anderson, 2010; Acs & Castelluccia, 2011; Dansana, 2012; van Dijk et al., 2010; Chowdhuri, 2014; Sarkar et al., 2017). It includes, firstly, certain attributes of the data or attributes that might leak the personal recognizable information. Secondly, the data can be split across multiple nodes either horizontally or vertically, and may not allow the data transfer to another side. Finally, usage of data model might have restriction on rules, and few rules may lead to law violation in order to access individual profiling. Privacy preserving based data mining (PPDM) (Agrawal, 1994) has arisen to discuss the above-mentioned issues. Majority of the PPDM techniques are the modified versions of the standard data mining algorithms, where the modification includes the cryptographic mechanisms which guarantee the privacy for the application. In many cases, restraints PPDM are: preserving data accuracy and retaining mining process performance while maintaining the privacy restrictions. Copious methodologies used by PPDM can be summarized based on following dimensions:

  • Data Distribution: This dimension concentrates on data distribution. The approaches adopt either centralized data distribution or decentralized data distribution. Generally, the data distribution can be categorized as horizontal and vertical data distribution. While horizontal data splitting is discussed in detail in the forthcoming sections, vertical distribution distributes all values for different attributes in different places.

  • Data Alteration: It is used to change the actual data into other form before releasing to the public in order to accomplish the data privacy. Data modification mechanisms include perturbation, blocking, aggregation (Chen et al., 2014; Won et al., 2014), swapping and sampling.

  • Privacy Preservation: Assures the delivery of data to the intended data mine by adapting data alteration before delivering. Distribution of data is done among more than one node without revealing the data at individual site. In classification phase, where the results will be given to designate node, which does the classification, it checks for the occurrence of certain rules without disclosing them.

Many authors have proposed techniques in order to provide the confidentiality in data mining (Aggarwal, 2010; Oliveira, 2004; Rawat, 2015; Fouad & Hassan, 2016), Elaine Shi et al. (2011) has proposed a time series based aggregation mechanism in order to attain data privacy, where group participants can occasionally upload encrypted data to the group aggregator(GA), who is responsible to do the summation on data in every time periodically. The authors suggested a mechanism which allows group users to submit encoded values to data aggregator, Afterwards aggregator will perform the summation on participants’ values in every period, without prior knowledge. We achieve strong privacy using this technique.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing