Trajectory Data Publication Based on Differential Privacy

Trajectory Data Publication Based on Differential Privacy

Zhen Gu, Guoyin Zhang
Copyright: © 2023 |Pages: 15
DOI: 10.4018/IJISP.315593
Article PDF Download
Open access articles are freely available for download

Abstract

Analyzing trajectory data can provide people with a higher quality of life. However, publishing trajectory data directly will leak privacy. The authors propose a trajectory data publication method based on differential privacy (TDDP). TDDP method consists of two stages. In the location generalization stage, firstly, the locations at each timestamp are clustered into classes by k-means++ algorithm, and then the representative location of each class is selected by using the exponential mechanism. In the generalized trajectory data publication stage, the authors design a sampling mechanism to form the generalized trajectories. The locations are sampled from the representative locations under different timestamps to form the generalized trajectories. The TDDP method can avoid the generation of non-semantic representative locations and ensure that the generalized trajectories can resist filtering attacks. The experimental results show that the trajectory data released by TDDP method can achieve a good balance between privacy protection and data availability.
Article Preview
Top

Introduction

In the era of big data, location-aware technologies such as mobile communications and sensing devices digitize the geographic locations of people and objects, and subsequently generate a large amount of trajectory data. Location data contains characteristics of human behavior, by analyzing and mining trajectory data, better services can be provided to people (Yang et al., 2019). For example, urban traffic can be reasonably planned to avoid traffic congestion by analyzing trajectory data (Yuan et al., 2012, &Yuan et al., 2013). However, trajectory data contain a lot of sensitive personal information, such as the home address, work address, physical health status. If the location or trajectory data are directly released, it will lead to privacy leakage (Wernke et al., 2014, Gursoy et al., 2019, & Ding et al., 2020), seriously, it will even threaten people's personal safety and property safety. The researches on trajectory data privacy protection are mainly divided into two types. One is the trajectory data privacy protection in offline mode. A specific organization collects trajectory data for analysis and mining to provide useful information to specific customers (Abul et al., 2008, Hua et al., 2015, Li et al., 2017, & Ma et al., 2021). The other type is online trajectory data privacy protection, such as location-based services. The real-time trajectory data of moving objects needs to be uploaded to the service provider, in this case, privacy protection of trajectory data is also required (Zhang et al., 2017, Zhang et al., 2018). In this paper, the authors mainly study the privacy protection of trajectory data in offline mode.

The existing trajectory data privacy protection methods mainly include IJISP.315593.m01 -anonymity method (Sweeney et al., 2002), encryption method and random disturbance method. The IJISP.315593.m02 -anonymity method is vulnerable to attacks with background knowledge. The encryption method is not a commonly used method due to its high computational cost. Among the random perturbation methods, the trajectory data publishing based on differential privacy has become a more popular research (Hua et al., 2015, & Liu et al., 2021).Differential privacy technology (Dwork et al., 2017) is the strongest unconditional privacy protection technology currently known, differential privacy can resist attacks from any background knowledge. However, some current researches on trajectory data publishing based on differential privacy also have some aspects that need to be improved.

  • (a)

    Some current researches require that the start and end times of any two trajectories must be the same or assume that the raw trajectories need to contain the same prefix or a common subsequence. However, it is difficult for the actual collected trajectory data to have these characteristics.

  • (b)

    Some current methods are to cluster the locations of all trajectories at each timestamp, and then use the cluster center of each class as a representative of all locations within the same class, at last, they use the cluster centers to generate the generalized trajectory. However, the cluster centers sometimes do not have semantic information, even, non-semantic representative locations can appear in multiple clusters, which will make the published trajectories to be identified and filtered by the adversary.

Complete Article List

Search this Journal:
Reset
Volume 18: 1 Issue (2024)
Volume 17: 1 Issue (2023)
Volume 16: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 15: 4 Issues (2021)
Volume 14: 4 Issues (2020)
Volume 13: 4 Issues (2019)
Volume 12: 4 Issues (2018)
Volume 11: 4 Issues (2017)
Volume 10: 4 Issues (2016)
Volume 9: 4 Issues (2015)
Volume 8: 4 Issues (2014)
Volume 7: 4 Issues (2013)
Volume 6: 4 Issues (2012)
Volume 5: 4 Issues (2011)
Volume 4: 4 Issues (2010)
Volume 3: 4 Issues (2009)
Volume 2: 4 Issues (2008)
Volume 1: 4 Issues (2007)
View Complete Journal Contents Listing