De-Anonymization Techniques

De-Anonymization Techniques

DOI: 10.4018/978-1-5225-5158-4.ch007

Abstract

Most operators provide some privacy controls such that many online networks restrict access to the information about individual members and their relationships. In a paper in 2006, it is claimed that the authors have developed a two-stage de-anonymisation algorithm that can re-identify the original network from an anonymised network obtained by using any of the anonymization algorithms developed thus far. However, at that time anonymisation, techniques for social networks were in their infancy. Several powerful anonymisation algorithms have been developed after that as explained in the previous chapters. But it seems that anonymisation algorithms and de-anonymisation algorithms have been developed alternatively. The authors present some more de-anonymisation algorithms developed subsequently, as late as 2017.
Chapter Preview
Top

Introduction

As we have seen in the previous chapters, preserving privacy of respondents in a social network before publishing has become of utmost importance. Even in online networks that are completely open, there is a disconnection between users’ willingness to share information and their reaction to unintended parties viewing or using their information (Carthy, 2007). As a consequence, most operators provide some privacy controls such that many online networks restrict access to the information about individual members and their relationships.

But, the network owners share the information with advertising partners and third parties. More often than not the published networks are used for research purposes. So, as discussed earlier the networks are anonymised before publication. It has been interpreted that anonymity is equivalent to privacy in several high-profile cases of data sharing.

In Narayanan and Shmatikov (2009), which they claim to be the first paper in their direction, it is demonstrated as how it is feasible to de-anonymise real world social networks. For this purpose the following steps have been taken out.

  • A survey of the current state of data sharing in social networks, the purpose of such sharing, the resulting privacy risks and the availability of auxiliary information, which an attacker can use for de-anonymisation, is made.

  • Privacy in social networks and its relation to node anonymity is defined formally. A categorization of attackers basing upon categories of attack determined through differentiation of attackers’ resources and auxiliary information is made. A methodology for measuring the extent of privacy breaches in social networks is provided.

  • Most importantly, a generic re-identification algorithm for anonymised social networks is developed. The algorithm uses only the network structure and does not make any additional assumptions about membership overlap between multiple networks. An illustration of the functionality of the algorithm is explained by taking two large real-worlds online social networks; Flickr and Twitter.

We have discussed in the beginning and other chapters before that publishing social networks is necessary for different purposes and the owners are willingly or compelled by the situation intend to publish such networks. The various reasons for such inclination are:

Academic and Government Data Mining

Phone call networks are commonly used to detect illicit activity such as calling fraud and for national security purposes. Sociologists, epidemiologists and health-care professionals collect data about geographic, friendship, family networks to study disease propagation and risk. Even when obtained from public websites, if it is released still presents privacy risks as the attackers who do not have resources can use it. Of course sometimes pseudo anonymised profiles are provided, which are similar to anonymised network.

Advertising

As social network data makes commerce much more profitable, network operators are inclined to share their graphs with advertising partners to enable better social targeting for advertisements. As an example Facebook explicitly says that users’ profiles may be shared for the purpose of personalizing advertisements and promotions as long as the individual is not explicitly identified (Facebook, 2007).

Third-Party Applications

Although third-party applications do not respect privacy policies, the data provided to third-party applications is usually not anonymised, even though most applications would be able to function on anonymised profiles. So, data from multiple applications can be aggregated and used for targeted advertising. It has been observed that a malicious third-party application can learn about members of a social network even if it obtains the data in anonymised form.

Aggregation

Aggregation of information from multiple social networks potentially presents a greater threat to individual privacy than one-time data releases. Aggregated networks are an excellent source of auxiliary information for attacks.

Complete Chapter List

Search this Book:
Reset