Article Preview
TopIntroduction
A social network is a community consisting of a number of entities like individuals or organizations and ties among these entities (Backstrom et al., 2007). The growth of social networking sites such as Facebook, Twitter, MySpace etc. leads to an exponential increase in the number of registered users on these sites (Statistics of Social Media Sites by Digital Media Ramblings, 2015). In various situations social network providers publish the acquired user’s information to different customers like researchers, community behavior analysts, advertisers etc. (Rosenblum, 2007). However, it is very important that published social networks data should not disclose the sensitive information of users.
In tabular micro data, an individual can be identified using quasi identifiers which in turn can disclose their sensitive information (Samarati & Sweeney, 2001). In one of the attack scenarios individuals can be identified by relating the published tabular data with some external data sources which are publically available (Samarati & Sweeney, 2001). Similar is the case with social networks data.
A social network is a graph consisting of nodes as vertices and edges as labels of relationship between them. When data of social network users is published then the entire graphical structure which contains sensitive information of users is also published. Privacy can be breached via a social network graph if the involved vertices or edges leak private information (Liu & Terzi, 2008). Lot of work has been done in preserving privacy of micro data (Samarati & Sweeney, 2001; Machanavajjhala et. al, 2007; Ninghui et. al, 2007; Sun et. al, 2008; Xiao & Tao, 2007; Chung et. al, 2007). Recently, a number of distinct models and anonymization algorithms have been developed to preserve the privacy of social network data (Blosser & Zhan, 2008; Wei & Lu, 2008; Campan & Truta, 2008; Zou et. al, 2009; Panda et. al, 2010). Attackers can use structure information like degree of nodes, subgraphs to identify the nodes in the network. To prevent node identification through structure attacks, graph should be K-anonymized (Samarati & Sweeney, 2001).