Article Preview
TopIntroduction
In the last decade, Online Social Networks (OSNs), such as Facebook and Twitter, have gained extreme popularity with more than a billion users worldwide. OSNs allow a user to publish the data to all his friends in his friend circle.
Currently, the OSN platforms are typically centralized, where the users store their data in the centralized servers deployed by the OSN service providers. The service providers can utilize and analyze these data to know the users’ private information, such as interest and personal affairs, and in the worst case may sell these information to the third party. Therefore, the current Centralized Online Social Networks (COSNs) have raised the serious concerns in privacy (Krishnamurthy, Wills et al. 2008, Krishnamurthy, Wills et al. 2010, Krishnamurthy 2013, Zhang, Sun et al. 2010).
In order to address the data privacy issue, the Decentralized Online Social Networks (DOSNs) have been proposed recently (Buchegger, Schiöberg et al. 2009, Yeung, Liccardi et al. 2009). Although the DOSN products (“Diaspora” n.d.) are not as popular as the OSNs, DOSN is indeed under active development (Wilson, Gosling et al. 2012, Koll, Li et al. 2013). In order to protect the data privacy, the centralized servers are bypassed in DOSNs and the data published by a user are stored and disseminated only among the friend circle of the user (Li et al. 2013). Although DOSNs can help protect the data privacy, maintaining data availability becomes a big challenge. This is because if a friend of the user is offline, the data stored in the friend cannot be accessed by other friends.
In order to achieve good data availability in DOSN, the data replication approach has been widely used. In this approach, a certain number of data replicas are created for each data item published by a user and these data replicas are stored in the user’s friend circle. By doing so, if a friend is offline, the data in this offline friend can be accessed through the replicated data stored in other friends. Consequently, data availability is improved.
In the existing data replication work in DOSN, it is typically assumed that the friends of a user are always capable of contributing sufficient storage capacity to store all the published data (Li et al. 2013, Olteanu and Pierre 2012). This assumption is not ideal in the current times. On one hand, the increasingly more data are being generated on the OSNs nowadays. On the other hand, the users now often use mobile devices, such as mobile phones, to access the OSN services. The storage capacity in the mobile devices is much more limited than the desktop computers used in the “old fashioned” style of accessing OSNs. Moreover, the number of the friends in a friend circle is limited (Ugander, Karrer et al. 2011). These above factors cause the storage shortage in DOSNs. Therefore, it is desired to know what level of data availability can be achieved given the total storage capacity contributed by the friend circle. However, the existing work in DOSN has not yet conducted quantitative research in this aspect. This paper aims to address this issue and build a quantitative model to capture the relation between the total storage capacity contributed by the friends and the level of data availability in the DOSN.
Moreover, the friends become online or offline dynamically in a DOSN. The data availability will drop when the number of online friends decreases. A novel method is proposed in this paper to predict the level of data availability on the fly.