Article Preview
Top1. Introduction
With the development of wired and wireless communication, people are able to easily access the Internet whenever and wherever. Social Network Services (SNS) have become a significant medium for information propagation and sharing. SNS aims to build online social networks and social relationships among users who are willing to share interests or activities. The interactions among people of different regions and ages are becoming increasingly flexible and convenient.
Nowadays, online microblog services are widespread as a broadcast medium, especially since Twitter launched in 2006 and the explosive growth of participants active in SNS. Different from traditional blog services (Yang and Counts, 2010; Zhao and Jiang, 2011), the content of a microblog is typically short and refined, which makes it easy to post and browse. Information is instant and flexible in terms of diffusion because of length-of-content restrictions. In comparison, a traditional SNS makes use of a bidirectional friend model as the social graph structure to reflect social relationships in real life; however, a microblog service creates a unidirectional subscription structure. The unidirectional subscription mechanism triggers a celebrity effect. For example, users are able to follow celebrities who have a huge influence on them, but barely know them. The introduction of these kinds of innovative improvements opens up a vast space for social exchange and speeds up interactions.
Microblog services are especially popular in China. Sina Weibo, one of the largest microblog websites in China, has attracted more than 0.6 billion registered users. The number of daily active users reached 60 million in September 2013. The rapid development of microblogs attracts considerable attention and provides sufficient data and content for research. As a matter of fact, many researchers have made numerous contributions on microblogs, including on statistical physics, sociophysics, econophysics, complex networks, human dynamics, and various interdisciplinary applications (Boccaletti and Latora, 2006; Abramson and Kuperman, 2001). Several theoretical models (Barabasi, 2005; Yan and Li, 2012) have been proposed to reproduce the process of human activities, and to predict human decisions. However, seldom are models built on the basis of large-scale data statistics and analyses (Xiong and Hu, 2012; Xiao and Wang, 2012). Therefore, the models cannot consist with empirical results. It may be of value to collect large-scale data and investigate potential patterns in microblog user behavior. This can be helpful in further studies on the promotion of human activities.
One significant limitation to collection of large-scale data thru the Sina Weibo Application Programming Interface (Sina Weibo API) is that the API severely restricts the speed of access through one single IP address. In order to overcome this limitation, a browser simulation technique was used to simulate the process of user operations, such as login, navigation, and search. With this approach, we were able to collect an abundance of data from Sina Weibo regarding following relationships, including user profiles, social graph structure, and release of microblog postings.