Emergent Data Mining Tools for Social Network Analysis

Emergent Data Mining Tools for Social Network Analysis

Dhiraj Murthy (Bowdoin College, USA), Alexander Gross (Bowdoin College, USA) and Alex Takata (Bowdoin College, USA)
Copyright: © 2013 |Pages: 18
DOI: 10.4018/978-1-4666-4213-3.ch003
OnDemand PDF Download:


This chapter identifies a number of the most common data mining toolkits and evaluates their utility in the extraction of data from heterogeneous online social networks. It introduces not only the complexities of scraping data from the diverse forms of data manifested in these sources, but also critically evaluates currently available tools. This analysis is followed by a presentation and discussion on the development of a hybrid system, which builds upon the work of the open-source Web-Harvest framework, for the collection of information from online social networks. This tool, VoyeurServer, attempts to address the weaknesses of tools identified in earlier sections, as well as prototype the implementation of key functionalities thought to be missing from commonly available data extraction toolkits. The authors conclude the chapter with a case study and subsequent evaluation of the VoyeurServer system itself. This evaluation presents future directions, remaining challenges, and additional extensions thought to be important to the effective development of data mining tools for the study of online social networks.
Chapter Preview


With the increased pervasiveness of the internet, society has seen exponential growth in digital data that has been made available on global public networks. With this rise of ‘Big Data,’ researchers have seen the need to identify, organize, collect, and extract this information back out of the system and into useful forms (Hammer, Garcia-Molina, Cho, Aranha, & Crespo, 1997). The fields of data mining and web-content extraction are critical to this process and have remained active areas of research, as the types and forms of data available on the Web have continued to grow and evolve. The continued growth of information on the Web - due in part to more recent trends of fully online, social, and context aware computing - have made more types of data available, which are of potential use in a highly interdisciplinary range of fields. Many disciplines are looking at ‘Big Data’ and ways to mine and analyze these data as the key to solving everything from technical problems to better understanding social interactions. For example, large sets of tweets mined from Twitter have been analyzed to detect natural disasters (Doan, Vo, & Collier, 2011; Hughes, Palen, Sutton, Liu, & Vieweg, 2008; Murthy & Longwell, in press), predict the stock market (Bollen & Mao, 2011), and track the time of our daily rituals (Golder & Macy, 2011). As our use of blogs, social networks, and social media continues to increase, so does our creation of more web-based hyperlinked data. The successful extraction of this web-based data is of considerable research and commercial value.

Data mining often goes beyond simple information retrieval and has moved towards a meta-discovery of structures and entities hidden in seas of data. As our social interactions become increasingly mediated by Internet-based technologies, the potential to use web-based data for understanding social structures and interactions will continue to increase.

Online social networks are defined as ‘web-based services that allow individuals to (1) construct a public or semi-public profile within a bounded system, (2) articulate a list of other users with whom they share a connection, and (3) view and traverse their list of connections and those made by others within the system’ (Boyd & Ellison, 2008). Individuals interact within online social networks through portals such as Facebook, which create social experiences for the user by creating a personalized environment and interaction space by combining knowledge of one users’ online activity and relationships with information about other networked individuals. It is through data mining algorithms that Twitter, for example, determines recommendations for users to follow or topics that may be of potential interest. One way to study social networks is by examining relationships between users and the attributes of these relationships. However, data on a blog, Facebook, or Twitter is not directly translatable into network-based data that would be useful within research praxis, and this is where the ability to perform effective data mining becomes important. Social networks typically only provide individual portal access to one’s egocentric network. Put in the language of social network analysis (SNA), the visible network is constructed in relation to ego (the individual being studied) and relations of ego, known as ‘alters,’ are seen (e.g. Facebook friends). However, in a restricted profile environment, the alters’ relationships are not revealed. In order to understand network structure (which is key to a systems perspective), the researcher must use methods like data mining in order to gather information about all users and interactions by iterating over the data. A variety of different types of tools have been developed to collect this web-based information. These tools were created for a wide array of purposes. The majority of these tools have been commercially released. Some of these tools can be used to construct profiles of individuals based on data from multiple sources. Given issues of privacy, ethical uses of these tools should be strictly employed (Van Wel & Royakkers, 2004).

Complete Chapter List

Search this Book:
Editorial Advisory Board
Table of Contents
Witold Pedrycz
Vishal Bhatnagar
Vishal Bhatnagar
Chapter 1
Gurdeep S Hura
This chapter presents this new emerging technology of social media and networking with a detailed discussion on: basic definitions and applications... Sample PDF
Need for Dynamicity in Social Networking Sites: An Overview from Data Mining Perspective
Chapter 2
Preeti Gupta, Vishal Bhatnagar
The social network analysis is of significant interest in various application domains due to its inherent richness. Social network analysis like any... Sample PDF
Data Preprocessing for Dynamic Social Network Analysis
Chapter 3
Dhiraj Murthy, Alexander Gross, Alex Takata
This chapter identifies a number of the most common data mining toolkits and evaluates their utility in the extraction of data from heterogeneous... Sample PDF
Emergent Data Mining Tools for Social Network Analysis
Chapter 4
Sanur Sharma, Vishal Bhatnagar
In recent times, there has been a tremendous increase in the number of social networking sites and their users. With the amount of information... Sample PDF
A Conceptual Framework for Social Network Data Security: The Role of Social Network Analysis and Data Mining Techniques
Chapter 5
Luca Cagliero, Luigi Grimaudo, Alessandro Fiori
Experiments, performed on real Twitter posts, show the effectiveness and the usability of the proposed system in supporting Twitter user behavior... Sample PDF
Analyzing Twitter User-Generated Content Changes
Chapter 6
Manish Kumar
Social Networks are nodes consisting of people, groups and organizations growing dynamically. The growth is horizontal as well as vertical in terms... Sample PDF
Applications of Data Mining in Dynamic Social Network Analysis
Chapter 7
Luca Cagliero, Alessandro Fiori
This Chapter overviews most recent data mining approaches proposed in the context of social network analysis. In particular, it aims at classifying... Sample PDF
Dynamic Social Network Mining: Issues and Prospects
Chapter 8
Gebeyehu Belay Gebremeskel, Zhongshi He, Huazheng Zhu
Unable to accommodating new technologies, including social technology, mobile devices and computing are other potential problems, which are... Sample PDF
Data Mining Prospects in Mobile Social Networks
Chapter 9
Gebeyehu Belay Gebremeskel, Zhongshi He, Xuan Jing
In this chapter, the authors focused on optimization of MSNs based on integrating for intelligent DM and BI platforms, which involves mobile... Sample PDF
Semantic Integrating for Intelligent Cloud Data Mining Platform and Cloud Based Business Intelligence for Optimization of Mobile Social Networks
Chapter 10
Sunil Kr Pandey, Vineet Kansal
Many popular online social networks such as Twitter, LinkedIn, and Facebook have become increasingly popular. In addition, a number of multimedia... Sample PDF
Social Media Analytics: An Application of Data Mining
Chapter 11
Sinchan Bhattacharya, Vishal Bhatnagar
Research on data mining is increasing at an incessant rate and to improve its effectiveness other techniques have been applied such as fuzzy sets... Sample PDF
Critical Parameters for Fuzzy Data Mining
Chapter 12
Zekâi Sen
Fuzzy methodologies show progress day by day towards better explanation of various natural, social, engineering and information problem solutions in... Sample PDF
New Trends in Fuzzy Clustering
Chapter 13
Sara Moridpour
Heavy vehicles have substantial impact on traffic flow particularly during heavy traffic conditions. Large amount of heavy vehicle lane changing... Sample PDF
Analysing the Performance of a Fuzzy Lane Changing Model Using Data Mining
Chapter 14
Basar Öztaysi, Sezi Çevik Onar
Social Networking Sites, which create platform for social interactions and sharing are the mostly used internet websites, thus are very important in... Sample PDF
User Segmentation Based on Twitter Data Using Fuzzy Clustering
Chapter 15
Basar Öztaysi, Sezi Çevik Onar
Social networking became one of the main marketing tools in the recent years since it’s a faster and cheaper way to reach the customers. Companies... Sample PDF
Defining the Factors that Effect User Interest on Social Network News Feeds via Fuzzy Association Rule Mining: The Case of Sports News
About the Contributors