Research on social networks has advanced significantly due to wide variety of on-line social websites and very popular Web 2.0 application. Social network analysis views social relationships in terms of network and graph theory about nodes (individual actors within the network) and ties (relationships between the actors). Using web mining techniques and social networks analysis it is possible to process and analyze large amount of social data (such as blogtagging, online game playing, instant messenger etc.) and by this to discover valuable information from data. In this way, we can understand the social structure, social relationships and social behaviors. This new approach is also denoted as social network mining. These algorithms differ from established set of data mining algorithms developed to analyze individual records, because social network datasets are called relational due to centrality of relations among entities. This chapter also sets out a process to apply web mining.
A social network is a social structure made of individuals (organizations, company etc.) also called nodes, which are connected by links represent relationships and interactions between individuals, a rich relational interdependency and content for mining. Figure 1 shows an example of social network.
An example of social network: individuals on nodes and links represent relationships
Online social network focuses on building on Internet communities of people who share interests and/or activities, who are interested in exploring the interests and activities of others or who are interested to communicate, interact and share. So, they are very popular Web 2.0 application. Some well-known social networking websites are: Facebook as general network, LinkedIn and Viadeo as business social network, Flickr about photo sharing etc. Thus, social network is a relevant part of human life (Fu, 2007; Goth, 2008).
This chapter describes how to use web mining techniques and online social networks analysis to obtain analysis regarding user profile and behaviour, information suitable for marketing, sociology etc. The analysis are focused on web resources such as user content, network structures, user behaviour on network website and how user creates its network
This chapter is structured as follows: section 1 presents social network analysis, section 2 presents web mining algorithms and techniques can be used for social networks analysis, section 3 presents techniques and process, section 4 discusses some applications. Finally, future challenges and research directions are explained.Top
Social Network Analysis
Social network analysis (SNA) is a mathematical technique developed in modern sociology, in order to understand structure and behaviour between members of social systems, to map relationships between individuals in social network, also to serve up business intelligence on the ties. Social network analysis is related to network theory and graph theory, so the network topology helps to determine a network's usefulness to its individuals. It is possible to classify objectives in: static to find community structures, dynamic to monitor community structure evolution and to spot abnormal individuals or abnormal time-stamps.
Evaluation of people location in network, that is the centrality of a node, is relevant to understand networks and their participants. These measures provided by social network analysis give us insight into the various roles and groupings: who are the connectors, leaders, bridges, isolates, the clusters existing and who is in them, who is in the kernel of network and who is on the periphery.
Social network analysis is descriptive rather than predictive, because it is built with only a few global parameters, so it is not useful for making prediction of future behaviour of network. This is due to networks availability, few information about each node and lack of data. In Web 2.0 age we have very large social networks creating massive quantities of data and we have substantial quantities of information at level of individual nodes suitable to build statistical models of individuals. The relevant difficult regards how to extract social data from a set of very different communication resources (Jin, 2007; Matsuo, 2007).
Adjacency matrix is a simply way to represent a network by representing which vertices of a graph are adjacent to which other vertices: if person i and j are connected with one direct link we have (i,j)=1 and (i,j)=0 otherwise. Using matrix algebra on adjacency matrix it is possible to evaluate some numerical properties useful for SNA, such as computing the intensity of relation between person (Kolaczyk, 2009; Scott, 2000; Wasserman, 1994) based on: