Article Preview
Top1. Introduction
Web sites have introduced a collection of refined techniques known as Web 2.0. Social tagging is one of the important features of Web 2.0. Social tagging systems allow users to annotate resources with free-form tags (Guandong Xu, 2013). Tagging is a popular way to interpret web 2.0 websites. The tags are collected by the user’s favorites or interests in the social bookmarking website. In Social Bookmarking systems, Tagging can be seen as the act of linking of entities such as users, resources and tags (Caimei Lu, 2011). When a user applies a tag to a resource in the system, a multilateral relationship between the user, the resource and the tag is formed. It helps user better way to understand and distribute their collections of attractive objects (Gupta M et.al 2010) (Figure 1).
Figure 1. Multilateral relationship
Feature Selection (FS) is an essential part of knowledge discovery. It is a process which attempts to select features which are more informative (Velayutham, 2011). FS is separated into the supervised and unsupervised categories. When class labels of the data are existing we use supervised feature selection, otherwise the unsupervised feature selection is applicable (Jothi, 2012). In this paper, the feature selection technique is applied for Bookmark Selection (BMS). The goal of Bookmark selection is to find out a marginal bookmarked URL subset from a Web 2.0 data while retaining a suitably high accuracy in representing the original bookmarks (Kumar SS, 2013 ; Inbarani et.al 2014b). The BMS is a must due to the profusion of noisy, irrelevant or misleading bookmarks added to web 2.0 sites. BMS is also used to increase the clustering accuracy and reduce the computational time of clustering algorithms. Web 2.0 user-generated tagged bookmark data usually contains some irrelevant bookmarks which should be removed before knowledge extraction. In this paper, Unsupervised Quick Reduct (USQR) method is used for selection of tagged bookmarks since there are no class labels for web 2.0 tagged bookmark data.
Clustering is one of the important tasks in data mining. Clustering is considered as an interesting approach for finding similarities in data and putting similar data into groups (Selvakumar, 2013). Clustering partitions a dataset into several groups such that the similarity within a group is larger than that among groups (Hammouda, 2000). Tag clustering is the process of grouping similar tags into the same cluster and is important for the success of Social Systems. The goal of clustering tags is to find frequently used tags from the tagged bookmarks. On the tag clustering, similar tags are clustered based on tag weights associated with bookmarks (Kumar SS, 2013).
In this paper, we proposed a TRS-PSO clustering algorithm for social systems and the techniques are implemented and tested against a various social tagging dataset. The performance of these techniques is compared based on ‘goodness of clustering’ evaluation measures.
The proposed work consists of
- •
Data Extraction: Fetching data from social systems and the Data set are converted into matrix representation. “Delicious” (del.icio.us) is a famous social bookmarking web service for storing, sharing, and determining web bookmarks.
- •
Data Formatting: Data formatting consists of mapping the tags and bookmarks based on tag weights represented in matrix format.
- •
Bookmark selection: BMS is the progression of selecting more useful tagged bookmarks from a set of bookmarks associated with tags.
- •
Tag Clustering: To cluster relevant tags based on tag weights associated with selected bookmarks
The rest of this paper is organized as follows: Section 2 presents some of the related work in web 2.0 tag clustering and feature selection. Section 3 Present Methodology of this research work. In Section 4, the experimental results have been reported. And the conclusion has been addressed in Section 5.