Article Preview
Top1. Introduction
Collaborative tagging has emerged as a common and successful solution for labelling and organising huge amounts of digital content, being adopted by many well-known sites such as Youtube, Flickr, Last.fm, or Delicious (Marlow, Naaman, Boyd, & Davis, 2006). In collaborative tagging, users assign a number of free-form semantically-meaningful textual labels (tags) to information resources. These tags can be then used for many purposes, including retrieval, browsing and categorisation (Bischoff, Firan, Nejdl, & Paiu, 2008). For instance, they can be used for matching user queries with resources tags, or for building tag clouds to navigate across resources. Such usages are of special importance for platforms that share multimedia content such as videos, images, or audio, since such contents cannot be so directly and straightforwardly indexed as it would be done with textual data like books or web pages (Bischoff, Firan, Nejdl, & Paiu, 2008). Because of this importance, collaborative tagging systems have been widely researched in the last few years. In particular, a focus has been given to collaborative tagging dynamics and user behaviour (Marlow, Naaman, Boyd, & Davis, 2006; Halpin, Robu, & Shepard, 2006; Golder & Huberman, 2006; Farooq, Kannampallil, Song, Ganoe, Carroll, & Giles, 2007) and to automatic tag classification methods based on user motivations ([5, 6]).
Nevertheless, collaborative tagging systems suffer from a number of well-known issues (Halpin, Robu, & Shepard, 2006; Cantador, Konstas, & Jose, 2011), which include tag scarcity, the use of different labels to refer to a single concept (synonymy), the ambiguity in the meaning of certain labels (polysemy), the commonness of typographical errors, the use of user-specific naming conventions, or even the use of different languages. One strategy for trying to overcome these problems, and thus to obtain more comprehensive and consistent tag assignments, is the use of tag recommendation systems to help users in the tagging process (Jäschke, Marinho, Hotho, Schmidt-Thieme, & Stumme, 2007). In that case, when users are labeling online resources, tag recommendation systems automatically suggest new tags that can also be meaningful or relevant for the resource being described. This way, tag recommendation serves the purpose of consolidating the tag vocabulary among users in a collaborative tagging system (Jäschke, Marinho, Hotho, Schmidt-Thieme, & Stumme, 2007). In addition, tag recommendation systems can be used, in an off-line mode, to extend the descriptions of information resources by automatically adding new tags.
Here we describe a general scheme for tag recommendation in large-scale collaborative tagging systems. Our approach is folksonomy-based, meaning that we do not perform any content analysis of the information resources for which we perform tag recommendations, but uniquely rely on the tag co-occurrence information that can be derived from the folksonomy itself. A particularly interesting aspect of our tag recommendation scheme is a step focused on automatically selecting the number of tags to recommend given a list of candidates with assigned scores. Other tag recommendation methods found in the literature generally do not consider this aspect and evaluate their solutions at different values of recommended tags (see Related work). Moreover, as the scheme we describe only relies on tag information derived from a folksonomy, it is rather domain-independent and could be easily adapted to other collaborative tagging systems, either alone or as a complement of more specific content-based strategies. We believe that a tag recommendation method such as the one we propose here can be useful to obtain more comprehensive and coherent descriptions of tagged resources, and help the emergence of less noisy and more consistent folksonomies. This can greatly benefit organisation, browsing and reuse of online content, and also leverage the value of folksonomies as reliable sources for knowledge-mining (Al-Khalifa & Davis, 2007; Limpens, Gandon, & Buffa, 2009).