Web Summarization and Browsing Through Semantic Tag Clouds

Web Summarization and Browsing Through Semantic Tag Clouds

Antonio M. Rinaldi (Università degli Studi di Napoli Federico II, Napoli, Italy)
Copyright: © 2019 |Pages: 23
DOI: 10.4018/IJIIT.2019070101

Abstract

The need to manage electronic documents is an open issue in the digital era. It becomes a challenging problem on the internet where a large amount of data needs even more efficient and effective methods and techniques for mining and representing information. In this context, document summarization, browsing processes and visualization techniques have had a great impact on several dimensions of user information perception. In this context, the use of ontologies for knowledge representation has rapidly grown in the last years in several application domains together with social-based techniques such as tag clouds. This form of visualization tool is becoming particularly useful in the interaction process between users and social applications where a huge amount of data needs to have effective and efficient interfaces. In this article, the authors propose a novel methodology based on a combination of ontologies and Tag Clouds for web document collections browsing and summarizing, they call this tool Semantic Tag Cloud.
Article Preview
Top

1. Introduction

The extremely rapid growth of user-centered information on the “Social Web” requires novel methodologies and techniques to assist users during their information searching and browsing. In this context, people use several tagging services to manage, organize and discover useful information.

Indeed, tagging is simple, it does not require a lot of thinking and it is very useful to find relevant objects. People tag pictures, videos, and other resources with a couple of keywords to easily retrieve and share them at a later time. There are several ways to help users in these tasks and in the last years new techniques have been proposed. One of these approaches is based on the creation of tag clouds. Tag clouds are visual representations of a set of terms which represent several document dimensions.

They show a set of terms in which text features (e.g. size, color, weight) are used to represent relevant properties among words and collected documents. They can be arranged along different visual features aggregations as (i) a tag for the frequency of each item; (ii) a global tag cloud where the frequencies are aggregated over all items and users; (iii) a cloud contains categories, with size indicating number of subcategories. Different visual representations may affect people’s performance in extracting information out of keyword summaries (Felix et al., 2018).

Tag clouds arise from collaborative tagging paradigm (Mathes, 2004; Hammond et al., 2005) used in several social software websites. In these systems, users annotate contents with free keywords (tags) defining associated metadata without any need to use existing, pre-defined and authoritative indexing structures; this classification system is called folksonomy (Wal, 2007). Folksonomies have a high impact on user tasks and are in strong contrast with other forms of terms classifications (e.g. thesauri and ontologies). This visualization tool implies less cognitive and physical workload than thinking of a search tag that defines the thematic field one likes to explore and entering it into the search field (Sinclair and Cardew- Hall, 2008); for example, after finding an initial tag and associated resources users can start browsing using tags or make use of related tag lists.

Social and collaborative systems have greatly increased the popularity of this type of visualization, but several problems arise from their knowledge base structures. In fact, even if tag clouds have been shown to help users get a high-level understanding of the data and to support people in casual exploration (Rivadeneira et al., 2007), the completely free choice of tags entails several problems for users. For example, it is hard to have a full impression of tags used in the whole system, users are often dealt with general linguistic problems related to folksonomies (Golder and Huberman, 2006, Grahl et al., 2007), structured ways of exploration are hardly provided and user interfaces of folksonomy systems often fail to support users in finding appropriate search tags and creating efficient queries for discovering interesting contents. In addition, the use of different word forms to represent the same or different concepts (i.e. polysemy property) is a hard issue in the recognition of the right topic in a document.

Moreover, as discussed in (Hassan-Montero and Herrero-Solana, 2006; Begelman et al., 2006), if visible tags are selected only by their usage frequency, there might be a problem of high semantic density, which means that very few topics and related prominent tags tend to dominate the whole visualization and less important items fade out (Hearst and Rosner, 2008; Rinaldi, 2012). Generally speaking, data visualization techniques and in particular user interfaces for information access have a great influence on the user perception (Yang et al., 2014; Mayer et al., 2014).

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 16: 4 Issues (2020): 1 Released, 3 Forthcoming
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing