Ranking Algorithm for Semantic Document Annotations

Ranking Algorithm for Semantic Document Annotations

Syarifah Bahiyah Rahayu (Universiti Kebangsaan Malaysia, Malaysia)
Copyright: © 2012 |Pages: 10
DOI: 10.4018/ijirr.2012010101


Semantic annotation represents a metadata of the document based on domain ontology. The purpose of this paper is to develop a ranking algorithm for semantic document annotation and to evaluate its performance in the Semantic Web (SW) application. The evaluation is to compare the ranking algorithm against other algorithms. For the evaluation purpose, all the algorithms are applied into the SW application. The SW application is a research prototype retrieval engine, PicoDoc. The system framework of PicoDoc is based on OCAS2008 ontology. During the experimentation stage, a real-life dataset from news article corpus of ABC and BBC websites are selected. The experiment shows promising results in retrieving related information using the ranking algorithm.
Article Preview


Semantic Web (SW) is a web of with a meaning. Thus, SW assists computer to understand the meaning of content. For example, with the help of SW, computer may distinguish the word of “pitch” is referring to either music or sport. In order to assist computer, SW is built based on statements. These statements are known as semantic annotation. Semantic annotation represents a summary of the document based on domain ontology. The semantic annotation contains no duplication statements. Thus, the meaning of the content is defined as knowledge. For instance, when a query “who hits a pitch shot?” is inquired. Computers know it refers to sport domain. Hence, the ranked results are related to golf. The whole process seems a good alternative for user to obtain relevance results. Crucial to the process is the degree of relevancy to the ranked results.

Although some researchers have paid attention in several aspects of SW: ontology creation (Davulcu et al., 2004; Chung et al., 2002; Kashyap et al., 2005), and annotation automation (Belhajjame et al., 2008; Cimiano et al., 2004; Reeve & Han, 2005; Valarakos et al., 2004). However, in addition, semantic retrieval encompasses several unexplored dimensions that lately have attracted research attention in other disciplines such as ontology alignment (Euzenat et al., 2004), lexicographic databases (Li et al., 2003), contextual dependent similarity (Rodriguez et al., 2004), distance metrics (Lin, 1998) and description logics (Borgida et al., 2005; d’Amato et al., 2006; Hu et al., 2007). The purpose of semantic retrieval is similar to Information Retrieval (IR) concept. However, semantic retrieval is focusing in retrieving information (knowledge) in the SW environment. Besides, this process could be done by exploiting the advantages of semantic document annotation. Some of this unexplored semantic retrieval appears to be important and worthy of investigation in the context of ranking semantic document annotation. However, little research has been carried out in ranking semantic document annotation. An investigation of these issues is important because it scores and ranks semantic document annotation based on its degree of knowledge. This measurement establishes a ranking list according to the degree of knowledge relevant of each semantic document annotation. Furthermore, previous empirical research has focused primarily on retrieving information. In traditional IR, documents are ranked and scored based on its content rather than the meaning of content. This study seeks to extend the semantic relevance by addressing the gaps in scoring and ranking semantic document annotation. Hence, the ranking algorithm is applying a concept spreading in order to effectively capture the knowledge.

The purpose of this study was to develop a ranking algorithm for semantic document annotations. More specifically, the study aimed to achieve the following specific research objectives:

  • 1.

    To develop a ranking algorithm for semantic document annotation.

  • 2.

    To evaluate performance of the proposed algorithm in the SW application.

Thus, this study presents an alternative method to rank and score documents based on semantic annotation. Hence, this method would capture and exploit document richness in order to score and rank semantic document annotation. At this level, the work is focusing on a document level semantic relevancy. The objects of evaluation are semantic news annotation articles corpus from ABC and BBC websites. In addition, the unit of analysis is the ranking list of semantic document annotation.

Complete Article List

Search this Journal:
Open Access Articles: Forthcoming
Volume 9: 4 Issues (2019): 2 Released, 2 Forthcoming
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing