A New Dynamic Neighbourhood-Based Semantic Dissimilarity Measure for Ontology

A New Dynamic Neighbourhood-Based Semantic Dissimilarity Measure for Ontology

Sathiya Balasubramanian (College of Engineering, Guindy, Anna University, Chennai, India) and Geetha T. V. (College of Engineering, Guindy, Anna University, Chennai, India)
Copyright: © 2019 |Pages: 18
DOI: 10.4018/IJIIT.2019070102

Abstract

The semantic web is a global initiative which employs ontologies to offer rich, semantic-based knowledge representation. Concepts in these ontologies are explored to find (dis)similarities between them using (dis)similarity measures. Despite the existence of numerous (dis)similarity measures, none have dynamically determined the quantum of information required to discover (dis)similarities between concepts. In this article, a new, efficient, feature-based semantic dissimilarity measure is proposed where the prime novelty lies in the dynamic selection of the semantic neighourhood (features) of the concepts. The neighbourhood is dynamically selected in accordance with the local density of the concept and the density of the ontology determined by the proposed density coefficient. Further, the proposed measure also scales down the dissimilarity value in accordance with the depth of the concept pair, using the novel Depth Coefficient.
Article Preview
Top

1. Introduction

Any representation of data is to be rich, well-structured and connected to accomplish the goals of the semantic web. This representation is provided by the ontology, an abstract model that describes a domain of interest with a set of concepts and rich relationships (IS-A + other relations) among the concepts in question. With the development of the semantic web, there has been a remarkable growth in the number of ontologies available. Systems processing these ontologies are required to have a basic understanding of the underlying information of the ontology to facilitate improved retrieval, management and exploitation.

Assessing similarities between ontological concepts is a basic step to understand the underlying information (Sánchez et al., 2012) which is obtained by the various similarity measures in the literature reviewed. In general, inputs to similarity measures are two concepts from the same or multiple ontologies and the similarity is represented by a real value, usually ranging from 0 to 1.

The similarity measures are mainly used in the following application areas. 1) ontology and document clustering to discover similar concepts (Do & Rahm, 2007; Hamdi et al., 2010; Hu, Qu & Cheng, 2008; Sridevi & Nagaveni, 2011), 2) ontology-matching systems for semantic interoperability (Euzenat & Shvaiko, 2007), 3) ontology mapping for semantic integration, etc. The similarity measures are used to eliminate heterogeneity among the different ontologies of same domain, thus enabling semantic interoperability and integration among the ontologies.

They are also used in a variety of applications such as classification (Im et al., 2018), Query expansion (Singh & Kumar, 2017), similar concept discovery in biomedical field (Pesquita et al., 2009; Zhang et al., 2008), recommendation (Likavec, Osborne & Cena, 2015), e-learning (Deborah, Baskaran & Kannan, 2012), web service discovery (Fellah, Malki & Elci, 2016), and assorted natural language processing tasks such as spelling error correction and detection (Budanitsky & Hirst, 2001), information retrieval (Hliaoutakis, 2006; Hwang & Kim, 2009), cross-lingual processing (Huang & Kuo, 2010), detection of synonyms (Lin, 1998), word sense disambiguation (Patwardhan, Banerjee & Pedersen, 2003), and so on.

As a general rule, each similarity measure exploits different views of information, such as linguistic, semantic and instances of the ontology. Linguistic similarity measures consider the information about a concept in terms of its name, label, comment, annotation, synonyms, etc. Structural similarity measures use the structural information of the concept such as its depth, Information Content (IC), neighbourhood, synset (synonym set), and the path length between two concepts to compute the similarity. Extensional similarity measures like the Jaccard similarity and Hamming distance use instances of the concepts to compute similarity values.

In this paper, we concentrate on measures which exploit semantic information. Semantic similarity measures in the literature are classified into categories based on the type of information used: path-based, depth-based, information-content (IC) based, hybrid, and feature-based ones (Sathiya et al.)

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 16: 4 Issues (2020): 1 Released, 3 Forthcoming
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing