A Semantic Information Content Based Method for Evaluating FCA Concept Similarity

A Semantic Information Content Based Method for Evaluating FCA Concept Similarity

Hongtao Huang (Engineering Research Center of Henan Provincial Universities for Educational Information, Henan Normal University, Xinxiang, China), Cunliang Liang (Engineering Research Center of Henan Provincial Universities for Educational Information, Henan Normal University, Xinxiang, China) and Haizhi Ye (Engineering Research Center of Henan Provincial Universities for Educational Information, Henan Normal University, Xinxiang, China)
DOI: 10.4018/IJCINI.2018040106

Abstract

Probability information content-based FCA concepts similarity computation method relies on the frequency of concepts in corpus, it takes only the occurrence probability as information content metric to compute FCA concept similarity, which leads to lower accuracy. This article introduces a semantic information content-based method for FCA concept similarity evaluation, in addition to the occurrence probability, it takes the superordinate and subordinate semantic relationship of concepts to measure information content, which makes the generic and specific degree of concepts more accurate. Then the semantic information content similarity can be calculated with the help of an ISA hierarchy which is derived from the domain ontology. The difference between this method and probability information content is that the evaluation of semantic information content is independent of corpus. Furthermore, semantic information content can be used for FCA concept similarity evaluation, and the weighted bipartite graph is also utilized to help improve the efficiency of the similarity evaluation. The experimental results show that this semantic information content based FCA concept similarity computation method improves the accuracy of probabilistic information content based method effectively without loss of time performance.
Article Preview

Information content describes the basic dimension, size and volume of a concept, it represents the information capacity of a concept in a specific environment. A specific or specialized entity contains more information than a general or an abstract entity, this is the basic principle of information content, and which has been widely used in evaluating semantic similarity between concepts (A. Formica, 2008; Jiang, Bai, Zhang, & Hu, 2017; Resnik, 1997). In this work, information content is associated with the scale of the subordinate tree of concepts. No matter how many internal concepts are introduced into the hierarchy, these concepts can be described by leaf nodes on subordinate tree. Furthermore, they can also be distinguished from other concepts with different leaf node sets. The following is the definition of leaf node.

  • Definition 1: Let C be the concept set of domain ontology O, then for any concept , a leaf node set is defined by:

    (1)

where is a leaf node on the hierarchy of c, is the subordinate word set of , is a leaf node if and only if . Multiple inheritance of internal nodes on subordinate word tree will lead to multiple path among leaf nodes. In order to avoid this redundancy, each leaf node is counted only once when L(c) is created. The more leaf nodes there are on the subordinate tree of a concept, the stronger the generality of this concept. Leaf nodes have the same maximum information content value, because they are specific enough to be distinguished from other nodes.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 13: 4 Issues (2019): Forthcoming, Available for Pre-Order
Volume 12: 4 Issues (2018): 2 Released, 2 Forthcoming
Volume 11: 4 Issues (2017)
Volume 10: 4 Issues (2016)
Volume 9: 4 Issues (2015)
Volume 8: 4 Issues (2014)
Volume 7: 4 Issues (2013)
Volume 6: 4 Issues (2012)
Volume 5: 4 Issues (2011)
Volume 4: 4 Issues (2010)
Volume 3: 4 Issues (2009)
Volume 2: 4 Issues (2008)
Volume 1: 4 Issues (2007)
View Complete Journal Contents Listing