Article Preview
TopInformation content describes the basic dimension, size and volume of a concept, it represents the information capacity of a concept in a specific environment. A specific or specialized entity contains more information than a general or an abstract entity, this is the basic principle of information content, and which has been widely used in evaluating semantic similarity between concepts (A. Formica, 2008; Jiang, Bai, Zhang, & Hu, 2017; Resnik, 1997). In this work, information content is associated with the scale of the subordinate tree of concepts. No matter how many internal concepts are introduced into the hierarchy, these concepts can be described by leaf nodes on subordinate tree. Furthermore, they can also be distinguished from other concepts with different leaf node sets. The following is the definition of leaf node.
where
is a leaf node on the hierarchy of
c,
is the subordinate word set of
,
is a leaf node if and only if
. Multiple inheritance of internal nodes on subordinate word tree will lead to multiple path among leaf nodes. In order to avoid this redundancy, each leaf node is counted only once when
L(c) is created. The more leaf nodes there are on the subordinate tree of a concept, the stronger the generality of this concept. Leaf nodes have the same maximum information content value, because they are specific enough to be distinguished from other nodes.