An Ontological Structure for Semantic Retrieval Based on Description Logics

An Ontological Structure for Semantic Retrieval Based on Description Logics

Hongwei Wang (Tongji University, China), James N. K. Liu (The Hong Kong Polytechnic University, China) and Wei Wang (Fudan University, China)
DOI: 10.4018/978-1-61520-819-7.ch006


Current information retrieval either relies on encoding process to describe given item or perform a full-text analysis to search for user-specified words. However, these syntax-based methods only reflect part of the content, so they can hardly ensure content matching. Ontology is an explicit specification of conceptualizations, which can explicitly and formally express the semantic of the concepts and their relationships. Therefore the domain ontology is better for information retrieval at semantic level. The ontological structure is determined by its specific applications. Although there are already several views on ontological structures in different contexts, the word association derived from reasoning the semantic relations between terms, the key point for semantic retrieval, has not been solved properly yet. This chapter proposes a six-element based ontological structure for semantic retrieval, and use description logic to semantically describe the atomic term, complex terms, instances, instances description, attribute assignment and axioms. Then, the new structure is evaluated by the Gruber’s criteria including explicitness and objectivity, consistency, extensibility, minimal encoding bias and minimal ontological commitment. Based on the new structure, we propose two reasoning mechanisms, i.e., terms-oriented and instances-oriented, for semantic retrieval application. Meanwhile, conversion mechanisms and determining algorithms are also proposed, which enable the reasoning for various relations in a specific area according to the rules made by domain experts. Finally, we put forward four kinds of rules for information retrieval, and analyze the applications of the new structure in semantic retrieval.
Chapter Preview


The contents on the Internet are changing every day. We are witnessing today an exponential growth of data accumulated within organizations and systems. Autonomous data repositories storing different types of data are becoming available for us. This makes it impossible for users to be aware of the locations, structure, query languages and semantics of the data in various repositories. Without the efficient retrieval tools, users have no way to position the exact location where their needed information is. So users have to blindly search every possible server. Obviously, it may be time consuming. Current retrieval methods are mainly based on man-made subject directories or keywords matching. These syntax-based methods with limitations to reveal semantic information only reflect part of the content, so they can hardly ensure content matching (Zghal, Aufaure, & Mustapha, 2007; Köhler, Philippi, Specht, & Rüegg, 2006).

Figure1 shows the general process of the Chinese information retrieval, including segmentation, word association, searching in database and information integration, among which, segmentation, searching in database and information integration have already been studied in-depth (Foo & Li, 2004; Zhang, Lu, & Zou, 2004; Feng, Hu, Zhao, & Yi, 2006; Fu, Kit, & Webster, 2008; Wang & Du, 2003; Shah, Finin, Joshi, Cost, & Matfield, 2002). In recent years, research efforts focus on word association relating keyword Ti to term set {ti1, ti2, …, tim}. In this process, most search engines use the syntactic matching method. That is, to set up a data dictionary, then to search by strictly matching and combinations. This method, however, often gets blunt feedback due to the lack of semantic understanding. For example, if searching for “computer”, the feedback may contain “computer mall”, “portable computer” and so on, but does not or rarely refer to “laptop”. If the word association is available semantically, the efficiency of information retrieval would be promoted (Gruber, 1993a; Guarino, Masolo, Vetere, & Council, 1999; Freitas & Bittencourt, 2000; Finin, Ding, Pan, Joshi, & Kolari, 2005).

Figure 1.

The general process of information retrieval

Ontology is a useful tool to support the process of word association. In the context of computer and information sciences, an ontological model defines a set of representational primitives with which to model a domain of knowledge or discourse. The representational primitives are typically classes (or sets), attributes (or properties), and relationships (or relations among class members). The definitions of the representational primitives include information about their meaning and constraints on their logically consistent application. In the context of database systems, ontology can be viewed as a level of abstraction of data models, analogous to hierarchical and relational models, but intended for modeling knowledge about individuals, their attributes, and their relationships to other individuals. Ontologies are typically specified in languages that allow abstraction away from data structures and implementation strategies; in practice, the languages of ontologies are closer in expressive power to first-order logic than languages used to model databases. For this reason, ontologies are said to be at the semantic level, whereas database schema are models of data at the syntactical or physical level. Due to their independence from lower level data models, ontologies are used for integrating heterogeneous databases, enabling interoperability among disparate systems, and specifying interfaces to independent, knowledge-based services.

Complete Chapter List

Search this Book: