The Semantic Web for Knowledge and Data Management

The Semantic Web for Knowledge and Data Management

Zongmin Ma (Northeastern University, China) and Huaiqing Wang (City University of Hong Kong, Hong Kong)
Indexed In: SCOPUS
Release Date: August, 2008|Copyright: © 2009 |Pages: 386
ISBN13: 9781605660288|ISBN10: 1605660280|EISBN13: 9781605660295|DOI: 10.4018/978-1-60566-028-8

Description

While the current Web provides access to an enormous amount of information, it is currently only human-readable. In response to this problem, the Semantic Web allows for explicit representation of the Semantics of data so that it is machine interpretable.

Semantic Web for Knowledge and Data Management: Technologies and Practices provides a single record of technologies and practices of the Semantic approach to the management, organization, interpretation, retrieval, and use of Web-based data. This groundbreaking collection offers state-of-the-art information to academics, researchers, and industry practitioners involved in the study, use, design, and development of advanced and emerging Semantic Web technologies.

Topics Covered

The many academic areas covered in this publication include, but are not limited to:

  • Automatic semantic annotation
  • Contextual hierarchy driven ontology learning
  • Data retrieval
  • Design diagrams
  • Domain knowledge management and applications
  • E-Tourism
  • Fuzzy Models
  • Machine Learning
  • Modeling
  • Ontological sources
  • Ontologies and intelligent agents
  • Ontology extraction
  • Ontology utilization
  • Probabilistic models for the semantic Web
  • Semantic models
  • Semantic ontologies
  • Semantic overlay networks
  • Semantic Web
  • Semantics-based approach
  • Software asset reuse
  • Storage concept improvement
  • SWARMS
  • Ubiquitous mobile communications
  • XML-based P2P information systems

Reviews and Testimonials

This book presents the latest research and application results in the Semantic Web.

– Zongmin Ma, Northeastern University, China

Students, researchers, and practitioners will benefit from the discussions of key theoretical concepts combined with guidelines for practical application and implementation. The chapters are accompanied by sections on future research directions, additional reading and questions for discussion, making it an ideal resource for coming to grips with the latest developments.

– Online Information Review, Vol. 33, No. 3

Table of Contents and List of Contributors

Search this Book:
Reset

Preface

The World Wide Web (WWW) has drastically changed the availability of electronically accessible information. Currently the WWW is the biggest information repository around the world and the most convenient means for information sharing and exchange. While the current Web provides access to an enormous amount of information, it is currently only human-readable. It is increasingly difficult to find, access, present and maintain the information required by a wide variety of users. In response to this problem, the Semantic Web was proposed by Time Berners-Lee, Director of the World Wide Web Consortium. The Semantic Web is defined as “an extension of the current World Wide Web in which information is given well-defined, better enabling computers and people to work in cooperation”. The Semantic Web allows the explicit representation of the semantics of the data so that it is machine interpretable. Therefore, the Semantic Web will enable a knowledge-based web and facilities the use of agent-based technology to better mine and filter information needs expressed by information consumers.

The Semantic Web is generally built on syntaxes which use URIs to represent data, usually in triple based structures: i.e. many triples of URI data that can be held in databases, or interchanged in the World Wide Web using a set of particular syntaxes developed especially for the task. These syntaxes are called Resource Description Framework (RDF) syntaxes. The layer above the syntax is the simple datatyping model. The RDF Schema (RDFS) is designed to be a simple datatyping model for the RDF. The Web Ontology Language (OWL) is a language as an ontology language based upon the RDF. OWL takes the RDF Schema a step further, by giving us more in-depth properties and classes. The next step in the architecture of the Semantic Web is trust and proof.

Being the next generation Internet technology, the Semantic Web is typically application-oriented. With advances and in-deep applications of computer and Internet technologies in data and knowledge intensive domains, the Semantic Web for knowledge and data management is emerging as a new discipline. The research and development of knowledge and data management in the Semantic Web are receiving increasing attention. The requirements of large-scale deployment and interoperability of the Semantic Web represent a major challenge to data and knowledge management, which raises a number of issues and requirements regarding how to represent, create, manage and use both ontologies as shared knowledge representations, but also large volumes of metadata records used to annotate Web resources of a diverse kind. So the Semantic Web for knowledge and data management is a field which must be investigated by academic researchers together with developers and users both from database, artificial intelligence, and software and knowledge engineering areas.

This book focuses on the following issues of the Semantic Web: the theory aspect of the Semantic Web and ontology, data management and processing in the Semantic Web, ontology and knowledge management, and Semantic Web-based applications, aiming at providing a single account of technologies and practices in the Semantic Web for knowledge and data management. The objective of the book is to provide the state of the art information to academics, researchers and industry practitioners who are involved or interested in the study, use, design and development of advanced and emerging the Semantic Web technologies with ultimate aim to empower individuals and organizations in building competencies for exploiting the opportunities of the data and knowledge society. This book presents the latest research and application results in the Semantic Web. The different chapters in the book have been contributed by different authors and provide possible solutions for the different types of technological problems concerning the Semantic Web for knowledge and data management.

Introduction
This book which consists of twelve chapters is organized into three major sections. The first section discusses the issues of the Semantic Web and ontologies in the first six chapters. The next three chapters covering the Semantic Web for data and knowledge management comprise the second section. The third section containing the final three chapters focuses on the applications of the Semantic Web.

First of all, we take a look at the issues of the Semantic Web and ontologies.

Research in ontology learning had always separated between ontology building and evaluation tasks. Moreover, it had used for example a sentence, a syntactic structure or a set of words to establish the context of a word. However, this research avoids accounting for the structure of the document and the relation between the contexts. Lobna Karoui combines these elements to generate an appropriate context definition for each word. Based on the context, she proposes an unsupervised hierarchical clustering algorithm that, in the same time, extracts and evaluates the ontological concepts. The results show that her concept discovery approach improves the conceptual quality and the relevance of the extracted ontological concepts, provides a support for the domain experts and facilitates the evaluation task for them.

In the Semantic Web context, information would be retrieved, processed, shared, reused and aligned in the maximum automatic way possible. The experience with such applications in the Semantic Web has shown that these are rarely a matter of true or false but rather procedures that require degrees of relatedness, similarity, or ranking. Apart from the wealth of applications that are inherently imprecise, information itself is many times imprecise or vague. In order to be able to represent and reason with such type of information in the Semantic Web, different general approaches for extending semantic web languages with the ability to represent imprecision and uncertainty has been explored. Hailong Wang et al. focus their attention on fuzzy extension approaches which are based on fuzzy set theory. They review the existing proposals for extending the theoretical counterpart of the semantic web languages, description logics (DLs), and the languages themselves. The expressive power of the fuzzy DLs formalism and its syntax and semantic, knowledge base, the decidability of the tableaux algorithm and its computational complexity as well as the fuzzy extension to OWL are discussed.

Ontologies are more commonly used today but still little consideration is given of how to efficiently store them. Edgar R. Weippl, Markus D. Klemen and Stefan Raffeiner present an improved database schema to store ontologies. More specifically, they propose an intuitive and efficient way of storing arbitrary relationships, show that their database schema is well suited to store both RDF and Topic Maps, and explain why it is more efficient by comparing it to other approaches. The proposed approach is built on reliable and efficient relational database management systems (RDBMS). It can be easily implemented for other systems and due to its vendor independence existing data can be migrated from one RDBMS to another relatively easy.

The emerged form of information with computer-processable meaning (semantics) as presented in the framework of the Semantic Web (SW) facilitates machines to access it more efficiently. Information is semantically annotated in order to ease the discovery and retrieval of knowledge. Ontologies are the basic element of the SW. They carry knowledge about a domain and enable interoperability between different resources. Another technology, that draws considerable attention nowadays (shows major interest, especially today), is the technology of Intelligent Agents. Intelligent agents act on behalf of a user to complete tasks and may adapt their behavior to achieve their objectives. Kostas Kolomvatsos and Stathes Hadjiefthymiades provide an exhaustive description of fundamentals regarding the combination of SW and intelligent agent technologies.

Recently, there has been an increasing interest in formalisms for representing uncertain information on the Semantic Web. This interest is triggered by the observation that knowledge on the web is not always crisp and we have to be able to deal with incomplete, inconsistent and vague information. The treatment of this kind of information requires new approaches for knowledge representation and reasoning on the web as existing Semantic Web languages are based on classical logic which is known to be inadequate for representing uncertainty in many cases. While different general approaches for extending Semantic Web languages with the ability to represent uncertainty are explored, Livia Predoiu and Heiner Stuckenschmidt focus their attention on probabilistic approaches. They survey existing proposals for extending semantic web languages or formalisms underlying Semantic Web languages in terms of their expressive power, reasoning capabilities as well as their suitability for supporting typical tasks associated with the Semantic Web.

The Semantic Web provides a common framework that allows data to be shared and reused across applications, enterprises, and community boundaries. However, lack of annotated semantic data is a bottleneck to make the Semantic Web vision a reality. Therefore, it is indeed necessary to automate the process of semantic annotation. In the past few years, there was a rapid expansion of activities in the semantic annotation area. Many methods have been proposed for automating the annotation process. However, due to the heterogeneity and the lack of structure of the Web data, automated discovery of the targeted or unexpected knowledge information still present many challenging research problems. Jie Tang et al. study the problems of semantic annotation and introduce the state-of-the-art methods for dealing with the problems. They also give a brief survey of the developed systems based on the methods. Several real-world applications of semantic annotation are introduced as well.

The next session takes look at the data and knowledge management with the Semantic Web and ontologies.

In a Peer-to-Peer (P2P) system, a Semantic Overlay Network (SON) models a network of peers whose connections are influenced by the peers' content, so that semantically related peers connect with each other. This is very common in P2P communities, where peers share common interests, and a peer can belong to more than one SON, depending on its own interests. Querying such a kind of systems is not an easy task: The retrieval of relevant data can not rely on flooding approaches which forward a query to the overall network. A way of selecting which peers are more likely to provide relevant answers is necessary to support more efficient and effective query processing strategies. Federica Mandreoli et al. present a semantic infrastructure for routing queries effectively in a network of SONs. Peers are semantically rich, in that peers' content is modelled with a schema on their local data, and peers are related each other through semantic mappings defined between their own schemas. A query is routed through the network by means of a sequence of reformulations, according to the semantic mappings encountered in the routing path. As reformulations may lead to semantic approximations, they define a fully distributed indexing mechanism which summarizes the semantics underlying whole subnetworks, in order to be able to locate the semantically best directions to forward a query to. They demonstrate through a rich set of experiments that their routing mechanism overtakes algorithms which are usually limited to the only knowledge of the peers directly connected to the querying peer, and that their approach is successful in a SONs scenario.

Jie Tang et al. describe the architecture and the main features of SWARMS, a platform for domain knowledge management. The platform aims at providing services for (1) efficiently storing and accessing the ontological information; (2) visualizing the networking structure in the ontological data; (3) searching and mining the semantic data. One advantage of the system is that it provides a suite of components for not only supporting efficient semantic data storage but also searching and mining the semantics. Another advantage is that the system supports visualization in the process of search and mining, which would greatly help a normal user to understand the knowledge inside the ontological data. SWARMS can be easily customized to adapt to different domains. The system has been applied to several domains, such as News, Software, and Social Network. The authors present the performance evaluations of the system.

Knowledge representation and management techniques can be efficiently used to improve data modeling and IR functionalities of P2P Information Systems, which have recently attracted a lot of attention from both industrial and academic research communities. These functionalities can be achieved by pushing semantics in both data and queries, and exploiting the derived expressiveness to improve file sharing primitives and lookup mechanisms made available by first-generation P2P systems. XML-based P2P Information Systems are a more specific instance of this class of systems, where the overall data domain is composed by very large, Internet-like distributed XML repositories from which users extract useful knowledge by means of IR methods implemented on top of XML join queries against the repositories. Alfredo Cuzzocrea first focuses his attention on the definition and the formalization of the XML-based P2P Information Systems class, also deriving interesting properties on such systems, and then he presents a knowledge-representation-and-management-based framework, enriched via semantics, that allows us to efficiently process knowledge and support advanced IR techniques in XML-based P2P Information Systems, thus achieving the definition of the so-called Semantically-Augmented XML-based P2P Information Systems.

In the third section, we see the application aspects of the Semantic Web.

Traditional E-Tourism applications store data internally in a form that is not interoperable with similar systems. Hence, tourist agents spend plenty of time updating data about vacation packages in order to provide good service to their clients. On the other hand, their clients spend plenty of time searching for the 'perfect' vacation package as the data about tourist offers are not integrated and are available from different spots on the Web. Danica Damljanoviæ and Vladan Devedžiæ develop Travel Guides - a prototype system for tourism management to illustrate how semantic web technologies combined with traditional E-Tourism applications help integration of tourism sources dispersed on the Web, and enable creating sophisticated user profiles. Maintaining quality user profiles enables system personalization and adaptivity of the content shown to the user. The core of this system is in ontologies – they enable machine readable and machine understandable representation of the data and more importantly reasoning.

The world becomes ubiquitous, and mobile communication platforms become oriented towards integration with the web, getting benefits from the large amount of information available there, and creation of the new types of value-added services. Semantic and ontology technologies are seen as being able to advance the seamless integration of the mobile and the Web worlds. Anna V. Zhdanova, Ning Li and Klaus Moessner present the overall state of the art ontology-related developments in mobile communication systems, namely, the work towards construction, sharing and maintenance of ontologies for mobile communications, reuse and application of ontologies and existing Semantic Web technologies in the prototypes. Social, collaborative and technical challenges experienced in the project showcase the need in alignment of ontology experts’ work across the mobile communication projects to establish the best practices in the area and drive standardization efforts. They indicate certain milestones in integration of Semantic Web-based intelligence with Mobile Communications, such as performing ontology construction, matching, and evolution in mobile service systems and alignment with existing heterogeneous data models.

Ontology is a basic building block for the Semantic Web. An active line of research in semantic web is focused on how to build and evolve ontologies using the information from different ontological sources inherent in the domain. A large part of the IT industry uses software engineering methodologies to build software solutions that solve real-world problems. For them, instead of creating solutions from scratch, reusing previously built software as much as possible is a business-imperative today. As part of their projects, they use design diagrams to capture various facets of the software development process. Kalapriya Kannan and Biplav Srivastava discuss how semantic web technologies can help solution-building organizations achieve software reuse by first learning ontologies from design diagrams of existing solutions and then using them to create design diagrams for new solutions. Their technique, called OntExtract, extracts domain ontology information (entities and their relationship(s)) from class diagrams and further refines the extracted information using diagrams that express dynamic interactions among entities such as sequence diagram. A proof of concept implementations is also developed as a Plug-in over a commercial development environment IBM’s Rational Software Architect.

Author(s)/Editor(s) Biography

Zongmin Ma (Z. M. Ma) received the Ph. D. degree from the City University of Hong Kong in 2001 and is currently a Full Professor in College of Information Science and Engineering at Northeastern University, China. His current research interests include intelligent database systems, knowledge representation and reasoning, the Semantic Web and XML, knowledge-bases systems, and semantic image retrieval. He has published over 80 papers in international journals, conferences, and books in these areas since 1999. He also authored and edited several scholarly books published by Springer-Verlag and IGI Global, respectively. He has served as member of the international program committees for several international conferences and also spent some time as a reviewer of several journals. Dr. Ma is a senior member of the IEEE.
Huaiqing Wang is a Professor at the Department of Information Systems, City University of Hong Kong. He is also the Honorary Dean and a Guest Professor of the School of Information Engineering, Wuhan University of Technology, China. He received his PhD in Computer Science from University of Manchester, UK, in 1987. Dr. Wang specializes in research and development of business intelligence systems, intelligent agents and their applications (such as multi-agent supported financial information systems, virtual learning systems, knowledge management systems, conceptual modeling and ontology). He has published more than 40 international refereed journal articles.

Indices