Ontology Learning and Knowledge Discovery Using the Web: Challenges and Recent Advances

Ontology Learning and Knowledge Discovery Using the Web: Challenges and Recent Advances

Wilson Wong (University of Western Australia, Australia), Wei Liu (University of Western Australia, Australia) and Mohammed Bennamoun (The University of Western Australia, Australia)
Release Date: May, 2011|Copyright: © 2011 |Pages: 358
ISBN13: 9781609606251|ISBN10: 1609606256|EISBN13: 9781609606268|DOI: 10.4018/978-1-60960-625-1


Ontologies form an indispensable part of the Semantic Web standard stack. While the Semantic Web is still our vision into the future, ontologies have already found a myriad of applications such as document retrieval, image retrieval, agent interoperability and document annotation.

Ontology Learning and Knowledge Discovery Using the Web: Challenges and Recent Advances provides relevant theoretical foundations, and disseminates new research findings and expert views on the remaining challenges in ontology learning. This book is invaluable resource as a library or personal reference for graduate students, researchers, and industrial practitioners. Readers who are in the process of looking for future research directions, and carving out their own niche area will find this book particularly useful due to the detailed scope and wide coverage of the book, which informs any discussion of artificial intelligence, knowledge acquisition, knowledge representation and reasoning, text mining, information extraction, and ontology learning.

Topics Covered

The many academic areas covered in this publication include, but are not limited to:

  • Applications of ontologies
  • Artificial Intelligence
  • Concept Formation
  • Information Extraction
  • Knowledge Acquisition
  • Knowledge Representation and Reasoning
  • Ontology Learning
  • Taxonomy Construction
  • Text Mining
  • Text Processing

Reviews and Testimonials

The time is ripe for this book. Techniques of ontology learning and knowledge discovery are beginning to converge. Prototypes are becoming stronger. Industry practitioners are beginning to realize the need for ontology learning. Wilson, Wei, and Mohammed bring together recent work in the construction and application of ontologies and knowledge bases. They introduce a wide range of techniques that utilize unstructured and semi-structured Web data for learning and discovery. [...] The interdisciplinary nature of ontology learning and knowledge discovery is reflected in this book. It will appeal to advanced undergraduates, postgraduate students, academic researchers and practitioners. I hope that it will lead to a world in which we can all live more effectively, a world in which the ready availability of information is balanced by our enhanced ability to process it.

– Ian H. Witten

Converting today's onslaught of information into usable knowledge is one of the main challenges facing us in the modern era. This volume covers a wide range of topics, from ontology learning to data mining of comparable patents and is edited by three faculty members at the University of Western Australia.

– Book News, Reference - Research Book News - August 2011

Table of Contents and List of Contributors

Search this Book:


It has become sort of a cliché nowadays to mention how rapidly textual information is growing and how the World Wide Web has assisted in this growth. This, however, does not shadow the fact that such explosive growth will only intensify for years to come, and more new challenges and opportunities will arise. Advances in fundamental areas such as information retrieval, machine learning, data mining, natural language processing, and knowledge representation and reasoning have provided us with some relief by uncovering and representing facts and patterns in text to ease the management, retrieval, and interpretation process. Information retrieval, for instance, provides various algorithms to analyse associations between components of a text using vectors, matrices, and probabilistic theorems. Machine learning and data mining, on the other hand, offer the ability to learn rules and patterns out of massive datasets in a supervised or unsupervised manner based on extensive statistical analysis. Natural language processing provides the tools for analysing natural language text on various language levels (e.g. morphology, syntax, semantics) to uncover manifestations of concepts and relations through linguistic cues. Knowledge representation and reasoning enable the extracted knowledge to be formally specified and represented such that new knowledge can be deduced.

The realization that a more systematic way of consolidating the discovered facts and patterns into an organised, higher level construct to enhance everyday applications (e.g. Web search) and enable intelligent systems (e.g. Semantic Web) eventually gave rise to ontology learning and knowledge discovery. Ontologies are effectively formal and explicit specifications, in the form of concepts and relations, of shared conceptualisations, while knowledge bases can be obtained by populating the ontologies with instances. Occasionally, ontologies contain axioms for validation and constraint definition. As an analogy, consider an ontology as a cupcake mould and knowledge bases as the actual cupcakes of assorted colours, tastes, and so on. Ontology learning from text is then essentially the process of deriving the high-level concepts and relations from textual information. Considering this perspective, knowledge discovery can refer to two things, the first denotation being the uncovering of relevant instances from data to populate the ontologies (also known as ontology population), and the second, more general sense being the searching of data for useful patterns. In this book, knowledge discovery can mean either one of the two. 

Being a young and exciting field, ontology learning has witnessed a relatively fast progress due to its adoption of established techniques from the related areas discussed above. Aside from the inherent challenges of processing natural language, one of the remaining obstacles preventing the large-scale deployment of ontology learning systems is the bottleneck in handcrafting structured knowledge sources (e.g. dictionaries, taxonomies, knowledge bases) and training data (e.g. annotated text corpora). It is gradually becoming apparent that in order to minimize human efforts in the learning process, and to improve the scalability and robustness of the system, static and expert crafted resources may no longer be adequate. An increasing amount of research effort is being directed towards harnessing collective intelligence on the Web as an attempt to address this major bottleneck. At the same time, as with many fields before ontology learning, the process of maturing has triggered an increased awareness of the difficulties in automatically discovering all components of an ontology, i.e. terms, concepts, relations, and especially axioms. This gives rise to the question of whether the ultimate goal of achieving full-fledged formal ontologies automatically can be achieved. While some individuals dwell on the question, many others have moved on with a more pragmatic goal, which is to focus on learning lightweight ontologies first, and extend them later if possible. With high hopes and achievable aims, we are now witnessing a growing interest in ontologies across different domains that require interoperability of semantics and a touch of intelligence in their applications.

This book brings together some of the latest work on three popular research directions in ontology learning and knowledge discovery today, namely, (1) the use of Web data to address the knowledge and training data preparation bottleneck, (2) the focus on lightweight ontologies, and (3) the application of ontologies in different domains and across different languages. Section I of the book contains chapters covering the use of a wide range of existing, adapted and emerging techniques for extracting terms, concepts and relations to construct ontologies and knowledge bases. For instance, in addition to traditional clustering techniques reported in Chapter III, a new topic extraction technique is being devised as in Chapter IV to offer alternative ways for discovering concepts. Chapter II, on the other hand, promotes the new application of existing deep semantic analysis methods for ontology learning in general. The use of semi-structured Web data such as Wikipedia for named entity recognition, and the question of how can this be applicable to ontology learning are also investigated in Chapter V. The focus of Chapter I is on the construction of practical, lightweight ontologies for three domains. As for Chapter VI and VII, the authors mainly investigate the use of a combination of data sources, both local and from the Web, to discover hierarchical and non-taxonomic relations. In Section II, the authors look at how ontologies and knowledge bases are currently being applied across different domains. Some of the domains covered by the chapters in this section include biomedical (Chapter VIII and IX), humanities (Chapter X) and enterprise knowledge management (Chapter XI). This book ends with Section III that covers chapters on the use of social data (Chapter XII) and parallel texts (Chapter XIII and XIV), which may or may not be from the Web for learning social ontologies, incorporating trust into ontologies, and improving the process learning ontologies.

This volume is both a valuable standalone as well as a great complement to the existing books on ontology learning that have been published since the turn of the millennium. Some of the previous books focus mainly on techniques and evaluations, while others look at more abstract concerns such as ontology languages, standards, and engineering environments. While the background discussions on the techniques and evaluations are indispensable, the focal point of this book remains on emerging research directions involving the use of Web data for ontology learning, the learning of lightweight as well as cross-language ontologies, and the involvement of ontologies in real-world applications. We are certain that the content of this book will be of interest to a wide ranging audience. From a teaching viewpoint, the book is intended for undergraduate students at the final year level, or postgraduate students who wish to learn about the basic techniques for ontology learning. From a researcher’s and practitioner’s point of view, this volume will be an excellent addition outlining the most recent progress to complement basic references in ontology learning. A basic familiarity with natural language processing, probability and statistics, and some fundamental Web technologies such as wikis and search engines is beneficial to the understanding of this text.

Wilson Wong
Wei Liu
Mohammed Bennamoun

Author(s)/Editor(s) Biography

Wilson Wong is a Postdoctoral Research Associate at the University of Western Australia (UWA) working on the application of text mining and natural language processing across different domains such as healthcare. Wilson was an Endeavour IPRS Scholar for his PhD study at UWA. His doctoral dissertation investigates the use of Web data for automatically acquiring knowledge from natural language texts across different domains. Wilson also has a BIT (First Class Honours) (Data Communications) degree, and an MSc (Information and Communication Technology) by research degree in the field of natural language processing from Malaysia. Wilson has close to 30 publications in book chapters, reputable conferences (e.g. IJCNLP, IJCAI, PACLING), and high-impact journals (e.g. DMKD, IDA). His areas of interest include text mining, natural language processing, Web technologies, and health informatics.
Wei Liu is an Assistant Professor at the University of Western Australia, and currently the Lab Coordinator for Adaptive System Group. She obtained her PhD from the University of Newcastle on Multi-Agent Belief Revision, Australia in 2003. Her current research interest is on ontology learning to bootstrap agent knowledge base. Dr. Wei Liu’s research strength lies in ontology learning and data-driven ontology change. She leads the work on developing automatic and semi-automatic ontology learning system from 2004, which addresses the cold-start issues (labour intensive and time consuming) of manual ontology engineering. The research contributes significantly to the investigation of emergent semantics through text mining. The first paper reporting the techniques and the system modules won one of the best papers in the conference and was invited for journal publication. The techniques developed including co-occurrence analysis to identify taxonomic and non-taxonomic relations, and spreading activation to identify the core, extended, and peripheral concepts. Information network analysis and clustering algorithms are also developed to measure the evolution of an ontology in both temporal and spatial scope.
Mohammed Bennamoun received his PhD from Queen's University, Canada/ Queensland University of Technology (QUT), Australia in the area of Computer Vision. He has been a full Professor and the Head of the School of Computer Science and Software Engineering (CSSE) at the University of Western Australia (UWA) since 2007. Prior to this, he was an Associate Professor at CSSE, a Senior Lecturer at QUT, and a Lecturer at Queen’s. He was an Erasmus Mundus Scholar at the University of Edinburgh in 2006. He was also a Visiting Professor at several other institutions including CNRS (Centre National de la Recherche Scientifique), Telecom Lille1, Helsinki University of Technology, University of Bourgogne and University of Paris 13. He is the co-author of the book “Object Recognition: Fundamentals and Case Studies” published by Springer-Verlag. He published over 140 journal and conference publications, and served as a guest editor for several special issues in international journals. His areas of interest include control theory, robotics, obstacle avoidance, object recognition, artificial neural networks, signal/image processing, and computer vision, and lately, in the development of tools for combining text and image analysis.


Editorial Board

  • Christopher Brewster, Aston University, UK
  • Chunyu Kit, City University of Hong Kong, China
  • Philipp Cimiano, University of Bielefeld, Germany
  • Sophia Ananiadou, University of Manchester, UK
  • Tharam Dillon, Curtin University of Technology, Australia
  • Venkata Subramaniam, IBM India Research, India