Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Deep Model Framework for Ontology-Based Document Clustering

U. K. Sridevi, P. Shanthi, N. Nagaveni

Source Title: Handbook of Research on Investigations in Artificial Life Research and Development

DOI: 10.4018/978-1-5225-5396-0.ch019

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Searching of relevant documents from the web has become more challenging due to the rapid growth in information. Although there is enormous amount of information available online, most of the documents are uncategorized. It is a time-consuming task for the users to browse through a large number of documents and search for information about the specific topics. The automatic clustering from these documents could be important and has great potential to improve the efficiency of information seeking behaviors. To address this issue, the authors propose a deep ontology-based approach to document clustering. The obtained results are encouraging and in implementation annotation rules are used. The work compared the information extraction capabilities of annotated framework of using ontology and without using ontology. The increase in F-measure is achieved when ontology as the distance measure. The improvement of 11% is achieved by ontology in comparison with keyword search.

Chapter Preview

Top

Introduction

The increase in the growth of text documents in the Web is a great challenge to information retrieval system. The searching and indexing systems are available for accessing the information but the retrieval of relevant information is still a problem. One current problem of information retrieval is that it is not really possible to extract relevant documents automatically. An information retrieval system uses indexing and the system’s performance depends on the quality of the indexing. The two main challenges in indexing are to create representative internal descriptions of documents and to organize these descriptions for fast retrieval. Descriptions of documents in information retrieval are supposed to reflect the documents content and establish the foundation for the retrieval of information when requested by users. The documents are marked with the description in indexing for easy retrieval.

Ontology has good conceptual structure representation and can be combined with the knowledge representation. The model makes use of annotation and indexing. The ontology model depends on the semantic index terms but the vector space model depends on the keyword index. The semantics of the concepts are used to build a concept term representation. The ontology similarity measure improves the concept relevance score. The semantically related terms gain more weights and it will improve the term importance in indexing process. The semantic analysis should somehow recognize concepts in the documents and then map them into the ontologies. The indexing process maps information found in documents into the ontology, identifying concepts and their positions in the ontology. Information in queries can similarly be mapped into the ontology and thus in addition to retrieving the exact match, the structure of the ontology can be used to retrieve semantically related documents. Semantic similarity and indexing focuses on the similarity measure using ontology. It also compares the vector space model with semantic information retrieval model. The methods are integrated to find the concept relation information, while these concepts are considered to be independent in the term vector space method. Using the ontology similarity method given in Euzenat and Shvaiko (2007), the cosine similarity between concept are measured. The term reweighting approaches based on ontology is used in information retrieval applications (Varelas et al., 2005). The semantic annotation process includes the creation of domain ontology and the ontology maps into the concept terms of the documents. In this model, the weight of the concepts is computed using their semantic similarity to other concepts in the document. The concept vector is generated in the document annotation process and the concept index is built. To improve the recognition of important indexing terms, it is possible to weight the concepts of a document in different ways (Valkeapaa et al., 2007).

Text mining algorithm can handle the real-world data that come in a diversity of forms and can be tremendously bulky (Pankaj et al., 2015). The work provides ontology framework based on text analytics and social media analytics. Social tagging system improves the personalized document clustering. The knowledge gained from social tagging system should be tremendous assets for conducting and improving various business intelligent applications (Yang et al., 2015).

In text clustering there exist some issues to tackle such as feature extraction and data dimension reduction. To overcome these problems, Yi et al (2017) presented a novel approach named deep-learning vocabulary network. Deep learning is used to extract the features of the text document and in the d in the process of clustering and extract features of text documents. Yan et al. (2015) has used semantic representation and deep belief network for document classification and retrieval. However, there are very few publications addressing semantic indexing with deep learning. Yan et al. (2016) included the semantic indexing in biomedical literature by including a vast amount of semantic labels from automatically annotating MeSH terms for MEDLINE.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Deep Model Framework for Ontology-Based Document Clustering

Abstract

Introduction

Complete Chapter List