Save 10% on All IGI Global Research Books
& OnDemand Individual Chapter & Article DownloadsAvailable exclusively on IGI Global’s Online Bookstore. Offer valid through October 31, 2024

Special Offers
- Save 10% on the IGI Global Online bookstore
  Now through October 31, 2024, save 10% on all IGI Global research books & OnDemand individual chapter & article downloads. IGI Global contributors may stack this discount with their exclusive 50% contributor discount, which is automatically applied when logged into a contributor portal account. Non-contributors may also combine the discount with one other discount, including coupon codes. Not valid on open access processing charges, e-collections, or videos. Discount is not applicable for distributors.
  Explore Books & Chapters
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education & Social Sciences
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education & Social Sciences
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

An Approach of Documents Indexing Using Summarization

Rida Khalloufi, Rachid El Ayachi, Mohamed Biniz, Mohamed Fakir, Muhammad Sarfraz

Source Title: Critical Approaches to Information Retrieval Research

DOI: 10.4018/978-1-7998-1021-6.ch005

OnDemand:

(Individual Chapters)

Available

$33.75

List Price: $37.50

Current Special Offers

10% Discount:-$3.75

TOTAL SAVINGS: $3.75

Abstract

Document indexing is an active domain, which is interesting a lot of researchers. Generally, it is used in the information retrieval systems. Document indexing encompasses a set of approaches that can be applied to index a document using a corpus. This treatment has several advantages, like accelerating the research process, finding the pertinent contains related to a query, reducing storage space, etc. The use of the entire document in the indexing process affects several parameters, such as indexing time, research time, storage space of treatment, etc. The focus of this chapter is to improve all parameters (cited above) related to the indexing process by proposing a new indexing approach. The goal of proposed approach is to use a summarization to minimize the size of documents without affecting the meaning.

Chapter Preview

Top

Introduction

There is an enormous amount of textual material, and it is growing every moment and time. Think of the internet comprised of web pages, news articles, status updates, blogs and so much more. The data is unstructured and the best that we can do to navigate it is to use search and skim the results.

There is a great need to reduce much of the text data to shorter and focused summaries that capture the salient details. So, we can navigate it more effectively as well as check whether the larger documents contain the information that we are looking for. We cannot possibly create summaries of all of the text manually; there is a great need for automatic methods.

They are many reasons why we need automatic text summarization tools. Here are some of them:

•
Summaries reduce reading time.
•
When researching documents, summaries make the selection process easier.
•
Automatic summarization improves the effectiveness of indexing.
•
Automatic summarization algorithms are less biased than human summarizes.
•
Personalized summaries are useful in question-answering systems as they provide personalized information.
•
Using automatic or semi-automatic summarization systems enable abstract commercial services to increase the number of texts, they are able to process (Torres & Juan, 2014).

The rest of the chapter is organized as follows. Section 2 gives a description of the automatic text summarization. Section 3 is dedicated to present the principal of indexing document and its steps. Section 4 proposes a new approach of indexing based on summarization to reduce the size of the document preserving the meaning. Section 5 is devoted to the experimental results obtained and criteria used in evaluation. Finally, the conclusion is given in Section 6.

Top

Automatic Text Summarization

Automatic text summarization is the process of creating a short and coherent version of a longer document. We are generally good at this type of task as it involves first understanding the meaning of the source document and then distilling the meaning and capturing salient details in the new description. As such, the goal of automatically creating summaries of text is to have the resulting summaries as good as those written by humans.

It is not enough to just generate words and phrases that capture the gist of the source document. The summary should be accurate and should read fluently as a new standalone document. The different dimensions of text summarization can be generally categorized based on its input type (single or multi document), purpose (generic, domain specific, or query-based) and output type (extractive or abstractive) (Kumar, Goh, Basiron, Choon, & Suppiah, 2016).

There are two main approaches to summarize text documents: Extractive Methods and Abstractive Methods. Extractive text summarization (Gupta & Lehal, 2010) involves the selection of phrases and sentences from the source document to make up the new summary. Techniques involve ranking the relevance of phrases in order to choose only those most relevant to the meaning of the source.

Abstractive text summarization (Kasture, Yargal, Nityan, Kulkarni, & Mathur, 2014) involves generating entirely new phrases and sentences to capture the meaning of the source document. This is a more challenging approach but is also the approach ultimately used by humans. Classical methods operate by selecting and compressing contents from the source document.

Classically, most successful text summarization methods are extractive because it is an easier approach. But, abstractive approaches hold the hope of more general solutions to the problem (Nallapati, Zhou, santos, Gulcehre, & Xiang, 2016).

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

An Approach of Documents Indexing Using Summarization

Abstract

Introduction

Automatic Text Summarization

Complete Chapter List