Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Classification of Sentence Ranking Methods for Multi-Document Summarization

Sean Sovine, Hyoil Han

Source Title: Innovative Document Summarization Techniques: Revolutionizing Knowledge Understanding

DOI: 10.4018/978-1-4666-5019-0.ch001

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Modern information technology allows text information to be produced and disseminated at a very rapid pace. This situation leads to the problem of information overload, in which users are faced with a very large body of text that is relevant to an information need and no efficient and effective way to locate within the body of text the specific information that is needed. In one example of such a scenario, a user might be given a collection of digital news articles relevant to a particular current event and may need to rapidly generate a summary of the essential information relevant to the event contained in those articles. In extractive MDS, the most fundamental task is to select a subset of the sentences in the input document set in order to form a summary of the document set. An essential component of this task is sentence ranking, in which sentences from the original document set are ranked in order of importance for inclusion in a summary. The purpose of this chapter is to give an analysis of the most successful methods for sentence ranking that have been employed in recent MDS work. To this end, the authors classify sentence ranking methods into six classes and present/discuss specific approaches within each class.

Chapter Preview

Top

Introduction

Automatic text summarization is one attempt to solve the problem of information overload, and consists of the study of automated techniques for extracting key information from a body of text and using that information to form a concise summary of the documents in the set. The ideal of automatic summarization work is to develop techniques by which a machine can generate summaries that successfully imitate summaries generated by human beings. The category of automatic summarization actually contains a wide range of variations of the basic summarization task. This variety arises because of the many different purposes that exist for generating a summary, the different possible definitions of what a text summary is, and the great variety that exists in possible input data sources for a summarization algorithm.

The types of automatic summarization task can be divided on several axes. First, the input data that is to be summarized may be known to belong to a specific domain, or the task may be considered generic or predominately domain-independent. The input data may be from a single source document, or from multiple source documents. In the case that the input data consists of a set of documents from different sources with a common topic, the task is referred to as multi-document summarization (MDS). The summarization task may be informative, so that the summarization algorithm attempts to determine the key information in the input data using features of the input data set. On the other hand, the task may be focused summarization, in which the consumer of the summary has a particular question or specific topic that will be used to guide and motivate the summarization process. Many summarization systems are currently designed to incorporate aspects of informative and focused summarization. Finally, summarization may be abstractive or extractive (Nenkova & McKeown, 2011; Radev, Hovy, & McKeown, 2002; Sparck Jones, 1999).

Abstractive summaries are like those typically created by human summarizers, where the summary is composed of language that is generated specifically for the purpose of the summary. Extractive summaries, by contrast, are composed of sentences or parts of sentences that are extracted from the text of the input documents—and possibly rearranged or compressed—to form the final summary, with few other modifications (Nenkova & McKeown, 2011; Radev, Hovy, & McKeown, 2002). This chapter addresses extractive MDS systems. Currently, most summarization systems developed and tested for research purposes are extractive in nature.

Most current summarization research is focused on a generic multi-document summarization task that also features a query-focused component. This is largely due to conventions developed during the course of the Document Understanding Conferences (DUC) and Text Analysis Conferences (TAC) (NIST 2011; NIST 2013). The evaluation tasks developed during the DUC/TAC conferences are by far the most widely used methodologies for evaluating automatic summarization systems. We discuss the DUC/TAC conferences and their evaluation methodologies further in the section Evaluation Methodologies.

Systems developed for the DUC/TAC evaluation task are largely domain-independent, but are designed to be tested using corpora of newswire documents containing multiple topic-focused document sets. These systems are often intended to generate summaries that are both informative and focused, but some of these systems are either exclusively informative or exclusively focused in approach. Some experimental evidence suggests that, while current MDS systems are achieving continually higher levels of quality in the summaries they generate, the performance of these current systems has not yet reached the theoretical maximum of the extractive approach, which is the extractive summary containing the optimal set of document sentences (Bysani, Reddy, & Varma, 2009).

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Classification of Sentence Ranking Methods for Multi-Document Summarization

Abstract

Introduction

Complete Chapter List