Story Summarization Using a Question-Answering Approach

Story Summarization Using a Question-Answering Approach

Sanah Nashir Sayyed (Dr. Babasaheb Ambedkar Marathwada University, India) and Namrata Mahender C. (Dr. Babasaheb Ambedkar Marathwada University, India)
DOI: 10.4018/978-1-7998-4730-4.ch003
OnDemand PDF Download:
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Summarization is the process of selecting representative data to produce a reduced version of the given data with a minimal loss of information; so, it generally works on text, images, videos, and speech data. The chapter deals with not only concepts of text summarization (types, stages, issues, and criteria) but also with applications. The two main categories of approaches generally used in text summaries (i.e., abstractive and extractive) are discussed. Abstractive techniques use linguistic methods to interpret the text; they produce understandable and semantically equivalent sentences with a shorter length. Extractive techniques mostly rely on statistical methods for extracting essential sentences from the given text. In addition, the authors explore the SACAS model to exemplify the process of summarization. The SACAS system analyzed 50 stories, and its evaluation is presented in terms of a new measurement based on question-answering MOS, which is also introduced in this chapter.
Chapter Preview
Top

1. Introduction

Visualizing any text summarizer system which briefs about the whole document in such a way that clears every essential requirement needed to understand the text, that too in a short time and shorter in size as compared with the original text is possible as many such systems are available, but their performance is not up to the expectation. In a wide variety of domains, summarization is needed; its applications are also very crucial, like medical history summarization, research document summarization in law case history summarization, official document summarization, search information summarization, and many more interesting applications can be made. So let’s first understand the concept of summarization, its categories, and components.

Summarization is the procedure of selecting essential data, which includes important information of the whole set and gives the result in the form of a summary. A broad categorization of summarization is the following:

  • Text summarization: Text summarization is a procedure of production of summaries by selecting important sentences without changing the meaning of the original document (Bhatia & Jaiswal, 2016).

  • Image summarization: In image summarization, images are summarized to get the accurate impression of the original scene, e.g., image summarization are picture collages, 3D collages, etc. (Chen, Cafarella, & Adar, 2011).

  • Video summarization: The video summarization is the production of a short video by selecting the visual data from the main video. Applications of video summarizations are video browsing, action recognition, or the creation of a visual diary (Lee, Ghosh, & Grauman, 2012).

  • Speech summarization: Speech summarization is the procedure of extracting vital sentences from spoken document considering the linguist aspect of the spoken language (McKeown, Hirschberg, Galley, & Maskey, 2005).

The focus of this chapter is on the text summarization; here, an extractive technique based on the SACAS model is used for story summarization, we discuss and highlight the major criteria and challenges encountered. A new Question-based MOS measuring technique is also specified.

Top

2. Text Summarization, Types And Components

Text summarization is the art of presenting the gist of the given text by selecting important aspects or concepts from the text without changing its form, context, and meaning, in a manner that the derived text is clearly readable and understandable. The work on automated summarization actually started from 1950 still today; it’s a dream of many researchers. There are a lot of elementary tasks, which makes automated text summarization an unresolved challenge. Some important points are listed here: (Okumura, Fukushima, & Nanba, 2001)

  • i.

    The main challenge is to decide which information is relevant.

  • ii.

    The system needs to understand the context of the text prior to summarizing

  • iii.

    Sentence revision that emphasizes elimination, combination, or substitution.

  • iv.

    Sentence fusion is to consider more than one sentence with some overlapping information too.

  • v.

    Generation of text (rather than just extracting to brief the essential information).

  • vi.

    Even summarization depends on the requirement of the user; e.g., A company’s annual report does not seem to provide the same information to each user or category of employees as the marketing/sales requirement is different from the production team, although there are some common requirements among them. The main highlight is each requirement is totally different, even if the content is the same and the aims are also same (the company should have more sales, for more profit); thus the challenge is the same pieces of text should be briefed according to the needs of each user.

  • vii.

    Evaluation metrics should be developed for judging summaries.

  • viii.

    No standard benchmark to assess the performance or accuracy of summarization.

Complete Chapter List

Search this Book:
Reset