Bio-Inspired Algorithms for Text Summarization: A Review

Bio-Inspired Algorithms for Text Summarization: A Review

Rasmita Rautray (Siksha ‘O' Anusandhan University, India) and Rakesh Chandra Balabantaray (IIIT Bhubaneswar, India)
Copyright: © 2017 |Pages: 22
DOI: 10.4018/978-1-5225-2375-8.ch003
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

In last few decades, Bio-inspired algorithms (BIAs) have gained a significant popularity to handle hard real world and complex optimization problem. The scope and growth of Bio Inspired algorithms explore new application areas and computing opportunities. This paper presents a review with the objective is to bring a better understanding and to motivate the research on BIAs based text summarization. Different techniques have been used for text summarization are genetic algorithm (GA), particle swarm optimization (PSO), differential evolution (DE), harmonic search (HS).
Chapter Preview
Top

1 Introduction

Text summarization (TS) is the process of automatically creating a compressed version of a given text that provides useful information to users (Aliguliyev, 2009). It provides a solution to information overhead problem (Hahn & Mani, 2000; Mani & Maybury, 1999). A summary is the main objective of summarization method (Alguliev & Aliguliyev, 2005), which highlights three important aspects that characterize the research on automatic summarization: (i) summaries may be produced from a single document or multiple documents; (ii) summaries should preserve important information; and (iii) summaries should be short. Summary generation evaluates each section (paragraph, sentence, word) of the document to decide whether to keep it or not and, then reformulate it to select output. Conceptually, summarization involves a tri-stage process (Gholamrezazadeh, Salehi & Gholamzadeh, 2009; Hovy & Lin, 1998; Lin & Hovy, 1997): topic identification or text representation, interpretation or compaction or summary representation and summary generation. First step transforms text document to source representation by identifying the main theme, second step interprets the meaning; distinguish relevant and irrelevant information, and then compact it to form summary representation. The last stage merges the previously identified information to generate summary. To achieve this goal, TS addresses both the problem of selecting a subset of the most important portions of sentences from the original documents (known as extractive summary) and the problem of generating coherent summaries by composing novel sentences and unseen in the original sources (known as abstractive summary) (Fattah & Ren, 2009). Depending on size of input document to be summarized, a summary can be single document summary or multi document summary. The producing summary for a single document is called single document summary and summary for multiple documents is called multi document summary (Mani & Maybury, 1999; Fattah & Ren, 2009; Alguliev, Aliguliyev & Isazade, 2012). The summary is generated based on different aspects such as content coverage, redundancy, length, readability, cohesion, diversity and relevancy, therefore summarization is considered as multi objective optimization problem. Such aspect which represents summary objectives is discussed below.

  • Content Coverage: It is most important and major objective of summarization. Content Coverage of a summary should contain all salient sentences that cover all important subtopics and contents as much as possible from the original document (Alguliev, Aliguliyev & Isazade, 2012; Alguliev, Aliguliyev & Hajirahimova, 2012; Alguliev, Aliguliyev, & Isazade, 2013; Mendoza et al. 2014; Wei, Li & Liu, 2010).

  • Redundancy or Diversity: The summary should expect non-redundant sentences. i.e. sentences with same meaning should be minimized (Alguliev, Aliguliyev & Hajirahimova, 2012; Alguliev et al., 2011; Alguliev, Aliguliyev & Mehdiyev, 2011).

  • Length: A summary should be bounded in length (Alguliev, Aliguliyev & Hajirahimova, 2012; Alguliev, Aliguliyev, & Isazade, 2013; Alguliev et al., 2011; Alguliev, Aliguliyev & Mehdiyev, 2011).

  • Readability: It is easiness of generated text or summary that can be understood by reader (Nandhini & Balasundaram, 2014).

  • Cohesion: It determines the degree of relatedness of the sentences that make up a summary (Mendoza et al, 2014;Alguliev, Aliguliyev & Mehdiyev, 2011).

  • Relevancy: A summary should contain information that is relevant to the source as well as for the user (Alguliev, Aliguliyev & Isazade, 2012; Alguliev et al., 2011).

Complete Chapter List

Search this Book:
Reset