CSMDSE-Cuckoo Search Based Multi Document Summary Extractor: Cuckoo Search Based Summary Extractor

CSMDSE-Cuckoo Search Based Multi Document Summary Extractor: Cuckoo Search Based Summary Extractor

Rasmita Rautray (Department of Computer Science and Engineering, Siksha ‘O' Anusandhan, Deemed to be University, Bhubaneswar-751030, Odisha, India), Rakesh Chandra Balabantaray (Department of Computer Science, IIIT, Bhubaneswar, Odisha, India), Rasmita Dash (Department of Computer Science and Engineering, Siksha ‘O' Anusandhan, Deemed to be University, Bhubaneswar-751030, Odisha, India) and Rajashree Dash (Department of Computer Science and Engineering, Siksha ‘O' Anusandhan, Deemed to be University, Bhubaneswar-751030, Odisha, India)
DOI: 10.4018/IJCINI.2019100103

Abstract

In the current scenario, managing of a useful web of information has become a challenging issue due to a large amount of information related to many fields is online. The summarization of text is considered as one of the solutions to extract pertinent text from vast documents. Hence, a novel Cuckoo Search-based multi document summary extractor (CSMDSE) is presented to handle the multi-document summarization (MDS) problem. The proposed CSMDSE is assimilating with few other swarm-based summary extractors, such as Cat Swarm Optimization based Extractor (CSOE), Particle Swarm Optimization based Extractor (PSOE), Improved Particle Swarm Optimization based Extractor (IPSOE) and Ant Colony Optimization based Extractor (ACOE). Finally, a simulation of CSMDSE is compared with other techniques with respect to the traditional benchmark datasets for summarization problem. The experimental analysis clearly indicates CSMDSE has good performance than the other summary extractors discussed in this study.
Article Preview
Top

1. Introduction

In recent years, web has been providing enormous amount of information in every field. Managing of overloaded data is a crucial task. Moreover, the effect of expanding the information in web and advanced libraries can lead to typical issues. One answer to this problem is to shorten the data in form of a concise document through summarization task. The main objective of this task is to reduce the original text without losing main contents (Aliguliyev, 2009). The concise document, called summary gives quick reference to create interest, also helps in decision making and act as time saver for readers. The task of summarization depends on the way it is generated summary such as extractive and abstractive summary. Important part such sentences, paragraphs, etc., are extracted out of the document is called extractive summarization whereas abstractive summarization requires linguistic analysis to generate summary (Binwahlan, Salim, & Suanmali, 2009; Ježek, 2008; Lloret, 2012; Mendoza et al., 2014; Oliveira et al., 2016). Both extractive and abstractive summary either generic or query type. Expression of main theme based on query is called query-based summary whereas of the major content without any additional information of the documents is called generic summary (Mani, 1999; Wan, 2010).

Based on dimension, documents to be consider for summarization task, can divided the problem into single or MDS (Fattah, 2009; Rautray, Balabantaray, & Bhardwaj, 2015). Generating summary from a document or document set is called single or multi document summarization respectively. As document set includes many similar or distinct documents, therefore MDS is considered as extension of single document summarization. Due to large space in MDS, it is more critical task to extract relevant sentences. Thus MDS is recognized as an optimization problem. The main objective of MDS is to produce optimal informative summary of the original contents. However, swarm-based optimization techniques are the advisable options to address this optimization problem. In recent past, various meta-heuristic techniques such as particle swarm optimization (PSO) (Binwahlan, Salim, & Suanmali, 2009; Alguliev, Aliguliyev & Mehdiyev, 2011; Alguliev et al., 2011; Asgari, Masoumi, & Sheijani, 2014; Rautray, Balabantaray & Bhardwaj, 2015; Rautray & Balabantaray, 2015), differential evolution (DE) (Aliguliyev, 2009; Alguliev, Aliguliyev & Mehdiyev, 2011; Alguliev, Aliguliyev, & Hajirahimova, 2012; Alguliev, Aliguliyev, & Isazade, 2012; Alguliev, Aliguliyev & Hajirahimova, 2012; Alguliev, Aliguliyev & Isazade, 2013; Nandhini & Balasundaram, 2014), harmonic search (HS) (Shareghi & Hassanabadi, 2008), ant colony optimization (ACO) (Mosa, Anwar, & Hamouda, 2018; Hassan, 2015) cuckoo search (CS) (Mirshojaei & Masoomi, 2015) and genetic algorithm (GA) (Gordon, 1988; López‐Pujalte, Guerrero‐Bote, & de Moya‐Anegón, 2003; García, de Moya Anegón, & Zarco, 2000; Alguliev & Aliguliyev, 2005; Fattah & Ren, 2009; He et al., 2006; Zhao & Tang, 2010; Kogilavani & Balasubramanie, 2010) are applied both in single and multi-document summarization. From the different applications of cuckoo search algorithm, the author has inspired and presented cuckoo search algorithm-based summary extractor. Further the model is also compared with PSOE, IPSOE, CSOE and ACOE. Performance of each summary generated by different models is analyzed in terms of sentence-sentence similarity, ROUGE score, and readability metric. The experimental result is analyzed over DUC (Document Understanding Conference) datasets and, it is clearly observed that the performance of CSMDSE is showing significant result than PSO, IPSO, ACO and CSO based summary extractor.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 14: 4 Issues (2020): 1 Released, 3 Forthcoming
Volume 13: 4 Issues (2019)
Volume 12: 4 Issues (2018)
Volume 11: 4 Issues (2017)
Volume 10: 4 Issues (2016)
Volume 9: 4 Issues (2015)
Volume 8: 4 Issues (2014)
Volume 7: 4 Issues (2013)
Volume 6: 4 Issues (2012)
Volume 5: 4 Issues (2011)
Volume 4: 4 Issues (2010)
Volume 3: 4 Issues (2009)
Volume 2: 4 Issues (2008)
Volume 1: 4 Issues (2007)
View Complete Journal Contents Listing