A New Biomimetic Method Based on the Power Saves of Social Bees for Automatic Summaries of Texts by Extraction

A New Biomimetic Method Based on the Power Saves of Social Bees for Automatic Summaries of Texts by Extraction

Mohamed Amine Boudia (GeCoDe Laboratory, Departments of Computer Science, Dr. Moulay Tahar University of Saïda, Saïda, Algeria), Reda Mohamed Hamou (Department of Computer Science, Dr. Moulay Tahar University of Saïda, Saïda, Algeria), Abdelmalek Amine (GeCoDe Laboratory, Dr. Moulay Tahar University of Saïda, Saïda, Algeria) and Amine Rahmani (Department of Computer Science, Dr. Moulay Tahar University of Saïda, Saïda, Algeria)
DOI: 10.4018/IJSSCI.2015010102
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

In this paper, the authors propose a new approach for automatic text summarization by extraction based on Saving Energy Function where the first step constitute to use two techniques of extraction: scoring of phrases, and similarity that aims to eliminate redundant phrases without losing the theme of the text. While the second step aims to optimize the results of the previous layer by the metaheuristic based on Bee Algorithm, the objective function of the optimization is to maximize the sum of similarity between phrases of the candidate summary in order to keep the theme of the text, minimize the sum of scores in order to increase the summarization rate, this optimization also will give a candidate's summary where the order of the phrases changes compared to the original text. The third and final layer aims to choose the best summary from the candidate summaries generated by bee optimization, the authors opted for the technique of voting with a simple majority.
Article Preview

1. Introduction And Problematic

Every day, the mass of electronic textual information is increasing dramatically making it more and more difficult access to relevant information without the use of special tools. Additionally, access to the content of the texts by rapid and effective ways has become a necessity.

A summary of a text is an effective way to represent the contents of the texts and allow quick access to their semantic content. The purpose of a summarization is to produce an abridged text covering most of the content from the source text.

“We cannot imagine our daily life, one day without summaries” (Mani, I., 2001). Newspaper head-lines, the first paragraph of a newspaper article, newsletters, weather, tables of results of sports competitions and library are all summarized. Even in the research, the authors of scientific articles must accompany their scientific articles by a summary written by them-selves.

Automatic summaries can be used to reduce the search time to find the relevant documents or to reduce the treatment of long texts by identifying the key information.

To make an automatic summary, the current literature presents three approaches:

  • Summarization by extraction;

  • Summarization by understanding;

  • Summarization by classification.

Our current work uses automatic summarization by extraction as it is a simple method to implement that gives good results; only in the previous work, produce the automatic summary by extraction approach consists to use only one technique at a time (Scoring of phrase, Similarity between phrase or prototype) and respects the order of the phrases in the original document. Thus, our work seeks to answer the following questions:

  • What is the contribution of the use of two methods of summarization at the same time on the quality of summary?

  • Can the bio-inspired method based on Energy Savings of Bees brings more for the automatic summary and increase the quality of the summary?

2. State Of The Art

Automatic summarization appeared earlier as a field of research in computer science from the axis of NLP (automatic language processing), HP Luhn proposed in 1958 a first approach to the development of automatic abstracts from extracting phrases. In the early 1960s, HP Edmundson and other participants in the project TRW (Thompson Ramo Wooldridge Inc.) Proposed a new system of automatic summarization where it combined several criteria to assess the relevance of phrases to extract.

These works were made to identify the fundamental ideas around the automatic summarization, such as problems caused by extraction to build summaries (problems of redundancy, incompleteness, break, etc..), the theoretical inadequacy of the use of statistics, or the difficulties to understand a text (from semantic analysis) to summarize.

From the 1980s, theories have emerged to describe the various treatments involved in the human cognitive system in the activities of reading and text understanding, in particular the model of Kintsch and Van Dijk explained in more construction of a summary.

These theories had then greatly inspired the architecture of the automatic summary of the time. The influence of psychological theories constituted a new step in the automatic summarization compared to the previous techniques, henceforth we “understand” the text, using the knowledge from deeper cognitive structures like scripts, scheme, frames... one of these early works inspired by research in psychology was that of G. DeJong with the FRUMP system. Other important works continued to appear at that time such as SUSY, TOPIC, SCISSOR and PAULINE.

Systems for automatic summarization by an understanding of this era were strongly influenced by all the works that are done on reading comprehension and knowledge representation in cognitive psychology and artificial intelligence.

In the late of the 90s, looking for information began to grow considerably, especially with the come on the Internet and search engines. The amount of information to be processed has become so large and heterogeneous that research needs the information has exploded, especially for automatic summarization. The needles are then oriented to rapid methods broadly applicable; that is to say, independent of areas.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 9: 4 Issues (2017): 3 Released, 1 Forthcoming
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing