A New Metric of Validation for Automatic Text Summarization by Extraction

A New Metric of Validation for Automatic Text Summarization by Extraction

Ahmed Chaouki Lokbani (Department of Computer Science, Dr. Tahar Moulay University of Saida, Saida, Algeria)
DOI: 10.4018/IJSITA.2017070102

Abstract

In this article, the author proposes a new metric of evaluation for automatic summaries of texts. In this case, the adaptation of the F-measure that generates a hybrid method of evaluating an automatic summary at the same time as both extrinsic and intrinsic. The article starts by studying the feasibility of adaptation of the F-measure for the evaluation of automatic summarization. After that, the author defines how to calculate the F-measure for a candidate summary. Text is presented with a term vector which can be either a word or a phrase, with a binary-weighted or occurrence. Finally, to determine to the exactitude of evaluation of the F-measure for automatic summarization by extraction calculates correlation with the ROUGE Evaluation.
Article Preview
Top

Introduction

A summary is to give a short overview, or the main points, of something longer with the constraint to preserve the semantics of a document. The purpose of this operation is to help the reader to identify interesting information for him/her without reading the entire document.

“We cannot imagine our daily life, a day without summary…” says Inderjeet Mani (2001). Headlines, the first paragraph of a newspaper article, newsletters, weather, tables of results of sport competitions and catalogs library are just the summary. Even in research, the author of the article must handle their scientific papers with summaries (abstract) written by them.

The volume of electronic textual information is increasing and making access to information difficult. Producing a summary may facilitate access to information, but it is also a complex task because it requires language skills.

The other problem of the automatic summary appears in the part of the quality evaluation of the produced summary; a community Automatic Natural Language Processing has not yet accurate solution to this problem and offers only partial solutions. In reality, there is no “ideal” summary.

In the literature there are two automatic evaluation summary ways:

  • Extrinsic

  • Intrinsic

Our work is based on the adaptation of the measure F- Measure for the Evaluation of the quality of an automatic summary. This adaptation generates hybridization evaluation method (Intrinsic and Extrinsic). Overall, our work seeks to answer the following questions:

  • Can we use valuation F- measure for automatic summaries?

  • How would we adapt the F- measure to evaluate the automatic summaries?

  • Is the F- Measure evaluation result really a reflection of the quality of summarization?

Top

Materials And Methods

To determine exactitude of evaluation F-measure for automatic summarization by extraction, we have calculated correlation with another metrics of Evaluation that is already verified and approved.

The Linear Correlation Coefficient of Bravais-Pearson (Artusi, Verderio, & Marubini, 2002)

This coefficient is used to detect the presence or absence of a linear relationship between two continuous quantitative characteristics. To calculate this coefficient, we must first calculate the covariance. Covariance is the average of the product of deviations from the mean.

IJSITA.2017070102.m01
(1)

The linear correlation coefficient of two characters X and Y is equal to the covariance of X and Y divided by the product of the standard deviations of X and Y

IJSITA.2017070102.m02
(2)

Thus, the sign of r indicates the direction of the correlation relationship while value absolute of r indicates the intensity of the relationship that is to say the ability to predict the values of Y in those functions X.

Table 1.
Explanation of correlation values
CorrelationNegativePositive
StrongFrom - 0,5 to 0.0From 0,0 to 0.5
WeekFrom - 1,0 to - 0.5From 0,5 to 1.0

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing