Introducing Word's Importance Level-Based Text Summarization Using Tree Structure

Introducing Word's Importance Level-Based Text Summarization Using Tree Structure

Nitesh Kumar Jha (Siksha 'O' Anusandhan (Deemed to be University), Bhubaneswar, India) and Arnab Mitra (Siksha 'O' Anusandhan (Deemed to be University), Bhubaneswar, India)
Copyright: © 2020 |Pages: 21
DOI: 10.4018/IJIRR.2020010102

Abstract

Text-summarization plays a significant role towards quick knowledge acquisition from any text-based knowledge resource. To enhance the text-summarization process, a new approach towards automatic text-summarization is presented in this article that facilitates level (word importance factor)-based automated text-summarization. An equivalent tree is produced from the directed-graph during the input text processing with WordNet. Detailed investigations further ensure that the execution time for proposed automatic text-summarization, is strictly following a linear relationship with reference to the varying volume of inputs. Further investigation towards the performance of proposed automatic text-summarization approach ensures its superiority over several other existing text-summarization approaches.
Article Preview
Top

1. Introduction

Our present modern lives are blessed with the Internet. Among several others, most importantly, Internet helps us to acquire / enhance our knowledge. Due to ever increasing volume of text documents, it is very hard to read every single line of every single document. For this reason, text summarization plays a crucial role towards knowledge acquisition from available text documents (Pokojski et al., 2018).

Text summarization incorporates the uses of keywords. Keywords provide a compact representation about contents of a document. Keyword extraction is considered as primary task towards the automatic summarization of documents. Several text mining applications e.g., ‘just-in-time (JIT)’ based information retrieval, automatic classification, summarization, and filtering etc. were presented which uses keywords (Zhang, 2008; and Reddivari et al., 2018). Manual keyword extraction from any text document is time consuming, costly and tedious task. Furthermore, the ever-increasing number of the online documents makes the situation more critical towards manual processing. For this reason, automated text summarization and keyword extraction have attracted the attention of investigators over the past years (Beliga et al., 2015).

Automatic text summarization is a text mining mission that facilitates quick grasp of the overall perception for a text document (Thakkar et al., 2010; and Bharti et al., 2017). Text summarization may be achieved in the form of an abstractive summary or, as an extractive summary. Abstractive Summaries are often achieved after learning the internal representation of the article and the quality of summary is similar to the quality as produced by human being (https://rare-technologies.com/text-summarization-in-python-extractive-vs-abstractive-techniques-revisited/). On the other hand, extractive summary extracts detail from the input article and presents the result to the user (Bharti et al., 2017). In our studies, we found that extractive summarization (based on keywords extraction) is mostly popular. For this reason, in our present research, we have focused towards the keywords-based extractive summarization. In our study, we find that graph-based approach is popular towards text summarization (Thakkar et al., 2010). Hence, a brief description on graph as presented in (Ruohonen, 2013), is presented next.

Graph IJIRR.2020010102.m01 is a pair, containing set of vertices IJIRR.2020010102.m02, set of edges IJIRR.2020010102.m03, and a relation associated with each edge. Mathematically graph (G) as found in (Ruohonen, 2013) is presented once again in following Equation 1 (Ruohonen, 2013):

IJIRR.2020010102.m04
(1) where, IJIRR.2020010102.m05 denotes graph, IJIRR.2020010102.m06 denotes set of vertices and IJIRR.2020010102.m07 denotes edges formed by pair of vertices i.e, an arc or, edge between vertex IJIRR.2020010102.m08 and vertex IJIRR.2020010102.m09 is described as IJIRR.2020010102.m10.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 11: 4 Issues (2021): Forthcoming, Available for Pre-Order
Volume 10: 4 Issues (2020)
Volume 9: 4 Issues (2019)
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing