Data Text Mining Based on Swarm Intelligence Techniques: Review of Text Summarization Systems

Data Text Mining Based on Swarm Intelligence Techniques: Review of Text Summarization Systems

Mohamed Atef Mosa (Institute of Public Administration, Department of Information Technology, Riyadh, Saudi Arabia)
Copyright: © 2020 |Pages: 37
DOI: 10.4018/978-1-5225-9373-7.ch004

Abstract

Due to the great growth of data on the web, mining to extract the most informative data as a conceptual brief would be beneficial for certain users. Therefore, there is great enthusiasm concerning the developing automatic text summary approaches. In this chapter, the authors highlight using the swarm intelligence (SI) optimization techniques for the first time in solving the problem of text summary. In addition, a convincing justification of why nature-heuristic algorithms, especially ant colony optimization (ACO), are the best algorithms for solving complicated optimization tasks is introduced. Moreover, it has been perceived that the problem of text summary had not been formalized as a multi-objective optimization (MOO) task before, despite there are many contradictory objectives in needing to be achieved. The SI has not been employed before to support the real-time tasks. Therefore, a novel framework of short text summary has been proposed to fulfill this issue. Ultimately, this chapter will enthuse researchers for further consideration for SI algorithms in solving summary tasks.
Chapter Preview
Top

Types Of Text Summarization

Single/multi-documents and short text are the three important categories of the summary. The task of generating a brief from many documents is more complicated than the extraction the information from the single document. The main problem appears in summarizing several documents together, particularly in a huge amount of short text. Some researchers suggested with regard to the redundancy to pick the sentences that are at the beginning of the paragraph and then measuring the similarity with the later sentences to select the best one (Sarkar, 2019). Therefore, the Maximal Marginal Relevance approach (MMR) is proposed by (Mosa, Hamouda, & Marei, 2017b) to reduce the redundancy. To produce the optimal results in multi-document and short text summarization, several researchers have investigated diverse systems and algorithms to generate an optimal summary (Mosa et al., 2017a; Gambhir & Gupta, 2017; Liu, Chen, & Tseng, 2015; Al-Dhelaan, 2015).

Key Terms in this Chapter

Natural Language Processing: The process concerned with the interactions between computers and different natural languages of human, in particular how to machines to process and analyze big-data of natural language.

Short Text Summarization: A process targeting to select the most important short texts written about the same topic.

Multi-Document Summarization: An automatic process target at extraction of information from multiple documents written about the same topic.

Swarm Intelligence Techniques: SI systems possess typically of a population of a number of agents interacting with each other within their environment. These interactions between all agents lead to the emergence of “intelligent” global behavior, unknown to the individual agents.

Text Mining: The process of extracting and deriving high quality and important information from text.

Automatic Summarization: The process helps to reduce a huge amount of data to a short set of words that reveals the core of the full text.

Nature Heuristic Techniques: Techniques designed for solving complicated problems especially, optimization ones more quickly when traditional methods fail to find an exact solution.

Complete Chapter List

Search this Book:
Reset