A Brief Review of Metaheuristics for Document or Text Clustering

A Brief Review of Metaheuristics for Document or Text Clustering

Sinem Büyüksaatçı (Istanbul University, Turkey) and Alp Baray (Istanbul University, Turkey)
Copyright: © 2016 |Pages: 13
DOI: 10.4018/978-1-5225-0075-9.ch012


Document clustering, which involves concepts from the fields of information retrieval, automatic topic extraction, natural language processing, and machine learning, is one of the most popular research areas in data mining. Due to the large amount of information in electronic form, fast and high-quality cluster analysis plays an important role in helping users to effectively navigate, summarize and organise this information for useful data. There are a number of techniques in the literature, which efficiently provide solutions for document clustering. However, during the last decade, researchers started to use metaheuristic algorithms for the document clustering problem because of the limitations of the existing traditional clustering algorithms. In this chapter, the authors will give a brief review of various research papers that present the area of document or text clustering approaches with different metaheuristic algorithms.
Chapter Preview


The solutions of real life problems can be numerous and sometimes an infinite number of solutions may be possible. In such a case, if the problem admits one solution, this will only actualize with a unique set of parameter values and traditional optimization approaches cannot be applied (Antoniou & Lu, 2007). On the other hand, the size of possible solutions that prevents an exhaustive search, the complexity and difficult constraints of the discussed problems, caused the approximate methods to be popular.

Metaheuristic algorithms, which are a class of approximate methods, have emerged in the 1980’s. The word “heuristic” has its origin in the old Greek word “heuriskein”, meaning the art of discovering new strategies (rules) to solve problems. The suffix “meta”, also a Greek word, means upper level methodology. Fred Glover (1986) firstly introduced the term metaheuristic in the paper “Future paths for integer programming and links to artificial intelligence” (Talbi, 2009, p.1).

Complete Chapter List

Search this Book: