Article Preview
TopIntroduction
In the last decade, the appearance of internet and the development of electronic communication have made the world as a global village. Since the mid 90’s, the Web become a source of information increasingly crucial, which has renewed interest for information retrieval. Meanwhile, the data sources are numerous and the amount of information available in electronic format does not cease to increase especially the unstructured data (free texts written in natural language).
Some years ago, information retrieval system (IRS) was used by only a set of people, but in our modern life a big change has occurred and IRS is often used by billions of peoples over the world. It is defined as a response to the need of users (query) by selecting a subset of documents from a wide range of documents that are closest to this query. The IRS is used in different application domains such as web, forums, business, and social network. Unfortunately, in the current digital society, the overload and the relevance of information have become a major stake where most of IRSs are face to several drawbacks in terms of:
- •
The Quality of Relevance and Usefulness: The access to the relevant information tailored to the needs of a specific user becomes both difficult and necessary.
- •
The Parameters Choice: The majority of information retrieval systems are based on the parameters (similarity measure and data representation technique). A poor choice of these parameters may cause degradation in the quality of results.
- •
The Ambiguity of Natural Language: This point is a major problem because the documents are written by humans.
- •
The Multiplicity of Data: The content of documents occur from different disciplines (biological, multimedia, sports, computer science ...etc.).
- •
User Interface: The majority of current search engines are based on a primary interaction that provides the display of the results as an ordered list. In this case the user views only the 1 or 2 pages and ignores the others but the desired information may be appeared in the 5 or 6 list.
The engineers and decision makers are confronted daily with increasing complex problems (NP-difficult) which affect generally all sectors (the design of mechanical systems, image pre-processing, information retrieval, and clustering ... ... etc.) where the classical techniques are unable to find an effective solution.
Nowadays, the world celebrates the birth of new interesting paradigms known as meta-heuristic methods, which have demonstrated their strength face to the optimization problems by finding the optimum of a function from a finite number of existing solutions. These are interesting techniques that reflect the process of any phenomenon in our daily life on an algorithm treatable by machine.
The content of our work is to shed light on the limits previously detailed by adapting a new meta-heuristic technique called fireworks algorithm (FWA) introduced by Tan in (tan, 2010) to solve the problem of information retrieval using a new visualization tool. This study seeks to the several objectives:
- •
The development of new version of fireworks algorithm to solve the information retrieval problem.
- •
Studying the influence of each parameter (iteration number, location number, mutation probability, and distance measure, text representation, and selection method, fitness function) in order to achieve the best performance.
- •
Comparing the obtained results with the results of other works existed in literature.
- •
Build modern interface of user / information with better interaction human/ machine using two types of visualisation methods (3D cub and silky structure).