A Fireworks Algorithm for Modern Web Information Retrieval with Visual Results Mining

A Fireworks Algorithm for Modern Web Information Retrieval with Visual Results Mining

Hadj Ahmed Bouarara (Tahar Moulay University of Saida Algeria, Algeria), Reda Mohamed Hamou (Dr. Tahar Moulay University of Saida, Algeria), Abdelmalek Amine (Tahar Moulay University of Saida Algeria, Algeria) and Amine Rahmani (Tahar Moulay University of Saida Alegria, Algeria)
DOI: 10.4018/978-1-4666-9562-7.ch034
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

The popularization of computers, the number of electronic documents available online /offline and the explosion of electronic communication have deeply rocked the relationship between man and information. Nowadays, we are awash in a rising tide of information where the web has impacted on almost every aspect of our life. Merely, the development of automatic tools for an efficient access to this huge amount of digital information appears as a necessity. This paper deals on the unveiling of a new web information retrieval system using fireworks algorithm (FWA-IR). It is based on a random explosion of fireworks and a set of operators (displacement, mapping, mutation, and selection). Each explosion of firework is a potential solution for the need of user (query). It generates a set of sparks (documents) with two locations (relevant and irrelevant). The authors experiments were performed on the MEDLARS dataset and using the validation measures (recall, precision, f-measure, silence, noise and accuracy) by studying the sensitive parameters of this technique (initial location number, iteration number, mutation probability, fitness function, selection method, text representation, and distance measure), aimed to show the benefit derived from using such approach compared to the results of others methods existed in literature (taboo search, simulated annealing, and naïve method). Finally, a result-mining tool was achieved for the purpose to see the outcome in graphical form (3d cub and cobweb) with more realism using the functionalities of zooming and rotation.
Chapter Preview
Top

Introduction

In the last decade, the appearance of internet and the development of electronic communication have made the world as a global village. Since the mid 90’s, the Web become a source of information increasingly crucial, which has renewed interest for information retrieval. Meanwhile, the data sources are numerous and the amount of information available in electronic format does not cease to increase especially the unstructured data (free texts written in natural language).

Some years ago, information retrieval system (IRS) was used by only a set of people, but in our modern life a big change has occurred and IRS is often used by billions of peoples over the world. It is defined as a response to the need of users (query) by selecting a subset of documents from a wide range of documents that are closest to this query. The IRS is used in different application domains such as web, forums, business, and social network. Unfortunately, in the current digital society, the overload and the relevance of information have become a major stake where most of IRSs are face to several drawbacks in terms of:

  • The Quality of Relevance and Usefulness: The access to the relevant information tailored to the needs of a specific user becomes both difficult and necessary.

  • The Parameters Choice: The majority of information retrieval systems are based on the parameters (similarity measure and data representation technique). A poor choice of these parameters may cause degradation in the quality of results.

  • The Ambiguity of Natural Language: This point is a major problem because the documents are written by humans.

  • The Multiplicity of Data: The content of documents occur from different disciplines (biological, multimedia, sports, computer science ...etc.).

  • User Interface: The majority of current search engines are based on a primary interaction that provides the display of the results as an ordered list. In this case the user views only the 1 or 2 pages and ignores the others but the desired information may be appeared in the 5 or 6 list.

The engineers and decision makers are confronted daily with increasing complex problems (NP-difficult) which affect generally all sectors (the design of mechanical systems, image pre-processing, information retrieval, and clustering ... ... etc.) where the classical techniques are unable to find an effective solution.

Nowadays, the world celebrates the birth of new interesting paradigms known as meta-heuristic methods, which have demonstrated their strength face to the optimization problems by finding the optimum of a function from a finite number of existing solutions. These are interesting techniques that reflect the process of any phenomenon in our daily life on an algorithm treatable by machine.

The content of our work is to shed light on the limits previously detailed by adapting a new meta-heuristic technique called fireworks algorithm (FWA) introduced by Tan in (tan, 2010) to solve the problem of information retrieval using a new visualization tool. This study seeks to the several objectives:

  • The development of new version of fireworks algorithm to solve the information retrieval problem.

  • Studying the influence of each parameter (iteration number, location number, mutation probability, and distance measure, text representation, and selection method, fitness function) in order to achieve the best performance.

  • Comparing the obtained results with the results of other works existed in literature.

  • Build modern interface of user / information with better interaction human/ machine using two types of visualisation methods (3D cub and silky structure).

Complete Chapter List

Search this Book:
Reset