Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Using Association Rules for Query Reformulation

Ismaïl Biskri, Louis Rompré

Source Title: Next Generation Search Engines: Advanced Models for Information Retrieval

DOI: 10.4018/978-1-4666-0330-1.ch013

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

In this paper the authors will present research on the combination of two methods of data mining: text classification and maximal association rules. Text classification has been the focus of interest of many researchers for a long time. However, the results take the form of lists of words (classes) that people often do not know what to do with. The use of maximal association rules induced a number of advantages: (1) the detection of dependencies and correlations between the relevant units of information (words) of different classes, (2) the extraction of hidden knowledge, often relevant, from a large volume of data. The authors will show how this combination can improve the process of information retrieval.

Chapter Preview

Top

Introduction

The ever increasing importance of internet penetration and the growing size of electronic documents has made information retrieval a major scientific discipline in computer science, all while access to relevant information has become difficult, having become an informational tide that is occasionally reduced to nothing more than noise.

Information retrieval consists of selecting the documents or segments of text likely to respond to the needs of a user from a document database. This operation is carried out by way of digital tools that are sometimes associated with linguistic tools in order to refine the granularity of the results given certain points of view (Desclés & Djioua, 2009) or logical tools in question-answer format, or even tools proper to Semantic Web. However, we knowingly omit a presentation of the contributions of these linguistic methods, logical methods, and Semantic Web, due to a concern for not weighing down the writing in this chapter, since we are primarily interested in the numerical side.

Formally, there are three main elements that stand out with regards to information retrieval:

1.
The group of documents.
2.
The information needs of the users.
3.
The relevance of the documents or segments of text that an information retrieval system returns given the needs expressed by the user.

The last two aspects necessarily rely on the user. Not only does the user define their needs, but they also validate the relevance of the documents returned. To express their needs, a user formulates a query that often (but not always) takes the form of key words submitted to an information retrieval system based either on a Boolean model, a vector model, or a probabilistic model (Boughanem & Savoy, 2008). However, it is often difficult for a user to find key words that allow them to express their exact needs. In many cases, the user is confronted by a lack of knowledge on the subject of interest in their information search on the one hand, and on the other hand, by results that may be biased, as is the case with search engines on the Web. Thus, retrieving relevant documents from the first search is almost impossible. Therefore, there is a need to carry out a reformulation of the query either by using completely different key words, or by expanding the initial query with the addition of new key words (El Amrani et al., 2004).

In the case of expanding the query, two variants are possible:

1.
The first is manual. The user chooses terms that are judged relevant in the documents that are also judged relevant in order to strengthen the query. This strategy is simple and computationally costs the least. However, it does not allow for a general view of the group of documents returned by the retrieval system considering their large numbers, and given that it is not humanly possible. Quite often, the user only consults the first few documents, and only judges these few.
2.
The second is semi-automatic. The terms added to the initial query are chosen by the user from a thesaurus (which may be constructed manually) or from similarity classes of documents and co-occurrences of terms obtained following a classification applied to a group of documents, obtained following the initial request as in clustering engines. A process of classifying textual data from web sites can help the user of a search engine to better identify the target site or to better formulate a query. Indeed, the lexical units which co-occur with the keywords submitted to the search engine can provide more details concerning the documents to which access is desired. However, the interpretation of similarity classes is a nontrivial exercise. The classes of similarity are usually presented as lists of words that occur together. These lists are often very large and their vocabulary is very noisy.

In this chapter we will show how maximal association rules can improve the semi-automatic reformulation of a query in order to access target documents more quickly.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Using Association Rules for Query Reformulation

Abstract

Introduction

Complete Chapter List