Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

A Systematic Study of Feature Selection Methods for Learning to Rank Algorithms

Mehrnoush Barani Shirzad, Mohammad Reza Keyvanpour

Source Title: International Journal of Information Retrieval Research (IJIRR) 8(3)

DOI: 10.4018/IJIRR.2018070104

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

This article describes how feature selection for learning to rank algorithms has become an interesting issue. While noisy and irrelevant features influence performance, and result in an overfitting problem in ranking systems, reducing the number of features by illuminating irrelevant and noisy features is a solution. Several studies have applied feature selection for learning to rank, which promote efficiency and effectiveness of ranking models. As the number of features and consequently the number of irrelevant and noisy features is increasing, systematic a review of Feature selection for learning to rank methods is required. In this article, a framework to examine research on feature selection for learning to rank (FSLR) is proposed. Under this framework, the authors review the most state-of-the-art methods and suggest several criteria to analyze them. FSLR offers a structured classification of current algorithms for future research to: a) properly select strategies from existing algorithms using certain criteria or b) to find ways to develop existing methodologies.

Article Preview

Top

Introduction

Ranking is the problem observed in many information retrieval systems. Learning to rank is the application of using machine learning methods to construct a model for automatically predicting the ranking for a list of documents. Learning to rank has been applied in document retrieval, multimedia retrieval, and collaborating filtering. A large number of algorithms have been proposed for learning to rank (Liu, 2011) for document retrieval applications. In document ranking, training data contain a list of queries and a list of documents related to each query where relative judgments are provided for each query-document pair. Each instance is a query-document pair that is represented by a feature vector. Training data apply a learning algorithm to learn a ranking model.

Learning to rank algorithms contain three major approaches: pointwise, pairwise and listwise which are different in input and output space, hypothesis and loss function. In the first approach the input space contains a single document-query pair, a hypothesis contains the scoring functions that predict the relevant degree of a document and a loss function evaluates the accuracy of every document. The output space depends on the algorithm that is used. In regression based algorithms, the output is a relevancy score which is a real value. Some studies consider ranking as a classification problem including binary and multiclass classification. Output space in these works contains binary relevant judgment (0 or 1) and multiclass label. In some studies which consider ranking as an ordinal regression problem, output are ordered categories. In the pairwise approach input space includes pairs of documents represented by feature vectors, and a hypothesis which mostly contains scoring functions. The loss function measures the inconsistency between ground truth and model output and output space contains a preference between pairs of documents. Many algorithms presented for this approach contain neural network model (Burges & Shaked, 2005), boosting methods (Freund & Iyer, 2003) and SVM models (Joachims, 2006). In the listwise approach input space includes a list of documents associated with a query, hypothesis include scoring function, two kinds of loss function proposed; first directly based on evaluation measure and the second type loss of function is not measured specifically (Cao & Qin, 2007). The first type performs in three ways: approximate a measure, optimize a bound for a measure and directly optimize evaluation measures (Xu & Li, 2007). Output space contains ranked list of documents.

Learning to rank algorithms apply standard ranking models including BM25, language model, and PageRank as features. The number of features in datasets is considerable making computations time consuming leading to a complicated training process. Moreover, the existence of irrelevant, noisy and redundant features can lead to reducing the accuracy of the ranking algorithm. Irrelevant features are those that do not have much effect on the output and a redundant feature plays the same role. Feature selection methods can effectively provide solutions to the aforementioned problems.

In recent years, several works have applied feature selection for learning to rank. Feature selection methods used for classification are categorized into three principal groups (Guyon & Elisseeff, 2003). All of the three categories have been applied in learning to rank, which include preprocessing filter methods, wrapper and embedded methods. Filter methods are preprocessing algorithms which are independent from learning phase contain (Geng & Liu, 2007; Naini & Altingovde, 2014; Gupta & Rosso, 2012; Shirzad & Keyvanpour, 2015; Gigli & Lucchese, 2016; Yu & Oh, 2009; Hua & Zhang, 2010), the wrapper methods constitute an other preprocessing strategy, conducted based on ranking algorithm (Pan & Converse, 2009; Dang and Croft, 2010; Pahikkala & Airola, 2010; Sousa & Canuto, 2016; Yu & Oh, 2009; Hua & Zhang, 2010) where embedded methods run feature selection during learning ranker (Sun & Qin, 2009; Lai & Pan, 2012; Lai & Pan, 2013; Lai & Pan, 2013; 2014; Chang & Zheng, 2009; Krasotkina & Mottl, 2015). Feature selection for learning to rank algorithms can investigate two aspects, ranking model and feature selection models. Learning to rank algorithms has been intensively studied (Liu, 2011) and, consequently, we focus on feature selection strategies. Due to the importance of feature selection in learning to rank algorithms and with regard to the studies that have shown the effectiveness and efficiency of applying it in ranking, in this paper we review and classify most studies.

Complete Article List

Search this Journal:

Reset

Volume 14: 1 Issue (2024)

Volume 13: 1 Issue (2023)

Volume 12: 4 Issues (2022): 3 Released, 1 Forthcoming

Volume 11: 4 Issues (2021)

Volume 10: 4 Issues (2020)

Volume 9: 4 Issues (2019)

Volume 8: 4 Issues (2018)

Volume 7: 4 Issues (2017)

Volume 6: 4 Issues (2016)

Volume 5: 4 Issues (2015)

Volume 4: 4 Issues (2014)

Volume 3: 4 Issues (2013)

Volume 2: 4 Issues (2012)

Volume 1: 4 Issues (2011)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

A Systematic Study of Feature Selection Methods for Learning to Rank Algorithms

Abstract

Introduction

Complete Article List