Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

A Survey on Data Mining Techniques in Research Paper Recommender Systems

Benard Magara Maake, Sunday O. Ojo, Tranos Zuva

Source Title: Research Data Access and Management in Modern Libraries

DOI: 10.4018/978-1-5225-8437-7.ch006

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

In this chapter, the authors give an overview of the main data mining techniques that are utilized in the context of research paper recommender systems. These techniques refer to mathematical models and tools that are utilized in discovering patterns in data. Data mining is a term used to describe a collection of techniques that infer recommendation rules and build models from research paper datasets. The authors briefly describe how research paper recommender systems' data is processed, analyzed, and then, finally, interpreted using these techniques. They review different distance measures, sampling techniques, and dimensionality reduction methods employed in computing research paper recommendations. They also review the various clustering, classification, and association rule-mining methods employed to mine for hidden information. Finally, they highlight the major data mining issues that are affecting research paper recommender systems.

Chapter Preview

Top

1. Introduction

Recommender systems are lately gaining significant roles in information filtering search. In the field of research paper recommender systems, various data mining techniques have been utilized to perform various tasks. This chapter intends to highlight the use of data mining and associated methods that have been used in research paper recommendation. We partly adopt the data mining steps and methods for recommender systems as highlighted by (Amatriain, Jaimes, Oliver, & Pujol, 2011) in the recommender systems handbook by (Ricci, Rokach, & Shapira, 2011) to represent the various data mining methods and technologies that were employed at various levels of computing research paper recommendations. Data mining in this context consists of three main steps namely: Data preprocessing stage, Data analysis stage and the Result interpretation stage. We may not have a crisp separation and categorization of some of the methods and algorithms since most of them overlap.

This review chapter is organized according to the following sections: The chapter introduction and overview is presented in Section 1. A summary of data preprocessing methods and measures as utilized in research paper recommender systems is presented in Section 2. Classification algorithms utilized by research paper recommender systems are highlighted in Section 3. Section 4 presents clustering algorithms, while Section 5 presents other approaches to classification. Section 6 presents the main data mining issues facing research paper recommendation, whereas Section 7 concludes the chapter.

Figure 1.

Data Mining in RPRS

Figure 1 highlights data mining features, approaches, and processes utilized in research paper recommender systems (RPRS). It represents the three main data mining steps which are consecutively applied during the processing of data, and they include data preprocessing step, data analysis step and finally, the results interpretation step. This chapter, however, dwell much on the first two steps, data preprocessing and data analysis steps since they actively utilize various data mining techniques.

Top

2. Data Preprocessing In Rprs

Data preprocessing is an important step in machine learning and information retrieval because it screens data for any problems to prevent the possibility of producing misleading results after the processing process. Real-world datasets in the field of RPRS were generally incomplete (Gupta & Varma, 2017), noisy (Bogers & Van den Bosch, 2008; Bollen & Van de Sompel, 2006; Dong, Tokarchuk, & Ma, 2009; J. He, Nie, Lu, & Zhao, 2012; Y. Liang, Li, & Qian, 2011; McNee et al., 2002; Torres, McNee, Abel, Konstan, & Riedl, 2004; Tran, Huynh, & Hoang, 2015; Wu, Hua, Li, & Pei, 2012; Xue, Guo, Lan, & Cao, 2014) and inconsistent (Capocci & Caldarelli, 2008) and thus required tasks that will transform them (Nascimento, Laender, da Silva, & Gonçalves, 2011). These preprocessing tasks include: data cleaning (Ferrara, Pudota, & Tasso, 2011), data integration (Hwang, Hsiung, & Yang, 2003; Mönnich & Spiering, 2008; Wu et al., 2012; Zarrinkalam & Kahani, 2012), data transformation (Joran Beel & Gipp, 2009), data reduction and data discretization. Data cleaning ensures that missing values are filled, noisy data is smoothed, outliers are removed (T.-P. Liang, Yang, Chen, & Ku, 2008) and all inconsistencies are resolved. Data integration ensures integration of all necessary files or databases (Zarrinkalam & Kahani, 2012). Data transformation normalises and aggregates the data going to be used for analysis. Data discretization ensures that some parts of numerical attributes are replaced with nominal ones, when the need arises.

Key Terms in this Chapter

Data Mining: The practice of examining large pre-existing databases in order to generate new information.

Classification: The action or process of categorizing or grouping something.

Similarity Measure: The measure of how much alike two data objects are. In data mining context, it is a distance with dimensions representing features of the objects.

Algorithms: A process or set of rules to be followed in calculations or other problem-solving operations, especially by a computer.

Recommender system: A subclass of information filtering system that seeks to predict the rating and preference a user would give to an item.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference