Enhancing Academic Recommendation Regarding Common Coauthors' Publication Records

Enhancing Academic Recommendation Regarding Common Coauthors' Publication Records

DOI: 10.4018/978-1-7998-0961-6.ch006
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

In this chapter, the authors investigated the feasibility of any improvement in paper recommendation by recommending similar papers to an input paper from the publication record of the first author. Although there are numerous approaches for recommending academic papers, they did not consider intellectually recommending papers based on the publication record of common coauthors. Consequently, they are motivated to introduce a remedy for this shortcoming by recommending scholarly papers based on similarity of textual references to visual features which considers the similarity of text fragments of one's publication record to any of their visual features (i.e., tables and figures). Based on the results of evaluation, the proposed enhancement will increase the mean precision, recall, and accordingly, the F-measure. In addition, it increases the position of the relevant papers in the returned list of documents.
Chapter Preview
Top

Using Collaborative and Content-Based Filtering in a Digital Library

(A.Vellino 2009) suggested a collaborative system to recommend research papers for producing numerical rating rather boolean rating that TechLens+ (R. Torres et.al 2004) produces. Therefore, they use Page Rank values in their algorithm. The expectation of the result was to enhance the recommendation results for research papers. However, the author of the paper mentioned that the evaluation results shown that Page Rank values notably decreased the quality of recommendation.

Docear’s (J.Beel et.al 2013) is suggested as a research paper recommender which uses a content-based filtering (CBF) approach over its digital library. This recommender requires users to build a mind map for the system in order to enrich the information repository of the recommender1 and using this mind map together by applying various CBF techniques, therefore the system retrieves a list of related papers.

CiteULike (T. Bogers et.al 2008) is a search engine that applies two collaborative filtering (CF) algorithms, called user-based filtering and item-based filtering where in former, the system tries to match the active user with neighboring users and in latter, filtering is done by finding neighboring similar items.

In general, a CF recommender hardly reaches to the expected performance when the number of users is small. Moreover, it has been shown (Liang, T.-P et.al 2008, Bogers, T. et.al 2009) that a CBF recommender performs better than a CF recommender.

Using Citation Scores

(B.Gipp et.al 2009) has made effort to improve the classical keyword-based searching by introducing a hybrid recommender system which uses more factors for document preference such as citation analysis, author analysis, source analysis, implicit ratings and explicit ratings. Therefore, the recommender accepts six inputs, i.e., text, references, authors, sources, ratings and documents which at least one of them must be provided by the user. Accordingly, using these inputs, the proposed system will provide users with a list of relevant papers. The main drawback of this system is the user engagement for recommendation.

A Co-training approach (C. Caragea et.al 2015) is suggested for topic classification based on the citation and text of a research paper. The main task of proposed approach is to classify papers that are bound in a citation network. To evaluate the performance of corresponding system, authors of the paper categorized their dataset into 5 groups, i.e., AI, IR, DB, ML and HCI. Accordingly, using text and citation information of each paper they evaluated their system.

Although this work does not explicitly targets the paper recommendation, but classified papers can enhance the recommendation. By considering this, the evaluation over their data set shows that AI topic is the hardest to be classified for being too general and having few instances2 in the data set. These two cons greatly affect the usage of such system.

Complete Chapter List

Search this Book:
Reset