Robust Classification Based on Correlations Between Attributes
Alexandros Nanopoulos (Aristotle University of Thessaloniki, Greece), Apostolos N. Papadopoulos (Aristotle University of Thessaloniki, Greece), Yannis Manolopoulos (Aristotle University of Thessaloniki, Greece) and Tatjana Welzer-Druzovec (University of Maribor, Slovenia)
Copyright: © 2008
The existence of noise in the data significantly impacts the accuracy of classification. In this article, we are concerned with the development of novel classification algorithms that can efficiently handle noise. To attain this, we recognize an analogy between k nearest neighbors (kNN) classification and user-based collaborative filtering algorithms, as they both find a neighborhood of similar past data and process its contents to make a prediction about new data. The recent development of item-based collaborative filtering algorithms, which are based on similarities between items instead of transactions, addresses the sensitivity of user-based methods against noise in recommender systems. For this reason, we focus on the item-based paradigm, compared to kNN algorithms, to provide improved robustness against noise for the problem of classification. We propose two new item-based algorithms, which are experimentally evaluated with kNN. Our results show that, in terms of precision, the proposed methods outperform kNN classification by up to 15%, whereas compared to other methods, like the C4.5 system, improvement exceeds 30%.