Influence of Domain and Model Properties on the Reliability Estimates' Performance

Influence of Domain and Model Properties on the Reliability Estimates' Performance

Zoran Bosnic (University of Ljubljana, Slovenia) and Igor Kononenko (University of Ljubljana, Slovenia)
Copyright: © 2009 |Pages: 19
DOI: 10.4018/jdwm.2009080704
OnDemand PDF Download:
No Current Special Offers


In machine learning, the reliability estimates for individual predictions provide more information about individual prediction error than the average accuracy of predictive model (e.g. relative mean squared error). Such reliability estimates may represent decisive information in the risk-sensitive applications of machine learning (e.g. medicine, engineering, and business), where they enable the users to distinguish between more and less reliable predictions. In the authors’ previous work they proposed eight reliability estimates for individual examples in regression and evaluated their performance. The results showed that the performance of each estimate strongly varies depending on the domain and regression model properties. In this paper they empirically analyze the dependence of reliability estimates’ performance on the data set and model properties. They present the results which show that the reliability estimates perform better when used with more accurate regression models, in domains with greater number of examples and in domains with less noisy data.
Article Preview

In order to enable users of classification and regression models to gain more insight into the reliability of individual predictions, various methods aiming at this task were developed in the past. Some of these methods were focused on extending formalizations of the existing predictive models, enabling them to make predictions with their adjoined reliability estimates. The other group of methods focused on the development of model-independent approaches, which are more general, but harder to analytically evaluate with individual models. In the following, we present the related work from the both groups of approaches.

The idea of reliability estimation for individual predictions originated in statistics, where confidence values and intervals are used to express the reliability of estimates. In machine learning, the statistical properties of predictive models were utilized to extend the predictions with adjoined reliability estimates, e.g. with support vector machines (Gammerman, Vovk, & Vapnik, 1998; Saunders, Gammerman, & Vovk, 1999), ridge regression (Nouretdinov, Melluish, & Vovk, 2001), and multilayer perceptron (Weigend & Nix, 1994). Since these approaches are bound to a particular model formalism, their reliability estimates can be probabilistically interpretable, thus being the confidence measures (0 represents the confidence of the most inaccurate prediction and 1 the confidence of the most accurate one). However, since not all approaches offer probabilistic interpretation, we use more general term, the reliability estimate, to name the measure that provides information about the trust in accuracy of the individual prediction.

Complete Article List

Search this Journal:
Open Access Articles
Volume 17: 4 Issues (2021): Forthcoming, Available for Pre-Order
Volume 16: 4 Issues (2020)
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing