Reference Hub12
Model Assessment with ROC Curves

Model Assessment with ROC Curves

Lutz Hamel
Copyright: © 2009 |Pages: 8
ISBN13: 9781605660103|ISBN10: 1605660108|EISBN13: 9781605660110
DOI: 10.4018/978-1-60566-010-3.ch204
Cite Chapter Cite Chapter

MLA

Hamel, Lutz. "Model Assessment with ROC Curves." Encyclopedia of Data Warehousing and Mining, Second Edition, edited by John Wang, IGI Global, 2009, pp. 1316-1323. https://doi.org/10.4018/978-1-60566-010-3.ch204

APA

Hamel, L. (2009). Model Assessment with ROC Curves. In J. Wang (Ed.), Encyclopedia of Data Warehousing and Mining, Second Edition (pp. 1316-1323). IGI Global. https://doi.org/10.4018/978-1-60566-010-3.ch204

Chicago

Hamel, Lutz. "Model Assessment with ROC Curves." In Encyclopedia of Data Warehousing and Mining, Second Edition, edited by John Wang, 1316-1323. Hershey, PA: IGI Global, 2009. https://doi.org/10.4018/978-1-60566-010-3.ch204

Export Reference

Mendeley
Favorite

Abstract

Classification models and in particular binary classification models are ubiquitous in many branches of science and business. Consider, for example, classification models in bioinformatics that classify catalytic protein structures as being in an active or inactive conformation. As an example from the field of medical informatics we might consider a classification model that, given the parameters of a tumor, will classify it as malignant or benign. Finally, a classification model in a bank might be used to tell the difference between a legal and a fraudulent transaction. Central to constructing, deploying, and using classification models is the question of model performance assessment (Hastie, Tibshirani, & Friedman, 2001). Traditionally this is accomplished by using metrics derived from the confusion matrix or contingency table. However, it has been recognized that (a) a scalar is a poor summary for the performance of a model in particular when deploying non-parametric models such as artificial neural networks or decision trees (Provost, Fawcett, & Kohavi, 1998) and (b) some performance metrics derived from the confusion matrix are sensitive to data anomalies such as class skew (Fawcett & Flach, 2005). Recently it has been observed that Receiver Operating Characteristic (ROC) curves visually convey the same information as the confusion matrix in a much more intuitive and robust fashion (Swets, Dawes, & Monahan, 2000). Here we take a look at model performance metrics derived from the confusion matrix. We highlight their shortcomings and illustrate how ROC curves can be deployed for model assessment in order to provide a much deeper and perhaps more intuitive analysis of the models. We also briefly address the problem of model selection.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.