Cost-Sensitive Learning in Medicine

Cost-Sensitive Learning in Medicine

Alberto Freitas (University of Porto and CINTESIS, Portugal), Pavel Brazdil (University of Porto and LIAAD - INESC Porto L.A., Portugal) and Altamiro Costa-Pereira (University of Porto, Portugal and CINTESIS, Portugal)
DOI: 10.4018/978-1-60566-218-3.ch003
OnDemand PDF Download:


This chapter introduces cost-sensitive learning and its importance in medicine. Health managers and clinicians often need models that try to minimize several types of costs associated with healthcare, including attribute costs (e.g. the cost of a specific diagnostic test) and misclassification costs (e.g. the cost of a false negative test). In fact, as in other professional areas, both diagnostic tests and its associated misclassification errors can have significant financial or human costs, including the use of unnecessary resource and patient safety issues. This chapter presents some concepts related to cost-sensitive learning and cost-sensitive classification and its application to medicine. Different types of costs are also present, with an emphasis on diagnostic tests and misclassification costs. In addition, an overview of research in the area of cost-sensitive learning is given, including current methodological approaches. Finally, current methods for the cost-sensitive evaluation of classifiers are discussed.
Chapter Preview

Cost-Sensitive Classification

Classification is one of the main tasks in knowledge discovery and data mining (Mitchell, 1997). It has been object of study in areas as machine learning, statistics and neural networks. There are many approaches for classification, including decision trees, Bayesian classifiers, neural classifiers, discriminant analysis, support vector machines, and rule induction, among many others.

The goal of classification is to correctly assign examples to one of a finite number of classes. In classification problems the performance of classifiers is usually measured using an error rate. The error rate is the proportion of errors detected in all instances and is an indicator of the global classifier performance. A large number of classification algorithms assume that the errors have the same cost and, because of that, are normally designed to minimize the number of errors (the zero-one loss). In these cases, the error rate is equivalent to assigning the same cost to all classification errors. For instance, in the case of a binary classification, false positives and false negatives would have equal cost. Nevertheless, in many situations, each type error may have a different associated cost.

In fact, in the majority of daily situations, decisions have distinct costs, and a bad decision may have serious consequences. It is therefore important to take into account the different costs associated to decisions, i.e., classification costs.

In this context, we may designate cost-sensitive classification when costs are ignored during the learning phase of a classifier and are only used when predicting new cases. In the other hand, we may call cost-sensitive learning when costs are considered during the learning phase and ignored, or not, when predicting new cases. In general, the cost-sensitive learning is a better option, with better results, as it considers costs during the process of generation of a new classifier. Cost-sensitive learning is the sub-area of Machine Learning concerned with situations of non-uniformity in costs.

Complete Chapter List

Search this Book:
Editorial Advisory Board
Table of Contents
Riccardo Bellazzi
Petr Berka, Jan Rauch, Djamel Abdelkader Zighed
Petr Berka, Jan Rauch, Djamel Abdelkader Zighed
Chapter 1
Jana Zvárová, Arnošt Veselý
This chapter introduces the basic concepts of medical informatics: data, information, and knowledge. Data are classified into various types and... Sample PDF
Data, Information and Knowledge
Chapter 2
Michel Simonet, Radja Messai, Gayo Diallo
Health data and knowledge had been structured through medical classifications and taxonomies long before ontologies had acquired their pivot status... Sample PDF
Ontologies in the Health Field
Chapter 3
Alberto Freitas, Pavel Brazdil, Altamiro Costa-Pereira
This chapter introduces cost-sensitive learning and its importance in medicine. Health managers and clinicians often need models that try to... Sample PDF
Cost-Sensitive Learning in Medicine
Chapter 4
Arnošt Veselý
This chapter deals with applications of artificial neural networks in classification and regression problems. Based on theoretical analysis it... Sample PDF
Classification and Prediction with Neural Networks
Chapter 5
Patrik Eklund, Lena Kallin Westin
Classification networks, consisting of preprocessing layers combined with well-known classification networks, are well suited for medical data... Sample PDF
Preprocessing Perceptrons and Multivariate Decision Limits
Chapter 6
Xiu Ying Wang, Dagan Feng
The rapid advance and innovation in medical imaging techniques offer significant improvement in healthcare services, as well as provide new... Sample PDF
Image Registration for Biomedical Information Integration
Chapter 7
ECG Processing  (pages 137-160)
Lenka Lhotská, Václav Chudácek, Michal Huptych
This chapter describes methods for preprocessing, analysis, feature extraction, visualization, and classification of electrocardiogram (ECG)... Sample PDF
ECG Processing
Chapter 8
EEG Data Mining Using PCA  (pages 161-180)
Lenka Lhotská, Vladimír Krajca, Jitka Mohylová, Svojmil Petránek, Václav Gerla
This chapter deals with the application of principal components analysis (PCA) to the field of data mining in electroencephalogram (EEG) processing.... Sample PDF
EEG Data Mining Using PCA
Chapter 9
Darryl N. Davis, Thuy T.T. Nguyen
Risk prediction models are of great interest to clinicians. They offer an explicit and repeatable means to aide the selection, from a general... Sample PDF
Generating and Verifying Risk Prediction Models using Data Mining
Chapter 10
Vangelis Karkaletsis, Konstantinos Stamatakis, Karampiperis, Karampiperis, Pythagoras Karampiperis, Pythagoras Karampiperis
The World Wide Web is an important channel of information exchange in many domains, including the medical one. The ever increasing amount of freely... Sample PDF
Management of Medical Website Quality Labels via Web Mining
Chapter 11
Rainer Schmidt
In medicine, a lot of exceptions usually occur. In medical practice and in knowledge-based systems, it is necessary to consider them and to deal... Sample PDF
Two Case-Based Systems for Explaining Exceptions in Medicine
Chapter 12
Bruno Crémilleux, Arnaud Soulet, Jiri Kléma, Céline Hébert, Olivier Gandrillon
The discovery of biologically interpretable knowledge from gene expression data is a crucial issue. Current gene data analysis is often based on... Sample PDF
Discovering Knowledge from Local Patterns in SAGE Data
Chapter 13
Jirí Kléma, Filip Železný, Igor Trajkovski, Filip Karel, Bruno Crémilleux
This chapter points out the role of genomic background knowledge in gene expression data mining. The authors demonstrate its application in several... Sample PDF
Gene Expression Mining Guided by Background Knowledge
Chapter 14
Pamela L. Thompson, Xin Zhang, Wenxin Jiang, Zbigniew W. Ras, Pawel Jastreboff
This chapter describes the process used to mine a database containing data, related to patient visits during Tinnitus Retraining Therapy. The... Sample PDF
Mining Tinnitus Database for Knowledge
Chapter 15
Dinora A. Morales, Endika Bengoetxea, Pedro Larrañaga
Infertility is currently considered an important social problem that has been subject to special interest by medical doctors and biologists. Due to... Sample PDF
Gaussian-Stacking Multiclassifiers for Human Embryo Selection
Chapter 16
Mining Tuberculosis Data  (pages 332-349)
Marisa A. Sánchez, Sonia Uremovich, Pablo Acrogliano
This chapter reviews the current policies of tuberculosis control programs for the diagnosis of tuberculosis. The international standard for... Sample PDF
Mining Tuberculosis Data
Chapter 17
Mila Kwiatkowska, M. Stella Atkins, Les Matthews, Najib T. Ayas, C. Frank Ryan
This chapter describes how to integrate medical knowledge with purely inductive (data-driven) methods for the creation of clinical prediction rules.... Sample PDF
Knowledge-Based Induction of Clinical Prediction Rules
Chapter 18
Petr Berka, Jan Rauch, Marie Tomecková
The aim of this chapter is to describe goals, current results, and further plans of long-time activity concerning application of data mining and... Sample PDF
Data Mining in Atherosclerosis Risk Factor Data
About the Contributors