Machine Learning Based Program to Prevent Hospitalizations and Reduce Costs in the Colombian Statutory Health Care System

Machine Learning Based Program to Prevent Hospitalizations and Reduce Costs in the Colombian Statutory Health Care System

Alvaro J. Riascos, Natalia Serna
Copyright: © 2018 |Pages: 21
DOI: 10.4018/IJKDB.2018070103
(Individual Articles)
No Current Special Offers


Health-care systems that rely on hospitalization for early patient treatment pose a financial concern for governments. In this article, the author suggests a hospitalization prevention program in which the decision of whether to intervene on a patient depends on a simple decision model and the prediction of the patient risk of an annual length-of-stay using machine learning techniques. These results show that the prevention program achieves significant cost savings relative to several base scenarios for program efficacies greater than or equal to 40% and intervention costs per patient of 100,000 to 700,000 Colombian pesos (i.e., approximately 14% to 100% of the average cost per patient in Colombia statuary health care system). This article also shows how tree-based methods outperform linear regressions when predicting an annual length-of-stay and the final model achieves a lower out-of-sample error compared to those of the Heritage Health Prize.
Article Preview


Avoidable hospitalizations are a source of increased health expenditures in many health systems. Prolonged length-of-stay is costly for providers, insurers, and patients because it is associated to greater health service consumption and to the development of endangering states during the hospital stay. In the Colombian public health care system, the increase in health costs due to avoidable hospitalizations has raised many questions on whether insurers are implementing prevention programs and on whether such programs are effective. In this context, prediction of patient annual length-of-stay (LOS) is an important tool for resource allocation and improving patient health outcomes. Accordingly, the objectives of this paper are: predicting the annual length-of-stay of users in the public health care system in Colombia and estimating the potential cost savings of a preventive program whose main input is the annual LOS prediction.

Most of the literature on prediction of annual LOS has been developed from the providers' perspective rather than from the insurers' perspective. Many authors predict LOS using a sample of patients with specific acute conditions or physiological traits that are often unobserved by the insurer. For example, Chang et. al (2002) study individuals with cerebrovascular accident, Tu & Guerriere (1993) study patients that are admitted to the intensive care unit after having a cardiac surgery, Chertow, Burdick, Honour, Bonventre, & Bates, (2005) focus on patients with renal failure, and Clague, Craddock, Andrew, Horan, & Pendleton, (2002) analyze patients with hip fracture. Our study differs from the previous ones in the sense that we predict annual LOS using information that is symmetrical between insurers, providers, and the government. We do not focus on users with particular health conditions but analyze a representative sample of individuals in the public health care system with heterogenous demographic and morbidity characteristics. We also lack data regarding specific patient physiological traits and we extend our analysis to measuring the potential cost savings of a prevention program where the intervention is decided upon patient LOS prediction. With regard to the empirical techniques for predicting annual LOS, we use machine learning approaches similar to the ones used by Rezaei, Ahmadi, Alizadeh, & Sadoughi (2013) and Walsh et al. (2004), which include boosted trees, random forests, and artificial neural networks.

The remainder of this paper is structured as follows: after this introduction, section II describes the Colombian public health care system, section III provides the empirical framework, section IV describes our database and the data preprocessing, section V presents the results of machine learning techniques, section VI presents the impact of LOS on health costs, and section VII concludes.

Complete Article List

Search this Journal:
Open Access Articles
Volume 8: 2 Issues (2018)
Volume 7: 2 Issues (2017)
Volume 6: 2 Issues (2016)
Volume 5: 2 Issues (2015)
Volume 4: 2 Issues (2014)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing