Data Mining and Knowledge Discovery in Healthcare Organizations: A Decision-Tree Approach
Murat Caner Testik (Cukurova University, Turkey), George C. Runger (Arizona State University, USA), Bradford Kirkman-Liff (Arizona State University, USA) and Edward A. Smith (Arizona State University, USA and Translational Genomics Research Institute, USA)
Copyright: © 2008
Health care organizations are struggling to find new ways to cut healthcare utilization and costs while improving quality and outcomes. Predictive models that have been developed to predict global utilization for a healthcare organization cannot be used to predict the behavior of individuals. On the other hand, massive amounts of healthcare data are available in databases that can be used for exploring patterns and therefore knowledge discovery. Diversity and complexity of the healthcare data requires attention to the use of statistical methods. By nature, healthcare data are multivariate, making the analysis difficult as well as interesting. In this chapter, our intention is to classify individuals that are future high-utilizers of healthcare. In particular, we answer the question of whether a mathematical model can be generated utilizing a large claims database that will predict which individuals who are not using a service in a yet untested database will be high utilizers of that health service in the future. For this purpose, an integrated dataset from enrollment, medical claims, and pharmacy databases containing more than 150 million medical and pharmacy claim line items and for over four million patients is analyzed for knowledge discovery. A modern data-mining tool, namely decision trees, which may have a broad range of applications in healthcare organizations, was used in our analyses and a discussion of this valuable tool is provided. The results and managerial aspects are discussed. Several approaches are proposed for the use of this technique depending on the health plan.