This chapter reviews the current policies of tuberculosis control programs for the diagnosis of tuberculosis. The international standard for tuberculosis control is the World Health Organization’s DOT (Direct Observation of Therapy) strategy that aims to reduce the transmission of the infection through prompt diagnosis and effective treatment of symptomatic tuberculosis patients who present at health care facilities. Physicians are concerned about the poor specificity of diagnostic methods and the increase in the notification of relapse cases. This works describes a data-mining project that uses DOT´s data to analyze the relationship among different variables and the tuberculosis diagnostic category registered for each patient.
Technology evolution has promoted the increase in the volume and variety of data. The amount of data increases exponentially with time. As a consequence, the manual analysis of this data is complex and prone to errors. When the amount of data to be analyzed exploded in the mid-1990s, knowledge discovery emerged as an important analytical tool. The process of extracting useful knowledge from volumes of data is known as knowledge discovery in databases (Fayyad, 1996). Knowledge discovery’s major objective is to identify valid, novel, potentially useful, and understandable patterns of data. Knowledge discovery is supported by three technologies: massive data collection, powerful multiprocessor computers, and data mining (Turban, 2005).
Data mining derives its name from the similarities between searching for valuable business information in a large database, and mining a mountain for a vein of valuable ore. Data mining can generate new business opportunities by providing automated prediction of trends and behaviors, and discovery of previously unknown patterns.