Development of a Classification Model for CD4 Count of HIV Patients Using Supervised Machine Learning Algorithms: A Comparative Analysis

Development of a Classification Model for CD4 Count of HIV Patients Using Supervised Machine Learning Algorithms: A Comparative Analysis

Peter Adebayo Idowu (Obafemi Awolowo University, Nigeria) and Jeremiah Ademola Balogun (Obafemi Awolowo University, Nigeria)
Copyright: © 2019 |Pages: 28
DOI: 10.4018/978-1-5225-7467-5.ch006


This chapter was developed with a view to present a predictive model for the classification of the level of CD4 count of HIV patients receiving ART/HAART treatment in Nigeria. Following the review of literature, the pre-determining factors for determining CD4 count were identified and validated by experts while historical data explaining the relationship between the factors and CD4 count level was collected. The predictive model for CD4 count level was formulated using C4.5 decision trees (DT), support vector machines (SVM), and the multi-layer perceptron (MLP) classifiers based on the identified factors which were formulated using WEKA software and validated. The results showed that decision trees algorithm revealed five (5) important variables, namely age group, white blood cell count, viral load, time of diagnosing HIV, and age of the patient. The MLP had the best performance with a value of 100% followed by the SVM with an accuracy of 91.1%, and both were observed to outperform the DT algorithm used.
Chapter Preview


HIV is a human immunodeficiency virus. It is the virus that can lead to acquired immunodeficiency syndrome or AIDS if not treated (Lakshmi and Isakki, 2017). HIV is spread primarily by unprotected sex, contaminated blood transmission, hypodermic, and from mother during pregnancy, delivery, or breastfeeding. HIV attacks the body’s immune system, specifically the CD4 cells (T cells), a type of white blood cell, which help the immune system fight off infections. Untreated, HIV reduces the number of CD4 cells (T cells) in the body, making the person more likely to get other infections or infection-related cancers. Anti-retro viral treatment is one of the best treatment for HIV patients. Anti-retroviral treatment can slow the course of the disease, and may lead to a near-normal life expectancy (Kama and Prem, 2013).

There is no cure for HIV but it is being managed with antiretroviral drugs (ARV) and Highly Active Antiretroviral drugs (HAART) which is the optimal combination of ARV (Rosma et al., 2012). ARV does not kill the virus but slow down the growth of the virus (Ojunga et al., 2014). Antiretroviral therapy (ART) and highly antiretroviral therapy (HAART) are the mechanisms for treating retroviral infections with drugs. (Brain et al., 2006). Monitoring of the progression of the disease is made even more important due to the emergence of HIV drug resistance, especially in developing countries with limited resource. HIV drug resistance refers to the inability of the ARV drug to reduce the viral reproduction rate sufficiently. Poor management of HIV drug resistance will lead to opportunistic infections that make treatment of HIV more difficult and even may lead to fatalities.

Common clinical markers of disease progression are weight loss, mucocutaneous manifestations, bacterial infections, chronic fever, chronic diarrhea, herpes zoster, oral candidiasis, and pulmonary tuberculosis (Morgan et al., 2002). One of the best available surrogate markers for HIV progression is the use of CD4 cell count information (Post et al., 1996). Although this is also standard of care in developing countries, the measurement of CD4 cell count requires many complex and expensive flow cytometric procedures which burden the minimal resources available. There have been previous attempts to predict CD4 cell count information using cheaper chemical assays and even correlating a patient’s total lymphocyte count (TLC) with CD4 cell counts using logistic and linear regression (Schechter et al., 1994; Mwamburi et al., 2005).

The CD4 cell count remains the strongest predictor of HIV related complications, even after the initiation of therapy. The baseline pretreatment value is informative: lower CD4 counts are associated with smaller and slower improvements in counts. However, precise thresholds that define treatment failure in patients starting at various CD4 levels are not yet established. As a general rule, new and progressive severe immunodeficiency is demonstrated by declining longitudinal CD4 cell counts which should trigger a switch in therapy. Another problem associated with CD4 count is, frequent failure happening on the CD4 counting machine which creates a great challenge in taking CD4 counts regularly in the scheduled time.

Machine learning algorithms provide a means of obtaining objective unseen patterns from evidence-based information especially in the public health care sector. These techniques have allowed for not only substantial improvements to existing clinical decision support systems, but also a platform for improved patient-centered outcomes through the development of personalized prediction models tailored to a patient’s medical history and current condition (Moudani et al., 2011a). Predictive research aims at predicting future events or an outcome based on patterns within a set of variables and has become increasingly popular in medical research (Olayemi et al., 2016). Accurate predictive models can inform patients and physicians about the future course of an illness or the risk of developing illness and thereby help guide decisions on screening and/or treatment (Waijee et al., 2013).

Complete Chapter List

Search this Book: