Predicting Patients' Satisfaction With Doctors in Online Medical Communities: An Approach Based on XGBoost Algorithm

Predicting Patients' Satisfaction With Doctors in Online Medical Communities: An Approach Based on XGBoost Algorithm

Yunhong Xu, Guangyu Wu, Yu Chen
Copyright: © 2022 |Pages: 17
DOI: 10.4018/JOEUC.287571
Article PDF Download
Open access articles are freely available for download


Online medical communities have revolutionized the way patients obtain medical-related information and services. Investigating what factors might influence patients’ satisfaction with doctors and predicting their satisfaction can help patients narrow down their choices and increase their loyalty towards online medical communities. Considering the imbalanced feature of dataset collected from Good Doctor, we integrated XGBoost and SMOTE algorithm to examine what factors and these factors can be used to predict patient satisfaction. SMOTE algorithm addresses the imbalanced issue by oversampling imbalanced classification datasets. And XGBoost algorithm is an ensemble of decision trees algorithm where new trees fix errors of existing trees. The experimental results demonstrate that SMOTE and XGBoost algorithm can achieve better performance. We further analyzed the role of features played in satisfaction prediction from two levels: individual feature level and feature combination level.
Article Preview


With the large-scale popularization of the Internet, the concept of “Internet +” has gradually penetrated into the medical industry, and the online medical communities have emerged (Li et al., 2019). Online medical communities have revolutionized the way patients obtain medical-related information and services. The online medical community enables patients to contact doctors without time and space constraints, thereby making it more convenient to receive medical treatment at a lower cost (Sims, 2016). In addition to convenience and cost, patients can also obtain diverse medical suggestions from different doctors, which allow them to make better medical decisions. Doctors can obtain social and economic benefits from engagement in online medical communities, e.g. social recognition of patients and financial returns.

Considering the benefits patients and doctors can obtain from online medical communities, more and more patients and doctors participate in online medical communities (Li et al., 2018). Under the traditional medical mode, due to lack of professional medical knowledge, patients habitually choose high-level hospitals or doctors with strong medical background (Liu et al., 2018). Medical treatment is a multi-dimensional and complex process. In addition to doctors’ medical background, the way doctors communicate with their patients, how they respond to patients, and other patients’ experiences can also influence patients’ satisfaction with doctors. However, it is difficult to trace these variables under the traditional medical mode. Online medical community can record several features that occur during or after medical treatment, which provides a holistic framework to predict patient satisfaction with doctors.

Patients’ satisfaction with doctors can be defined as the extent to which patients are content with the medical services which they receive from doctors. Understanding what factors that drive patient satisfaction and how these factors can be used to predict patient satisfaction can facilitate patients to make better decisions and promote doctors to provide better medical services. Previous studies have shown that patients' satisfaction with their medical service providers tends to be more positive, that is, patients are more willing to give doctors satisfactory evaluation (Marcinowicz et al., 2009). And data evaluated as dissatisfied may be more important than the satisfaction evaluation data when studying patients’ satisfaction (Vaida & Osmo, 2003). Therefore, we should consider both satisfaction and dissatisfaction data when predicting patients’ satisfaction. When both kinds of data are considered, there are 6933 satisfactory samples and 77 unsatisfactory samples in our dataset collected from Good Doctor. The data is imbalanced in that the samples classified as positive (which means patients who are not satisfied with their doctors) and negative (which means patients who are satisfied with their doctors) are not equally distributed. Particularly, positive samples account for about 1.1% of the total sample. Data imbalance would cause the performance of machine learning algorithms to degrade, because most machine learning algorithms will ignore or have poor performance on the minority class (Wang et al., 2020). There are two ways to deal with imbalanced dataset: data resampling and algorithms based on ensemble learning. For data resampling, the Synthetic Minority Over-sampling Technique (SMOTE) (Chawla et al., 2002) is used in this research to synthesize new samples from the minority class. XGBoost is a large-scale ensemble algorithm proposed by Chen et al. (2016), which performs a second-order Taylor expansion on the cost function, where both the first-order derivative and the second-order derivative are used. Previous research has demonstrated that SMOTE and XGBoost algorithm can achieve better results when confronted with imbalanced datasets (He et al., 2021; Meng et al., 2020; Wang et al., 2020). In this research, we integrate XGBoost and SMOTE to predict patients’ satisfaction with doctors.

Complete Article List

Search this Journal:
Volume 35: 3 Issues (2023)
Volume 34: 10 Issues (2022)
Volume 33: 6 Issues (2021)
Volume 32: 4 Issues (2020)
Volume 31: 4 Issues (2019)
Volume 30: 4 Issues (2018)
Volume 29: 4 Issues (2017)
Volume 28: 4 Issues (2016)
Volume 27: 4 Issues (2015)
Volume 26: 4 Issues (2014)
Volume 25: 4 Issues (2013)
Volume 24: 4 Issues (2012)
Volume 23: 4 Issues (2011)
Volume 22: 4 Issues (2010)
Volume 21: 4 Issues (2009)
Volume 20: 4 Issues (2008)
Volume 19: 4 Issues (2007)
Volume 18: 4 Issues (2006)
Volume 17: 4 Issues (2005)
Volume 16: 4 Issues (2004)
Volume 15: 4 Issues (2003)
Volume 14: 4 Issues (2002)
Volume 13: 4 Issues (2001)
Volume 12: 4 Issues (2000)
Volume 11: 4 Issues (1999)
Volume 10: 4 Issues (1998)
Volume 9: 4 Issues (1997)
Volume 8: 4 Issues (1996)
Volume 7: 4 Issues (1995)
Volume 6: 4 Issues (1994)
Volume 5: 4 Issues (1993)
Volume 4: 4 Issues (1992)
Volume 3: 4 Issues (1991)
Volume 2: 4 Issues (1990)
Volume 1: 3 Issues (1989)
View Complete Journal Contents Listing