Death and Morbidity Prediction Using Data Mining in Perforated Peptic Ulcers

Death and Morbidity Prediction Using Data Mining in Perforated Peptic Ulcers

Hugo Peixoto, Lara Silva, Soraia Pereira, Tiago Jesus, Vitor Neves Lopes, António Carlos Abelha
Copyright: © 2020 |Pages: 13
DOI: 10.4018/IJRQEH.2020010104
(Individual Articles)
No Current Special Offers


Peptic ulcers are not the most common complication in gastrointestinal mucosa, but these defects stand out as being the complication with the highest mortality rate. Several scoring systems based on clinical and biochemical parameters, such as the Boey and PULP scoring system have been developed to predict the probability of mortality. In this study, a data mining process is performed in the medical data available, in order to evaluate how the scoring systems perform when trying to predict mortality and patients' state complication. Furthermore, the presented paper studies the two scoring systems presented to define which one outperforms the other. On one hand PULP scoring allows a better mortality prediction achieving, above a 90% accuracy. One the other hand, regarding complications, the Boey system achieves better results leading to a better prediction when it comes to predicting patients' state complication.
Article Preview


Peptic ulcers are defined as defects in the gastrointestinal mucosa that extend through the muscularis mucosae (Sandler et al., 2002) Complications of peptic ulcer disease include bleeding, perforation, penetration, and gastric outlet obstruction. Its incidence, defined as the rate of new (or newly diagnosed) cases of the disease, varies geographically – in developed countries, hemorrhage is the most common cause (up to 73%), followed by perforation (9%) and obstruction (3%). Although not being the most common complication, the perforations stand out as being the complication with the highest mortality rate (Wang et al., 2010). Different numbers present in countries categorized as developing countries, as a review from Nigeria demonstrates, with obstruction being the most common cause of complication (56%), followed by perforation (30%) and bleeding (10%) (Irabor, 2005). Pieces of evidence implicate that the primary causes of complicated peptic ulcer disease are H. pylori infection, a common bacteria, and nonsteroidal inflammatory drugs. This work is based on a dataset that evaluates several clinical and biochemical parameters in patients with the diagnosis of peptic ulcer perforation, in order to classificate them in two scoring systems – Peptic Ulcer Perforation (PULP) (Lohsiriwat et al., 2009) and Boey (Møller et al., 2012) – previously established to determine a patient's prognosis with this pathology.

The PULP (Peptic Ulcer Perforation Score), is a score attributed through 11 different variables that predict 30-day mortality in patients operated on with PPU (Peptic Perforated Ulcers). The 11 variables have a different weight on the total score, which allows the quantification of the perforation. On the other hand, the Boey is attributed by three variables, with the same weight in the final result, that allows assessing the prognosis of patients with PPU that are submitted to surgery, helping in the assessment of mortality and morbidity (Agarwal et al., 2016).

The main objective of this work is to establish a relation between the scoring systems (PULP and Boey) in patients with perforated peptic ulcer and the outcome (mortality and morbidity) (Sanchez-Delgado et al., 2011). The first goal of the work is the attempt to understand which one of the scores predict better the parameters in the study (mortality and complications). At the same time, it is intended to understand which of the parameters in which are based on the two scores are more related to the two parameters in order to predict mortality and complications.

In order to achieve all the goals proposed, it was necessary to follow a detailed process that allows you to analyze and try to discover patterns in large amounts of data. This process is entitled “Data Mining” and involves methods of machine learning, statistics and database systems. To easily understand the method used, first will be explored, superficially, the concept of Data Mining and the main features of all the process. Then, we will explore, briefly, some of the work already done in this area. Only after that, the Data Mining process will effectively start, passing through all the five phases that constitute the process: business understanding, data understanding, data preparation, modeling, evaluation, and deployment. In the end, it will be done an extensive analysis of the results and drawn some conclusions relevant to the propose.

Complete Article List

Search this Journal:
Volume 13: 1 Issue (2024): Forthcoming, Available for Pre-Order
Volume 12: 2 Issues (2023)
Volume 11: 4 Issues (2022)
Volume 10: 4 Issues (2021)
Volume 9: 4 Issues (2020)
Volume 8: 4 Issues (2019)
Volume 7: 4 Issues (2018)
Volume 6: 4 Issues (2017)
Volume 5: 4 Issues (2016)
Volume 4: 4 Issues (2015)
Volume 3: 4 Issues (2014)
Volume 2: 4 Issues (2013)
Volume 1: 4 Issues (2012)
View Complete Journal Contents Listing