Improved ICU mortality prediction based on SOFA scores and gastrointestinal parameters

Yehudit Aperstein , Data curation , Formal analysis , Investigation , Methodology , Software , Supervision , Writing – original draft , ¹ Lidor Cohen , Data curation , Formal analysis , ¹ Itai Bendavid , Visualization , Writing – original draft , Writing – review & editing , ^2, ^* Jonathan Cohen , Formal analysis , Writing – review & editing , ² Elad Grozovsky , Data curation , Formal analysis , ² Tammy Rotem , Data curation , Formal analysis , ¹ and Pierre Singer , Conceptualization , Investigation , Methodology , Project administration , Supervision , Writing – review & editing ²

Yehudit Aperstein

¹ Department of Industrial Engineering and Management, Afeka Academic College of Engineering, Tel Aviv, Israel

Find articles by Yehudit Aperstein

Lidor Cohen

¹ Department of Industrial Engineering and Management, Afeka Academic College of Engineering, Tel Aviv, Israel

Find articles by Lidor Cohen

Itai Bendavid

² Department of General Intensive Care and Institute for Nutrition Research, Rabin Medical Center, Beilinson Hospital, Petah Tikva, Israel

Find articles by Itai Bendavid

Jonathan Cohen

² Department of General Intensive Care and Institute for Nutrition Research, Rabin Medical Center, Beilinson Hospital, Petah Tikva, Israel

Find articles by Jonathan Cohen

Elad Grozovsky

² Department of General Intensive Care and Institute for Nutrition Research, Rabin Medical Center, Beilinson Hospital, Petah Tikva, Israel

Find articles by Elad Grozovsky

Tammy Rotem

¹ Department of Industrial Engineering and Management, Afeka Academic College of Engineering, Tel Aviv, Israel

Find articles by Tammy Rotem

Pierre Singer

² Department of Industrial Engineering and Management, Afeka Academic College of Engineering, Tel Aviv, Israel

A U C = \int_{\infty}^{- \infty} T P R (T) F P R' (T) d T

The model with the maximal AUC was considered the most favorable. In addition to AUC, we also compared sensitivity, specificity, accuracy, negative predictive value (NPV) and positive predictive value (PPV), all of which are common performance indicators for comparison of predictive models.

Results

The case records of 4,500 patients were included in our analysis. For the first part of modeling we looked at certain classification algorithms (ANN, SVM, etc.) independently in order to select the best model from each model type. We selected the best performing model from each group. The fusion of logistic and linear regression provided the best results (AUC of 0.9113). We inspected the performance of SVM with three different kernels: linear, radial and polynomial, and selected the best model with 8-fold-cross validation. This process is further detailed in S1 File . Table 4 presents the performance of each SVM model trained with a different kernel, while the best performance was achieved with the polynomial kernel.

Table 4

Support Vector Machines (SVMs) results.

	Linear SVM	Radial SVM	Polynomial SVM
Area under Curve (AUC)	0.9061	0.8825	0.9066
Accuracy	0.8323	0.8291	0.8766
Sensitivity	0.6632	0.6526	0.6316
Specificity	0.9050	0.9050	0.9050
FPR	0.0950	0.0950	0.0950

Open in a separate window

The results of SVM methods using different kernel functions are presented. As the highest AUC was achieved using a polynomial kernel function, this method was assessed to be the superior SVM and only it was used later for comparison with the other models. SVM: Support Vector Machine; FPR: False Positive Rate

After the Best SVM model was selected, we compared it with other built models such as the ANN and the logistic regression model. For a graphical comparison of models, we used the ROC curve to asses which model performs best on the available data. Fig 1 presents the ROC curve for each model plotted together for best comparison.

An external file that holds a picture, illustration, etc. Object name is pone.0222599.g001.jpg

Open in a separate window

Fig 1

A comparison of classifiers on ROC curve.

The Received-Operator Curves (ROCs) of three different classifiers are presented. All three methods (logistic regression, SVM with a polynomial kernel and ANN) produced similar curves, all above 0.9 which is considered highly accurate for classification, with only minute differences between them.

As the performance of the different classifiers was similar according to Receiver-Operator Curves, we decided to employ ensembles of the different models to further improve diagnostic ability. We constructed the following ensembles with combinations of the aforementioned models. Table 5 displays the performance of all classifiers and ensemble classifiers, where it is evident that the best AUC is achieved with the ensemble of logistic and linear regression. This finding is somewhat intuitive given the ordinal nature of the input scores we used (both SOFA and gastrointestinal scores are on an ordinal scale).

Table 5

Full results comparison (without GI parameter).

Model	Area under Curve (AUC)
ANN	0.8875
SVM (Polynomial kernel)	0.9066
Linear Regression	0.9070
Logistic Regression	0.9070
Ensemble 1: ANN + Linear Regression	0.9101
Ensemble 2: Logistic + Linear Regression	0.9113
Ensemble 3: ANN + SVM + Linear Regression	0.9072
Ensemble 4: ANN + SVM + Linear + Logistic Regression	0.9081

Open in a separate window

A comparison of the performance of the different models as well as ensemble methods, i.e. combinations of single methods, shows that the ensemble of logistic and linear regression produced the highest AUC. GI: gastrointestinal. AUC: area under the curve. ANN: artificial neural networks. SVM: support vector machine.

After finding the best performing ensemble, we looked at improving results with the addition of the GI dysfunction score. We used a penalty function to correct the SOFA score when the actual outcome did not accord with the score.

At this point, using the 3 latest SOFA scores of a patient, we reached a level of overall accuracy which was higher than past finding in the literature, but still there were misclassified cases which we wanted to minimize. These cases were in fact false positives (patients which survived their ICU stay, but the model classified them as not likely to survive the stay). It became evident from the data that the majority of these cases were such that the last 3 SOFA scores were rising, implying a worsening in patient condition, even though that patient survived. We hoped the gastrointestinal system could shed some light on these errors, by explaining the survival of these patients by their nutritional condition, therefore improving model performance. We looked at the three latest SOFA scores only, three latest SOFA scores with Zb value (SOFA + Zb) and three latest SOFA scores with gastrointestinal scores and Zb values. We evaluated these inputs on our ensemble models and found the combination of the latest three SOFA scores, the addition of the GI failure tool as well as the penalty function (Zb) to yield the best results (AUC = 0.9146). This performance analysis is presented in Table 6 .

Table 6

Performance of all inspected inputs (with GIF).

# models	ANN	Poly SVM	Linear Reg.	Logistic Reg.	SOFA	SOFA + Zb	SOFA + Gastrointestinal with Zb
1	✓				0.8875	0.9077	0.9024
1		✓			0.9066	0.9076	0.9146
1			✓		0.9070	0.9087	0.9036
1				✓	0.9070	0.8855	0.8645
2	✓		✓		0.9101	0.8960	0.9033
2			✓	✓	0.9113	0.9096	0.9020
2		✓	✓		0.9102	0.9093	0.9080
3	✓	✓	✓		0.9072	0.9098	0.9100
4	✓	✓	✓	✓	0.9081	0.9086	0.9046

Open in a separate window

A comparison of the inspected models, single as well as ensembles, before and after the addition of a GI dysfunction tool. It reveals better predictive capabilities for the addition of the GI dysfunction score to the SOFA score with a penalty function (Zb). # MODELS: 1 signifies a single model, 2 to 4 signify ensembles. GIF: gastrointestinal failure; SVM: Support Vector Machine; ANN: artificial neural networks; SOFA: Sequential organ failure assessment; Reg.: regression.

Discussion

There is an ongoing effort to improve prediction models for patient outcome in the ICU. In this study we tested the efficacy of using a patient’s latest SOFA scores to represent the change in condition throughout ICU stay for the purpose of predicting ICU mortality. We first examined the ability of the SOFA score to predict mortality on the using the data from our ICU. The need to use sub-scores dictates larger input vectors[ 9 ], thus in this work we examined new ways to achieve this level of accuracy with more compact inputs. Using several machine learning algorithms showed good performance of the SOFA score with an AUC mostly above 0.9. We then assessed several ensemble methods and found the combination of logistic and linear regression to slightly improve prediction. Furthermore, since so many models and methodologies were used, examining the different models we observed a range of performance in accuracy, showing a relatively tight interval between 0.8875 and 0.9113. This narrow interval, despite using four different algorithms, ensembles and input combinations, indicates solid results where accuracy is not expected to decline drastically when further tested on new data, possibly from mixed center populations (i.e., patients from other hospitals/countries). The next step was to incorporate a GI failure score with the SOFA score to further improve prediction accuracy. We used descriptive decision trees to discover GI parameters that may be able to reduce prediction error of classifiers based solely on SOFA. In the aforementioned study by Reintam et al. [ 17 ], a GI dysfunction score was developed in an effort to further improve the performance of the SOFA score; however, the results were equivocal [ 16 ]: although the number of GI symptoms was significantly higher in non-survivors, no symptom could be used as an independent predictor for mortality. Furthermore, the incorporation of the combination of SOFA and GI failure scale to this new heterogenic population failed to improve performance. The final conclusion drawn from these past studies was that a new approach to the problem was required.

It seems that a few obstacles prohibit the GI system's incorporation into severity scoring systems, including the wide diversity of gastrointestinal disorder clinical manifestations in the ICU [ 25 ], a lack of an accepted definition for GI failure [ 26 ], lacking validation of laboratory markers, mainly citrulline [ 27 ], and the scarcity of strong-level evidence. Feeding intolerance, an important manifestation and defining factor for GI failure, is by itself not yet well defined [ 28 ], as it may be based solely on GRV measurements, amount of enteral nutrition delivered or GI symptom lists. Understanding of the intricate interrelation between acute GI dysfunction, feeding intolerance and intraabdominal hypertension and their wide areas of overlap is still evolving [ 29 ].

We devised a completely new approach for the incorporation of the GI abnormalities into prognostic methods. Our machine learning prediction model combines integrated gastrointestinal disturbances with well-established organ failure severity score. The model significantly improved the prediction capabilities of the standard SOFA score. Moreover, the model analyzes the dynamics of change in these parameters over time, making it a dynamic score (i.e., adding the important element of time). The time series approach allows for a significant improvement in mortality risk prediction compared to a single SOFA score reading. Our research shows that our approach allows the design of a prediction model with improved prediction accuracy of ICU mortality risk, potentially advancing towards the addition of GI component into the SOFA score, thus improving its predictive abilities.

Conclusions

Our models of data analysis yielded strong evidence for the accuracy of the SOFA-based scoring system. When incorporating the time element by looking at three consecutive SOFA scores and adding a seventh we demonstrated a yet more accurate predictive ability of the model. We believe it represents a step towards a call for the inclusion of the GI system in SOFA-based scoring systems and helps bridge the evidence gap in this field.

Supporting information

S1 File

Machine learning algorithms.

(DOCX)

Click here for additional data file. ^{(42K, docx)}

S2 File

Penalty functions and descriptive regression trees.

(DOCX)

Click here for additional data file. ^{(125K, docx)}

Abbreviations

SOFA	sequential organ failure assessment
ICU	intensive care unit
APACHE	acute physiology and chronic health
SAPS	simplified acute physiology score
GI	gastrointestinal
IAH	intra-abdominal hypertension
REE	resting energy expenditure
ROC	receiver operating characteristic curve
AUC	area under the curve
LR	logistic regression
SVM	support vector machines
ANN	artificial neural networks

Funding Statement

The authors received no specific funding for this work.

Data Availability

The public sharing of the data underlying this study is restricted as per the policy of the data guardian, Clalit Health Services, as the data contain sensitive patient information. Although the authors cannot make their study’s data publicly available at the time of publication, all authors commit to make the data underlying the findings described in this study fully available without restriction to those who request the data, in compliance with the PLOS Data Availability policy. For data sets involving personally identifiable information or other sensitive data, data sharing is contingent on the data being handled appropriately by the data requester and in accordance with all applicable local requirements. Data access queries may be directed to Dr. Itai Bendavid ( moc.allaw@dbti ) or Prof. Pierre Singer ( li.gro.tilalc@regnisp ). Per the requirements of Clalit Health Services, data requesters can only access the hospital's dataset locally (i.e, physically in our hospital) under the supervision of Dr. Bendavid and Prof. Singer, and the data cannot exported, even in anonymized form.

References

1. Vincent JL, Moreno R, Takal J, Willatts S, De Mendonça A, Bruining H, et al. The SOFA (sepsis-related organ failure assessment) score to describe organ dysfunction/failure. On behalf of the working group on sepsis-related problems of the European society of intensive care medicine . Intensive Care Med 1996; 22 :707–10. 10.1007/bf01709751 [ PubMed ] [ CrossRef ] [ Google Scholar ]

2. Ferreira FL, Bota DP, Bross A, Mélot C, Vincent JL. Serial evaluation of the SOFA score to predict outcome in critically ill patients . JAMA 2001; 286 :1754–8. 10.1001/jama.286.14.1754 [ PubMed ] [ CrossRef ] [ Google Scholar ]

3. Raith EP, Udy AA, Bailey M, McGloughlin S, MacIsaac C, Bellomo R, et al. Prognostic accuracy of the SOFA score, SIRS criteria and qSOFA score for in-hospital mortality among patients with suspected infection admitted to the intensive care unit . JAMA 2017; 317 :290–300. 10.1001/jama.2016.20328 [ PubMed ] [ CrossRef ] [ Google Scholar ]

4. Wong LS, Young JD. A comparison of ICU mortality prediction using the APACHE II scoring system and artificial neural networks . Anaesthesia 1999; 54 :1048–54. 10.1046/j.1365-2044.1999.01104.x [ PubMed ] [ CrossRef ] [ Google Scholar ]

5. Moreno R, Vincent JL, Matos R, De Mendonça A, Cantraine F, Thijs L, et al. The use of maximum SOFA score to quantify organ dysfunction/failure in intensive care. Results of a prospective, multicenter study. Working group on sepsis related problems of ESICM . Intensive Care Med 1999; 25 :686–96. 10.1007/s001340050931 [ PubMed ] [ CrossRef ] [ Google Scholar ]

6. Toma T, Abu-Hanna A, Bosman RJ. Discovery and inclusion of SOFA score episodes in mortality prediction . J Biomed Inform 2007; 40 :649–60. 10.1016/j.jbi.2007.03.007 [ PubMed ] [ CrossRef ] [ Google Scholar ]

7. Minne L, Abu-Hanna A, de Jonge E. Evaluation of SOFA-based models for predicting mortality in the ICU: a systematic review . Crit Care 2008; 12 :R161 10.1186/cc7160 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]

8. Sandri M, Berchialla P, Baldi I, Gregori D, De Blasi RA. Dynamic Bayesian networks to predict sequences of organ failures in patients admitted to ICU . J Biomed Inform 2014; 48 :106–13. 10.1016/j.jbi.2013.12.008 [ PubMed ] [ CrossRef ] [ Google Scholar ]

9. Houthooft R, Ruyssinck J, van der Herten J, Stijven S, Couckuyt I, Gadeyne B, et al. Predictive modelling of survival and length of stay in critically ill patients using sequential organ failure scores . Artif Intell Med 2015; 63 :191–207. 10.1016/j.artmed.2014.12.009 [ PubMed ] [ CrossRef ] [ Google Scholar ]

10. Jain A, Palta S, Saroa R, Palta A, Sama S, Gombar S. Sequential organ failure assessment scoring and prediction of patient's outcome in intensive care unit of a tertiary care hospital . J Anaesthesiol Clin Pharmacol 2016; 32 :364–8. 10.4103/0970-9185.168165 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]

11. Clark JA, Coopersmith CM. Intestinal crosstalk: a new paradigm for understanding the gut as the "motor" of critical illness . Shock 2007; 28 :384–93. 10.1097/shk.0b013e31805569df [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]

12. Mittal R, Coopersmith CM. Redefining the gut as the motor of critical illness . Trends Mol Med 2014; 20 :214–23. 10.1016/j.molmed.2013.08.004 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]

13. Patel JJ, Rosenthal MD, Miller KR, Martindale RG. The gut in trauma . Curr Opin Crit Care 2016; 22 :339–46. 10.1097/MCC.0000000000000331 [ PubMed ] [ CrossRef ] [ Google Scholar ]

14. Alverdy JC, Chang EB. The re-emerging role of the intestinal microflora in critical illness and inflammation: why the gut hypothesis of sepsis syndrome will not go away . J Leukoc Biol 2008; 83 :461–6. 10.1189/jlb.0607372 [ PubMed ] [ CrossRef ] [ Google Scholar ]

15. Vincent JL, de Mendonça A, Cantraine F, Moreno R, Takala J, Suter PM, et al. Use of the SOFA score to assess the incidence of organ dysfunction/failure in intensive care units: result of a multicenter, prospective study. Working group on "sepsis-related problems" on behalf of the European society of intensive care medicine . Crit Care Med 1998; 26 :1793–800. [ PubMed ] [ Google Scholar ]

16. Reintam Blaser A, Poeze M, Malbrain ML, Björck M, Oudemans-van Straaten HM, Starkopf J; Gastrointestinal failure trial group. Gastrointestinal symptoms during the first week of intensive care are associated with poor outcome: a prospective multicenter study . Intensive Care Med 2013; 39 :899–909. 10.1007/s00134-013-2831-1 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]

17. Reintam A, Parm P, Kitus R, Starkopf J, Kern H. Gastrointestinal failure score in critically ill patients: a prospective observational study . Crit Care 2008; 12 :R90 10.1186/cc6958 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]

18. Abed N, Mohammed L, Metwaly A, et al. Gastrointestinal failure score in combination with SOFA score in the assessment of the critically ill patients . Crit Care 2011; 15 ( Suppl 1 ):P509. [ Google Scholar ]

19. Sun JK, Li WQ, Ni HB, Ke L, Tong ZH, Li N, et al. Modified gastrointestinal failure score for patients with severe acute pancreatitis . Surg Today 2013; 45 :506–13. [ PubMed ] [ Google Scholar ]

20. Reintam Blaser A, Malbrain ML, Starkopf J, Fruhwald S, Jakob SM, De Waele J, et al. Gastrointestinal function in intensive care patients: terminology, definitions and management. Recommendations of the ESICM working group on abdominal problems . Intensive Care Med 2012; 38 :384–94. 10.1007/s00134-011-2459-y [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]

21. Hu B, Sun R, Wu A, Ni Y, Liu J, Guo F, et al. Severity of acute gastrointestinal injury grade is a predictor of all-cause mortality in critically ill patients: a multicenter, prospective, observational study . Crit Care 2017; 21 :188 10.1186/s13054-017-1780-4 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]

22. Guillén J, Jiankun L, Furr M, Wang T, Strong S, Moore CC, et al. Predictive Models for Severe Sepsis in Adult ICU Patients . 2015 Systems and Information Engineering Design Symposium . 10.1109/SIEDS.2015.7116970 [ CrossRef ] [ Google Scholar ]

23. Amato F, López A, Peña-Méndez EM, Vaňhara P, Hampi A, Havel J. Artificial neural networks in medical diagnosis . J Appl Biomed 2013; 11 :47–58. [ Google Scholar ]

24. Faisy C, Guerot E, Diehl JL, Labrousse J, Fagon JY. Assessment of resting energy expenditure in mechanically ventilated patients . Am J Clin Nutr 2003; 78 :241–9. 10.1093/ajcn/78.2.241 [ PubMed ] [ CrossRef ] [ Google Scholar ]

25. Reintam Blaser A, Starkopf J, Malbrain ML. Abdominal signs and symptoms in intensive care patients . Anaesthesiol Intensive Ther 2015; 47 :379–87. 25 10.5603/AIT.a2015.0022 [ PubMed ] [ CrossRef ] [ Google Scholar ]

26. Blaser Reintam A, Jakob SM, Starkopf J. Gastrointestinal failure in the ICU . Curr Opin Crit Care 2016; 22 :128–41. 26 10.1097/MCC.0000000000000286 [ PubMed ] [ CrossRef ] [ Google Scholar ]

27. Piton G, Manzon C, Cypriani B, Carbonnel F, Capellier G. Acute intestinal failure in critically ill patients: is plasma citrulline the right marker? Intensive Care Med 2011; 37 :911–7. 27 10.1007/s00134-011-2172-x [ PubMed ] [ CrossRef ] [ Google Scholar ]

28. Reintam Blaser A, Starkopf L, Deane AM, Poeze M, Starkopf J. Comparison of different definitions of feeding intolerance: a retrospective observational study . Clin Nutr 2015; 34 :956–61. 28 10.1016/j.clnu.2014.10.006 [ PubMed ] [ CrossRef ] [ Google Scholar ]

29. Reintam Blaser A, Malbrain MLNG, Regli A Abdominal pressure and gastrointestinal function: an inseperable couple? Anaesthesiol Intensive Ther 2017; 49 :146–158. [ PubMed ] [ Google Scholar ]

Articles from PLOS ONE are provided here courtesy of PLOS