ArticlePDF Available

Machine learning vs. traditional regression analysis for fluid overload prediction in the ICU

Springer Nature
Scientific Reports
Authors:
  • The University of Georgia College of Pharmacy

Abstract and Figures

Fluid overload, while common in the ICU and associated with serious sequelae, is hard to predict and may be influenced by ICU medication use. Machine learning (ML) approaches may offer advantages over traditional regression techniques to predict it. We compared the ability of traditional regression techniques and different ML-based modeling approaches to identify clinically meaningful fluid overload predictors. This was a retrospective, observational cohort study of adult patients admitted to an ICU ≥ 72 h between 10/1/2015 and 10/31/2020 with available fluid balance data. Models to predict fluid overload (a positive fluid balance ≥ 10% of the admission body weight) in the 48–72 h after ICU admission were created. Potential patient and medication fluid overload predictor variables (n = 28) were collected at either baseline or 24 h after ICU admission. The optimal traditional logistic regression model was created using backward selection. Supervised, classification-based ML models were trained and optimized, including a meta-modeling approach. Area under the receiver operating characteristic (AUROC), positive predictive value (PPV), and negative predictive value (NPV) were compared between the traditional and ML fluid prediction models. A total of 49 of the 391 (12.5%) patients developed fluid overload. Among the ML models, the XGBoost model had the highest performance (AUROC 0.78, PPV 0.27, NPV 0.94) for fluid overload prediction. The XGBoost model performed similarly to the final traditional logistic regression model (AUROC 0.70; PPV 0.20, NPV 0.94). Feature importance analysis revealed severity of illness scores and medication-related data were the most important predictors of fluid overload. In the context of our study, ML and traditional models appear to perform similarly to predict fluid overload in the ICU. Baseline severity of illness and ICU medication regimen complexity are important predictors of fluid overload.
This content is subject to copyright. Terms and conditions apply.
1
Vol.:(0123456789)
Scientic Reports | (2023) 13:19654 | https://doi.org/10.1038/s41598-023-46735-3
www.nature.com/scientificreports
Machine learning vs. traditional
regression analysis for uid
overload prediction in the ICU
Andrea Sikora
1, Tianyi Zhang
2, David J. Murphy
3, Susan E. Smith
1, Brian Murray
4,
Rishikesan Kamaleswaran
5,6, Xianyan Chen
2, Mitchell S. Buckley
7, Sandra Rowe
8 &
John W. Devlin
9,10*
Fluid overload, while common in the ICU and associated with serious sequelae, is hard to predict and
may be inuenced by ICU medication use. Machine learning (ML) approaches may oer advantages
over traditional regression techniques to predict it. We compared the ability of traditional regression
techniques and dierent ML-based modeling approaches to identify clinically meaningful uid
overload predictors. This was a retrospective, observational cohort study of adult patients admitted
to an ICU ≥ 72 h between 10/1/2015 and 10/31/2020 with available uid balance data. Models to
predict uid overload (a positive uid balance ≥ 10% of the admission body weight) in the 48–72 h
after ICU admission were created. Potential patient and medication uid overload predictor variables
(n = 28) were collected at either baseline or 24 h after ICU admission. The optimal traditional logistic
regression model was created using backward selection. Supervised, classication-based ML models
were trained and optimized, including a meta-modeling approach. Area under the receiver operating
characteristic (AUROC), positive predictive value (PPV), and negative predictive value (NPV) were
compared between the traditional and ML uid prediction models. A total of 49 of the 391 (12.5%)
patients developed uid overload. Among the ML models, the XGBoost model had the highest
performance (AUROC 0.78, PPV 0.27, NPV 0.94) for uid overload prediction. The XGBoost model
performed similarly to the nal traditional logistic regression model (AUROC 0.70; PPV 0.20, NPV
0.94). Feature importance analysis revealed severity of illness scores and medication-related data
were the most important predictors of uid overload. In the context of our study, ML and traditional
models appear to perform similarly to predict uid overload in the ICU. Baseline severity of illness and
ICU medication regimen complexity are important predictors of uid overload.
Fluid overload, a frequent and unintended consequence of the resuscitation process in critically ill adults may
result in increased rates of acute kidney injury and invasive mechanical ventilation initiation, prolonged intensive
care unit (ICU) stay, and mortality1,2. Timely de-resuscitation to remove excess uid is associated with improved
outcomes36. While the predictors of volume responsiveness are well-established7,8, particularly in patients with
sepsis9,10, the predictors for ICU uid overload remain unclear. Development of rigorous uid overload prediction
algorithms could shorten the time to the implementation of uid overload mitigation strategies [e.g., concentra-
tion of intravenous (IV) uid products, discontinuation of maintenance uids, administration of diuretics] and
improve outcomes.
Non-diuretic ICU medication use may aect uid overload risk; preliminary data suggests the medication
regimen complexity-ICU (MRC-ICU) score is associated with both uid overload and uid balance11. is score
has also been shown to predict mortality and length of stay and also the medication interventions needed to
OPEN
1Department of Clinical and Administrative Pharmacy, University of Georgia College of Pharmacy, 1120 15th
Street, HM-118, Augusta, GA 30912, USA. 2Department of Statistics, University of Georgia Franklin College of
Arts and Sciences, Athens, GA, USA. 3Division of Pulmonary, Allergy, Critical Care and Sleep Medicine, Emory
University, Atlanta, GA, USA. 4Department of Pharmacy, University of North Carolina Medical Center, Chapel
Hill, NC, USA. 5Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA,
USA. 6Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA, USA. 7LaJolla
Pharmaceuticals, Waltham, USA. 8Department of Pharmacy, Oregon Health and Science University, Portland, OR,
USA. 9Northeastern University School of Pharmacy, Boston, MA, USA. 10Division of Pulmonary and Critical Care
Medicine, Brigham and Women’s Hospital, Boston, MA, USA. *email: j.devlin@northeastern.edu
Content courtesy of Springer Nature, terms of use apply. Rights reserved
2
Vol:.(1234567890)
Scientic Reports | (2023) 13:19654 | https://doi.org/10.1038/s41598-023-46735-3
www.nature.com/scientificreports/
optimize a patient’s pharmacotherapy regimen1219. erefore, quantifying patient-specic, medication-related
data is likely an important consideration in the prediction of uid overload in critically adults2,20,21.
Event prediction in the ICU remains a perennial area of research given the many challenges that exist for clini-
cians to accurately predict clinical outcomes in the highly complex and dynamic critical care environment22,23.
Articial intelligence and machine learning techniques have been proposed as a method to improve ICU clinical
outcome prediction given their unique ability to handle multi-dimensional problems and identify novel patterns
within the vast troves of continuously-generated patient data21,2426. However, to some ICU clinicians, the use
of articial intelligence/machine learning approaches to predict clinical events may have a ‘black-box eect,
which can ultimately preclude implementation. e rigorous evaluation of whether articial intelligence-based
approaches predict clinical events better than traditional regression models (or clinical expertise alone) remains
a key question in critical care practice2731.
In this study, we sought to compare the ability of machine learning approaches to traditional regression
models to predict uid overload and the individual predictors for its occurrence in critically ill adults. We
hypothesized that advanced machine learning techniques perform better than traditional regression models to
predict uid overload and that the predictors for uid overload identied through machine learning approaches
may be dierent.
Methods
We conducted a retrospective, observational study of adults admitted ICUs at the University of North Carolina
Health System (UNCHS), an integrated health system, who had uid overload data available. e protocol for
this study was approved with waivers of informed consent and HIPAA authorization granted by UNHCS Insti-
tutional Review Board (approval number: Project00001541; approval date: October 2021). Procedures followed
in the study were in accordance with the ethical standards of the of the UNHCS Institutional Review Board
and the Helsinki Declaration of 1975, as most recently amended32. e reporting of this study adheres to the
STrengthening and reporting of OBservational data in Epidemiology statement33.
Population
A random sample of 1000 adults (≥ 18years) admitted to an ICU at UNCHS between October 2015 and October
2020 was generated. Patients on their index ICU admission with uid balance data available for the rst 72h
were included (Supplemental Digital Content (SDC) Fig.S1). Patients were excluded if the admission was not
their index ICU admission.
Data collection and outcomes
De-identied UNCHS electronic health record (EHR) data (Epic Systems, Verona, WI) housed in the Carolina
Data Warehouse (CDW) was extracted by a trained CDW data analyst. e primary outcome was the presence
of uid overload at the 48–72h (i.e., day 3) aer ICU admission. Fluid overload was dened as a positive uid
balance in milliliters (mL) greater than or equal to 10% of the patients admission body weight in kilograms
(kg)2,34. For example, a patient with a body weight of 100kg at ICU admission having a positive uid balance at
72h of 12,000mL (or 12kg) would be considered to have uid overload. A secondary outcome was the amount
of uid overload as a function of body weight. For example, the aforementioned patient would have a uid
overload amount of 12%.
Following a literature review, and through investigator consensus, potential predictor variables for uid
overload were dened2,3538. A total of 28potential predictors were identied: 1) ICU baseline: age ≥ 65years, sex,
admission to a medical (vs. surgical) ICU, primary ICU admission diagnosis (i.e., cardiac, chronic kidney disease,
heart failure, hepatic, pulmonary, sepsis, trauma), and select co-morbidities (i.e., chronic kidney disease, heart
failure); 2) 24h aer ICU admission: APACHE II and SOFA score (using worst values in the 24h period), use
of supportive care devices (i.e., renal replacement therapy, invasive mechanical ventilation), serum laboratory
values (i.e., albumin < 3mg/dL, bicarbonate < 22mEq/L or > 29mEq/L, chloride ≥ 110mEq/L, creatinine ≥ 1.5mg/
dL, lactate ≥ 2mmol/L, potassium ≥ 5.5mEq/L, sodium ≥ 148mEq/L or < 134mEq/L), uid ba lance (mL), and
presence of acute kidney injury (as dened by need for renal replacement therapy or serum creatinine greater
than or equal two times baseline); 3) Medication data at 24h: MRC-ICU score, vasopressor use in the rst 24h,
use of continuous medication infusions, and the number of continuous medication infusions.
Data analysis
Data missingness
Due to the hypothesis-generating nature of our study and the lack of published data on ICU uid overload
prediction, no attempt was made to estimate a study sample size. e 991 patients were split into training and
testing datasets using a 80:20 ratio. We assumed data was missing at random (MAR) (i.e., related to observed, not
unobserved values) and therefore chose Multiple Imputation by Chained Equation (MICE), rather than complete
case analysis or simple imputation, as the most appropriate approach to address missingness. Ten imputations
per variable were therefore applied for all missing data in the training and testing datasets to generate multiple
imputed training and testing datasets (SDC Fig.S1).
Machine learning models
We employed Random Forest, SVM, and XGBoost for the task of modeling the presence of uid overload3941.
During the model training on each of the ten imputed training sets, vefold cross validation wasapplied for
Random Forest, SVM and XGBoost, using their most appropriate R package4244, to choose the hyperparameters
for these machine learning models that resulted in the highest prediction accuracy. Each of these models was
Content courtesy of Springer Nature, terms of use apply. Rights reserved
3
Vol.:(0123456789)
Scientic Reports | (2023) 13:19654 | https://doi.org/10.1038/s41598-023-46735-3
www.nature.com/scientificreports/
then tted on the corresponding imputed training set, and predictions for probability of uid overload were
made on each of the ten imputed testing sets using the corresponding optimal model.During this phase, hyper-
parameters were tuned. For Random Forest, two hyperparameters were tuned (number of trees and number of
variables randomly sampled as candidates at each split). For SVM, linear kernel and cost of constraints violation
were tuned. For XGBoost, two hyperparameters were tuned (maximum depth of a tree and maximum number
of boosting iterations). For each model, ten dierent predictions were generated on ten dierent imputed test
sets. ese predictions of the probability for uid overload were averaged as the nal prediction.
For the degree of uid overload, we built models with the amount of uid overload at 72h. Since this is a
continuous variable, we employed their regression of the above machine learning models: Random Forest regres-
sion, SVM regression, and XGBoost regression. For XGBoost, feature importance was measured as the frequency
a feature was used in the trees. For Random Forest, feature importance was measured by mean decrease in node
impurity. Because ten dierent models were used on each imputed dataset, ten dierent feature importance lists
were generated for each. A subsequent analysis modeling uid overload as a continuous variable (percent of net
milliliters of uid by body weight) instead of dichotomous presence or absence of uid overload) was performed
(see SDC—Additional Methods S1).
Traditional regression models
Subsequently, a full logistic regression model was built for the presence of uid overload for each of the ten
complete training sets. We then applied backward elimination to select the nal model. e initial set of variables
for the variable selection were determined by the signicance of variables in the ten full models by multivariate
Wald testing45. We built our linear regression models so that the degree of uid overload was similar to that of
the ten completed training sets. In order to compare these models with the MRC-ICU only model, we also built
logistic regression and linear regression models with MRC-ICU as the sole predictor in the ten training sets.
Aer model tting, model ts were pooled using Rubin’s method46. Using the pooled models, odds ratios (OR)
and their 95% condence intervals (CI) were reported.
For each regression model, ten dierent predictions were generated on ten dierent imputed test sets as well.
ese predictions of the probability for uid overload were averaged as the nal prediction. We compared the
variables selected through backward selection with the top ve variables chosen by Random Forest (see SDC
Additional Methods S1). To further evaluate our results in those patients with high APACHE-2 (≥ 25) and high
SOFA (≥ 10) scores, we generated predictions using the backward section model (see SDC Additional Methods
S1).
Ethical approval
e protocol for this study was approved with waivers of informed consent and HIPAA authorization granted by
UNHCS Institutional Review Board (approval number: Project00001541; approval date: October 2021).
Results
A total of 49 (12.5%) of the 391 included patients had uid overload on ICU day 3. e degree of day 3 uid
overload was signicantly greater in the uid overload (vs non overload) patients (16.6% vs 2.2%, p < 0.01).
Overall, the mean APACHE II score was 15.7 ± 6.6, mean SOFA score was 8.3 ± 3.3, and MRC-ICU score was
11.8 ± 8.7. A signicantly greater proportion of uid overload patients (vs. those without) had an elevated serum
lactate 2mmol/L (32.7% vs. 14.9%, p = 0.01) and AKI (28.6% vs. 10.5%, p < 0.001) at 24h and positive uid
balance (1,840mL vs. 390mL, p < 0.001) on ICU day 3. All model covariates are summarized in Table1. At ICU
day 3, patients with uid overload (vs those without) were more likely to be dead (20.4% vs. 7.3%, p = 0.01), have
AKI (34.7% vs. 15.8%, p < 0.001), and remain on mechanical ventilation (12.7% vs. 4.2%, p = 0.05).
Among the machine learning models, XGBoost demonstrated the highest AUROC (0.78) compared to SVM
(0.69) and RF (0.76) and was associated with a PPV of 0.27 and NPV of 0.94. Notably, all models tested at
relatively poor PPV. In comparison, stepwise logistic regression had an AUROC of 0.70, PPV 0.26, and NPV
0.94. Full results are reported in Table2, and AUROC curves for all models are provided in SDC Supplemental
Fig.S2. Results of the full logistic regression are reported in SDC Supplemental TableS1. Stepwise regression
resulted in a more parsimonious model (7 variables vs. 31 variables) but demonstrated similar performance to
the machine learning models (SDC Supplementary TableS2). In the stepwise regression, presence of sepsis, male
sex, the SOFA score at 24h, and the 24h serum sodium and bicarbonate comprised the stepwise regression
model (Table2). In an analysis of MRC-ICU as a single predictor for uid overload, the model had an AUROC
of 0.74 (0.60–0.84), sensitivity 0.62 (0.35–0.85), specicity 0.70 (0.63–0.77), PPV 0.16 (0.08–0.27), and NPV
0.96 (0.90–0.98).
Feature importance graphs were plotted for XGBoost (Fig.1), RF (SDC Supplemental Fig.S3) and SVM (SDC
5 Supplemental Fig.S4). Among the 10 dierent feature importance lists generated for each model, dierences
between top features were noted. For example, for two of the machine learning models, XGBoost (Fig.2) and RF,
the top ve most important features were uid balance at 24h, SOFA score at 24h, MRC-ICU at 24h, APACHE
II at 24h, and the number of continuous infusions at 24h. While the stepwise regression model found uid
balance at 24h and APACHE II at 24h to be top features, the SOFA score at 24h, the MRC-ICU at 24h and the
number of continuous infusions were not found to be model features. e full regression results for predicting
the amount of uid overload at 72h are reported in SDC Supplemental TableS3. For stepwise regression, twelve
variables were included with uid balance, laboratory values, and severity of illness being signicant predictors
(SDC Supplemental TableS4). All models demonstrated similar performance as measured by MSE (SDC Sup-
plemental TableS5). Feature importance graphs are presented in SDC Supplemental Figs.S5–S7).
Content courtesy of Springer Nature, terms of use apply. Rights reserved
4
Vol:.(1234567890)
Scientic Reports | (2023) 13:19654 | https://doi.org/10.1038/s41598-023-46735-3
www.nature.com/scientificreports/
Table 1. Study cohort characteristics by presence of uid overload within 72h of ICU admission. Data are
presented as n (%) unless otherwise stated.
All (n = 391) Fluid overload (n = 49) No uid overload (n = 342) p-value
ICU baseline
Age ≥ 65years 202 (51.7) 19 (38.8) 183 (53.5) 0.08
Male sex 213 (54.5) 23 (46.9) 190 (55.6) 0.33
Chronic comorbidities
Chronic kidney disease 13 (3.3) 1 (2.0) 12 (3.5) 0.06
Heart failure 19 (4.9) 2 (4.1) 17 (4.9) 0.06
Admission to medical ICU 156 (39.9) 24 (48.9) 132 (38.6) 0.22
Primary ICU admission diagnosis
Cardiac 81 (20.7) 3 (6.1) 78 (22.8)
0.06
Chronic kidney disease 13 (3.3) 1 (2.0) 12 (3.5)
Hepatic 6 (1.5) 1 (2.0) 5 (1.5)
Pulmonary 58 (14.8) 8 (16.3) 50 (14.6)
Sepsis/septic shock 29 (7.4) 7 (14.3) 22 (6.4)
Trauma 10 (2.6) 3 (6.1) 7 (2.0)
24h aer ICU admission
Severity of illness, mean (SD)
APACHE II Score 15.7 (6.6) 17.5 (7.0) 15.4 (6.6) 0.06
SOFA Score 8.3 (3.3) 9.9 (4.6) 8.2 (3.1) 0.07
Supportive devices
Any renal replacement therapy 5 (1.3) 1 (2.0) 4 (1.2) 1.00
Any mechanical ventilation 140 (35.8) 21 (42.9) 119 (34.8) 0.53
Serum laboratory values
Albumin < 3mg/dL 88 (22.5) 18 (36.7) 70 (20.5) 0.02
Bicarbonate < 22mEq/L 74 (18.9) 14 (28.6) 60 (17.5) 0.16
Bicarbonate > 29mEq/L 64 (16.4) 6 (12.2) 58 (16.9)
Creatinine ≥ 1.5mg/dL 28 (7.2) 7 (14.3) 21 (6.1) 0.02
Chloride ≥ 110mEq/L 125 (31.9) 19 (38.8) 106 (30.9) 0.33
Potassium ≥ 5.5mEq/L 19 (4.9) 5 (10.2) 14 (4.1) 0.12
Lactate ≥ 2mmol/L 67 (17.1) 16 (32.7) 51 (14.9) 0.01
Sodium ≥ 148mEq/L 22 (5.6) 6 (12.2) 16 (4.7) 0.01
Sodium < 134mEq/L 33 (8.4) 4 (8.1) 29 (8.5)
Fluid balance (mL), mean (SD) 570 (1960) 1840 (301) 390 (168) < 0.001
Acute kidney injury 50 (12.8) 14 (28.6) 26 (10.5) < 0.001
Medications
MRC-ICU, mean (SD) 11.8 (8.7) 13.4 (8.4) 11.5 (8.7) 0.06
Any vasopressor 119 (30.4) 16 (32.6) 103 (30.1) 0.85
Any continuous infusions 249 (63.6) 34 (69.3) 215 (62.8) 0.47
Infusions/patient, mean (SD) 2.29 (3.3) 1.98 (2.2) 2.33 (3.4) 0.35
Table 2. Performance of presence of uid overload prediction models, mean (condence interval). AUROC
area under the receiver operating characteristic, PPV positive predictive value, NPV negative predictive value.
AURO C Accuracy Sensitivity Specicity PPV NPV
Traditional regression
All variables 0.70 (0.53, 0.82) 0.82 (0.76, 0.87) 0.43 (0.19, 0.70) 0.85 (0.79, 0.89) 0.20 (0.08, 0.37) 0.94 (0.89, 0.97)
Stepwise selected regression 0.70 (0.52, 0.82) 0.86 (0.80, 0.90) 0.43 (0.19, 0.70) 0.89 (0.84, 0.93) 0.26 (0.11, 0.47) 0.94 (0.90, 0.97)
Supervised machine learning models
Random forest 0.76 (0.62, 0.86) 0.83 (0.77, 0.88) 0.56 (0.29, 0.80) 0.8571 (0.80, 0.90) 0.25 (0.12, 0.43) 0.95 (0.91, 0.98)
Support vector machine 0.69 (0.51, 0.82) 0.80 (0.74, 0.86) 0.50 (0.24, 0.75) 0.82 (0.76, 0.88) 0.21 (0.09, 0.36) 0.94 (0.90, 0.97)
XGBoost 0.78 (0.62, 0.87) 0.87 (0.81, 0.91) 0.37 (0.15, 0.64) 0.91 (0.86, 0.94) 0.27 (0.10, 0.50) 0.94 (0.89, 0.97)
Content courtesy of Springer Nature, terms of use apply. Rights reserved
5
Vol.:(0123456789)
Scientic Reports | (2023) 13:19654 | https://doi.org/10.1038/s41598-023-46735-3
www.nature.com/scientificreports/
When variables selected through backward selection were compared with the variables chosen by the Ran-
dom Forest model, we found MRC-ICU at 24h to be highly correlated with sex-male, number of IV continuous
infusions to be highly correlated with sex-male and age—≥ 65), uid balance at 24h (mL) to be highly correlated
with admission diagnosis-sepsis/septic shock, laboratory values-serum bicarbonate, and age—≥ 65 (SDC Sup-
plemental TablesS6 and S7). ese results indicate high explanatory power exists between the backward selec-
tion and random forest variables. e vast majority of cases of uid overload occur in patients with both high
APACHE II and SOFA scores (SDC Supplemental TableS8).
Discussion
Although machine learning models have been shown to outperform traditional regression models in a variety
of settings47,48, the potential benets of machine learning in critical care remain an open eld of exploration, in
part due to a current lack of rigorous comparison in high quality ICU datasets29,49,50. Our analysis represents the
Figure1. Feature importance for presence of uid overload prediction with XGBoost.
Figure2. Most common features for presence of uid overload prediction with XGBoost imputations.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
6
Vol:.(1234567890)
Scientic Reports | (2023) 13:19654 | https://doi.org/10.1038/s41598-023-46735-3
www.nature.com/scientificreports/
rst published comparison of machine learning approaches with traditional regression methods to predict uid
overload using a novel dataset with granular medication data.
We report that machine learning and logistic regression analyses demonstrate a similar predictive power
to identify patients with uid overload on day 3 of their ICU stay. Although use of machine learning did not
appear to improve predictive performance over regression analysis, it expanded the number of variables critical
to uid overload prediction and highlights the importance of further articial intelligence-based exploration in
this area. is analysis of individual predictors may help bedside clinicians better understand how the machine
learning models work and may help overcome their ‘black box’ hesitancy to trust machine learning-generated
results51,52. For example, feature importance graphs for the machine learning analyses found complexity of the
daily ICU medication regimen (i.e., MRC-ICU score), which includes the number of intravenous medication
infusions (the primary method to administer medications in this population and a primary source of uids to a
patient), to be an important contributor to uid overload. In comparison, in the traditional multivariable regres-
sion, the MRC-ICU score was not associated with uid overload. is may be because machine learning analyses
better account for severity of illness and the response of clinicians to respond to this severity by administering
more medication infusions leading to a more complex daily medication regimen; however, the methods applied,
including feature importance, preclude causal inference at this juncture. As such, our results highlight the unique
power of machine learning to identify complex relationships that can be further elucidated via machine-learning
based causal inference modeling and other designs aimed at causation2,20.
Optimizing uid management (or uid stewardship) has been previously dened by the ROSE model of
Resuscitation, Optimization, Stabilization, and dE-resuscitation35. Aer an initial 24—48h period characterized
by overt volume resuscitation (e.g., a crystalloid bolus) and IV medication initiation (e.g., antibiotics), and the
associated uid administration, the care priority shis from volume administration to volume removal. While
comprehensive uid stewardship management strategies including reduced uid use and diuretic administra-
tion can eectively reduce uid overload and its sequelae, they are oen deployed too late1,2. Interestingly, some
reports have indicated ‘hidden uids’ (dened as blood products, enteralnutrition, ushes, and intravenous
medications) were signicantly associated with the development of uid overload. During critical illness many
of these ‘hidden uids’ are necessary (e.g., blood products), given that intravenous medications account for over
40% of total uid intake in this analysis, interventions such as concentrating intravenous medications, employ-
ing oral formulations when feasible, careful evaluation of maintenace uids, and antibiotic de-escalation are
potoentially still viable even in high illness severity that can reduce this complication. However, weighing risks
and benets associated with these interventions in thiscontext may be aided by more quantitative prediction
data56,57. Overall, de-resuscitation and uid stewardship can be deceptively complex53. In a patient with shock,
balancing the dueling forces of volume responsiveness assessment and timely volume resuscitation with the
risks associated with uid overload represents a highly complex Goldilocks scenario that requires clinicians to
have high clinical precision, essentially pivoting ‘on a dime’, from a strategy of aggressive volume expansion to
one of rapid volume removal36,54,55.
Despite the complexities of this decision process, limited prediction tools for uid overload are available
to assist clinicians at the ICU bedside. As such, real-time recognition identifying when to make the shi from
resuscitation to de-resuscitation has the potential to improve bedside management. However, to go beyond
the hourly assessment of ‘Ins and Outs’ would require accurate prediction of future uid overload risk and the
adverse events associated with it, in the time-dependent context of intervention delivery (e.g., diuretics). In such
a scenario, an algorithm would be able to accurately interpret a septic patient who is 3 L positive 24h aer uid
resuscitation initiation as being in a ‘green zone’ (i.e., appropriately resuscitated). However, 24h later, if the same
patient is 4 L positive while o vasopressors and with down-trending sepsis markers the algorithm could alert
clinicians that the patient is now in a ’yellow zone’ where interventions like diuretic therapy and uid reductions
are required to reduce acute kidney injury and intubation risk. is type of real-time predictive capability could
support continuous clinician decision-making but requires evaluation outside the scope of our current study.
Fluid overload also presents an important test case for exploring and adapting articial intelligence methods
to ICU problems, particularly those related to ICU medication use. Fluid overload represents a uniquely inter-
venable event in the ICU. Intervenable events share three key characteristics: they are predictable, preventable,
and otherwise associated with poor outcomes. e results of our study, and others, indicate that uid overload
can be predicted with modeling of some kind, especially given its ability to be quantitatively dened5658. Fluid
overload has been associated with poor outcomes including acute kidney injury, delirium, poor respiratory out-
comes, prolonged length of stay, and potentially increasing mortality2,37,5962. Evidence demonstrates the timely
recognition and management of uid overload is feasible and is associated with reduced mortality and time in
the ICU3,5,63,64. Notably, uid stewardship has been adapted by critical care pharmacists as key component of
comprehensive medication management5,6,65. As such, these results may support other investigations as they
identify patients in whom it is safe to initiate de-resuscitation or importantly never needed that degree of uid
volume initially and at the bedside may prompt clinicians to be more targeted in therapies initiated or aggres-
sive in curtailing early ‘hidden’ uids to avoid the complications of uid overload and/or the need for a highly
interventional period of de-resuscitation (e.g., diuretics, dialysis). Articial intelligence may be particularly well
suited to bolster these eorts, and thus while feature importance analyses cannot provide foundation for causal
inference, they may guide such future investigations.
Our study has limitations. Our patient sample may have been too small to demonstrate superiority of the
machine learning approaches compared to traditional regression, and no validation in a separate, external data-
set was undertaken at this juncture66. Future studies applying this approach to alternative, larger datasets (e.g.,
MIMIC-III) should be considered to examine the external validity of our ndings. Although MICE is the estab-
lished approach to address missingness in cohort studies that includes variables that are a composite of several
individual patient-specic values (e.g. SOFA), it is possible that some of the values in the imputed datasets that
Content courtesy of Springer Nature, terms of use apply. Rights reserved
7
Vol.:(0123456789)
Scientic Reports | (2023) 13:19654 | https://doi.org/10.1038/s41598-023-46735-3
www.nature.com/scientificreports/
represented our new ground truth may not have been accurate67. Bias may exist due to which patients had uid
balance data available. Other predictors for uid overload not included in our models may exist68. By relying on
prediction data derived in the rst 24h of ICU admission, we did not fully capture the dynamic nature of critical
illness over the entire three day ICU period before uid overload occurred. Future time-dependent evaluations
of changing features employing unsupervised learning techniques may yield novel insights.
Conclusion
Fluid overload is an important, intervenable event in the ICU population. Incorporation of medication-related
variables and articial intelligence has demonstrated promise to improve prediction that may ultimately guide
timely intervention and mitigation of this ICU complication; however, comparative advantages over traditional
modeling techniques may remain warranted.
Data availability
e datasets used and/or analyzed during the current study available from the corresponding author on reason-
able request.
Received: 1 June 2023; Accepted: 4 November 2023
References
1. Carr, J. R. et al. Fluid stewardship of maintenance Intravenous uids. J. Pharm. Pract. 897, 190 (2021).
2. Hawkins, W. A. et al. Fluid stewardship during critical illness: A call to action. J. Pharm. Pract. 33(6), 863–873 (2020).
3. Bissell, B. D. et al. Impact of protocolized diuresis for de-resuscitation in the intensive care unit. Crit. Care 24(1), 70 (2020).
4. Jones, T. W. et al. Early diuretics for de-resuscitation in septic patients with le ventricular dysfunction. Clin. Med. Insights Cardiol.
16, 11795468221095876 (2022).
5. Hawkins, W. A. et al. From theory to bedside: implementation of uid stewardship in a medical ICU pharmacy practice. Am. J.
Health Syst. Pharm. 79(12), 984–992 (2022).
6. Bissell, B. D. et al. A narrative review of pharmacologic de-resuscitation in the critically ill. J. Crit. Care 59, 156–162 (2020).
7. Messmer, A. S. et al. Fluid overload phenotypes in critical illness-a machine learning approach. J. Clin. Med. 11(2), 1 (2022).
8. Zhang, Z., Ho, K. M. & Hong, Y. Machine learning for the prediction of volume responsiveness in patients with oliguric acute
kidney injury in critical care. Crit. Care 23(1), 112 (2019).
9. R aghu, A., Komorowski, M., Celi, L. A., Szolovits, P., & Ghassemi, M. Continuous state-space models for optimal sepsis treatment:
A deep reinforcement learning approach. In Machine Learning for Healthcare Conference 2017; pp. 147–163.
10. Komorowski, M., Celi, L. A., Badawi, O., Gordon, A. C. & Faisal, A. A. e articial intelligence clinician learns optimal treatment
strategies for sepsis in intensive care. Nat. Med. 24(11), 1716–1720 (2018).
11. Olney, W. J. et al. Medication regimen complexity score as an indicator of uid balance in critically ill Patients. J. Pharm. Pract.
897, 190 (2021).
12. Sikora, A. et al. Impact of pharmacists to improve patient care in the critically ill: A large multicenter analysis using meaningful
metrics with the medication regimen complexity-ICU (MRC-ICU) score. Crit. Care Med. 50(9), 1318–1328 (2022).
13. Newsome, A. S. et al. Multicenter validation of a novel medication-regimen complexity scoring tool. Am. J. Health Syst. Pharm.
77(6), 474–478 (2020).
14. Newsome, A. S. et al. Characterization of changes in medication complexity using a modied scoring tool. Am. J. Health Syst.
Pharm. 76(Supplement 4), S92-s95 (2019).
15. Gwynn, M. E. et al. Development and validation of a medication regimen complexity scoring tool for critically ill patients. Am. J.
Health Syst. Pharm. 76(Suppl 2), S34–S40 (2019).
16. Al-Mamun, M. A., Brothers, T. & Newsome, A. S. Development of machine learning models to validate a medication regimen
complexity scoring tool for critically ill patients. Ann. Pharmacother. 55(4), 421–429 (2021).
17. Smith, S. E., Shelley, R. & Sikora, A. Medication regimen complexity vs patient acuity for predicting critical care pharmacist
interventions. Am. J. Health Syst. Pharm. 79(8), 651–655 (2022).
18. Webb, A. J., Rowe, S. & Newsome, A. S. A descriptive report of the rapid implementation of automated MRC-ICU calculations in
the EMR of an academic medical center. Am. J. Health Syst. Pharm. 79(12), 979–983 (2022).
19. Newsome, A. S. et al. Medication regimen complexity is associated with pharmacist interventions and drug-drug interactions: A
use of the novel MRC-ICU scoring tool. J. Am. Coll. Clin. Pharm. 3(1), 47–56 (2020).
20. Sanchez, P. et al. Causal machine learning for healthcare and precision medicine. R Soc. Open Sci. 9(8), 220638 (2022).
21. Iwase, S. et al. Prediction algorithm for ICU mortality and length of stay using machine learning. Sci. Rep. 12(1), 12912 (2022).
22. Beil, M. et al. On predictions in critical care: e individual prognostication fallacy in elderly patients. J. Crit. Care 61, 34–38
(2021).
23. Lovejoy, C. A., Buch, V. & Maruthappu, M. Articial intelligence in the intensive care unit. Crit. Care 23(1), 7 (2019).
24. Gutierrez, G. Articial intelligence in the intensive care unit. Crit. Care 24(1), 101 (2020).
25. Goh, K. H. et al. Articial intelligence in sepsis early prediction and diagnosis using unstructured data in healthcare. Nat. Com-
mun. 12(1), 711 (2021).
26. Hyland, S. L. et al. Early prediction of circulatory failure in the intensive care unit using machine learning. Nat. Med. 26(3), 364–373
(2020).
27. DeGrave, A. J., Janizek, J. D. & Lee, S. I. AI for radiographic COVID-19 detection selects shortcuts over signal. medRxiv 1, 1 (2020).
28. Nguyen, D., Ngo, B. & vanSonnenberg, E. AI in the intensive care unit: Up-to-date review. J. Intensive Care Med. 36(10), 1115–1123
(2021).
29. Yoon, J. H., Pinsky, M. R. & Clermont, G. Articial intelligence in critical care medicine. Crit. Care 26(1), 75 (2022).
30. Farion, K. J. et al. Comparing predictions made by a prediction model, clinical score, and physicians: Pediatric asthma exacerba-
tions in the emergency department. Appl. Clin. Inf. 4(3), 376–391 (2013).
31. Feng, J. Z. et al. Comparison between logistic regression and machine learning algorithms on survival prediction of traumatic
brain injuries. J. Crit. Care 54, 110–116 (2019).
32. World Medical Association. World Medical Association Declaration of Helsinki: Ethical principles for medical research involving
human subjects. JAMA 312, 2191–2194 (2023).
33. Von Elm, E. A. D. et al. STROBE Initiative: Strengthening the reporting of observational studies in epidemiology (STROBE) state-
ment: Guidelines for reporting observational studies. BMJ 335, 806–808 (2007).
34. Bouchard, J. et al. Fluid accumulation, survival and recovery of kidney function in critically ill patients with acute kidney injury.
Kidney Int. 76(4), 422–427 (2009).
Content courtesy of Springer Nature, terms of use apply. Rights reserved
8
Vol:.(1234567890)
Scientic Reports | (2023) 13:19654 | https://doi.org/10.1038/s41598-023-46735-3
www.nature.com/scientificreports/
35. Carr, J. R. et al. Fluid stewardship of maintenance intravenous uids. J. Pharm. Pract. 35(5), 769–782 (2022).
36. Malbrain, M. et al. Principles of uid management and stewardship in septic shock: It is time to consider the four D’s and the four
phases of uid therapy. Ann. Intensive Care 8(1), 66 (2018).
37. Claure-Del Granado, R. & Mehta, R. L. Fluid overload in the ICU: Evaluation and management. BMC Nephrol. 17(1), 109 (2016).
38. O’Connor, M. E. & Prowle, J. R. Fluid overload. Crit. Care Clin. 31(4), 803–821 (2015).
39. Chen, T., & Guestrin, C. XGBoost: A Scalable Tree Boosting System. In: Proceedings of the 22nd ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining [Internet]. New York, NY, USA: ACM; 2016. p. 785–94. Available from:
https:// doi. org/ 10. 1145/ 29396 72. 29397 85.
40. Cortes, C. & Vapnik, V. Support-vector networks. Mach. Learn. 20(3), 273–297 (1995).
41. Ho, T. K. Random decision forests. In: Proceedings of 3rd international conference on document analysis and recognition. p. 278–82
(1995).
42. Liaw, A. & Wiener, M. Classication and regression by Random Forest. R News 2(3), 18–22 (2002).
43. Meyer, D., Dimitriadou, E., & Hornik, K., et al. Miscellaneous functions of the Department of Statistics and Probability eory
Group (2023). R Package version 1.7–13. https:// CRAN.R- proje ct. org/ packa ge= e1071.
44. Chen, T., He, T., Benesty, M., et al. XGBoost: Extreme gradient boosting. R package versions 1.7.5.1. https:// CRAN.R- proje ct. org/
packa ge= xgboo st.
45. Li, K. H., Raghunathan, T. E. & Rubin, D. B. Large-sample signicance levels from multiply imputed data using moment-based
statistics and an F reference distribution. J. Am. Stat. Assoc. 86, 1065–1073 (1991).
46. Rubin, D. B. Multiple imputation for nonresponse in surveys (Wiley-Interscience, Hoboken, NJ, 2004).
47. Topol, E. J. Deep medicine: how articial intelligence can make healthcare human again. First edition. pp 1 online resource (Basic
Books, New York, 2019).
48. Kahneman, D., Sibony, O., & Sunstein, C. R. Noise: A aw in human judgment. First edition. Edition (Little, Brown Spark, New
York, 2021)
49. D’Hondt, E. et al. Identifying and evaluating barriers for the implementation of machine learning in the intensive care unit. Com-
mun. Med. (Lond) 2(1), 162 (2022).
50. van de Sande, D. et al. Moving from bytes to bedside: A systematic review on the use of articial intelligence in the intensive care
unit. Intensive Care Med. 47(7), 750–760 (2021).
51. Moss, L. et al. Demystifying the black box: e importance of interpretability of predictive models in neurocritical care. Neurocrit.
Care 37(Suppl 2), 185–191 (2022).
52. e Lancet Respiratory M: Opening the black box of machine learning. Lancet Respir. Med. 6(11), 801 (2018).
53. Malbrain, M., Martin, G. & Ostermann, M. Everything you need to know about deresuscitation. Intensive Care Med. 48(12),
1781–1786 (2022).
54. Gelbart, B. et al. Fluid accumulation in mechanically ventilated, critically ill children: retrospective cohort study of prevalence and
outcome. Pediatr. Crit. Care Med. 23(12), 990–998 (2022).
55. National Heart L, Blood Institute Acute Respiratory Distress Syndrome Clinical Trials N, Wiedemann, H. P., et al. Comparison of
two uid-management strategies in acute lung injury. N. Engl. J. Med. 354(24), 2564–2575 (2006).
56. Gamble, K. C. et al. Hidden uids in plain sight: Identifying intravenous medication classes as contributors to intensive care unit
uid intake. Hosp. Pharm. 57(2), 230–236 (2022).
57. Branan, T. et al. Association of hidden uid administration with development of uid overload reveals opportunities for targeted
uid minimization. SAGE Open Med. 8, 2050312120979464 (2020).
58. Mitchell, K. H. et al. Volume Overload: prevalence, risk factors, and functional outcome in survivors of septic shock. Ann. Am.
orac. Soc. 12(12), 1837–1844 (2015).
59. Ouchi, A. et al. Association between uid overload and delirium/coma in mechanically ventilated patients. Acute Med. Surg. 7(1),
e508 (2020).
60. Murphy, C. V. et al. e importance of uid management in acute lung injury secondary to septic shock. Chest 136(1), 102–109
(2009).
61. Boyd, J. H. et al. Fluid resuscitation in septic shock: A positive uid balance and elevated central venous pressure are associated
with increased mortality. Crit. Care Med. 39(2), 259–265 (2011).
62. Woodward, C. W. et al. Fluid overload associates with major adverse kidney events in critically ill patients with acute kidney injury
requiring continuous renal replacement therapy. Crit. Care Med. 47(9), e753–e760 (2019).
63. Silversides, J. A., Perner, A. & Malbrain, M. Liberal versus restrictive uid therapy in critically ill patients. Intensive Care Med.
45(10), 1440–1442 (2019).
64. Goldstein, S. et al. Pharmacological management of uid overload. Br. J. Anaesth. 113(5), 756–763 (2014).
65. Silversides, J. A. et al. Fluid management and deresuscitation practices: A survey of critical care physicians. J. Intensive Care Soc.
21(2), 111–118 (2020).
66. Burkov, A. e hundred-page machine learning book (Quebec City, Canada, Andriy Burkov, 2019).
67. O’Keefe, A. G., Farewell, D. M., Tom, B. D. M. & Farewell, V. T. Multiple imputation of missing composite outcomes in longitudinal
data. Stat. Biosci. 8(2), 310–332 (2016).
68. Qin, X. et al. A deep learning model to identify the uid overload status in critically ill patients based on chest X-ray images. Pol.
Arch. Intern. Med. 133(2), 1 (2023).
Acknowledgements
Data acquisition were supported by NC TraCS, funded by Grant Number UL1TR002489 from the National
Center for Advancing Translations Sciences at the National Institutes of Health, and Data Analytics at the Uni-
versity of North Carolina Medical Center Department of Pharmacy.
Author contributions
A.S. was responsible for project execution, design, and initial manuscript writing. J.D., D.M., and R.K. provided
critical revisions of manuscript, data interpretation, and senior level oversight. M.Y., T.Z, and X.C. handled data
pre-processing and analysis (M.Y., T.Z.) and methodology support and data interpretation (X.C., R.K.). B.M.
served as site coordinator for all data validation and procurement as well as manuscript revisions and data inter-
pretation. S.S., M.B., and S.R. provided clinical interpretation, results interpretation, and manuscript revision.
Funding
Funding through Agency of Healthcare Research and Quality for Drs. Devlin, Murphy, Sikora, Smith, and
Kamaleswaran was provided through R21HS028485 and R01HS029009.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
9
Vol.:(0123456789)
Scientic Reports | (2023) 13:19654 | https://doi.org/10.1038/s41598-023-46735-3
www.nature.com/scientificreports/
Competing interests
e authors declare no competing interests.
Additional information
Supplementary Information e online version contains supplementary material available at https:// doi. org/
10. 1038/ s41598- 023- 46735-3.
Correspondence and requests for materials should be addressed to J.W.D.
Reprints and permissions information is available at www.nature.com/reprints.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional aliations.
Open Access is article is licensed under a Creative Commons Attribution 4.0 International
License, which permits use, sharing, adaptation, distribution and reproduction in any medium or
format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons licence, and indicate if changes were made. e images or other third party material in this
article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the
material. If material is not included in the article’s Creative Commons licence and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.
© e Author(s) 2023
Content courtesy of Springer Nature, terms of use apply. Rights reserved
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com
... The incorporation of medication data into prediction models for critically ill patients has improved results for prediction of fluid overload and prolonged duration of mechanical ventilation, particularly with the application of supervised machine learning approaches. [9,10] Additionally, a pilot study of six machine learning methods also showed that incorporation of medication data and the medication regimen complexity-intensive care unit (MRC-ICU) score improved mortality prediction, and adding MRC-ICU to severity of illness improved traditional regression as well. [11] These examples offer credence to the concept that incorporating information on medication regimens is useful in predicting both shot-term and long-term outcomes for ICU patients. ...
... These methods have been previously published. [9,17] ...
... Despite the potential impact of AI and machine learning approaches, it is important to consider whether performance of these models significantly improves upon traditional models that are simpler and more transparent, considering the resources required to implement these approaches. [9,42] This study also highlights the need for prediction models to benchmark against existing clinical standards. ...
Preprint
Full-text available
Background: In critically ill patients, complex relationships exist among patient disease factors, medication management, and mortality. Considering the potential for nonlinear relationships and the high dimensionality of medication data, machine learning and advanced regression methods may offer advantages over traditional regression techniques. The purpose of this study was to evaluate the role of different modeling approaches incorporating medication data for mortality prediction. Methods: This was a single-center, observational cohort study of critically ill adults. A random sample of 991 adults admitted ≥ 24 hours to the intensive care unit (ICU) from 10/2015 to 10/2020 were included. Models to predict hospital mortality at discharge were created. Models were externally validated against a temporally separate dataset of 4,878 patients. Potential mortality predictor variables (n=27, together with 14 indicators for missingness) were collected at baseline (age, sex, service, diagnosis) and 24 hours (illness severity, supportive care use, fluid balance, laboratory values, MRC-ICU score, and vasopressor use) and included in all models. The optimal traditional (equipped with linear predictors) logistic regression model and optimal advanced (equipped with nature splines, smoothing splines, and local linearity) logistic regression models were created using stepwise selection by Bayesian information criterion (BIC). Supervised, classification-based ML models [e.g., Random Forest, Support Vector Machine (SVM), and XGBoost] were developed. Area under the receiver operating characteristic (AUROC), positive predictive value (PPV), and negative predictive value (NPV) were compared among different mortality prediction models. Results: A model including MRC-ICU in addition to SOFA and APACHE II demonstrated an AUROC of 0.83 for hospital mortality prediction, compared to AUROCs of 0.72 and 0.81 for APACHE II and SOFA alone. Machine learning models based on Random Forest, SVM, and XGBoost demonstrated AUROCs of 0.83, 0.85, and 0.82, respectively. Accuracy of traditional regression models was similar to that of machine learning models. MRC-ICU demonstrated a moderate level of feature importance in both XGBoost and Random Forest. Across all ten models, performance was lower on the validation set. Conclusions: While medication data were not included as a significant predictor in regression models, addition of MRC-ICU to severity of illness scores (APACHE II and SOFA) improved AUROC for mortality prediction. Machine learning methods did not improve model performance relative to traditional regression methods.
... Compared with logistic regression, random forest tends to yield lower recall and higher precision but achieves much better overall discrimination ability based on AUC ROC. Overall, the XGBoost model likely exhibits advantages, consistent with previous studies (e.g., [ 25 ]), over the other models through learning complex nonlinear relationships, employing regularization to prevent overfitting, and better opportunities for finetuning. Also notable is that the final tuned XGBoost models' characteristics suggest that the predictive ability is better for Asian, Black, and Hispanic individuals than for non-Hispanic white individuals. ...
Article
Full-text available
Background: Acute ischemic stroke is a leading cause of death in the United States. Identifying patients with stroke at high risk of mortality is crucial for timely intervention and optimal resource allocation. This study aims to develop and validate machine learning-based models to predict in-hospital mortality risk for intensive care unit (ICU) patients with acute ischemic stroke and identify important associated factors. Methods: Our data include 3,489 acute ischemic stroke admissions to the ICU for patients not discharged or dead within 48 h from the Medical Information Mart for Intensive Care-IV (MIMIC-IV) database. Demographic, hospitalization type, procedure, medication, intake (intravenous and oral), laboratory, vital signs, and clinical assessment [e.g., Glasgow Coma Scale Scores (GCS)] during the initial 48 h of admissions were used to predict in-hospital mortality after 48 h of ICU admission. We explored 3 machine learning models (random forests, logistic regression, and XGBoost) and applied Bayesian optimization for hyperparameter tuning. Important features were identified using learned coefficients. Results: Experiments show that XGBoost tuned for area under the receiver operating characteristic curve (AUC ROC) was the best performing model (AUC ROC 0.86, F1 0.52), compared to random forests (AUC ROC 0.85, F1 0.47) and logistic regression (AUC ROC 0.75, F1 0.40). Top features include GCS, blood urea nitrogen, and Richmond RASS score. The model also demonstrates good fairness for males versus females and across racial/ethnic groups. Conclusions: Machine learning has shown great potential in predicting in-hospital mortality risk for people with acute ischemic stroke in the ICU setting. However, more ethical considerations need to be applied to ensure that performance differences across different racial/ethnic groups will not exacerbate existing health disparities and will not harm historically marginalized populations.
... In recent years, there has been growing interest in integrating dynamic variables and Machine Learning (ML) approaches to improve mortality prediction in the ICU [15]. The ML models offer advantages over classical statistical models in handling complex, high-dimensional data typical of ICU environments, providing more accurate and dynamic predictions by capturing nonlinear relationships and interactions that traditional models may miss [16]. ...
Article
Full-text available
Background: A machine learning prognostic mortality scoring system was developed to address challenges in patient selection for clinical trials within the Intensive Care Unit (ICU) environment. The algorithm incorporates Red blood cell Distribution Width (RDW) data and other demographic characteristics to predict ICU mortality alongside existing ICU mortality scoring systems like Simplified Acute Physiology Score (SAPS). Methods: The developed algorithm, defined as a Mixed-effects logistic Random Forest for binary data (MixRFb), integrates a Random Forest (RF) classification with a mixed-effects model for binary outcomes, accounting for repeated measurement data. Performance comparisons were conducted with RF and the proposed MixRFb algorithms based solely on SAPS scoring, with additional evaluation using a descriptive receiver operating characteristic curve incorporating RDW’s predictive mortality ability. Results: MixRFb, incorporating RDW and other covariates, outperforms the SAPS-based variant, achieving an area under the curve of 0.882 compared to 0.814. Age and RDW were identified as the most significant predictors of ICU mortality, as reported by the variable importance plot analysis. Conclusions: The MixRFb algorithm demonstrates superior efficacy in predicting in-hospital mortality and identifies age and RDW as primary predictors. Implementation of this algorithm could facilitate patient selection for clinical trials, thereby improving trial outcomes and strengthening ethical standards. Future research should focus on enriching algorithm robustness, expanding its applicability across diverse clinical settings and patient demographics, and integrating additional predictive markers to improve patient selection capabilities.
... It involves splitting the dataset into training and testing samples to iteratively train the model and evaluate its performance. These algorithms can quickly identify variable features and create intricate predictive models, leading to more accurate predictions of the target variable (Sikora et al., 2023). ...
Article
Full-text available
Despite extensive research on the impact of individual and environmental factors on negative risk-taking behaviors, the understanding of these factors’ influence on positive risk-taking, and how it compares to negative risk taking, remains limited. This research employed machine-learning techniques to identify shared and unique predictors across individual, family, and peer domains. Participants (N = 1012; 44% girls; Mage = 14.60 years, SD = 1.16 years) were drawn from three public middle schools in a large city in southern China (with 49.2% in grade 7 and 50.8% in grade 11). The findings indicate that positive risk-taking is significantly associated with general risk propensity, self-control, and negative parenting by father, while negative risk-taking is correlated with self-control, deviant peer affiliations, and peer victimization. Paternal negative parenting triggered positive risk-taking in boys, whereas self-control had a greater impact on girls. For negative risk-taking, boys were more affected by peer victimization, while girls were more influenced by deviant peer affiliations. This study further demonstrates that as progress from junior to senior high school, peer influence grows more significant in predicting positive risk taking; deviant peer affiliations exert a persistent pivotal influence, future positive time perspective replaces life satisfaction, and paternal negative parenting becomes increasingly impactful in predicting negative risk taking.
... Equidade em Aprendizado de Máquina e Sistemas de Recomendação: A consciência sobre os impactos sociais negativos oriundos da aplicação de algoritmos de aprendizado de máquina em contextos decisórios tem crescido [Sudakov andTitov 2023, Emirhüseyinoglu et al. 2023]. Como resposta, várias métricas e conceitos de justiça foram propostos para tarefas de aprendizado de máquina, englobando classificação [Dargan et al. 2024, Feng et al. 2023, regressão [Xue et al. 2024, Sikora et al. 2023, ordenação [Bhargava et al. 2022, Zehlike et al. 2022], e escolha de conjuntos [Vashistha et al. 2023, Celis et al. 2016]. Essas propostas se dividem em duas categorias principais: avaliações de justiça em nível individual e em nível de grupo [Dwork et al. 2011]. ...
Conference Paper
Este estudo investiga a equidade em sistemas de recomendação utilizando o dataset MovieLens, aplicando estratégias de filtragem colaborativa: ALS, KNN e NMF. Avaliamos a injustiça em diferentes configurações de agrupamento: Gênero, Idade, Avaliações e Aglomerativo. Os resultados indicam variações significativas de injustiça entre as estratégias, com o método Aglomerativo destacando-se por apresentar os maiores níveis de injustiça do grupo na maioria das abordagens. Esta análise sugere a necessidade de uma seleção cuidadosa da estratégia de filtragem e do método de agrupamento para promover sistemas de recomendação mais justos e inclusivos, destacando a importância de considerar múltiplas dimensões de injustiça na concepção destes sistemas.
Article
Background Fluid overload (FO) in the intensive care unit (ICU) is common, serious, and may be preventable. Intravenous medications (including administered volume) are a primary cause for FO but are challenging to evaluate as a FO predictor given the high frequency and time‐dependency of their use and other factors affecting FO. We sought to employ unsupervised machine learning methods to uncover medication administration patterns correlating with FO. Methods This retrospective cohort study included 927 adults admitted to an ICU for ≥72 h. FO was defined as a positive fluid balance ≥7% of admission body weight. After reviewing medication administration record data in 3‐h periods, medication exposure was categorized into clusters using principal component analysis (PCA) and Restricted Boltzmann Machine (RBM). Medication regimens of patients with and without FO were compared within clusters to assess their temporal association with FO. Results FO occurred in 127 (13.7%) of 927 included patients. Patients received a median (interquartile range) of 31(13–65) discrete intravenous medication administrations over the 72‐h period. Across all 47,803 intravenous medication administrations, 10 unique medication clusters, containing 121 to 130 medications per cluster, were identified. The mean number of Cluster 7 medications administered was significantly greater in the FO cohort compared with patients without FO (25.6 vs.10.9, p < 0.0001). A total of 51 (40.2%) of 127 unique Cluster 7 medications were administered in more than five different 3‐h periods during the 72‐h study window. The most common Cluster 7 medications included continuous infusions, antibiotics, and sedatives/analgesics. Addition of Cluster 7 medications to an FO prediction model including the Acute Physiologic and Chronic Health Evaluation (APACHE) II score and receipt of diuretics improved model predictiveness from an Area Under the Receiver Operation Characteristic (AUROC) curve of 0.719 to 0.741 ( p = 0.027). Conclusions Using machine learning approaches, a unique medication cluster was strongly associated with FO. Incorporation of this cluster improved the ability to predict FO compared to traditional prediction models. Integration of this approach into real‐time clinical applications may improve early detection of FO to facilitate timely intervention.
Article
Objective Common data models provide a standard means of describing data for artificial intelligence (AI) applications, but this process has never been undertaken for medications used in the intensive care unit (ICU). We sought to develop a common data model (CDM) for ICU medications to standardize the medication features needed to support future ICU AI efforts. Materials and Methods A 9-member, multi-professional team of ICU clinicians and AI experts conducted a 5-round modified Delphi process employing conference calls, web-based communication, and electronic surveys to define the most important medication features for AI efforts. Candidate ICU medication features were generated through group discussion and then independently scored by each team member based on relevance to ICU clinical decision-making and feasibility for collection and coding. A key consideration was to ensure the final ontology both distinguished unique medications and met Findable, Accessible, Interoperable, and Reusable (FAIR) guiding principles. Results Using a list of 889 ICU medications, the team initially generated 106 different medication features, and 71 were ranked as being core features for the CDM. Through this process, 106 medication features were assigned to 2 key feature domains: drug product-related (n = 43) and clinical practice-related (n = 63). Each feature included a standardized definition and suggested response values housed in the electronic data library. This CDM for ICU medications is available online. Conclusion The CDM for ICU medications represents an important first step for the research community focused on exploring how AI can improve patient outcomes and will require ongoing engagement and refinement.
Article
Full-text available
Introduction: Recent studies have highlighted adverse outcomes of fluid overload in critically ill patients. Therefore, early recognition is essential for the management of these patients. Objectives: Our purpose is to propose a deep learning (DL) model to explore noninvasive chest X-ray (CXR) image information associated with fluid overload status. Patients and methods: We collected study data from the Medical Information Mart for Intensive Care IV (MIMIC-IV, v1.0) and MIMIC Chest X-Ray (v2.0.0) databases for modeling, and from our hospital for testing. Extravascular lung water index (ELWI) > 10 mL/kg and global end diastolic volume index (GEDI) > 700 mL/m2 were used as threshold values for fluid overload status. A DL model with a transfer learning strategy was proposed to predict fluid overload status through CXR images in comparison with clinical and semantic label models. Results: Whether in the primary cohort or test cohort, the DL models showed relatively strong performance for predicting the ELWI (AUROC: 0.896, 95% CI 0.819-0.972 and 0.718, 0.594-0.822, respectively) and GEDI status (AUROC: 0.814, 95% CI 0.699-0.930 and 0.649, 0.510-0.787, respectively), which were better than clinical and semantic label models. Additionally, a visualization technique to determine the important areas of features in the input images was adopted. Conclusions: As CXR is routinely used in the intensive care unit, a simple, fast, low-cost, and noninvasive DL model can be regarded as an effective supplementary tool for identifying fluid overload, and should be widely adopted in clinical applications, especially when invasive hemodynamic monitoring is not available.
Article
Full-text available
Background Despite apparent promise and the availability of numerous examples in the literature, machine learning models are rarely used in practice in ICU units. This mismatch suggests that there are poorly understood barriers preventing uptake, which we aim to identify. Methods We begin with a qualitative study with 29 interviews of 40 Intensive Care Unit-, hospital- and MedTech company staff members. As a follow-up to the study, we attempt to quantify some of the technical issues raised. To perform experiments we selected two models based on criteria such as medical relevance. Using these models we measure the loss of performance in predictive models due to drift over time, change of available patient features, scarceness of data, and deploying a model in a different context to the one it was built in. Results The qualitative study confirms our assumptions on the potential of AI-driven analytics for patient care, as well as showing the prevalence and type of technical blocking factors that are responsible for its slow uptake. The experiments confirm that each of these issues can cause important loss of predictive model performance, depending on the model and the issue. Conclusions Based on the qualitative study and quantitative experiments we conclude that more research on practical solutions to enable AI-driven innovation in Intensive Care Units is needed. Furthermore, the general poor situation with respect to public, usable implementations of predictive models would appear to limit the possibilities for both the scientific repeatability of the underlying research and the transfer of this research into practice.
Article
Full-text available
Causal machine learning (CML) has experienced increasing popularity in healthcare. Beyond the inherent capabilities of adding domain knowledge into learning systems, CML provides a complete toolset for investigating how a system would react to an intervention (e.g. outcome given a treatment). Quantifying effects of interventions allows actionable decisions to be made while maintaining robustness in the presence of confounders. Here, we explore how causal inference can be incorporated into different aspects of clinical decision support systems by using recent advances in machine learning. Throughout this paper, we use Alzheimer’s disease to create examples for illustrating how CML can be advantageous in clinical scenarios. Furthermore, we discuss important challenges present in healthcare applications such as processing high-dimensional and unstructured data, generalization to out-of-distribution samples and temporal relationships, that despite the great effort from the research community remain to be solved. Finally, we review lines of research within causal representation learning, causal discovery and causal reasoning which offer the potential towards addressing the aforementioned challenges.
Article
Full-text available
Machine learning can predict outcomes and determine variables contributing to precise prediction, and can thus classify patients with different risk factors of outcomes. This study aimed to investigate the predictive accuracy for mortality and length of stay in intensive care unit (ICU) patients using machine learning, and to identify the variables contributing to the precise prediction or classification of patients. Patients (n = 12,747) admitted to the ICU at Chiba University Hospital were randomly assigned to the training and test cohorts. After learning using the variables on admission in the training cohort, the area under the curve (AUC) was analyzed in the test cohort to evaluate the predictive accuracy of the supervised machine learning classifiers, including random forest (RF) for outcomes (primary outcome, mortality; secondary outcome, length of ICU stay). The rank of the variables that contributed to the machine learning prediction was confirmed, and cluster analysis of the patients with risk factors of mortality was performed to identify the important variables associated with patient outcomes. Machine learning using RF revealed a high predictive value for mortality, with an AUC of 0.945 (95% confidence interval [CI] 0.922–0.977). In addition, RF showed high predictive value for short and long ICU stays, with AUCs of 0.881 (95% CI 0.876–0.908) and 0.889 (95% CI 0.849–0.936), respectively. Lactate dehydrogenase (LDH) was identified as a variable contributing to the precise prediction in machine learning for both mortality and length of ICU stay. LDH was also identified as a contributing variable to classify patients into sub-populations based on different risk factors of mortality. The machine learning algorithm could predict mortality and length of stay in ICU patients with high accuracy. LDH was identified as a contributing variable in mortality and length of ICU stay prediction and could be used to classify patients based on mortality risk.
Article
Full-text available
Introduction De-resuscitation practices in septic patients with heart failure (HF) are not well characterized. This study aimed to determine if diuretic initiation within 48 hours of intensive care unit (ICU) admission was associated with a positive fluid balance and patient outcomes. Methods This single-center, retrospective cohort study included adult patients with an established diagnosis of HF admitted to the ICU with sepsis or septic shock. The primary outcome was the incidence of positive fluid balance in patients receiving early (<48 hours) versus late (>48 hours) initiation of diuresis. Secondary outcomes included hospital mortality, ventilator-free days, and hospital and ICU length of stay. Continuous variables were assessed using independent t-test or Mann-Whitney U, while categorical variables were evaluated using the Pearson Chi-squared test. Results A total of 101 patients were included. Positive fluid balance was significantly reduced at 72 hours (−139 mL vs 4370 mL, P < .001). The duration of mechanical ventilation (4 vs 5 days, P = .129), ventilator-free days (22 vs 18.5 days, P = .129), and in-hospital mortality (28 (38%) vs 12 (43%), P = .821) were similar between groups. In a subgroup analysis excluding patients not receiving renal replacement therap (RRT) (n = 76), early diuretics was associated with lower incidence of mechanical ventilation (41 [73.2%] vs 20 (100%), P = .01) and reduced duration of mechanical ventilation (4 vs 8 days, P = .018). Conclusions Diuretic use within 48 hours of ICU admission in septic patients with HF resulted in less incidence of positive fluid balance. Early diuresis in this unique patient population warrants further investigation.
Article
Full-text available
Neurocritical care patients are a complex patient population, and to aid clinical decision-making, many models and scoring systems have previously been developed. More recently, techniques from the field of machine learning have been applied to neurocritical care patient data to develop models with high levels of predictive accuracy. However, although these recent models appear clinically promising, their interpretability has often not been considered and they tend to be black box models, making it extremely difficult to understand how the model came to its conclusion. Interpretable machine learning methods have the potential to provide the means to overcome some of these issues but are largely unexplored within the neurocritical care domain. This article examines existing models used in neurocritical care from the perspective of interpretability. Further, the use of interpretable machine learning will be explored, in particular the potential benefits and drawbacks that the techniques may have when applied to neurocritical care data. Finding a solution to the lack of model explanation, transparency, and accountability is important because these issues have the potential to contribute to model trust and clinical acceptance, and, increasingly, regulation is stipulating a right to explanation for decisions made by models and algorithms. To ensure that the prospective gains from sophisticated predictive models to neurocritical care provision can be realized, it is imperative that interpretability of these models is fully considered.
Article
Objectives: To describe the prevalence, patterns, explanatory variables, and outcomes associated with fluid accumulation (FA) in mechanically ventilated children. Design: Retrospective cohort study. Setting: Tertiary PICU. Patients: Children mechanically ventilated for greater than or equal to 24 hours. Interventions: None. Measurements and main results: Between July 2016 and July 2021, 1,636 children met eligibility criteria. Median age was 5.5 months (interquartile range [IQR], 0.7-46.5 mo), and congenital heart disease was the most common diagnosis. Overall, by day 7 of admission, the median maximum cumulative FA, as a percentage of estimated admission weight, was 7.5% (IQR, 3.3-15.1) occurring at a median of 4 days after admission. Overall, higher FA was associated with greater duration of mechanical ventilation (MV) (mean difference, 1.17 [95% CI, 1.13-1.22]; p < 0.001]), longer intensive care length of stay (LOS) (mean difference, 1.16 [95% CI, 1.12-1.21]; p < 0.001]), longer hospital LOS (mean difference, 1.19 [95% CI, 1.13-1.26]; p < 0.001]), and increased mortality (odds ratio, 1.31 [95% CI, 1.08-1.59]; p = 0.005). However, these associations depended on the effects of children with extreme values, and there was no increase in risk up to 20% FA, overall, in children following cardiopulmonary bypass and in children in the general ICU. When excluding children with maximum FA of >10%, there was no association with duration of MV (mean difference, 0.99 [95% CI, 0.94-1.04]; p = 0.64) and intensive care or hospital LOS (mean difference, 1.01 [95% CI, 0.96-1.06]; p = 0.70 and 1.01 [95% CI, 0.95-1.08]; 0.79, respectively) but an association with reduced mortality 0.71 (95% CI, 0.53-0.97; p = 0.03). Conclusions: In mechanically ventilated critically ill children, greater maximum FA was associated with longer duration of MV, intensive care LOS, hospital LOS, and mortality. However, these findings were driven by extreme values of FA of greater than 20%, and up to 10%, there was reduced mortality and no signal of harm.
Article
Despite the established role of the critical care pharmacist on the ICU multiprofessional team, critical care pharmacist workloads are likely not opti- mized in the ICU. Medication regimen complexity (as measured by the Medication Regimen Complexity-ICU [MRC-ICU] scoring tool) has been proposed as a potential metric to optimize critical care pharmacist workload but has lacked robust external validation. The purpose of this study was to test the hypothesis that MRC-ICU is related to both patient outcomes and pharmacist interventions in a diverse ICU population.
Article
Objectives: Despite the established role of the critical care pharmacist on the ICU multiprofessional team, critical care pharmacist workloads are likely not optimized in the ICU. Medication regimen complexity (as measured by the Medication Regimen Complexity-ICU [MRC-ICU] scoring tool) has been proposed as a potential metric to optimize critical care pharmacist workload but has lacked robust external validation. The purpose of this study was to test the hypothesis that MRC-ICU is related to both patient outcomes and pharmacist interventions in a diverse ICU population. Design: This was a multicenter, observational cohort study. Setting: Twenty-eight ICUs in the United States. Patients: Adult ICU patients. Interventions: Critical care pharmacist interventions (quantity and type) on the medication regimens of critically ill patients over a 4-week period were prospectively captured. MRC-ICU and patient outcomes (i.e., mortality and length of stay [LOS]) were recorded retrospectively. Measurements and main results: A total of 3,908 patients at 28 centers were included. Following analysis of variance, MRC-ICU was significantly associated with mortality (odds ratio, 1.09; 95% CI, 1.08-1.11; p < 0.01), ICU LOS (β coefficient, 0.41; 95% CI, 00.37-0.45; p < 0.01), total pharmacist interventions (β coefficient, 0.07; 95% CI, 0.04-0.09; p < 0.01), and a composite intensity score of pharmacist interventions (β coefficient, 0.19; 95% CI, 0.11-0.28; p < 0.01). In multivariable regression analysis, increased patient: pharmacist ratio (indicating more patients per clinician) was significantly associated with increased ICU LOS (β coefficient, 0.02; 0.00-0.04; p = 0.02) and reduced quantity (β coefficient, -0.03; 95% CI, -0.04 to -0.02; p < 0.01) and intensity of interventions (β coefficient, -0.05; 95% CI, -0.09 to -0.01). Conclusions: Increased medication regimen complexity, defined by the MRC-ICU, is associated with increased mortality, LOS, intervention quantity, and intervention intensity. Further, these results suggest that increased pharmacist workload is associated with decreased care provided and worsened patient outcomes, which warrants further exploration into staffing models and patient outcomes.