NORMA eResearch @NCI Library

Machine Learning Models to Predict in-Hospital Mortality of Heart Failure Patients in Intensive Care Units: Technical Report

Hennouni, Sophia (2022) Machine Learning Models to Predict in-Hospital Mortality of Heart Failure Patients in Intensive Care Units: Technical Report. Undergraduate thesis, Dublin, National College of Ireland.

[thumbnail of Bachelor of Science]
PDF (Bachelor of Science)
Download (1MB) | Preview


Heart failure (HF) is a highly prevalent disorder worldwide but assessing the mortality risk of HF patients remains a challenge as the nature of the predictors is still poorly understood.

This project aimed to apply machine learning techniques to Electronic Health Records (EHRs) to develop a prediction model for in-hospital mortality of HF patients admitted to Intensive Care Unit (ICU). The main objective of the model was to determine whether a machine learning approach could improve the reliability of well-established scoring systems such as the Get With The Guidelines-Heart Failure (GWTG-HF).

The data used to train and validate the prediction model was from the Medical Information Mart for Intensive Care (MIMIC) III database. The analysis was carried out on a cohort of 1,177 ICU patients and the in-hospital mortality rate was 13.52%.

The features selection phase followed three approaches: correlation analysis, Principal Component Analysis (PCA) and a decision tree algorithm. The dataset was divided into a training set (70%) and a testing set (30%). Eight machine learning techniques were applied to the three sets of features: k-means clustering, k-nearest neighbour (K-NN), logistic regression, naïve Bayes, support vector machines (SVM), random forests, extreme gradient boosting ensembles (XGBoost) and repeated incremental pruning to produce error reduction (RIPPER). The area under the ROC curve (AUC), recall and F1 scores were computed for each set within each technique using stratified 10-fold cross validation, and the best performing set-model association was chosen.

The best performance was obtained with logistic regression on the features selected by the decision tree. The AUC score was 0.7625, with was marginally lower than the GWTG-HF scoring system (AUC = 0.7743). We can thus say the results developed in this research are largely comparable to existing gold standards for heart failure evaluation. Moreover, anion gap was consistently identified as a feature of importance, but it is not part of the GWTG-HF calculator. A potentially good attribute has indeed been identified.

Overall, machine learning models predicted mortality results very close to a well-established scoring system. Finally, with the newly identified variable (anion gap), existing models like GWTG-HF could also improve their prediction accuracy.

Item Type: Thesis (Undergraduate)
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
Q Science > QA Mathematics > Computer software
T Technology > T Technology (General) > Information Technology > Computer software
R Medicine > Healthcare Industry
Divisions: School of Computing > Bachelor of Science (Honours) in Computing
Depositing User: Clara Chan
Date Deposited: 30 Aug 2022 15:55
Last Modified: 30 Aug 2022 15:55

Actions (login required)

View Item View Item