NORMA eResearch @NCI Library

Machine Learning Driven Recovery Recommendations for Defaulted Loans: A SHAP-Based Decision Support Framework

Natarajan, Sreeram (2025) Machine Learning Driven Recovery Recommendations for Defaulted Loans: A SHAP-Based Decision Support Framework. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (1MB) | Preview
[thumbnail of Configuration Manual]
Preview
PDF (Configuration Manual)
Download (1MB) | Preview

Abstract

Loan defaults and recovery are vital factors that directly affect the profitability of financial institutions. Identifying the right recovery strategy for each borrower remains challenging with traditional methods. This research proposes a machine learning (ML)-driven recovery decision support framework aimed at assisting the recovery teams of financial institutions in making informed and transparent decisions for each defaulted loan. Since publicly available datasets with recovery action details are scarce, this study uses a dataset derived from Lending Club via Kaggle and creates a rule-based recovery action as the target variable. This target variable enabled the supervised models to learn recovery patterns based on various borrower characteristics and loan statuses. To ensure transparency and enhance trust in decision making, the framework integrates SHAP (SHapley Additive exPlanations) to provide borrower-centric explanations of the key features influencing recovery outcomes. Multiple classification ML models such as Logistic Regression, Random Forest, and XGBoost were evaluated, alongside data resampling techniques like SMOTE, class balancing, and undersampling to address data imbalance. Among these, XGBoost with class balancing was identified as the best performing model with an Accuracy of 0.9986, Balanced Accuracy of 0.9927, Macro ROC AUC of 0.9999 and a perfect Macro F1-score of 1.00. Random Forest with class balancing also demonstrated good performance across metrics. Fit status analysis was carried out by comparing training and test performance, which showed no signs of underfitting or overfitting across all models. While this study focuses on the evaluation of ML models and explainability using SHAP, future work could involve developing a front-end loan recovery dashboard, deploying the framework in realtime banking environments and testing its scalability with real-world data from financial institutions.

Item Type: Thesis (Masters)
Supervisors:
Name
Email
Onwuegbuche, Faithful Chiagoziem
UNSPECIFIED
Subjects: H Social Sciences > HG Finance > Credit. Debt. Loans.
H Social Sciences > HG Finance > Fintech
T Technology > T Technology (General) > Information Technology > Fintech
Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning
Divisions: School of Computing > Master of Science in FinTech
Depositing User: Ciara O'Brien
Date Deposited: 24 Jun 2026 10:45
Last Modified: 24 Jun 2026 10:45
URI: https://norma.ncirl.ie/id/eprint/9394

Actions (login required)

View Item View Item