NORMA eResearch @NCI Library

Combining Random Forest and SHAP for Interpretability and Insights in Predictive Maintenance

Alves Garcia, Bruna Rafaela (2024) Combining Random Forest and SHAP for Interpretability and Insights in Predictive Maintenance. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (722kB) | Preview
[thumbnail of Configuration Manual]
Preview
PDF (Configuration Manual)
Download (1MB) | Preview

Abstract

This study investigates the integration of Random Forest models with SHAP (SHapley Additive exPlanations) to address key challenges in Predictive Maintenance (PdM), particularly within resource-constrained environments. The study employs the MetroPT3 dataset, comprising time-series sensor data from metro air compressor unit, to identify machine failure patterns and potentially inform maintenance decisions. Given the dataset’s highly imbalanced nature and the challenges associated with preserving temporal dependencies, the methodology employs data preprocessing techniques, such as rolling statistical, feature engineering, class weighting and correlation-based feature selection, were employed to prepare the dataset for analysis. The initial Random Forest model demonstrated a good predictive capability, but the integration of SHAP provided feature-level insights that enhanced transparency and informed further refinements. Threshold tuning further optimised the model’s performance, achieving a precision of 0.94, recall of 0.74, and F1-score of 0.83 for the minority class. SHAP analysis highlighted key predictors, such as pressure-related and motor current features, offering actionable insights for maintenance decisions. This work underscores the importance of balancing predictive accuracy with interpretability, addressing both technical and practical challenges in Predictive Maintenance. The findings demonstrate the potential of combining machine learning with Explainable AI to deliver transparent and actionable insights. Limitations related to class imbalance and dataset specificity are discussed, along with proposals for future research.

Item Type: Thesis (Masters)
Supervisors:
Name
Email
Chikkankod, Arjun
UNSPECIFIED
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
Q Science > QH Natural history > QH301 Biology > Methods of research. Technique. Experimental biology > Data processing. Bioinformatics > Artificial intelligence
Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Artificial intelligence
Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Ciara O'Brien
Date Deposited: 01 Sep 2025 13:51
Last Modified: 01 Sep 2025 13:51
URI: https://norma.ncirl.ie/id/eprint/8670

Actions (login required)

View Item View Item