Singh, Kajal (2024) Evaluating Machine Learning Models for Defect Rate Prediction and Maintenance Classification in Industrial Systems. Masters thesis, Dublin, National College of Ireland.
Preview |
PDF (Master of Science)
Download (2MB) | Preview |
Preview |
PDF (Configuration Manual)
Download (1MB) | Preview |
Abstract
This research explores the application of machine learning (ML) techniques to enhance manufacturing and supply chain efficiency by addressing critical aspects such as defect rate reduction and predictive maintenance. Challenges faced by industries, including logistical inefficiencies, supply chain disruptions, and equipment defects significantly impact operational costs, revenue, and customer satisfaction. This study adopts a data-driven framework integrating both regression and classification models. Such as Linear Regression, Random Forest, Support Vector Regression (SVR), and ensemble techniques which is a combination of an hybrid model of XGBoost and decision tree regression. To predict the defect rates which can help to identify maintenance requirement necessity and optimize supply chain logistics also.
To ensure model robustness, preprocessing techniques such as imputation, scaling, and Synthetic Minority Over-sampling Technique (SMOTE) were employed to address data inconsistencies and class imbalances. The findings reveal that baseline models, particularly Random Forest and Linear Regression outperformed ensemble methods in predicting defect rates. Demonstrating superior generalization with R2 values of 0.18% and 0.86% respectively. Ensemble models performed adequately for smaller logistical datasets, achieving an R2 value of 0.18%. However, overfitting in supply chain models led to high training accuracy but poor testing accuracy, underscoring the importance of careful model selection and feature engineering for inspection result.
Manufacturing defect status predictions achieved a moderately high accuracy of 74%. Conversely, supply chain models faced generalization challenges, achieving only 50% test accuracy. Additionally, K-Nearest Neighbors (KNN) classification underperformed on the logistics dataset, with an R2 value of 25%, missing critical features. These results highlight the pivotal role of feature engineering, data quality, and robust model selection in delivering reliable predictions across industrial applications.
Item Type: | Thesis (Masters) |
---|---|
Supervisors: | Name Email Haycock, Barry UNSPECIFIED |
Subjects: | Q Science > QA Mathematics > Electronic computers. Computer science T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning H Social Sciences > HD Industries. Land use. Labor > Business Logistics > Supply Chain Management |
Divisions: | School of Computing > Master of Science in Data Analytics |
Depositing User: | Ciara O'Brien |
Date Deposited: | 05 Sep 2025 10:47 |
Last Modified: | 05 Sep 2025 10:47 |
URI: | https://norma.ncirl.ie/id/eprint/8816 |
Actions (login required)
![]() |
View Item |