Ingle, Piyush (2023) Optimizing Diabetes Predictive Modeling: A Study of Data Balancing and Advanced Algorithms. Masters thesis, Dublin, National College of Ireland.
Preview |
PDF (Master of Science)
Download (1MB) | Preview |
Preview |
PDF (Configuration Manual)
Download (1MB) | Preview |
Abstract
Diabetes is now seen as a chronic illness that poses a global problem since it may affect everyone. Diabetes Mellitus is another name for the disorder that interferes with how our bodies process blood sugar levels. The goal of this study is to apply two data balancing techniques – ADASYN (Adaptive synthetic sampling) and SMOTE (Synthetic Minority Over-sampling) to improve the accuracy Diabetes Prediction Models. In addition to addressing the inherent class imbalance in diabetes datasets, the study looks at how these strategies affect the prediction abilities of five conventional Machine learning algorithms, k-Nearest Neighbors (KNN), AdaBoost, Decision Tree, Logistic Regression, and Gaussian Naïve Bayes. Furthermore, the study digs into the field of deep learning through the utilization of three sophisticated algorithms: Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), and Recurrent Neural Network (RNN). This work attempts to identify the synergies between several machine learning and deep learning algorithms and data balancing strategies for efficient diabetes prediction through a thorough comparison analysis. The findings offer insightful information about how to maximize model performance in healthcare applications and give a detailed grasp of how various predictive modelling techniques interact with data preparation techniques when it comes to diabetes diagnosis. After comparison of all the results we found that SMOTE has given better results with Decision tree as the best performer with accuracy of 82% and F1 score 0.80.
Item Type: | Thesis (Masters) |
---|---|
Supervisors: | Name Email Rustam, Furqan UNSPECIFIED |
Uncontrolled Keywords: | Diabetes Prediction; ADASYN; SMOTE; Machine Learning; Deep Learning |
Subjects: | Q Science > QA Mathematics > Electronic computers. Computer science T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science R Medicine > RA Public aspects of medicine > RA0421 Public health. Hygiene. Preventive Medicine R Medicine > Healthcare Industry Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning |
Divisions: | School of Computing > Master of Science in Data Analytics |
Depositing User: | Ciara O'Brien |
Date Deposited: | 09 May 2025 08:10 |
Last Modified: | 09 May 2025 08:10 |
URI: | https://norma.ncirl.ie/id/eprint/7529 |
Actions (login required)
![]() |
View Item |