Hybrid Reinforcement Learning for Personalized Diabetes Care

Rathore, Shayshank

Hybrid Reinforcement Learning for Personalized Diabetes Care

Tools

Rathore, Shayshank (2025) Hybrid Reinforcement Learning for Personalized Diabetes Care. Masters thesis, Dublin, National College of Ireland.

Preview	PDF (Master of Science) Download (1MB) \| Preview
Preview	PDF (Configuration Manual) Download (1MB) \| Preview

Abstract

Custom diabetes care needs decision support that adapts quickly to each patient while remaining transparent to clinicians. We face major challenge with a lack of explainability in action of Artificial Intelligence (AI) and a heavy dependence on large amounts of data for training in medical field. We present a hybrid reinforcement learning (RL) approach that combines reward decomposition. It separating clinically meaningful objectives such as glycaemic control, avoidance of adverse events and intervention burden with a lightweight meta-learning routine for fast per-patient adaptation. Using a custom Gymnasium environment derived from the UCI diabetes time-series, we train Proximal Policy Optimization (PPO) and compare it to a rule-based controller. Evaluation focuses on Time-in-Range (TIR), hypoglycaemic events, insulin usage and sample-efficiency. In our current configuration, PPO eliminates hypoglycaemic steps and reduces insulin use by ∼86% relative to the rule baseline. A pilot meta-learning step yields small per-task reward gains, indicating potential for personalization with limited data. We discuss why control remains suboptimal (coarse action space, reward balance, largely deterministic dynamics) and outline concrete remedies. Overall, the method delivers interpretable trade-offs and data-efficient adaptation, offering a pragmatic path toward trustworthy RL-based decision support in diabetes care.

Item Type:	Thesis (Masters)
Supervisors:	Name Email Shahid, Abdul UNSPECIFIED
Uncontrolled Keywords:	Reinforcement Learning; Personalized Diabetes Care; Reward Decomposition; Meta-Learning; Proximal Policy Optimization (PPO)
Subjects:	Q Science > QH Natural history > QH301 Biology > Methods of research. Technique. Experimental biology > Data processing. Bioinformatics > Artificial intelligence Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Artificial intelligence R Medicine > Diseases > Endocrine glands - Diseases > Diabetes
Divisions:	School of Computing > Master of Science in Artificial Intelligence
Depositing User:	Ciara O'Brien
Date Deposited:	04 Jun 2026 15:10
Last Modified:	04 Jun 2026 15:10
URI:	https://norma.ncirl.ie/id/eprint/9342

Actions (login required)

View Item