Kalungepatil, Sakshi Kacheshwar (2025) Customer Churn Prediction using RAG-Based Sentiment Analysis with LLMs and CatBoost. Masters thesis, Dublin, National College of Ireland.
Preview |
PDF (Master of Science)
Download (1MB) | Preview |
Preview |
PDF (Configuration Manual)
Download (992kB) | Preview |
Abstract
In today’s competitive e-commerce era with vast quantity of sentiment-rich textual and behavioral customer data available, predicting customer churn becomes crucial in understanding customer behavior for business sustainability, user engagement, and rapid market growth. This is achievable with the integration of advanced techniques required to extract meaningful insights for data-driven decision-making. This research investigates the use of a hybrid approach for predicting early signs of costumer churn by integrating a fine-tuned large language model LLM with an ensemble machine learning technique. The study explores the idea of combining the Retrieval-Augmented Generation (RAG) framework with instruction-following fine-tuned Large Language Model Meta AI (LLaMA) for sentiment analysis through customer review data. To boost predictive performance, sentiment-driven features achieved from the RAG module are combined with structural features such as verified purchase and review length, and are passed to the CatBoost model for final churn prediction. The research used a Kaggle dataset consisting of Amazon customer reviews 2023, containing the combination of textual and behavioral characteristics. This hybrid approach reveals that the model achieved an accuracy of 86.75% for the RAG-based fine-tuned LLaMA model and an accuracy of 75.9% for the CatBoost model. Adapting such a hybrid approach validates the effectiveness of combining sentiment-rich textual data with structural features for churn prediction in real-world applications.
| Item Type: | Thesis (Masters) |
|---|---|
| Supervisors: | Name Email Haque, Rejwanul UNSPECIFIED |
| Subjects: | Q Science > QH Natural history > QH301 Biology > Methods of research. Technique. Experimental biology > Data processing. Bioinformatics > Artificial intelligence Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Artificial intelligence P Language and Literature > P Philology. Linguistics > Computational linguistics. Natural language processing H Social Sciences > HF Commerce > Marketing > Consumer Behaviour H Social Sciences > HF Commerce > Electronic Commerce |
| Divisions: | School of Computing > Master of Science in Data Analytics |
| Depositing User: | Ciara O'Brien |
| Date Deposited: | 01 Jul 2026 11:03 |
| Last Modified: | 01 Jul 2026 11:03 |
| URI: | https://norma.ncirl.ie/id/eprint/9430 |
Actions (login required)
![]() |
View Item |
Tools
Tools