Bhatgaonkar, Sneha (2024) Decoding Online Pharmacy Trends: Clustering, Prediction and Business Insights. Masters thesis, Dublin, National College of Ireland.
Preview |
PDF (Master of Science)
Download (1MB) | Preview |
Preview |
PDF (Configuration Manual)
Download (2MB) | Preview |
Abstract
The online pharmacy market in India is experiencing significant growth, driven by increasing consumer demand for convenience and accessibility in healthcare services. In this competitive market, it is essential to extract insights from available data, identify areas for improvement, and enhance business operations and services. This study aims to uncover patterns within online pharmacy data sourced from the Kaggle platform. Analysing biomedical text is challenging but domain-specific transformer-based models are effective in extracting relevant entities in such scenarios. To achieve this, named entity recognition (NER) models such as Med7 and Clinical-AI-Apollo (Medical-NER) are used to extract features. The data is analysed using K-Means clustering, and a classification model is built to predict product reviews using supervised machine learning models Random Forest, XGBoost, and Easy Ensemble classifiers. For vectorization, transformer-based models such as BioBERT, BioFormer-16L, and Clinical-AI-Apollo are used. Evaluation results show that the Easy Ensemble classifier with XGBoost estimator effectively handles class imbalance, and when combined with the Clinical-AI-Apollo model for vectorization, it outperforms other models in terms of ROC AUC performance. This work contributes to a deeper understanding of the data, providing business insights into identifying disease profiles, side-effects, medicine forms, composition and manufacturer etc. along with the review-based performance probability of products. These insights can inform better strategies to enhance underperforming products.
Item Type: | Thesis (Masters) |
---|---|
Supervisors: | Name Email Hasanuzzaman, Mohammed UNSPECIFIED |
Uncontrolled Keywords: | Clustering; NER; vectorization; classification |
Subjects: | Q Science > QA Mathematics > Electronic computers. Computer science T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science R Medicine > RS Pharmacy and materia medica H Social Sciences > HF Commerce > Electronic Commerce |
Divisions: | School of Computing > Master of Science in Data Analytics |
Depositing User: | Ciara O'Brien |
Date Deposited: | 01 Sep 2025 15:17 |
Last Modified: | 01 Sep 2025 15:17 |
URI: | https://norma.ncirl.ie/id/eprint/8682 |
Actions (login required)
![]() |
View Item |