NORMA eResearch @NCI Library

Classification of Online Patient Reviews Based on Effectiveness Using Machine Learning Algorithms

Anvekar, Srinivas Prakash (2020) Classification of Online Patient Reviews Based on Effectiveness Using Machine Learning Algorithms. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
PDF (Master of Science)
Download (1MB) | Preview
[thumbnail of Configuration manual]
PDF (Configuration manual)
Download (946kB) | Preview


In the past decade, use of diverse expedited review approaches has expanded and due to various improvements in the field of medicine the time to market for the drugs has reduced considerably. This has resulted in lesser testing of drugs in clinical trials leading to Adverse Drug Reactions upon usage of those drugs in a real-world scenario. One of the leading causes of mortality rate over the years in past decade has been Adverse Drug Reactions and hence it has become a concern to oversee this situation. The boom in online forums and social media in recent years has brought about immense wealth of information submitted by patients related to drugs and their reactions. Valuable knowledge can be leveraged from this information using machine learning techniques. This research is concerned with processing the textual data about drugs from online patient reviews and classify them based on their effectiveness using SVM, kNN, Random Forest, XGBoost Classifier and Logistic Regression algorithms and evaluate which model best suits in obtaining the objective of classifying the reviews. The analysis is performed based on in-domain data and also cross-domain data. The transfer learning approach can be used to find the similarity across domains and is promising technique in field of review analysis. This study also tries to explain the correlation between sentiment of a review with that of effectiveness of the review. Among the different models, the XGBoost Classifier had the best performance over different approaches proving the viability of this research and further improvements on the model in future.
Keywords: Drug reviews, Sentiment Analysis, machine learning algorithms, feature extraction, Tf-Idf, Correlation.

Item Type: Thesis (Masters)
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
Q Science > QA Mathematics > Computer software
T Technology > T Technology (General) > Information Technology > Computer software
R Medicine > R Medicine (General)
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Dan English
Date Deposited: 11 Jun 2020 12:33
Last Modified: 11 Jun 2020 12:33

Actions (login required)

View Item View Item