Mahendran Joseph Solomon, Ashlyn Nivita (2023) Deciphering Sarcasm in Textual Data: A Comparative Study of Machine Learning and Deep Learning Methods and a Nuanced Dive into Topic Modeling. Masters thesis, Dublin, National College of Ireland.
Preview |
PDF (Master of Science)
Download (2MB) | Preview |
Preview |
PDF (Configuration Manual)
Download (1MB) | Preview |
Abstract
In the landscape of digital communication, accurately identifying sarcasm presents a unique and complex challenge. The main objective of this study is to address this challenge by exploring the integration of advanced deep learning architectures in Natural Language Processing (NLP), aiming to enhance the precision and context-awareness in sarcasm detection within specialized textual domains. The dataset selected for this research consists of news headlines from two professional sources, The Onion and HuffPost, offering a distinct advantage of being free of noise due to their structured and professional journalistic standards. Latent Dirichlet Allocation (LDA) was initially employed for Topic Modeling to categorize the news headlines. Due to the low coherence score of 0.4418, the features extracted by LDA were found to be irrelevant for sarcasm detection, offering no practical utility in this specific analytical context. For sarcasm detection a robust approach was adopted by developing a hybrid deep learning model. This model combines Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN) with Bidirectional Long Short-Term Memory (BiLSTM) and Gated Recurrent Units (GRU), alongside GloVe embeddings. Tested over 10 epochs, this model achieved an accuracy of 82.19% on the test data, outperforming traditional machine learning models like Random Forest and Decision Tree, which recorded accuracies of 78.97% and 67.61% respectively.
Item Type: | Thesis (Masters) |
---|---|
Supervisors: | Name Email Subhnil, Shubham UNSPECIFIED |
Subjects: | Q Science > QA Mathematics > Electronic computers. Computer science T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science H Social Sciences > HM Sociology > Information Science > Communication P Language and Literature > P Philology. Linguistics > Computational linguistics. Natural language processing Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning |
Divisions: | School of Computing > Master of Science in Data Analytics |
Depositing User: | Ciara O'Brien |
Date Deposited: | 16 May 2025 10:24 |
Last Modified: | 16 May 2025 10:24 |
URI: | https://norma.ncirl.ie/id/eprint/7562 |
Actions (login required)
![]() |
View Item |