NORMA eResearch @NCI Library

Contextual Hate Speech Detection Leveraging RoBERTa: Overcoming Challenges in Online Communication

Bhalerao, Varun Atul (2024) Contextual Hate Speech Detection Leveraging RoBERTa: Overcoming Challenges in Online Communication. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (620kB) | Preview
[thumbnail of Configuration Manual]
Preview
PDF (Configuration Manual)
Download (1MB) | Preview

Abstract

The growth in online communication has increased hate speech in an exponential way, presenting an urgent need and serious challenges for any company that deals with the safe and inclusive maintenance of digital environments. Traditional methods of detection, one that uses simple keyword-based techniques often fail to capture the original, context-dependent nature of hate speech that manifests through slang, code words, and euphemisms. This paper presents the application of RoBERTa, a strongly optimized variant of BERT, toward increasing the accuracy and resiliency of hate speech detection. Training on larger datasets and longer training periods, and dynamic masking make RoBERTa much more powerful in understanding and processing human languages in their diversified and subtle contexts. Especially, it investigates whether RoBERTa can overcome the inefficiencies of early models in detecting hate speech efficiently across languages and cultural contexts, including traditional machine learning approaches and early deep learning models like CNNs and RNNs. Comparative analysis has shown that it outperforms traditional approaches in finding contextually nuanced hate speech, especially with other techniques like ensembling models or emotion recognition. The research aims to come up with a very accurate and versatile hate speech detection system that could work in different languages and across changing linguistic patterns. Hence, the result concludes the potential of transformer-based models in raising online safety and inclusivity, as shown by RoBERTa.

Item Type: Thesis (Masters)
Supervisors:
Name
Email
Siddig, Abubakr
UNSPECIFIED
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
P Language and Literature > P Philology. Linguistics > Computational linguistics. Natural language processing
Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4150 Computer Network Resources > The Internet > World Wide Web > Websites > Online social networks
T Technology > TK Electrical engineering. Electronics. Nuclear engineering > Telecommunications > The Internet > World Wide Web > Websites > Online social networks
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Ciara O'Brien
Date Deposited: 07 Aug 2025 10:52
Last Modified: 07 Aug 2025 10:52
URI: https://norma.ncirl.ie/id/eprint/8463

Actions (login required)

View Item View Item