NORMA eResearch @NCI Library

Automated Detection of Fake News in Urdu Language Using Pre-Trained Transformer Models

Lohano, Jackay (2024) Automated Detection of Fake News in Urdu Language Using Pre-Trained Transformer Models. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (1MB) | Preview
[thumbnail of Configuration Manual]
Preview
PDF (Configuration Manual)
Download (764kB) | Preview

Abstract

The propagation of misinformation across different languages and domains on various social media platforms is of grave concern for societies and individuals due to its wide-range consequences. Although, researchers have addressed this challenge using advanced deep learning (DL) models, fake news detection in low resource languages such as Urdu is still at nascent stage. Studies have used traditional machine learning (ML) models on a very small and domain-restricted Urdu datasets for fake news detection. This study explores fake news classification in Urdu using three state-of-the-art (SOTA) pre-trained multilingual transformer modelsi.e. mBERT, DistilmBERT and mT5 on a large and multidomain Urdu dataset. Models are evaluated on the evaluation metrics such as accuracy, precision, recall and f1-score. The results show that DistilmBERT model demonstrates promising results with an accuracy of 89% compared to its larger counterpart mBERT and mT5 models. The findings reveal the potential of DistilmBERT model for real-world applications in memory-constrained environments for identification of fake news. This research contributes to the ongoing efforts to combat misinformation in resource-scarce languages by building a reliable Urdu fake news detection model, addressing the complexities of information dissemination and manipulation in modern age.

Item Type: Thesis (Masters)
Supervisors:
Name
Email
Niculescu, Hamilton
UNSPECIFIED
Uncontrolled Keywords: Fake News Detection; Urdu Language; Transformer models; Low resource language; Transfer Learning
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
P Language and Literature > P Philology. Linguistics > Computational linguistics. Natural language processing
Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning
Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4150 Computer Network Resources > The Internet > World Wide Web > Websites > Online social networks
T Technology > TK Electrical engineering. Electronics. Nuclear engineering > Telecommunications > The Internet > World Wide Web > Websites > Online social networks
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Ciara O'Brien
Date Deposited: 03 Sep 2025 11:41
Last Modified: 03 Sep 2025 11:41
URI: https://norma.ncirl.ie/id/eprint/8739

Actions (login required)

View Item View Item