Multimodal Fake News Detection: Integrating OCR and Deep Learning Models for Text and Image Analysis

Vennapu, Jayaram

Multimodal Fake News Detection: Integrating OCR and Deep Learning Models for Text and Image Analysis

Tools

Vennapu, Jayaram (2024) Multimodal Fake News Detection: Integrating OCR and Deep Learning Models for Text and Image Analysis. Masters thesis, Dublin, National College of Ireland.

Preview	PDF (Master of Science) Download (1MB) \| Preview
Preview	PDF (Configuration Manual) Download (879kB) \| Preview

Abstract

Fake news identification is an essential task of the current age, especially with the help of multimedia channels. This work proposed a dual Optical Character Recognition (OCR) method supplemented by a multimodal deep learning model to distinguish between real and fake news. The dataset comprises images with raw textual news content, and it is divided into a training set with 88%, a validation set with 8%, and a test set with only 4%. Data augmentation is done by rotating the images within the plane, changing saturation and exposure, making random cuts to the image, and resizing the images to be 640 * 640 pixels. For text analysis, the OCR used is OCR 2.5 and the information extracted from images are tokens using a BERT-base-uncased tokenizer. The attached text is pre-trained and finetuned on a Bert Bert-based transformer model for three epochs, test accuracy of 95.12% and an F1- score is 95.12%. For image analysis, a fully connected neural network, CNN-based, ResNet-18 is finetuned from the pre-trained ImageNet model and used to classify images achieving a test accuracy of 97.56 percent, and a test F1 score of 97.56 percent. The measures of accuracy, precision, recall, and F1-score prove the efficiency of the proposed system. This general approach unites text and image recognition, using OCR and deep learning innovations for recognizing fake news, providing a stable solution with top results.

Item Type:	Thesis (Masters)
Supervisors:	Name Email Subhnil, Shubham UNSPECIFIED
Uncontrolled Keywords:	Fake News Detection; Optical Character Recognition (OCR); BERT-based Transformer; CNN-based ResNet; Image Classification
Subjects:	Q Science > QA Mathematics > Electronic computers. Computer science T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science P Language and Literature > P Philology. Linguistics > Computational linguistics. Natural language processing Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4150 Computer Network Resources > The Internet > World Wide Web > Websites > Online social networks T Technology > TK Electrical engineering. Electronics. Nuclear engineering > Telecommunications > The Internet > World Wide Web > Websites > Online social networks
Divisions:	School of Computing > Master of Science in Data Analytics
Depositing User:	Ciara O'Brien
Date Deposited:	05 Sep 2025 13:44
Last Modified:	05 Sep 2025 13:44
URI:	https://norma.ncirl.ie/id/eprint/8833

Actions (login required)

View Item