NORMA eResearch @NCI Library

An AI Approach to Investigate the Identification of Fake News in Brazilian Portuguese

Fischer, Marcelo (2021) An AI Approach to Investigate the Identification of Fake News in Brazilian Portuguese. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (1MB) | Preview
[thumbnail of Configuration manual]
Preview
PDF (Configuration manual)
Download (1MB) | Preview

Abstract

Spread of fake news is damaging to the society as a whole. Given the amount of news produced each day it is vital to automate the process of fake news identification. However, this is still a challenge specially for languages other than English. This research proposes an AI stacking approach, an ensemble method, three deep neural networks, and mBERT to identify fake news in Brazilian Portuguese. The AI stacking approach combines seven machine learning models to detect fake news in Brazilian Portuguese. The dataset used is the Fake.Br Corpus with 7200 news (3600 real and 3600 fake). Truncated texts were considered to avoid the bias of the length of the texts. The best performing model in this scenario, with the use of term frequency (TF) was the LinearSVC with almost 97% recall, followed by the stacking model with 96.3% recall. The deep learning architectures were not able to outperform the LinearSVC or the stacking model. The mBERT model was able to achieve 99.4% recall. The results shown here extend further the understanding of fake news identification in Brazilian Portuguese.

Item Type: Thesis (Masters)
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
Q Science > QA Mathematics > Computer software
T Technology > T Technology (General) > Information Technology > Computer software
Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4150 Computer Network Resources > The Internet > World Wide Web > Websites
T Technology > TK Electrical engineering. Electronics. Nuclear engineering > Telecommunications > The Internet > World Wide Web > Websites
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Clara Chan
Date Deposited: 25 Nov 2021 16:10
Last Modified: 25 Nov 2021 16:10
URI: https://norma.ncirl.ie/id/eprint/5147

Actions (login required)

View Item View Item