NORMA eResearch @NCI Library

Comparative Analysis of Machine learning Algorithms using NLP Techniques in Automatic Detection of Fake News on Social Media Platforms

Murugesan, Manoj Kumar (2019) Comparative Analysis of Machine learning Algorithms using NLP Techniques in Automatic Detection of Fake News on Social Media Platforms. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (713kB) | Preview
[thumbnail of Configuration manual]
Preview
PDF (Configuration manual)
Download (1MB) | Preview

Abstract

Widening popularity of social media platforms and the increasing number of users trigger the spreading of fake news that creates chaos and tension in people’s peaceful life. It is a vital interest to detect fake news, which has enormous potential to disrupts people’s healthy growth. Traditional non-machine learning detection approaches like linguistic, network, and user profile analysis was deficient for dynamic and sophisticated social media network. Those conventional methods involved humans who are prone to make errors and take a lot of time. This research addresses this limitation using Natural Language Processing Techniques, along with machine learning algorithms. Our proposed system aims to detect fake news accurately, efficiently earlier, and with low false-positive rates. The first phase of the methodology involves cleaning the noises in the dataset and pre-processing to convert them into a Document-term matrix format. Literature reviews gave a spotlight on a few best performing machine learning algorithms. Such algorithms are Decision tree, Random-forest, AdaBoost, XGBoost and, LightGBM, which are well-known models for accurate and efficient text classification. The second phase of the research involved the evaluation of classification model in-terms evaluation metrics such as accuracy, precision, recall rate, f1score, and AUC. LightGBM with the bag of word technique performed magnificently with 96.3% of AUC and 93.1% accuracy rate. Lastly, an API was designed with the LightGBM model and deployed to achieve our goal of accurate detection with a low false-positive rate.

Item Type: Thesis (Masters)
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
Q Science > QA Mathematics > Computer software
T Technology > T Technology (General) > Information Technology > Computer software
Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4150 Computer Network Resources > The Internet > World Wide Web > Websites > Online social networks
T Technology > TK Electrical engineering. Electronics. Nuclear engineering > Telecommunications > The Internet > World Wide Web > Websites > Online social networks
Divisions: School of Computing > Master of Science in Cyber Security
Depositing User: Caoimhe Ní Mhaicín
Date Deposited: 02 Apr 2020 11:02
Last Modified: 02 Apr 2020 11:02
URI: https://norma.ncirl.ie/id/eprint/4157

Actions (login required)

View Item View Item