NORMA eResearch @NCI Library

DisasterTweetsEnsemble: Ensemble for Disaster Tweets Classification

Sharma, Shreedhar (2023) DisasterTweetsEnsemble: Ensemble for Disaster Tweets Classification. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (1MB) | Preview
[thumbnail of Configuration Manual]
Preview
PDF (Configuration Manual)
Download (2MB) | Preview

Abstract

Analyzing social media data can significantly impact society, and one use case of social media is natural calamities. During natural calamities, one requires live feed from the disaster-affected area to provide aid, and social media is at the forefront of providing the data. The Research shows the use of social media Tweets during disasters to provide aid by filtering the data as relevant or non-relevant. At the time of the disaster, tweets or social media-generated data were high in velocity, and some random tweets that were not useful were also generated. Hence, the given Research tries to classify the tweets as informative and non-informative based on the labelled dataset. The Research trains various machine learning models on three datasets from CRISISNLP: Nepal, Queensland, and Crisis. The study uses the KDD methodology to work on the dataset, from cleaning the data with the help of natural language understanding to building different machine learning and deep learning algorithms. The study uses different vector techniques on the text data and builds models like Logistic Regression, Naïve Bayes, XGBoost and LSTM. The study shows that the count vectorizer performs well with different combinations of algorithms with a maximum accuracy of 95% on the Queensland dataset. Also, the study tries to build a state-of-the-art embedded model to go one step further in this direction and achieve an accuracy of 95.9 % on the Queensland dataset.

Item Type: Thesis (Masters)
Supervisors:
Name
Email
Moldovan, Arghir Nicolae
UNSPECIFIED
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
P Language and Literature > P Philology. Linguistics > Computational linguistics. Natural language processing
Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning
Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4150 Computer Network Resources > The Internet > World Wide Web > Websites > Online social networks
T Technology > TK Electrical engineering. Electronics. Nuclear engineering > Telecommunications > The Internet > World Wide Web > Websites > Online social networks
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Ciara O'Brien
Date Deposited: 21 May 2025 11:54
Last Modified: 21 May 2025 11:54
URI: https://norma.ncirl.ie/id/eprint/7607

Actions (login required)

View Item View Item