Sharma, Shreedhar (2023) DisasterTweetsEnsemble: Ensemble for Disaster Tweets Classification. Masters thesis, Dublin, National College of Ireland.
Preview |
PDF (Master of Science)
Download (1MB) | Preview |
Preview |
PDF (Configuration Manual)
Download (2MB) | Preview |
Abstract
Analyzing social media data can significantly impact society, and one use case of social media is natural calamities. During natural calamities, one requires live feed from the disaster-affected area to provide aid, and social media is at the forefront of providing the data. The Research shows the use of social media Tweets during disasters to provide aid by filtering the data as relevant or non-relevant. At the time of the disaster, tweets or social media-generated data were high in velocity, and some random tweets that were not useful were also generated. Hence, the given Research tries to classify the tweets as informative and non-informative based on the labelled dataset. The Research trains various machine learning models on three datasets from CRISISNLP: Nepal, Queensland, and Crisis. The study uses the KDD methodology to work on the dataset, from cleaning the data with the help of natural language understanding to building different machine learning and deep learning algorithms. The study uses different vector techniques on the text data and builds models like Logistic Regression, Naïve Bayes, XGBoost and LSTM. The study shows that the count vectorizer performs well with different combinations of algorithms with a maximum accuracy of 95% on the Queensland dataset. Also, the study tries to build a state-of-the-art embedded model to go one step further in this direction and achieve an accuracy of 95.9 % on the Queensland dataset.
Actions (login required)
![]() |
View Item |