Ora, Anchal (2020) Spam Detection in Short Message Service Using Natural Language Processing and Machine Learning Techniques. Masters thesis, Dublin, National College of Ireland.
Preview |
PDF (Master of Science)
Download (1MB) | Preview |
Preview |
PDF (Configuration manual)
Download (4MB) | Preview |
Abstract
As the usage of mobile phones increased, the use of Short Message Service increased significantly. Due to the lower costs of text messages, people started using it for promotional purposes and unethical activities. This resulted in the ratio of spam messages increasing exponentially and thereby loss of personal and financial data. To prevent data loss, it is crucial to detect spam messages as quick as possible. Thus, the research aims to classify spam messages not only efficiently but also with low latency. Different machine learning models like XGBoost, LightGBM, Bernoulli Naïve Bayes that are proven to be very fast with low time complexity have been implemented in the research. The length of the messages was taken as an additional feature, and the features were extracted using Unigram, Bigram and TF-IDF matrix. Chi Square feature selection was implemented to further reduce the space complexity. The results showcased that Bernoulli Naïve Bayes followed by LightGBM with the TF IDF matrix generated the highest accuracy of 96.5% in 0.157 seconds and 95.4% in 1.708 seconds respectively.
Keywords: Spam SMS, Text Classification, Natural Language Processing, Machine Learning, Bernoulli Naïve Bayes, LightGBM, XGBoost
Item Type: | Thesis (Masters) |
---|---|
Subjects: | Q Science > QA Mathematics > Electronic computers. Computer science T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science Q Science > QA Mathematics > Computer software T Technology > T Technology (General) > Information Technology > Computer software Q Science > QA Mathematics > Computer software > Mobile Phone Applications T Technology > T Technology (General) > Information Technology > Computer software > Mobile Phone Applications |
Divisions: | School of Computing > Master of Science in Data Analytics |
Depositing User: | Dan English |
Date Deposited: | 15 Jun 2020 12:27 |
Last Modified: | 15 Jun 2020 12:27 |
URI: | https://norma.ncirl.ie/id/eprint/4286 |
Actions (login required)
View Item |