NORMA eResearch @NCI Library

Improving the detection of email spam filter using LGS-Count model

Bonu, Shiva Prasad (2022) Improving the detection of email spam filter using LGS-Count model. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (1MB) | Preview
[thumbnail of Configuration manual]
Preview
PDF (Configuration manual)
Download (1MB) | Preview

Abstract

In today’s advent with an increase in web popularity, there has been an increase in its usage and data amongst end users. In such a scenario, e-mails have become one of the most secure medium to make online transactions to fulfil the purpose of communication and transfer required data. Due to its convenient nature of use this had led to a significant revolution taking place over conventional communication systems. However, the main obstruction behind mails is the publication of unwanted and harmful mails known as spam. Spam mails are deceptive mails that are intentionally sent to cause harm to the end user. Hence a detection method to avoid such scenarios is needed. Spam mails are generally detected through ML and NLP mechanisms and therefore this thesis puts forward the working principle of TF-IDF and stemming algorithms to detect such words and further classify it as spam mails (unwanted) and ham mails (valid). The working implementation of the thesis is carried out on the CSDMC 2010 dataset. Further, the training and testing process is executed and the proposed method is implemented. The thesis focuses to develop an enhanced spam exposure framework based on count vectorizer and TF-DF vectorizer. Lastly, the classification of spam and ham mails are evaluated using a comprehensive range of ML algorithms and results are calculated based on ROC curves and confusion matrix.

Item Type: Thesis (Masters)
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
Q Science > QA Mathematics > Computer software > Computer Security
T Technology > T Technology (General) > Information Technology > Computer software > Computer Security
Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4150 Computer Network Resources > The Internet > Electronic Mail
T Technology > TK Electrical engineering. Electronics. Nuclear engineering > Telecommunications > The Internet > Electronic Mail
Divisions: School of Computing > Master of Science in Cyber Security
Depositing User: Clara Chan
Date Deposited: 30 Nov 2022 19:09
Last Modified: 30 Nov 2022 19:09
URI: https://norma.ncirl.ie/id/eprint/5949

Actions (login required)

View Item View Item