NORMA eResearch @NCI Library

Comparative Analysis of Graph Attention Networks and LSTM Models for Enhanced Email Phishing Detection: An Ensemble Approach

Siddiqui, Zaid (2024) Comparative Analysis of Graph Attention Networks and LSTM Models for Enhanced Email Phishing Detection: An Ensemble Approach. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (955kB) | Preview
[thumbnail of Configuration Manual]
Preview
PDF (Configuration Manual)
Download (836kB) | Preview

Abstract

Phishing attacks are a serious threat in cybersecurity that manipulates the human factor to gain sensitive information. Traditional rule-based systems have failed in their detection and need intelligent methods against the evolving tactics of phishing attacks. In this paper, the use of machine learning and deep learning models was examined in detail by utilizing the SpamAssassin dataset to establish their performance. We will go over a set of different models-RF, LSTM, GRU, and GAT-comparing their performance on two axes: with and without metadata augmentation. Some of the most interesting findings from these results include how enhanced features related to metadata have proved much more powerful in enhancing the accuracy of detection. With the GRU model optimized with the right hyperparameters and metadata, we achieved almost perfect F1-score, which outperformed the text-only methods significantly. Data balancing techniques like SMOTE also perform well, since they ensure balanced class representation during the training of the models and therefore do well due to intrinsic class imbalance in phishing datasets. The results are supported by ROC curves, confusion matrices, and feature importance plots that confirm the gain in model accuracy and strength provided by our metadata integration. Further testing of the system will be done with larger datasets and hybrid models for real-time detection.

Item Type: Thesis (Masters)
Supervisors:
Name
Email
Nagahamulla, Harshani
UNSPECIFIED
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
P Language and Literature > P Philology. Linguistics > Computational linguistics. Natural language processing
Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4150 Computer Network Resources > The Internet > Electronic Mail
T Technology > TK Electrical engineering. Electronics. Nuclear engineering > Telecommunications > The Internet > Electronic Mail
Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Ciara O'Brien
Date Deposited: 05 Sep 2025 10:42
Last Modified: 05 Sep 2025 10:42
URI: https://norma.ncirl.ie/id/eprint/8815

Actions (login required)

View Item View Item