NORMA eResearch @NCI Library

Detecting Real Time Phishing URL’s using URL Cleaning Technique

Naidugari, Vikas (2024) Detecting Real Time Phishing URL’s using URL Cleaning Technique. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (656kB) | Preview
[thumbnail of Configuration Manual]
Preview
PDF (Configuration Manual)
Download (1MB) | Preview

Abstract

This paper seeks to establish the best approach to detecting the phishing URLs in the ever-shifting cybersecurity environment in order to enhance the safety of online operations. In this report, the URLs are processed and analysed using the data classification and feature extraction approaches that are used to detect the existence of phishing threats. URL related features are present in the dataset such as length of the URL, structure of domains, and other important attributes necessary for evaluation of phishing. The fundamental purpose of this work is to increase the reliability of phishing filters, which are crucial for preserving confidentiality and users’ trust. The model contains a comprehensive preprocessing stage that has relevant features that are derived from raw URLs. Subsequently, we use Logistic Regression, Support Vector Machine, Kneighbor, Random Forest, Decision Tree, and Naïve Bayes as a suitable machine learning model to predict the websites as either legitimate or phishing. The analysis shows that the accuracy for the different predictive algorithms used for classification of the URLs based on the processed data set, is high with accurate classification of the phishing and genuine URLs. The research shows that methods that were used in feature extraction proved to be successful. In addition to the strong capability that has been proved by this research for enhancing online security through machine learning for enhanced phishing detection. This report will be a rich source of ideas for those interested in this field for future investigations and applications.

Item Type: Thesis (Masters)
Supervisors:
Name
Email
Mustafa, Raza Ul
UNSPECIFIED
Uncontrolled Keywords: Phishing Detection; URL Cleaning; Machine Learning; Naïve Bayes; Random Forest; Decision Tree
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
Q Science > QA Mathematics > Computer software > Computer Security
T Technology > T Technology (General) > Information Technology > Computer software > Computer Security
Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning
Divisions: School of Computing > Master of Science in Cyber Security
Depositing User: Ciara O'Brien
Date Deposited: 30 Jul 2025 13:25
Last Modified: 30 Jul 2025 13:25
URI: https://norma.ncirl.ie/id/eprint/8346

Actions (login required)

View Item View Item