NORMA eResearch @NCI Library

Malicious URL(s) classification

Dsouza, Rohan (2020) Malicious URL(s) classification. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
PDF (Master of Science)
Download (2MB) | Preview


Malicious URLs are a serious threat to the realm of online security and are one of the most fundamental ways to attack any online user. URL(s) can work as a primary source for distribution of malware/viruses over the internet which has led to an increasing urge for classification of URLs. To prevent users from being attacked, various anti-virus companies use black listing methods and block such URLs at a client end. However, there are millions of malicious URLs that are generated everyday and adding all these URLs in a blacklisting database becomes a monotonous method. Furthermore, it tends to lack newly generated URLs. To solve such problems, machine learning has grabbed attention in recent years to find out the hidden patterns from a dataset of URLs. Although it shows promise, it seems to be inefficient when the size of data is extremely large. This leads to the introduction of big data technologies where we apply machine learning algorithms in a distributed environment. In this research, we have critically compared the performance of traditional machine learning technologies with distributed modern machine learning technologies using Spark MLlib. We have used Logistic regression and Support Vector Machine algorithms in our model to determine the credibility of a URL. Our results conclude that each technique's performance is relative to the size of the data it is working on.

Item Type: Thesis (Masters)
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
Q Science > QA Mathematics > Computer software
T Technology > T Technology (General) > Information Technology > Computer software
Divisions: School of Computing > Master of Science in Cloud Computing
Depositing User: Caoimhe Ní Mhaicín
Date Deposited: 23 Mar 2020 15:57
Last Modified: 23 Mar 2020 15:57

Actions (login required)

View Item View Item