NORMA eResearch @NCI Library

PlagCaps: Prediction of Plagiarised Text on a Corpus Dataset using Deep Learning Algorithms

Shukla, Prathmesh (2021) PlagCaps: Prediction of Plagiarised Text on a Corpus Dataset using Deep Learning Algorithms. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (724kB) | Preview
[thumbnail of Configuration manual]
Preview
PDF (Configuration manual)
Download (276kB) | Preview

Abstract

Plagiarism detection in the field of education and research is a challenging and tedious task. Recently created machine learning algorithms are mainly focused on string-level analysis and comparison. An untrustworthy and faulty model used for plagiarism detection is not beneficial to the education and research sector. Advancement in the field of deep learning is helping in reading huge text and image-based datasets in no time with the help of its pre-trained models. This research is proposing state-of-art Capsule Networks (PlagCap) and Long Short-Term Memory (LSTM), deep learning models for text classification purposes on IMDB movie reviews and Quora Question Pair datasets. Natural Language Processing Tool Kit (NLPTK) is also used for data pre-processing, and Glove a pre-trained word vector model is used for word embedding. These two models are performing well with huge corpus base text data. An accuracy of 85.70% by LSTM and 94.99% by Capsule Network model is achieved and these models are outperforming all the previously done researches. These models can be used in real-world applications to improve the accuracy of plagiarism detection techniques.

Item Type: Thesis (Masters)
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
Q Science > QA Mathematics > Computer software
T Technology > T Technology (General) > Information Technology > Computer software
Z Bibliography. Library Science. Information Resources > Z004 Books. Writing. Paleography
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Clara Chan
Date Deposited: 14 Dec 2021 13:56
Last Modified: 14 Dec 2021 13:56
URI: https://norma.ncirl.ie/id/eprint/5222

Actions (login required)

View Item View Item