Hernández, Edith (2023) An Ensemble model to predict the classification of goods using text description. Masters thesis, Dublin, National College of Ireland.
Preview |
PDF (Master of Science)
Download (1MB) | Preview |
Preview |
PDF (Configuration Manual)
Download (95kB) | Preview |
Abstract
The main purpose of this research is to tackle the challenge of classifying 6-digit codes based on product descriptions. In order to achieve this, we will suggest an approach that combines NLP techniques, pre-trained word embedding and similarity search libraries.
There is a growing need, for effective methods to categorise products from large datasets, especially for customs authorities. The experiment intends to have the potential to improve the accuracy and efficiency of categorizing imported goods by leveraging advancements in Natural Language Processing (NLP) and deep learning. The research process will involve data collection, analysis, and experimental assessment. Every step is properly aligned with the CRISP-DM model.
Integrating FAISS in the proposed experiment improves the accuracy in using RoBERTa classification, which achieves 80%. The opposite case using FAISS and Distilbert classification got less than 1%.
The expected outcomes include gaining an understanding of the challenges and possibilities associated with classifying goods as well as developing a practical solution that can be applied in various contexts.
Item Type: | Thesis (Masters) |
---|---|
Supervisors: | Name Email Haque, Rejwanul UNSPECIFIED |
Uncontrolled Keywords: | Natural Language Processing; HS code classification; RoBERTa; Similarity search |
Subjects: | Q Science > QA Mathematics > Electronic computers. Computer science T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science H Social Sciences > Economics > Business P Language and Literature > P Philology. Linguistics > Computational linguistics. Natural language processing Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning |
Divisions: | School of Computing > Master of Science in Data Analytics |
Depositing User: | Ciara O'Brien |
Date Deposited: | 08 May 2025 16:11 |
Last Modified: | 08 May 2025 16:11 |
URI: | https://norma.ncirl.ie/id/eprint/7526 |
Actions (login required)
![]() |
View Item |