NORMA eResearch @NCI Library

E-commerce Product Similarity Match Detection using Product Text and Images

Hari Krishnan, Kannan (2021) E-commerce Product Similarity Match Detection using Product Text and Images. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
PDF (Master of Science)
Download (2MB) | Preview
[thumbnail of Configuration manual]
PDF (Configuration manual)
Download (443kB) | Preview


The online merchants and users supply rich data everyday on e-commerce websites, this only gets bigger in this era of data growth allowing more scope for research advancements in the area of product similarity match detection. The text descriptions of two similar products in the e-commerce websites may be slightly different, but their pictures could be largely varied. The use of either only text-based comparisons or image comparison methods had been the trend for all e-commerce platforms in the market but the technology growth explore ways for combining both image and title descriptions together. This research project uses deep learning methods to detect the list of e-commerce identical products using product title and images. Residual Network called ResNet with deep layers and siamese twin is used as an approach for modelling the images, TF-IDF vectorizer is used for text modelling. Experimented through image augmentation, embedding and text processing to identify the identical products and a combination of TF-IDF + ResNet-18 is used as a base model for this research. Cross validation score is used to calculate the model accuracy and computational time is captured for model performance. The outcome of this research work will evaluate the implementation results and a comparison study is carried out using the real-world data from one of the leading e-commerce providers.

Item Type: Thesis (Masters)
Uncontrolled Keywords: e-commerce; ResNet; Siamese; TF-IDF; match detection
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
Q Science > QA Mathematics > Computer software
T Technology > T Technology (General) > Information Technology > Computer software
H Social Sciences > HF Commerce > Electronic Commerce
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Clara Chan
Date Deposited: 06 Dec 2021 10:20
Last Modified: 06 Dec 2021 10:20

Actions (login required)

View Item View Item