NORMA eResearch @NCI Library

Text and Image Based Multi-model Fashion Image Retrieval system

Khan, Aafaq Iqbal (2021) Text and Image Based Multi-model Fashion Image Retrieval system. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (1MB) | Preview
[thumbnail of Configuration manual]
Preview
PDF (Configuration manual)
Download (743kB) | Preview

Abstract

Interactive image retrieval is an emerging research topic playing a significant role in the success of a wide variety of applications, especially in the fashion domain. However, as fashion product catalogues have grown in size and the number of features each product has increased, it has become more challenging for users to express their needs effectively. In traditional fashion e-stores, users may not be able to specify the details of the outfits more accurately by utilizing a text-based query. Therefore, we focus on a multi-model image retrieval system to integrate the query image with a text query that describes the visual differences between the query image and the search target image. To tackle this task, we investigate a similarity metric between a target image and a candidate image (query image) plus a text query. Both target images and query image are encoded with Efficient-Net and ResNet-50 (only one at a time) into feature vector representation, and encode caption text to a text feature vector using LSTM. Then we compose the query image vector and text feature vector into a single vector which is expected to be biased toward the target image vector with the help of state-of-the-art TIRG Vo et al. (2019). The compositional query-based TIRG achieved a higher average recall with 29.2, than other methods, text only (21.93), image only Efficient-Net (8.74), and image only Resnet-50 (8.75). The TIRG outperforms text-only, image-only Efficient-Net, and image-only Resnet-50 methods in terms of batch-based classification training loss with values 0.192, 0.42 (65% more), 0.91 (79% more), and 0.52 (63% more) respectively.

Item Type: Thesis (Masters)
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
T Technology > TR Photography
H Social Sciences > HD Industries. Land use. Labor > Specific Industries > Fashion Industry
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Tamara Malone
Date Deposited: 21 Feb 2023 13:25
Last Modified: 21 Feb 2023 13:25
URI: https://norma.ncirl.ie/id/eprint/6201

Actions (login required)

View Item View Item