Khan, Aafaq Iqbal (2021) Text and Image Based Multi-model Fashion Image Retrieval system. Masters thesis, Dublin, National College of Ireland.
Preview |
PDF (Master of Science)
Download (1MB) | Preview |
Preview |
PDF (Configuration manual)
Download (743kB) | Preview |
Abstract
Interactive image retrieval is an emerging research topic playing a significant role in the success of a wide variety of applications, especially in the fashion domain. However, as fashion product catalogues have grown in size and the number of features each product has increased, it has become more challenging for users to express their needs effectively. In traditional fashion e-stores, users may not be able to specify the details of the outfits more accurately by utilizing a text-based query. Therefore, we focus on a multi-model image retrieval system to integrate the query image with a text query that describes the visual differences between the query image and the search target image. To tackle this task, we investigate a similarity metric between a target image and a candidate image (query image) plus a text query. Both target images and query image are encoded with Efficient-Net and ResNet-50 (only one at a time) into feature vector representation, and encode caption text to a text feature vector using LSTM. Then we compose the query image vector and text feature vector into a single vector which is expected to be biased toward the target image vector with the help of state-of-the-art TIRG Vo et al. (2019). The compositional query-based TIRG achieved a higher average recall with 29.2, than other methods, text only (21.93), image only Efficient-Net (8.74), and image only Resnet-50 (8.75). The TIRG outperforms text-only, image-only Efficient-Net, and image-only Resnet-50 methods in terms of batch-based classification training loss with values 0.192, 0.42 (65% more), 0.91 (79% more), and 0.52 (63% more) respectively.
Item Type: | Thesis (Masters) |
---|---|
Subjects: | Q Science > QA Mathematics > Electronic computers. Computer science T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science T Technology > TR Photography H Social Sciences > HD Industries. Land use. Labor > Specific Industries > Fashion Industry |
Divisions: | School of Computing > Master of Science in Data Analytics |
Depositing User: | Tamara Malone |
Date Deposited: | 21 Feb 2023 13:25 |
Last Modified: | 21 Feb 2023 13:25 |
URI: | https://norma.ncirl.ie/id/eprint/6201 |
Actions (login required)
View Item |