NORMA eResearch @NCI Library

Enhancing Image Caption Quality using an Ensemble of Deep Learning Models

-, Harisankar (2024) Enhancing Image Caption Quality using an Ensemble of Deep Learning Models. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (738kB) | Preview
[thumbnail of Configuration Manual]
Preview
PDF (Configuration Manual)
Download (752kB) | Preview

Abstract

This research analyses of how effective it is to combine various CNN architectures with LSTM models and ensemble methods with the aim of enhancing image captioning performance. The proposed study analyses three CNN architectures: VGG16, ResNet50 and Xception, and their usage in CNN-LSTM architectures for captioning tasks. Also, ensemble methods like Bagging and Boosting were employed to take advantage of those base models. The evaluation metrics used were BLEU and METEOR scores. The results demonstrated that ResNet50 was the best performing model overall, since it attained the highest BLEU and METEOR scores, showing that its generated captions are most contextually relevant. Quite significant results were achieved from the model VGG16 and Xception, but they still had weaknesses in terms of dealing with more complicated multi-word phrase. Bagging seems to have somehow benefited from the diversity of the models, but it created grammatical mistakes in models that could be due to combination of dissimilar outcomes. Boosting, in contrast, was able to produce good results by assigning weights and iterating over base models, which in turn generated captions that are relevant to the images.

Item Type: Thesis (Masters)
Supervisors:
Name
Email
Basilio, Jorge
UNSPECIFIED
Uncontrolled Keywords: Convolutional Neural Network; image caption; ensemble methods; Long Short-Term Memory
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
P Language and Literature > P Philology. Linguistics > Computational linguistics. Natural language processing
Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Ciara O'Brien
Date Deposited: 01 Sep 2025 11:48
Last Modified: 01 Sep 2025 11:48
URI: https://norma.ncirl.ie/id/eprint/8664

Actions (login required)

View Item View Item