NORMA eResearch @NCI Library

Enhancing Guava Fruit Disease Detection and Localization through a Hybrid Vision Transformer and Convolutional Neural Network Architecture

Lala, Gokul (2024) Enhancing Guava Fruit Disease Detection and Localization through a Hybrid Vision Transformer and Convolutional Neural Network Architecture. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (789kB) | Preview
[thumbnail of Configuration Manual]
Preview
PDF (Configuration Manual)
Download (426kB) | Preview

Abstract

This research proposes a novel hybrid architecture that combines vision transformers and convolutional neural networks to treat the most difficult problem in guava fruit disease detection and localization. At the bottom of this new approach lies the central idea of exploiting complementary strengths from both architectures to help in efficient agriculture-based disease detection.

In this research, a proposed ViT-CNN-based hybrid model is developed and implemented using PyTorch; the implementation is done over 6,549 guava fruit images across the nine classes of diseases. This is a global feature acquired pre-trained ViT that integrates custom stages of CNN layers for local feature refinement and a classification head.

Results indicate that the model comes out very strong in running up to 95.63% validation accuracy. Some of the notable results include very high per-class F1-scores ranging from 0.88 to 1.00, with good handling of class imbalance and very fast convergence during training. This hybrid approach thus overcame some of the limitations of standalone ViTs in capturing fine-grained features relevant to disease identification.

This work thus offers a very powerful tool for early and accurate guava fruit disease detection, generally contributing much to related areas of computer vision in agriculture. From the experiments, it seems that the model is performing well across most classes of diseases and so may be promising for crop management and yield improvement in guava cultivation. Some future works could include multimodal integration, temporal fitting of the model on disease progression, and adaption to different cultivation conditions.

Item Type: Thesis (Masters)
Supervisors:
Name
Email
Yaqoob, Abid
UNSPECIFIED
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
S Agriculture > S Agriculture (General)
Q Science > QH Natural history > QH301 Biology > Methods of research. Technique. Experimental biology > Data processing. Bioinformatics > Artificial intelligence > Computer vision
Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Artificial intelligence > Computer vision
H Social Sciences > HD Industries. Land use. Labor > Specific Industries > Food Industry
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Ciara O'Brien
Date Deposited: 20 Aug 2025 09:56
Last Modified: 20 Aug 2025 09:57
URI: https://norma.ncirl.ie/id/eprint/8585

Actions (login required)

View Item View Item