NORMA eResearch @NCI Library

Document Image Classification using Convolutional Neural Network and Transfer Learning Technique

Ibushe, Asim Arif (2024) Document Image Classification using Convolutional Neural Network and Transfer Learning Technique. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (1MB) | Preview
[thumbnail of Configuration Manual]
Preview
PDF (Configuration Manual)
Download (669kB) | Preview

Abstract

In todays digital age, efficient document classification and pattern recognition are essential for server-side big data structuring. This requirement is especially important for domain such as banking and government, where document processing and classification is essential. This study suggest an efficient and automated approach using deep learning and transfer learning technology to predict target label for document pictures and helps to automate image classification task. We have performed comparative study on two document image dataset to demonstrate which learning method suits the best. First dataset contain 6 different document image target label to classify, we have implemented CNN algorithm which performed 10% accuracy hike compared to traditional machine learning. Second dataset titled ”Tobacco-3482” serve as a valuable resource containing total 3492 document images which demonstrates different patterns and structure including 10 separate directory for each target label. Research determines the incremental increase in prediction accuracy from traditional learning such as K-Nearest Neighbour, Support Vector Classifier to Convolutional and transfer learning technique. Inceptionv3 and VGG-16 algorithms results were compared with the help of Imagenet dataset weights. Apart from that, proposed system add a feature of data-deduplication python script, which keeps on observing all document records integrity and efficent directory storage optimization. According to post-analysis and accuracy score Inceptionv3 and VGG-16 are top performer with 40 epochs. Validation accuracy for Inceptionv3 and VGG-16 are 79.60% 81.61% respectively, which shows that Visual Geometry Group-16 performs the best for dataset ”Tobacco3482”.

Item Type: Thesis (Masters)
Supervisors:
Name
Email
Agarwal, Bharat
UNSPECIFIED
Uncontrolled Keywords: Image Convolution; Image Classification; Transfer Learning; Data Deduplication; KNN; SVM; Convolutional Neural Network; InceptionV3; VGG-16
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
Q Science > QA Mathematics > Algebra > Algorithms > Computer algorithms
Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Ciara O'Brien
Date Deposited: 05 Jun 2025 14:43
Last Modified: 05 Jun 2025 14:43
URI: https://norma.ncirl.ie/id/eprint/7766

Actions (login required)

View Item View Item