Pawar, Chandan Vijay (2024) InferTextIQ: Multimodal Document Analysis and Question Answering System with Model Selection. Masters thesis, Dublin, National College of Ireland.
Preview |
PDF (Master of Science)
Download (1MB) | Preview |
Preview |
PDF (Configuration Manual)
Download (916kB) | Preview |
Abstract
In recent years, AI-driven document analytics has advanced rapidly, with large language models (LLMs) increasingly applied to complex document processing. As organizations face growing volumes of diverse document types, there’s urgent sophisticated multimodal analytical tools. Although a few state-of-the art models have emerged in front, for example GPT-3.5 and Google Gemini a vast gap continues to dominate in between conducting comparative analyses of their performance across different document formats. This paper addresses this gap by introducing InferTextIQ, a novel multimodal document analysis and question-answering system designed to benchmark GPT-3.5 and Google Gemini in processing complex documents such as PDFs and CSVs.
The findings show that GPT-3.5 had an accuracy of 70% in analyzing PDFs, while Gemini’s was 53.33%. When it came to CSV processing, Gemini had a slight advantage, achieving 60% accuracy compared to GPT-3.5’s 50%. Therefore, it can be said that both models were quite unsatisfactory in handling textual data in CSV files, thus pointing to one area of improvement in multimodal document analysis. The work helps to spread the word within the research community about the series of strengths and limitations that GPT-3.5 and Gemini do have concerning multimodal document processing and opens up a platform for further studies on adaptive model selection, domain-specific fine-tuning, and the development of more robust AI-driven multimodal systems for document analysis.
Item Type: | Thesis (Masters) |
---|---|
Supervisors: | Name Email Qayum, Abdul UNSPECIFIED |
Subjects: | Q Science > QA Mathematics > Electronic computers. Computer science T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science Q Science > QH Natural history > QH301 Biology > Methods of research. Technique. Experimental biology > Data processing. Bioinformatics > Artificial intelligence Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Artificial intelligence P Language and Literature > P Philology. Linguistics > Computational linguistics. Natural language processing |
Divisions: | School of Computing > Master of Science in Data Analytics |
Depositing User: | Ciara O'Brien |
Date Deposited: | 25 Aug 2025 09:56 |
Last Modified: | 25 Aug 2025 09:56 |
URI: | https://norma.ncirl.ie/id/eprint/8611 |
Actions (login required)
![]() |
View Item |